2.4 Searching Algorithms

Searching algorithms are fundamental techniques for finding specific elements or values within data structures. Understanding different search strategies and when to apply them is crucial for efficient problem-solving.

2.4.1 Binary Search

Binary Search is a highly efficient divide-and-conquer algorithm for finding a target value in a sorted array. It repeatedly divides the search space in half, eliminating half of the remaining elements with each comparison.

2.4.1.1 Prerequisites

Before applying binary search, ensure these conditions are met:

  1. The array/list must be sorted - Either in ascending or descending order
    • This is the most critical requirement
    • If the array is unsorted, you must sort it first (\(O(n \log n)\)) or use a different search strategy
    • Interviewers often test this - always verify the data is sorted before applying binary search
  2. Random access to elements - Direct index access in \(O(1)\) time
    • Works well with arrays and array-based data structures
    • Not suitable for linked lists (cannot jump to middle element efficiently)
  3. Consistent comparison operation - The comparison logic must be well-defined and consistent
    • For custom objects, ensure proper comparator implementation
    • The sorted property must be maintained throughout the search

When the data is not sorted: Consider using linear search \(O(n)\), hash table \(O(1)\) lookup, or sorting first if multiple searches are needed.

2.4.1.2 Core Concept

At each step, compare the target with the middle element: - If target equals middle: Found! - If target is less than middle: Search left half - If target is greater than middle: Search right half

This halving process continues until the target is found or the search space is empty.

Time Complexity: \(O(\log n)\) - Each comparison halves the search space Space Complexity: - Iterative: \(O(1)\) - Only uses pointer variables - Recursive: \(O(\log n)\) - Call stack depth

2.4.1.3 Iterative Implementation (Two Pointers)

The iterative approach uses two pointers (left and right) to define the current search space and a loop to repeatedly narrow it down.

Template:

public int binarySearch(int[] nums, int target) {
    int left = 0;
    int right = nums.length - 1;

    while (left <= right) {
        // Avoid integer overflow: use (left + right) / 2 carefully
        int mid = left + (right - left) / 2;

        if (nums[mid] == target) {
            return mid;  // Found target
        } else if (nums[mid] < target) {
            left = mid + 1;  // Search right half
        } else {
            right = mid - 1;  // Search left half
        }
    }

    return -1;  // Target not found
}

Key Details:

  1. Loop Condition: left <= right (not left < right)
    • Allows checking when left and right converge to the same index
    • Without =, we’d miss the case where target is at the last remaining position
  2. Mid Calculation: mid = left + (right - left) / 2
    • Prevents integer overflow that could occur with (left + right) / 2
    • When left + right > Integer.MAX_VALUE, direct addition overflows
    • Mathematically equivalent but safer
  3. Pointer Updates:
    • left = mid + 1: We’ve checked mid, so exclude it from the left search
    • right = mid - 1: We’ve checked mid, so exclude it from the right search
    • Using mid instead of mid ± 1 could cause infinite loops

Example Trace: Find target 6 in [1, 3, 5, 6, 8, 9, 11]

Iteration 1: left=0, right=6, mid=3
  nums[3]=6 equals target → Found! Return 3

Example Trace: Find target 7 in [1, 3, 5, 6, 8, 9, 11]

Iteration 1: left=0, right=6, mid=3
  nums[3]=6 < 7 → left=4

Iteration 2: left=4, right=6, mid=5
  nums[5]=9 > 7 → right=4

Iteration 3: left=4, right=4, mid=4
  nums[4]=8 > 7 → right=3

Iteration 4: left=4, right=3
  left > right → Not found, return -1

2.4.1.4 Recursive Implementation

The recursive approach divides the problem into smaller subproblems, mirroring the divide-and-conquer nature of binary search.

Template:

public int binarySearchRecursive(int[] nums, int target) {
    return binarySearchHelper(nums, target, 0, nums.length - 1);
}

private int binarySearchHelper(int[] nums, int target, int left, int right) {
    // Base case: search space is empty
    if (left > right) {
        return -1;
    }

    int mid = left + (right - left) / 2;

    if (nums[mid] == target) {
        return mid;  // Found target
    } else if (nums[mid] < target) {
        // Recurse on right half
        return binarySearchHelper(nums, target, mid + 1, right);
    } else {
        // Recurse on left half
        return binarySearchHelper(nums, target, left, mid - 1);
    }
}

Key Details:

  1. Base Case: left > right indicates empty search space
  2. Recursive Calls: Pass updated boundaries (mid + 1 or mid - 1)
  3. Return Values: Propagate the result back up the call stack

Example Trace: Find target 6 in [1, 3, 5, 6, 8, 9, 11]

Call 1: helper(nums, 6, 0, 6)
  mid=3, nums[3]=6 equals target → Return 3

2.4.1.5 Iterative vs Recursive Comparison

Aspect Iterative (Two Pointers) Recursive
Implementation Loop with pointers Recursive function calls
Intuition Explicit pointer manipulation Natural divide-and-conquer
Space Complexity \(O(1)\) - No extra space \(O(\log n)\) - Call stack
Performance Slightly faster (no function call overhead) Slightly slower (call overhead)
Readability More verbose Cleaner, mirrors problem structure
Stack Overflow Risk None Possible for very large arrays (rare)
Debugging Easier to step through Harder to trace recursion
Interview Preference More common in production Shows understanding of recursion

Pros of Iterative: - ✅ More space-efficient (\(O(1)\) vs \(O(\log n)\)) - ✅ Faster in practice (no function call overhead) - ✅ No risk of stack overflow - ✅ Easier to debug with breakpoints - ✅ Preferred in production code

Cons of Iterative: - ❌ Slightly more complex to write correctly (pointer management) - ❌ Less elegant than recursive version

Pros of Recursive: - ✅ Cleaner, more readable code - ✅ Directly mirrors the divide-and-conquer strategy - ✅ Easier to understand conceptually - ✅ Natural for problems with recursive structure

Cons of Recursive: - ❌ Uses \(O(\log n)\) extra space for call stack - ❌ Function call overhead makes it slower - ❌ Risk of stack overflow (though unlikely with \(\log n\) depth) - ❌ Harder to debug (call stack tracing)

2.4.1.6 When to Use Which Approach

Use Iterative (Two Pointers) when: - Production code where space efficiency matters - Performance is critical - Working with very large datasets - Need to avoid any risk of stack overflow - Interviewer asks for optimal space complexity

Use Recursive when: - Problem naturally fits recursive thinking - Interviewers ask specifically for recursive solution - Demonstrating understanding of recursion - Code clarity is more important than minor performance gains - Problem is a variant requiring recursive modifications

In Practice: Most production code uses iterative binary search due to better space complexity and performance. However, knowing both approaches demonstrates a deeper understanding.

2.4.1.7 Common Binary Search Variants

1. Find First Occurrence (Leftmost)

Find the leftmost (first) index where target appears.

public int findFirst(int[] nums, int target) {
    int left = 0, right = nums.length - 1;
    int result = -1;

    while (left <= right) {
        int mid = left + (right - left) / 2;

        if (nums[mid] == target) {
            result = mid;       // Record position
            right = mid - 1;    // Continue searching left
        } else if (nums[mid] < target) {
            left = mid + 1;
        } else {
            right = mid - 1;
        }
    }

    return result;
}

2. Find Last Occurrence (Rightmost)

Find the rightmost (last) index where target appears.

public int findLast(int[] nums, int target) {
    int left = 0, right = nums.length - 1;
    int result = -1;

    while (left <= right) {
        int mid = left + (right - left) / 2;

        if (nums[mid] == target) {
            result = mid;       // Record position
            left = mid + 1;     // Continue searching right
        } else if (nums[mid] < target) {
            left = mid + 1;
        } else {
            right = mid - 1;
        }
    }

    return result;
}

3. Search Insert Position

Find the index where target should be inserted to maintain sorted order.

public int searchInsert(int[] nums, int target) {
    int left = 0, right = nums.length - 1;

    while (left <= right) {
        int mid = left + (right - left) / 2;

        if (nums[mid] == target) {
            return mid;
        } else if (nums[mid] < target) {
            left = mid + 1;
        } else {
            right = mid - 1;
        }
    }

    // After loop: left is the insertion position
    return left;
}

Key Insight: When the loop exits without finding the target, left points to where the target should be inserted.

2.4.1.8 Common Pitfalls

1. Infinite Loop with Wrong Bounds

// ❌ WRONG: Can cause infinite loop
while (left < right) {  // Missing = sign
    int mid = left + (right - left) / 2;
    if (nums[mid] < target) {
        left = mid;      // ❌ Should be mid + 1
    } else {
        right = mid - 1;
    }
}

// ✅ CORRECT
while (left <= right) {
    int mid = left + (right - left) / 2;
    if (nums[mid] < target) {
        left = mid + 1;  // ✅ Exclude mid
    } else {
        right = mid - 1;
    }
}

2. Integer Overflow

// ❌ WRONG: Can overflow
int mid = (left + right) / 2;

// ✅ CORRECT: Prevents overflow
int mid = left + (right - left) / 2;

3. Off-by-One Errors

// Common mistakes:
right = nums.length;     // ❌ Should be nums.length - 1
left = mid;              // ❌ Should be mid + 1 (causes infinite loop)
right = mid;             // ❌ Should be mid - 1 (for standard search)

2.4.1.10 Summary

Binary Search is one of the most important algorithms in computer science:

  • When to Use: Searching in sorted data, or when you can define a monotonic search space
  • Key Insight: Eliminate half the search space with each comparison
  • Space-Time Tradeoff: Iterative uses \(O(1)\) space, recursive uses \(O(\log n)\) space
  • Production Choice: Iterative approach is preferred for efficiency
  • Interview Strategy: Know both approaches and their trade-offs

2.4.2 QuickSelect

QuickSelect is a selection algorithm for finding the k-th smallest (or largest) element in an unordered array. It’s based on the QuickSort partitioning technique but only recurses into one partition, making it significantly faster than full sorting for selection problems.

2.4.2.1 Core Concept

QuickSelect uses the same partition logic as QuickSort but with a key optimization: after partitioning, we only need to recurse on the partition that contains the k-th element, discarding the other half. This gives us O(n) average time compared to O(n log n) for sorting.

Key Insight: We don’t need the entire array sorted - just the k smallest (or largest) elements separated from the rest.

How It Works: 1. Pick a pivot element (random or deterministic) 2. Partition array around pivot: elements \(\le\) pivot on left, elements \(>\) pivot on right 3. Pivot is now in its final sorted position 4. If pivot position \(=\) k: Done! Return pivot 5. If pivot position \(<\) k: Recurse on right partition (need larger elements) 6. If pivot position \(>\) k: Recurse on left partition (have too many elements)

2.4.2.2 Time and Space Complexity

  • Time Complexity:
    • Average case: O(n) - each partition reduces problem by ~half (geometric series: n + n/2 + n/4 + … = 2n)
    • Worst case: O(n²) - bad pivot choices (all elements on one side every time)
    • With randomization: O(n) with high probability (worst case becomes extremely rare)
  • Space Complexity:
    • Iterative: O(1) - in-place with no recursion
    • Recursive: O(log n) average, O(n) worst case for call stack

Comparison with Sorting:

Approach Time (Avg) Time (Worst) Space When to Use
QuickSelect O(n) O(n²) O(1)-O(log n) Find k-th element only
Full Sort O(n log n) O(n log n) O(1)-O(n) Need sorted array or multiple queries
Min-Heap O(n log k) O(n log k) O(k) Small k, need stability

2.4.2.3 Standard Implementation

Lomuto Partition Scheme (simpler, commonly used):

public int quickSelect(int[] nums, int k) {
    // Find k-th smallest (0-indexed: k-1)
    return quickSelectHelper(nums, 0, nums.length - 1, k - 1);
}

private int quickSelectHelper(int[] nums, int left, int right, int k) {
    if (left == right) {
        return nums[left];  // Only one element
    }

    // Partition and get pivot's final position
    int pivotIndex = partition(nums, left, right);

    // Check if pivot is at k-th position
    if (pivotIndex == k) {
        return nums[k];  // Found k-th smallest
    } else if (pivotIndex < k) {
        // k-th element is in right partition
        return quickSelectHelper(nums, pivotIndex + 1, right, k);
    } else {
        // k-th element is in left partition
        return quickSelectHelper(nums, left, pivotIndex - 1, k);
    }
}

private int partition(int[] nums, int left, int right) {
    // Choose random pivot to avoid worst-case O(n²)
    int randomIndex = left + new Random().nextInt(right - left + 1);
    swap(nums, randomIndex, right);  // Move pivot to end

    int pivot = nums[right];
    int i = left;  // Write position for elements <= pivot

    for (int j = left; j < right; j++) {
        if (nums[j] <= pivot) {
            swap(nums, i, j);
            i++;
        }
    }

    // Place pivot at its final sorted position
    swap(nums, i, right);
    return i;
}

private void swap(int[] nums, int i, int j) {
    int temp = nums[i];
    nums[i] = nums[j];
    nums[j] = temp;
}

2.4.2.4 Iterative Implementation (Space-Optimized)

To avoid recursion overhead and achieve O(1) space:

public int quickSelectIterative(int[] nums, int k) {
    int left = 0, right = nums.length - 1;
    k = k - 1;  // Convert to 0-indexed

    while (left <= right) {
        int pivotIndex = partition(nums, left, right);

        if (pivotIndex == k) {
            return nums[k];
        } else if (pivotIndex < k) {
            left = pivotIndex + 1;  // Search right
        } else {
            right = pivotIndex - 1; // Search left
        }
    }

    return -1;  // Should never reach here
}

2.4.2.5 Why Randomization Matters

Without randomization (deterministic pivot like last element): - Sorted input: O(n²) - pivot is always smallest/largest, no splitting - Reverse sorted: O(n²) - same issue - Example: [1,2,3,4,5] with last element as pivot always partitions into [1,2,3,4] and []

With randomization: - Any input: O(n) average with high probability - Worst case: Still O(n²) but probability is negligible (~1/n!) - Practical performance: Nearly always linear

Proof of O(n) Average Time:

Expected partitions (random pivot):
Level 1: n operations (partition entire array)
Level 2: n/2 operations (recurse on ~half)
Level 3: n/4 operations (recurse on ~quarter)
...
Total: n + n/2 + n/4 + n/8 + ... = 2n = O(n)

2.4.2.6 Finding k-th Largest vs k-th Smallest

k-th Smallest (default above): Partition in ascending order, find element at index k-1

k-th Largest: Two approaches:

  1. Convert to smallest: k-th largest = (n - k + 1)-th smallest

    return quickSelect(nums, nums.length - k + 1);
  2. Reverse partition logic: Partition in descending order

    private int partitionDescending(int[] nums, int left, int right) {
        int pivot = nums[right];
        int i = left;
        for (int j = left; j < right; j++) {
            if (nums[j] >= pivot) {  // >= instead of <=
                swap(nums, i, j);
                i++;
            }
        }
        swap(nums, i, right);
        return i;
    }

2.4.2.7 Common Use Cases

1. K-th Element Selection - Find k-th smallest/largest element in unsorted array - Median finding (k = n/2) - Percentile calculation

2. Top K Elements - Find k smallest/largest elements (not sorted among themselves) - After QuickSelect, first k elements are the k smallest (unordered) - K Closest Points to Origin (LeetCode 973) - Top K Frequent Elements (LeetCode 347)

3. Partitioning Problems - Separate elements meeting a criteria without full sorting - Dutch National Flag problem (3-way partition)

2.4.2.8 QuickSelect vs Alternatives

When to Use QuickSelect: - ✅ Need one k-th element or k smallest elements (unordered) - ✅ Average O(n) performance is acceptable - ✅ Can modify array in-place - ✅ Large datasets where O(n) vs O(n log k) matters

When to Use Min-Heap: - ✅ Need k largest elements (heap approach more intuitive) - ✅ Guaranteed O(n log k) is better than risky O(n²) - ✅ Small k (k << n) - ✅ Need stable selection (preserve order) - ✅ Cannot modify original array

When to Use Full Sort: - ✅ Need sorted k elements (not just separated) - ✅ Multiple k-th queries on same data - ✅ k is close to n

2.4.2.9 Visual Example

Find 3rd smallest in [7, 10, 4, 3, 20, 15]:

Initial: [7, 10, 4, 3, 20, 15], k=3 (find element at index 2)

Iteration 1: Choose pivot = 15 (random)
  Partition: [7, 10, 4, 3] | 15 | [20]
  Pivot at index 4, k=2 < 4 → Recurse left

Iteration 2: Array [7, 10, 4, 3], choose pivot = 3
  Partition: [] | 3 | [7, 10, 4]
  Pivot at index 0, k=2 > 0 → Recurse right

Iteration 3: Array [7, 10, 4], choose pivot = 4
  Partition: [4] | 7 | [10]
  Pivot at index 1 (global index 2), k=2 → Found!

Result: 7 (the 3rd smallest element)

2.4.2.10 Common Pitfalls

1. Not Using Randomization

// ❌ WRONG: Deterministic pivot vulnerable to O(n²)
int pivot = nums[right];

// ✅ CORRECT: Randomized pivot
int randomIndex = left + new Random().nextInt(right - left + 1);
swap(nums, randomIndex, right);
int pivot = nums[right];

2. Off-by-One Errors in k

// k-th smallest is at index k-1 (0-indexed)
quickSelectHelper(nums, 0, nums.length - 1, k - 1);  // ✅ Correct

// k-th largest: convert to smallest
quickSelect(nums, nums.length - k + 1);  // ✅ Correct

3. Forgetting Base Case

// ❌ WRONG: No base case for single element
private int quickSelectHelper(int[] nums, int left, int right, int k) {
    int pivotIndex = partition(nums, left, right);
    // ... missing left == right check

// ✅ CORRECT: Handle single element
if (left == right) {
    return nums[left];
}

2.4.2.12 Summary

QuickSelect is a powerful selection algorithm:

  • When to Use: Finding k-th smallest/largest or top k elements (unordered)
  • Key Insight: Only recurse on the partition containing k-th element
  • Performance: O(n) average with randomization, much faster than sorting
  • Critical Optimization: Random pivot selection prevents O(n²) worst case
  • Trade-off: Slightly unpredictable (avg O(n) vs guaranteed O(n log k) for heap)
  • Interview Gold: Demonstrates deep understanding of QuickSort and optimization
  • Production Use: Ideal for large datasets where average-case linear time matters

The power of binary search lies in reducing \(O(n)\) linear search to \(O(\log n)\) logarithmic search - a massive improvement for large datasets.