9.4 Top K Frequent Elements

9.4.1 Problem Metadata

9.4.2 Description

Given a non-empty integer array, return the k most frequent elements. k is guaranteed to be valid.

9.4.3 Examples

Input: nums = [1,1,1,2,2,3], k = 2
Output: [1,2]

9.4.4 Constraints

  • 1 <= nums.length <= 10^5
  • k <= number of unique values

9.4.5 Solution - Min-Heap Approach

9.4.5.1 Walkthrough

This solution uses a min-heap of size k to efficiently find the k most frequent elements without sorting all elements.

Core Strategy:

  1. Count frequencies using a HashMap: O(n)
  2. Maintain a min-heap of size k storing {frequency, value} pairs
  3. Heap strategy:
    • For each unique element, push {frequency, value} onto the heap
    • If heap size exceeds k, pop the minimum (element with smallest frequency)
    • The heap automatically maintains the k elements with largest frequencies
  4. Extract results from heap: O(k log k)

Why Min-Heap Works:

The counterintuitive part: we use a min-heap to find the k largest frequencies.

  • Min-heap property: The smallest element is always at the root
  • Size constraint: We keep the heap size \(\le k\)
  • Eviction logic: When heap size exceeds k, we remove the minimum (least frequent element)
  • Result: After processing all elements, the heap contains exactly the k most frequent elements

Example Walkthrough:

For nums = [1,1,1,2,2,3], k = 2:

Frequencies: 1→3, 2→2, 3→1

Process each unique element:
1. Add {3, 1} → heap = [{3,1}]                    size=1
2. Add {2, 2} → heap = [{2,2}, {3,1}]             size=2
3. Add {1, 3} → heap = [{1,3}, {2,2}, {3,1}]      size=3 > k!
              → poll() removes {1,3} (minimum frequency)
              → heap = [{2,2}, {3,1}]              size=2 ✓

Extract results: [1, 2]

Key Insight: No ordering requirement - LeetCode 347 accepts results in any order when frequencies are equal.

9.4.5.2 Analysis

  • Time Complexity: O(n log k)
    • Build frequency map: O(n)
    • Process m unique elements with heap operations: O(m log k) where m \(\le\) n
    • Extract k results: O(k log k)
    • Overall: O(n log k) (since m \(\le\) n and k \(\le\) m)
    • Advantage over sorting: Full sort would be O(n log n). When k is small, O(n log k) is significantly faster.
  • Space Complexity: O(n)
    • Frequency map: O(m) where m \(\le\) n (number of unique elements)
    • Min-heap: O(k)
    • Total: O(n)

Why This Approach is Efficient:

  • Better than full sort when k << n
  • Heap optimization: We only maintain k elements, not all n
  • Optimal for small k: The smaller k is relative to n, the more efficient this becomes

9.4.5.3 Implementation Steps

  1. Build frequency map: Create Map<Integer, Integer> to count occurrences of each element in nums
  2. Initialize min-heap: Create PriorityQueue<int[]> with comparator that orders by frequency (first element of array)
  3. Process each unique element:
    • For each entry in the frequency map:
      • Push {frequency, value} pair onto heap
      • If heap size exceeds k, poll() to remove the element with smallest frequency
  4. Extract results:
    • Create result array of size k
    • Poll all remaining heap entries (in reverse order to maintain heap structure)
    • Extract the value (second element) from each heap entry

9.4.5.4 Code - Java

public int[] topKFrequent(int[] nums, int k) {
    Map<Integer, Integer> counts = new HashMap<>();
    for (int num : nums) {
        counts.put(num, counts.getOrDefault(num, 0) + 1);
    }

    PriorityQueue<int[]> heap = new PriorityQueue<>(Comparator.comparingInt(a -> a[0]));

    for (Map.Entry<Integer, Integer> entry : counts.entrySet()) {
        heap.offer(new int[]{entry.getValue(), entry.getKey()});
        if (heap.size() > k) {
            heap.poll();
        }
    }

    int[] result = new int[k];
    for (int i = k - 1; i >= 0; i--) {
        result[i] = heap.poll()[1];
    }
    return result;
}

9.4.6 Solution - Sorting with Stream API

9.4.6.1 Walkthrough

This solution uses frequency counting followed by sorting to find the k most frequent elements. While less optimal than the min-heap approach, it offers a simpler, more readable implementation using Java Streams.

Core Strategy:

  1. Count frequencies using a HashMap: O(n)
  2. Sort all unique elements by frequency (descending)
  3. Extract top k elements using limit(k)

Key Difference from Min-Heap:

Unlike the min-heap approach that maintains only k elements at a time, this solution: - Sorts all unique elements by frequency - Uses stream operations to cleanly extract the top k - Trades efficiency for code simplicity

Step-by-step execution for nums = [1,1,1,2,2,3], k = 2:

Step 1: Build frequency map
  counts = {1→3, 2→2, 3→1}

Step 2: Stream entries and sort by frequency (descending)
  Entry stream: [(1,3), (2,2), (3,1)]
  After sort:   [(1,3), (2,2), (3,1)]  (already sorted by frequency desc)

Step 3: Limit to k=2 elements
  After limit:  [(1,3), (2,2)]

Step 4: Extract keys
  Result: [1, 2]

When to Use This Approach:

Good for: - Code interviews where readability matters - Small datasets where performance difference is negligible - When you need to avoid PriorityQueue complexity

Not ideal for: - Large datasets with small k (min-heap is much better) - Performance-critical production code - When k << n (e.g., k=10, n=1,000,000)

9.4.6.2 Analysis

  • Time Complexity: O(n + m log m)
    • Build frequency map: O(n)
    • Sort m unique elements: O(m log m) where m \(\le\) n
    • Stream operations (limit, map): O(k)
    • Worst case: When all elements are unique (m = n), this becomes O(n log n)
    • Best case: When few unique elements (m << n), approaches O(n + m log m)
  • Space Complexity: O(n)
    • Frequency map: O(m) where m \(\le\) n
    • Stream intermediate operations: O(m) for sorting
    • Result array: O(k)
    • Total: O(n)

Comparison with Min-Heap:

Metric Sorting Approach Min-Heap Approach
Time Complexity O(n log n) worst case O(n log k)
Code Simplicity High (clean streams) Medium (manual heap)
Performance (k=10, n=100K) ~1.7M operations ~330K operations
Best Use Case Small datasets, interviews Production, large n with small k

Example Performance: - n = 100,000 elements, k = 10 - Sorting: 100,000 × log(100,000) ≈ 1,660,000 operations - Min-heap: 100,000 × log(10) ≈ 332,000 operations - Min-heap is ~5x faster

9.4.6.3 Implementation Steps

  1. Build frequency map: Create Map<Integer, Integer> to count occurrences
  2. Stream and sort: Convert map entries to stream and sort by frequency descending
  3. Limit to k: Use limit(k) to take only top k elements
  4. Extract keys: Map entries to keys and collect as int array

9.4.6.4 Code - Java

public int[] topKFrequent(int[] nums, int k) {
    // Build frequency map
    Map<Integer, Integer> counts = new HashMap<>();
    for (int num : nums) {
        counts.put(num, counts.getOrDefault(num, 0) + 1);
    }

    // Sort by frequency (descending), limit to k, extract keys
    return counts.entrySet().stream()
        .sorted(Map.Entry.<Integer, Integer>comparingByValue().reversed())
        .limit(k)
        .mapToInt(Map.Entry::getKey)
        .toArray();
}

9.4.7 Solution - QuickSelect (Optimal)

9.4.7.1 Walkthrough

This solution uses QuickSelect to achieve O(n) average time - the optimal solution for this problem. QuickSelect is a partition-based selection algorithm that finds the k-th largest/smallest elements without fully sorting.

Core Strategy:

Instead of sorting all unique elements by frequency, we: 1. Build frequency map: Count occurrences → O(n) 2. Create array of unique elements: Store as [value, frequency] pairs 3. Run QuickSelect: Partition array by frequency until k elements are separated 4. Return first k elements: They’re guaranteed to be the k most frequent

Key Insight: Partitioning by Frequency

QuickSelect partitions around a pivot frequency: - Left side: All elements with frequency \(\ge\) pivot frequency - Right side: All elements with frequency \(<\) pivot frequency - After partitioning at position k-1, the first k elements are the k most frequent

Visual Example for nums = [1,1,1,2,2,3,4,4,4,4], k = 2:

Step 1: Build frequency map
  counts = {1→3, 2→2, 3→1, 4→4}

Step 2: Create array of [value, frequency] pairs
  unique = [[1,3], [2,2], [3,1], [4,4]]

Step 3: QuickSelect to partition by frequency
  Pick pivot (say index 2, freq=1)
  Partition by frequency (descending):
    [[4,4], [1,3], [3,1], [2,2]]
           pivot at index 2

  k=2, pivot=2: Need elements at positions 0 and 1
  Recurse on left: [0...1]

  Pick pivot (index 0, freq=4)
  Already partitioned: [[4,4], [1,3]]

  k=2, pivot=1: Done! First 2 elements are [4,1]

Result: [4, 1] (order doesn't matter per problem constraints)

Why QuickSelect is Optimal:

  • No unnecessary sorting: Unlike full sort O(n log n), we only partition until k elements are separated
  • Average O(n) time: Each partition reduces problem size by ~half (geometric series: n + n/2 + n/4 + … = 2n)
  • In-place: Can modify the array in place (though we create a new array for [value, freq] pairs)

9.4.7.2 Analysis

  • Time Complexity:
    • Average case: O(n) - geometric series of partition operations
      • Build frequency map: O(n)
      • Create unique array: O(m) where m = number of unique elements
      • QuickSelect: O(m) average, since we partition m unique elements
      • Total: O(n + m) = O(n) since m \(\le\) n
    • Worst case: O(n + m²) - bad pivot choices (rare with randomization)
    • Best case: O(n) - optimal pivot every time
  • Space Complexity: O(n)
    • Frequency map: O(m) where m \(\le\) n
    • Unique elements array: O(m)
    • Result array: O(k)
    • Total: O(n)

Comparison with Other Solutions:

Solution Time (Avg) Time (Worst) Space Best For
QuickSelect O(n) O(n²) O(n) Optimal average performance
Min-Heap O(n log k) O(n log k) O(n) Guaranteed performance, small k
Full Sort O(n log n) O(n log n) O(n) Code simplicity

When to Use QuickSelect: - Production code where average O(n) performance matters - Technical interviews to demonstrate advanced algorithm knowledge - Large datasets where the difference between O(n) and O(n log k) is significant

9.4.7.3 Implementation Steps

Helper Function: partition(unique, left, right) 1. Choose random pivot index in [left, right] 2. Swap pivot to end position 3. Use two-pointer technique to partition by frequency (descending): - i = write position for elements with frequency \(\ge\) pivot - j = read position scanning all elements 4. For each element from left to right-1: - If frequency \(\ge\) pivot frequency: swap to position i, increment i 5. Swap pivot back to position i (its final position) 6. Return i (pivot’s final index)

Main Function: quickSelect(unique, left, right, k) 7. If left \(\ge\) right: return (base case) 8. Partition array and get pivot index 9. If pivot index \(=\) k-1: Done! First k elements are answer 10. If pivot index \(<\) k-1: Recurse on right: quickSelect(unique, pivotIndex+1, right, k) 11. If pivot index \(>\) k-1: Recurse on left: quickSelect(unique, left, pivotIndex-1, k)

Build Result: 12. Extract first k elements (values only, discard frequencies)

9.4.7.4 Code - Java

class Solution {
    private Random rand = new Random();

    public int[] topKFrequent(int[] nums, int k) {
        // Step 1: Build frequency map
        Map<Integer, Integer> counts = new HashMap<>();
        for (int num : nums) {
            counts.put(num, counts.getOrDefault(num, 0) + 1);
        }

        // Step 2: Create array of [value, frequency] pairs
        int[][] unique = new int[counts.size()][2];
        int idx = 0;
        for (Map.Entry<Integer, Integer> entry : counts.entrySet()) {
            unique[idx][0] = entry.getKey();      // value
            unique[idx][1] = entry.getValue();    // frequency
            idx++;
        }

        // Step 3: QuickSelect to partition k most frequent elements
        quickSelect(unique, 0, unique.length - 1, k);

        // Step 4: Extract first k values
        int[] result = new int[k];
        for (int i = 0; i < k; i++) {
            result[i] = unique[i][0];
        }
        return result;
    }

    private void quickSelect(int[][] unique, int left, int right, int k) {
        if (left >= right) {
            return;
        }

        // Partition and get pivot's final position
        int pivotIndex = partition(unique, left, right);

        // Check if we've found exactly k elements on the left
        if (pivotIndex == k - 1) {
            return; // Perfect! First k elements are the k most frequent
        } else if (pivotIndex < k - 1) {
            // Need more elements, recurse right
            quickSelect(unique, pivotIndex + 1, right, k);
        } else {
            // Have too many elements, recurse left
            quickSelect(unique, left, pivotIndex - 1, k);
        }
    }

    private int partition(int[][] unique, int left, int right) {
        // Randomized pivot to avoid worst-case O(n²)
        int randomPivot = left + rand.nextInt(right - left + 1);
        swap(unique, randomPivot, right); // Move pivot to end

        int pivotFreq = unique[right][1];
        int i = left; // Write position for elements >= pivot frequency

        // Partition: move all more/equal frequent elements to the left
        for (int j = left; j < right; j++) {
            if (unique[j][1] >= pivotFreq) { // Descending order by frequency
                swap(unique, i, j);
                i++;
            }
        }

        // Move pivot to its final position
        swap(unique, i, right);
        return i;
    }

    private void swap(int[][] unique, int i, int j) {
        int[] temp = unique[i];
        unique[i] = unique[j];
        unique[j] = temp;
    }
}