Leetcode 347. Top K Frequent Elements

Photo by okeykat on Unsplash

Leetcode 347. Top K Frequent Elements

I want to document my thought process on this Leetcode question, https://leetcode.com/problems/top-k-frequent-elements/.

First I'll try get an idea of what the function signature should be like while listening to the question. Then I'll start ask clarifying questions and come up with examples. After clarifying, I'll brainstorm some ideas to solve the problem. I want to try throwing different data structures at the problem for now, some may just be for the sake of considering them.

Function signature:

  • int[] topKFrequent(int[] nums, int k)

Clarifying Questions / Assumptions:

  • Input: null/empty? k always valid? Multi answers?

  • Output: How to handle null/empty/Invalid k? Should sort?

  • Assume: k is always valid

Examples:

  • nums = [1, 1, 2, 2, 3], k = 2

  • nums = [1], k = 10 // invalid

Brainstorm/Observations:

If k is always valid and k == nums.length, return the whole array.
k <= nums.length

What would I do if I had to solve this in real life with paper and pen?

(n = nums.length)

  1. Count frequency + Sort

    • Count frequency and store it as int[][] numFreq.
      Sort using an altered version of quicksort.
      Get k items from sorted array as result.

    • Example
      nums = [5,6,6,6,7,7,7,8,8]
      numFreq = { [5,1], [6,3], [7,3], [8,2] }

      aftersort = { [5,1], [8,2], [6,3], [7,3] }

    • time O(nlogn) ~ O(n^2) // for quicksort
      space O(n) // for counting frequencies

  2. Count frequency + Max heap

    • Count frequency and store num and frequency pair as int[][] numFreq or HashMap<>.
      Add all items to a max heap with custom comparator.
      e.g. - PriorityQueue pq = new PriorityQueue<>((a, b) -> freqMap.get(b) - freqMap.get(a));
      Poll k items off heap for result.

    • time O(nlogn) // for building a max heap
      space O(n) // for counting frequencies and heap

  3. Count frequency + Min heap (maintain size at k)

    • Use min heap instead of max heap.
      Add items to heap and poll whenever heap.size() > k.

    • time O(nlogk) // for building a min heap capped at size k
      space O(n) // for counting frequencies

  4. Count frequency + Bucketsort

    • Count frequency first.
      Create buckets with size "n + 1", because max bucket size will be capped at n, and the addtional 1 of for frequency 0.
      Sort the numbers into buckets by frequency,
      then go through the buckets till we get k items.

    • nums = [5,6,6,6,7,7,7,8,8]
      numFreq = { [5,1], [6,3], [7,3], [8,2] }

      minFreq = 1
      maxFreq = 3

      buckets = {

      0: 5 // frequency = 1
      1: 8 // frequency = 2
      2: 6, 7 // frequency = 3
      }

    • Max bucket size will be capped at n
      time O(n) // for counting frequency, sort into buckets, building result
      space O(n) // for counting frequency and buckets

  5. Count frequency + QuickSelect

    • Count frequency, and convert it to a int[][] freq and num pair.

      Use QuickSelect, to find the item at "n - k + 1".
      Items to the right are the results we want, albeit not sorted.

    • time O(n) ~ O(n^2) // for quickselect
      space O(n) //

Code:

  • MinHeap
class Solution {
    public int[] topKFrequent(int[] nums, int k) {
        Map<Integer, Integer> freq = countFreq(nums);
        PriorityQueue<Integer> pq = findTopK(freq, k);
        return toArray(pq);
    }

    private Map<Integer, Integer> countFreq(int[] nums) {
        Map<Integer, Integer> freq = new HashMap<>();
        for (int n : nums) {
            freq.put(n, freq.getOrDefault(n, 0) + 1);
        }
        return freq;
    }

    private PriorityQueue<Integer> findTopK(Map<Integer, Integer> freq, int k) {
        PriorityQueue<Integer> pq = new PriorityQueue<>(
            (i, j) -> Integer.compare(freq.get(i), freq.get(j)));
        for (int i : freq.keySet()) {
            pq.add(i);
            if (pq.size() > k) {
                pq.poll();
            }
        }
        return pq;
    }

    private int[] toArray(PriorityQueue<Integer> pq) {
        int[] ans = new int[pq.size()];
        for (int i = pq.size() - 1; i >= 0; i--) {
            ans[i] = pq.poll();
        }
        return ans;
    }
}
  • Bucketsort:
class Solution {
    public static int[] topKFrequent(int[] nums, int k) {
        if (nums == null || nums.length == 0 || k == 0) return new int[0];

        Map<Integer, Integer> freq = countFreq(nums);
        List<List<Integer>> buckets = toBuckets(freq, nums.length);
        return getTopBuckets(buckets, k);
    }

    private static Map<Integer, Integer> countFreq(int[] nums) {
        Map<Integer, Integer> freq = new HashMap<>();
        for (int num : nums) {
            freq.put(num, freq.getOrDefault(num, 0) + 1);
        }
        return freq;
    }

    private static List<List<Integer>> toBuckets(Map<Integer, Integer> map, int size) {
        List<List<Integer>> buckets = new ArrayList<>();
        for (int i = 0; i < size + 1; i++) {
            buckets.add(new ArrayList<>());
        }
        for (Map.Entry<Integer, Integer> item : map.entrySet()) {
            int num = item.getKey();
            int freq = item.getValue();
            buckets.get(freq).add(num);
        }
        return buckets;
    }

    private static int[] getTopBuckets(List<List<Integer>> buckets, int k) {
        int[] top = new int[k];

        int j = 0;
        for (int i = buckets.size() - 1; i >= 0; i--) {
            List<Integer> bucket = buckets.get(i);
            for (int num : bucket) {
                top[j++] = num;
                if (j >= k) {
                    break;
                }
            }
            if (j >= k) {
                break;
            }
        }
        return top;
    }
}
  • QuickSelect:
class Solution {    
    public static int[] topKFrequent(int[] nums, int k) {
        if (nums == null || nums.length == 0 || k == 0) return new int[0];

        int[][] freq = countFreq(nums);
        quickSelect(freq, freq.length - k);
        return toArray(freq, k);
    }

    private static int[][] countFreq(int[] nums) {
        Map<Integer, Integer> map = new HashMap<>();
        for (int num : nums) {
            map.put(num, map.getOrDefault(num, 0) + 1);
        }

        int[][] freq = new int[map.size()][];
        int i = 0;
        for (int num : map.keySet()) {
            freq[i++] = new int[] {map.get(num), num};
        }
        return freq;
    }

    private static void quickSelect(int[][] freq, int target) {
        int l = 0;
        int r = freq.length - 1;
        while (l < r) {
            int p = partition(freq, l, r);
            if (p == target) {
                break;
            } else if (p < target) {
                l = p + 1;
            } else {
                r = p - 1;
            }
        }
    }

    private static int partition(int[][] freq, int l, int r) {
        swap(freq, r, l + new Random().nextInt(r - l));

        int p = r;
        int lo = l;
        for (int i = l; i < p; i++) {
            if (freq[i][0] < freq[p][0]) {
                swap(freq, lo, i);
                lo++;
            }
        }
        swap(freq, lo, p);
        return lo;
    }

    private static void swap(int[][] freq, int i, int j) {
        if (i == j) return;

        int[] tmp = freq[i];
        freq[i] = freq[j];
        freq[j] = tmp;
    }

    private static int[] toArray(int[][] freq, int k) {
        int[] ans = new int[k];
        for (int i = 0; i < ans.length; i++) {
            ans[i] = freq[freq.length - 1 - i][1];
        }
        return ans;
    }
}

Let me know if there is anything I should have considered, or has an error! I'm happy to get any feedback.