Finding Top 'K' Elements
Heaps (Priority Queues)
In life and in data, not everything deserves your attention. Focus only on the top K that matter.
Imagine you have a massive stream of real-time data, like user scores in an online game, and you need to display a leaderboard of the top 10 players. Sorting the entire dataset every second is incredibly inefficient. How can you find just the "Top K" elements without all the extra work?
Problem: Kth Largest Element in an Array (LeetCode Link): Given an integer array nums
and an integer k
, return the k
th largest element in the array. 💡 HintIf you only need to keep track of k
elements, what's an efficient way to maintain the k
largest you've seen so far?
The Brute-Force Approach
The simplest solution is to sort the entire array in descending order and then pick the element at index k-1
. This works, but it has a time complexity of , which is overkill. We are doing much more work than necessary by sorting the entire array.
The Heap (Priority Queue) Solution
A heap is the perfect data structure for this job. It allows us to maintain a collection of the "Top K" elements we've seen so far in just time per element.
Note: In this chapter, we will focus on how to use a heap (in-built data structure). How heaps works is something that you can explore later. You can think of heap as a black box that always keeps the smallest (or largest) element at the top.
The core idea is to use a min-heap of size k
.
- Iterate: Go through each number in the input array.
- Build the Heap:
- If the heap currently has fewer than
k
elements, just add the number. - If the heap is full (size
k
), compare the current number to the smallest element in the heap (which is always at the top of a min-heap).
- If the heap currently has fewer than
- Compare and Replace: If the current number is larger than the smallest number in the heap, it means the current number has a right to be in the "Top K". So, we remove the smallest element and add the current one.
- Final Result: After iterating through all the numbers, the min-heap will contain the
k
largest elements from the entire array. Thek
th largest is the smallest among them, which is the root of the min-heap.
This approach is far more efficient, with a time complexity of .
Walkthrough: Finding the Kth Largest Element
import heapq
def find_kth_largest(nums: list[int], k: int) -> int:
# Python's heapq is a min-heap by default.
min_heap = []
for num in nums:
if len(min_heap) < k:
heapq.heappush(min_heap, num)
elif num > min_heap[0]:
# If the current number is larger than the smallest in the heap,
# it deserves a spot in the top k.
heapq.heapreplace(min_heap, num)
# The root of the min-heap is the kth largest element.
return min_heap[0]
import java.util.PriorityQueue;
public int findKthLargest(int[] nums, int k) {
// A min-heap by default.
PriorityQueue<Integer> minHeap = new PriorityQueue<>();
for (int num : nums) {
if (minHeap.size() < k) {
minHeap.add(num);
} else if (num > minHeap.peek()) {
minHeap.poll();
minHeap.add(num);
}
}
return minHeap.peek();
}
#include <vector>
#include <queue>
int findKthLargest(std::vector<int>& nums, int k) {
// A min-heap.
std::priority_queue<int, std::vector<int>, std::greater<int>> minHeap;
for (int num : nums) {
if (minHeap.size() < k) {
minHeap.push(num);
} else if (num > minHeap.top()) {
minHeap.pop();
minHeap.push(num);
}
}
return minHeap.top();
}
More Problems & Variations
This core "Top K" pattern can be adapted to solve many related problems.
-
Top K Frequent Elements (LeetCode Link): Find the
k
most frequent elements. 💡 HintFirst, use a hash map to count the frequency of each number. Then, use a min-heap of sizek
to find thek
elements with the highest frequencies. The heap will store pairs of(frequency, number)
. -
K Closest Points to Origin (LeetCode Link): Find the
k
points closest to the origin (0, 0). 💡 HintYou need thek
smallest distances. Use a max-heap of sizek
. The heap will store pairs of(distance, point)
. If a new point's distance is smaller than the largest distance in the heap, replace it. -
Sort Characters By Frequency (LeetCode Link): Sort a string in decreasing order based on the frequency of characters. 💡 HintThis is a "Top K" problem where elements are unique characters and frequencies are their counts. Use a max-heap to pull out characters in order of frequency.
Bonus Point: Quickselect
For finding the "Kth Largest Element" specifically, there is an even more optimized algorithm called Quickselect. It's a modification of the QuickSort algorithm. Instead of recursing into both sides of the pivot, it only recurses into the side that contains the k
th element. This gives it an average-case time complexity of , which is better than the heap solution's . However, its worst-case complexity is , whereas the heap solution is a more reliable . In an interview, mentioning both and discussing the trade-offs is a sign of a strong candidate.