Keep/Track Best N with Heaps
🎯 Goal: Track the Top N Values
You want to keep only the N best values seen so far. Depending on what "best" means (largest or smallest), the heap you use is inverted:
Goal | Heap Type | Why This Heap? |
---|---|---|
N largest | Min-heap | Smallest of arr is easy to evict |
N smallest | Max-heap | Largest of arr is easy to evict |
✅ Python Cheatsheet
N Largest → Use a min-heap
import heapq
def top_k_largest(stream, k):
heap = []
for val in stream:
if len(heap) < k:
heapq.heappush(heap, val)
elif val > heap[0]:
heapq.heappushpop(heap, val)
return sorted(heap, reverse=True)
N Smallest → Use a max-heap via negation
import heapq
def top_k_smallest(stream, k):
heap = []
for val in stream:
if len(heap) < k:
heapq.heappush(heap, -val)
elif val < -heap[0]:
heapq.heappushpop(heap, -val)
return sorted([-x for x in heap])
🧠 Key Insight
To keep the "Best N" of something, you don't sort — you evict the worst whenever a better value arrives.
The heap type you use is the one that keeps the eviction target on top:
- Keep/Track N largest? Use a min-heap to eject the smallest (@ idx 0)
- Keep/Track N smallest? Use a max-heap to eject the largest (@ idx 0)
TL;DR
- ✅ Keep a heap of size
N
- ✅ Always compare new values to the worst of the N kept so far
- ✅ Replace if the new value is better
- ✅ Use
heapq
(with-x
for max-heaps)