📚 LSM Tree vs B-Tree

When databases choose a storage/indexing structure, they usually pick between B-trees (classic choice for OLTP databases) and LSM trees (modern choice for write-heavy key–value stores).

Both solve the same fundamental problem — organizing sorted key–value pairs on disk — but they make different trade-offs based on whether the workload is read-heavy or write-heavy.

🔍 Side-by-Side Comparison

Aspect	LSM Tree	B-Tree (and B+Tree)
Write Pattern	Appends new data to memory (MemTable) + WAL, later flushed as immutable sorted SSTables → sequential writes.	Updates/insertions happen in place on disk → random writes.
Read Pattern	May need to check MemTable + several SSTables. Uses Bloom filters + indexes to cut down search. Reads can be slower if many levels exist.	Generally faster point lookups (1–2 disk seeks), since data is in one structure with well-defined pages.
Range Scans	Good (files are sorted). Might need to merge results across SSTables.	Very good — data locality in leaf pages makes range scans efficient.
Compaction / Maintenance	Requires background compaction (merging/deleting across SSTables). This adds write amplification.	No compaction step — nodes are balanced as part of normal operations.
Write Amplification	High (same data written multiple times at different levels).	Lower (modifies pages once, though splits can cascade).
Read Amplification	Higher if many SSTables exist. Mitigated by Bloom filters and leveled compaction.	Lower — usually a single tree traversal.
Memory Usage	Needs RAM for MemTable + Bloom filters + indexes.	Needs RAM for buffer cache to keep hot pages, but no Bloom filters.
Workload Fit	Optimized for write-heavy workloads (e.g., logging, time series, streaming ingestion, event data).	Optimized for read-heavy and OLTP workloads (e.g., banking, ERP, indexes for relational databases).
Examples	Cassandra, ScyllaDB, HBase, LevelDB, RocksDB.	MySQL (InnoDB), PostgreSQL, Oracle, SQL Server.

🚦 In Practice

LSM tree: “Take writes now, sort/merge later.” Best for high-ingest, append-heavy workloads.
B-tree: “Keep data balanced as we go.” Best for low-latency queries and frequent point lookups.

LSM Tree:

B-Tree:

🔍 Side-by-Side Comparison​

🚦 In Practice​

🔍 Side-by-Side Comparison

🚦 In Practice