What is LSM, and Why LSM?
                        目录
                        
                    
                    
                What is LSM, and Why LSM?
- 
本文章是对LSM的基本介绍 
- 
LSM trees全称为Log-structured merge trees,是一种用于KV存储引擎的数据结构,它被广泛应用于分布式存储系统 
- 
使用其的工业产品 - TiDB and CockroachDB
- RocksDB, based on LevelDB
 
- 
LSM tree是一种append-friendly data structure,和其他的KV存储引擎底层的数据结构,比如RB-Tree and B-Tree最大的不同就是当更改数据的时候,RB-Tree and B-Tree需要在内存或者磁盘把原有数据进行修改,但是对于LSM tree来说上面的修改操作从之前的同步操作改成了lazily applied to the storage,引擎可以一次性批量将前面修改数据的操作处理成SST (sorted string table) files,然后再将其写入磁盘(这个处理的速度要比前面的RB-Tree and B-Tree快得多,顺序追加日志要比查找到值再改快得多),一旦写入磁盘,引擎将不再直接修改它们,只需要后台的任务定时将以进行compaction即可 
- 
这种架构上的设计让LSM trees易于使用 - 数据在持久存储里面是不可变的,并发控制的问题因为这个变得简单,而且compaction的任务可以交给远程的服务器去做,这样的架构也非常适合现在的云原生(存储和compaction完全可以派发给不同的机器)
- 引擎的读和写的性能可以通过调整compaction算法/算法的参数来进行变化,这样可以让引擎适应多种不同的情境
 
- 
  lsm-structure 
- 
一个LSM存储引擎大致可以分为下面三个部分 - Write-ahead log to persist temporary data for recovery.
- SSTs on the disk to maintain an LSM-tree structure.
- Mem-tables in memory for batching small writes.
 
- 
一个存储引擎应该提供下面四个接口 - Put(key, value): store a key-value pair in the LSM tree.
- Delete(key): remove a key and its corresponding value.
- Get(key): get the value corresponding to a key.
- Scan(range): get a range of key-value pairs.
 
- 
为了保证持久化还会有一个接口 - Sync(): ensure all the operations before- syncare persisted to the disk.
 
- 
一些引擎会将 Put和Write的操作组合起来处理,称为WriteBatch
- 
本教程中LSM的压缩算法选用leveled compaction algorithm,在工业界也很常用 
- Write Path
  write-path 
- The write path of LSM contains four steps:
- Write the key-value pair to the write-ahead log so that it can be recovered after the storage engine crashes.
- Write the key-value pair to memtable. After (1) and (2) are completed, we can notify the user that the write operation is completed.
- (In the background) When a mem-table is full, we will freeze them into immutable mem-tables and flush them to the disk as SST files in the background.
- (In the background) The engine will compact some files in some levels into lower levels to maintain a good shape for the LSM tree so that the read amplification is low.
 
 
     
    