CMU 15-445 Lecture #19: Multi-Version Concurrency Control
CMU 15-445 Database Systems
Lecture #19: Multi-Version Concurrency Control
Multi-Version Concurrency Control
-
常常作为2PL和T/O的辅助手段
-
The DBMS maintains multiple physical versions of a single logical object in the database(维护多个历史版本(像git))
-
When a txn writes to an object, the DBMS creates a new version of that object. 不改动,直接创建一个新的版本
-
When a txn reads an object, it reads the newest version that existed when the txn started.
-
First implementations was Rdb/VMS and InterBase at DEC in early 1980s.
→ Both were by Jim Starkey, co-founder of NuoDB.
→ DEC Rdb/VMS is now “Oracle Rdb”.
→ InterBase was open-sourced as Firebird.
-
解决的问题
- Writers do not block readers.
- Readers do not block writers.
- 我去上面读历史版本就是了
-
Read-only txns can read a consistent snapshot without acquiring locks. 好像在读静态数据
- Use timestamps to determine visibility.用时间戳来确定可见性
- MVCC naturally supports Snapshot Isolation (SI).天然支持快照隔离读
-
Easily support time-travel queries.可以读取某一个时刻的历史版本,和IDE退到昨天的代码很像(其他方案很难做到,会把历史数据直接给写没)
- 从上面的一张图可以看到T1和T2没法做到完全串行化,T2没有读到T1commit上去的数据,所以说只靠MVCC做不到完全串行化,Oracle最高隔离级别就是上面的图,快照隔离
- There are five important MVCC design considerations:
- Concurrency Control Protocol
- Version Storage
- Garbage Collection
- Index Management
- Deletes
- Concurrency Control Protocol
- Approach #1: Timestamp Ordering: Assign txns timestamps that determine serial order.
- Approach #2: Optimistic Concurrency Control: Three-phase protocol from last class,Use private workspace for new versions.
- Approach #3: Two-Phase Locking: Txns acquire appropriate lock on physical version before they can read/write a logical tuple.
Design consideration: Version Storage
- Version Storage
-
The DBMS uses the tuples’ pointer field to create a version chain per logical tuple
- This allows the DBMS to find the version that is visible to a particular txn at runtime.
- Indexes always point to the “head” of the chain.
-
Approach #1: Append-Only Storage: New versions are appended to the same table space.
-
- 两种插法:头插法和尾插法,头插法搜索效率高(大部分txn要最新的数据)
-
Approach #2: Time-Travel Storage: Old versions are copied to separate table space.
-
Approach #3: Delta Storage: The original values of the modified attributes are copied into a separate delta record space.
- MySQL用的就是这个方案
-
Design consideration: Garbage Collection
-
Garbage Collection
-
历史版本不能一直存着(那样存储空间就会被严重浪费),所以需要定期回收无用的历史版本
-
怎么判断无用?
- 现在运行的事务都看不到这个版本了(Snapshot Isolation)
- 创建这个版本的事务回滚了
-
两个问题
- 怎么发现过期的版本?
- 决定何时回收才能保证内存安全?
-
-
Approach #1: Tuple-level
-
Find old versions by examining tuples directly.
-
Background Vacuuming vs. Cooperative Cleaning
-
-
Approach #2: Transaction-level
- Txns keep track of their old versions so the DBMS does not have to scan tuples to determine visibility
Design consideration: Index Management
-
Primary key indexes point to version chain head
-
修改主键=先删除后插入
-
Secondary indexes
-
Approach #1: Logical Pointers: 记录数据的逻辑地址,比如主键的值
-
Approach #2: Physical Pointers: 记录物理地址
-
逻辑地址就没有上面的毛病,只需要改主键索引,辅助索引就不需要变
-
还有一个折中的方案就是在物理地址索引和版本链之间加上一个表做索引到版本链的代理
-
-
Delete
- Approach #1: Deleted Flag: Maintain a flag to indicate that the logical tuple has been deleted after the newest physical version. This can either be in the tuple header or a separate column.
- Approach #2: Tombstone Tuple: Create an empty physical version to indicate that a logical tuple is deleted. Use a separate pool for tombstone tuples with only a special bit pattern in version chain pointer to reduce storage overhead.