目录

Operating System Chapter10 状态机模型的应用


Operating System

$Nanjing\ University\rightarrow Yanyan\ Jiang\newline$

Overview

复习


本次课回答的问题

  • Q: 状态机模型如此有用,还能更有用一点吗?

本次课主要内容

  • 终于做完了铺垫,是时候让你感受到 “真正的力量” 了
    • 都是没用的内容,当我口胡就行了

状态机:理解我们的世界

哲 ♂ 学探讨

  • 我们的物理世界是 “确定规则” 的状态机吗?

    • 宏观物理世界近似于 deterministic 的状态机 (经典力学)

    • 微观世界可能是 non-deterministic 的 (量子力学)

把物理世界建模成基本粒子的运动

哲 ♂ 学探讨 (cont’d)

可以在这个模型上严肃地定义很多概念:预测未来、时间旅行……

  • 成为你理解物理 (和计算机) 世界的参考

状态机模型:理解编译器和现代 CPU

  • 编译器:源代码$S$(状态机) → 二进制代码$C$(状态机)

$$ C=compile(S) $$

  • 编译 (优化) 的正确性 (Soundness):

    • S 与 C 的可观测行为严格一致 (二进制落实到硬件也是如此,严格一致不是句句对应,而是状态机一致,比如一个时钟周期内执行两条指令 $\Longrightarrow$ instruction-level parallelism
      • system calls; volatile variable loads/stores; termination
  • 超标量 (superscalar)/乱序执行处理器

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
#include <stdio.h>
#include <time.h>

#define LOOP 1000000000ul

__attribute__((noinline)) void loop() {
  for (long i = 0; i < LOOP; i++) {
    asm volatile(
      "mov $1, %%rax;"
      "mov $1, %%rdi;"
      "mov $1, %%rsi;"
      "mov $1, %%rdx;"
      "mov $1, %%rcx;"
      "mov $1, %%r10;"
      "mov $1, %%r8;"
      "mov $1, %%r9;"
    :::"rax", "rdi", "rsi", "rdx", "rcx", "r10", "r8", "r9");
  }
}

int main() {
  clock_t st = clock();
  loop();
  clock_t ed = clock();
  double inst = LOOP * (8 + 2) / 1000000000;
  double ips = inst / ((ed - st) * 1.0 / CLOCKS_PER_SEC);
  printf("%.2lfG instructions/s\n", ips);
}
  • 在优化下执行的效果
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
┌──(kali㉿kali)-[~/chapter9]
└─$ screenfetch            
..............                                  
            ..,;:ccc,.                           kali@kali
          ......''';lxO.                         OS: Kali Linux 
.....''''..........,:ld;                         Kernel: x86_64 Linux 6.0.0-kali3-amd64
           .';;;:::;,,.x,                        Uptime: 49m
      ..'''.            0Xxoc:,.  ...            Packages: 2707
  ....                ,ONkc;,;cokOdc',.          Shell: zsh 5.9
 .                   OMo           ':ddo.        Resolution: 1714x874
                    dMc               :OO;       DE: Xfce
                    0M.                 .:o.     WM: Xfwm4
                    ;Wd                          WM Theme: Kali-Dark
                     ;XO,                        GTK Theme: Kali-Dark [GTK2]
                       ,d0Odlc;,..               Icon Theme: Flat-Remix-Blue-Dark
                           ..',;:cdOOd::,.       Font: Cantarell 11
                                    .:d;.':;.    Disk: 19G / 80G (26%)
                                       'd,  .'   CPU: AMD Ryzen 9 5900HX with Radeon Graphics @ 4x 3.294GHz
                                         ;l   .. GPU: VMware SVGA II Adapter
                                          .o     RAM: 1501MiB / 1972MiB
                                            c   
                                            .'  
                                             .  
                                                                                                                                            
┌──(kali㉿kali)-[~/chapter9]
└─$ gcc -O2 ilp-demo.c -o ilp-demo   
                                                                                                                                            
┌──(kali㉿kali)-[~/chapter9]
└─$ ./ilp-demo                      
19.06G instructions/s

┌──(kali㉿kali)-[~/chapter9]
└─$ ./ilp-demo
19.38G instructions/s
                                                                                                                                            
┌──(kali㉿kali)-[~/chapter9]
└─$ ./ilp-demo
19.19G instructions/s
  • 优化过后,每一个时钟周期可以执行远远超过一条指令

查看状态机执行

Trace 和调试器

  • 程序执行 = 状态机执行

    • 我们能不能 “hack” 进这个状态机
      • 观察状态机的执行
        • strace/gdb
  • 甚至记录和改变状态机的执行

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
┌──(kali㉿kali)-[~/chapter9]
└─$ strace ./ilp-demo 
execve("./ilp-demo", ["./ilp-demo"], 0x7ffe10420e10 /* 55 vars */) = 0
brk(NULL)                               = 0x55de2e4e2000
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f9eb4b2c000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=89062, ...}, AT_EMPTY_PATH) = 0
mmap(NULL, 89062, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f9eb4b16000
close(3)                                = 0
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0Ps\2\0\0\0\0\0"..., 832) = 832
pread64(3, "\6\0\0\0\4\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0"..., 784, 64) = 784
newfstatat(3, "", {st_mode=S_IFREG|0755, st_size=1922136, ...}, AT_EMPTY_PATH) = 0
pread64(3, "\6\0\0\0\4\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0"..., 784, 64) = 784
mmap(NULL, 1970000, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f9eb4935000
mmap(0x7f9eb495b000, 1396736, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x26000) = 0x7f9eb495b000
mmap(0x7f9eb4ab0000, 339968, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x17b000) = 0x7f9eb4ab0000
mmap(0x7f9eb4b03000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1ce000) = 0x7f9eb4b03000
mmap(0x7f9eb4b09000, 53072, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f9eb4b09000
close(3)                                = 0
mmap(NULL, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f9eb4932000
arch_prctl(ARCH_SET_FS, 0x7f9eb4932740) = 0
set_tid_address(0x7f9eb4932a10)         = 5644
set_robust_list(0x7f9eb4932a20, 24)     = 0
rseq(0x7f9eb4933060, 0x20, 0, 0x53053053) = 0
mprotect(0x7f9eb4b03000, 16384, PROT_READ) = 0
mprotect(0x55de2e4dc000, 4096, PROT_READ) = 0
mprotect(0x7f9eb4b5e000, 8192, PROT_READ) = 0
prlimit64(0, RLIMIT_STACK, NULL, {rlim_cur=8192*1024, rlim_max=RLIM64_INFINITY}) = 0
munmap(0x7f9eb4b16000, 89062)           = 0
clock_gettime(CLOCK_PROCESS_CPUTIME_ID, {tv_sec=0, tv_nsec=1796473}) = 0
clock_gettime(CLOCK_PROCESS_CPUTIME_ID, {tv_sec=0, tv_nsec=514062681}) = 0
newfstatat(1, "", {st_mode=S_IFCHR|0600, st_rdev=makedev(0x88, 0), ...}, AT_EMPTY_PATH) = 0
getrandom("\x2a\x03\xa3\xe2\x7d\xa4\x84\x07", 8, GRND_NONBLOCK) = 8
brk(NULL)                               = 0x55de2e4e2000
brk(0x55de2e503000)                     = 0x55de2e503000
write(1, "19.52G instructions/s\n", 2219.52G instructions/s
) = 22
exit_group(0)                           = ?
+++ exited with 0 +++
  • strace调试最小的Hello OS,可以清楚地看到系统调用
1
2
3
4
5
6
7
┌──(kali㉿kali)-[~/chapter9]
└─$ strace ./minimal
execve("./minimal", ["./minimal"], 0x7ffefc958480 /* 55 vars */) = 0
write(1, "\33[01;31mHello, OS World\33[0m\n", 28Hello, OS World
) = 28
exit(1)                                 = ?
+++ exited with 1 +++
  • -T看每一个系统调用花费的时间
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
┌──(kali㉿kali)-[~/chapter9]
└─$ strace -T ./ilp-demo
execve("./ilp-demo", ["./ilp-demo"], 0x7ffdfd498e18 /* 55 vars */) = 0 <0.000258>
brk(NULL)                               = 0x55a0fd277000 <0.000005>
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fb9c980a000 <0.000113>
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory) <0.000076>
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 <0.000077>
newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=89062, ...}, AT_EMPTY_PATH) = 0 <0.000077>
mmap(NULL, 89062, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fb9c97f4000 <0.000074>
close(3)                                = 0 <0.000072>
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3 <0.000076>
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0Ps\2\0\0\0\0\0"..., 832) = 832 <0.000072>
pread64(3, "\6\0\0\0\4\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0"..., 784, 64) = 784 <0.000071>
newfstatat(3, "", {st_mode=S_IFREG|0755, st_size=1922136, ...}, AT_EMPTY_PATH) = 0 <0.000076>
pread64(3, "\6\0\0\0\4\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0"..., 784, 64) = 784 <0.000071>
mmap(NULL, 1970000, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fb9c9613000 <0.000084>
mmap(0x7fb9c9639000, 1396736, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x26000) = 0x7fb9c9639000 <0.000073>
mmap(0x7fb9c978e000, 339968, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x17b000) = 0x7fb9c978e000 <0.000073>
mmap(0x7fb9c97e1000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1ce000) = 0x7fb9c97e1000 <0.000060>
mmap(0x7fb9c97e7000, 53072, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7fb9c97e7000 <0.000074>
close(3)                                = 0 <0.000072>
mmap(NULL, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fb9c9610000 <0.000074>
arch_prctl(ARCH_SET_FS, 0x7fb9c9610740) = 0 <0.000073>
set_tid_address(0x7fb9c9610a10)         = 8876 <0.000071>
set_robust_list(0x7fb9c9610a20, 24)     = 0 <0.000097>
rseq(0x7fb9c9611060, 0x20, 0, 0x53053053) = 0 <0.000109>
mprotect(0x7fb9c97e1000, 16384, PROT_READ) = 0 <0.000079>
mprotect(0x55a0fcb07000, 4096, PROT_READ) = 0 <0.000072>
mprotect(0x7fb9c983c000, 8192, PROT_READ) = 0 <0.000073>
prlimit64(0, RLIMIT_STACK, NULL, {rlim_cur=8192*1024, rlim_max=RLIM64_INFINITY}) = 0 <0.000074>
munmap(0x7fb9c97f4000, 89062)           = 0 <0.000094>
clock_gettime(CLOCK_PROCESS_CPUTIME_ID, {tv_sec=0, tv_nsec=876239}) = 0 <0.000075>
clock_gettime(CLOCK_PROCESS_CPUTIME_ID, {tv_sec=0, tv_nsec=511881481}) = 0 <0.000199>
newfstatat(1, "", {st_mode=S_IFCHR|0600, st_rdev=makedev(0x88, 0), ...}, AT_EMPTY_PATH) = 0 <0.000090>
getrandom("\x6e\x63\x59\xe8\xb3\xfe\x8a\x8d", 8, GRND_NONBLOCK) = 8 <0.000143>
brk(NULL)                               = 0x55a0fd277000 <0.000106>
brk(0x55a0fd298000)                     = 0x55a0fd298000 <0.000052>
write(1, "19.57G instructions/s\n", 2219.57G instructions/s
) = 22 <0.000085>
exit_group(0)                           = ?
+++ exited with 0 +++
/img/Operating System/chapter10-1.png
gdb layout src
  • 注意为了让layout src能够成功显示,编译时需要使用gcc-g编译选项
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
┌──(kali㉿kali)-[~/chapter9]
└─$ gcc -O2 ilp-demo.c -o ilp-demo -g  
                                                                                                                                            
┌──(kali㉿kali)-[~/chapter9]
└─$ gdb ilp-demo                     
GNU gdb (Debian 13.1-2) 13.1
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ilp-demo...
(gdb) start
Temporary breakpoint 1 at 0x1060: file ilp-demo.c, line 22.
Starting program: /home/kali/chapter9/ilp-demo 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Temporary breakpoint 1, main () at ilp-demo.c:22
22        clock_t st = clock();
(gdb) layout src

应用 (1): Time-Travel Debugging

  • 程序执行是随时间 “前进” 的$s_0 \rightarrow s_1 \rightarrow s_2 \rightarrow \dots$

  • 能否在时间上 “后退”? (time-travel)

    • 经常 gdb 不小心 step 过了,从头再来……
    • 记录所有的$s_i$,就能实现任意的 time-traveling

凉性循环,看一会rust

  • 记录所有$s_i$的开销太大($s_i$由内存+寄存器组成)
    • 一条指令的side-effect通常有限

应用 (1): Time-Travel Debugging (cont’d)

  • gdb 的隐藏功能 (大家读过 gdb 的手册了吗?)

    • record full - 开始记录

    • record stop - 结束记录

    • reverse-step/reverse-stepi - “时间旅行调试”


  • 例子:调试 rdrand.c
    • Reverse execution 不是万能的
      • 有些复杂的指令 (syscall) 无法保证 $\Longrightarrow$ record and replay
1
2
3
4
5
6
7
8
#include <stdio.h>
#include <stdint.h>

int main() {
  uint64_t val;
  asm volatile ("rdrand %0": "=r"(val));
  printf("rdrand returns %016lx\n", val);
}

asm volatile ("rdrand %0": "=r"(val));使用了 GCC 内联汇编(Inline Assembly)的语法,使用 Intel 的 RDRAND 指令生成一个随机数,并将该随机数保存到变量 val 中。

具体来说,%0 表示占位符,用来代表第一个输入或输出操作数,这里是 val。而 =r 则表示将 val 寄存器中的值作为输出值,同时通知编译器该变量会被修改。

因此,该代码使用 RDRAND 生成一个随机数,并将其保存到 val 变量中。最后通过 printf 打印出该随机数。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
┌──(jungle㉿LAPTOP-A7S3TAA4)-[/mnt/d/work for vscode/chapter10]
└─$ gcc -O1 rdrand.c -g -o rdrand   
                                                                                         
┌──(jungle㉿LAPTOP-A7S3TAA4)-[/mnt/d/work for vscode/chapter10]
└─$ ./rdrand
rdrand returns d8112baa2a77dd3e

┌──(jungle㉿LAPTOP-A7S3TAA4)-[/mnt/d/work for vscode/chapter10]
└─$ ./rdrand
rdrand returns 106671f827a8f49e                                                           

┌──(jungle㉿LAPTOP-A7S3TAA4)-[/mnt/d/work for vscode/chapter10]
└─$ ./rdrand
rdrand returns 04adc0b810ec0565

volatile提醒编译器它后面所定义的变量随时都有可能改变,因此编译后的程序每次需要存储或读取这个变量的时候,告诉编译器对该变量不做优化,都会直接从变量内存地址中读取数据,从而可以提供对特殊地址的稳定访问。

如果没有volatile关键字,则编译器可能优化读取和存储,可能暂时使用寄存器中的值,如果这个变量由别的程序更新了的话,将出现不一致的现象。(简洁的说就是:volatile关键词影响编译器编译的结果,用volatile声明的变量表示该变量随时可能发生变化,与该变量有关的运算,不要进行编译优化,以免出错)

  • 更改之后进行gdb
1
2
3
4
5
6
7
8
#include <stdio.h>
#include <stdint.h>

int main() {
  uint64_t volatile val = 114514;
  asm volatile ("rdrand %0": "=r"(val));
  printf("rdrand returns %016lx\n", val);
}
  • 每一次的调试结果都不一样,那我们应该如何复现BUG
    • 打开记录模式
1
(gdb) record full
/img/Operating System/chapter10-2.png
on wsl
/img/Operating System/chapter10-3.png
layout asm
/img/Operating System/chapter10-4.png
rax值相同
  • val的值已经改变,如何向上回溯?
    • 使用rsi往回执行
/img/Operating System/chapter10-5.png
rsi
/img/Operating System/chapter10-6.png
返回旧值

应用 (2): Record & Replay

  • 在程序执行时记录信息,结束后重现程序的行为
    • 确定的程序不需要任何记录
      • 假设$s_0$执行 1,000,000 条确定的指令后得到$s^{`}$
        • 那么只要记录$s_0$和1,000,000
        • 就能通过“再执行一次” 推导出$s^{`}$

  • 记录:[指令数(确定指令) + 结果(非确定指令)] $\times n$

应用 (2): Record & Replay (cont’d)

采样状态机执行

关于性能优化

Premature optimization is the root of all evil. (D. E. Knuth)

  • 那到底怎么样才算 mature 呢?

    • 状态机的执行需要时间;对象需要占用空间

    • 需要理解好 “时间花在哪里”、“什么对象占用了空间”

  • 我们需要真实执行的性能摘要

    • 本质的回答:“为了做某件事到底花去了多少资源”

    • 简化的回答:“一段时间内资源的消耗情况”

Profiler 和性能摘要

性能摘要需要对程序执行性能影响最小(不能因为这个进程过多影响你检测的程序),往往不需要 full trace。

  • 隔一段时间 (比如一毫秒)“暂停” 程序、观察状态机的执行
    • 中断就可以做到
    • 将状态 [ss′] “记账”
      • 执行的语句
      • 函数调用栈
      • 服务的请求
    • 得到统计意义的性能摘要

例子:Linux Kernel perf (支持硬件 PMU) - ilp-demo.c

  • perf list, perf stat (-e), perf record, perf report
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
#include <stdio.h>
#include <time.h>

#define LOOP 1000000000ul

__attribute__((noinline)) void loop() {
  for (long i = 0; i < LOOP; i++) {
    asm volatile("mov $1, %%rax;"
                 "mov $1, %%rdi;"
                 "mov $1, %%rsi;"
                 "mov $1, %%rdx;"
                 "mov $1, %%rcx;"
                 "mov $1, %%r10;"
                 "mov $1, %%r8;"
                 "mov $1, %%r9;" ::
                     : "rax", "rdi", "rsi", "rdx", "rcx", "r10", "r8", "r9");
  }
}

int main() {
  clock_t st = clock();
  loop();
  clock_t ed = clock();
  double inst = LOOP * (8 + 2) / 1000000000;
  double ips = inst / ((ed - st) * 1.0 / CLOCKS_PER_SEC);
  printf("%.2lfG instructions/s\n", ips);
}
  • 安装perf
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
┌──(jungle㉿LAPTOP-A7S3TAA4)-[/mnt/d/work for vscode/chapter10]
└─$ sudo apt install linux-tools-common
[sudo] password for jungle: 
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following NEW packages will be installed:
  linux-tools-common
0 upgraded, 1 newly installed, 0 to remove and 48 not upgraded.
Need to get 295 kB of archives.
After this operation, 812 kB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu jammy-updates/main amd64 linux-tools-common all 5.15.0-69.76 [295 kB]
Fetched 295 kB in 0s (4448 kB/s)           
debconf: unable to initialize frontend: Dialog
debconf: (Dialog frontend requires a screen at least 13 lines tall and 31 columns wide.)
debconf: falling back to frontend: Readline
Selecting previously unselected package linux-tools-common.
(Reading database ... 81634 files and directories currently installed.)
Preparing to unpack .../linux-tools-common_5.15.0-69.76_all.deb ...
Unpacking linux-tools-common (5.15.0-69.76) ...
Setting up linux-tools-common (5.15.0-69.76) ...
Processing triggers for man-db (2.10.2-1) ...

实际中的性能优化

  • 你们遇到的大部分情况

    • 二八定律:80% 的时间消耗在非常集中的几处代码

    • L1 (pmm): 小内存分配时的 lock contention

      • profiler 直接帮你解决问题

  • 工业界遇到的大部分情况
    • 木桶效应:每个部分都已经 tune 到局部最优了
      • 剩下的部分要么 profiler 信息不完整,要么就不好解决
      • (工程师整天都对着 profiler 看得头都大了)
      • The flame graph(火焰图) (CACM'16)

Model Checker/Verifier

Model Checker 的威力大家已经知道了

  • 150 行代码的 model-checker.py

    • 证完所有《操作系统》课上涉及的并发程序

    • 复现 OSTEP 教科书上的并发 bug (条件变量错误唤醒)


Model Checker: 不仅是并发

  • 任何 “non-deterministic” 的状态机都可以检查
1
2
3
4
5
6
7
u32 x = rdrand();
u32 y = rdrand();
if (x > y)
  if (x * x + y * y == 65)
    bug();
...
assert(ptr); // 可能空指针吗?

总结

  • 本次课回答的问题
    • Q: 状态机的视角给了我们什么?

  • Take-away messages

    • 编程 (状态机) 就是全世界

    • 状态机可以帮我们

      • 建立物理世界的公理体系
        • 理解调试器、Trace, profiler
        • 自动分析程序的执行 (model checker)

声明:本文章引用资料与图像均已做标注,如有侵权本人会马上删除