目录

Computer Organization and Architecture Instruction Sets Addressing Modes and Formats


Computer Organization and Architecture

Instruction Sets: Addressing Modes and Formats

Outline

  • Addressing

  • x86 and ARM addressing modes

  • Instruction Formats

  • x86 and ARM instruction formats

Addressing

What is addressing mode?

  • Elements in the instruction include: opcode, source operand, destination operand, and next instruction address

  • Possible positions of operands

    • Memory

    • Register

    • Immediate

    • I/O

  • Addressing mode specifies how to obtain an operand of an instruction

  • Addressing is relatively simple when the operand is in a register or immediate

  • If the operand is in memory

    • The address field of an operand in an instruction cannot be too long

    • Want to access a large memory space

  • Memory addressing adopts multiple addressing modes

    • Balance the addressable address range, addressing flexibility, addressing complexity and the number of storage units occupied

Memory addressing

  • Absolute

  • Displacement

  • Indexed

  • register indirect

  • memory indirect

  • Autoincrement

  • Autodecrement


Advantage

  • Expanding addressable address space

  • Improved addressing flexibility

  • Provide better program architecture to help programmers design more flexible programs

    • For example, array, pointer based access, etc

Common addressing mode

  • Immediate

  • Direct

  • Indirect

  • Register

  • Register Indirect

  • Displacement (Indexed)

  • Stack

Immediate addressing
  • Operand is part of instruction

    • Operand = address field
  • e.g.

    • ADD 5

    • Add 5 to contents of accumulator

    • 5 is operand

  • No memory reference to fetch data

    • Fast

    • Limited range: length of the address field in the instruction is limited

    • Inflexible

1
MOV BL,10
  • 指令中包含了操作码和立即数

  • 复杂一点的指令中,操作数包括立即数,以及其他寻址方式

1
MOV BL,10
  • 这个指令把10这个立即数送到BL寄存器中

  • 立即数寻址在很多指令中都会用到,但是受到的限制比较大

Direct addressing
  • Address field contains address of operand

  • Effective address (EA) = address field (A)

    • Single memory reference to access data

    • No additional calculations to work out effective address

    • Limited address space

1
ADD A
  • Add contents of cell A to accumulator

  • Look in memory at address A for operand

/img/Computer Organization and Architecture/chapter11-1.png
  • 指令中给出了操作数在主存储器中的地址

  • 通过一次存储器访问,就可以得到操作数

  • 操作数的地址直接在指令中。指令的长度有限,能留给直接寻址的地址域的长度有限,导致寻址空间有限

Indirect addressing
  • Memory cell pointed to by address field contains the address of (pointer to) the operand

  • EA = (A)

    • Access the storage unit with address A to obtain the actual address of the operand

    • Access the memory according to this address to get the operand

    • Memory needs to be accessed twice

1
ADD (A)
  • Add contents of cell pointed to by contents of A to accumulator

  • Large address space

    • $2^n\ where\ \rightarrow n=word\ length\newline$
  • May be nested, multilevel, cascaded

    • e.g. EA=((A))

    • Effective address is the value of the storage unit pointed to by (A)

  • Multiple memory accesses to find operand

  • Hence slower

/img/Computer Organization and Architecture/chapter11-2.png
  • 指令中包含了一个地址A

  • 根据A去存储器中访问,得到操作数的地址

  • 根据这个地址去获得操作数

  • 要2次访问存储器,访问速度相对比较慢

Register addressing
  • Operand is held in register named in address field

  • EA = R

  • Limited number of registers

    • Register address field is 3-5 bits, and the number of accessible registers ranges from 8 to 32
  • Very small address field needed

    • Shorter instructions
    • Faster instruction fetch
  • Similar to direct addressing

    • No memory access

    • Very fast execution

  • Very limited address space

  • Multiple registers helps performance

    • Requires good assembly programming or compiler writing

    • Multiple used operands are placed in registers

/img/Computer Organization and Architecture/chapter11-3.png
  • 寄存器寻址和存储器直接寻址非常类似

  • 访问的是CPU内部的寄存器

  • 访问寄存器的速度比访问存储器快很多,并且寄存器的数量少,充分利用好寄存器寻址,可以提高处理速度

Register indirect addressing
  • Similar to indirect addressing

  • EA = (R)

    • Operand is in memory cell pointed to by contents of register R
  • Large address space($2^n$)

    • n is the word length of register
  • Much faster than indirect addressing

    • One memory access + one register access
/img/Computer Organization and Architecture/chapter11-4.png
  • 指令中的地址域中是寄存器R,而寄存器R中的值是操作数在主存中的地址

  • 经过两次访问,才能得到操作数。第一次是寄存器访问,第二次是存储器访问

  • 由于寄存器的访问时间很短,所以寄存器间接寻址的时间,基本上和访问存储器的时间相当

Displacement addressing
  • Add a displacement to the base address to obtain the actual address of the operand

    • EA = A + (R)
  • Address field hold two values

    • A = base value
    • R = register that holds displacement
    • or vice versa
  • The operand address is the relative address of the base address, which is often used in virtual addresses

/img/Computer Organization and Architecture/chapter11-5.png
  • 指令中包含了2个地址字段,寄存器R和基址A

  • 寻址时,根据R的值,去寄存器中读取操作数的地址偏移量,加上基址A,得到操作数在主存中的地址,访问存储器,得到操作数

  • 偏移寻址有三种方式:第一种是相对寻址,第二种是基址寄存器寻址,第三种是变址寻址

Relative addressing
  • A version of displacement addressing

    • R = Program counter, PC

    • EA = A + (PC)

    • obtain the operand from the memory, and the address of the operand comes from PC and A

  • Locality of reference & cache usage

    • Program counter is instruction address

    • Based on the principle of locality, the probability of data in cache is very high, and data access is fast

/img/Computer Organization and Architecture/chapter11-6.png
  • 相对寻址中,隐含使用了PC作为基础地址,用指令中地址域中的A作为偏移量

  • 通过两个的计算,得到操作数在主存中的地址

  • 根据这个地址,访问存储器,得到实际的操作数

Base-register addressing
  • Use a register R as the base register

    • R holds pointer to base address

    • R may be explicit or implicit

    • e.g. segment registers in 80x86 is implicit

  • The address field in the instruction gives displacement A

  • The operation of R and A can obtain the actual address of the operand

/img/Computer Organization and Architecture/chapter11-7.png
  • 基址寄存器BR中包含了寻址的基址,而指令中的地址字段中包含了偏移量

  • 这两个相加,得到操作数地址,访问存储器,获取操作数

  • 基址寄存器寻址的寻址过程包括:1. 访问1次寄存器;2. 进行一次加法运算;3. 访问一次主存

Indexed addressing
  • One type of displacement addressing mode

  • The base address is in the address field, and the offset is in the register

    • A = base

    • R = displacement

    • EA = A + R

  • Good for accessing arrays

    • EA = A + R
    • R++
/img/Computer Organization and Architecture/chapter11-8.png
  • 变址寻址中,寄存器中的值是偏移量,基址为指令中给出的地址

  • 寻址时,将基址和寄存器中的偏移量进行相加,得到存储器地址

  • 根据这个地址访问内存,得到操作数

Combinations
  • Post-index: Indexing after indirect addressing

    • First get address from memory, then indexing address

    • EA = (A) + (R)

/img/Computer Organization and Architecture/chapter11-9.png
  • 指令中地址字段的内容用来访问存储器,获得操作数的直接地址
  • 直接地址被寄存器值变址,得到操作数的实际地址,然后访问这个地址,得到操作数

  • Pre-index: Indirect addressing after indexing

    • Index first, read the memory after getting the address

    • EA = (A + (R))

/img/Computer Organization and Architecture/chapter11-10.png
  • 指令中地址字段和寄存器先进行变址,得到操作数的间接地址

  • 访问存储器,得到操作数的实际地址

  • 再一次访问存储器,得到操作数

Stack addressing
  • Operand is implicitly on top of stack

  • e.g. ADD

    • Pop top two number from stack

    • Add the two numbers

    • Push the sum

x86 and ARM addressing modes

Swapping

  • Problem: I/O is so slow compared with CPU that even in multi-programming system, CPU can be idle most of the time

  • Solutions

    • Increase main memory

      • Expensive
      • Leads to larger programs
    • Swapping

Partitioning

  • Splitting memory into sections to allocate to processes (including Operating System)

  • Fixed-sized partitions

    • May not be equal size

    • Process is fitted into smallest hole that will take it (best fit)

    • Some wasted memory

    • Leads to variable sized partitions

Variable sized partitions
  • Allocate exactly the required memory to a process

    • This leads to a hole at the end of memory, too small to use

    • Only one small hole - less waste

  • When all processes are blocked, swap out a process and bring in another

    • New process may be smaller than swapped out process
    • Another hole
  • Eventually have lots of holes,called fragmentation

  • Solutions

    • Coalesce - Join adjacent holes into one large hole

    • Compaction - From time to time go through memory and move all hole into one free block

Relocation

  • Instructions contain addresses

    • Locations of data

    • Addresses for instructions (branching)

  • No guarantee that process will load into the same place in memory

    • Logical address - relative to beginning of program

    • Physical address - actual location in memory (this time)

    • Automatic conversion using base address

Paging

  • Use paging to solve the problem of memory waste

    • Split memory into equal sized, small chunks-page frames

    • Split programs (processes) into equal sized small chunks–pages

    • Allocate the required number page frames to a process

  • Operating System is responsible for the management of page tables

    • A process does not require contiguous page frames

    • Each process uses a page table to record which page frames in memory it uses

  • Each process has its own page table

  • Each page table entry contains the frame number of the corresponding page in main memory

  • Two extra bits are needed to indicate

    • whether the page is in main memory or not

    • Whether the contents of the page has been altered since it was last loaded

Real and virtual memory

  • Real memory

    • Main memory, the actual RAM
  • Virtual memory

    • Memory on disk

    • Allows for effective multiprogramming and relieves the user of tight constraints of main memory

  • Advantage of virtual memory

    • You do not need to load all processes into memory

    • Running multiple processes simultaneously

    • Improved operational efficiency

Segmentation

  • Paging is not (usually) visible to the programmer

  • Segmentation is visible to the programmer

  • Usually different segments allocated to program and data

  • May be a number of program and data segments

x86 addressing modes

  • x86 adopts a memory management mechanism combining segments and pages

  • Virtual or effective address is offset into segment

    • Starting address plus offset gives linear address

    • This goes through page translation if paging enabled


  • 9 addressing modes available

    • Immediate

    • Register operand

    • Displacement

    • Base

    • Base with displacement

    • Scaled index with displacement

    • Base with index and displacement

    • Base scaled index with displacement

    • Relative

/img/Computer Organization and Architecture/chapter11-11.png
  • 指令中给的逻辑地址包含两个部分:段和段内偏移量

  • 查找段表,可以得到段起始地址,加上段内偏移量,得到操作数的线性地址

  • 线性地址采用了分页的方式,所以还需要通过页转换机制,得到物理地址,最后通过物理地址查询得到这个操作数。页表采用两级页表的形式

  • 6个段寄存器,每个进程使用哪个段寄存器由指令和执行的上下文来确定。每个段寄存器对应一个段描述符表,记录了段的访问权限,段的起始地址和段的长度

  • 基址寄存器和变址寄存器,用于构造复杂的寻址方式

  • 基址、变址以及指令中的偏移量计算得到有效地址,加上段地址得到操作数的线性地址,然后再根据分页的规则,得到物理地址


Terms

  • Effective address

  • Physical address

  • LA: linear address

  • SR : segment register

  • B: base register

  • I : index register

  • S: scale factor


x86 addressing modes

  • 8个32位通用寄存器,分别是EAX、EBX、ECX、EDX、ESI、EDI、ESP、EBP

  • 8个16位通用寄存器,AX、BX、CX、DX、SI、DI、SP、BP

  • 8个8位通用寄存器,AH、BH、CH、DH、AL、BL、CL、DL

  • 通过段寄存器来确定段的起始地址,然后计算得到线性地址

  • 比例变址寻址带偏移量寻址模式中,变址比例因子为1、2、4、8,这个是因为x86是按字节寻址,设置比例因子可以按16位或32位进行变址

  • 相对寻址主要用于控制转移指令

  • 将偏移量加到程序计数器中,得到相对于下一个需要执行指令的地址的偏移地址

  • 偏移量是一个有符号整数,通过计算,可以增加也可以减少程序计数器中的地址值

ARM addressing modes

  • ARM is a RISC architecture processor

  • RISC uses simple addressing modes, but ARM provides more addressing modes

  • Only load/store instructions can reference memory

  • Indirectly through base register plus offset

  • Base register itself may be updated during addressing

  • 3 addressing mode

Offset
  • 偏移寻址:只偏移,不变址。从基址寄存器增加或减少偏移量来形成内存地址
1
STRB r0, [r1,#12]
  • 将r0存放到存储器中,存储器地址为r1的值加上立即数12
Pre-index
  • 内存地址跟偏移寻址一样,基址寄存器增加或减少偏移量来形成内存地址

  • 内存地址会写回到基址寄存器,基址寄存器的值会增加或减少一个偏移量

1
STRB r0, [r1,#12]!
  • 这里!就是标识是前变址

  • 寻址完成后,r1寄存器的值变成了r1-12

Post-index
  • 操作数的地址就是在基址寄存器的值
  • 寻址完成后,基址寄存器的值会增加或减少一个偏移量,相当于寻址完成后,基址寄存器自身增加或减少了一个偏移量
1
STRBv r0, [r1],#12
  • #表示后变址

  • 寻址用r1地址,同时r1寄存器的值变成了r1-12


  • Base register acts as index register for pre-index and postindex addressing

  • Offset either immediate value in instruction or another register

  • If register,scaled register addressing available

    • Offset register value scaled by shift operator

    • Instruction specifies shift size

  • Data Processing

    • Register addressing

    • Value in register operands may be scaled using a shift operator

    • Or mixture of register and immediate addressing


Addressing of branch

  • Branch

    • Only immediate

    • Instruction contains 24 bit value

    • When addressing, this immediate value will be shifted two bits to the left, reaching the boundary of a 32-bit word

    • Shifted 2 bits to the left, which is equivalent to an offset of 26 bits. The effective address range is+- 32MB


ARM Load/Store Multiple Addressing

  • One instruction can load or store multiple data at the same time

    • Load or store a set of general registers
  • 16-bit instruction field in instruction specifies list of registers

    • Registers corresponds to a sequential storage unit in memory
    • Memory unit with the lowest address corresponds to the register with the lowest number
  • Base register specifies first main memory address

  • Four types

    • increment after
    • increment before
    • decrement after
    • decrement before
  • Incrementing or decrementing starts before or after first memory access

/img/Computer Organization and Architecture/chapter11-12.png
Multiple addressing diagram
  • r10开始的三个单元内容加载到r0,r1,r4这三个寄存器中。R0为低地址,r4为高地址

  • 采用后递增,从0x20C开始,连续三个存储单元的内容取出后,分别给r0,r1和r4。采用前递增,第一个存储单元的地址要在基址寄存器中的地址基础上加1,然后取连续三个存储单元的内容取出后,分别给r0,r1和r4

  • 对于后递减,就是从基址寄存器开始,地址递减的连续三个存储单元。对于前递减,就是先在基址寄存器的地址上减1,然后地址递减的连续三个存储单元

Instruction Formats

  • Instruction set is the interface provided by the processor to the upper layer

    • An important symbol of CPU performance

    • The rationality of the instruction set has a great impact on the performance of the CPU

  • Therefore, the design of instruction format is the core content of processor design


Instruction formats

  • Instruction include

    • Opcode

    • Operand(s) (implicit or explicit) and addressing mode

  • Instruction formats: How many bits do the parts of the instruction occupy, and in what order

  • Layout of bits in an instruction

  • Usually more than one instruction format in an instruction set

Key of instruction formats

  • The width of opcodes: determines number of operation

    • The more opcodes, the more functions of the instruction set, and the larger the number of bits
  • The width of operands: effect the instruction length

    • The operand takes up a large proportion of the instruction length

    • Number of operands, addressing mode and size of addressing space have a great impact on the length of instructions

  • Addressing modes: determine the complexity and the length of the instruction

    • The more complex the addressing mode is, the more operations are required to obtain the physical address of the operand, and the higher the time complexity is

    • Complex addressing mode can use less address field length to obtain larger addressing space and save instruction length


  • First step in instruction set design is to determine the length of instructions

  • Trade off between powerful instruction repertoire and saving space


Summary

  • The operation code and operands should have as many digits as possible

  • The longer the instruction, the more memory space it takes

  • Generally,instruction length is consistent with the bus width , or an integer multiple

  • In the design of instruction set

    • Every part of the directive needs to be properly planned

    • Seeking the best balance among various design scheme

Allocation of bits

  • After the length of the instruction is determined, each bit in the instruction needs to be allocated reasonably to maximize the use of each bit

    • If the opcode is long, operands is short
    • Variable length opcode, additional bits determine operation
  • First, you need to determine the number of operands and opcodes

The following factors need to be considered

  • Number of operands

  • Number of addressing modes

  • Register versus memory

  • Number of register sets

  • Address range

  • Address granularity


Number of addressing modes

  • Some opcodes implicitly specify the addressing mode of the operand, which does not need to be specified separately

  • Sometimes it is necessary to explicitly specify the addressing mode of this operand, and one or more addressing mode bits are required

  • There may be multiple addressing modes in an instruction


Number of operands

  • If the instruction only supports one operand, it is troublesome to write the program

  • Generally, two operands are supported

  • Each operand hope an independent addressing mode

    • Flexible

    • Need addressing indication bit

  • Some processors allow one operand to specify the addressing bit


Register versus memory

  • Data needs to be loaded into CPU through registers for processing
  • If there is only one register, it does not need to be specified, but it is very troublesome to use
  • Several registers are generally provided
    • Several bits can specify a register, which takes up less instruction bits
  • Most processors have more than 32 registers

Number of register sets

  • Most processors provide only one set of general-purpose registers

    • Store Data

    • Store address field in offset addressing mode

  • Some processors, such as the x86 processor, can provide multiple sets of registers

    • Divide by function, some store data, some store offset

    • Opcode implicitly determines which set of registers to use

    • Reduce the number of instructions


Address range

  • In direct addressing, the address range is determined by the length of the address field in the instruction

    • Instruction length is limited

    • The address range of direct addressing is small

  • General use offset addressing

    • Length of the address register is critical

    • If the offset is large, the length of the address field in the instruction is also long


Address granularity

  • The smaller the addressable address granularity is, the longer the address bits are required

  • Addressing by byte

    • Some operations are more convenient

    • e.g. character processing

    • More address bits required

  • Operate according to words

    • Number of address bits reduced

    • Reduced operational flexibility

x86 and ARM instruction formats

x86 instruction format

/img/Computer Organization and Architecture/chapter11-13.png

Characteristic

  • Addressing mode is associated with the instruction opcode

  • An instruction has only one addressing mode

  • Only one memory operand can be referenced in an instruction

  • Typical CISC architecture,use complex instruction format

    • X86 needs to consider downward compatibility

    • Hope to provide richer instructions for compiler developers

ARM instruction formats

  • Typical RISC architecture

  • All the instructions are 32 bits, and the format is very neat

  • ARM instructions are divided into four categories

    • data processing instructions

    • load / save instructions

    • overload / save instructions

    • branch instructions

  • All instructions are conditionally executed

Condition code
  • All instructions are conditionally executed

  • The instruction contains a 4-bit condition code, which is in the highest 4-bit of the instruction

  • Except for the condition flags 1110 and 1111, all other instructions must meet the conditions before they can be executed

  • The condition code includes four condition flags, which are stored in the program status register

  • The four condition flags are N negative flag, Z zero flag, C carry flag, V overflow flag

  • For all arithmetic or logic instructions, an S bit is given to indicate whether the instruction modifies the condition flag bit

Data processing

/img/Computer Organization and Architecture/chapter11-14.png
  • 数据处理指令类型为000或001。操作码都是4位,s表示是否修改条件标志位。指令中都有三个操作数

  • 第一种格式中,目的寄存器Rd,第一个操作数寄存器Rn和第二个操作数寄存器Rm,操作数可以根据shift的标志进行移位,shift amount指明移动多少位

  • 第二种格式跟第一种类似,只是移位的位数不是立即数,而是由寄存器Rs来确定

  • 第三种格式中,第二操作数是一个立即数,并且可以针对立即数进行循环右移,循环右移的次数由rotate域中的值决定

Load/Store

  • 加载/保存指令中,指令一般类型为010和011。后面5位标识了寻址模式、数据类型,是字节还是字,以及加载和保存标志。

  • 第一种加载/保存指令是立即数偏移指令,指令中给出了12位的偏移量。内存地址就是基址寄存器Rn加上或减去立即数偏移量。

  • 第二种指令是寄存器偏移。偏移量在Rm寄存器中,通过shift确定移位操作,移动shift amount位之后得到,然后再和基址寄存器Rn计算,得到内存地址。

  • 多载/多存指令中,指令一般类型为100。指令中给了16位的寄存器列表,内存地址在Rn中,是先递增,先递减,还是后递增,后递减,由寻址模式来决定

Branch

  • 分支指令的指令一般类型为101,提供了一个24位的立即数
  • 还有一个标志位L,这个标志位决定返回地址是否保存在连接寄存器,也就是link register中。

ARM immediate constants

  • 数据处理指令中,立即数占了8位,同时还规定了一个循环移位的值。这样设计的目的是为了获得取值范围较大的数

  • 通过循环移位,可以将立即数的范围从8位最多扩展到32位

Thumb instruction set

  • Special Usage: use 16 bit instructions to implement most of 32-bit instructions

  • In an embedded system, there may only be a 16 bit bus

  • Thumb instruction set: Re-encoded subset of ARM instruction set

  • Increases performance in 16-bit or less data bus

  • Need to reduce 16 bits in the instruction

  • Unconditional (4 bits saved)

  • Always update conditional flags

    • Update flag not used (1 bit saved)
  • Subset of instructions

    • 2 bit opcode, 3 bit type field (2 bit saved)

    • Reduced operand specifications (9 bits saved)

  • 压缩指令集的16位指令可以扩展到32位的标准指令

    • 压缩的指令集只有16位,可以在配置较低的硬件上执行

    • 如果在标准的ARM处理器上执行,可以按照这个图上的方法,扩充到32位之后进行执行

  • ARM处理器能够执行16位和32位的指令,并且能够两种格式混合执行

  • 处理器中的控制寄存器中的1位用来确定当前的执行是16位的指令还是32位的指令