目录

Computer Organization and Architecture Instruction Sets: Characteristics and Functions


Computer Organization and Architecture

Instruction Sets: Characteristics and Functions

Outline

  • Machine Instruction Characteristics

  • Types of Operands

  • Intel x86 and ARM Data Types

  • Types of Operations

  • Endian Support

Machine Instruction Characteristics

Language

Programming language

  • Classification of programming language

    • Machine language

    • Assembly language

    • High-level language

  • Compiler

    • Computers can only recognize machine language

    • Translation program that converts high-level/assembly language programs into machine language


Machine language

  • Defined by the computer’s hardware design

  • Consists of streams of numbers (1s and 0s)

  • Instruct the computer to perform the most basic operations

  • A computer can understand only its own machine language

  • It is difficult to remember, and generally will not be used directly


Assembly language

  • Represents machine-language instructions using English-like abbreviations

  • Replace the address of an instruction or operand with an address symbol or label

  • Assemblers convert assembly language to machine language

  • Specific assembly language and specific machine language instruction set are one-to-one, and cannot be directly transplanted between different computer

Instruction set

  • The complete collection of instructions that are understood by a CPU

    • Machine Code

    • Binary

    • Usually represented by assembly codes

/img/Computer Organization and Architecture/chapter10-1.png
Instruction cycle state diagram
Elements of an instruction
  • Operation code (Op code)

    • Do what
  • Source Operand reference

    • From this
  • Result Operand reference

    • Put the answer here
  • Next Instruction Reference

    • When you have done that, do this…

    • Generally, it defaults to the next storage unit

Instruction representation
  • In machine code,each opcode has a unique bit string

  • For programmers a symbolic representation is used

    • e.g. ADD,SUB,LOAD
  • The operand follows the opcode in the instruction

  • If there are multiple operands, separate them with “, ”

1
ADD A,B

简单的指令格式

/img/Computer Organization and Architecture/chapter10-2.png
  • 指令总共16个bit长,分为三个部分

    • 第一部分是操作码,4个bit。最多有16种操作

    • 第二部分是操作数1的引用,6个bit

    • 第三部分是操作数2的引用,6个bit

  • 在指令中,源操作数和目的操作数可以在内存、CPU寄存器中或者I/O中,也可能是一个数,称之为立即数

Operands

  • Main memory

    • Memory address must be supplied

    • If virtual address is supplied,address translation required

    • It may be in the cache

  • I/O device

    • The instruction must specify the I/O module and device for the operation
  • CPU register

    • If only one register exists, reference to it may be implicit

    • If more than one register exists, then each register is assigned a unique name or number

    • Instruction must contain the number of the desired register

  • Immediate

    • The value of the operand is contained in a field in the instruction being executed
Instruction types
  • The instructions can be categorized into four types

    • Data processing

    • Data storage

    • Data movement

    • Program flow control


  • Data processing instruction

    • Processing data

    • Including arithmetic and logic instructions


  • Data storage

    • Storing data

    • Mainly refers to the transfer of data between memory and CPU registers


  • Data movement
    • mainly refers to the data transmission between CPU and I/O
    • I/O instructions

  • Program flow control

    • Some instructions of CPU execution control

    • Test instructions

    • Branch instructions

Instruction types

  • Arithmetic instructions

  • Logic (Boolean) instructions

  • Memory instructions: moving data between memory and the registers

  • I/O instructions

  • Test instructions: used to test the value of a data word or the status of a computation

  • Branch instructions: used to branch to a different set of instructions

Number of addresses

  • The address in the instruction is used to address the operand

  • The number of addresses in instructions varies with different instruction types

  • Different computers support different numbers of operands

  • The number of operands in an instruction may be 3, 2, 1, or none

3 addresses
  • Operand 1, Operand 2, Result

  • a = b + c

  • May be a forth - next instruction (usually implicit)

  • Not common

  • Needs very long words to hold everything

2 addresses
  • Two operand

  • One address doubles as operand and result

  • a = a + b

  • Reduces length of instruction

  • Requires some extra work

    • Temporary storage to hold some results
1 addresses
  • Implicit second address

  • Usually a register (accumulator)

  • Common on early machines

    • LOAD A $AC \leftarrow A\newline$

    • SUB B $AC \leftarrow AC-B\newline$

    • STORE Y $Y \leftarrow AC\newline$

0 addresses
  • 0 (zero) addresses: All addresses implicit

    • Usually use stack to imply the operands

    • e.g:

      • push a
      • push b
      • add (pop a and b,then add a and b(c=a+b) and push the result to the stack)
      • pop c
      • c = a + b

  • More addresses

    • More complex instructions

    • Fewer instructions per program

    • Inter-register operations are quicker

    • More registers

  • Fewer addresses

    • Less complex instructions

    • More instructions per program

    • Faster fetch/execution of instructions

Instruction set design

Design issues of instruction set
  • Operation repertoire: how many and which operations to provides, and how complex operations should be

  • Data types: the operands types

  • Instruction format: instruction length, number of addresses…

  • Registers: how many registers can be used by the instructions

  • Addressing: how to access a memory location, how many modes can be used

Types of Operands

  • Addresses

  • Numbers

    • Integer/floating point
  • Characters

    • ASCII etc
  • Logical Data

    • Bits or flags

Address

  • The data operated by the instruction may be in memory

    • Address is used for addressing operands

    • Treat as an unsigned integer

  • In many cases, need to process the address to get the actual address of the data

Numbers

  • Three types of numerical data are common in computers

    • Binary integer or binary fixed point

    • Binary floating point

    • Decimal

      • Packed decimal

      • Use 4-bit binary number to represent a decimal number

      • In the packed decimal representation, only the previous 10 codes are used, that is, from 0000 to 1001

Characters

  • A common form of data is text or character strings

  • The most commonly used

    • International Reference Alphabet (IRA)

    • United States as the American Standard Code for Information Interchange (ASCII)

  • Extended Binary Coded Decimal Interchange Code (EBCDIC)

    • Used on IBM mainframes

Logical data

  • Boolean or binary data items

    • Each item can take on only the values 1 (true) and 0 (false)
  • There are occasions when we wish to manipulate the bits of a data item

数据都是以二进制串的形式保存的。因此数据是什么类型的主要取决于指令的类型。指令中确定了它所操作的数据的类型

Intel x86 and ARM Data Types

X86 data types

  • 8 bit Byte

  • 16 bit word

  • 32 bit double word

  • 64 bit quad word

  • 128 bit double quadword

  • Addressing is by 8 bit unit

  • Words do not need to align at even-numbered address

  • Data accessed across 32 bit bus in units of double word read at addresses divisible by 4

  • Little endian


  • 无符号整数4种格式,分别是8位,16位,32位和64位
  • 有符号数采用补码形式,也有4种格式,位数是8位,16位,32位和64位
  • 浮点数,包括单精度浮点数,32位;双精度浮点数,64位,以及扩展的双精度浮点数,80位
  • 浮点数表示符合IEEE 754标准的要求

SIMD data types

  • In the X86 architecture, MMX(Multi Media eXtension)related instructions are added to improve the processing efficiency of multimedia data

  • MMX technology adds 57 instructions specially designed for video signal, audio signal and graphic manipulation to the CPU

  • Therefore, MMX CPU greatly improves the computer’s multimedia (such as stereo, video, 3D animation, etc.) processing function

  • In MMX instructions, one instruction can process multiple data at the same time, which is called single instruction multiple data(SIMD)

  • Basic idea of SIMD is to package multiple operands into one memory addressable data, that is, the data obtained by one addressing is the result of multiple data packages

    • one instruction can obtain multiple operands and process them at the same time
  • Five packaging methods for compressed data

    • Packed byte and packed byte integer

    • Packed word and packed word integer

    • Packed doubleword and packed doubleword integer

    • Packed quadword and packed quadword integer

    • Packed single-precision floating-point and packed double-precision floatingpoint

  • Packed byte and packed byte integer

    • Bytes packed into 64-bit quadword

    • or 128-bit double quadword

  • Packed word and packed word integer

    • 16-bit words packed into 64-bit quadword

    • or 128-bit double quadword

  • Packed doubleword and packed doubleword integer

    • 32-bit doublewords packed into 64-bit quadword

    • or 128-bit double quadword

  • Packed quadword and packed quadword integer

    • Two 64-bit quadwords packed into 128-bit double quadword
  • Packed single-precision floating-point and packed double-precision floating-point

    • Four 32-bit floating-point or two 64-bit floating-point values packed into a 128-bit double quadword

ARM data types

  • 8 (byte), 16 (halfword), 32 (word) bits

  • Halfword and word accesses should be word aligned

  • Nonaligned access alternatives

    • Default

      • Treated as truncated

      • Load single word instructions rotate right word aligned data transferred by non word-aligned address one, two or three bytes Alignment checking

    • Data abort signal indicates alignment fault for attempting unaligned access

    • Unaligned access: Processor uses one or more memory accesses to generate transfer of adjacent bytes transparently to the programmer


  • Unsigned integer interpretation supported for all types

  • Twos-complement signed integer interpretation supported for all types

  • Majority of implementations do not provide floating-point hardware

    • Saves power and area

    • Floating-point arithmetic implemented in software

    • Optional floating-point coprocessor

    • Single- and double-precision IEEE 754 floating point data types

Types of Operations

  • Arithmetic

  • Logical

  • Data Transfer

  • Conversion

  • I/O

  • System Control

  • Transfer of Control

Data transfer

  • Location of source and destination must be specified

    • Memory
    • Register
    • Top of the stack
  • For the memory access, addressing mode must be specified

    • Memory has multiple addressing modes, such as direct addressing and indirect addressing
  • The length of the operands must be specified

  • Data transmission instructions need to specify

    • Source address

    • Destination address

    • Amount of data

  • Which data transfer instructions are included is one of the important issues to be considered in instruction set design

  • For example, whether the location of the operand is determined by the opcode or by the operand needs to be designed


  • IBM 390

    • Use different instructions for different movements

    • Operation code determines the direction of data movement

  • VAX

    • Data transmission between different data sources with the same

    • The position of each operand must be specified separately in the instruction

Common Data Transfer Instructions

  • 常见的数据传送指令包括:移动、存储、加载、交换、清除、设置、进栈、出栈等

  • 数据传送对于处理器来说是最基本、最简单的操作,实现了数据从一个位置到另一个位置的移动

  • 数据传送指令将数据从一个位置移动到另一个位置

  • 如果数据传送涉及到内存的话,还有一点复杂:

    • 需要根据寻址方式来计算存储器的地址

    • 如果给的是虚拟地址的话,还需要进行虚拟地址到实际地址的转换

    • 所以得到实存地址后,需要检查数据是否在cache中。如果在cache中,就对cache进行读取操作;如果没有命中,需要进行存储器的读或者写操作

Arithmetic

  • Single operand instruction

    • absolute, negate, increment, decrement
  • Two operands instruction

    • Add, subtract, multiply, divide
  • The operands are

    • Signed integer (fixed point) numbers

    • Floating-point numbers

    • Packed decimal numbers

  • Mainly completed by CPU

  • 算术运算的操作可能会包括数据传输传送操作。数据传送操作的目的是在运算前给ALU提供操作的数据,或者在运算后将结果输出

  • 算术操作的实际执行是在ALU中

  • 计算完成之后,还会设置状态码或标志位,用以表示计算的结果,比如是否溢出,是否出错等

Logical

  • Most processors can operate on a single bit of a word or addressable unit

    • Called bit twiddling

    • It’s actually a bit wise Boolean operation

  • Basic logic operation

    • AND, OR, NOT, Exclusive-OR
  • Extended logical operations

    • Test,compare,set control variables,shift,rotate
  • 测试test:测试指令,进行特定条件的测试并设置标志位

  • 比较compare:比较指令,对两个或多个操作数进行比较,并设置标志位

  • 设置控制变量set control variables:一组用于设置控制位的指令,以进行保护,中断处理,定时控制的用途

  • 移位shift:左移或右移数据

  • 循环rotate:循环移位

Shift operations! ! !

  • Logical shift: without considering the highest sign bit

    • Logical right: Move the operand to the right by n bits, and fill in 0 at the left position

    • Logical left: Move the operand to the left by n, and fill in 0 for the n bits vacated on the right

  • Arithmetic shift: consider the highest sign bit

    • Arithmetic right: Shift n bits to the right as a whole, and fill the empty n bits on the left with the highest sign bit

    • Arithmetic left: Retain the sign bit of the highest bit, and then shift the other bits by n bits to the left

  • 10100110:移动3位

    • 逻辑右移:00010100,逻辑左移:00110000

    • 算术右移:11110100,算术左移:10110000


Rotate/Cyclic shift

  • Rotate right: Each number moves one digit to the right, and the rightmost digit moves to the leftmost digit

  • Rotate left: Each number moves one digit to the left, and the leftmost digit moves to the rightmost digit

  • One possible use of the loop is to move left circularly, place each bit at the highest bit in turn, and then test the sign bit to determine the value of each bit

  • 10100110:移动3位

    • 循环右移:11010110

    • 循环左移:00110101


Role of shift

  • Right shift

    • Logical right shift: which is equivalent to dividing an unsigned integer by 2

    • Arithmetic right shift: for complement representation, it is equivalent to dividing by 2

  • Left shift: overflow needs to be considered

    • When there is no overflow, it is equivalent to multiplying by 2

    • When there is overflow, it has different effects on logical shift left and arithmetic shift left

/img/Computer Organization and Architecture/chapter10-3.png

Conversion

  • 转换指令主要目的是改变数据格式,或者对数据格式进行操作

  • 比如说,对二级制的数据格式进行转换,转换成十进制,或者从压缩的10进制转换为二进制

  • 还有一种是翻译,根据一个表的相应位的值,将内存块中的一些数据翻译成另一些数

Input/output

  • May be specific instructions

  • May be done using data movement instructions (memory mapped)

  • May be done by a separate controller (DMA)

System control

  • Privileged instructions

  • CPU needs to be in specific state

    • Kernel mode
  • For operating systems use

  • Some control instructions

    • Read or write a control register

    • Read or write a storage protection key

    • Access to process control blocks in a multiprogramming system

Transfer of control

  • The first scenario is that we need to repeat some instructions

    • Multiplication of vector or matrix is easy to implement if circular statements are used

    • Need use the transfer instruction, starting from the end of the loop body

    • It is almost impossible without a transfer instruction

  • The second scenario is that when we write a program, we often need to judge which operation to do next according to a calculation result

    • When calculating division, you can first verify whether the divisor is 0. If it is 0, you can directly report an error

    • Transfer instructions are required, and the instructions to be executed in the next step are determined according to the judgment results

  • The third scenario is that when we write programs, we often use procedures or functions

    • Break a large program into several small parts, and then process them separately

    • Procedures or functions can be called multiple times

    • Transfer instructions must be used when calling procedures or functions

Role of control transfer
  • Normal execution of instructions

    • PC(program counter)stores the address of the next instruction to be executed

    • After the instruction retrieval is completed, PC automatically adds 1 to point to the next instruction address

  • Control transfer instructions

    • Determine the next instruction to be executed according to the execution result of the current instruction

    • Change the original instruction execution order

/img/Computer Organization and Architecture/chapter10-4.png
Types of control transfer
Some control transfer instruction
  • Branch

    • Also called jump

    • Take the address of next instruction to be executed as an operand of current instruction

    • For conditional branch instructions, the branch is made only if a certain condition is met

      • Otherwise, executes next instruction in sequence
    • Usually the condition is taken as a result of an operation (arithmetic or logic)

  • Skip

    • Skip execution of the next instruction

    • It is not necessary to specify the address of the next instruction in the instruction

    • The skip instruction includes an implied address

    • The skip implies that one instruction be skipped

      • The implied address equals the address of next instruction plus one instruction-length
  • Subroutine call

    • Call the procedure code to execute the procedure

    • After execution, return to the point where the call occurred and continue to execute next instruction

    • A procedure is a subroutine, that is, a computer program, which can perform certain functions

    • Write the general function as a procedure, which can be called many times in the program

      • Save the workload of programming

      • Memory space occupied by programs is also reduced

    • Through procedure writing, modular programming is carried out to improve the efficiency of programming

Procedure call instruction

  • It is invoked by a calling instruction and returned by a return instruction

    • Procedure call can be nested

    • Each procedure call is matched by a return in the called program

  • The CPU must save the return address,it also need to pass parameters to the procedure in one of the following ways

    • Register

    • Start of called procedure

    • Top of stack

Nested procedure calls
  • 主程序中调用过程1

  • 过程1中有2个调用过程2的步骤

  • 每次调用完必须有返回的指令,返回到调用的地方

  • 返回地址的保存和使用方式,一般采用栈来完成

Use of stacks
/img/Computer Organization and Architecture/chapter10-5.png
stack
  • 初始化时,栈是空栈

  • 调用过程1时,需要把返回地址,也就是4101压栈

  • 过程1中调用过程2时,需要把返回地址4601压栈

  • 第一个过程2执行完成,把4601弹出,这样就返回到过程调用前的状态

  • 第二次调用过程2,同样把返回地址压栈

  • 过程执行结束,弹出4651。过程1执行结束,弹出4101,继续执行主程序

Passing parameters
  • Pass parameters is important to the procedure cal

  • Using registers

    • Must assure that the registers are used properly
  • Using memory cells

    • It is difficult to exchange the variables
  • Using stack

    • more flexible

    • When a procedure is called

      • Stack the return address

      • Stack parameters to be passed to the called procedure

    • When return

      • Return parameters can also be placed on the stack
    • All above stacked info for the procedure is called a stack frame(OS)


/img/Computer Organization and Architecture/chapter10-6.png
  • 主程序调用P的时候,先把返回地址压栈,然后将之前的帧指针地址保存,之后再将x1和x2这两个参数压到栈里面

  • P调用Q的时候,先把返回地址压栈,然后将老的帧地址保存,再将P要传给Q的y1和y2地址压栈

  • 通过栈,完成了参数、返回地址的传递

Stack

  • Queues work in two basic ways

    • FIFO: first in first out
    • LIFO: last in first out
  • Stack is a LIFO

  • A stack is an ordered set of elements, only one of which can be accessed at a time

  • The point of access is called the top of the stack

/img/Computer Organization and Architecture/chapter10-7.png
stack operation
/img/Computer Organization and Architecture/chapter10-8.png
Typical stack organization

Endian Support

  • Example of Endian
    • Suppose we want to store a 32-bit hex value 12345678 to address 184
/img/Computer Organization and Architecture/chapter10-9.png

Endian

  • Big endian

    • The most significant byte in the lowest numerical byte address

    • Equivalent to the left-to-right order of writing

  • Little endian

    • The least significant byte in the lowest numerical byte address

    • Reminiscent of the right-to-left order of arithmetic operations in arithmetic units

  • Machines from different manufacturers may adopt different endian

Standard

  • Pentium (x86), VAX, Alpha are little-endian

  • BM 370, Motorola 680x0 (Mac), and most RISC are big-endian

  • Internet is big-endian

    • Makes writing Internet programs on PC more awkward!

    • WinSock provides HtoI and ItoH (Host to Internet & Internet to Host) functions to convert

ARM endian support

  • ARM supports two endian

  • E-bit in system control register

  • E-bit=0 is the big endian; if E-bit=1, it is the little endian E-bit=0