目录

Computer Organization and Architecture Top Level View of Computer Function and Interconnection


Computer Organization and Architecture

Top Level View of Computer Function and Interconnection

Review

  • What are the factors that directly affect computer performance?

    • The running speed of the processor

    • The organization and architecture of chip

    • Balance between the individual components


  • The meaning of Amdahl's law
    • The parallel execution ability of the program is the key factor affecting the acceleration ratio of the multi-core processor

Outline

  • Key Terms

  • Computer Components

  • Computer Function

  • Interconnection Structures

  • Bus Interconnection

  • PCI Bus

Key terms

  • PC: Program counter

  • IR: Instruction register

  • MAR: Memory address register

  • MBR: Memory buffer register

  • I/O AR: I/O address register

  • I/O BR: I/O buffer register

  • AC: Accumulator

Computer Components

Program Concept

  • Logical units have certain functions

    • And

    • Or

    • Not

  • Multiple logical units are connected to complete certain functions

    • Hardwired systems

    • To complete another calculation, logical units need to be reconfigured

    • Inflexible


What is program

  • Construct a general purpose hardware with arithmetic and logic functions

    • Can do different tasks, given correct control signals

    • For a new task, instead of re-wiring, supply a new set of control signals

  • Use Program to provide control signals

    • A sequence of steps

    • For each step, an arithmetic or logical operation is done

    • For each operation, a different set of control signals is needed

Function of Control Unit

  • For each operation a unique code is provided

    • e.g. ADD, MOVE
  • A hardware segment to generate control signal

    • Accepts code and issues control signals

    • Called controller or instruction interpreter

  • ALU performs corresponding operations according to the control signal

  • Sequence of operations is called software

  • We have a computer!


  • The Control Unit(CU)and the Arithmetic and Logic Unit (ALU) constitute the Central Processing Unit(CPU)

  • Data and instructions need to get into the system and results out

    • Input

    • Output

  • Temporary storage of code and results is needed

    • Main memory

Top Level View

  • The computer generally includes the CPU, main memory, $I/O$ module, and System Bus

  • $CPU=Control\ Unit+Arithmetic\ and\ Logic\ Unit\newline$

  • The main memory includes multiple data storage units, each with one address

  • In the $I/O$ module, there is a set of caches for $I/O$ access, for data for temporary storage and $I/O$ interaction

/img/Computer Organization and Architecture/chapter3-1.png

Computer Function

  • Instruction Fetch and Execute

    • Execute program
  • Interrupts

    • Handling performance differences between CPU and other components
  • I/O Function

    • Inter working with peripherals

Instruction Cycle

  • The most basic function of a computer is execute program

    • Program consists of instructions

    • Instruction in memory, execution in CPU

  • Two steps involved: Fetch $\rightarrow$ Execute

  • The processing required for a single instruction is called instruction cycle

    • Fetch cycle
    • Execute cycle
Fetch Cycle
  • Program Counter (PC) holds address of next instruction to fetch

  • Processor fetches instruction from memory location pointed to by PC

  • Increment PC

    • Unless told otherwise
  • Instruction loaded into Instruction Register (IR)

  • Processor interprets instruction and performs required actions

Execute Cycle
  • Processor interprets instruction and performs required actions

  • Data processing

    • Some arithmetic or logical operation on data
  • Processor-memory

    • data transfer between CPU and main memory
  • Processor-I/O

    • Data transfer between CPU and I/O module
  • Control

    • Alteration of sequence of operations

    • e.g. jump

  • Combination of above

Example: A Hypothetical Machine
  • Both instructions and data are 16 bits long

  • The instruction format

    • 4 bit opcode

    • 12 bits address

Actions

  • Three actions – three instruction cycles

    • Read data from address 940 to AC

    • Add the contents of 941 to AC

    • Store the result to address 941

  • Instructions to be used (partial list)

    • 0001 = Load AC from memory

    • 0010 = Store AC to memory

    • 0101 = Add to AC from memory

Cycle 1: Read Data From Address 940

/img/Computer Organization and Architecture/chapter3-2.png

Cycle 2: Adding Data From 941 to AC

/img/Computer Organization and Architecture/chapter3-3.png

Cycle 3: Writing Data From AC to 941

/img/Computer Organization and Architecture/chapter3-4.png
Conclusion
  • In this example, three instruction cycles, each consisting of

    • a fetch cycle
    • an execute cycle
  • With a more complex set of instructions, fewer cycles would be needed since

    • Execution cycle may perform more than one reference to memory

    • An instruction may specify I/O operation instead of memory reference

    • An instruction may specify an operation to be performed on a vector of numbers or a string of characters

      • e.g. ADD B, A
/img/Computer Organization and Architecture/chapter3-5.png
Instruction Cycle State Diagram

Multiple operands and multiple results allowed

Interrupts

  • Executing instructions need cooperation of different components

  • Execution speed of different components varies greatly

    • e.g. CPU is much faster then printer
  • Lots of waste of CPU time

  • Interrupt is used to improve CPU utilization


  • Mechanism by which other modules (e.g. I/O) may interrupt normal sequence of processing

  • Program

    • e.g. overflow,division by zero
  • Timer

    • Generated by internal processor timer

    • Used in pre-emptive multi-tasking

  • I/O

    • from I/O controller
  • Hardware failure

    • e.g. memory parity error
Interrupt Cycle
  • Add an interrupt cycle in instruction cycle

    • After the execution of the current instruction is completed,processor checks for interrupt

    • Indicated by an interrupt signal

  • If no interrupt, fetch next instruction

  • If interrupt pending

    • Suspend execution of current program

    • Save context

    • Set PC to start address of interrupt handler routine

    • Process interrupt

    • Restore context and continue interrupted program

Transfer of Control via Interrupts
  • Before the interrupt user program suspend and interrupt processing, after the user program recovery, are done by the processor

  • User programs do not need to develop special code to accommodate interruptions

Instruction Cycle with Interrupts
/img/Computer Organization and Architecture/chapter3-6.jpg
  • After the current instruction is executed, check if the interrupt is blocked. If the blocking is interrupted, continue to perform a process of pointing and executing instructions

  • If the interrupt allows, you need to detect whether the interrupt occurs and enter the interrupt cycle

  • If an interrupt is detected, perform the interrupt processor, otherwise proceed with the next instruction

Program Timing
  • The processing power of different I/O devices varies greatly, and for interruptions, short and long interruptions exist

  • Short I/O Wait

    • Since the I/O device is not prepared for a long time, the CPU continues to process the user program

    • After the I/O equipment is ready, the interrupt is initiated. The processor is processing, and after processing, continue to process the instructions behind the user program

    • The CPU basically has no waiting state, and the processing efficiency is relatively high

  • Long I/O Wait

  • The preparation time of I/O equipment is very long. When the processor completes the subsequent operation and needs to conduct I/O operation again, the I/O equipment is not ready yet

  • The CPU can only wait for the I/O equipment to be ready and complete the I/O operation before continuing to execute the instructions related to the next write operation

  • Because the preparation time of the I/O equipment is too long, and there are I/O operations behind it, so there will be a period of waiting for the CPU

Instruction Cycle (with Interrupts) - State Diagram

/img/Computer Organization and Architecture/chapter3-7.png

Multiple Interrupts

  • Ways to handle multiple interrupts: sequential,Nested
Sequential
  • when process interrupt, disable other interrupts

  • Processor will ignore further interrupts while processing one interrupt

  • Interrupts remain pending and are checked after first interrupt has been processed

  • Interrupts handled in sequence as they occur

  • Interrupts with high priority cannot be handled in time

Nested
  • Allow interrupt to be interrupted

  • Define priorities

    • Low priority interrupts can be interrupted by higher priority interrupts

    • When higher priority interrupt has been processed, processor returns to previous interrupt

    • After the low priority interrupt processing is completed, execute the user program

  • Called interrupts-nested

/img/Computer Organization and Architecture/chapter3-8.png

I/O Function

  • An I/O module can exchange data directly with the Processor

    • Example: a disk controller
  • The processor can also read data from or write data to an I/O module

    • The processor identifies a specific device that is controlled by a particular I/O module

    • Similar in form to memory that could occur

    • I/O instructions rather than memory referencing instructions


  • In some cases, allow I/O exchanges to occur directly with memory

    • the processor grants to an I/O module the authority to read from or write to memory

    • I/O-memory transfer can occur without tying up the processor

    • relieving the processor of responsibility for the exchange, DMA(direct memory access)

Interconnection Structures

  • A computer includes CPU, memory and I/O

  • All the units must be connected to execute instructions

  • Collection of paths connecting various modules of a computer (CPU, memory, I/O) is called the interconnection structure

Connecting Types

  • Types of transfers must be supported by interconnection structure

    • Memory to CPU, CPU to memory

    • I/O to CPU, CPU to I/O

    • I/O to or from memory (Direct Memory Access –DMA)

Memory Connection

  • Receives and sends data

  • Receives addresses (of locations)

  • Receives control signals

    • Read
    • Write
    • Timing

Memory Operation

  • Read

    • Input: address, read signal
    • Output: data
  • Write

    • Input: address, write signal, data
    • Output: null

Input/Output Connection

  • I/O module is located between the CPU and peripherals

  • I/O connection is similar to memory from computer’s viewpoint

  • Output

    • Receive data from computer(CPU)

    • Send data to peripheral

  • Input

    • Receive data from peripheral

    • Send data to computer(CPU)


  • Receive control signals from computer(CPU)

  • Send control signals to peripherals

    • e,g. write disk
  • Receive addresses from computer(CPU)

    • e.g. port number to identify peripheral
  • Send interrupt signals (for control)

CPU Connection

  • Reads instruction and data

  • Writes out data (after processing)

  • Sends control signals to other units

  • Receives and acts on interrupts

Bus Interconnection

  • Bus: a communication pathway connecting two or more devices

  • Usually broadcast

  • Often grouped

    • A number of channels in one bus
    • e.g. 32 bit data bus is 32 separate single bit channels
  • Power lines may not be shown

Main points of Bus

  • Only one device at a time can successfully transmit with BUS

  • A bus consists of multiple communication pathways, or lines

  • A bus that connects major computer components (processor, memory, I/O) is called a system bus

Bus Structure! ! !

  • Data
  • Address
  • control lines
Data Bus
  • The Data Bus provide a path for moving data between system modules

  • Carries data

    • Remember that there is no difference between “data” and “instruction” at this level
  • The number of lines being referred to as the width of the data bus

    • Width is a key determinant of performance

    • Capability of data transmission

Address Bus
  • Identify the source or destination of data

    • Each memory unit (or each I/O device) is assigned to a unique address
  • Bus width determines maximum memory capacity of system

    • e.g. 8080 has 16 bit address bus giving $64k$ address space
    • e.g. one memory unit usually is one byte, $1byte \times 64k\newline$
  • If memory is organized in multiple modules

    • Higher-order bits are used to select a particular module on the bus

    • Lower-order bits select a memory location or I/O port within the module

Control Bus
  • The control bus controls the use of data and address lines

  • Control signals transmit both command and timing information among system modules

  • Typical control lines include

    • Memory read/write

    • I/O read/write

    • Transfer ACK. Indicates that data have been accepted from or placed on bus

    • Interrupt request and acknowledge

    • Bus request and acknowledge

    • Clock used to synchronize operations

    • Reset

Operation of bus

  • Bus shared by all modules

  • If one module wishes to send data to another, it must

    • Obtain use of bus

    • Transfer data via bus

  • If one module wishes to request data from another, it must

    • Obtain the use of bus

    • Transfer a request to other module over control and address lines

    • Wait for second module to send data

Problems of single bus

  • Single bus structure: permits a number of units to be connected together through a common set of wires

  • Lots of devices on one bus leads to

    • Propagation delays

    • Long data paths mean that co-ordination of bus use can adversely affect performance

    • If aggregate data transfer approaches bus capacity

    • Bus may become a bottleneck as total data transfer demand approaches capacity of bus

Multiple-bus hierarchies

  • Most systems use multiple buses to overcome these problems

  • Multiple Bus Structure: several buses laid out in hierarchy

    • The bus with higher speed is closer to the CPU

    • The bus with low speed is in farther location

Traditional Bus Structure(ISA)

/img/Computer Organization and Architecture/chapter3-9.png

High Performance Bus

  • A bridge is set between processor bus and the high-speed bus

  • The cache controller is integrated into a bridge

  • High speed devices are connected to the high speed bus

  • Lower speed devices are connected to an expansion bus

diagram

/img/Computer Organization and Architecture/chapter3-10.png

Elements of bus design

  • Type

    • Dedicated, multiplexed
  • Method of arbitration

    • Centralized, distributed
  • Timing

    • Synchronous, asynchronous
  • Bus width

    • Address, data
  • Data transfer type

    • Read, write, read-modify-write, read-after-write, block

Bus types

  • Dedicated: Separate data & address lines

  • Multiplexed: Shared lines, Address valid or data valid control line

    • More complex control
    • Ultimate performance
  • Physical dedication: the use of multiple buses, each of which connects only a subset of modules

Bus arbitration! ! !

  • Arbitration may be centralised or distributed

  • Master - slave mechanism

    • Master controls the bus and place information onto it

    • Slave (target) receives information from the master


Two methods of arbitration

  • Centralised: Single hardware device controlling bus access - bus controller (or arbiter)

    1. The chaining method
    • Three controlling bus lines

      • BS: bus busy
      • BR: bus request
      • BR: bus grant
    • Any device sends a BR signal

    • BG passes through each module from the highest priority to the lowest priority

    • The device that close to the arbiter has higher priority

/img/Computer Organization and Architecture/chapter3-11.png
  1. The counter polling method

    • Three controlling bus lines: BS, BR, BG

    • Arbiter uses a counter to send polling signals(instead of using BG signal)

    • Any device sends a BR signal

    • After receiving BR signal, when BS=0 (bus not busy), the arbiter sends device address

    • When the address matches the one that the device has, that device has gained the use of bus

    • The counter stops polling

  2. Independent method

    • Each device has a BR, BG pair

    • The device that desire to use the bus sends BR

    • The arbiter decides which device can use the bus and sends BG to that device

      • Each device sends a request signal independently
      • Central arbiter decides which device can use the bus
      • Sends a BG signal to this device
  • The Operation in General
    • Bus request
    • Bus arbitration
    • Device addressing
    • Data transfer
    • Bus release
  • Distributed: there is no central controller
    • Each module may claim the bus

    • Control logic on all modules

Timing

  • Timing – Refers to way in which events are coordinated on bus

  • Synchronous timing

    • Occurrence of events on bus is determined by a clock
  • Asynchronous timing

    • Occurrence of one event on a bus follows and depends on occurrence of a previous event

Timing – synchronous

  • Events determined by clock signals

    • Control Bus includes clock line

    • A single 1-0 is a bus cycle

  • All devices can read clock line

    • Usually sync on leading edge

    • Usually a single cycle for an event

Bus width ! ! !

  • The width of data bus impacts system performance

    • How much information can be transferred at once

    • The wider the data bus, the greater the number of bits transferred at one time

  • The width of address bus impacts system capacity

    • The wider the address bus, the greater the range of locations that can be referenced

    • How much information can be stored

  • The width of control bus impacts system complexity

  • All buses support to read and write the data

Bus data transfer

  • Some bus systems also support a block data transfer

    • One address cycle followed by n data cycles

    • First data item is transferred to or from specified address

    • Remaining data items are transferred to or from subsequent addresses

Block data transfer

Bus speed ! ! !

  • The bus speed reflects how many bits of information can be sent across each wire each second

  • Bandwidth, also called throughput, bus transfer rate, refers to the total amount of data that can be transferred on the bus in a second

    • Measured in bits per second or bytes per second
  • $Bandwidth=bus\ width \times bus\ speed\newline$

Summary ! ! !

  • Bus system: a communication pathway connecting two or more devices

  • The operation of bus

    • Obtain the use of bus
    • Data transfer via bus
  • The bus hierarchy

    • Single bus
    • Multiple bus structures
  • Elements of bus design: type,Method of arbitration,timing,bus width, Data transfer type

PCI Bus

  • PCI: Peripheral Component Interconnection

    • Bus structure developed for Pentium in 1990

    • Intel released to public domain

    • High bandwidth, processor independent bus

  • Characteristic

    • delivers better system performance for high-speed I/O subsystems

    • PCI requires very few chips to implement and supports

/img/Computer Organization and Architecture/chapter3-12.png

PCI Bus Lines (required)

  • Systems lines

    • Including clock and reset
  • Address & Data

    • 32 time mux lines for address/data
    • Interrupt & validate lines
  • Interface Control

  • Arbitration

    • Not shared
    • Direct connection to PCI bus arbiter
  • Error lines

PCI Bus Lines (Optional)

  • Interrupt lines

    • Not shared
  • Cache support

  • 64-bit Bus Extension

    • Additional 32 lines
    • Time multiplexed
    • 2 lines to enable devices to agree to use 64-bit transfer
  • JTAG/Boundary Scan

    • For testing procedure

PCI Operations

  • Transaction between initiator (master) and target

  • Master claims bus

  • Determine type of transaction

    • e.g. I/O read/write
  • Address phase

  • One or more data phases


  • Bus command are specified by C/BE# lines during the address phase

Data transfers

  • Every data transfer on the PCI bus is a single transaction consisting of one address phase and one or more data phases

  • The fundamentals of all PCI data transfers are controlled with three signals

    • FRAME#: is driven by the master to indicate the beginning and end of a transaction

    • IRDY#: is driven by the master to indicate that it is ready to transfer data

    • TRDY#: is driven by the target to indicate that it is ready to transfer data

Bus arbitration

  • PCI uses central arbitration scheme: each master agent has a unique REQ# and a GNT# signal

    • A simple request-grant handshake is used
  • At a given instant in time, one or more PCI bus master may require the use of PCI bus

  • An arbiter is used to efficiently manage access to a PCI bus that is shared by several masters