Computer Organization and Architecture Top Level View of Computer Function and Interconnection
Computer Organization and Architecture
Top Level View of Computer Function and Interconnection
Review
-
What are the factors that directly affect computer performance?
-
The running speed of the processor
-
The organization and architecture of chip
-
Balance between the individual components
-
- The meaning of
Amdahl's
law- The parallel execution ability of the program is the key factor affecting the acceleration ratio of the multi-core processor
Outline
-
Key Terms
-
Computer Components
-
Computer Function
-
Interconnection Structures
-
Bus Interconnection
-
PCI
Bus
Key terms
-
PC: Program counter
-
IR: Instruction register
-
MAR: Memory address register
-
MBR: Memory buffer register
-
I/O AR: I/O address register
-
I/O BR: I/O buffer register
-
AC: Accumulator
Computer Components
Program Concept
-
Logical units have certain functions
-
And
-
Or
-
Not
-
-
Multiple logical units are connected to complete certain functions
-
Hardwired systems
-
To complete another calculation, logical units need to be reconfigured
-
Inflexible
-
What is program
-
Construct a general purpose hardware with arithmetic and logic functions
-
Can do different tasks, given correct control signals
-
For a new task, instead of re-wiring, supply a new set of control signals
-
-
Use Program to provide control signals
-
A sequence of steps
-
For each step, an arithmetic or logical operation is done
-
For each operation, a different set of control signals is needed
-
Function of Control Unit
-
For each operation a unique code is provided
- e.g. ADD, MOVE
-
A hardware segment to generate control signal
-
Accepts code and issues control signals
-
Called controller or instruction interpreter
-
-
ALU performs corresponding operations according to the control signal
-
Sequence of operations is called software
-
We have a computer!
-
The Control Unit(CU)and the Arithmetic and Logic Unit (ALU) constitute the Central Processing Unit(CPU)
-
Data and instructions need to get into the system and results out
-
Input
-
Output
-
-
Temporary storage of code and results is needed
- Main memory
Top Level View
-
The computer generally includes the CPU, main memory, $I/O$ module, and System Bus
-
$CPU=Control\ Unit+Arithmetic\ and\ Logic\ Unit\newline$
-
The main memory includes multiple data storage units, each with one address
-
In the $I/O$ module, there is a set of caches for $I/O$ access, for data for temporary storage and $I/O$ interaction

Computer Function
-
Instruction Fetch and Execute
- Execute program
-
Interrupts
- Handling performance differences between CPU and other components
-
I/O Function
- Inter working with peripherals
Instruction Cycle
-
The most basic function of a computer is execute program
-
Program consists of instructions
-
Instruction in memory, execution in CPU
-
-
Two steps involved: Fetch $\rightarrow$ Execute
-
The processing required for a single instruction is called instruction cycle
- Fetch cycle
- Execute cycle
Fetch Cycle
-
Program Counter (PC) holds address of next instruction to fetch
-
Processor fetches instruction from memory location pointed to by PC
-
Increment PC
- Unless told otherwise
-
Instruction loaded into Instruction Register (IR)
-
Processor interprets instruction and performs required actions
Execute Cycle
-
Processor interprets instruction and performs required actions
-
Data processing
- Some arithmetic or logical operation on data
-
Processor-memory
- data transfer between CPU and main memory
-
Processor-I/O
- Data transfer between CPU and I/O module
-
Control
-
Alteration of sequence of operations
-
e.g. jump
-
-
Combination of above
Example: A Hypothetical Machine
-
Both instructions and data are 16 bits long
-
The instruction format
-
4 bit opcode
-
12 bits address
-
Actions
-
Three actions – three instruction cycles
-
Read data from address 940 to AC
-
Add the contents of 941 to AC
-
Store the result to address 941
-
-
Instructions to be used (partial list)
-
0001 = Load AC from memory
-
0010 = Store AC to memory
-
0101 = Add to AC from memory
-
Cycle 1: Read Data From Address 940

Cycle 2: Adding Data From 941 to AC

Cycle 3: Writing Data From AC to 941

Conclusion
-
In this example, three instruction cycles, each consisting of
- a fetch cycle
- an execute cycle
-
With a more complex set of instructions, fewer cycles would be needed since
-
Execution cycle may perform more than one reference to memory
-
An instruction may specify I/O operation instead of memory reference
-
An instruction may specify an operation to be performed on a vector of numbers or a string of characters
- e.g. ADD B, A
-

Instruction Cycle State Diagram
Multiple operands and multiple results allowed
Interrupts
-
Executing instructions need cooperation of different components
-
Execution speed of different components varies greatly
- e.g. CPU is much faster then printer
-
Lots of waste of CPU time
-
Interrupt is used to improve CPU utilization
-
Mechanism by which other modules (e.g. I/O) may interrupt normal sequence of processing
-
Program
- e.g. overflow,division by zero
-
Timer
-
Generated by internal processor timer
-
Used in pre-emptive multi-tasking
-
-
I/O
- from I/O controller
-
Hardware failure
- e.g. memory parity error
Interrupt Cycle
-
Add an interrupt cycle in instruction cycle
-
After the execution of the current instruction is completed,processor checks for interrupt
-
Indicated by an interrupt signal
-
-
If no interrupt, fetch next instruction
-
If interrupt pending
-
Suspend execution of current program
-
Save context
-
Set PC to start address of interrupt handler routine
-
Process interrupt
-
Restore context and continue interrupted program
-
Transfer of Control via Interrupts
-
Before the interrupt user program suspend and interrupt processing, after the user program recovery, are done by the processor
-
User programs do not need to develop special code to accommodate interruptions
Instruction Cycle with Interrupts

-
After the current instruction is executed, check if the interrupt is blocked. If the blocking is interrupted, continue to perform a process of pointing and executing instructions
-
If the interrupt allows, you need to detect whether the interrupt occurs and enter the interrupt cycle
-
If an interrupt is detected, perform the interrupt processor, otherwise proceed with the next instruction
Program Timing
-
The processing power of different I/O devices varies greatly, and for interruptions, short and long interruptions exist
-
Short I/O Wait
-
Since the I/O device is not prepared for a long time, the CPU continues to process the user program
-
After the I/O equipment is ready, the interrupt is initiated. The processor is processing, and after processing, continue to process the instructions behind the user program
-
The CPU basically has no waiting state, and the processing efficiency is relatively high
-
-
Long I/O Wait
-
The preparation time of I/O equipment is very long. When the processor completes the subsequent operation and needs to conduct I/O operation again, the I/O equipment is not ready yet
-
The CPU can only wait for the I/O equipment to be ready and complete the I/O operation before continuing to execute the instructions related to the next write operation
-
Because the preparation time of the I/O equipment is too long, and there are I/O operations behind it, so there will be a period of waiting for the CPU
Instruction Cycle (with Interrupts) - State Diagram

Multiple Interrupts
- Ways to handle multiple interrupts: sequential,Nested
Sequential
-
when process interrupt, disable other interrupts
-
Processor will ignore further interrupts while processing one interrupt
-
Interrupts remain pending and are checked after first interrupt has been processed
-
Interrupts handled in sequence as they occur
-
Interrupts with high priority cannot be handled in time
Nested
-
Allow interrupt to be interrupted
-
Define priorities
-
Low priority interrupts can be interrupted by higher priority interrupts
-
When higher priority interrupt has been processed, processor returns to previous interrupt
-
After the low priority interrupt processing is completed, execute the user program
-
-
Called interrupts-nested

I/O Function
-
An I/O module can exchange data directly with the Processor
- Example: a disk controller
-
The processor can also read data from or write data to an I/O module
-
The processor identifies a specific device that is controlled by a particular I/O module
-
Similar in form to memory that could occur
-
I/O instructions rather than memory referencing instructions
-
-
In some cases, allow I/O exchanges to occur directly with memory
-
the processor grants to an I/O module the authority to read from or write to memory
-
I/O-memory transfer can occur without tying up the processor
-
relieving the processor of responsibility for the exchange,
DMA
(direct memory access)
-
Interconnection Structures
-
A computer includes CPU, memory and I/O
-
All the units must be connected to execute instructions
-
Collection of paths connecting various modules of a computer (CPU, memory, I/O) is called the interconnection structure
Connecting Types
-
Types of transfers must be supported by interconnection structure
-
Memory to CPU, CPU to memory
-
I/O to CPU, CPU to I/O
-
I/O to or from memory (Direct Memory Access –
DMA
)
-
Memory Connection
-
Receives and sends data
-
Receives addresses (of locations)
-
Receives control signals
- Read
- Write
- Timing
Memory Operation
-
Read
- Input: address, read signal
- Output: data
-
Write
- Input: address, write signal, data
- Output: null
Input/Output Connection
-
I/O module is located between the CPU and peripherals
-
I/O connection is similar to memory from computer’s viewpoint
-
Output
-
Receive data from computer(CPU)
-
Send data to peripheral
-
-
Input
-
Receive data from peripheral
-
Send data to computer(CPU)
-
-
Receive control signals from computer(CPU)
-
Send control signals to peripherals
- e,g. write disk
-
Receive addresses from computer(CPU)
- e.g. port number to identify peripheral
-
Send interrupt signals (for control)
CPU Connection
-
Reads instruction and data
-
Writes out data (after processing)
-
Sends control signals to other units
-
Receives and acts on interrupts
Bus Interconnection
-
Bus: a communication pathway connecting two or more devices
-
Usually broadcast
-
Often grouped
- A number of channels in one bus
- e.g. 32 bit data bus is 32 separate single bit channels
-
Power lines may not be shown
Main points of Bus
-
Only one device at a time can successfully transmit with BUS
-
A bus consists of multiple communication pathways, or lines
-
A bus that connects major computer components (processor, memory, I/O) is called a system bus
Bus Structure! ! !
- Data
- Address
- control lines
Data Bus
-
The Data Bus provide a path for moving data between system modules
-
Carries data
- Remember that there is no difference between “data” and “instruction” at this level
-
The number of lines being referred to as the width of the data bus
-
Width is a key determinant of performance
-
Capability of data transmission
-
Address Bus
-
Identify the source or destination of data
- Each memory unit (or each I/O device) is assigned to a unique address
-
Bus width determines maximum memory capacity of system
- e.g. 8080 has 16 bit address bus giving $64k$ address space
- e.g. one memory unit usually is one byte, $1byte \times 64k\newline$
-
If memory is organized in multiple modules
-
Higher-order bits are used to select a particular module on the bus
-
Lower-order bits select a memory location or I/O port within the module
-
Control Bus
-
The control bus controls the use of data and address lines
-
Control signals transmit both command and timing information among system modules
-
Typical control lines include
-
Memory read/write
-
I/O read/write
-
Transfer
ACK
. Indicates that data have been accepted from or placed on bus -
Interrupt request and acknowledge
-
Bus request and acknowledge
-
Clock used to synchronize operations
-
Reset
-
Operation of bus
-
Bus shared by all modules
-
If one module wishes to send data to another, it must
-
Obtain use of bus
-
Transfer data via bus
-
-
If one module wishes to request data from another, it must
-
Obtain the use of bus
-
Transfer a request to other module over control and address lines
-
Wait for second module to send data
-
Problems of single bus
-
Single bus structure: permits a number of units to be connected together through a common set of wires
-
Lots of devices on one bus leads to
-
Propagation delays
-
Long data paths mean that co-ordination of bus use can adversely affect performance
-
If aggregate data transfer approaches bus capacity
-
Bus may become a bottleneck as total data transfer demand approaches capacity of bus
-
Multiple-bus hierarchies
-
Most systems use multiple buses to overcome these problems
-
Multiple Bus Structure: several buses laid out in hierarchy
-
The bus with higher speed is closer to the CPU
-
The bus with low speed is in farther location
-
Traditional Bus Structure(ISA)

High Performance Bus
-
A bridge is set between processor bus and the high-speed bus
-
The cache controller is integrated into a bridge
-
High speed devices are connected to the high speed bus
-
Lower speed devices are connected to an expansion bus
diagram

Elements of bus design
-
Type
- Dedicated, multiplexed
-
Method of arbitration
- Centralized, distributed
-
Timing
- Synchronous, asynchronous
-
Bus width
- Address, data
-
Data transfer type
- Read, write, read-modify-write, read-after-write, block
Bus types
-
Dedicated: Separate data & address lines
-
Multiplexed: Shared lines, Address valid or data valid control line
- More complex control
- Ultimate performance
-
Physical dedication: the use of multiple buses, each of which connects only a subset of modules
Bus arbitration! ! !
- Arbitration may be centralised or distributed
-
Master - slave mechanism
-
Master controls the bus and place information onto it
-
Slave (target) receives information from the master
-
Two methods of arbitration
-
Centralised: Single hardware device controlling bus access - bus controller (or arbiter)
- The chaining method
-
Three controlling bus lines
- BS: bus busy
- BR: bus request
- BR: bus grant
-
Any device sends a BR signal
-
BG passes through each module from the highest priority to the lowest priority
-
The device that close to the arbiter has higher priority

-
The counter polling method
-
Three controlling bus lines: BS, BR, BG
-
Arbiter uses a counter to send polling signals(instead of using BG signal)
-
Any device sends a BR signal
-
After receiving BR signal, when BS=0 (bus not busy), the arbiter sends device address
-
When the address matches the one that the device has, that device has gained the use of bus
-
The counter stops polling
-
-
Independent method
-
Each device has a BR, BG pair
-
The device that desire to use the bus sends BR
-
The arbiter decides which device can use the bus and sends BG to that device
- Each device sends a request signal independently
- Central arbiter decides which device can use the bus
- Sends a BG signal to this device
-
- The Operation in General
- Bus request
- Bus arbitration
- Device addressing
- Data transfer
- Bus release
- Distributed: there is no central controller
-
Each module may claim the bus
-
Control logic on all modules
-
Timing
-
Timing – Refers to way in which events are coordinated on bus
-
Synchronous timing
- Occurrence of events on bus is determined by a clock
-
Asynchronous timing
- Occurrence of one event on a bus follows and depends on occurrence of a previous event
Timing – synchronous
-
Events determined by clock signals
-
Control Bus includes clock line
-
A single 1-0 is a bus cycle
-
-
All devices can read clock line
-
Usually sync on leading edge
-
Usually a single cycle for an event
-
Bus width ! ! !
-
The width of data bus impacts system performance
-
How much information can be transferred at once
-
The wider the data bus, the greater the number of bits transferred at one time
-
-
The width of address bus impacts system capacity
-
The wider the address bus, the greater the range of locations that can be referenced
-
How much information can be stored
-
-
The width of control bus impacts system complexity
-
All buses support to read and write the data
Bus data transfer
-
Some bus systems also support a block data transfer
-
One address cycle followed by n data cycles
-
First data item is transferred to or from specified address
-
Remaining data items are transferred to or from subsequent addresses
-
Block data transfer
Bus speed ! ! !
-
The bus speed reflects how many bits of information can be sent across each wire each second
-
Bandwidth, also called throughput, bus transfer rate, refers to the total amount of data that can be transferred on the bus in a second
- Measured in bits per second or bytes per second
-
$Bandwidth=bus\ width \times bus\ speed\newline$
Summary ! ! !
-
Bus system: a communication pathway connecting two or more devices
-
The operation of bus
- Obtain the use of bus
- Data transfer via bus
-
The bus hierarchy
- Single bus
- Multiple bus structures
-
Elements of bus design: type,Method of arbitration,timing,bus width, Data transfer type
PCI
Bus
-
PCI
: Peripheral Component Interconnection-
Bus structure developed for Pentium in 1990
-
Intel released to public domain
-
High bandwidth, processor independent bus
-
-
Characteristic
-
delivers better system performance for high-speed I/O subsystems
-
PCI
requires very few chips to implement and supports
-


PCI
Bus Lines (required)
-
Systems lines
- Including clock and reset
-
Address & Data
- 32 time mux lines for address/data
- Interrupt & validate lines
-
Interface Control
-
Arbitration
- Not shared
- Direct connection to
PCI
bus arbiter
-
Error lines
PCI
Bus Lines (Optional)
-
Interrupt lines
- Not shared
-
Cache support
-
64-bit Bus Extension
- Additional 32 lines
- Time multiplexed
- 2 lines to enable devices to agree to use 64-bit transfer
-
JTAG/Boundary Scan
- For testing procedure
PCI
Operations
-
Transaction between initiator (master) and target
-
Master claims bus
-
Determine type of transaction
- e.g. I/O read/write
-
Address phase
-
One or more data phases
- Bus command are specified by C/BE# lines during the address phase
Data transfers
-
Every data transfer on the
PCI
bus is a single transaction consisting of one address phase and one or more data phases -
The fundamentals of all
PCI
data transfers are controlled with three signals-
FRAME#: is driven by the master to indicate the beginning and end of a transaction
-
IRDY#: is driven by the master to indicate that it is ready to transfer data
-
TRDY#: is driven by the target to indicate that it is ready to transfer data
-
Bus arbitration
-
PCI
uses central arbitration scheme: each master agent has a uniqueREQ
# and aGNT
# signal- A simple request-grant handshake is used
-
At a given instant in time, one or more
PCI
bus master may require the use ofPCI
bus -
An arbiter is used to efficiently manage access to a
PCI
bus that is shared by several masters