

6

## Vahid: Basic Datapath Operations

- Load operation: Load data from data memory to RF
- ALU operation: Transforms data by passing one or two RF register values through ALU, performing operation (ADD, SUB, AND, OR, etc.), and writing back into RF.
- Store operation: Stores RF register value back into data memory
- Each operation can be done in one clock cycle



# Vahid: Basic Architecture – Control Unit



### Review: Three-Instruction Programmable Processor





### Exercise: Understanding the Processor Design (2)

Q1: D[8] = D[8] + RF[1] + RF[4]



8.4

# A Six-Instruction Programmable Processor

Let's add three more instructions:

- Load-constant instruction-0011 r<sub>3</sub>r<sub>2</sub>r<sub>1</sub>r<sub>0</sub> c<sub>7</sub>c<sub>6</sub>c<sub>5</sub>c<sub>4</sub>c<sub>3</sub>c<sub>2</sub>c<sub>1</sub>c<sub>0</sub>
  - MOV Ra, #c-specifies the operation RF[a]=c
- Subtract instruction-0100 ra<sub>3</sub>ra<sub>2</sub>ra<sub>1</sub>ra<sub>0</sub> rb<sub>3</sub>rb<sub>2</sub>rb<sub>1</sub>rb<sub>0</sub> rc<sub>3</sub>rc<sub>2</sub>rc<sub>1</sub>rc<sub>0</sub>
  - SUB Ra, Rb, Rc—specifies the operation RF[a]=RF[b]-RF[c]
- Jump-if-zero instruction  $-0101 \operatorname{ra}_3 \operatorname{ra}_2 \operatorname{ra}_1 \operatorname{ra}_0 \operatorname{o}_7 \operatorname{o}_6 \operatorname{o}_5 \operatorname{o}_4 \operatorname{o}_3 \operatorname{o}_2 \operatorname{o}_1 \operatorname{o}_0$ 
  - **JMPZ Ra**, offset—specifies the operation PC = PC + offset if RF[a] is 0

|                                                   | TABLE 8.1 Six-instruction instruction set |                         | TABLE 8.2 Instruction opcodes. |        |
|---------------------------------------------------|-------------------------------------------|-------------------------|--------------------------------|--------|
|                                                   | Instruction                               | Meaning                 | Instruction                    | Opcode |
|                                                   | MOV Ra, d                                 | RF[a] = D[d]            | MOV Ra, d                      | 0000   |
|                                                   | MOV d, Ra                                 | D[d] = RF[a]            | MOV d, Ra                      | 0001   |
|                                                   | ADD Ra, Rb, Rc                            | RF[a] = RF[b] + RF[c]   | ADD Ra, Rb, Rc                 | 0010   |
|                                                   | MOV Ra, #C                                | RF[a] = C               | MOV Ra, #C                     | 0011   |
|                                                   | SUB Ra, Rb, Rc                            | RF[a] = RF[b] - RF[c]   | SUB Ra, Rb, Rc                 | 0100   |
| Divital Davian                                    | JMPZ Ra, offset                           | PC=PC+offset if RF[a]=0 | JMPZ Ra, offset                | 0101   |
| Digital Design<br>Copyright © 2006<br>Frank Vahid |                                           |                         |                                | 16     |







CSE 30321 - Lecture 15 - Midterm Review

# IC, CPI and IPC

Consider the processor we have worked on. What is its CPI? IPC?



**University of Notre Dame** 

Instruction Count (IC) = Number of Instructions = 10

Average number of cycles per instruction (CPI) = '

Instructions per Cycle (IPC) =

Can CPI < 1?



- $\frac{Instructions}{\Pr ogram} \times \frac{Clock cycles}{Instruction} \times \frac{Seconds}{Clock Cycle} = \frac{Seconds}{\Pr ogram} = CPU time$
- We can see CPU performance dependent on:
  - Clock rate, CPI, and instruction count
- CPU time is directly proportional to all 3:
  - Therefore an x % improvement in any one variable leads to an x % improvement in CPU performance
- But, everything usually affects everything:



#### CSE 30321 - Lecture 15 - Midterm Review

12

### **Different Types of Instructions**

- Multiplication takes more time than addition
- Floating point operations take longer than integer operations
- Memory accesses take more time than register accesses
- NOTE: changing the cycle time often affects the number of cycles an instruction will take

CPU Clock Cycles = 
$$\sum_{i=1}^{n} CPI_i * IC_i = AvgCPI * IC$$

**University of Notre Dame** 

### **Question 2a - Measurement Comparison**

- Given that two machines have the same ISA, which measurement is always the same for both machines running program P?
  - Clock Rate:
  - CPI:
  - Execution Time:
  - Number of Instructions:
  - MIPS:

#### University of Notre Dame





## Deriving the previous formula



University of Notre Dame

#### CSE 30321 - Lecture 15 - Midterm Review

### **MIPS Registers**

### (and the "conventions" associated with them)

| Name      | R#    | Usage                            | Preserved on Call |
|-----------|-------|----------------------------------|-------------------|
| \$zero    | 0     | The constant value 0             | n.a.              |
| \$at      | 1     | Reserved for assembler           | n.a.              |
| \$v0-\$v1 | 2-3   | Values for results & expr. eval. | no                |
| \$a0-\$a3 | 4-7   | Arguments                        | no                |
| \$†0-\$†7 | 8-15  | Temporaries                      | no                |
| \$s0-\$s7 | 16-23 | Saved                            | yes               |
| \$t8-\$t9 | 24-25 | More temporaries                 | no                |
| \$k0-\$k1 | 26-27 | Reserved for use by OS           | n.a.              |
| \$gp      | 28    | Global pointer                   | yes               |
| \$sp      | 29    | Stack pointer                    | yes               |
| \$fp      | 30    | Frame pointer                    | yes               |
| \$ra      | 31    | Return address                   | yes               |

University of Notre Dame







University of Notre Dame

21

23

Lower

Mem

Addr

Higher

Mem

Addr

## More complex cases

- Register contents across procedure calls are designated as either caller or callee saved
- MIPS register conventions:
  - \$t\*, \$v\*, \$a\*: not preserved across call
    - caller saves them if required
  - \$s\*, \$ra, \$fp: preserved across call
    - callee saves them if required
  - See P&H FIGURE 2.18 (p.88) for a detailed register usage convention
  - Save to where??
- More complex procedure calls
  - What if your have more than 4 arguments?
  - What if your procedure requires more registers than available?
  - What about nested procedure calls?
  - What happens to \$ra if proc1 calls proc 2 which calls proc3,...

#### University of Notre Dame

#### CSE 30321 - Lecture 15 - Midterm Review

### The stack comes to the rescue

#### Stack

- A dedicated area of memory
- First-In-Last-Out (FILO)

#### Used to

- Hold values passed to a procedure as arguments
- Save register contents when needed
- Provide space for variables local to a procedure

### Stack operations

- push: place data on stack (sw in MIPS)
- pop: remove data from stack (Iw in MIPS)

### Stack pointer

- Stores the address of the top of the stack
- \$29 (\$sp) in MIPS



### Procedure call essentials: Caller/Callee Mechanics



#### University of Notre Dame

#### CSE 30321 - Lecture 15 - Midterm Review

#### Where is the stack located? Memory Structure Reserved Addr Instruction PC i-2 segment i-1 ί. Top of stack i+1 i+2 \$sp = i Data segment Stack SP segment