## "Big Picture" Discussion

### CSE 30321 MIPS Single Cycle Dataflow



#### 6

### Functions of Each Component

- Datapath: performs data manipulation operations
  - arithmetic logic unit (ALU)
  - floating point unit (FPU)
- Control: directs operation of other components
  - finite state machines
  - micro-programming
- Memory: stores instructions and data
  - random access v.s. sequential access
  - volatile v.s. non-volatile
  - RAMs (SRAM, DRAM), ROMs (PROM, EEPROM), disk
  - tradeoff between speed and cost/bit
- Input/Output and I/O devices: interface to environment
  - mouse, keyboard, display, device drivers

### The Performance Perspective

- · Performance of a machine determined by
  - Instruction count, clock cycles per instruction, clock cycle time
- Processor design (datapath and control) determines:
  - Clock cycles per instruction
  - Clock cycle time
- We will discuss a simplified MIPS implementation

| Lectures 11-12 7                                                                                                                                                                                                                                                                                            | Lectures 11-12                                                                                                                                                                                                                                                                                                                                                                                  |                                                              |
|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------|
|                                                                                                                                                                                                                                                                                                             |                                                                                                                                                                                                                                                                                                                                                                                                 | 8                                                            |
| <ul> <li>Let's talk about this generally on the board first</li> <li>Let's just look at our instruction formats and "derive" a simple datapath <ul> <li>(we need to make all of these instruction formats "work")</li> <li>(see handout summarizing board discussion from last time)</li> </ul> </li> </ul> | <ul> <li>To simplify things a bit we'll just look</li> <li>memory-reference: lw, sw</li> <li>arithmetic-logical: add, sub, and, or,</li> <li>branching: beq, j</li> <li>Organizational overview: <ul> <li>fetch an instruction based on the cont</li> <li>decode the instruction</li> <li>fetch operands <ul> <li>(read one or two registers)</li> <li>execute</li> </ul> </li> </ul></li></ul> | at a few instructions:<br>slt<br>Most common<br>instructions |
|                                                                                                                                                                                                                                                                                                             | <ul> <li>(effective address calculation/arithm comparison)</li> <li>store result</li> <li>(write to memory / write to register</li> </ul>                                                                                                                                                                                                                                                       |                                                              |

| Lectures 11-12 | 9 |    |
|----------------|---|----|
| What we'll do  |   | Im |

- ...look at instruction encodings...
- ...look at datapath development...
- ...discuss how we generate the control signals to make the datapath elements work...

What to be Done for Each

Instruction?

Fetch operands

Execute

Decode

Write back

• How many cycles should the above take?

• Less cycles => more to be done in one cycle

<digress: Single Cycle vs. Multi-Cycle with 6-instruction processor>

· You are the architect so you decide!

## Implementation Overview

Lectures 11-12



| University of Notre Dame | Computer Sci. & Engr. | University of Notre Dame |                | Computer Sci. & Engr. |
|--------------------------|-----------------------|--------------------------|----------------|-----------------------|
| Lectures 11-             | 2 11                  |                          | Lectures 11-12 | 12                    |

## Single Cycle Implementation

- Each instruction takes one cycle to complete.
- We wait for everything to settle down, and the right thing to be done
  - ALU might not produce "right answer" right away(why?)
  - we use write signals along with clock to determine when to write
- Cycle time determined by length of the longest path



Fetch Inst

Computer Sci. & Engr.

### **Instruction Fetch Unit**

- Fetch the instruction: mem[PC] ,
- Update the program counter:
  - sequential code: PC <- PC+4
  - branch and jump: PC <- "something else"





## Let's say we want to fetch... ...an R-type instruction (arithmetic)

• Instruction format:

| 31 26  | 25 21  | 20 16  | 5 15 11 | 10 6      | 5         | 0 |
|--------|--------|--------|---------|-----------|-----------|---|
| op (6) | rs (5) | rt (5) | rd (5)  | shamt (5) | funct (6) |   |

• RTL:

, So IR ← Memory(PC)

- Instruction fetch: mem[PC]
- ALU operation: reg[rd] <- reg[rs] op reg[rt]
- Go to next instruction: Pc <- PC+ 4
- Ra, Rb and Rw are from instruction's rs, rt, rd fields.
- Actual ALU operation and register write should occur after decoding the instruction.

## During Decode ...

• Take bits from instruction encoding in IR and send to different parts of datapath

#### e.g. R-type, Add encoding:



## Datapath for R-Type Instructions



Register timing:

- Register can always be read.
- Register write only happens when RegWr is set to high and at the falling edge of the clock



#### Lectures 11-12

## **I-Type Branch Instructions**

• Instruction format:

| 3 | 1 26   | 25 21  | 20 16  | 6 15 (                       |
|---|--------|--------|--------|------------------------------|
|   | Op (6) | rs (5) | rt (5) | Address/Immediate value (16) |

- RTL for branch operations: e.g., BEQ
  - Instruction fetch: mem[PC]
  - Compute conditon: Cond <- reg[rs] reg[rt]
  - Calculate the next instruction's address:

if (Cond eg 0) then

```
PC \leftarrow PC + 4 + (SignExd(imm16) \times 4)
```

```
else ?
```

need to align

Computer Sci. & Engr.

21

23

University of Notre Dame Lectures 11-12

## Next Address Logic





Lectures 11-12

# **J-Type Jump Instructions**

• Instruction format:

| 31 26  | 25                  | 0 |
|--------|---------------------|---|
| Op (6) | Target address (26) |   |

- RTL operations: e.g., BEQ
  - Instruction fetch: mem[PC]
  - Set up PC: PC <- ((PC+ 4)<31:28> CONCAT(target<25:0>) x 4

## **Datapath for Branch Instructions**

24



Computer Sci. & Engr.

25

## A Single Cycle Datapath



A Single Cycle Datapath

Lectures 11-12



## Let's trace a few instructions

Lectures 11-12

• For example...

University of Notre Dame

- \$5, \$6, \$7 - Add
- 0(\$9), \$10 - SW
- \$1, \$2, \$3 - Sub
- \$11, 0(\$12) - LW

**28** 

University of Notre Dame

30



29

Control inputs:

Func (6 bits)

**Opcode (5 bits)** 

Control

Path

•

•

Computer Sci. & Engr.

**MemWrite** 

**ALUSrc** 

**ALUctr** 

**Branch** 

Jump

Data

Path

Control

output

#### Lectures 11-12



33



Computer Sci. & Engr.

\_\_\_\_\_

University of Notre Dame

Computer Sci. & Engr.

1



- -CPI = 1
- Clock cycle =  $\Sigma$  (%(type-i instructions) \* propagation delay of the type "i" instruction datapath operations)
- Better than the previous, but impractical to implement
- Disadvantages:
  - What if we have floating-point operations?
  - How about component usage?

#### 4

## Multiple Cycle Alternative

- Break an instruction into smaller steps
- Execute each step in one cycle.
- Execution sequence:
  - Balance amount of work to be done
  - Restrict each cycle to use only one major functional unit
  - At the end of a cycle
    - Store values for use in later cycles, why?
    - Introduce additional "internal" registers
- The advantages:
  - Cycle time much shorter
  - Diff. inst. take different # of cycles to complete
  - Functional unit used more than once per instruction

#### University of Notre Dame