#### Chapter 2: Instructions: Language of the Computer 2.8 – 2.10 Procedure, String and Addressing for Wide

#### ITSC 3181 Introduction to Computer Architecture Fall 2021 <u>https://passlab.github.io/ITSC3181/</u>

Department of Computer Science Yonghong Yan <u>yyan7@uncc.edu</u> <u>https://passlab.github.io/yanyh/</u>

#### **Chapter 2: Instructions: Language of the Computer**

- Lecture
  - 2.1 Introduction
  - 2.2 Operations of the Computer Hardware
  - 2.3 Operands of the Computer Hardware

#### • Lecture

- 2.4 Signed and Unsigned Numbers
- 2.5 Representing Instructions in the Computer
- Lecture
  - 2.6 Logical Operations
  - 2.7 Instructions for Making Decisions

#### Lecture

- 2.8 Supporting Procedures in Computer Hardware
- 2.9 Communicating with People
- 2.10 RISC-V Addressing for Wide Immediate and Addresses
- Lecture
  - 2.11 Parallelism and Instructions: Synchronization
  - 2.12 Translating and Starting a Program
    - We covered before along with C Basics
  - 2.13 A C Sort Example to Put It All Together
  - 2.14 Arrays versus Pointers
    - We covered most before along with C Basics
  - 2.15 Advanced Material: Compiling C and Interpreting Java
  - 2.16 Real Stuff: MIPS Instructions
  - 2.17 Real Stuff: x86 Instructions
  - 2.18 Real Stuff: The rest of RISC-V
  - 2.19 Fallacies and Pitfalls
  - 2.20 Concluding Remarks
  - 2.21 Historical Perspective and Further Reading

#### **Three Classes of Instructions We Will Focus On:**

- **1.** Arithmetic-logic instructions
  - add, sub, addi, and, or, shift left | right, etc
- **2.** Memory load and store instructions
  - Iw and sw: Load/store word
  - Id and sd: Load/store doubleword
- 3. Control transfer instructions (changing sequence of instruction execution)
  - Conditional branch: bne, beq
  - Unconditional jump: j
  - Procedure call and return: jal and jr

# \$2.8 8 Supporting Procedures in Computer Hardware

#### Procedure Call: sum\_full.c



https://passlab.github.io/ITSC3181/exercises/sum

### **Procedure Call**

- Control is transferred when there is procedure call and return
- Steps required
  - 1. Place parameters in registers
  - 2. Transfer control to procedure
  - 3. Acquire storage for procedure
  - 4. Perform procedure's operations
  - 5. Place result in register for caller
  - 6. Return to place of call



#### Sum Example, sum\_full\_riscv.s



https://passlab.github.io/ITSC3181/exercises/sum

### **Procedure Call Instructions**

- Procedure call: jump and link
  - jal x1, ProcedureLabel
  - Address of following instruction put in x1
  - Jumps to target address
- Procedure return: jump and link register
   jalr x0, 0(x1)
  - Like jal, but jumps to 0 + address in x1
  - Use x0 as rd (x0 cannot be changed)
  - Can also be used for computed jumps
    - e.g., for case/switch statements

#### **Memory Layout of a Process**

- Text: program code
- Static data: global variables
  - e.g., static variables in C, constant arrays and strings
  - x3 (global pointer) initialized to address allowing ±offsets into this segment
- Dynamic data: heap
   E.g., malloc in C, new in Java
- Stack: automatic storage



#### Local Data on the Stack

- Local data allocated by callee
  - e.g., C automatic variables
- Procedure frame (activation record)
  - Used by some compilers to manage stack storage



# **Stack Memory Used for Function Calls**

- Stack is Last-In-First-Out (LIFO) data structure to store the info of each function of the call path
  - Main() calls foo(), foo() calls bar(), bar() calls tar()
  - Call in: push function to the stack top
  - Return: pop function from the top
- Stack frame, function frame, activation record
  - The memory and the data of the info for each function call







#### Stack Frame (Activation Record) of a Function Call



#### Leaf Procedure Example

- Leaf procedure
  - A procedure does not call other procedures
    - Thinking of procedure calls as a tree
- C code:

```
long long int leaf_example (
 long long int g, long long int h,
 long long int i, long long int j) {
   long long int f;
   f = (g + h) - (i + j);
   return f;
}
```

- Arguments g, ..., j in x10, ..., x13
- f in x20
- temporaries x5, x6
- Need to save x5, x6, x20 on stack

#### Local Data on the Stack

- Stack before, during and after the function call
  - SP (stack pointer) is the register that store the address of the long long int leaf\_example ( long long int g, long long int g

```
int main ( ... ) {
    ...
    long long int result = leaf_example( ... );
    ...
}
```

```
long long int leaf_example (
  long long int g, long long int h,
  long long int i, long long int j) {
    long long int f;
    f = (g + h) - (i + j);
    return f;
  }
- Arguments g, ..., j in x10, ..., x13
- f in x20
- temporaries x5, x6
```

Need to save x5, x6, x20 on stack

If caller uses x5, x6 or x20, we have to preserve them. They are preserved in the callee stack.



#### Leaf Procedure Example

long long int leaf\_example ( long long int g, long long int h, long long int i, long long int j) { long long int f; f = (g + h) - (i + j);return f: - Arguments g, ..., j in x10, ..., x13 - f in x20 temporaries x5, x6 Need to save x5, x6, x20 on stack Adjust stack to make room for 3 items Save x5, x6, x20 on stack x5 = g + hx6 = i + jf = x5 - x6copy f to return register Resore x5, x6, x20 from stack Adjust back to return memory for 3 items

Return to caller

**RISC-V** code: leaf\_example: addi sp,sp,-24 sd x5,16(sp) sd x6,8(sp) sd x20,0(sp) add x5,x10,x11 add x6,x12,x13 sub x20,x5,x6 addi x10,x20,0 1d x20,0(sp) ld x6,8(sp) ld x5,16(sp) addi sp, sp, 24 jalr x0,0(x1)

#### **Register Usage**

- x5 x7, x28 x31: temporary registers
  - Not automatically preserved by the callee
- x8 x9, x18 x27: saved registers
  - If used, the callee saves and restores them

| Name                 | Register<br>number | Usage                          | Preserved<br>on call? |  |
|----------------------|--------------------|--------------------------------|-----------------------|--|
| x0                   | 0                  | The constant value 0           | n.a.                  |  |
| x1 (ra)              | 1                  | Return address (link register) | yes                   |  |
| x2 (sp)              | 2                  | Stack pointer                  | yes                   |  |
| x3 (gp)              | 3                  | Global pointer                 | yes                   |  |
| x4 (tp)              | 4                  | Thread pointer                 | yes                   |  |
| x5-x7                | 5–7                | Temporaries                    | no                    |  |
| x8-x9                | 8–9                | Saved                          | yes                   |  |
| x10-x17              | 10-17              | Arguments/results              | no                    |  |
| x18-x27 <b>18-27</b> |                    | Saved                          | yes                   |  |
| x28-x31 <b>28-31</b> |                    | Temporaries                    | no                    |  |

FIGURE 2.14 RISC-V register conventions.

#### **Non-Leaf Procedures**

- Procedures that call other procedures
- For nested call, caller needs to save on the stack:
  - Its return address
  - Any arguments and temporaries needed after the call
- Restore from the stack after the call

#### **Non-Leaf Procedure Example**

- C code:
  - long long int fact (long long int n)
    {
     if (n < 1) return n;
     else return n \* fact(n 1);
    }</pre>
  - Argument n in x10
  - Result in x10
- It is a recursive function.

#### Leaf Procedure Example

- RISC-V code: fact: addi sp,sp,-16
  - sd x1,8(sp)
    sd x10,0(sp)
  - addi x5,x10,-1
  - bge x5,x0,L1
  - addi x10,x0,1
  - addi sp,sp,16
  - jalr x0,0(x1)
- L1: addi x10,x10,-1 jal x1,fact addi x6,x10,0 ld x10,0(sp) ld x1,8(sp)
  - addi sp, sp, 16
  - mul x10,x10,x6
    jalr x0,0(x1)

```
long long int fact (long long int n)
```

```
if (n < 1) return f;
else return n * fact(n - 1);
```

- Argument n in x10
- Result in x10

{

}

Adjust stack for two items Save return address and n on stack Save the argument n x5 = n - 1if  $n \ge 1$ , go to L1 Else, set return value to 1 Pop stack, don't bother restoring values Return n = n - 1call fact(n-1) move result of fact(n - 1) to x6 Restore caller's n Restore caller's return address Pop stack

return n \* fact(n-1)

#### **Character Data**

- Byte-encoded character sets
  - ASCII: 128 characters
    - 95 graphic, 33 control
  - Latin-1: 256 characters
    - ASCII, +96 more graphic characters
- Unicode: 32-bit character set
  - Used in Java, C++ wide characters, ...
  - Most of the world's alphabets, plus symbols
  - UTF-8, UTF-16: variable-length encodings

#### **ASCII Characters**

• Each character is represented by a 8-bit byte → max 256

| ASCII<br>value | Char-<br>acter |
|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|
| 32             | space          | 48             | 0              | 64             | @              | 80             | Р              | 96             |                | 112            | р              |
| 33             | 1              | 49             | 1              | 65             | A              | 81             | Q              | 97             | а              | 113            | q              |
| 34             |                | 50             | 2              | 66             | В              | 82             | R              | 98             | b              | 114            | r              |
| 35             | #              | 51             | 3              | 67             | С              | 83             | S              | 99             | с              | 115            | S              |
| 36             | \$             | 52             | 4              | 68             | D              | 84             | Т              | 100            | d              | 116            | t              |
| 37             | %              | 53             | 5              | 69             | E              | 85             | U              | 101            | е              | 117            | u              |
| 38             | &              | 54             | 6              | 70             | F              | 86             | V              | 102            | f              | 118            | v              |
| 39             |                | 55             | 7              | 71             | G              | 87             | W              | 103            | g              | 119            | w              |
| 40             | (              | 56             | 8              | 72             | н              | 88             | Х              | 104            | h              | 120            | x              |
| 41             | )              | 57             | 9              | 73             | 1              | 89             | Y              | 105            | i              | 121            | У              |
| 42             | *              | 58             | :              | 74             | J              | 90             | Z              | 106            | j              | 122            | z              |
| 43             | +              | 59             | ;              | 75             | к              | 91             | [              | 107            | k              | 123            | {              |
| 44             | ,              | 60             | <              | 76             | L              | 92             | \              | 108            | 1              | 124            | 1              |
| 45             | -              | 61             | =              | 77             | M              | 93             | 1              | 109            | m              | 125            | }              |
| 46             | •              | 62             | >              | 78             | N              | 94             | Α.             | 110            | n              | 126            | ~              |
| 47             | 1              | 63             | ?              | 79             | 0              | 95             |                | 111            | 0              | 127            | DEL            |

FIGURE 2.15 ASCII representation of characters.

# **Byte/Halfword/Word Operations**

- RISC-V byte/halfword/word load/store
  - Load byte/halfword/word: Sign extend to 64 bits in rd
    - lb rd, offset(rs1)
    - lh rd, offset(rs1)
    - lw rd, offset(rs1)
  - Load byte/halfword/word unsigned: Zero extend to 64 bits in rd
    - lbu rd, offset(rs1)
    - lhu rd, offset(rs1)
    - lwu rd, offset(rs1)
  - Store byte/halfword/word: Store rightmost 8/16/32 bits
    - sb rs2, offset(rs1)
    - sh rs2, offset(rs1)
    - sw rs2, offset(rs1)

# **String Copy Example**

- C code:
  - A string is an array of characters with` \0` as the last character
    - char x[100]; a string of 100 character
    - char \* x2; is used for refer to a string
  - Null-terminated string
- void strcpy (char x[], char y[])
  - { size\_t i; i = 0; while ((x[i]=y[i])!='\0') i += 1:
  - }
  - Base address for x and y are in x10 and x11
  - i is in x19

#### **String Copy Example**

|                                                | <pre>-void strcpy (char x[], char y[])     { size_t i;     i = 0;     while ((x[i]=y[i])!='\0')</pre> |
|------------------------------------------------|-------------------------------------------------------------------------------------------------------|
| <ul> <li>RISC-V code:</li> </ul>               | <u>i += 1;</u>                                                                                        |
| strcpy:                                        | }                                                                                                     |
| addi sp,sp,-8<br>sd x19,0(sp)<br>add x19,x0,x0 | // adjust stack for 1 doubleword<br>// save x19<br>// i=0                                             |
| L1: add x5,x19,x10                             | // x5 = addr of y[i]                                                                                  |
| lbu x6,0(x5)                                   | // x6 = y[i]                                                                                          |
| add x7,x19,x10                                 | // x7 = addr of x[i]                                                                                  |
| sb x6,0(x7)                                    | // x[i] = y[i]                                                                                        |
| beq x6,x0,L2                                   | // if y[i] == $\setminus 0$ then exit                                                                 |
| addi x19,x19,1                                 | //i = i + 1                                                                                           |
| jal x0,L1                                      | <pre>// next iteration of loop</pre>                                                                  |
| L2: ld x19,0(sp)                               | // restore saved x19                                                                                  |
| addi sp,sp,8                                   | <pre>// pop 1 doubleword from stack</pre>                                                             |
| jalr x0,0(x1)                                  | // and return 23                                                                                      |

# **32-bit Constants**

immediate funct3 rs1 rd opcode Most constants are small 12 bits 5 bits 7 bits 3 bits 5 bits I-format instructions have only 12 bits for immediate Iressing for Wide Immediates E.g. addi x6, x0, 1024 12-bit immediate is sufficient most of the time For the occasional 32-bit constant lui rd, constant (load upper immediate) Copies 20-bit constant to bits [31:12] of rd and Extends bit 31 to bits [63:32] Clears bits [11:0] of rd to 0 Addresses lui x19, 976 // 0x003D0 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0011 1101 0000 0000 0000 0000 addi x19,x19,128 // 0x500

Now x19 has 976\*212 + 128

#### SB-Format Encoding for Branch Instr (e.g. beq)

http://content.riscv.org/wp-content/uploads/2017/05/riscv-spec-v2.2.pdf#page=116

- Branch instructions, e.g. "beq x3, x4, EXIT", specify
  - Opcode, two source registers (rs1 and rs2), target address as imm
  - Most branch targets are near branch, Forward or backward
- SB-Format instructions: beq x8, x9, 4

|    | imm<br>▲ [10:5] | rs2 | rs1 | fun | ct3   | imm<br>[4:1] |     | оро   | code                 |  |  |
|----|-----------------|-----|-----|-----|-------|--------------|-----|-------|----------------------|--|--|
|    |                 |     |     |     |       |              |     |       |                      |  |  |
| im | imm[12] imm[11] |     |     |     |       |              |     |       |                      |  |  |
|    | imm[12 10:5]    | rs2 | rs1 | 000 | imm[4 | 4:1 11]      | 110 | 00011 | $\operatorname{BEQ}$ |  |  |
|    | imm[12 10:5]    | rs2 | rs1 | 001 | imm[4 | 4:1 11]      | 110 | 00011 | BNE                  |  |  |
|    | imm[12 10:5]    | rs2 | rs1 | 100 | imm[4 | 4:1 11]      | 110 | 00011 | BLT                  |  |  |
|    | imm[12 10:5]    | rs2 | rs1 | 101 | imm[4 | 4:1 11]      | 110 | 00011 | BGE                  |  |  |
|    | imm[12 10:5]    | rs2 | rs1 | 110 | imm[4 | 4:1 11]      | 110 | 00011 | BLTU                 |  |  |
|    | imm[12 10:5]    | rs2 | rs1 | 111 | imm[4 | 4:1 11]      | 110 | 00011 | BGEU                 |  |  |
|    | · · · ·         | - 1 |     |     |       | -            |     |       |                      |  |  |

#### • PC-relative addressing

- Branch target address is encoded as the offset off the the address of the branch instruction itself
- Target address = PC (Branch address) + immediate × 2

# imm of beq instruction is the offset between the address of beq instruction and the target address (page 115 of the text book)



- The Exit offset of the bne is encoded as 6 (..0110)
  - Offset is 6\*2 = 12 bytes, i.e. 3 instr forward
  - Exit's address = bne's address (80012) + 12 = 80024 (Exit)
- The Loop offset of the beq is encoded as -10 (..110110)
  - Offset is -10\*2 = -20 bytes, i.e. 5 instr backward
  - Loop's address = beq's address (80020) + -20 = 80000 (Loop)
- imm[11] is the sign bit of the imm  $\rightarrow$  help decoding
- To calculate the imm for beq: (target-PC)/2

## **Jump Addressing**

- Jump and link (jal) target uses 20-bit immediate for larger range
- UJ format:



- For long jumps, eg, to 32-bit absolute address
  - Iui: load address[31:12] to temp register
  - jair: add address[11:0] and jump to target

# Summary of RISC-V Addressing (How Operands are Specified or Provided)

1. Immediate addressing

| immediate | rs1 | funct3 | rd | ор |
|-----------|-----|--------|----|----|
|-----------|-----|--------|----|----|

2. Register addressing



3. Base addressing



4. PC-relative addressing



#### **RISC-V Encoding and Decoding: Encoding Format**

| Name         |                | Fi                | eld      |        |               | Comments                     |                               |  |
|--------------|----------------|-------------------|----------|--------|---------------|------------------------------|-------------------------------|--|
| (Field Size) | 7 bits         | 5 bits            | 5 bits   | 3 bits | 5 bits        | 7 bits                       |                               |  |
| R-type       | funct7         | rs2               | rs1      | funct3 | rd            | opcode                       | Arithmetic instruction format |  |
| I-type       | immediate      | rs1               | funct3   | rd     | opcode        | Loads & immediate arithmetic |                               |  |
| S-type       | immed[11:5]    | rs2               | rs1      | funct3 | immed[4:0]    | opcode                       | Stores                        |  |
| SB-type      | immed[12,10:5] | rs2               | rs1      | funct3 | immed[4:1,11] | opcode                       | Conditional branch format     |  |
| UJ-type      | imme           | ediate[20,10:1,11 | l,19:12] |        | rd            | opcode                       | Unconditional jump format     |  |
| U-type       |                | immediate[31::    | 12]      |        | rd            | opcode                       | Upper immediate format        |  |

#### **RISC-V Encoding and Decoding: Opcode/Funct**

|              |                |                  |         |        |               |        | Format  | Instruction | Opcode  | Funct3 | Funct6/7 |
|--------------|----------------|------------------|---------|--------|---------------|--------|---------|-------------|---------|--------|----------|
| Name         |                |                  | eld     |        |               |        |         | add         | 0110011 | 000    | 0000000  |
| (Field Size) | 7 bits         | 5 bits           | 5 bits  | 3 bits | 5 bits        | 7 bits |         | sub         | 0110011 | 000    | 0100000  |
| R-type       | funct7         | rs2              | rs1     | funct3 | rd            | opcode | $\top$  | s11         | 0110011 | 001    | 0000000  |
| -type        | immediate[     | 11:0]            | rs1     | funct3 | rd            | opcode |         | xor         | 0110011 | 100    | 0000000  |
| S-type       | immed[11:5]    | rs2              | rs1     | funct3 | immed[4:0]    | opcode | R-type  | srl         | 0110011 | 101    | 0000000  |
| SB-type      | immed[12,10:5] | rs2              | rs1     | funct3 | immed[4:1,11] | opcode | R-type  | sra         | 0110011 | 101    | 0000000  |
| JJ-type      | imme           | diate[20,10:1,11 | ,19:12] |        | rd            | opcode |         | or          | 0110011 | 110    | 0000000  |
| J-type       |                | immediate[31:1   | 12]     |        | rd            | opcode |         | and         | 0110011 | 111    | 0000000  |
|              |                |                  |         |        |               |        |         | lr.d        | 0110011 | 011    | 0001000  |
|              |                |                  |         |        |               |        |         | sc.d        | 0110011 | 011    | 0001100  |
|              |                |                  |         |        |               |        |         | lb          | 0000011 | 000    | n.a.     |
|              |                |                  |         |        |               |        |         | lh          | 0000011 | 001    | n.a.     |
|              |                |                  |         |        |               |        |         | iw          | 0000011 | 010    | n.a.     |
|              |                |                  |         |        |               |        |         | id          | 0000011 | 011    | n.a.     |
|              |                |                  |         |        |               |        |         | ibu         | 0000011 | 100    | n.a.     |
|              |                |                  |         |        |               |        |         | ihu         | 0000011 | 101    | n.a.     |
|              |                |                  |         |        |               |        | I-type  | iwu         | 0000011 | 110    | n.a.     |
|              |                |                  |         |        |               |        |         | addi        | 0010011 | 000    | n.a.     |
|              |                |                  |         |        |               |        |         | slli        | 0010011 | 001    | 000000   |
|              |                |                  |         |        |               |        |         | xori        | 0010011 | 100    | n.a.     |
|              |                |                  |         |        |               |        |         | srli        | 0010011 | 101    | 000000   |
|              |                |                  |         |        |               |        |         | srai        | 0010011 | 101    | 010000   |
|              |                |                  |         |        |               |        |         | ori         | 0010011 | 110    | n.a.     |
|              |                |                  |         |        |               |        |         | andi        | 0010011 | 111    | n.a.     |
|              |                |                  |         |        |               |        |         | jalr        | 1100111 | 000    | n.a.     |
|              |                |                  |         |        |               |        |         | sb          | 0100011 | 000    | n.a.     |
|              |                |                  |         |        |               |        | S-type  | sh          | 0100011 | 001    | n.a.     |
|              |                |                  |         |        |               |        | Grype   | SW          | 0100011 | 010    | n.a.     |
|              |                |                  |         |        |               |        |         | sd          | 0100011 | 111    | n.a.     |
|              |                |                  |         |        |               |        |         | beq         | 1100111 | 000    | n.a.     |
|              |                |                  |         |        |               |        |         | bne         | 1100111 | 001    | n.a.     |
|              |                |                  |         |        |               |        | SB-type | blt         | 1100111 | 100    | n.a.     |
|              |                |                  |         |        |               |        | on-type | bge         | 1100111 | 101    | n.a.     |
|              |                |                  |         |        |               |        |         | bltu        | 1100111 | 110    | n.a.     |
|              |                |                  |         |        |               |        |         | bgeu        | 1100111 | 111    | n.a.     |
|              |                |                  |         |        |               |        | U-type  | lui         | 0110111 | n.a.   | n.a.     |
|              |                |                  |         |        |               |        | Illtype | ial         | 1101111 | n 0    | na       |

#### **To Decode an Instruction Word**

| Name         |                             | Fi     | eld    |        |               |                              | Comments                      |  |  |
|--------------|-----------------------------|--------|--------|--------|---------------|------------------------------|-------------------------------|--|--|
| (Field Size) | 7 bits                      | 5 bits | 5 bits | 3 bits | 5 bits        | 7 bits                       |                               |  |  |
| R-type       | funct7                      | rs2    | rs1    | funct3 | rd            | opcode                       | Arithmetic instruction format |  |  |
| l-type       | immediate                   | rs1    | funct3 | rd     | opcode        | Loads & immediate arithmetic |                               |  |  |
| S-type       | immed[11:5]                 | rs2    | rs1    | funct3 | immed[4:0]    | opcode                       | Stores                        |  |  |
| SB-type      | immed[12,10:5]              | rs2    | rs1    | funct3 | immed[4:1,11] | opcode                       | Conditional branch format     |  |  |
| UJ-type      | immediate[20,10:1,11,19:12] |        |        |        | rd            | opcode                       | Unconditional jump format     |  |  |
| U-type       | immediate[31:12]            |        |        |        | rd            | opcode                       | Upper immediate format        |  |  |

- To decode an instruction word: 00578833<sub>hex</sub>
  - 1. Convert to binary: 0000 0000 0101 0111 1000 1000 0011 0011
  - 2. Determine the opcode, the rightmost 7 bits: 011 0011
  - 3. It is R-type arithmetic instruction
  - 4. Decode the rest, func3 and funct7 and then rs1, rs2, rd

| funct7  | rs2   | rs1   | funct3 | rd    | opcode  |
|---------|-------|-------|--------|-------|---------|
| 0000000 | 00101 | 01111 | 000    | 10000 | 0110011 |

5. Add x16, x15, x5