Assembly language

definition

  • Different CPUs implement different sets of instructions.  The set of instructions a particular CPU implements is an Instruction Set Architecture (ISA), and the programming language defined by the ISA is commonly known as an assembly language.
    • Examples: ARM (cell phones), Intel x86 (i9, i7, i5, i3), IBM/Motorola PowerPC (old Macs), MIPS, RISC-V, …

risc-v

CS 61C Reference Card

overall view

system:

  • CPU: compute things fast; but store little data
  • main memory: store more data, but access much slower than computed.

register

  • A register is a CPU component specifically designed to store a small amount of data. Each register stores 32 bits of data (for a 32-bit system) or 64 bits of data (for a 64-bit system). For the purposes of this class, we consider RV32 only (which uses 32-bit registers)
  • This data is purely binary; types do not exist at the assembly level
  • Registers are a hardware component, so once you make the CPU, you can’t change the number of registers available.
    • Can’t “make” a new register when defining a new variable; have to delete an existing register for each variable you make.
  • RISC-V gives access to 32 integer registers:
    • x0-x31
    • x0 always stores 0, so only 31 registers are available to hold varibles

instructions

![[Pasted image 20240627160611.png]]
Each line of RISC-V code is a single instruction, which executes a simple operation on registers
in the format:

  • <instruction name> <destination register> <operands>
  • Ex. “add x5 x6 x7” means “Add the values stored in x6 and x7, and store the result in x5”
  • Commas can be added between registers (“add x5, x6, x7”), but this is optional.
    for comments: # ...
    Far more important than in other languages!!!

addition

add x1, x2, x3 == a = b + c (x1 ->a, x2 -> b, x3 -> c)

subtraction

sub x3,x4,x5 == d = e - f
exercise:

  1. do a = b + c + d - e;
  2. do f = (g + h) - (i + j);

Immediates

numerical constants
add immediate: addi x3, x4, 10 == f = g + 10
No subtract immediates in RISC-V!

load immediate

li rd, imm
e.g.: li rd, 5 == addi rd, x0, 5

register 0

add x3, x4, x0 == f = g

data transfer

main memory

![[Pasted image 20240529085310.png]]

  • 4 bytes together make a word
  • RISC-V uses little-endian to store data: The least significant byte gets stored at the lowest address.

load word

lw rd imm(rs1): Compute imm+rs1, then load the 4 bytes at that address into rd
example: lw x10 12(x5)

  • 0x100+12=0x10C
  • bytes at 0x10C-0x10F: 0x53, 0x42, 0x56, 0x00
  • 32-bit valurte 0x0056 4253

store word

sw rs2 imm(rs1): Compute imm+rs1, then store the 4 bytes of rs2 into that address.
e.g.: sw x10 0(x5) if x5 is 0x100, x10 is 0x1234 5678

  • 0x100+0 = 0x100
  • Since RISC-V is little-endian, the 32-bit value 0x1234 5678 gets split into bytes 0x78 0x56 0x34 0x12
  • So the bytes in memory 0x100, 0x101, 0x102, and 0x103 get set to 0x78, 0x56, x34, and 0x12, respectively.

load byte

lb rd imm(rs1)

store byte

sb rs2 imm(rs1)

example: converting C code to RISC-V

C:

1
2
3
4
5
6
int a = 5;  
char b[] = "string";
int c[10];
uint8_t d = b[3];
c[4] = a+d;
c[a] = 20;

RISC-V:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
li t0 5
sw t0 0(sp)
li t0 0x69727473
sw t0 4(sp)
li t0 0x0000676E
sw t0 8(sp)
lb t0 7(sp)
sb t0 52(sp)
lw t0 0(sp)
lbu t1 52(sp)
add t2 t0 t1
sw t2 28(sp)
li t0 20
lw t1 0(sp)
slli t1 t1 2 #t1*=4
addi t1 t1 12
add t1 t1 sp
sw t0 0(t1)

control flow

new C operator

label

A label is an identifier to a particular line of code

  • Doesn’t count as a line of code itself; merely “points out” a particular line
  • Each label must have a unique name (like variable names)

goto

The goto statement changes the next line to be run to the labelled line

  • The label can be either before or after the goto statement.
    never use goto!!!
  • goto has a tendency to create completely illegible code
  • Generally considered bad practice, except in very specific situations
    • Error handling
    • Jumping out of nested loops
      simple example of goto:
      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      11
      12
      13
      int* a = malloc(sizeof(int)*1000);
      if(a == NULL) goto ErrorA;
      int* b = malloc(sizeof(int)*1000000);
      if(b == NULL) goto ErrorB;
      int* c = malloc(sizeof(int)*1000000000);
      if(c == NULL) goto ErrorC;
      FILE* d = fopen(filename);
      if(d == NULL) {
                 free(c);
      ErrorC:    free(b);
      ErrorB:    free(a);
      ErrorA:    allocation_failed();
      }

RISC-V instructions for control flow

List of branch instructions:

  • beq rs1 rs2 Label: Branch if EQual
  • bne rs1 rs2 Label: Branch if Not Equal
  • blt rs1 rs2 Label: Branch if Less Than (signed) (rs1 < rs2)
  • bge rs1 rs2 Label: Branch if Greater or Equal (signed)
  • bltu rs1 rs2 Label: Branch if Less Than (unsigned)
  • bgeu rs1 rs2 Label: Branch if Greater or Equal (unsigned)
    example:
    1
    2
    3
    4
    5
    6
    7
    8
    int a = 0;
    for(int i = 0; i < 10; i++) {
        if(i == 7) {
            break;
        }
        a = a + i;
    }
    a = a + 50;
1
2
3
4
5
6
7
8
9
10
11
	  li x10 0        #int a = 0;
      li x5 0         #int i = 0;
Loop:
      li x6 10        #int j = 10;
      bge x5 x6 End   #if(i >= j) goto End;
      li x6 7         #j = 7;
      beq x5 x6 End   #if(i == j) goto End;
      add x10 x10 x5  #a = a + i;
      addi x5 x5 1    #i = i + 1;
      j Loop          #goto Loop;
End:  addi x10 x10 50 #a = a + 50;

bitwise operations

review: C bitwise operations:

  • AND, OR, XOR, and NOT: perform the operation one bit at a time
  • left and right shift: shift the bits of the number left/right
    shift have numerial equivalents in math
  • left shift: multiply by $2^n$ –> fill the new bits with 0s
  • right shift by n -> floor divide by $2^n$ –> dividing an unsigned number should fill the bits with 0s; signed numbers fill with the sign bit

RISC-V operations

add rd rs1 rs2: Add
sub rd rs1 rs2: Subtract
and rd rs1 rs2: Bitwise AND
or rd rs1 rs2: Bitwise OR
xor rd rs1 rs2: Bitwise XOR
sll rd rs1 rs2: Shift Left Logical (rd left rs2 digit, left using the highest of rs1)
srl rd rs1 rs2: Shift Right Logical
sra rd rs1 rs2: Shift Right Arithmetic
so far: ![[Pasted image 20240617151458.png]]

precedures

C functions:

1
2
3
4
5
6
7
8
int foo(int i) {
    if(i == 0) return 0;
    int a = i + foo(i-1);
    return a;
}
int j = foo(3);
int k = foo(100);
int m = j+k;

Calling a function:

  • Set function arguments
  • Goto the start of the function
    During a function call:
  • Keep local scope separate from global scope
  • Perform the desired task of the function
    Returning from a function:
  • Place the return value in a variable that can be accessed
  • Goto the line immediately after the function call

problems in RISC-V

maintaining scope

In RISC-v, however, local scope do not exist. Therefore, we need a way to store variables that no called function can change

returning from a function

We’ll need a way to send in the return address to a function, and jump to that return address when we finish with the function

memory model of RISC-V

almost the same as C
Program is also a kind of data. It stored in text section of the memory.
in RISC-v:

  • every (real) instruction is stored as a 32-bit number
  • A special 33rd register called the Program Counter (or PC) keeps track of which line of code is currently being run
  • jumps: jump to the address of the instruction
  • jal rd label: jump and link: jump to the given label and sets rd to PC+4 (for function calls)
  • j label: jump to the given label; used for unconditional jumps
  • jalr rd rs1 imm: Jumps to the instruction at address rs1+imm, and sets rd to PC+4 (used for higher-order functions and some function calls)
  • jr rs1: jump to the instruction at address rs1 (return from a function)

stack

One of our registers (by convention x2, nicknamed sp, or “stack pointer”) is set to the bottom of the stack. A function can choose to create a stack frame, by manipulating sp.

  • Anything above the sp at the start of a function belongs to another function. You may not modify anything above the sp without permission.
  • Everything below the sp is safe to modify.
    • But anyone else can modify it, so you can’t leave data there and expect it to stay the same
  • By decrementing the sp, we can allocate as much space as we need for our function, that we can use however we want.
  • After finishing a function call, the sp must be set to its value from before the function call
    自己的理解:相当于寄存器sp存储了一个栈的底部,然后当调用函数时将这个栈的底部向下移一定的位数形成自己的栈(addo sp sp -8),调用玩了之后再把sp还原回原来的位置

converting a C function into RISC-V

the function is:

1
2
3
4
5
6
7
8
int foo(int i) {
    if(i == 0) return 0;
    int a = i + foo(i-1);
    return a;
}
int j = foo(3);
int k = foo(100);
int m = j+k;

step1: define how the function plans to use registers
Inputs:
i: x10

  • We’ll call this register “a0” for “argument”
    Return Address: x1
  • We’ll call this register “ra”
    Output: x10 (Yes, we’ll reuse a0 for the return value)
    Stack Pointer: x2
  • Nicknamed “sp”
    Register that will NOT be changed by foo: x8, x9
  • We can still use these registers, as long as they get restored by the end of the function
  • We’ll call these registers “s0” and “s1” for “saved”
    1
    2
    3
    4
    5
    6
    7
    li a0 3      # int j = foo(3);
    jal ra foo   # call foo
    mv s0 a0     # mv rd rs1 sets rd = rs1
    li a0 100    # int k = foo(100);
    jal ra foo   # call foo
    mv s1 a0     # Saves return value in s1
    add a0 s0 s1 # int m = j+k;
    the function foo:
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    foo: # int foo(int i)
    addi sp sp -8 #Prologue
    sw ra 0(sp)   #Prologue
    sw s0 4(sp)   #Prologue
    mv s0 a0      #Move i
    bne s0 x0 next
    li a0 0
    j Epilogue #     if(i == 0) return 0;
    Next:
    addi t0 s0 -1 #int j = i - 1;
    mv a0 t0
    jal ra foo    #j = foo(j);
    mv t0 a0
    add a0 s0 t0  #int a = i + j;
    Epilogue:
    lw ra 0(sp)   #Epilogue
    lw s0 4(sp)   #Epilogue
    addi sp sp 8  #Epilogue
    jr ra         #return a;

calling convention

Standardize a set of conventions that everyone agrees to follow in the functions of each register
![[Pasted image 20240623140053.png]]

  • zero: The x0 register, which always stores 0
  • ra: x1, which is used to store return addresses
    • Two new pseudoinstructions that explicitly use this:
      • jal Label -> jal ra Label
      • ret       -> jr ra
  • sp: The x2 register, which is the stack pointer
  • s0-s11: Saved registers
    • Registers that do not need to be restored by a called function (i.e. if you want to save a variable in this register, it needs to be saved somewhere before you call another function)
  • a0-a7: Registers used for function arguments
    • a0, a1 also used for function outputs
  • If a function needs more than 8 arguments, can use the stack to store more arguments
  • t0-t6: temporary Registers
  • gp: The x3 register, used to store a reference to the heap. Also called the “global pointer”
  • tp: The x4 register, used to store separate stacks for threads

all remaining RISC-V instructions

  • slt, slti, sltu, sltiu: Set Less Than
    • slt rd rs1 rs2: Compares rs1 to rs2. If rs1 < rs2 (signed), sets rd to 1. Otherwise sets rd to 0.
  • lh, lhu, sh
    • Load/Store Halfword: Same as lb/lbu/sb, but for 2-byte blocks
  • ebreak, ecall: Environment Break/Call
    • Asks the computer to do something (ex. Print data, set a breakpoint for debugging, allocate heap space)
    • We’ll provide utility functions that call ecall/ebreak for you
  • lui, auipc: will be mentioned in the following part!

RISC-V Instruction Formats

简单的概括: 程序也是数据,研究存储程序的数据
![[Pasted image 20240624121656.png]]

R-types

![[Pasted image 20240624121306.png]]

  • Designed for instructions with 3 registers and no immediate
    • Arithmetic operators like add or sub
  • Each register is identified by its number. 32 registers → 5 bits to identify one register uniquely
    • 0x0 → 0b00000
    • a0 → x10 → 0b01010
  • opcode: Instruction identifier: Always the last 7 bits of the instruction over all instruction formats
    • Can therefore be used to determine which instruction format is currently in use.
  • Some sets of similar instructions get assigned the same opcode
    • Ex. All arithmetic R-type instructions have the opcode 0x33
  • funct3: 3-bit identifier to differentiate instructions with the same opcode
  • funct7: Extra 7-bit identifier for extremely similar instructions with the same opcode and funct3 (such as sra and srl)
  • ![[Pasted image 20240624121449.png]]

I-type

![[Pasted image 20240624121318.png]]
Designed for instructions with 2 registers (rs1 and rd) and 1 immediate

  • Most components are stored the same way as before, with the addition of the imm component
  • Most instructions use signed immediates, so our range for I-type immediates is [-2048,2047].
    ![[Pasted image 20240624121541.png]]
    ![[Pasted image 20240624121526.png]]
    ![[Pasted image 20240624121549.png]]

S-type

![[Pasted image 20240624121639.png]]
similar to I-type, but store the immediate separately
![[Pasted image 20240624121726.png]]

U-type

![[Pasted image 20240624143102.png]]

  • Load Upper Immediate: lui rd imm
    • Sets rd to imm << 12
    • example: How would you translate “li t0 0xABCDEFFF” to instructions?
    • lui t0 0xABCDF #t0 stores 0xABCDF000
      addi t0 t0 0xFFF #t0 stores 0xABCDEFFF
  • Upper Immediate to Program Counter: auipc rd imm
    • Sets rd to (imm << 12) + PC
  • Primarily used in two pseudoinstructions:
    • li rd imm: Set rd to imm
    • la rd Label: Set rd to the address of Label
      ![[Pasted image 20240624143227.png]]

B-type

![[Pasted image 20240624143201.png]]

  • Branch instructions also use 2 source registers and an immediate, so the format is similar to S-Type
    • This format is sometimes referred to as SB-type for that reason
  • Note that the immediate is stored in a strange pattern
    • If we had the binary 0bA BCDE FGHI JKLM (where each letter was a bit), the first box would store 0bACD EFGH and the second box would store 0bI JKLB. Bit M isn’t stored.
    • This is also to simplify the underlying circuit; note that we put the MSB of our immediate in the MSB of our instruction (to simplify sign-extension), and other than that put 10 of the remaining 11 bits in the same position as S-type instructions
  • Branch instructions have 13-bit immediates = [-4096, 4094] range, which is up to 210 instructions up/down.
    ![[Pasted image 20240624143551.png]]

J-types

![[Pasted image 20240624143745.png]]

  • Jal instructions use only 1 destination and an immediate, so we can use the U-type format for extra immediate bits
    • This format is sometimes referred to as UJ-type for that reason
  • Note that the immediate is stored in an even stranger pattern
    • If we had the binary 0bA BCDE FGHI JKLM NOPQ RSTU (where each letter was a bit), the data would be stored as 0b AKLM NOPQ RSTJ BCDE FGHI. As before, the last bit isn’t stored
    • Note that we put the MSB of our immediate in the MSB of our instruction, bits 19-12 in the same spot as U-types, and bits 10-1 in the same spot as I-types.
  • Jumps have 21-bit immediates, so up to 218 instructions up/down

summary

you can get the summary card here.