binary

basics

  • A system of storing data using just two digits: 1 and 0.
  • Everything in a computer is ultimately stored in binary (high voltage wire = 1, low voltage wire = 0)
  • Generally rooted in the mathematical concept of binary (as a base 2 system of representing numbers)
  • Since computers tend to “think” in binary, it is ultimately useful to work with values in binary. By convention we prepend any binary value with “0b”

operations

&, |, ~, ^: convert every
<<, >>:

  • Left shift: Convert to binary, then move all bits left, appending 0s as needed (Equivalent to multiplying by a power of 2)
  • Right shift (logical): Convert to binary, then move all bits right, prepending 0s as needed (Equivalent to dividing by a power of 2)
    for example:
    ![[Pasted image 20240523104246.png]]

signed numbers

![[Pasted image 20240523110950.png]]
Formally:

  • Define a “bias”
  • To interpret stored binary: Read the data as an unsigned number, then add the bias
  • To store a data value: Subtract the bias, then store the resulting number as an unsigned number

float

fixed point representation

![[Pasted image 20240523113405.png]]
but what about other numbers?

  • very large number (31,556,926,010 (3.155692610 x 10^10))
  • very small number (0.000000000052917710 (5.2917710 x 10^-11))

floating point

IEEE standard 754!
Take scientific notation as an example:
![[Pasted image 20240526155108.png]]
Similarly, the floating point method are as the $A*2^B$

  • 1 bit: sign bit
  • 8 bits: exponent (B)
  • 23 bits: significand(A)
    $(-1)^s*(1+significand)*2^{exponent-127}$

sth. special

0

zero have no normalized representation!(all zeros)

large & small numbers

255 is the same as 0? –overflow (more than 3.4* 10^38!) & underflow (less than 1.2* 10^-38!) !

$\pm \infty$

IEEE standard: export 1111 1111 , significand zero for $\pm \infty$

not a number(NaN)

export 1111 1111, significand nonzero.

Another problem: there is a gap between FP numbers and zero!

  • smallest normalized number: $2^{-126}$
  • smallest number between 2 numbers: $2^{-149}$
    Solution: denormalized number( no (implied) leading 1; implicit exponent for all denorms = -126)
    You can see [here](IEEE-754 Floating Point Converter (h-schmidt.net)) for IEEE 754 float transformation

other floating point representations

double precision floating point

extend the 16 bits to 32 bits!
sign 1bits; exponent 11bits; significand 20bits!