## Intro.

Representing floating point in hardware is a great challenge when we’re limited by area and precision. The general way of representing floating point is by using IEEE 754 Floating Point standard. But the amount of storage needed for that standard is way too big for our needs, so we came up with our own standard in storing fixed point numbers.

The diagram above is used so that explanation could be easier later.

## Representation Format

We usually use the format below to represent fixed point numbers:

**Q[x].[y]**

x: Number of Bits for **Whole Number**

y: Number of Bits for **Decimal Number**

- We need a total of
**x + y**bits to represent this format

**Example:**

**Q2.3**has 2 whole number bits and 3 decimal number bits and the decimal would be right after the 3rd number from the LSB.- This requires 5 bits to represent

**The diagram above**has the format Q3.4- It requires 7 bits to represent

### SQ[x].[y]

This is the same as above but there’s an extra sign bit to represent **signed numbers.**

So, we need a total of **x + y + 1 **bits to represent this type of number.

**Example:**

**SQ2.3**would require 6 bits to represent and the decimal would be right after the 3rd number from the LSB(same as above).

## Divide and Conquer Method in Solving Division and N-th Root of a Number

### Division

- In solving division, we have
**A/B = C.** - We move
**B**to the other side of the equation ending up with**A = CB.** - Our goal is to guess what
**C**is. - The approach in guessing
**C**would be by assuming that the**MSB**of**C**as**1.** - Then, we
**multiply**the**“assumed C”**with**B**and compare the result with**A.** - If the result is less than
**A**, the**MSB**of**C**is**indeed 1**, otherwise, we’ll set the**MSB**of**C**as**0**. - Thus we can proceed to the
**next bit**until we reach the**last bit**. - There’re 2 ways where we can stop this operation:
- When
**A = CB**is achieved during our assumptions. - When we reach the
**LSB**of**C**.

- When

The diagram above is for better understanding on the idea that I’m trying to explain.

**N-th Root of a Number**

- This is really similar to the methods used in
**division**. - We have
**A^(1/B) = C**. - We move the
**Power**to the other side of the equation ending up with**A = C^B**. - Now, we start guessing for
**C**again. - The method used for guessing would be the same as what we used for
**division**.

## Operating with Fixed Point Numbers

- Operating with
**multiplication**wouldn’t be a problem because the number of bits required to represent**whole number**and**decimal number**adds up.- E.x. Q2.3 * Q4.5 = Q6.8

- There’s a problem with
**addition**and**subtraction**instead, we have to shift the decimal based on our operation.- Usually the shifting of the decimal in this situation is dependent on the precision needed by the design.
- All bits used to represent
**whole number**should remain unchanged since a change in the number of bits here will affect the result**significantly.** - The
**decimal number**bits can be altered anyway the designer wants as more bits provides better resolution but ends up with larger area or vice versa. **Adding**two fixed point numbers would end up with a**maximum**of**max(fixed point number 1, fixed point number 2) + 1**bits result.- E.x. Q3.4 + Q2.3 = Q4.x

**Subtracting**two fixed point numbers would end up with**max(fixed point number 1, fixed point number 2)**number of bits for the result as there wouldn’t be a carry required for subtraction.- E.x. Q3.4 – Q2.3 = Q3.x

## Issues with Fixed Point Numbers

- Usually, designs would use
**16 bit**to represent fix point numbers. - We’ve to keep track on operations like
**continuous multiplications**as it would blow up the number of bits required to represent the whole number of the result.