VHDL integer division should be really avoided?
In VHDL there are the math primitive subtraction, addiction, and multiplication that are generally available in the libraries provided by the FPGA or ASIC vendor.
For example, in this post, we saw how to implement a pipelined multiplier. The example shows the use of multiplication and addition primitives.
The division is a bit more complex case. Generally, the deployment of the division requires a much more complex logic circuit, and for this reason, we tend to avoid, where possible, the use of the division operator unless there are special cases.
If we have to divide by 2 or power of two, the implementation is simply shifting to the right of the number to divide.
In binary representation, shifting to the right of a position corresponds to a division by two, as in a decimal representation a shifting to the right corresponds to a division by 10.
More generally when we represent a base number B a shifting to the right of a position corresponds to a division for B, shifting left corresponds to the corresponding multiplication for B.
Returning to the division obviously, you could ask:
what if I have to divide by a number other than 2 or power of two?
In this case, we need to distinguish if the number is constant or no-constant.
If we have to compute the division for a constant x / K where K is a constant we can simply calculate the multiplication
x * 1 / K as explained in this post.
Okay, I understand that dividing for 2 and powers of 2 or for constants is simple,
but if I have to divide by numbers that are not powers of two and are not constant?
Here the problem is a bit complicated…
Again, if we use an FPGA the vendor provides IPs that allow division.
We could also implement division through a cordic.
These solutions have some drawbacks. They are generally demanding in terms of area and they need many clock cycles of execution time.
If we have timing and area problem, we cannot use this strategy.
If we are using modern FPGA, there is an almost simple solution in case the bit number of the divisor number is small, say less than 10/12 bits.
In this case, we can map the division values into a ROM / LUT using the divisor number as the address to get the division output value.
Suppose we have to divide x / y.
We can rewrite the division as x * 1 / y and focus only on implementing 1 / y, the rest is a simple multiplication and we know how to do a VHDL multiplication.
Suppose y is represented by Ny bits and y is positive. In the case of negative y, it is sufficient to consider that 1 / y = – (- 1 / y).
At this point we only have to find the way to represent all y values, that is:
1, 1/2, 1/3, ... 1 / (2 ^ Ny) - 1
To do this we must also determine how many bits we want to use for the 1 / y representation. Assuming the number of output bits is NDy
If we have:
M = (2 ^ NDy) -1
the maximum number that is obtained with NDy bits, that is the value “1” quantized with NDy bit
the division 1 / y represented with NDy will be:
M * 1 / y with y = 1,2, ... (2 ^ Ny) -1
Let’s take an example so much to understand.
y represented by 5 bits: y = 1,2,3 .., 31
1 / y = 1, 1 / 2, 1/3, .., 1/31
suppose we represent the 9-bit division: M = 2 ^ 9-1 = 511 then
1 / y quantized to 9 bit will be:
511, 511/2, 511/3, ..., 511/31
It is clear that these values must be converted into fixed-point values i.e. integer values to be tabulated in a LUT within the FPGA, so we can write that the quantization of 1 / y to 9 bits will be:
round (511 / y) where y = 1,2, .., 31
If we analyze the equation
round (((2 ^ NDy) -1) / ((2 ^ N) -1))
it is clear that to get an integer from the round function
NDy> Ny
below we see an example with Ny = 4 and NDy = 5; Ny = 4 and NDy = 8
it is clear that different values of y collide at the same value 1 / y
for example, y = 10 has the same value as y = 11, y=12
like a rule of thumb NDy = 2 * Ny
in this case, on the right side of Figure3, no collision occurs
If we want to generalize, here is a possible MATLAB or Scilab code that allows us to quantify division 1 / y.
Nt=4; Nq=8; x=[0:2^Nt-1]'; x(1)=1; xInv = (1./x); xInvq = round((1./x)*2^((Nq))-1);
The Figures4 show the left 1 / y value in floating point, to the right the 8-bit quantized representation, where y is represented by 4 bits.
Note that the value of y = 0 has been forced to 2 ^ NDy-1
As we know about dividing for zero it is not possible but still, we have to handle this possibility.
In the LUT index, 0 value will depend on how we want to manage the division by 0.
VHDL code for division implementation in FPGA
We can implement the LUT performing a division directly in VHDL as a constant initialized in the VHDL code.
The LUT/ROM implementation di demanded to the VHDL synthesizer. Generally speaking, when we are going to implement a LUT with a great number of bit in terms of Ny and NDy the synthesizer will try to map the LUT into the internal ROM macro.
Vice-versa if the LUT contains few number of bits (Ny and NDy are a small number) it will be implemented directly into the FPGA logic and NO RAM/ROM block will be used.
Here below an example of VHDL code implementing a LUT division where
input: 4 bit
output: 8 bit
The LUT has been generated using the MATLAB/Scilab code above.
library ieee ; use ieee.std_logic_1164.all; use ieee.numeric_std.all; entity division_lut is port( y : in std_logic_vector(3 downto 0); y_inv : out std_logic_vector(7 downto 0)); end division_lut; architecture rtl of division_lut is constant C_NY : integer:= 4; constant C_NDY : integer:= 8; type t_divition_lut is array (0 to 2**C_NY-1) of integer range 0 to 2**C_NDY-1; constant C_DIV_LUT : t_divition_lut := ( 255, 255, 127, 84, 63, 50, 42, 36, 31, 27, 25, 22, 20, 19, 17, 16); begin y_inv <= std_logic_vector(to_unsigned(C_DIV_LUT(to_integer(unsigned(y))),C_NDY)); end rtl;
Conclusion
In this post, we learn how to implement a division in VHDL. The post showed how to deal with a division in fixed point arithmetic. When the number of bits used to represent the number is small (on the range of 8-12 bits), we can use the LUT architecture to implement the division.
The quantization has been implemented using Scilab (compatible with MATLAB) commands, and an example of VHDL implementation is given at the end of the post.
References
[1] https://www.altera.com/products/fpga/stratix-series/stratix-10/overview.html
[2] https://www.mathworks.com/
[4] https://en.wikipedia.org/wiki/Division_algorithm
[5] https://en.wikipedia.org/wiki/CORDIC
Thanks a lot for this valuable post.. it is really a big effort you are doing in this website.. keep it up and thanks again for that ..
wish you all the best .
Azoz Omar
Thank You very much, Azoz
I really appreciate your feedback!
Yes, interesting, particularly the division by a constant.
Thank you Francesco!
thanks for this tutorial.
I have one question.for this methode we must know the division output.do i speak right?
thank you for this tutorial, i have one question :
for use this methode must know the division output.
Do i speak right?
What do you mean “must know the division output”?
The code computes the division x/y. No previous knowledge of the result is needed.
In the algo you perform 1/y and then multiply x * 1/y
If you need to refresh how to multiply two number you can refer to this post:
http://surf-vhdl.com/how-to-implement-pipeline-multiplier-vhdl/
Very nice, Thanks!!!
Was very useful implementation.
Thanks for this great tutorial. Assuming we always know y = 10, now how to implement? That means we have only divided by 10 …
I don’t understand your question, may you reword, please?
Many thanks for this post. What if the number of bits is more than 12? I have a dividend of 32 bits and a divisor of 16 bits. Have you any code for that?
Thanks a lot!