How To Implement Shift-Register in VHDL Using a FIFO

How To Implement Shift-Register in VHDL Using a FIFO

When you implement a digital design one of the most used building block is a pipeline or a digital delay line. For instance, you could need to compensate the delay between two

For instance, you could need to compensate the delay between two branches of  a digital circuit in terms of clock cycle. In

In Figure1, there is a possible example where you have to subtract the value of an input sample of an ADC and this value is computed using the ADC sample as well. In the figure, the “Processing Block” compute the required functions over the current ADC samples. Then the correction has to be subtracted to the current ADC input samples. So you need to compensate the processing time in terms of clock cycle using a feed-forward architecture implemented as a delay line.

Figure1 – An example of digital delay line requirement

 

The delay line can be implemented in different ways. If you use an FPGA, this device gives you different solutions, depending on how many clock cycles you need to compensate and the device you are using:

  • Shift register pipeline as flip-flop
  • Shift register pipeline as block RAM or internal LUT depending on technology
  • Delay line implemented as FIFO

Shift-Register Implementation of Digital Delay Line

The register pipeline has a straightforward implementation. A VHDL example of pipeline delay line can be:

library ieee;
use ieee.std_logic_1164.all;

entity shift_register_parametric is
generic(
  G_WIDTH                 : integer := 8;
  G_DEPTH                 : integer := 24);
port (
  i_clk                       : in  std_logic;
  i_rstb                      : in  std_logic;
  i_d                         : in  std_logic_vector(G_WIDTH-1 downto 0);
  o_q                         : out std_logic_vector(G_WIDTH-1 downto 0));
end shift_register_parametric;

architecture rtl of shift_register_parametric is

type t_q_pipe is array(0 to G_DEPTH-1) of std_logic_vector(G_WIDTH-1 downto 0);
signal q_pipe          : t_q_pipe;

begin

process_pipe : process(i_clk,i_rstb)
begin
  if(i_rstb='0') then
    q_pipe   <= (others=>(others=>'0'));
  elsif(rising_edge(i_clk)) then
  
    q_pipe   <= i_d&q_pipe(0 to q_pipe'length-2);
    
  end if;
end process process_pipe;

o_q <= q_pipe(q_pipe'length-1);

end rtl;

 

If you remove the reset signal “i_rstb” the synthesizers will implement the shift-register in a dedicated FPGA logic resources, if available. The VHDL code for this optimized shift register implementation using internal Block RAM or LUT can be:

 

library ieee;
use ieee.std_logic_1164.all;

entity delay_line_parametric is
generic(
  G_WIDTH                 : integer := 8;
  G_DEPTH                 : integer := 24);
port (
  i_clk                       : in  std_logic;
  i_d                         : in  std_logic_vector(G_WIDTH-1 downto 0);
  o_q                         : out std_logic_vector(G_WIDTH-1 downto 0));
end delay_line_parametric;

architecture rtl of delay_line_parametric is

type t_q_pipe is array(0 to G_DEPTH-1) of std_logic_vector(G_WIDTH-1 downto 0);
signal q_pipe          : t_q_pipe;

begin

process_pipe : process(i_clk)
begin
  if(rising_edge(i_clk)) then
  
    q_pipe   <= i_d&q_pipe(0 to q_pipe'length-2);
    
  end if;
end process process_pipe;

o_q <= q_pipe(q_pipe'length-1);

end rtl;

 

In Altera Cyclone III FPGA technology, for example, shift register is implemented as Figure2:

Quartus II MAP Viewer for Delay Line implementation in Cyclone III FPGA
Figure2 – Quartus II MAP Viewer for Delay Line implementation in Cyclone III FPGA

If the delay line is demanding in terms of a number of bits to be stored, the FIFO implementation could be a very efficient solution.

 

 

FIFO Implementation of Digital Delay Line

In this case, the delay line is implemented using the synchronous FIFO memory. As explained in Figure3, in this case the input “i_rstb” signal is used to enable write data into the FIFO when high, when low reset the delay counter and the FIFO control logic. As clear the “i_rstb” is used as synchronous reset/enable of the delay line. This is an efficient approach when a long delay line is requested.

This is an efficient approach when a long delay line is requested.

Figure3 – FIFO Implementation of Digital Delay Line

The input data and the reset signal are re-synchronized with the input clock to avoid the problem in synchronous design implementation. Generally, the reset signal of the synchronous FIFO macro of the FPGA is a synchronous signal.

A possible VHDL code implementation of the delay line implemented as FIFO in FPGA could be:

library ieee;
use ieee.std_logic_1164.all;

entity shift_register_fifo is
generic(
  G_WIDTH                 : integer := 8;   -- FIFO shall be compliant
  G_DEPTH                 : integer := 24); -- G_DEPTH > 3
port (
  i_clk                       : in  std_logic;
  i_rstb                      : in  std_logic;
  i_d                         : in  std_logic_vector(G_WIDTH-1 downto 0);
  o_q                         : out std_logic_vector(G_WIDTH-1 downto 0));
end shift_register_fifo;

architecture rtl of shift_register_fifo is

component alt_fifo_1kx8
port (
  aclr        : in std_logic ;
  clock       : in std_logic ;
  data        : in std_logic_vector (7 downto 0);
  rdreq       : in std_logic ;
  wrreq       : in std_logic ;
  almost_full : out std_logic ;
  empty       : out std_logic ;
  q           : out std_logic_vector (7 downto 0));
end component;

signal aclr             : std_logic ;
signal data             : std_logic_vector (7 downto 0);
signal rdreq            : std_logic ;
signal wrreq            : std_logic ;
signal delay_counter    : integer;

begin

process_control : process(i_clk)
begin
  if(rising_edge(i_clk)) then
    data            <= i_d;  -- resync data in
    if(i_rstb='0') then
      delay_counter   <= 0;
      wrreq           <= '0';
      rdreq           <= '0';
      aclr            <= '1';
    elsif(delay_counter<G_DEPTH-2) then
      delay_counter   <= delay_counter + 1;
      wrreq           <= '1';
      rdreq           <= '0';
      aclr            <= '0';
    else
      wrreq           <= '1';
      rdreq           <= '1';
      aclr            <= '0';
    end if;	
  end if;
end process process_control;

u_alt_fifo_1kx8 : alt_fifo_1kx8
port map(
  aclr        => aclr        ,
  clock       => i_clk       ,
  data        => data        ,
  rdreq       => rdreq       ,
  wrreq       => wrreq       ,
  almost_full => open        ,
  empty       => open        ,
  q           => o_q         );
end rtl;

 

In the entity/architecture pair implementation the FIFO width shall be matched width the input data. The FIFO depth shall be greater than the delay line length.

If the number of clock cycle delay is “small” the flip-flop delay line approach should be used.


 

Simulation result of Digital Delay Line VHDL Implementation

The simulation wave of Figure4 compares the three outputs of the different delay line implementation:

 

Figure4 - Simulation result of different VHDL implementation of Shift Register
Figure4 – Simulation result of different VHDL implementation of Shift Register

 

  • Shift register delay line implementation;
  • Shift register optimized for internal FPGA block RAM implementation
  • Delay line implemented using FIFO.

It is worth of notice that Altera Quartus II implements the both Shift register delay line architecture versions (with and without asynchronous reset) using the internal “altshift_taps”  macro optimizing the internal flip-flop area usage.


 

 

If you appreciated this post, please help us to share it with your friend.

 

[social_sharing style=”style-7″ fb_like_url=”https://surf-vhdl.com/how-to-implement-shift-register-in-vhdl-using-a-fifo” fb_color=”light” fb_lang=”en_US” fb_text=”like” fb_button_text=”Share” tw_lang=”en” tw_url=”https://surf-vhdl.com/how-to-implement-shift-register-in-vhdl-using-a-fifo” tw_button_text=”Share” g_url=”https://surf-vhdl.com/how-to-implement-shift-register-in-vhdl-using-a-fifo” g_lang=”en-GB” g_button_text=”Share” linkedin_url=”https://surf-vhdl.com/how-to-implement-shift-register-in-vhdl-using-a-fifo” linkedin_lang=”en_US” alignment=”center”]

 

If you need to contact us, please write to: surf.vhdl@gmail.com

We appreciate any of your comment, please post below:

5 thoughts to “How To Implement Shift-Register in VHDL Using a FIFO”

    1. You can code one yourself. Just make a synchronous RAM from an array of slv and and a simple controller for read/write and some logic for the flags: full, almost_full, empty, almost_empty .

      The sync RAM is sync Wr/async Rd and is simple dual port (separate Rd and Wr addr)

      USE ieee.numeric_std.ALL;
      USE ieee.numeric_std_unsigned.ALL;

      entity alt_fifo_1kx8 is
      port(
      clock : in std_logic;
      aclr : in std_logic;
      data : in std_logic_vector(7 downto 0);
      rdreq : in std_logic;
      wrreq : in std_logic;
      almost_full : out std_logic;
      empty : out std_logic;
      q : out std_logic_vector(7 downto 0)
      );
      end alt_fifo_1kx8;

      architecture Behav of alt_fifo_1kx8 is
      — 2^10 = 1024 ~ 1K… make addr 10 bits wide
      constant ADDR_WIDTH : positive:= 10; — positive is subtype of intas is natural {0,1…N}, positive (1,2…N}
      constant FULL_THLD : std_logic_vector(ADDR_WIDTH-1 downto 0) := (others => ‘1’); — Full thresh
      constant AFULL_THLD : std_logic_vector(ADDR_WIDTH-1 downto 0) := FULL_THLD-1; — Almost Full thresh

      signal Rd_addr : std_logic_vector(ADDR_WIDTH-1 downto 0):= (others => ‘0’);
      signal Wr_addr : std_logic_vector(ADDR_WIDTH-1 downto 0):= (others => ‘0’);
      signal Wr_en : std_logic:= ‘0’;
      signal Rd_en : std_logic:= ‘0’;
      signal FIFO_full : std_logic:= ‘0’;
      signal FIFO_empty : std_logic:= ‘0’;

      signal fbit_comp, overflow_set, underflow_set: std_logic:=’0′;
      signal Addr_eq : std_logic:=’0′;
      signal Addr_diff : std_logic_vector(ADDR_WIDTH-1 downto 0):=(others => ‘0’);

      type mem_array is array (0 to (2**ADDR_WIDTH)-1) of std_logic_vector(7 downto 0);
      signal RAM: mem_array;

      begin

      RAM_P: process(clock)
      begin
      if(rising_edge(clock)) then
      if(wr_en=’1′) then
      RAM(to_integer(unsigned(Wr_addr(9 downto 0)))) <= data;
      end if;
      end if;
      end process RAM_P;
      q <= RAM(to_integer(unsigned(Rd_addr(9 downto 0))));

      Rd_en <= rdreq AND NOT(FIFO_empty); — can't rd from an empty FIFO
      process(clock,aclr)
      begin
      if(aclr ='1') then
      Rd_addr ‘0’);
      if(rising_edge(clock)) then
      if(Rd_en=’1′) then
      Rd_addr <= Rd_addr + 1; — numeric_std_unsigned allows + 1 incr
      end if;
      end if;
      end if;
      end process;

      Wr_en <= (not fifo_full) and wrreq; — can't write to a full FIFO
      process(clock,aclr)
      begin
      if(aclr ='1') then
      Wr_addr ‘0’);
      if(rising_edge(clock)) then
      if(Rd_en=’1′) then
      Wr_addr <= Wr_addr + 1; — numeric_std_unsigned allows + 1 incr
      end if;
      end if;
      end if;
      end process;

      — status logic
      fbit_comp <= Wr_addr(ADDR_WIDTH-1) xor Rd_addr(ADDR_WIDTH-1); — XOR MSBs of Rd & Wr addresses
      Addr_diff = Rd_addr
      Addr_eq <= '1' when (Wr_addr = Rd_addr) else '0';
      FIFO_full <= '1' when Wr_addr = FULL_THLD else '0';
      empty <= (not fbit_comp) and Addr_eq;
      almost_full = AFULL_THLD) else ‘0’;
      empty <= FIFO_empty;

Leave a Reply

Your email address will not be published. Required fields are marked *