Serial Peripheral Interface Introduction
The Serial Peripheral Interface (SPI) bus is a synchronous serial communication controller specification used for short-distance communication, primarily in embedded systems. The interface was developed by Motorola and has become a de-facto standard. Typical applications include sensors, Secure Digital cards, and liquid crystal displays.
SPI devices communicate in full-duplex mode using a master-slave architecture with a single master. The SPI master device originates from the frame for reading and writing. Multiple SPI slave devices are supported through selection with individual slave select (SS) lines as in Figure 2.

The SPI is a four-wire serial bus as you can see in Figure 1 and in Figure 2

For further information see the Wikipedia Page dedicated to the SPI.
SPI controller architecture overview
The SPI bus specifies four logic signals:
- SCLK : Serial Clock (output from master).
- MOSI : Master Output, Slave Input (output from master).
- MISO : Master Input, Slave Output (output from slave).
- SS : Slave Select (active low, output from master).
You can find alternate naming, but the functionalities are the same.
SPI timing example is shown in Figure 4. The MOSI can be clocked either on the rising or falling edge of SCKL. If MISO change on the rising edge of SCLK, MISO will change on falling and vice versa.

Data transmission begins on the falling edge of SS, then a number N of clock cycles will be provided. The MOSI is driven with the output data payload. The data payload can contain either data and command. If MOSI contains a command, i.e. read command, after a programmed number of SCLK cycles, MISO will be driven with the serial data value read from the slave.
SPI controller VHDL implementation
Before writing the SPI controller VHDL code, let’s review the SPI controller architecture of Figure 5

i_clk : input clock i_rstb : input power on reset active low i_tx_start : input start sending i_data_parallel on serial line o_tx_end : serial data sending terminated i_data_parallel : parallel data to sent o_data_parallel : parallel received data o_sclk : serial clock output o_ss : slave select o_mosi : serial data output i_miso : serial data input

The SPI controller VHDL code will implement the FSM described in Figure 6. The input parallel data will be send using tx_start input signal. The FSM goes to “ST_TX_RX” state for a programmed number of clock cycles. During the data transmission, MISO input is sampled on the internal shift register. At the end of data transmission, the parallelized input is available on “o_parallel_data” output port and a pulse is generated on “o_tx_end” port. A possible VHDL implementation of SPI controller is available below:
library ieee; use ieee.std_logic_1164.all; use ieee.numeric_std.all; entity spi_controller is generic( N : integer := 8; -- number of bit to serialize CLK_DIV : integer := 100 ); -- input clock divider to generate output serial clock; o_sclk frequency = i_clk/(2*CLK_DIV) port ( i_clk : in std_logic; i_rstb : in std_logic; i_tx_start : in std_logic; -- start TX on serial line o_tx_end : out std_logic; -- TX data completed; o_data_parallel available i_data_parallel : in std_logic_vector(N-1 downto 0); -- data to sent o_data_parallel : out std_logic_vector(N-1 downto 0); -- received data o_sclk : out std_logic; o_ss : out std_logic; o_mosi : out std_logic; i_miso : in std_logic); end spi_controller; architecture rtl of spi_controller is type t_spi_controller_fsm is ( ST_RESET , ST_TX_RX , ST_END ); signal r_counter_clock : integer range 0 to CLK_DIV*2; signal r_sclk_rise : std_logic; signal r_sclk_fall : std_logic; signal r_counter_clock_ena : std_logic; signal r_counter_data : integer range 0 to N; signal w_tc_counter_data : std_logic; signal r_st_present : t_spi_controller_fsm; signal w_st_next : t_spi_controller_fsm; signal r_tx_start : std_logic; -- start TX on serial line signal r_tx_data : std_logic_vector(N-1 downto 0); -- data to sent signal r_rx_data : std_logic_vector(N-1 downto 0); -- received data begin w_tc_counter_data <= '0' when(r_counter_data>0) else '1'; -------------------------------------------------------------------- -- FSM p_state : process(i_clk,i_rstb) begin if(i_rstb='0') then r_st_present <= ST_RESET; elsif(rising_edge(i_clk)) then r_st_present <= w_st_next; end if; end process p_state; p_comb : process( r_st_present , w_tc_counter_data , r_tx_start , r_sclk_rise , r_sclk_fall ) begin case r_st_present is when ST_TX_RX => if (w_tc_counter_data='1') and (r_sclk_rise='1') then w_st_next <= ST_END ; else w_st_next <= ST_TX_RX ; end if; when ST_END => if(r_sclk_fall='1') then w_st_next <= ST_RESET ; else w_st_next <= ST_END ; end if; when others => -- ST_RESET if(r_tx_start='1') then w_st_next <= ST_TX_RX ; else w_st_next <= ST_RESET ; end if; end case; end process p_comb; p_state_out : process(i_clk,i_rstb) begin if(i_rstb='0') then r_tx_start <= '0'; r_tx_data <= (others=>'0'); r_rx_data <= (others=>'0'); o_tx_end <= '0'; o_data_parallel <= (others=>'0'); r_counter_data <= N-1; r_counter_clock_ena <= '0'; o_sclk <= '1'; o_ss <= '1'; o_mosi <= '1'; elsif(rising_edge(i_clk)) then r_tx_start <= i_tx_start; case r_st_present is when ST_TX_RX => o_tx_end <= '0'; r_counter_clock_ena <= '1'; if(r_sclk_rise='1') then o_sclk <= '1'; r_rx_data <= r_rx_data(N-2 downto 0)&i_miso; if(r_counter_data>0) then r_counter_data <= r_counter_data - 1; end if; elsif(r_sclk_fall='1') then o_sclk <= '0'; o_mosi <= r_tx_data(N-1); r_tx_data <= r_tx_data(N-2 downto 0)&'1'; end if; o_ss <= '0'; when ST_END => o_tx_end <= r_sclk_fall; o_data_parallel <= r_rx_data; r_counter_data <= N-1; r_counter_clock_ena <= '1'; o_ss <= '0'; when others => -- ST_RESET r_tx_data <= i_data_parallel; o_tx_end <= '0'; r_counter_data <= N-1; r_counter_clock_ena <= '0'; o_sclk <= '1'; o_ss <= '1'; o_mosi <= '1'; end case; end if; end process p_state_out; p_counter_clock : process(i_clk,i_rstb) begin if(i_rstb='0') then r_counter_clock <= 0; r_sclk_rise <= '0'; r_sclk_fall <= '0'; elsif(rising_edge(i_clk)) then if(r_counter_clock_ena='1') then -- sclk = '1' by default if(r_counter_clock=CLK_DIV-1) then -- firse edge = fall r_counter_clock <= r_counter_clock + 1; r_sclk_rise <= '0'; r_sclk_fall <= '1'; elsif(r_counter_clock=(CLK_DIV*2)-1) then r_counter_clock <= 0; r_sclk_rise <= '1'; r_sclk_fall <= '0'; else r_counter_clock <= r_counter_clock + 1; r_sclk_rise <= '0'; r_sclk_fall <= '0'; end if; else r_counter_clock <= 0; r_sclk_rise <= '0'; r_sclk_fall <= '0'; end if; end if; end process p_counter_clock; end rtl;
In the following figures, there are three examples of SPI protocol simulation. In the Modelsim windows is clear the SPI protocol behavior: input parallel data is serialized on MOSI output port. The MISO input data is parallelized in the o_parallel_data port of the SPI controller.
Figure 7 shows an overall simulation view of three SPI cycles

Figure 8 shows a zoom on the second SPI cycle

Figure 9 shows a zoom on the SPI cycle start. The system clock is set to 10 ns in the simulation. The SCLK is generated dividing by 200 the system clock: 100 for high phase, 100 for low phase as specified in the generic “CLK_DIV”

The SPI controller VHDL code above is technology independent and can be implemented either on FPGA or ASIC.
Figure 10 shows Altera Quartus II RTL viewer of the SPI VHDL code implementation above.

The SPI controller VHDL code has been tested on Altera Cyclone III FPGA with 8 bit-serial and parallel data.
The implementation takes 58 Logic Element (LE) and performs @ 400 MHz as reported in the Quartus II area report and timing report below.

