MicroZed Chronicles: Using DSP48E2 as Multiplexers

May 6, 2022 


Editor’s Note: This content is republished from the MicroZed Chronicles, with permission from the author.

 

I am a regular reader of many FPGA notice boards. A few days ago, I saw a question about how  the DSP48E2 could be used as a multiplexer. The question arose because the developer was running low on logic resources while the DSP elements were unused. 

I had come across the Hoplite Network on Chip a few years ago and a version of this also used the DSP48 elements as multiplexers. The Hoplite-DSP version used the DSP48 as a mux to return logic resources to the FPGA designers.  

The DSP48E2 is a very versatile feature. In our programmable logic, we mainly we use it to implement mathematical algorithms like filters, FFT, and so on. 

Looking at the architecture of the DSP48, however, there are several multiplexors that can be used to switch the data that is fed into the ALU. 

We can multiplex a signal by controlling the setting of the X and Y multiplexor and by setting the correct mode for the ALU. 

We can do this by configuring the ALU to perform an addition and selecting the input we require from the X or Y mux while setting the other mux to a constant zero. As a result, we are using the addition of 0 to the desired signal to perform the multiplexing. 

MZ_442_DSP48

We can multiplex between signals on A:B and C within the DSP48. This enables multiplexing of 48 bits of data. Of course, inputs A is 30 bits and B is 18 bits. This is combined into signal A:B after the dual A and B registers.  

Signal A:B is fed into to mux X, while signal C is fed into mux Z and Y. All multiplexors W, X, Y, and Z have an input that can be selected which is all zeros. 

To perform the multiplexing, we can configure the following equations using inmode, opmode, and alumode commands. 

 

P = A:B + 0

P = C + 0

 

To demonstrate this, I created a simple example in Vivado using the DSP48 template from the language templates. I configured this DSP template so that I could control the opmode to switch between inputs A:B and C. 

The code can be seen below. At the top level, however, the DSP mux offers the user two 48-bit ports A, C, and a select signal. Internally the A signal is routed to DSP ports A and B while port C is connected to the DSP port C. 

Depending upon the state of the select signal, the op code is changed to select the correct channels on the X and Y multiplexor. 

To output A:B which is connected to the X mux, we need to set OP code bits[1:0] to 11 and ensure all other multiplexors to output zero.

Similarly the same approach is taken for C which is connected to the Y mux. Its opmode[3:2] is set to 11 and all other multiplexors are set to output zero.

MZ_442_OPMODE_1
MZ_442_OPMODE_2
MZ_442_OPMODE_3
MZ_442_OPMODE_4

Library ieee;

use ieee.std_logic_1164.all;

Library UNISIM;

use UNISIM.vcomponents.all;

entity dspmux is port(

    clk : in std_logic;

    rst : in std_logic;

    a   : in std_logic_vector(47 downto 0);

    c   : in std_logic_vector(47 downto 0);

    sel : in std_logic;

    op  : out std_logic_vector(47 downto 0)); end entity;

architecture rtl of dspmux is

signal ain : std_logic_vector(29 downto 0); signal bin : std_logic_vector(17 downto 0); signal cin : std_logic_vector(47 downto 0); signal ALUMODE : std_logic_vector(3 downto 0); signal INMODE : std_logic_vector (4 downto 0); signal OPMODE : std_logic_vector(8 downto 0);

begin

INMODE <= (others =>'0');

ALUMODE <= (others =>'0');

ain <= a(47 downto 18);

bin <= a(17 downto 0);

cin <= c;

process(sel)

   begin

    if sel = '0' then

       OPMODE <= "000000011";

    else

       OPMODE <= "000001100";

    end if;

end process;

   DSP48E2_inst : DSP48E2

   generic map (

      -- Feature Control Attributes: Data Path Selection

      AMULTSEL => "A",            -- Selects A input to multiplier (A, AD)

      A_INPUT => "DIRECT",        -- Selects A input source,

      BMULTSEL => "B",            -- Selects B input to multiplier (AD, B)

      B_INPUT => "DIRECT",        -- Selects B input source,

      PREADDINSEL => "A",         -- Selects input to pre-adder (A, B)

      RND => X"000000000000",     -- Rounding Constant

      USE_MULT => "NONE",         -- Select multiplier usage

      USE_SIMD => "ONE48",        -- SIMD selection (FOUR12, ONE48, TWO24)

      USE_WIDEXOR => "FALSE",     -- Use the Wide XOR function

      XORSIMD => "XOR24_48_96",   -- Mode of operation for the Wide XOR

      -- Pattern Detector Attributes: Pattern Detection Configuration

      AUTORESET_PATDET => "NO_RESET",

      AUTORESET_PRIORITY => "RESET",   -- Priority of AUTORESET vs. CEP

      MASK => X"3fffffffffff",         -- 48-bit mask value for pattern

      PATTERN => X"000000000000",      -- 48-bit pattern match for

      SEL_MASK => "MASK",              -- C, MASK,

      SEL_PATTERN => "PATTERN",        -- Select pattern value

      USE_PATTERN_DETECT => "NO_PATDET", -- Enable pattern detect

      -- Programmable Inversion Attributes: Specifies built-in

      programmable inversion on specific pins

      IS_ALUMODE_INVERTED => "0000",     -- Optional inversion for ALUMODE

      IS_CARRYIN_INVERTED => '0',        -- Optional inversion for CARRYIN

      IS_CLK_INVERTED => '0',            -- Optional inversion for CLK

      IS_INMODE_INVERTED => "00000",     -- Optional inversion for INMODE

      IS_OPMODE_INVERTED => "000000000", -- Optional inversion for OPMODE

      IS_RSTALLCARRYIN_INVERTED => '0',  -- Optional inversion for

       RSTALLCARRYIN

      IS_RSTALUMODE_INVERTED => '0',     -- Optional inversion for

       RSTALUMODE

      IS_RSTA_INVERTED => '0',           -- Optional inversion for RSTA

      IS_RSTB_INVERTED => '0',           -- Optional inversion for RSTB

      IS_RSTCTRL_INVERTED => '0',        -- Optional inversion for RSTCTRL

      IS_RSTC_INVERTED => '0',           -- Optional inversion for RSTC

      IS_RSTD_INVERTED => '0',           -- Optional inversion for RSTD

      IS_RSTINMODE_INVERTED => '0',      -- Optional inversion for

       RSTINMODE

      IS_RSTM_INVERTED => '0',           -- Optional inversion for RSTM

      IS_RSTP_INVERTED => '0',           -- Optional inversion for RSTP

      -- Register Control Attributes: Pipeline Register Configuration

      ACASCREG => 1,                     -- Number of pipeline stages(0-2)

      ADREG => 1,                        -- Pipeline stages for pre-adder

      ALUMODEREG => 1,                   -- Pipeline stages for ALUMODE

      AREG => 1,                         -- Pipeline stages for A (0-2)

      BCASCREG => 1,                     -- Number of pipeline stages(0-2)

      BREG => 1,                         -- Pipeline stages for B (0-2)

      CARRYINREG => 1,                   -- Pipeline stages for CARRYIN

      CARRYINSELREG => 1,                -- Pipeline stages for CARRYINSEL

      CREG => 1,                         -- Pipeline stages for C (0-1)

      DREG => 1,                         -- Pipeline stages for D (0-1)

      INMODEREG => 1,                    -- Pipeline stages for INMODE

      MREG => 1,                         -- Multiplier pipeline stages

      OPMODEREG => 1,                    -- Pipeline stages for OPMODE

      PREG => 1                          -- Number of pipeline stages P

   )

   port map (

      -- Cascade outputs: Cascade Ports

      ACOUT => open,           -- 30-bit output: A port cascade

      BCOUT => open,           -- 18-bit output: B cascade

      CARRYCASCOUT => open,    -- 1-bit output: Cascade carry

      MULTSIGNOUT => open,     -- 1-bit output: Multiplier sign cascade

      PCOUT => open,           -- 48-bit output: Cascade output

      -- Control outputs: Control Inputs/Status Bits

      OVERFLOW => open,        -- 1-bit output: Overflow in add/acc

      PATTERNBDETECT => open,  -- 1-bit output: Pattern bar detect

      PATTERNDETECT => open,   -- 1-bit output: Pattern detect

      UNDERFLOW => open,       -- 1-bit output: Underflow in add/acc

      -- Data outputs: Data Ports

      CARRYOUT => open,         -- 4-bit output: Carry

      P => op,                  -- 48-bit output: Primary data

      XOROUT => open,           -- 8-bit output: XOR data

      -- Cascade inputs: Cascade Ports

      ACIN => (others =>'0'),   -- 30-bit input: A cascade data

      BCIN => (others =>'0'),   -- 18-bit input: B cascade

      CARRYCASCIN => '0',       -- 1-bit input: Cascade carry

      MULTSIGNIN => '0',        -- 1-bit input: Multiplier sign cascade

      PCIN => (others =>'0'),   -- 48-bit input: P cascade

      -- Control inputs: Control Inputs/Status Bits

      ALUMODE => ALUMODE,           -- 4-bit input: ALU control

      CARRYINSEL => (others =>'0'), -- 3-bit input: Carry select

      CLK => CLK,                   -- 1-bit input: Clock

      INMODE => INMODE,             -- 5-bit input: INMODE control

      OPMODE => OPMODE,             -- 9-bit input: Operation mode

      -- Data inputs: Data Ports

      A => AIN,                     -- 30-bit input: A data

      B => BIN,                     -- 18-bit input: B data

      C => CIN,                     -- 48-bit input: C data

      CARRYIN => '0',               -- 1-bit input: Carry-in

      D => (others =>'0'),          -- 27-bit input: D data

      -- Reset/Clock Enable inputs: Reset/Clock Enable Inputs

      CEA1 => '1',        -- 1-bit input: Clock enable for 1st stage AREG

      CEA2 => '1',        -- 1-bit input: Clock enable for 2nd stage AREG

      CEAD => '1',        -- 1-bit input: Clock enable for ADREG

      CEALUMODE => '1',   -- 1-bit input: Clock enable for ALUMODE

      CEB1 => '1',        -- 1-bit input: Clock enable for 1st stage BREG

      CEB2 => '1',         -- 1-bit input: Clock enable for 2nd stage BREG

      CEC => '1',          -- 1-bit input: Clock enable for CREG

      CECARRYIN => '1',    -- 1-bit input: Clock enable for CARRYINREG

      CECTRL => '1',       -- 1-bit input: Clock enable for OPMODEREG and

        CARRYINSELREG

      CED => '1',          -- 1-bit input: Clock enable for DREG

      CEINMODE => '1',     -- 1-bit input: Clock enable for INMODEREG

      CEM => '1',          -- 1-bit input: Clock enable for MREG

      CEP => '1',          -- 1-bit input: Clock enable for PREG

      RSTA => rst,         -- 1-bit input: Reset for AREG

      RSTALLCARRYIN => rst,-- 1-bit input: Reset for CARRYINREG

      RSTALUMODE => rst,   -- 1-bit input: Reset for ALUMODEREG

      RSTB => rst,         -- 1-bit input: Reset for BREG

      RSTC => rst,         -- 1-bit input: Reset for CREG

      RSTCTRL => rst,      -- 1-bit input: Reset for OPMODEREG and

CARRYINSELREG

      RSTD => rst,         -- 1-bit input: Reset for DREG and ADREG

      RSTINMODE => rst,    -- 1-bit input: Reset for INMODEREG

      RSTM => rst,         -- 1-bit input: Reset for MREG

      RSTP => rst          -- 1-bit input: Reset for PREG

   );

end architecture;

Running this in a simple simulation provides the results below where you can clearly see the output switching between the A and C inputs to the module. 

MZ_442_Output_Switching

Of course, implementing multiplexing in this way is not something we would do every day and would be done only in specific cases. It is a viable tool in the FPGA developer toolbox though, so I thought it would make for an interesting blog.  

When considering implementations which use this approach, we also need to consider the width of the vector being multiplexed and routing penalties that apply to entering and leaving the DSP48E2 element. We can, however, always use techniques such as hand placement etc. to extract the best possible performance.