Introduction
The DPU is the deep learning processing unit (DPU) designed to support the Zynq UltraScale+ MPSoC. It is a configurable computation engine optimized for convolutional neural networks. The degree of parallelism utilized in the engine is a design parameter and can be selected according to the target device and application. It includes a set of highly optimized instructions, and supports most convolutional neural networks, such as VGG, ResNet, GoogLeNet, YOLO, SSD, MobileNet, FPN, and others.
Features
The DPU has the following features:
- One AXI slave interface for accessing configuration and status registers.
- One AXI master interface for accessing instructions.
- Supports individual configuration of each channel.
- Supports optional interrupt request generation.
Some highlights of DPU functionality include:
- Configurable hardware architecture core includes: B512, B800, B1024, B1152, B1600, B2304, B3136, and B4096
- Maximum of four homogeneous cores
- Convolution and deconvolution
- Depthwise convolution
- Max pooling
- Average pooling
- ReLU, ReLU6, and Leaky ReLU
- Concat
- Elementwise-Sum and Elementwise-Multiply
- Dilation
- Reorg
- Fully connected layer
- Softmax
- Batch Normalization
- Split
IP Facts
DPU IP Facts Table | |
---|---|
Core Specifics | |
Supported Device Family | Zynq® UltraScale+™ MPSoC Family |
Supported User Interfaces | Memory-mapped AXI interfaces |
Resources | See DPU Configuration. |
Provided with Core | |
Design Files | Encrypted RTL |
Example Design | Verilog |
Constraints File | Xilinx Design Constraints (XDC) |
Supported S/W Driver | Included in PetaLinux |
Tested Design Flows | |
Design Entry | Vivado® Design Suite and Vitis™ unified software platform |
Simulation | N/A |
Synthesis | Vivado® Synthesis |
Xilinx Support web page | |
|