Programming Examples

You can have many different application requirements, but all of them can be categorized into three categories. The first is to use the ready-made models provided by the Xilinx® Vitis™ AI library to quickly build their own application; the second is to use your own custom models which are similar to the models in the Vitis AI Library; and the last is to use new models that are totally different from the models in the Vitis AI library. This chapter describes the detailed development steps for the first two cases. For the third case, you can also use the Vitis AI library samples and libraries implementation for reference. Therefore, this chapter describes the following contents:

  • Customizing pre-processing
  • Using the configuration file as pre-processing and post-processing parameter
  • Using the Vitis AI Library's post-processing library
  • Implementing user post-processing code
  • Working with the xdputil tool

The following figure shows the relationships of the various Vitis AI library APIs and their corresponding example. And there are four kinds of APIs in this release:

  • Vitis AI Library API_0 based on VART
  • Vitis AI Library API_1 based on AI Library
  • Vitis AI Library API_2 based on DpuTask
  • Vitis AI Library API_3 based on Graph_runner
Figure 1: The Diagram of AI Library API


Developing with Vitis AI API_0

  1. Install the cross-compilation system on the host side, refer to Installation.
  2. Download the model that you want to use, such as resnet50, and copy it to the board using scp.
  3. Install the model on the target side.
    tar -xzvf <model>.tar.gz
    cp -r <model> /usr/share/vitis_ai_library/models
    By default, we put the models under /usr/share/vitis_ai_library/models directory on the target side.
    Note: You do not need to install the Xilinx model packet if you want to use your own model.
  4. Git clone the corresponding Vitis AI Library from https://github.com/Xilinx/Vitis-AI.
  5. Create a folder under your workspace, using classification as an example.
    mkdir classification
  6. Create the demo_classification.cpp source file. The main flow is shown in the following figure. See Vitis-AI/demo/VART/resnet50/src/main.cc for a complete code example.
    Figure 2: Flow for Developing with Vitis AI API_0


  7. Create a build.sh file as shown below, or copy one from the Vitis AI library demo and modify it.
    #/bin/sh
    CXX=${CXX:-g++}
    $CXX -O2 -fno-inline -I. -o demo_classification demo_classification.cpp -lopencv_core -lopencv_video -lopencv_videoio -lopencv_imgproc -lopencv_imgcodecs -lopencv_highgui -lglog -lxir -lunilog -lpthread -lvart-runner
  8. Cross compile the program.
    sh -x build.sh
  9. Copy the executable program to the target board using scp.
    scp demo_classification root@IP_OF_BOARD:~/
  10. Execute the program on the target board. Before running the program, make sure the target board has the Vitis AI Library installed, and prepare the images you want to test.
    ./demo_classification /usr/share/vitis_ai_library/models/resnet50/resnet50.xmodel resnet50_0 demo_classification.jpg

Developing with User Model and AI Library API_2

When you use your own models, it is important to note that your model framework should be within the scope supported by the Vitis AI library. The following is an introduction of how to deploy a retrained YOLOv3 Caffe model to ZCU102 platform based on Vitis AI library step by step.
  1. Download the corresponding docker image from https://github.com/Xilinx/Vitis-AI.
  2. Load and run the docker.
  3. Create a folder and place the float model under it on the host side, then use AI Quantizer tool to do the quantization. For more details, see Vitis AI User Guide in the Vitis AI User Documentation (UG1431).
  4. Use AI Compiler tool to do the model compiling to get the xmodel file, such as yolov3_custom.xmodel. For more information, see Vitis AI User Guide in the Vitis AI User Documentation (UG1431).
  5. Create the yolov3_custom.prototxt, as shown in the following.
    model {
      name: "yolov3_custom"
      kernel {
         name: "yolov3_custom"
         mean: 0.0
         mean: 0.0
         mean: 0.0
         scale: 0.00390625
         scale: 0.00390625
         scale: 0.00390625
      }
      model_type : YOLOv3
      yolo_v3_param {
        num_classes: 20
        anchorCnt: 3
        layer_name: "59"
        layer_name: "67"
        layer_name: "75"
        conf_threshold: 0.3
        nms_threshold: 0.45
        biases: 10
        biases: 13
        biases: 16
        biases: 30
        biases: 33
        biases: 23
        biases: 30
        biases: 61
        biases: 62
        biases: 45
        biases: 59
        biases: 119
        biases: 116
        biases: 90
        biases: 156
        biases: 198
        biases: 373
        biases: 326
        test_mAP: false
      }
    }
    Note: The <model_name>.prototxt is effective only when you use AI Library API_1.

    When you use AI Library API_2, the parameter of the model needs to be loaded and read manually by the program. See the Vitis-AI/demo/Vitis-AI-Library/samples/dpu_task/yolov3/demo_yolov3.cpp for details.

  6. Create the demo_yolov3.cpp file. See Vitis-AI/demo/Vitis-AI-Library/samples/dpu_task/yolov3/demo_yolov3.cpp for reference.
  7. Create a build.sh file as shown below, or copy one from the Vitis AI library demo and modify it.
    #/bin/sh
    CXX=${CXX:-g++}
    $CXX -std=c++17 -O3 -I. -o demo_yolov3 demo_yolov3.cpp -lopencv_core -lopencv_video -lopencv_videoio -lopencv_imgproc -lopencv_imgcodecs -lopencv_highgui -lglog -lxnnpp-xnnpp -lvitis_ai_library-model_config -lprotobuf -lvitis_ai_library-dpu_task
  8. Exit the docker tool system and start the docker runtime system.
  9. Cross compile the program and generate executable file demo_yolov3.
    sh -x build.sh
  10. Create model folder under /usr/share/vitis_ai_library/models on the target side.
    mkdir yolov3_custom /usr/share/vitis_ai_library/models
    Note: /usr/share/vitis_ai_library/models is the default location for the program to read the model. You can also place the model folder in the same directory as the executable program.
  11. Copy the yolov3_custom.xmodel and yolov3_custom.prototxt to the target and put them under /usr/share/vitis_ai_library/models/yolov3_custom.
    scp yolov3_custom.xmodel  yolov3_custom.prototxt root@IP_OF_BOARD:/usr/share/vitis_ai_library/models/yolov3_custom
  12. Copy the executable program to the target board using scp.
    scp demo_yolov3 root@IP_OF_BOARD:~/
  13. Execute the program on the target board and get the following results. Before running the program, make sure the target board has the Vitis AI library installed, and prepare the images you want to test.
    ./demo_yolov3 yolov3_custom sample.jpg 

Developing with Vitis AI API_3 (Graph Runner)

When the model is split into multiple subgraphs, it cannot be automatically run with API_0, API_1, and API_2. You have to deploy the model subgraph by subgraph. Graph runner is new API for deploying models in this release. It converts the model into a single graph and makes deployment easier for multiple subgraph models.
  1. Git clone the corresponding Vitis AI Library from https://github.com/Xilinx/Vitis-AI.
  2. Install the cross-compilation system on the host side, refer to Installation.
  3. Check the model to see if it has multiple subgraphs. If yes, you can check whether the ops not supported by DPU are in our supported scope. You can find the operations supported by the Vitis AI Library in Vitis-AI/tools/Vitis-AI-Library/cpu_task/ops.
    Note: If the operation is not in the supported list in cpu_task, then you cannot use the graph_runner directly. You may encounter an error when you compile the model. You must first solve it, then add the operation under cpu_task.
  4. Create the model_test.cpp source file. The main flow is shown in the following figure. See Vitis-AI/demo/Vitis-AI-Library/samples/graph_runner/platenum_graph_runner/platenum_graph_runner.cpp for a complete code example.
    Figure 3: Flow for Developing with Vitis AI API_3


  5. Create a build.sh file as shown below, or copy one from the Vitis AI library demo and modify it.
    result=0 && pkg-config --list-all | grep opencv4 && result=1
    if [ $result -eq 1 ]; then
    	OPENCV_FLAGS=$(pkg-config --cflags --libs-only-L opencv4)
    else
    	OPENCV_FLAGS=$(pkg-config --cflags --libs-only-L opencv)
    fi
    
    CXX=${CXX:-g++}
    $CXX -std=c++17 -O2 -I. \
    	-o platenum_graph_runner \
    	platenum_graph_runner.cpp \
    	-lglog \
    	-lxir \
    	-lvart-runner \
    	-lvitis_ai_library-graph_runner \
    	${OPENCV_FLAGS} \
    	-lopencv_core \
    	-lopencv_imgcodecs \
    	-lopencv_imgproc
  6. Cross compile the program.
    sh -x build.sh
  7. Copy the executable program to the target board using scp.
    scp test_model root@IP_OF_BOARD:~/
  8. Install the latest VART. For more information, refer to Step 3: Installing AI Library Package.
  9. Execute the program on the target board. Before running the program, make sure the target board has the Vitis AI library installed, and prepare the images you want to test.
    ./model_test <model> <image>

Customizing Pre-Processing

Before convolution neural network processing, image data generally needs to be pre-processed. The basics of some pre-processing techniques that can be applied to any kind of data are as follows:

  • Mean subtraction
  • Normalization
  • PCA and Whitening

Call the setMeanScaleBGR function to implement the Mean subtraction and normalization, as shown in the following figure. See Vitis-AI/tools/Vitis-AI-Library/dpu_task/include/vitis/ai/dpu_task.hpp for details.

Figure 4: setMeanScaleBGR Example

Call the cv::resize function to scale the image, as shown in the following figure.

Figure 5: cv::resize Example

Using the Configuration File

The Vitis™ AI Library provides a way to read model parameters by reading the configuration file. It facilitates uniform configuration management of model parameters. The configuration file is located in /usr/share/vitis_ai_library/models/[model_name]/[model_name].prototxt.

model
{
  name: "yolov3_voc"
  kernel {
     name: "yolov3_voc"
     mean: 0.0
     mean: 0.0
     mean: 0.0
     scale: 0.00390625
     scale: 0.00390625
     scale: 0.00390625
  }
  model_type : YOLOv3
  yolo_v3_param {
    …
  }
  is_tf: false
}
Table 1. Compiling Model and Kernel Parameters
Model/Kernel Parameter Type Description
model name This name should be same as the ${MODEL_NAME}.
model_type This type should depend on which type of model you used.
kernel name

This name should be filled as the result of your DNNC compile. Sometimes, its name may have an extra postfix “_0”.Here, fill the name with such postfix.

(For example: inception_v1_0)

mean Normally there are three lines. Each of them corresponding to the mean-value of “BRG”, which are pre-defined in the model.
scale Normally there are three lines. Each of them corresponds to the RGB-normalized scale. If the model had no scale in training stage, here should fill with one.
is_tf Bool type. If your model is trained by TensorFlow, please add this and set with “true”. It could be blank in the prototxt or set as “false” when the model is Caffe.

yolo_v3_param

  model_type : YOLOv3
  yolo_v3_param {
    num_classes: 20
    anchorCnt: 3
    layer_name: "59"
    layer_name: "67"
    layer_name: "75"
    conf_threshold: 0.3
    nms_threshold: 0.45
    biases: 10
    biases: 13
    biases: 16
    biases: 30
    biases: 33
    biases: 23
    biases: 30
    biases: 61
    biases: 62
    biases: 45
    biases: 59
    biases: 119
    biases: 116
    biases: 90
    biases: 156
    biases: 198
    biases: 373
    biases: 326
    test_mAP: false
  }

The parameters for the YOLOv3 model are listed in the following table. You can modify them as your model requires.

Table 2. YOLOv3 Model Parameters
Parameter Type Description
num_classes The actual number of the model’s detection categories
anchorCnt The number of this model’s anchor
layer_name The kernel's output layer names. When your model's output is more than one, you need to use this parameter to ensure a certain sequence. Such name should be same as these in kernel's. (If you fill it as an invalid name, the model creator will use the kernel default order.)
conf_threshold The threshold of the boxes’ confidence, which can be modified to fit your practical application
nms_threshold The threshold of NMS
biases These parameters are the same as the model’s. Each bias need writes in a separate line. (Biases amount) = anchorCnt * (output-node amount) * 2. Set correct lines in the prototxt.
test_mAP If your model was trained with letterbox and you want to test its mAP, set this as “true”. Normally, it is “false” for executing much faster.

SSD_param

model_type : SSD
ssd_param :
{
    num_classes : 4
    nms_threshold : 0.4
    conf_threshold : 0.0
    conf_threshold : 0.6
    conf_threshold : 0.4
    conf_threshold : 0.3
    keep_top_k : 200
    top_k : 400
    prior_box_param {
    layer_width : 60,
    layer_height: 45,
    variances: 0.1
    variances: 0.1
    variances: 0.2
    variances: 0.2
    min_sizes: 21.0
    max_sizes: 45.0
    aspect_ratios: 2.0
    offset: 0.5
    step_width: 8.0
    step_height: 8.0
    flip: true
    clip: false
    }
}

Following are the SSD parameters. The parameters of SSD-model include all kinds of threshold and PriorBox requirements. You can reference your SSD deploy.prototxt to fill them.

Table 3. SSD Model Parameters
Parameter Type Description
num_classes The actual number of the model’s detection categories
anchorCnt The number of this model’s anchor
conf_threshold

The threshold of the boxes’ confidence. Each category can have a different threshold, but its amount must be equal to num_classes.

nms_threshold The threshold of NMS
biases These parameters are same as the model’s. Each bias need writes in a separate line. (Biases amount) = anchorCnt * (output-node amount) * 2. Set correct lines in the prototxt.
test_mAP If your model was trained with letterbox and you want to test its mAP, set this as “true”. Normally it is “false” for executing much faster
keep_top_k Each category of detection objects’ top K boxes
top_k All the detection object’s top K boxes, except the background (the first category)
prior_box_param

There is more than one PriorBox, which could be found in the original model (deploy.prototxt) for corresponding each different scale. These PriorBoxes should oppose each other.

Table 4. PriorBox Parameters
Parameter Type Description
layer_width/layer_height The input width/height of this layer. Such numbers can be computed from the net structure.

ariances

These numbers are used for boxes regression, only to fill them as the original model. There should be four variances.
min_sizes/max_size Filled as the “deploy.prototxt”, but each number should be written in a separate line.
aspect_ratios The ratio’s number (each one should be written in a separate line). Default has 1.0 as its first ratio. If you set a new number here, there will be two ratios created when the opposite is true. One is a filled number; other is its reciprocal. For example, this parameter has only one set element, “ratios: 2.0”. The ratio vector has three numbers: 1.0, 2.0. 0.5
offset Normally, the PriorBox is created by each central point of the feature map, so that the offset is 0.5.
step_width/step_height Copy from the original file. If there are no such numbers there, you can use the following formula to compute them:

step_width = img_width ÷ layer_width

step_height = img_height ÷ layer_height

offset Normally, PriorBox is created by each central point of the feature map, so that the offset is 0.5.
flip Control whether to rotate the PriorBox and change the ratio of length/width.
clip Set as "false". If true, it will let the detection boxes’ coordinates keep at [0, 1].

Example Code

The following is the example code.

Mat img = cv::imread(argv[1]);
auto yolo = vitis::ai::YOLOv3::create("yolov3_voc",   true);
auto results   = yolo->run(img);
for(auto   &box : results.bboxes){
    int label = box.label;
    float xmin = box.x * img.cols + 1;
    float ymin = box.y * img.rows + 1;
    float xmax = xmin + box.width * img.cols;
    float ymax = ymin + box.height * img.rows;
    if(xmin < 0.) xmin = 1.;
    if(ymin < 0.) ymin = 1.;
    if(xmax > img.cols) xmax = img.cols;
    if(ymax > img.rows) ymax = img.rows;
    float confidence = box.score;
    cout << "RESULT: "   << label <<   "\t" << xmin << "\t"   << ymin << "\t"
            << xmax <<   "\t"   << ymax << "\t" << confidence   << "\n";
    rectangle(img, Point(xmin, ymin), Point(xmax, ymax), Scalar(0, 255, 0), 1, 1, 0);
}
imshow("", img);
waitKey(0);

To create the YOLOv3 object, use create.

static std::unique_ptr<YOLOv3> create(const std::string& model_name, bool need_mean_scale_process = true);
Note: The model_name is same as the prototxt. For more details about the example, refer to ~/Vitis-AI/Vitis-AI-Library/yolov3/test/test_yolov3.cpp.

Implementing User Post-Processing Code

You can also call their own post-processing functions on their own request. Take demo_yolov3.cpp and demo_classification.cpp as an example. Use vitis::ai::DpuTask::create or vitis::ai::DpuRunner::create_dpu_runner to create the task, and after DPU processing is complete, the post-processing function can be invoked. The post_process function in the following figure is a user post-processing code.

Figure 6: User Post-Processing Code Example

For more details, see ~/Vitis-AI/demo/Vitis-AI-Library/samples/dpu_task/classification/demo_classification.cpp.

Using the AI Library's Post-Processing Library

Post-processing is an important step in the whole process. Each neural network has different post-processing methods. The xnnpp post-processing library is provided in Vitis AI Library to facilitate user calls. It is a closed source library. It supports the following neural network post-processing.

  • Classification
  • Face detection
  • Face landmark detection
  • SSD detection
  • Pose detection
  • Semantic segmentation
  • Road line detection
  • YOLOv3 detection
  • YOLOv2 detection
  • Openpose detection
  • RefineDet detection
  • ReID detection
  • Multi-task
  • Face recognition
  • Plate detection
  • Plate recognition
  • Medical segmentation
  • Medical detection
  • Face quality
  • Hourglass
  • Retinaface
  • Centerpoint
  • Multitaskv3
  • Pointpillars_nuscenes
  • Rcan

There are two ways to call xnnpp:

  • One is an automatic call, through vitis::ai::<model>::create, to create the task, such as vitis::ai::YOLOv3::create("yolov3_bdd", true). After <model>->run finished, xnnpp is automatically processed. You can modify the parameters through the model configuration file.
  • One is a manual call, through vitis::ai::DpuTask::create, to create the task. Then, create the object of the post-process and run the post-process. Take SSD post-processing as an example, the specific steps are as follows:
    1. Create a configuration and set the correlating data to control post-process.
      using DPU_conf = vitis::ai::proto::DpuModelParam;
      DPU_conf config;
    2. If it is a caffemodel, set the "is_tf" as false.
      config.set_is_tf(false);
    3. Fill the other parameters.
      fillconfig(config);
    4. Create an object of SSDPostProcess.
      auto input_tensor = task->getInputTensor();
      auto output_tensor = task->getOutputTensor();
      auto ssd = vitis::ai::SSDPostProcess::create(input_tensor, output_tensor,config);
    5. Run the post-process.
      auto results = ssd->ssd_post_process();
Note: For more details about the post processing examples, see the ~/Vitis-AI/demo/Vitis-AI-Library/samples/dpu_task/yolov3/demo_yolov3.cpp and ~/Vitis-AI/tools/Vitis-AI-Library/yolov3/test/test_yolov3.cpp files in the host system.

Using the xdputil Tool

xdputil is introduced in this release for board development. It can be used for both Edge and Cloud devices. It is pre-installed in the latest board image or docker. The source code of xdputil is located in the Vitis-AI/tools/Vitis-AI-Library/usefultools folder. It contains the following functions.

help
It shows the usage of xdputil.
xdputil --help
status
It shows the status of DPU.
xdputil status
run
Run DPU with the input file. It can be used for DPU cross-checking.
xdputil run <xmodel> [-i <subgraph_index>] <input_bin>

xmodel: The model run on DPU
-i : The subgraph_index of the model, the default value is 0
input_bin: The input file for the model
xmodel
Check the xmodel information. You can convert the xmodel to png/svg/txt formats. Run the following command to show the usage of xmodel.
root@xilinx-zcu102-2021_1:~# xdputil xmodel -h
usage: xdputil.py xmodel [-h] [-l] [-m] [-p [PNG]] [-s [SVG]] [-t [TXT]]
                         xmodel

xmodel

positional arguments:
  xmodel                xmodel file path

optional arguments:
  -h, --help            show this help message and exit
  -l, --list            show subgraph list
  -m, --meta_info       show xcompiler version
  -p [PNG], --png [PNG]
                        the output to png
  -s [SVG], --svg [SVG]
                        the output svg path
  -t [TXT], --txt [TXT]
                        when <txt> is missing, it dumps to standard output.
mem
Read or write memory.
xdputil mem <-r|-w> <addr> <size> <output_file|input_file>
query
Shows device information, including DPU, fingerprint, and Vitis AI version.
xdputil query
benchmark
Performance of the test model.
xdputil benchmark <xmodel> [-i subgraph_index] <num_of_threads>
Note: When you use xdputil in the docker, /usr/bin/python3 -m xdputil is used instead of xdputil. For example,
/usr/bin/python3 -m xdputil query