Running Examples

For the Vitis AI development kit v1.4 release, VART-based examples demonstrate the use of the Vitis AI unified C++/Python APIs (which are available across Cloud-to-Edge).

These samples can be found at https://github.com/Xilinx/Vitis-AI/demo. If you are using Xilinx ZCU102 and ZCU104 boards to run samples, make sure to enable X11 forwarding with the "ssh -X" option, or the command export DISPLAY=192.168.0.10:0.0 (assuming the IP address of host machine is 192.168.0.10), when logging in to the board using an SSH terminal, as all the examples require X11 to work properly.

Note: The examples will not work through a UART connection due to the lack of X11 support. Alternatively, you can connect boards with a monitor directly instead of using the Ethernet.

Vitis AI Examples

Vitis AI provides several C++ and Python examples to demonstrate the use of the unified cloud-edge runtime programming APIs.

Note: The sample code helps you get started with the new runtime (VART). They are not meant for performance benchmarking.

To familiarize yourself with the unified APIs, use the VART examples. These examples are only to understand the APIs and do not provide high performance. These APIs are compatible between the edge and cloud, though cloud boards may have different software optimizations such as batching and on the edge would require multi-threading to achieve higher performance. If you desire higher performance, see the Vitis AI Library samples and demo software.

If you want to do optimizations to achieve high performance, here are some suggestions:

Rearrange the thread pipeline structure so that every DPU thread has its own "DPU" runner object.
Optimize display thread so that when DPU FPS is higher than display rate, skipping some frames. 200 FPS is too high for video display.
Pre-decoding. The video file might be H.264 encoded. The decoder is slower than the DPU and consumes a lot of CPU resources. The video file has to be first decoded and transformed into raw format.
The batch mode on Alveo boards needs special consideration as it may cause video frame jittering. ZCU102 has no batch mode support.
OpenCV cv::imshow is slow, so you need to use libdrm.so. This is only for local display, not through X server.

The following table below describes these Vitis AI examples.

Table 1. Vitis AI Examples
ID	Example Name	Models	Framework	Notes
1	resnet50	ResNet-50	Caffe	Image classification with Vitis AI unified C++ APIs.
2	resnet50_pt	ResNet-50	PyTorch	Image classification with Vitis AI unified extension C++ APIs.
3	resnet50_ext	ResNet-50	Caffe	Image classification with Vitis AI unified extension C++ APIs.
4	resnet50_mt_py	ResNet-50	TensorFlow	Multi-threading image classification with Vitis AI unified Python APIs.
5	inception_v1_mt_py	Inception-v1	TensorFlow	Multi-threading image classification with Vitis AI unified Python APIs.
6	pose_detection	SSD, Pose detection	Caffe	Pose detection with Vitis AI unified C++ APIs.
7	video_analysis	SSD	Caffe	Traffic detection with Vitis AI unified C++ APIs.
8	adas_detection	YOLOv3	Caffe	ADAS detection with Vitis AI unified C++ APIs.
9	segmentation	FPN	Caffe	Semantic segmentation with Vitis AI unified C++ APIs.
10	squeezenet_pytorch	Squeezenet	PyTorch	Image classification with Vitis AI unified C++ APIs.

The typical code snippet to deploy models with Vitis AI unified C++ high-level APIs is as follows:

// get dpu subgraph by parsing model file
auto runner = vart::Runner::create_runner(subgraph, "run");
// get input scale and output scale,
// they are used for fixed-floating point conversion
auto outputTensors = runner->get_output_tensors();
auto inputTensors = runner->get_input_tensors();
auto input_scale = get_input_scale(inputTensors[0]);
auto output_scale = get_output_scale(outputTensors[0]);
// do the image pre-process, convert float data to fixed point data
// populate input/output tensors
auto job_id = runner->execute_async(inputsPtr, outputsPtr);
runner->wait(job_id.first, -1);
// process outputs, convert fixed point data to float data

The typical code snippet to deploy models with Vitis AI unified extension C++ high-level APIs is as follows:

// get dpu subgraph by parsing model file
std::unique_ptr<vart::RunnerExt> runner =
          vart::RunnerExt::create_runner(subgraph, attrs.get());
// get input & output tensor buffers
auto input_tensor_buffers = runner->get_inputs();
auto output_tensor_buffers = runner->get_outputs();
// get input scale and output scale,
// they are used for fixed-floating point conversion
auto input_tensor = input_tensor_buffers[0]->get_tensor();
auto output_tensor = output_tensor_buffers[0]->get_tensor();
auto input_scale = get_input_scale(input_tensor);
auto output_scale = get_output_scale(output_tensor);
// do the image pre-process, convert float data to fixed point data
setImageBGR(images[batch_idx], (void*)data_in, input_scale);
// sync data for input
input->sync_for_write(0, input->get_tensor()->get_data_size() /
                         input->get_tensor()->get_shape()[0]); 
// populate input/output tensors
auto v = runner->execute_async(input_tensor_buffers, output_tensor_buffers);
auto status = runner->wait((int)v.first, -1);
// sync data for output
output->sync_for_read(0, output->get_tensor()->get_data_size() /
                         output->get_tensor()->get_shape()[0]);
// process outputs, conver fixed point data to float data

The typical code snippet to deploy models with Vitis AI unified Python high-level APIs is shown below:

dpu_runner = runner.Runner(subgraph，"run")
# populate input/output tensors
jid = dpu_runner.execute_async(fpgaInput, fpgaOutput)
dpu_runner.wait(jid)
# process fpgaOutput

Note: DPU processes only work with the input and output of fixed-point data. For improved performance and more efficient memory usage, use int8 data as input and run the float to fixed-point conversion along with preprocessing. If the input data is float, the VART converts the float data to fixed-point data which consumes more time.

Running Vitis AI Examples

Before running Vitis™ AI examples on Edge or on Cloud, download the vitis_ai_runtime_r1.4.0_image_video.tar.gz from here. The images and videos used in the following example can be found in the package.

To improve the user experience, the Vitis AI Runtime packages, VART samples, Vitis-AI-Library samples and models have been built into the board image, and the examples are pre-compiled. You can directly run the example program on the target.

For Edge (DPUCZDX8G/DPUCVDX8G)

Download the vitis_ai_runtime_r1.4.0_image_video.tar.gz from host to the target using scp with the following command.
```
scp vitis_ai_runtime_r1.4.0_image_video.tar.gz root@[IP_OF_BOARD]:~/
```

Unzip the vitis_ai_runtime_r1.4.0_image_video.tar.gz package.

tar -xzvf vitis_ai_runtime_r1.4.0_image_video.tar.gz -C ~/Vitis-AI/demo/VART

Download the model. The download link of the model is described in the YAML file of the model. You can find the YAML file in Vitis-AI/models/AI-Model-Zoo and download the model of the corresponding platform. Take resnet50 as an example:
```
wget https://www.xilinx.com/bin/public/openDownload?filename=resnet50-zcu102_zcu104_kv260-r1.4.0.tar.gz -O resnet50-zcu102_zcu104_kv260-r1.4.0.tar.gz

scp resnet50-zcu102_zcu104_kv260-r1.4.0.tar.gz root@[IP_OF_BOARD]:~/
```
Untar the model on the target and install it.
Note: If the /usr/share/vitis_ai_library/models folder does not exist, create it first.
```
mkdir -p /usr/share/vitis_ai_library/models
```
To install the model package, run the following command:
```
tar -xzvf resnet50-zcu102_zcu104_kv260-r1.4.0.tar.gz
cp resnet50 /usr/share/vitis_ai_library/models -r
```
Enter the directory of samples in the target board. Take resnet50 as an example.
```
cd ~/Vitis-AI/demo/VART/resnet50
```
Run the example.
```
./resnet50 /usr/share/vitis_ai_library/models/resnet50/resnet50.xmodel
```
Note: If the above executable program does not exist, cross-compile it on the host first.

Note: For examples with video input, only `webm` and `raw` format are supported by default with the official system image. If you want to support video data in other formats, install the relevant packages on the system.

The following table shows the run commands for all the Vitis AI samples.

Table 2. Launching Commands for Vitis AI Samples on ZCU102/ZCU104
ID	Example Name	Command
1	resnet50	./resnet50 /usr/share/vitis_ai_library/models/resnet50/resnet50.xmodel
2	resnet50_pt	./resnet50_pt /usr/share/vitis_ai_library/models/resnet50_pt/resnet50_pt.xmodel ../images/001.jpg
3	resnet50_ext	./resnet50_ext /usr/share/vitis_ai_library/models/resnet50/resnet50.xmodel ../images/001.jpg
4	resnet50_mt_py	python3 resnet50.py 1 /usr/share/vitis_ai_library/models/resnet50/resnet50.xmodel
5	inception_v1_mt_py	python3 inception_v1.py 1 /usr/share/vitis_ai_library/models/inception_v1_tf/inception_v1_tf.xmodel
6	pose_detection	./pose_detection video/pose.webm /usr/share/vitis_ai_library/models/sp_net/sp_net.xmodel /usr/share/vitis_ai_library/models/ssd_pedestrian_pruned_0_97/ssd_pedestrian_pruned_0_97.xmodel
7	video_analysis	./video_analysis video/structure.webm /usr/share/vitis_ai_library/models/ssd_traffic_pruned_0_9/ssd_traffic_pruned_0_9.xmodel
8	adas_detection	./adas_detection video/adas.webm /usr/share/vitis_ai_library/models/yolov3_adas_pruned_0_9/yolov3_adas_pruned_0_9.xmodel
9	segmentation	./segmentation video/traffic.webm /usr/share/vitis_ai_library/models/fpn/fpn.xmodel
10	squeezenet_pytorch	./squeezenet_pytorch /usr/share/vitis_ai_library/models/squeezenet_pt/squeezenet_pt.xmodel

For Cloud

Before running the samples on the Cloud, ensure that either the Versal VCK5000 evaluation board or an Alveo card, such as U50, U50LV, or U280, is installed on the server, and the docker system is loaded and running.

If you have downloaded Vitis-AI, entered Vitis-AI directory, and then started Docker.

Thus, VART examples are located in the path of /workspace/demo/VART/ in the docker system.

Download the vitis_ai_runtime_r1.4.0_image_video.tar.gz package and unzip it.
```
tar -xzvf vitis_ai_runtime_r1.4.0_image_video.tar.gz -C /workspace/demo/VART
```
Compile the sample. Take resnet50 as an example.
```
cd /workspace/demo/VART/resnet50
bash –x build.sh
```
When the compilation is complete, the executable resnet50 is generated in the current directory.
Download the model. The download link of the model is described in the YAML file of the model. You can find the YAML file in Vitis-AI/models/AI-Model-Zoo. Take resnet50 as an example:
```
wget https://www.xilinx.com/bin/public/openDownload?filename=resnet50-u50-r1.4.0.tar.gz -O resnet50-u50-r1.4.0.tar.gz
```
Untar the model on the target and install it.
Note: If the /usr/share/vitis_ai_library/models folder does not exist, create it.
```
sudo mkdir -p /usr/share/vitis_ai_library/models
```
Then install the model package.
```
tar -xzvf resnet50-u50-r1.4.0.tar.gz
sudo cp resnet50 /usr/share/vitis_ai_library/models -r
```

Run the sample.

./resnet50 /usr/share/vitis_ai_library/models/resnet50/resnet50.xmodel

The following table shows the run commands for all the Vitis AI samples in the cloud.

Table 3. Launching Commands for Vitis AI Samples for Cloud DPUs
ID	Example Name	Command
1	resnet50	./resnet50 /usr/share/vitis_ai_library/models/resnet50/resnet50.xmodel
2	resnet50_pt	./resnet50_pt /usr/share/vitis_ai_library/models/resnet50_pt/resnet50_pt.xmodel ../images/001.jpg
3	resnet50_ext	./resnet50_ext /usr/share/vitis_ai_library/models/resnet50/resnet50.xmodel ../images/001.jpg
4	resnet50_mt_py	/usr/bin/python3 resnet50.py 1 /usr/share/vitis_ai_library/models/resnet50/resnet50.xmodel
5	inception_v1_mt_py	/usr/bin/python3 inception_v1.py 1 /usr/share/vitis_ai_library/models/inception_v1_tf/inception_v1_tf.xmodel
6	pose_detection	./pose_detection video/pose.mp4 /usr/share/vitis_ai_library/models/sp_net/sp_net.xmodel /usr/share/vitis_ai_library/models/ssd_pedestrian_pruned_0_97/ssd_pedestrian_pruned_0_97.xmodel
7	video_analysis	./video_analysis video/structure.mp4 /usr/share/vitis_ai_library/models/ssd_traffic_pruned_0_9/ssd_traffic_pruned_0_9.xmodel
8	adas_detection	./adas_detection video/adas.avi /usr/share/vitis_ai_library/models/yolov3_adas_pruned_0_9/yolov3_adas_pruned_0_9.xmodel
9	segmentation	./segmentation video/traffic.mp4 /usr/share/vitis_ai_library/models/fpn/fpn.xmodel
10	squeezenet_pytorch	./squeezenet_pytorch /usr/share/vitis_ai_library/models/squeezenet_pt/squeezenet_pt.xmodel