Optimize: Performance Analysis

Performance analysis in the Vitis software platform provides functionality for viewing and analyzing different types of performance data. Its goal is to provide views, graphs, metrics, etc. to help extract useful information from the data, in a way that is more user-friendly and informative than huge text dumps.

Performance analysis provides the following features:

  • Support for viewing Arm data.
  • Support for viewing APM data with PS and MDM as master.
  • Support for viewing MicroBlaze data.
  • Support for viewing and analyzing live data.
  • Support for offline viewing of data.
  • Support for zooming out/in of the data.
  • Event filtering and searching.
  • Import and export of trace packages.

The Performance analysis feature in the Vitis software platform supports data collection from AXI Performance Monitor (APM) Event Counters, Arm Performance Monitor Unit (PMU) from a Zynq-7000 SoC processing system, and MicroBlaze performance monitoring counters. For an example usage of performance monitoring on a Zynq device, refer to System Performance Modeling. For a MicroBlaze design, APM can be used in a similar way as SPM.

To collect MicroBlaze performance data, the performance monitoring counters must be enabled in the Vivado hardware design. For more information, refer to the MicroBlaze Processor Reference Guide (UG984). The Vitis software platform monitors the following events for MicroBlaze processors:
  • Number of clock cycles
  • Any valid instruction executed
  • Read or write data request from/to data cache
  • Read or write data cache hit
  • Pipeline stalled
  • Instruction cache latency for memory read

The data is collected in the Vitis software platform in real time. The values from these counters are sampled every 10 ms. These values are used to calculate metrics shown in the Performance Counters view.

The Vitis software platform monitors the following PMU events for each Cortex-A9 CPU:

  • Data cache refill
  • Data cache access
  • Data stall
  • Write stall
  • Instruction rename
  • Branch miss

The following two Level-2 cache controller (L2C-PL330) counters are monitored:

  • Number of cache hits
  • Number of cache accesses

The following APM counters for each HP and ACP port are monitored:

  • Write Byte Count
  • Read Byte Count
  • Write Transaction Count
  • Total Write Latency
  • Read Transaction Count
  • Total Read Latency

Working with the Performance Analysis Perspective

The Performance Analysis perspective is comprised of many views which provide the capability of collecting and analyzing the performance data referred as trace.



Explorer View

The Project Explorer view displays all the available projects in the workspace. When a performance analysis session is launched the data from the board is collected and stored as trace files in tracing project. Each of the hardware project contains a corresponding tracing project,*_Traces, where the data is stored. Performance counters data from single run is stored under designated Run_* folder. Data from different sections is stored in different files under the run folder.



To analyse the data double click the trace file to open it in an Events editor view. After the file is opened, the tree under the trace file can be expanded to view the list of available analysis views.



Deleting Supplementary Files

Supplementary files are by definition trace specific files that accompany a trace. These files could be temporary files, persistent indexes, or any other persistent data files created by the tool during parsing a trace.

All supplementary files are hidden from the user and are handled internally by the tool. However, there is a possibility to delete the supplementary files so that there are recreated when opening a trace.

To delete all supplementary files from one or many traces and experiments:

  1. Select the relevant traces and experiments in the Project Explorer view.
  2. Right-click and select Delete Supplementary Files from the context menu that appears. The Delete Resources page, with a list of supplementary files, grouped under the trace or experiment they belong to, appears.

  3. Select the file(s) to delete from the list.
  4. Click OK.

Link with Editor

The tracing projects support the Link With Editor feature of the Project Explorer view. With this feature it is now possible to do the following:

  • Select a trace element in the Project Explorer view and the corresponding Events editor will get focus, if the relevant trace is open.
  • Select an Events editor and the corresponding trace element will be highlighted in the Project Explorer view.

To enable or disable this feature toggle the Link With Editor button of the Project Explorer view as shown below.



Exporting a Trace Package

The Export Trace Package wizard allows users to select a trace and export its files and bookmarks to an archive on a media. The Traces folder holds the set of traces available for a tracing project. To export traces contained in the Traces folder:

  1. Select File > Export. The Export page appears.
  2. Expand Tracing and select Trace Package Export.
  3. Click Next. The Export trace package page appears.

  4. Select the project containing the traces and then the traces to be exported.
  5. You can also open the Export trace package wizard by expanding the project in the Project Explorer view, selecting the traces under the Traces folder, and selecting the Export Trace Package from the context menu that appears.

  6. You can now select the content to export and various format options for the resulting file.
  7. Click Finish to generate the package and save it to the media. The folder structure of the selected traces relative to the Traces folder is preserved in the trace package.

Importing a Trace Package

The Import Trace Package wizard allows users select a previously exported trace package from their media and import the content of the package in the workspace.

The Traces folder holds the set of traces available for a tracing project. To import a trace package to the Traces folder of a project:

  1. Select File > Import from the File main menu. The Import page appears.
  2. Expand Tracing and select Trace Package Import.
  3. Click Next. The Import trace package page appears.
  4. Select the archive containing the traces and the destination project.
  5. You can also open the Import Trace Package wizard by expanding the project in the Project Explorer view and selecting the Import Trace Package from the context menu that appears.

  6. You can now select the content to import from the selected trace archive.
  7. Click Finish to import the trace to the target folder. The folder structure from the trace package is restored in the Traces folder of the project.

Events Editor

The Events editor shows the basic trace data elements (events) in a tabular format. The editors can be dragged in the editor area so that several traces may be shown side by side, as shown in the following figure.



The header displays the current trace name. The page displays the following fields.

Timestamp
The event timestamp.
Type
The event type (PS/ APM/MicroBlaze).
Content
The raw event content obtained from the hardware server.

The first row of the table is the header row. You can search and filter the information on the page, using this row.

The highlighted event is the current event, and is synchronized with the other views. If you select another event, the other views will be updated accordingly. The properties view will display a more detailed view of the selected event.

An event range can be selected by holding the Shift key while clicking another event or using any of the cursor keys ( Up', Down, PageUp, PageDown, Home, and End). The first and last events in the selection will be used to determine the current selected time range for synchronization with the other views.

The Events editor can be closed, disposing a trace. When this is done, all the tabs displaying the information will be updated with the trace data of the next event editor tab. If all the editor tabs are closed, the tabs will display their empty states.

Searching and Filtering Events

Searching and filtering of events in the table can be performed by entering matching conditions in one or multiple columns in the header row (the first row below the column header).

To toggle between searching and filtering, click on the Search or Filter button in the left margin of the header row, or right-click on the header row and select Show Filter Bar or Show Search Bar in the context menu.

To apply a matching condition to a specific column, click on the column's header row cell, type in a regular expression and press the Enter key. You can also enter a simple text string and it will be automatically be replaced with a 'contains' regular expression.

When matching conditions are applied to two or more columns, all conditions must be met for the event to match (for example, 'and' behavior).

To clear all matching conditions in the header row, press the Delete key.

Searching an Event

When a searching condition is applied to the header row, the table selects the next matching event starting from the top currently displayed event. Wrapping occurs if there is no match until the end of the trace.

All matching events have a Search match button in their left margin. Non-matching events are dimmed.



Press Enter to search for and selects the next matching event. Press Shift+Enter to search for and select the previous matching event. Wrapping occurs in both directions.

Press Esc to cancel an ongoing search.

Press Del to clear the header row and reset all events to normal.

Filtering an Event

When a filtering condition is entered in the head row, the table will clear all events and fill itself with matching events as they are found from the beginning of the trace.

A status row will be displayed before and after the matching events, dynamically showing how many matching events were found and how many events were processed so far. When the filtering is completed, the status row icon in the left margin will change from a stop to a filter icon.



Press ESC to stop an ongoing filtering. In this case the status row icon will remain as a 'stop' icon to indicate that not all events were processed.

Press DEL or right-click on the table and select Clear Filters from the context menu to clear the header row and remove the filtering. All trace events will be now shown in the table. Note that the currently selected event will remain selected even after the filter is removed.

You can also search on the subset of filtered events by toggling the header row to the Search Bar while a filter is applied. Searching and filtering conditions are independent of each other.

Bookmarking an Event

Any event of interest can be tagged with a bookmark.

To add a bookmark, double-click the left margin next to an event, or right-click the margin and select Add bookmark. Alternatively, use the Edit > Add bookmark menu. Edit the bookmark description as desired and click OK.

The bookmark will be displayed in the left margin, and hovering the mouse over the bookmark icon will display the description in a tooltip.

The bookmark will be added to the Bookmarks view. In this view, the bookmark description can be edited, and the bookmark can be deleted. Double-clicking the bookmark or selecting Go to from its context menu will open the trace or experiment and go directly to the event that was bookmarked.

To remove a bookmark, double-click its icon, select Remove Bookmark from the left margin context menu, or select Delete from the Bookmarks view.

Histogram View

The Histogram View displays the trace events (counters data) distribution with respect to time. When performance analysis is running, this view is dynamically updated as the events are received.



The controls on the view are described below.

Selection Start
Displays the start time of the current selection.
Selection End
Displays the end time of the current selection.
Window Span
Displays the current zoom window size in seconds.

The controls can be used to modify their respective value. After validation, the other controls and views will be synchronized and updated accordingly. To modify both selection times simultaneously, press the link button which disables the Selection End control input.

The large (full) histogram, at the bottom, shows the event distribution over the trace. It also has a smaller semi-transparent orange window, with a cross-hair, that shows the current zoom window.

The smaller (zoom) histogram, on top right, corresponds to the current zoom window, a sub-range of the event set.

The x-axis of each histogram corresponds to the event timestamps. The start time and end time of the histogram range is displayed. The y-axis shows the maximum number of events in the corresponding histogram bars.

The vertical blue line(s) show the current selection time (or range). If applicable, the region in the selection range will be shaded.

The mouse actions that can be used to control the histogram are listed below.

Left-click
Sets a selection time
Left-drag
Sets a selection range
Shift+left-click or drag
Extend or shrink the selection range
Middle-click or CTRL+Left-click
Centers the zoom window
Middle-drag or CTRL+left-drag
Moves the zoom window
Right-drag
Sets the zoom window
SHIFT+Right-click or drag
Extend or shrink the zoom window
Mouse wheel up
Zoom in
Mouse wheel down
Zoom out

Hovering the mouse over an histogram bar pops up an information window that displays the start/end time of the corresponding bar, as well as the number of events it represents. If the mouse is over the selection range, the selection span in seconds is displayed.

The actions performed by various keystrokes when they are used in the Histogram view are listed below.

Left Arrow
Moves the current event to the previous non-empty bar.
Right Arrow
Moves the current event to the next non-empty bar.
Home
Sets the current time to the first non-empty bar.
End
Sets the current time to the last non-empty histogram bar.
Plus (+)
Zoom in
Minus (-)
Zoom out

Colors View

The Colors view allows you to define a prioritized list of color settings.



A color setting associates a foreground and background color (used in any events table), and a tick color (used in the Time Chart view), with an event filter.

In an events table, any event row that matches the event filter of a color setting will be displayed with the specified foreground and background colors. If the event matches multiple filters, the color setting with the highest priority will be used.

The same principle applies to the event tick colors in the Time Chart view. If a tick represents many events, the tick color of the highest priority matching event will be used.

Color settings can be inserted, deleted, reordered, imported and exported using the buttons in the Colors view toolbar. Changes to the color settings are applied immediately, and are persisted to disk.

Filters View

The Filters view allows you to define preset filters that can be applied to any events table.



The filters can be more complex than what can be achieved with the filter header row in the events table. The filter is defined in a tree node structure, where the node types can be any of TRACETYPE, AND, OR, CONTAINS, EQUALS, MATCHES. or COMPARE. Some nodes types have restrictions on their possible children in the tree.

The TRACETYPE node filters against the trace type of the trace as defined in a plug-in extension or in a custom parser. When used, any child node will have its aspect combo box restricted to the possible aspects of that trace type.

The AND node applies the logical and condition on all of its children. All children conditions must be true for the filter to match. A not operator can be applied to invert the condition.

The OR node applies the logical or condition on all of its children. At least one children condition must be true for the filter to match. A not operator can be applied to invert the condition.

The CONTAINS node matches when the specified event aspect value contains the specified value string. A not operator can be applied to invert the condition. The condition can be case sensitive or insensitive.

The EQUALS node matches when the specified event aspect value equals exactly the specified value string. A not operator can be applied to invert the condition. The condition can be case sensitive or insensitive.

The MATCHES node matches when the specified event aspect value matches against the specified regular expression. A not operator can be applied to invert the condition.

The COMPARE node matches when the specified event aspect value compared with the specified value gives the specified result. The result can be set to smaller than, equal or greater than. The type of comparison can be numerical, alphanumerical or based on time stamp. A not operator can be applied to invert the condition.

For numerical comparisons, strings prefixed by "0x", "0X" or "#" are treated as hexadecimal numbers and strings prefixed by "0" are treated as octal numbers.

For time stamp comparisons, strings are treated as seconds with or without fraction of seconds. This corresponds to the TTT format in the Time Format preferences. The value for a selected event can be found in the Properties view under the Timestamp property. The common 'Timestamp' aspect can always be used for time stamp comparisons regardless of its time format.

Filters can be added, deleted, imported and exported using the buttons in the Filters view toolbar. The nodes in the view can be Cut (Ctrl-X), Copied (Ctrl-C) and Pasted (Ctrl-V) by using the buttons in the toolbar or by using the key bindings. This makes it easier to quickly build new filters from existing ones. Changes to the preset filters are only applied and persisted to disk when the Save filters button is pressed.

Time Chart View

The Time Chart view allows you to visualize every open trace in a common time chart. Each trace is displayed in its own row, and ticks are displayed for every punctual event. As you zoom using the mouse wheel, or by right-clicking and dragging in the time scale, more detailed event data is computed from the traces.



Time synchronization is enabled between the time chart view and other trace viewers such as the events table.

Color settings defined in the Colors view can be used to change the tick color of events displayed in the Time Chart view.

When a search is applied in the events table, the ticks corresponding to matching events in the Time Chart view are decorated with a marker below the tick.

When a bookmark is applied in the events table, the ticks corresponding to the bookmarked event in the Time Chart view is decorated with a bookmark above the tick.

When a filter is applied in the events table, the non-matching ticks are removed from the Time Chart view.

The Time Chart view only supports traces that are opened in an editor. The use of an editor is specified in the plug-in extension for that trace type, or is enabled by default for custom traces.

Analysis Views

For each of the different types of trace (PS, APM, and so on) collected, there is a set of views to help in analyzing it. There are two types of views; tabular and graphical.

You can view the analysis of trace data both in live mode, when the data collection is running, and in offline mode. In live mode, tabular view displays analysis for the entire trace duration, whereas the graphical view displays analysis for the last 20 seconds. In the offline mode, graphical view displays the zoomed region whereas the tabular view displays the selection region or zoomed region depending on whichever is the last user action. In live mode, to pause the views and view the past data, use the button present in the analysis views. When the views are paused, the Histogram view can be used to zoom and analyze any portion of the data.

These analysis views display the data only when corresponding trace file is opened in the Events Editor; otherwise they will be empty.

PS Performance Graphs

All the PS (Arm) metrics will be displayed using these graphs.

PS Performance Counters

Tabular representation of the PS (Arm) metrics.



APM Performance Graphs

APM metrics are displayed using the graphs.



APM Performance Counters

APM metrics displayed in tabular format.



FreeRTOS Analysis

FreeRTOS event trace displayed in different states.



Performance Session Manager

The Performance Session Manager view provides you with the capability to control the sessions. You can start and stop a performance session from this view. Each time a session is started, a set of trace files is created based on your configuration.

Whenever an application is debugged or performance analysis is launched, the view automatically populates the entry for the active configuration.

Configure Session

You can configure a session by choosing the list of modules for which the data has to be collected. Each of the modules will be enabled based on the design information.

If you wish to configure the modules prior to starting performance analysis, use the Configure Performance Analysis option on the hardware Project.

Configure APM

You can choose which APM slots to be monitored by selecting the Configure APM option on the Configure Session page.

Configure MicroBlaze

You can choose the MicroBlaze instances for performance analysis use the option Configure MicroBlaze in the Configure Session page. By default, only instances from the first MDM module will be selected.

Offline Mode

Viewing the live performance analysis is supported only for duration of 10 mins and stops automatically after the elapsed time. When Offline Mode is selected, the performance analysis runs indefinitely until you stop it manually from the view.

Modify ATG Configuration

You can modify the ATG traffic configuration using the Modify ATG Configuration option available in the Performance Session Manager.

System Performance Modeling

System Performance Modeling (SPM) offers system-level performance analysis for characterizing and evaluating the performance of hardware and software systems. In particular, it enables analysis of the critical partitioning trade-offs between the Arm® Cortex A9 processors and the programmable fabric for a variety of different traffic scenarios. It provides graphical visualizations of AXI transaction traces and system-level performance metrics such as throughput, latency, utilization, and congestion.

SPM can be used in two ways:

  • Using a predefined design provided with the Vitis software platform
  • With the user design

In the current release, SPM is supported only for baremetal/standalone applications.

The following diagram shows the system performance modeling flow.



Predefined Design Flow

The predefined flow provided with the Vitis software platform uses the fixed design and comes with a fixed bitstream. In this design, there are five AXI Traffic Generators (ATGs), with one connected to each of the four High Performance ports (HP0-3) and one connected to the Accelerator Coherency Port (ACP). The ATGs are set up and controlled using one of the General Purpose (GP) ports. In addition, an AXI Performance Monitor (APM) is included in order to monitor the AXI traffic on the HP0-3 and ACP ports.

System Performance Modeling Using the Predefined Design

Creating the System Performance Modeling Project
  1. Select File > New > Other > Xilinx > SPM Project... to start the System Performance Modeling application.
  2. Click Finish.
  3. The SPM Launcher opens.
  4. To start the SPM with the default traffic configuration, click Debug.
  5. It first programs the FPGA and then starts the SPM.
Selecting an ATG Traffic Configuration

To select a traffic configuration:

  1. In the Project Explorer, right-click the hardware platform and select Run As > Run Configurations.
  2. Under Performance Analysis, select Performance Analysis on <filename>.elf.
  3. You can use the ATG Configuration view to define multiple traffic configurations and select the traffic to be used for the current run. The following figure shows the traffic that is defined in the Default configuration.
  4. The port location is taken from the Hardware handoff file. If no ATG was configured in the design, the ATG Configuration view is empty.
  5. You can use the ATG Configuration page to add and edit configurations.
  6. To add a configuration to the list of configurations, click the + button.
  7. To edit a configuration, select the Configuration: drop-down list to choose the configuration that you want to edit.
  8. For ease of defining an ATG configuration, you can create Configuration Templates. These templates are saved for the user workspace and can be used across the Projects for ATG traffic definitions. To create a template, do the following:
  9. Click Configure Templates.
  10. Click the + button to add a new user-defined configuration template.
  11. The newly created template is assigned a Template ID with the pattern of "UserDef_*" by default. You can change the ID and also define the rest of the fields.
  12. You can use these defined templates to define an ATG configuration. To delete a Configuration Template, select it and click the X button.
    TIP: In an ATG configuration, to set a port so that it does not have any traffic, set the Template ID for that port to None.

Configure FSBL Parameters

Changing the first stage bootloader (FSBL) configuration is only available for the fixed design flow of the System Performance Modeling application.

To invoke the FSBL Configuration Change page, right-click the configuration name and select Configure FSBL Parameters.

Below are the details about the first stage bootloader (FSBL) parameters.

Table 1. FSBL Parameters

Parameter

Description

Default Value

PS Clock Frequency (MHz)

The clock frequency of the Zynq-7000 SoC PS (specified in MHz).

666.7 MHz

PL Clock Frequency (MHz)

The clock frequency of the Zynq-7000 SoC PL (specified in MHz).

100.0 MHz

DDR Clock Frequency (MHz)

The clock frequency of the DDR memory (specified in MHz).

533.3 MHz

DDR Data Path Width

The bit width used in the DDR memory data path. Possible values are 16 and 32 bits.

32 bits

DDR Port 0 - Enable HPR

This enables the usage of high priority reads on DDR port 0. This port is used by the CPUs and the ACP via the L2 Cache.

Unchecked

DDR Port 1 - Enable HPR

This enable the usage of high priority reads on DDR port 1. This port is used by other masters via the central interconnect.

Unchecked

DDR Port 2 - Enable HPR

This enables the usage of high priority reads on DDR port 2. This port is used by HP2 and HP3.

Unchecked

DDR Port 3 - Enable HPR

This enable the usage of high priority reads on DDR port 3. This port is used by HP0 and HP1.

Unchecked

HPR/LPR Queue Partitioning

Indicates the desired partitioning for high and low priority reads in the queue. Note that the queue has a depth of 32 read requests. There are four values provided in a drop-down menu.

HPR(0)/LPR(32)

LPR to Critical Priority Level

The number of clocks that the LPR queue can be starved before it goes critical. Unit: 32 DDR clock cycles. This value sets the DDR LPR_reg register [1]. Valid values are between 0 and 2047.

2

HPR to Critical Priority Level

The number of clocks that the HPR queue can be starved before it goes critical. Unit: 32 DDR clock cycles. This value sets the DDR HPR_reg register [1]. Valid values are between 0 and 2047.

15

Write to Critical Priority Level

The number of clocks that the write queue can be starved before it goes critical. Unit: 32 DDR clock cycles. This value sets the DDR WR_reg register [1]. Valid values are between 0 and 2047.

2

For more information about the FSBL, refer to Zynq-7000 SoC Software Developers Guide (UG821).

User-Defined Flow

Performance analysis can be done on any user-defined applications.

System Performance Modeling Using a User-Defined Flow

The Vitis software platform provides the capability to monitor a running target regardless of the target operating system.

Note: If no ATG is configured in the hardware, the ATG Configuration view will be empty. Make sure to remove the Breakpoints by selecting Window > Show View > Breakpoints.
  1. If your design is defined in the Vivado Design Suite, then it is recommended to create a platform specification based on the design. To do performance analysis based on the specification:
    1. Build and export your bitstream using File > Export > Export Hardware in the Vivado Design Suite.
    2. In the Vitis™ software platform, select File > New > Platform Project and import the generated file <your design>.xsa into the Vitis software platform.
    3. Right-click on the platform project and select Run As > Run Configuration.
    4. Select SPM Analysis and click the New button to create a performance analysis configuration.
    5. Select Standalone Application Debug from the Debug Type dropdown list.
    6. Select the imported hardware platform specification from the Hardware platform dropdown list.
    7. Select the Reset entire system and Program FPGA check boxes.
    8. Click Run to launch the Performance Analysis perspective.
  2. For any reason, if you cannot create a hardware platform specification, or do not have one, you can still do performance analysis in the Vitis software platform. To do performance analysis in absence of the specification:
    1. Select Run > Run Configurations. Click OK to save the details and close the Configure APM page.
    2. Select Single Application Debug and click the New button to create a new configuration.
    3. Select Attach to running target from the Debug Type dropdown list.
    4. Click Performance Analysis Configuration to edit the PS and APM settings.
    5. Click ATG configuration to see the ATG configuration.
    6. Click Apply and run to see the data.

Limitations

  • The Vitis software platform supports SPM only for baremetal/standalone applications.
  • Versal devices do not support SPM.