Optimize: Performance Analysis
Performance analysis in the Vitis software platform provides functionality for viewing and analyzing different types of performance data. Its goal is to provide views, graphs, metrics, etc. to help extract useful information from the data, in a way that is more user-friendly and informative than huge text dumps.
Performance analysis provides the following features:
- Support for viewing Arm data.
- Support for viewing APM data with PS and MDM as master.
- Support for viewing MicroBlaze data.
- Support for viewing and analyzing live data.
- Support for offline viewing of data.
- Support for zooming out/in of the data.
- Event filtering and searching.
- Import and export of trace packages.
The Performance analysis feature in the Vitis software platform supports data collection from AXI Performance Monitor (APM) Event Counters, Arm Performance Monitor Unit (PMU) from a Zynq-7000 SoC processing system, and MicroBlaze performance monitoring counters. For an example usage of performance monitoring on a Zynq device, refer to System Performance Modeling. For a MicroBlaze design, APM can be used in a similar way as SPM.
- Number of clock cycles
- Any valid instruction executed
- Read or write data request from/to data cache
- Read or write data cache hit
- Pipeline stalled
- Instruction cache latency for memory read
The data is collected in the Vitis software platform in real time. The values from these counters are sampled every 10 ms. These values are used to calculate metrics shown in the Performance Counters view.
The Vitis software platform monitors the following PMU events for each Cortex-A9 CPU:
- Data cache refill
- Data cache access
- Data stall
- Write stall
- Instruction rename
- Branch miss
The following two Level-2 cache controller (L2C-PL330) counters are monitored:
- Number of cache hits
- Number of cache accesses
The following APM counters for each HP and ACP port are monitored:
- Write Byte Count
- Read Byte Count
- Write Transaction Count
- Total Write Latency
- Read Transaction Count
- Total Read Latency
Working with the Performance Analysis Perspective
The Performance Analysis perspective is comprised of many views which provide the capability of collecting and analyzing the performance data referred as trace.
Explorer View
The Project Explorer view displays all the available projects in the workspace. When a performance analysis session is launched the data from the board is collected and stored as trace files in tracing project. Each of the hardware project contains a corresponding tracing project,*_Traces, where the data is stored. Performance counters data from single run is stored under designated Run_* folder. Data from different sections is stored in different files under the run folder.
To analyse the data double click the trace file to open it in an Events editor view. After the file is opened, the tree under the trace file can be expanded to view the list of available analysis views.
Deleting Supplementary Files
Supplementary files are by definition trace specific files that accompany a trace. These files could be temporary files, persistent indexes, or any other persistent data files created by the tool during parsing a trace.
All supplementary files are hidden from the user and are handled internally by the tool. However, there is a possibility to delete the supplementary files so that there are recreated when opening a trace.
To delete all supplementary files from one or many traces and experiments:
- Select the relevant traces and experiments in the Project Explorer view.
- Right-click and select Delete
Supplementary Files from the context menu that appears. The
Delete Resources page, with a list of supplementary files, grouped under
the trace or experiment they belong to, appears.
- Select the file(s) to delete from the list.
- Click OK.
Link with Editor
The tracing projects support the Link With Editor feature of the Project Explorer view. With this feature it is now possible to do the following:
- Select a trace element in the Project Explorer view and the corresponding Events editor will get focus, if the relevant trace is open.
- Select an Events editor and the corresponding trace element will be highlighted in the Project Explorer view.
To enable or disable this feature toggle the Link With Editor button of the Project Explorer view as shown below.
Exporting a Trace Package
The Export Trace Package wizard allows users to select a trace and export its files and bookmarks to an archive on a media. The Traces folder holds the set of traces available for a tracing project. To export traces contained in the Traces folder:
- Select Export page appears. . The
- Expand Tracing and select Trace Package Export.
- Click Next. The Export trace
package page appears.
- Select the project containing the traces and then the traces to be exported.
- You can also open the Export trace package wizard by expanding the project in the
Project Explorer view, selecting the traces under the Traces folder, and
selecting the Export Trace Package from the context menu
that appears.
- You can now select the content to export and various format options for the resulting file.
- Click Finish to generate the package and save it to the media. The folder structure of the selected traces relative to the Traces folder is preserved in the trace package.
Importing a Trace Package
The Import Trace Package wizard allows users select a previously exported trace package from their media and import the content of the package in the workspace.
The Traces folder holds the set of traces available for a tracing project. To import a trace package to the Traces folder of a project:
- Select File main menu. The Import page appears. from the
- Expand Tracing and select Trace Package Import.
- Click Next. The Import trace package page appears.
- Select the archive containing the traces and the destination project.
- You can also open the Import Trace Package wizard by expanding the project in the
Project Explorer view and selecting the Import Trace
Package from the context menu that appears.
- You can now select the content to import from the selected trace archive.
- Click Finish to import the trace to the target folder. The folder structure from the trace package is restored in the Traces folder of the project.
Events Editor
The Events editor shows the basic trace data elements (events) in a tabular format. The editors can be dragged in the editor area so that several traces may be shown side by side, as shown in the following figure.
The header displays the current trace name. The page displays the following fields.
- Timestamp
- The event timestamp.
- Type
- The event type (PS/ APM/MicroBlaze).
- Content
- The raw event content obtained from the hardware server.
The first row of the table is the header row. You can search and filter the information on the page, using this row.
The highlighted event is the current event, and is synchronized with the other views. If you select another event, the other views will be updated accordingly. The properties view will display a more detailed view of the selected event.
An event range can be selected by holding the Shift key while clicking another event or using any of the cursor keys ( Up', Down, PageUp, PageDown, Home, and End). The first and last events in the selection will be used to determine the current selected time range for synchronization with the other views.
The Events editor can be closed, disposing a trace. When this is done, all the tabs displaying the information will be updated with the trace data of the next event editor tab. If all the editor tabs are closed, the tabs will display their empty states.
Searching and Filtering Events
Searching and filtering of events in the table can be performed by entering matching conditions in one or multiple columns in the header row (the first row below the column header).
To toggle between searching and filtering, click on the Search or Filter button in the left margin of the header row, or right-click on the header row and select Show Filter Bar or Show Search Bar in the context menu.
To apply a matching condition to a specific column, click on the column's header row cell, type in a regular expression and press the Enter key. You can also enter a simple text string and it will be automatically be replaced with a 'contains' regular expression.
When matching conditions are applied to two or more columns, all conditions must be met for the event to match (for example, 'and' behavior).
To clear all matching conditions in the header row, press the Delete key.
Searching an Event
When a searching condition is applied to the header row, the table selects the next matching event starting from the top currently displayed event. Wrapping occurs if there is no match until the end of the trace.
All matching events have a Search match button in their left margin. Non-matching events are dimmed.
Press Enter to search for and selects the next matching event. Press Shift+Enter to search for and select the previous matching event. Wrapping occurs in both directions.
Press Esc to cancel an ongoing search.
Press Del to clear the header row and reset all events to normal.
Filtering an Event
When a filtering condition is entered in the head row, the table will clear all events and fill itself with matching events as they are found from the beginning of the trace.
A status row will be displayed before and after the matching events, dynamically showing how many matching events were found and how many events were processed so far. When the filtering is completed, the status row icon in the left margin will change from a stop to a filter icon.
Press ESC to stop an ongoing filtering. In this case the status row icon will remain as a 'stop' icon to indicate that not all events were processed.
Press DEL or right-click on the table and select Clear Filters from the context menu to clear the header row and remove the filtering. All trace events will be now shown in the table. Note that the currently selected event will remain selected even after the filter is removed.
You can also search on the subset of filtered events by toggling the header row to the Search Bar while a filter is applied. Searching and filtering conditions are independent of each other.
Bookmarking an Event
Any event of interest can be tagged with a bookmark.
To add a bookmark, double-click the left margin next to an event, or right-click the margin and select Add bookmark. Alternatively, use the menu. Edit the bookmark description as desired and click OK.
The bookmark will be displayed in the left margin, and hovering the mouse over the bookmark icon will display the description in a tooltip.
To remove a bookmark, double-click its icon, select Remove Bookmark from the left margin context menu, or select Delete from the Bookmarks view.
Histogram View
The Histogram View displays the trace events (counters data) distribution with respect to time. When performance analysis is running, this view is dynamically updated as the events are received.
The controls on the view are described below.
- Selection Start
- Displays the start time of the current selection.
- Selection End
- Displays the end time of the current selection.
- Window Span
- Displays the current zoom window size in seconds.
The controls can be used to modify their respective value. After validation, the other controls and views will be synchronized and updated accordingly. To modify both selection times simultaneously, press the link button which disables the Selection End control input.
The large (full) histogram, at the bottom, shows the event distribution over the trace. It also has a smaller semi-transparent orange window, with a cross-hair, that shows the current zoom window.
The smaller (zoom) histogram, on top right, corresponds to the current zoom window, a sub-range of the event set.
The x-axis of each histogram corresponds to the event timestamps. The start time and end time of the histogram range is displayed. The y-axis shows the maximum number of events in the corresponding histogram bars.
The vertical blue line(s) show the current selection time (or range). If applicable, the region in the selection range will be shaded.
The mouse actions that can be used to control the histogram are listed below.
- Left-click
- Sets a selection time
- Left-drag
- Sets a selection range
- Shift+left-click or drag
- Extend or shrink the selection range
- Middle-click or CTRL+Left-click
- Centers the zoom window
- Middle-drag or CTRL+left-drag
- Moves the zoom window
- Right-drag
- Sets the zoom window
- SHIFT+Right-click or drag
- Extend or shrink the zoom window
- Mouse wheel up
- Zoom in
- Mouse wheel down
- Zoom out
Hovering the mouse over an histogram bar pops up an information window that displays the start/end time of the corresponding bar, as well as the number of events it represents. If the mouse is over the selection range, the selection span in seconds is displayed.
The actions performed by various keystrokes when they are used in the Histogram view are listed below.
- Left Arrow
- Moves the current event to the previous non-empty bar.
- Right Arrow
- Moves the current event to the next non-empty bar.
- Home
- Sets the current time to the first non-empty bar.
- End
- Sets the current time to the last non-empty histogram bar.
- Plus (+)
- Zoom in
- Minus (-)
- Zoom out
Colors View
The Colors view allows you to define a prioritized list of color settings.
A color setting associates a foreground and background color (used in any events table), and a tick color (used in the Time Chart view), with an event filter.
In an events table, any event row that matches the event filter of a color setting will be displayed with the specified foreground and background colors. If the event matches multiple filters, the color setting with the highest priority will be used.
The same principle applies to the event tick colors in the Time Chart view. If a tick represents many events, the tick color of the highest priority matching event will be used.
Color settings can be inserted, deleted, reordered, imported and exported using the buttons in the Colors view toolbar. Changes to the color settings are applied immediately, and are persisted to disk.
Filters View
The Filters view allows you to define preset filters that can be applied to any events table.
The filters can be more complex than what can be achieved with the filter header row in the events table. The filter is defined in a tree node structure, where the node types can be any of TRACETYPE, AND, OR, CONTAINS, EQUALS, MATCHES. or COMPARE. Some nodes types have restrictions on their possible children in the tree.
The TRACETYPE node filters against the trace type of the trace as defined in a plug-in extension or in a custom parser. When used, any child node will have its aspect combo box restricted to the possible aspects of that trace type.
The AND node applies the logical and
condition on all of its children. All children conditions must be true for the
filter to match. A not
operator can be applied to invert the
condition.
The OR node applies the logical or
condition on all of its children. At least one children condition must be true for
the filter to match. A not
operator can be applied to invert the
condition.
The CONTAINS node matches when the specified event aspect
value contains the specified value
string. A not
operator can be applied to invert
the condition. The condition can be case sensitive or insensitive.
The EQUALS node matches when the specified event aspect
value equals exactly the specified value
string. A not
operator can be applied to invert
the condition. The condition can be case sensitive or insensitive.
The MATCHES node matches when the specified event aspect
value matches against the specified regular expression
. A not
operator can be applied to
invert the condition.
The COMPARE node matches when the specified event aspect
value compared with the specified value
gives the specified result
. The result can be
set to smaller than
, equal
or
greater than
. The type of comparison can be numerical,
alphanumerical or based on time stamp. A not
operator can be
applied to invert the condition.
For numerical comparisons, strings prefixed by "0x", "0X" or "#" are treated as hexadecimal numbers and strings prefixed by "0" are treated as octal numbers.
For time stamp comparisons, strings are treated as seconds
with or without fraction of seconds. This corresponds to the TTT format in the Time Format
preferences. The value for a selected event can be found in the Properties view under the Timestamp
property. The common 'Timestamp' aspect can always be used
for time stamp comparisons regardless of its time format.
Filters can be added, deleted, imported and exported using the buttons in the Filters view toolbar. The nodes in the view can be Cut (Ctrl-X), Copied (Ctrl-C) and Pasted (Ctrl-V) by using the buttons in the toolbar or by using the key bindings. This makes it easier to quickly build new filters from existing ones. Changes to the preset filters are only applied and persisted to disk when the Save filters button is pressed.
Time Chart View
The Time Chart view allows you to visualize every open trace in a common time chart. Each trace is displayed in its own row, and ticks are displayed for every punctual event. As you zoom using the mouse wheel, or by right-clicking and dragging in the time scale, more detailed event data is computed from the traces.
Time synchronization is enabled between the time chart view and other trace viewers such as the events table.
Color settings defined in the Colors view can be used to change the tick color of events displayed in the Time Chart view.
When a search is applied in the events table, the ticks corresponding to matching events in the Time Chart view are decorated with a marker below the tick.
When a bookmark is applied in the events table, the ticks corresponding to the bookmarked event in the Time Chart view is decorated with a bookmark above the tick.
When a filter is applied in the events table, the non-matching ticks are removed from the Time Chart view.
The Time Chart view only supports traces that are opened in an editor. The use of an editor is specified in the plug-in extension for that trace type, or is enabled by default for custom traces.
Analysis Views
For each of the different types of trace (PS, APM, and so on) collected, there is a set of views to help in analyzing it. There are two types of views; tabular and graphical.
You can view the analysis of trace data both in live mode, when the data collection is running, and in offline mode. In live mode, tabular view displays analysis for the entire trace duration, whereas the graphical view displays analysis for the last 20 seconds. In the offline mode, graphical view displays the zoomed region whereas the tabular view displays the selection region or zoomed region depending on whichever is the last user action. In live mode, to pause the views and view the past data, use the button present in the analysis views. When the views are paused, the Histogram view can be used to zoom and analyze any portion of the data.
These analysis views display the data only when corresponding trace file is opened in the Events Editor; otherwise they will be empty.
PS Performance Graphs
All the PS (Arm) metrics will be displayed using these graphs.
PS Performance Counters
Tabular representation of the PS (Arm) metrics.
APM Performance Graphs
APM metrics are displayed using the graphs.
APM Performance Counters
APM metrics displayed in tabular format.
FreeRTOS Analysis
FreeRTOS event trace displayed in different states.
Performance Session Manager
The Performance Session Manager view provides you with the capability to control the sessions. You can start and stop a performance session from this view. Each time a session is started, a set of trace files is created based on your configuration.
Whenever an application is debugged or performance analysis is launched, the view automatically populates the entry for the active configuration.
Configure Session
You can configure a session by choosing the list of modules for which the data has to be collected. Each of the modules will be enabled based on the design information.
If you wish to configure the modules prior to starting performance analysis, use the Configure Performance Analysis option on the hardware Project.
Configure APM
You can choose which APM slots to be monitored by selecting the Configure APM option on the Configure Session page.
Configure MicroBlaze
You can choose the MicroBlaze instances for performance analysis use the option Configure MicroBlaze in the Configure Session page. By default, only instances from the first MDM module will be selected.
Offline Mode
Viewing the live performance analysis is supported only for duration of 10 mins and stops automatically after the elapsed time. When Offline Mode is selected, the performance analysis runs indefinitely until you stop it manually from the view.
Modify ATG Configuration
You can modify the ATG traffic configuration using the Modify ATG Configuration option available in the Performance Session Manager.
System Performance Modeling
System Performance Modeling (SPM) offers system-level performance analysis for characterizing and evaluating the performance of hardware and software systems. In particular, it enables analysis of the critical partitioning trade-offs between the Arm® Cortex A9 processors and the programmable fabric for a variety of different traffic scenarios. It provides graphical visualizations of AXI transaction traces and system-level performance metrics such as throughput, latency, utilization, and congestion.
SPM can be used in two ways:
- Using a predefined design provided with the Vitis software platform
- With the user design
In the current release, SPM is supported only for baremetal/standalone applications.
The following diagram shows the system performance modeling flow.
Predefined Design Flow
The predefined flow provided with the Vitis software platform uses the fixed design and comes with a fixed bitstream. In this design, there are five AXI Traffic Generators (ATGs), with one connected to each of the four High Performance ports (HP0-3) and one connected to the Accelerator Coherency Port (ACP). The ATGs are set up and controlled using one of the General Purpose (GP) ports. In addition, an AXI Performance Monitor (APM) is included in order to monitor the AXI traffic on the HP0-3 and ACP ports.
System Performance Modeling Using the Predefined Design
Creating the System Performance Modeling Project
- Select to start the System Performance Modeling application.
- Click Finish.
- The SPM Launcher opens.
- To start the SPM with the default traffic configuration, click Debug.
- It first programs the FPGA and then starts the SPM.
Selecting an ATG Traffic Configuration
To select a traffic configuration:
- In the Project Explorer, right-click the hardware platform and select .
- Under Performance Analysis, select Performance Analysis on <filename>.elf.
- You can use the ATG Configuration view to define multiple traffic configurations and select the traffic to be used for the current run. The following figure shows the traffic that is defined in the Default configuration.
- The port location is taken from the Hardware handoff file. If no ATG was configured in the design, the ATG Configuration view is empty.
- You can use the ATG Configuration page to add and edit configurations.
- To add a configuration to the list of configurations, click the + button.
- To edit a configuration, select the Configuration: drop-down list to choose the configuration that you want to edit.
- For ease of defining an ATG configuration, you can create Configuration Templates. These templates are saved for the user workspace and can be used across the Projects for ATG traffic definitions. To create a template, do the following:
- Click Configure Templates.
- Click the + button to add a new user-defined configuration template.
- The newly created template is assigned a Template ID with the pattern of "UserDef_*" by default. You can change the ID and also define the rest of the fields.
- You can use these defined templates to define an ATG configuration. To delete a Configuration
Template, select it and click the X button. TIP: In an ATG configuration, to set a port so that it does not have any traffic, set the Template ID for that port to None.
Configure FSBL Parameters
Changing the first stage bootloader (FSBL) configuration is only available for the fixed design flow of the System Performance Modeling application.
To invoke the FSBL Configuration Change page, right-click the configuration name and select Configure FSBL Parameters.
Below are the details about the first stage bootloader (FSBL) parameters.
Parameter |
Description |
Default Value |
---|---|---|
PS Clock Frequency (MHz) |
The clock frequency of the Zynq-7000 SoC PS (specified in MHz). |
666.7 MHz |
PL Clock Frequency (MHz) |
The clock frequency of the Zynq-7000 SoC PL (specified in MHz). |
100.0 MHz |
DDR Clock Frequency (MHz) |
The clock frequency of the DDR memory (specified in MHz). |
533.3 MHz |
DDR Data Path Width |
The bit width used in the DDR memory data path. Possible values are 16 and 32 bits. |
32 bits |
DDR Port 0 - Enable HPR |
This enables the usage of high priority reads on DDR port 0. This port is used by the CPUs and the ACP via the L2 Cache. |
Unchecked |
DDR Port 1 - Enable HPR |
This enable the usage of high priority reads on DDR port 1. This port is used by other masters via the central interconnect. |
Unchecked |
DDR Port 2 - Enable HPR |
This enables the usage of high priority reads on DDR port 2. This port is used by HP2 and HP3. |
Unchecked |
DDR Port 3 - Enable HPR |
This enable the usage of high priority reads on DDR port 3. This port is used by HP0 and HP1. |
Unchecked |
HPR/LPR Queue Partitioning |
Indicates the desired partitioning for high and low priority reads in the queue. Note that the queue has a depth of 32 read requests. There are four values provided in a drop-down menu. |
HPR(0)/LPR(32) |
LPR to Critical Priority Level |
The number of clocks that the LPR queue can be starved before it goes critical. Unit: 32 DDR clock cycles. This value sets the DDR LPR_reg register [1]. Valid values are between 0 and 2047. |
2 |
HPR to Critical Priority Level |
The number of clocks that the HPR queue can be starved before it goes critical. Unit: 32 DDR clock cycles. This value sets the DDR HPR_reg register [1]. Valid values are between 0 and 2047. |
15 |
Write to Critical Priority Level |
The number of clocks that the write queue can be starved before it goes critical. Unit: 32 DDR clock cycles. This value sets the DDR WR_reg register [1]. Valid values are between 0 and 2047. |
2 |
For more information about the FSBL, refer to Zynq-7000 SoC Software Developers Guide (UG821).
User-Defined Flow
Performance analysis can be done on any user-defined applications.
System Performance Modeling Using a User-Defined Flow
The Vitis software platform provides the capability to monitor a running target regardless of the target operating system.
- If your design is defined in the Vivado Design Suite, then it is recommended to create a platform
specification based on the design. To do performance analysis based on the
specification:
- Build and export your bitstream using Vivado Design Suite. in the
- In the Vitis™ software platform, select and import the generated file <your design>.xsa into the Vitis software platform.
- Right-click on the platform project and select .
- Select SPM Analysis and click the New button to create a performance analysis configuration.
- Select Standalone Application Debug from the Debug Type dropdown list.
- Select the imported hardware platform specification from the Hardware platform dropdown list.
- Select the Reset entire system and Program FPGA check boxes.
- Click Run to launch the Performance Analysis perspective.
- For any reason, if you cannot create a hardware platform specification, or do not
have one, you can still do performance analysis in the Vitis
software platform. To do performance analysis in absence of the specification:
- Select OK to save the details and close the Configure APM page. . Click
- Select Single Application Debug and click the New button to create a new configuration.
- Select Attach to running target from the Debug Type dropdown list.
- Click Performance Analysis Configuration to edit the PS and APM settings.
- Click ATG configuration to see the ATG configuration.
- Click Apply and run to see the data.
Limitations
- The Vitis software platform supports SPM only for baremetal/standalone applications.
- Versal devices do not support SPM.