Use ZCU102 TRD to Accelerate Development of ZYNQ UltraScale+ MPSoC

Similar documents
Multimedia SoC System Solutions

Designing a Multi-Processor based system with FPGAs

SDSoC: Session 1

借助 SDSoC 快速開發複雜的嵌入式應用

Dramatically Accelerate 96Board Software via an FPGA with Integrated Processors

A So%ware Developer's Journey into a Deeply Heterogeneous World. Tomas Evensen, CTO Embedded So%ware, Xilinx

Copyright 2014 Xilinx

XMC-ZU1. XMC Module Xilinx Zynq UltraScale+ MPSoC. Overview. Key Features. Typical Applications

Introduction to Embedded System Design using Zynq

Zynq-7000 All Programmable SoC Product Overview

Copyright 2016 Xilinx

Optimizing HW/SW Partition of a Complex Embedded Systems. Simon George November 2015.

XMC-RFSOC-A. XMC Module Xilinx Zynq UltraScale+ RFSOC. Overview. Key Features. Typical Applications. Advanced Information Subject To Change

HKG : OpenAMP Introduction. Wendy Liang

Software Design Challenges for heterogenic SOC's

Partitioning of computationally intensive tasks between FPGA and CPUs

XMC-SDR-A. XMC Zynq MPSoC + Dual ADRV9009 module. Preliminary Information Subject To Change. Overview. Key Features. Typical Applications

SoC Systeme ultra-schnell entwickeln mit Vivado und Visual System Integrator

Heterogeneous Software Architecture with OpenAMP

VPX3-ZU1. 3U OpenVPX Module Xilinx Zynq UltraScale+ MPSoC with FMC HPC Site. Overview. Key Features. Typical Applications

Profiling and Debugging OpenCL Applications with ARM Development Tools. October 2014

i.mx 7 - Hetereogenous Multiprocessing Architecture

Isolation Methods in Zynq UltraScale+ MPSoCs

Copyright 2017 Xilinx.

Designing Multi-Channel, Real-Time Video Processors with Zynq All Programmable SoC Hyuk Kim Embedded Specialist Jun, 2014

Zynq Architecture, PS (ARM) and PL

MYC-C7Z010/20 CPU Module

VPX3-ZU1. 3U OpenVPX Module Xilinx Zynq UltraScale+ MPSoC with FMC HPC Site. Overview. Key Features. Typical Applications

Real-Timeness and System Integrity on a Asymmetric Multi Processing configuration

Zynq Ultrascale Mpsoc For The System Architect Logtel

SoC Platforms and CPU Cores

S2C K7 Prodigy Logic Module Series

Software Driven Verification at SoC Level. Perspec System Verifier Overview

ARM64 + FPGA and more: Linux on the Xilinx ZynqMP

Designing with NXP i.mx8m SoC

Avnet Zynq Mini Module Plus Embedded Design

Performance Optimization for an ARM Cortex-A53 System Using Software Workloads and Cycle Accurate Models. Jason Andrews

MYD-C7Z010/20 Development Board

Optimizing ARM SoC s with Carbon Performance Analysis Kits. ARM Technical Symposia, Fall 2014 Andy Ladd

Designing, developing, debugging ARM Cortex-A and Cortex-M heterogeneous multi-processor systems

Hardware Software Bring-Up Solutions for ARM v7/v8-based Designs. August 2015

Building High Performance, Power Efficient Cortex and Mali systems with ARM CoreLink. Robert Kaye

Design Choices for FPGA-based SoCs When Adding a SATA Storage }

Introduction to SoC+FPGA

Parallella Linux - quickstart guide. Antmicro Ltd

SoC Systeme ultra-schnell entwickeln mit Vivado und Visual System Integrator

UltraZed -EV Starter Kit Getting Started Version 1.3

Getting Started with FreeRTOS BSP for i.mx 7Dual

STM32MP1 Microprocessor Continuing the STM32 Success Story. Press Presentation

1-1 SDK with Zynq EPP

A176 Cyclone. GPGPU Fanless Small FF RediBuilt Supercomputer. IT and Instrumentation for industry. Aitech I/O

Zynq-7000 All Programmable SoC: Embedded Design Tutorial. A Hands-On Guide to Effective Embedded System Design

Embedded HW/SW Co-Development

THE LEADER IN VISUAL COMPUTING

SoC FPGAs. Your User-Customizable System on Chip Altera Corporation Public

ARM CORTEX-R52. Target Audience: Engineers and technicians who develop SoCs and systems based on the ARM Cortex-R52 architecture.

Using VxWorks BSP with Zynq-7000 AP SoC Authors: Uwe Gertheinrich, Simon George, Kester Aernoudt

Zynq Ultrascale+ Architecture

LinkSprite Technologies,.Inc. pcduino V2

Designing with ALTERA SoC Hardware

Introducing the Spartan-6 & Virtex-6 FPGA Embedded Kits

Making Full use of Emerging ARM-based Heterogeneous Multicore SoCs

Cortex-A9 MPCore Software Development

This guide is used as an entry point into the Petalinux tool. This demo shows the following:

ARMv8-A Software Development

Freescale i.mx6 Architecture

pcduino V3B XC4350 User Manual

Introduction to Sitara AM437x Processors

F28HS Hardware-Software Interface: Systems Programming

SOM PRODUCTS BRIEF. S y s t e m o n M o d u l e. Engicam. SOMProducts ver

Cortex-A15 MPCore Software Development

Simplify System Complexity

Elaborazione dati real-time su architetture embedded many-core e FPGA

Designing Security & Trust into Connected Devices

10/02/2015 PetaLinux Image with Custom Application

All Programmable: from Silicon to System

Optimizing Cache Coherent Subsystem Architecture for Heterogeneous Multicore SoCs

Cost-Optimized Backgrounder

Protecting Embedded Systems from Zero-Day Attacks

Intel SoC FPGA Embedded Development Suite (SoC EDS) Release Notes

Product Technical Brief S3C2440X Series Rev 2.0, Oct. 2003

10/02/2015 PetaLinux Linux Image Network Connection

Building blocks for 64-bit Systems Development of System IP in ARM

Simplify System Complexity

FPQ6 - MPC8313E implementation

Designing Security & Trust into Connected Devices

KeyStone II. CorePac Overview

Designing Security & Trust into Connected Devices

Zynq AP SoC Family

FA3 - i.mx51 Implementation + LTIB

64 bit Bare Metal Programming on RPI-3. Tristan Gingold

. SMARC 2.0 Compliant

«Real Time Embedded systems» Cyclone V SOC - FPGA

Integrating LogiCORE SEM IP in Zynq UltraScale+ Devices

Getting the Most out of Advanced ARM IP. ARM Technology Symposia November 2013

High-Performance, Highly Secure Networking for Industrial and IoT Applications

OpenAMP Discussion - Linaro2018HK. Copyright 2018 Xilinx

FPGA Entering the Era of the All Programmable SoC

Multicore platform towards automotive safety challenges

Introducing the AM57x Sitara Processors from Texas Instruments

Transcription:

Use ZCU102 TRD to Accelerate Development of ZYNQ UltraScale+ MPSoC

Topics Hardware advantages of ZYNQ UltraScale+ MPSoC Software stacks of MPSoC Target reference design introduction Details about one Design Modules (DM) in ZCU102 TRD

Hardware Advantages of MPSoC

Zynq UltraScale+ MPSoC Real-Time Processors 32-bit Dual-Core Memory Subsystem High Bandwidth Low Latency Graphics Processor ARM Mali-400MP2 Application Processor 64-bit Dual/Quad-Core High Speed Peripherals Key Interfaces Fabric Acceleration Customizable Engines High Speed Connectivity Video Codec 8K4K (15fps) 4K2K (60fps) Platform & Power Management Granular Power Control Functional Safety Configuration & Security Unit Anti-Tamper & Trust Industry Standards Page 4

Zynq UltraScale+ MPSoC Application Processor 64-bit Dual/Quad-Core 64-bit increases compute capability with 32-bit compatibility 27X performance/watt (DMIPS) vs predecessor SIMD engine accelerates multimedia, signal & image processing Real-Time Processors 32-bit Dual-Core Deterministic processing for critical real-time operation Split-Mode, and Lock-Step Mode for fault tolerance & detection 256KB TCM for deterministic and low-latency response Memory Subsystem High Bandwidth Low Latency 32GB of Addressable Memory 2400Mbps for DDR4 & LPDDR4 6 AXI Ports for high memory bandwidth 256KB low latency OCM w/ecc, no need for external memory Page 5

Zynq UltraScale+ MPSoC Massive Interconnect High Bandwidth ACE bi-directional port for coherent memory access between a coherent master & A53 (CCI) HPC ports for coherent memory access between a DMA and A53 Twelve 128-bit AXI ports, 6,000 interconnects between PS & PL Video Codec 8K4K (15fps) 4K2K (60fps) H264 and H265 standards Up to a 4K x 2K@60/8K x 4K@15 Hz rate,8-bit and 10-bit color depth, YCbCr 4:2:2 and 4:2:0 video formats Low-latency mode Graphics Processor ARM Mali-400MP2 Most power-optimized ARM GPU with Full HD support (1080p) Ideal for 2D vector graphics and 3D graphics Supports open standards, eg, OpenGL ES 11 & 20 Page 6

Zynq UltraScale+ MPSoC High Speed Peripherals Key Interfaces 6G Transceivers supports PCIe, DisplayPort, SGMII, SATA, USB 30 DisplayPort up to 4K x 2K @ 30fps, with alpha blending Gigabit Ethernet, SD/SDIO, Quad-SPI, SPI, NAND, CAN, UART, I2C, USB 20 Configuration & Security Unit Anti-Tamper & Trust Industry Standards Boot from Quad SPI Flash, NAND Flash, SD 30, or emmc Fault tolerant device boot: secure and non-secure Dedicated decryption (AES-256) & authentication (4096-bit RSA key, SHA3 hash functions) engines Platform & Power Management Granular Power Control Functional Safety Granular architecture enabling block-level power management Eliminate static power of unused blocks Enables extensible runtime power management (RAM) Page 7

More than Just Silicon Page 8

Zynq MPSoC Toolflow Hardware Software Configure Processing System Hardware Handoff File (HDF) bitstream HSM Add IP Add custom RTL* HSM imageub u-bootelf Create BSP Implementation Integrate OS Simulation* Extract Platform Add PMU Firmware* Generate Bitstream Configure Kernel/System Add XEN/ATF* Configure the Kernel Generate FSBL Build Kernel/File System Write Applications Generate Linux System Debug, Profiling, Performance Modeling* bootbin Create Boot Image * optional step Page 9

Embedded Software Development Tools Feature Eclipse-Based IDE Linaro GCC Tool Chain Multi-Core Debug Performance Profiling & Analysis Ecosystem Development Tools Benefit Familiar SW development environment Xilinx Software Design Kit (SDK) Industry standard compiler tool chain for Embedded Linux & Bare Metal Debug & cross triggering for Cortex-A53s, Cortex-R5s, and MicroBlaze Processor Analyze interfaces across processing and programmable logic domains Broad support for 3 rd party dev tools & debug, eg, ARM DS-5, Lauterbach Trace-32 Designers use their preferred development & debug environment Xilinx Software Development Kit (SDK) for SW Dev and Project, Build, & Tool Chain Management Page 10

PetaLinux PetaLinux is a build tool that allows end users to quickly bring up embedded Linux systems Why PetaLinux? Simplifies the Linux configuration and build system for Xilinx SoC FPGA Automatically configure Linux kernel, U-Boot, root file system, and application(s) to target a particular Vivado project Four commands to boot up embedded Linux for Xilinx SoC FPGA Page 11

Typical use case: APU SMP & AMP Non-secure State Secure State Non-secure State Secure State App1 App2 App1 App2 App1 App2 App1 App2 SMP Linux SMP Linux FreeRTOS XEN Hypervisor ARM Trusted Firmware (ATF) ARM Trusted Firmware (ATF) SMP Linux on Cortex-A53 No Hypervisor needed Linux Application and interrupt handler can be bound to a specific core AMP on Cortex-A53 Hypervisor is needed Easy legacy system migration Page 12

Typical use case: RPU App1 App2 App1 App2 App1 App2 FreeRTOS Baremetal FreeRTOS Split mode (default) R5-0: FreeRTOS R5-1: BareMetal or other RTOS Lockstep mode R5-0/R5-1: FreeRTOS Continually comparing outputs running the same software Page 13

OpenAMP Framework OpenAMP provides software components to enable development of software applications for APU + RPU systems Remoteproc : controls the Life Cycle Management (LCM) of the remote processors from the master processor RPMsg API : allows Inter Process Communications (IPC) between software running on independent cores in an AMP system Application remoteproc rpmsg Linux Kernel Linux Code/Data Bare Metal Code/Data Ring Buffers Buffer Memory Application XilOpenAMP Library rpmsg remoteproc Bare Metal or RTOS DRAM Page 14

System Software Out-of-the-Box Firmware, Drivers, Frameworks Feature Details First Stage Boot Loader (FSBL) Power Management Framework ARM Trusted Firmware U-Boot Linux Kernel Inter-Processor Framework (OpenAMP) Generated from the Hardware Handoff(HDF) file Standard APIs for power management OpenSource firmware to boot secure OS, leverage ARMv8-A virtualization features Out-of-the-box boot loaders Standard Linux Kernel from mainline Framework for inter-os & inter-processor management & communication (APU & RPU) Processing System ARM Trusted Firmware APU Memory GPU Peripherals Inter-Processor Framework Security Firmware RPU CSU PMU System Control Page 15

SDSoC accelerate SW to HW func1();<-sw func2();<-sw C/C++ func3();<-sw func1();<-sw func2();<-hw C/C++ func3();<-hw func1();<-sw #pragma SDS func2();<-hw C/C++ func3();<-hw SW 1 2 3 1 1 HW ACP, DMA 2 3 HP, SGDMA 2 3 Convert software algorithm to hardware design Automatically generate data mover network Explore different architecture to find the optimal Page 16

Targeted Reference Design

ZCU102 Targeted Reference Design What is TRD Design enables customers to evaluate the Zynq UltraScale+ MPSoC Provides demonstration of Zynq UltraScale+ MPSoC features Provides a starting platform upon which users may implement their own designs Design provides a ready to run demonstration enabling a positive out-of-box experience Design is delivered as a standard TRD with associated documentation Design Details: UG1221: Zynq UltraScale+ MPSoC Base Targeted Reference Design Steps in Wiki http://wwwwikixilinxcom/zynq+ultrascale+mpsoc+base+trd+20172 Design Demonstrates APU Running SMP Linux RPU-1 Running Bare Metal RPU-0 Running FreeRTOS Basic 4K video pipe controlled by the Processing System Multiple choices of video source and sink Reference Design Conception Divide a complex design into multiple design modules (DM) to help to understand each part Each DM can be verified separately Page 18

Hardware Interfaces & IP GPU Video Inputs TPG USB Webcam (optional) Vivid (Virtual video device) HDMI (20171) Video Outputs DisplayPort Tx HDMI (20171) Video Processing 2D Convolution Filter Optical Flow (20171) Auxiliary Peripherals SD I2C GPIO Ethernet UART USB 20 / 30 APM SATA (20171) Page 19

Software Components Operating systems APU: SMP Linux RPU-0: FreeRTOS RPU-1: Bare-metal Linux frameworks/libraries Video: Video4Linux (V4L2), Media Controller Display: DRM/KMS, X-Server (XOrg) Graphics: Qt5, OpenGL ES2 Vision: OpenCV Inter-process communication: OpenAMP User applications APU: Video control application with GUI RPU-0: Multi-threaded heartbeat application FreeRTOS RPU-1: Performance monitoring application Bare-metal Page 20

Block Diagram Page 21

Video Pipeline Block Diagram Page 22

Block Diagram and DM(1) DM1: APU SMP Linux Page 23

Block Diagram and DM(2) DM1: APU SMP Linux DM2: RPU0 FreeRTOS Application Page 24

Block Diagram and DM(3) DM1: APU SMP Linux DM2: RPU0 FreeRTOS Application DM3: RPU1 Bare-metal Application Page 25

Block Diagram and DM(4) DM1: APU SMP Linux DM2: RPU0 FreeRTOS Application DM3: RPU1 Bare-metal Application DM4: APU/RPU1 Inter Process Communication Page 26

Block Diagram and DM(5) Page 27 DM1: APU SMP Linux DM2: RPU0 FreeRTOS Application DM3: RPU1 Bare-metal Application DM4: APU/RPU1 Inter Process Communication DM5: APU Qt Application

Demo GUI

Block Diagram and DM(6) DM6: PL Video Capture Page 29

Block Diagram and DM(7) DM6: PL Video Capture DM7: OpenCV-based Image Processing Page 30

Block Diagram and DM(8) DM6: PL Video Capture DM7: OpenCV-based Image Processing DM8: PL-accelerated Image Processing Page 31

Block Diagram and DM(9) DM6: PL Video Capture DM7: OpenCV-based Image Processing DM8: PL-accelerated Image Processing DM9: Two Image Processing Functions Page 32

Hardware Platform and Connectivity Overview Page 33

Boot Flow Page 34

DM Highlights DM1 APU SMP Linux DM2 RPU0 FreeRTOS Application DM3 RPU1 Bare-metal Application DM4 APU/RPU1 Inter Process Communication DM5 APU Qt Application DM6 PL Video Capture DM7 OpenCV-based Image Processing DM8 PL-accelerated Image Processing DM9 Two Image Processing Functions DM10 Full-fledged Base TRD

DM4: APU/RPU Inter-processor Communication

OpenAMP Framework OpenAMP provides software components to enable development of software applications for APU + RPU systems Remoteproc : controls the Life Cycle Management (LCM) of the remote processors from the master processor RPMsg API : allows Inter Process Communications (IPC) between software running on independent cores in an AMP system Application remoteproc rpmsg Linux Kernel Linux Code/Data Bare Metal Code/Data Ring Buffers Buffer Memory Application XilOpenAMP Library rpmsg remoteproc Bare Metal or RTOS DRAM Page 37

AMP System Terminology A master is defined as the CPU that is booted first A remote is defined as a CPU managed by a master CPU The master CPU brings up and takes down the remote CPU The master communicates with the remote to offload work The master and remote CPUs may be homogeneous or heterogeneous The master/remote roles are typically static at build time messaging Master Remote management Page 38

Master Remoteproc APIs The remoteproc APIs provides four functions for the master CPU 1 Load the code and data of the remote CPU into memory 2 Start the remote CPU with reset and clock control 3 Manage a communication channel with the remote CPU 4 Shut down the remote CPU with reset and clock control Page 39

Remote Remoteproc APIs The remoteproc APIs also support the remote CPU It provides three functions for the remote CPU 1 Initialization of the remoteproc system on the remote CPU 2 Manage a communication channel with the master CPU 3 Shutdown of the remoteproc system on the remote CPU Page 40

Remote Processor (remoteproc) Component The remote firmware includes a statically linked resource table The resource table describes the required system resources The firmware is parsed by the master to get the resource table Examples of resources in the resource table include: memory regions (carve-outs) for code and data sections memory regions to describe interprocessor communication Page 41

Remote Processor Message (rpmsg) Component rpmsg is a messaging bus to allow communication between CPUs Each CPU is device on the messaging bus A channel is a communication link between CPUs on the bus A channel is created when the remote CPU is started A channel is identified by a name together with source and destination addresses Master application Remote application channel channel Page 42

Virtio (virtual I/O) Component The virtio component is used to implement rpmsg It provides virtual I/O services to support communication between the master and remote Application Linux Kernel remoteproc rpmsg A vring is a transport abstraction for I/O operations used by virtio A vring implements a ring buffer virtio vring Page 43

Bare Metal Remote Firmware Build Process The firmware build consists of multiple libraries which are combined with the application The resource table is built into the application The resource table is linked into a section the master knows how to find it in the application Each library is built based on the role (master or remote) resource table application libbaremetal _remotea Resource Table Memory carve-out for firmware code and data rpmsg virtio device, TX and RX vring info remote firmware ELF image libopen_ ampa Page 44

OpenAMP Documents OpenAMP Wiki http://wwwwikixilinxcom/openamp Github Repository https://githubcom/openamp/open-amp OpenAMP Get Start Guide UG1186: http://wwwxilinxcom/support/documentation/sw_manuals/xilinx2017_2/ug1186-zynqopenamp-gsgpdf ZCU102 TRD Document UG1221

Wrap up

Summary Heterogeneous Multi-Processing SoC Hardware Quad core ARM Cortex-A53 Dual core ARM Cortex-R5 Multiple acceleration engines and high speed peripherals Programmable Logic Complete Software Stacks SMP Linux AMP and Inter-Processor Communication SDSoC for accelerator design with C Targeted Reference Design Make good use of MPSoC architecture Good reference for start

Support Known issues and limitations http://wwwwikixilinxcom/zynq+ultrascale+mpsoc+base+trd+20172 Discussion Xilinx Community Forums Please include "ZCU102 Base TRD" and the release version in the topic

Explore More in xilinxcom ZCU102 TRD, example designs, documents and board files https://wwwxilinxcom/zcu102 revision https://wwwxilinxcom/revision http://wwwwikixilinxcom/revision+getting+started+guide+20172 Open source Linux documents and tips at http://wikixilinxcom MPSoC Ubuntu Desktop: http://wwwwikixilinxcom/+zynq+ultrascale%ef%bc%8b+mpsoc+ubuntu+desktop PetaLinux Yocto Tips: http://wwwwikixilinxcom/petalinux+yocto+tips Device Tree Tips: http://wwwwikixilinxcom/device+tree+tips Known Issue Answer Records https://wwwxilinxcom/supporthtml#knowledgebase PetaLinux 20172 - Product Update Release Notes and Known Issues https://wwwxilinxcom/support/answers/69372html

Follow Us on Social Media Chinese English xilinx_inc facebookcom/xilinxinc twittercom/#!/xilinxinc http://iyoukucom/xilinx youtubecom/xilinxinc http://weibocom/xilinxchina linkedincom/company/xilinx plusgooglecom/+xilinx http://forumsxilinxcom/cn xilinxcom/about/app-downloadhtml Page 50