Simplifying DSP Development with C6EZ Tools

Similar documents
ARM+DSP - a winning combination on Qseven

Lab 1. OMAP5912 Starter Kit (OSK5912)

Developing Core Software Technologies for TI s OMAP Platform

PAULA CARRILLO October Processor SDK & PRU-ICSS Industrial software

OMAPL138 Software Developers Guide

LINUX KERNEL UPDATES FOR AUTOMOTIVE: LESSONS LEARNED

Doing more with multicore! Utilizing the power-efficient, high-performance KeyStone multicore DSPs. November 2012

VICP Signal Processing Library. Further extending the performance and ease of use for VICP enabled devices

Profiling and Debugging OpenCL Applications with ARM Development Tools. October 2014

C6000 Compiler Roadmap

Porting BLIS to new architectures Early experiences

Integrating DMA capabilities into BLIS for on-chip data movement. Devangi Parikh Ilya Polkovnichenko Francisco Igual Peña Murtaza Ali

Operating System Services. User Services. System Operation Services. User Operating System Interface - CLI. A View of Operating System Services

CHAPTER 2: SYSTEM STRUCTURES. By I-Chen Lin Textbook: Operating System Concepts 9th Ed.

Implementing Secure Software Systems on ARMv8-M Microcontrollers

Intel C++ Compiler Professional Edition 11.1 for Mac OS* X. In-Depth

TOOLS FOR IMPROVING CROSS-PLATFORM SOFTWARE DEVELOPMENT

NDK OVERVIEW OF THE ANDROID NATIVE DEVELOPMENT KIT

Chapter 8: Memory- Management Strategies. Operating System Concepts 9 th Edition

SDSoC: Session 1

Introduction to gem5. Nizamudheen Ahmed Texas Instruments

Chapter 2: Operating-System Structures. Operating System Concepts 9 th Edit9on

C55x Digital Signal Processors Software Overview

C55x Digital Signal Processors Software Overview

CS533 Concepts of Operating Systems. Jonathan Walpole

Kinetis Software Optimization

MASTER THESIS. TITLE: Migration of a stereoscopic camera system from Matlab to the TMS320DM814x DaVinci platform

PRU Firmware Development. Building Blocks for PRU Development: Module 2

Adding Advanced Shader Features and Handling Fragmentation

Andrew S. Tanenbaum, Operating Systems, Design and Implementation, (Second Edition), Prentice Hall.

XPU A Programmable FPGA Accelerator for Diverse Workloads

Developing and Integrating FPGA Co-processors with the Tic6x Family of DSP Processors

Embedded Systems Programming

Virtual Machines and Dynamic Translation: Implementing ISAs in Software

Conclusions. Introduction. Objectives. Module Topics

A Technical Overview of expressdsp-compliant Algorithms for DSP Software Producers

Chapter 7. Hardware Implementation Tools

Generating Programs and Linking. Professor Rick Han Department of Computer Science University of Colorado at Boulder

Embedded Streaming Media with GStreamer and BeagleBoard. Presented by Todd Fischer todd.fischer (at) ridgerun.com

Operating Systems. 09. Memory Management Part 1. Paul Krzyzanowski. Rutgers University. Spring 2015

ARM Powered SoCs OpenEmbedded: a framework for toolcha. generation and rootfs management

: REAL TIME SYSTEMS LABORATORY DEVELOPMENT: EXPERIMENTS FOCUSING ON A DUAL CORE PROCESSOR

Evaluation of MIPS Prelinking

OpenCL TM & OpenMP Offload on Sitara TM AM57x Processors

Process. Program Vs. process. During execution, the process may be in one of the following states

Here to take you beyond. ECEP Course syllabus. Emertxe Information Technologies ECEP course syllabus

Linux Driver and Embedded Developer

COSC350 System Software

Introduction. CS 2210 Compiler Design Wonsun Ahn

Easy Multicore Programming using MAPS

DevKit8000 Evaluation Kit

Porting Linux to a New Architecture

Chapter 2: System Structures

Multiple file project management & Makefile

Porting Linux to a New Architecture

Four Components of a Computer System

Qt for Device Creation

Chapter 8: Main Memory

Kick Start your Embedded Development with Qt

TUNING CUDA APPLICATIONS FOR MAXWELL

Anand Raghunathan

Chapter 2. Operating-System Structures

Nios II Embedded Design Suite Release Notes

HSA Foundation! Advanced Topics on Heterogeneous System Architectures. Politecnico di Milano! Seminar Room (Bld 20)! 15 December, 2017!

Chapter 8: Memory-Management Strategies

Technology and Design Tools for Multicore Embedded Systems Software Development

Embedded SDR System Design Case Study: An Implementation Perspective

KYC - Know your compiler. Introduction to GCC

Cross Compiling. Real Time Operating Systems and Middleware. Luca Abeni

Distributed Systems 8. Remote Procedure Calls

General Purpose Signal Processors

SmartHeap for Multi-Core

Alexander Warg. Complex Lab Operating System 2007 Winter Term. Introduction

Software Driven Verification at SoC Level. Perspec System Verifier Overview

Chapter 2: Operating-System Structures

CHAPTER 8 - MEMORY MANAGEMENT STRATEGIES

Generic Programming in C

Speeding AM335x Programmable Realtime Unit (PRU) Application Development Through Improved Debug Tools

The DSP/BIOS Bridge. Víctor Manuel Jáquez Leal. 06 February Igalia

Embedded HW/SW Co-Development

DevKit8500D Evaluation Kit

DSP/BIOS LINK OMAP2530 EVM LNK 172 USR. Version 1.64 NOV 13, 2009

Platform-based Design

CS368-LI Digital Media Software Development Kit

Introduction to creating 3D UI with BeagleBoard. ESC-341 Presented by Diego Dompe

CHAPTER 8: MEMORY MANAGEMENT. By I-Chen Lin Textbook: Operating System Concepts 9th Ed.

HETEROGENEOUS SYSTEM ARCHITECTURE: PLATFORM FOR THE FUTURE

TUNING CUDA APPLICATIONS FOR MAXWELL

WS_CCESSH-OUT-v1.00.doc Page 1 of 8

Chapter 8: Main Memory. Operating System Concepts 9 th Edition

Getting Started with FreeRTOS BSP for i.mx 7Dual

Embedded Systems Programming

Multithreaded Coprocessor Interface for Dual-Core Multimedia SoC

Exploring Task Parallelism for Heterogeneous Systems Using Multicore Task Management API

Chapter 2: Operating-System Structures. Operating System Concepts 9 th Edition

SW Tools Solutions for DA8x. Linux by Degrees (updated) Part II

Introduction to CUDA Algoritmi e Calcolo Parallelo. Daniele Loiacono

Computer Systems Organization

Operating-System Structures

Embedded Linux Architecture

Transcription:

Simplifying DSP Development with C6EZ Tools

DSP Development made easier with C6EZ Tools Seamlessly ports ARM code to DSP (ARM Developers) Provides ARM access to ready-to-use DSP kernels (System Developers) Graphical Software Development Tool for quick prototyping (DSP Developers)

C6EzFlo For DSP developers

Target Audience for C6EZFlo Target Developers Signal Processing Experts People familiar with older generations of DSProcessors Challenges Not expert software engineers Unfamiliar with parallel VLIW architectures Unfamiliar in C-level optimization Unfamiliar with complex peripheral drivers Strategy: C6EZFlo Drag-and-drop environment For IO For compute Optimized underlying blocks Well-structured easy to modify C output Easy capability to create new blocks 4

C6EZFlo Overview C6EZFlo s graphical development tool creates a visual signal flow, making DSP prototyping easy, fast and cost-efficient Allows drag and drop functionality to connect i/o blocks to peripherals on the DSP without needing to know DSP code Generates optimized C code that is heavily commented and easily modifiable for further development Features broad and expandable block set to support a wide range of end applications

C6EZFlo V2.0 Window View System Block Diagram Block Palette Properties Pane Output Window 6

Drawing a System in C6EZFlo Blocks are self-contained algorithms Connect blocks to implement a system Configure system with individual block parameters Special framework block controls global parameters Integrated help pages for all blocks 7

Generating the Application Draw a block diagram using the graphical tool Click Generate Code button New source files are automatically added to the project 8

Understanding the Generated Source Supporting Files <app>.h <app>.tcf Library Files includes <app>_main.c <app>_threads.c <app>_blocks.c <block>_create <block>_init main creates <thread>_tsk <block>_proc <block>_ctrl Program entrypoint Processing loop(s) calls Optimized algorithm code

C6EZFlo Processing Blocks Algorithms Now 2Q11 4Q11 Foundation Signal Processing Software Algorithms Digital Signal Processing 12 20 40 Image Processing 7 10 20 Math 49 49 49 Filter Package 7 20 30 System Design Components Signal Generation 6 6 10 System Design 9 9 15 Board I/O Connectivity 18 18 18 Application Specific Software Algorithms ProAudio: Audio Processing Library 2 10 20 Power and Energy 15 20 Vision And Analytics 30 10 Total Supported Functions 110 157 252

C6EZFlo Getting Started Supported Devices Availability C674x, OMAP-L13x, DM643x, DM648, DM647, C6424, C6421, C6452 Downloadable at product page Integrated into CCS v5 (April 2011) Applicable Links C6EZTools Wiki www.ti.com/c6eztoolswiki C6EZFlo Wiki www.ti.com/c6ezflowiki TI Product Page www.ti.com/c6ezflo Technical Support Forums - http://e2e.ti.com/support/default.aspx Feedback/Feature Request Mailing list - C6EZFlo@list.ti.com

C6EzRun For ARM Developers

What can a ARM developer do with the DSP? Run math intensive algorithms Leverage some DSP features, like floating-point computations Free up ARM MIPS for additional system features Save money by not moving to a more powerful & expensive ARM! Get true real-time response via DSP without sacrificing features of a high-level OS like Linux and Android

Total Execution Time (s) For 100 FFT Iterations DSP Performance Complex FFT FFT Size FFT runs ~10x faster on DSP than on ARM. Small FFT size, overhead dominates, running on DSP does not provide advantage. Larger FFT size, overhead absorbed, running on DSP provides advantage. SoC: ARM9 + Floating-Point DSP CPU Frequency: 300MHz Code & Data Location: External DDR2 Memory Instruction and Data Cache: Enabled Single-precision floating-point data buffers.

C6EZRun: Easy DSP Performance for the ARM Developer Target Developers ARM Developers Challenges Unfamiliar with Heterogeneous programming environments Graphical IDEs IPC Strategy: CEZ6run GCC-style build flow Automatically puts Whole app Or Selected functions On DSP 15

C6EZRun for ARM Developers C6EZRun quickly ports ARM code to run on the DSP, making DSP development easy, fast and cost efficient Compile C code for DSP execution right from the Linux shell module1.c module2.c main.c critical.c GCC Tool Chain ARM Executable No modification to ARM application code required Provides a familiar development environment with a GCC-like interface, simplifying the user/development experience module1.c module2.c main.c critical.c ARM-only Development c6runlib tools GCC Tool Chain critical.lib ARM+DSP Development with c6runlib ARM Executable Embedded DSP Executable

C6EZRun Project Open Source Project, hosted on gforge.ti.com Intends to provide an abstracted mechanism for getting code running on the DSP Project Goals Introduce DSP Development to ARM/Linux developer Provide an abstracted mechanism for getting code running on the DSP Start getting Linux and open-source community using the DSP

Leverage C6EZRun in 3 ways Common Back-end Libraries Provide ARM and DSP software infrastructure to load and interact with the DSP C6RunLib Partition an application between the ARM and DSP C6RunApp Run an entire application on the DSP 18

C6RunLib Goal is to automate building an ARM-side library from user s C code of critical functions User can call into that library in their app, and calls will be executed on the DSP ARM ARM Application System On Chip DSP Extract Critical Functions as a library C6RunLib using Framework C6RunLib Critical Stub Library Fxns Consists of automating creation of RPC framework, hiding low-level inter-core communication and other TI specific bits as possible 19

module1.c module2.c main.c critical.c GCC Toolchain ARM Executable ARM-only Development module1.c module2.c main.c GCC Toolchain ARM Executable Embedded DSP Executable critical.c c6runlib tools critcal.lib ARM+DSP Development with c6runlib

C6RunLib Example $ c6runlib-cc -c -O2 o dummy.o dummy.c Above converts C code containing critical functions to C6000 object file Also analyzes global C functions and generates ARM-side remote procedure call stubs $ c6runlib-ar rcs dummy_dsp.lib dummy.o Add object file to library dummy_dsp.lib Underneath, the dummy.o object file is linked to a DSP executable and compiled into the framework Framework object file and stubs object file is archived into the lib ARM-side stubs resolve symbols for ARM application when built against the library 6/1/2011 21

C6RunApp Consists of two parts: Backend library builds and front end user interface Backend libraries collate DSPLink, CMEM, DSP/BIOS, and custom code Front end interface is a command-line cross compiler script (acts like GCC) System On Chip ARM DSP C6RunApp Framework Recompile using DSP C C6RunApp Loader and Application CIO Server Entire application retargets to DSP, but with C I/O access to ARM/Linux 22

module1.c module2.c main.c GCC Toolchain ARM Executable ARM-only Development module1.c module2.c main.c c6runapp tools ARM Executable Embedded DSP Executable ARM+DSP Development with c6runapp

Example C6RunApp Usage $ c6runapp-cc -o hello_world hello_world.c Compiles hello_world.c to C6000 object file, which is then linked into a DSP executable Executable is compiled into the ARM side framework, which is used to build an ARM-side executable called hello_world $ c6runapp-cc -c -o file1.o file1.c $ c6runapp-cc -c -o file2.o file2.c $ c6runapp-cc -o myapp file1.o file2.o Individually compiles file1.c and file2.c to object files Links object files together to create application called myapp, with DSP executable image embedded inside it. 6/1/2011 24

Notes on the C6EZRun tools Front end script wraps the TI C6000 Codegen tools (specifically cl6x) Usage syntax is primarily a GCC clone Supports many GCC options and translates them to appropriate cl6x options Many GCC optimization/warning options are silently ignored They may have no direct analogue in the TI compiler They may not apply Unknown options passed directly to cl6x command-line Allows specific usage of TI C6000 codegen tool Can still perform optimizations and options that only apply to the C6000 codegen tools

C6EZRun Making DSP Development EZ Supported Devices Availability OMAP-L13x, C6A816x, CAA814x, DM814x, DM816x, DM3730, DM6467, OMAP3530, Downloadable on product page Standard part of Software Development Kit Applicable Links C6EZTools Wiki www.ti.com/c6eztoolswiki C6EZRun Wiki www.ti.com/c6ezrunwiki TI Product Page www.ti.com/c6ezrun Technical Support Forums - http://e2e.ti.com/support/default.aspx Feedback/Feature Request Mailing list - C6EZRun@list.ti.com

C6EzAccel for System Developers

What can a System developer do on the DSP? Offload key signal-processing and math intensive algorithms. Leverage some DSP features, like floating-point computations Free up ARM MIPS for additional system features Save money by not moving to a more powerful & expensive ARM! 3 Ways to access the DSP from the ARM Call DSP directly over DSPLINK/SYLINK. Wrap the algorithm with IUniversal interface to Codec Engine Use C6EZRun to cross compile and redirect code for DSP.

C6EZAccel: ARM access to Ready-to-Use DSP functions Target Developers Challenges Strategy: C6EZAccel ARM SoC Developers Unfamiliar with Heterogeneous programming environments Graphical IDEs Inter-processor communication Codec-style build flow Enables Freeing MIPS on ARM Pre/Post Processing Offloading tasks to DSP 29

C6EZAccel APIs C6EZAccel for System Developers C6EZAccel provides ARM application access to production ready, performance tuned functions on the DSP DSP + ARM SOC ARM Application C6EZAccel optimized xdais Algorithms for: Audio Medical Vision Power and Control Image processing Signal processing ARM processor DSP ARM side API with easy access to 100 s of optimized DSP kernels Asynchronous execution allows ARM/DSP parallel processing Includes key signal processing, image processing and math functions

Basic Overview of C6EZAccel with the Software Stack

C6EZAccel application Consists of two parts: DSP server integrating the DSP algorithms and ARM side user interface DSP server collating DSPLink, CMEM, DSP/BIOS, and DSP libraries ARM C Application System On Chip DSP ARM user interface is a static library that abstracts the codec engine setting and ARM side cache management Function is retargeted to DSP, with error handling managed on ARM/Linux using error codes C6EZAccel ARM library DSP server 32

Leveraging C6EZAccel Unit Server Provide ARM and DSP software infrastructure to load and interact with C6EZAccel algorithm on the DSP Add to Codec Server Add custom function on the DSP Provide ARM and DSP software infrastructure to load and interact with C6EZAccel and A/V Multimedia codecs on the DSP Run any algorithm on the DSP 33

C6EZAccel Application Usage User Experience in ARM Application Headers and definitions C6accel_Handle hc6accel = NULL #define ENGINENAME " <platfrom_name>" #define ALGNAME "c6accel " #include " soc/c6accelw/c6accelw " Source code to call fxn1 and fxn2 on DSP CE_Runtime_init(); hc6accel = C6accel_create( );.. Status = C6accel_fxn1(hC6accel,parameters1); Status = C6accel_fxn2(hC6accel,parameters2); C6accel_delete(hC6accel); Creates a C6accel handle. Calls fxn1 and fxn2 on the DSP. Deletes C6Accel handle. Step 1: Analyze source to determine what to move to the DSP Step 2: Allocate all buffers to be passed to DSP using CMEM Step 3: Offload function to the DSP by invoking C6EZAccel APIs or CE APIs Step 4: Integrate DSP codec server with application using XDC configuration Step 5: Compile all ARM code with the linker command file generated from Step 4 34

C6EAccel: Key Features Asynchronous Calling Parallel processing on ARM and DSP Frees ARM MIPS, allows feature addition Maximum system level performance Source to call fxn1 asynchronously on DSP C6Accel_setAsync(hC6); C6accel_fxn1(hC6accel,parameters1); < ARM code to run in parallel> C6accel_waitAsyncCall(hC6accel); Chaining of APIs Averaging of Inter processor overhead Improved Efficiency to make multiple calls

Performance benefits of offloading C6EZAccel Overheads: Cache invalidation on ARM and the DSP Address translation from virtual to physical and vice versa DSP/SYSLINK overhead. Activating, processing, deactivating the C6EZAccel algorithm on the DSP DSP provides higher gains on larger images/data tasks to the DSP Performance of C6EZAccel vs Native ARM floating point FFT function

C6EZAccel Supported Processing Blocks Algorithms Now 2Q11 4Q11 Foundation Signal Processing Software Algorithms Digital Signal Processing 32 45 60 Image Processing 40 50 60 Floating Point Math 30 30 30 Fixed Point Math 32 32 32 Filter Package 128 Application Specific Software Algorithms Vision And Analytics Library (VLIB) 52 52 Open Source Computer Vision (OpenCV) 60 100 ProAudio: Audio Processing Library 20 Power and Energy 15 Total Supported Functions 37 134 269 497

C6EZAccel Making DSP Development EZ Supported Devices Availability OMAP-L13x, C6A816x, CAA814x, DM814x, DM816x, DM3730, DM6467, OMAP3530, Downloadable on product page Standard part of Software Development Kit Applicable Links C6EZTools Wiki www.ti.com/c6eztoolswiki C6EZAccel Wiki www.ti.com/c6ezaccelwiki TI Product Page www.ti.com/c6ezaccel Technical Support Forums - http://e2e.ti.com/support/default.aspx Feedback/Feature Request Mailing list - C6EZAccel@list.ti.com

C6EZRun versus C6EZAccel C6EZRun C6EZAccel For ARM/Linux developer on DSP+ARM devices For users with little to no knowledge of DSP development For users who are bringing their own C source code For users who don t need compatibility with other algorithms, codecs, etc. For users with some knowledge of DSP development For users who want to use TI s prebuilt, optimized DSP libraries For users who want to use other packaged components on the DSP

Thank you!