Overview. Technology Details. D/AVE NX Preliminary Product Brief

Similar documents
Mali-400 MP: A Scalable GPU for Mobile Devices Tom Olson

Dave Shreiner, ARM March 2009

PowerVR Hardware. Architecture Overview for Developers

Mali Developer Resources. Kevin Ho ARM Taiwan FAE

Overview. Think Silicon is a privately held company founded in 2007 by the core team of Atmel MMC IC group

Profiling and Debugging Games on Mobile Platforms

ARM Processors for Embedded Applications

The Bifrost GPU architecture and the ARM Mali-G71 GPU

Copyright Khronos Group Page 1. Vulkan Overview. June 2015

POWERVR MBX & SGX OpenVG Support and Resources

PowerVR Series5. Architecture Guide for Developers

TES Guiliani + BLU on Renesas Rx63N Embedded GUI Solution Kit - Introduction

Coming to a Pixel Near You: Mobile 3D Graphics on the GoForce WMP. Chris Wynn NVIDIA Corporation

Mobile Graphics Ecosystem. Tom Olson OpenGL ES working group chair

Bifrost - The GPU architecture for next five billion

! Readings! ! Room-level, on-chip! vs.!

Bringing it all together: The challenge in delivering a complete graphics system architecture. Chris Porthouse

Cover TBD. intel Quartus prime Design software

Applying the Benefits of Network on a Chip Architecture to FPGA System Design

3D Graphics in Future Mobile Devices. Steve Steele, ARM

Introduction to the Qsys System Integration Tool

Higher Level Programming Abstractions for FPGAs using OpenCL

Falanx Microsystems. Company Overview

Chapter 5. Introduction ARM Cortex series

PowerVR GPU IP from Wearables to Servers. Kristof Beets Director of Business Development May 2015

Next Generation OpenGL Neil Trevett Khronos President NVIDIA VP Mobile Copyright Khronos Group Page 1

Cover TBD. intel Quartus prime Design software

The Nios II Family of Configurable Soft-core Processors

ARM. Mali GPU. OpenGL ES Application Optimization Guide. Version: 3.0. Copyright 2011, 2013 ARM. All rights reserved. ARM DUI 0555C (ID102813)

Parallel Computing: Parallel Architectures Jin, Hai

CS427 Multicore Architecture and Parallel Computing

Graphics, Mobile Computing, APIs and Life

Designing with ALTERA SoC Hardware

Hardware Accelerated Graphics for High Performance JavaFX Mobile Applications

Antonio R. Miele Marco D. Santambrogio

Developing the Bifrost GPU architecture for mainstream graphics

Free Downloads OpenGL ES 3.0 Programming Guide

From Concept to Silicon

Computer and Hardware Architecture I. Benny Thörnberg Associate Professor in Electronics

Introduction to CUDA Algoritmi e Calcolo Parallelo. Daniele Loiacono

Introduction to CUDA Algoritmi e Calcolo Parallelo. Daniele Loiacono

YOUR PARTNER IN INNOVATION

Spring 2010 Prof. Hyesoon Kim. AMD presentations from Richard Huddy and Michael Doggett

Analyzing and Debugging Performance Issues with Advanced ARM CoreLink System IP Components

Multi-core microcontroller design with Cortex-M processors and CoreSight SoC

Multimedia in Mobile Phones. Architectures and Trends Lund

Graphics Hardware, Graphics APIs, and Computation on GPUs. Mark Segal

The NVIDIA GeForce 8800 GPU

Introduction to Multicore architecture. Tao Zhang Oct. 21, 2010

Working with Metal Overview

PowerVR: Getting Great Graphics Performance with the PowerVR Insider SDK. PowerVR Developer Technology

Hot Chips Bringing Workstation Graphics Performance to a Desktop Near You. S3 Incorporated August 18-20, 1996

The Challenges of System Design. Raising Performance and Reducing Power Consumption

Portland State University ECE 588/688. Graphics Processors

OpenGL on Android. Lecture 7. Android and Low-level Optimizations Summer School. 27 July 2015

ARM. Mali GPU. OpenGL ES Application Optimization Guide. Version: 2.0. Copyright 2011, 2013 ARM. All rights reserved. ARM DUI 0555B (ID051413)

ASYNCHRONOUS SHADERS WHITE PAPER 0

Graphics Architectures and OpenCL. Michael Doggett Department of Computer Science Lund university

Adding Advanced Shader Features and Handling Fragmentation

Bringing display and 3D to the C.H.I.P computer

Copyright Khronos Group Page 1

Whiz-Bang Graphics and Media Performance for Java Platform, Micro Edition (JavaME)

Graphics Hardware. Graphics Processing Unit (GPU) is a Subsidiary hardware. With massively multi-threaded many-core. Dedicated to 2D and 3D graphics

Practical Hardware Debugging: Quick Notes On How to Simulate Altera s Nios II Multiprocessor Systems Using Mentor Graphics ModelSim

Vulkan: Architecture positive How Vulkan maps to PowerVR GPUs Kevin sun Lead Developer Support Engineer, APAC PowerVR Graphics.

MICROTRONIX AVALON MULTI-PORT FRONT END IP CORE

Bringing display and 3D to the C.H.I.P computer

Motivation Hardware Overview Programming model. GPU computing. Part 1: General introduction. Ch. Hoelbling. Wuppertal University

Strato and Strato OS. Justin Zhang Senior Applications Engineering Manager. Your new weapon for verification challenge. Nov 2017

ARM Multimedia IP: working together to drive down system power and bandwidth

GoForce 3D: Coming to a Pixel Near You

Spring 2009 Prof. Hyesoon Kim

Modern Processor Architectures (A compiler writer s perspective) L25: Modern Compiler Design

Threading Hardware in G80

Lecture 26: Parallel Processing. Spring 2018 Jason Tang

ARM Cortex core microcontrollers 3. Cortex-M0, M4, M7

CSE 591: GPU Programming. Introduction. Entertainment Graphics: Virtual Realism for the Masses. Computer games need to have: Klaus Mueller

CONTACT: ,

Mobile HW and Bandwidth

Real-Time Buffer Compression. Michael Doggett Department of Computer Science Lund university

Graphics Processing Unit Architecture (GPU Arch)

Achieving Console Quality Games on Mobile

Lecture 25: Board Notes: Threads and GPUs

SIGGRAPH Briefing August 2014

A Low Cost Tile-based 3D Graphics Full Pipeline with Real-time Performance Monitoring Support for OpenGL ES in Consumer Electronics

Spring 2009 Prof. Hyesoon Kim

Performance Optimization for an ARM Cortex-A53 System Using Software Workloads and Cycle Accurate Models. Jason Andrews

ST 软件 软件平台 2. TouchGFX

Lecture 6: Texture. Kayvon Fatahalian CMU : Graphics and Imaging Architectures (Fall 2011)

This Unit: Putting It All Together. CIS 371 Computer Organization and Design. Sources. What is Computer Architecture?

Inside VR on Mobile. Sam Martin Graphics Architect GDC 2016

Real-Time Rendering (Echtzeitgraphik) Michael Wimmer

Optimizing and Profiling Unity Games for Mobile Platforms. Angelo Theodorou Senior Software Engineer, MPG Gamelab 2014, 25 th -27 th June

ENHANCED TOOLS FOR RISC-V PROCESSOR DEVELOPMENT

Moving Mobile Graphics Advanced Real-time Shadowing. Marius Bjørge ARM

EE382N (20): Computer Architecture - Parallelism and Locality Spring 2015 Lecture 09 GPUs (II) Mattan Erez. The University of Texas at Austin

Press Briefing SIGGRAPH 2015 Neil Trevett Khronos President NVIDIA Vice President Mobile Ecosystem. Copyright Khronos Group Page 1

Tools To Get Great Graphics Performance

EECS 487: Interactive Computer Graphics

Google Workloads for Consumer Devices: Mitigating Data Movement Bottlenecks Amirali Boroumand

Transcription:

Overview D/AVE NX is the latest and most powerful addition to the D/AVE family of rendering cores. It is the first IP to bring full OpenGL ES 2.0/3.1 rendering to the FPGA and SoC world. Targeted for graphics user interfaces on displays up to 4K x 4K resolution in the Industrial, Medical, Military, Avionics, Automotive and Consumer markets D/AVE NX is designed to meet the sweet spot of performance and footprint. By enabling the use of state of the art shader based APIs even on small devices high quality 2D and full 3D applications can be satisfied using the D/AVE NX core. The possibility to work with a modern language like OpenGL ES 2.0/3.1 enables the user to rapidly build high-end user interfaces (e.g. based on QT or Android) and makes new, future proof implementations possible. D/AVE NX can scale easily to fit exactly into the resource/performance sweet spot for a particular application. Entire device families can be equipped with differently scaled variants of the core, making all of them fully software compatible. A single unified software stack and the guarantee to produce the exactly same visual result (at different speeds) allows saving significant development resources. D/AVE NX is highly efficient as the internal multi-level scheduler can maximize the utilization of every HW element even better than the fixed function pipeline of the successful D/AVE cores could. Scheduling also does not have to be pre-computed in the compiler simplifying the compiler and driver architecture considerably. Technology Details System Features Scalability throughout the entire design o Scaling from tiny footprint up to high end performance o Scalable unified shader architectures are a perfect fit for FPGAs o Exact same driver / software stack can be used on all versions o Enables the same output at different speeds Unified Shader Architecture o Dynamic, fully reconfigurable shaders o Efficient support for branches / divergent control flow o Fully IEEE compatible floating point ALUs (incl. rounding, denormals etc.) o Compressed shader binaries for higher code density o Non-constant varying indexing o True integer arithmetic (8bit, 16bit, 32bit) o Multi level caches for shader memory Massively parallel execution with fine grained Multithreading o Thread context switch near instantaneous (~2 cycles) TES Electronic Solutions GmbH 1 Copyright 2017

o Advanced task scheduler supporting both long and short term stalls o Architecture is able to eliminate hundreds of cycles of latency o Concurrent use from multiple applications o Job preemption possible on fragment level Bandwidth reduction techniques o On the fly lossless data compression/decompression o Sophisticated caching mechanisms Application optimization and debugging support o Multiple hardware performance counters o Detailed analysis of shader stalls and scheduling Pipelined architecture for high clock frequencies System security features o Stop on bus error for integration with memory protection units o Hardware out-of-framebuffer memory access protection Rendering Full support of all OpenGL ES 2.0/3.1 and VULKAN rendering features High render quality o Highly accurate sub pixel positioning, interpolation and filtering o Multiple anti-aliasing techniques (including MSAA) Effective texture compression o Multiple high quality standard codecs ETC2/EAC, ASTC o Maximizing the visual quality with a given bandwith o Saving storage and transfer resources Hardware supported blending o Normal alpha blending o Linear colorspace blending o Porter-Duff blending (alpha premultiply / postdivide) Various texture and framebuffer formats o 8 bit alpha/luminance, ARGB4444, ARGB1555, RGB565, ARGB8888 etc. o Wide formats with up to 128bit per pixel o Floating point texture support o 3d Texture and texture array support o Easily extendable High resolutions o Frame buffers and textures up to 4k x 4k pixels Support for Image Transformation & Warping Composition Engine Power Management Memory blocks controlled by Chip Select port Prepared for efficient automatic clock gating Global clock gating as option TES Electronic Solutions GmbH 2 Copyright 2017

Integration Low resource consumption (starting at 31K LE) Single clock domain architecture o Bus interface clock frequency may differ from core frequency High latency capable Optional internal arbitration to work with a single bus master Adaptors for common bus protocols o ARM AMBA: APB for register access, AHB or AXI (preferred) for memory bus master access o Intel PSG Avalon as bus adaptors for both register and bus master access o Other bus protocols can be easily adapted Resource Usage The actual resource usage of D/AVE NX depends mainly on the number of Shader Units (SUs), the number of Arithmetic Logic Units (ALUs) per shader unit and partly on the bus and cache configuration. Verification Concept A 100% algorithmic equivalent C model is used as reference for the verification of the RTL code. This realtime capable reference model is called Soft D/AVE which acts as a pixel accurate emulator on Windows PC. The emulator is also used for driver and application development. The following test metrics are used: Functional Coverage Code Coverage using Cadence IUS Static Code Analysis Timing Checks FPGA prototyping TES Electronic Solutions GmbH 3 Copyright 2017

Software Drivers TES provides Khronos conform OpenGL ES 2.0/3.1 and EGL drivers. Both drivers rely on a low level D/AVE NX driver layer abstracting hardware details like the register access and making porting to different CPUs / Operating systems a lot easier. All drivers have the following features: Fully reentrant & thread-safe Minimal OS dependency (HAL part separated) No inline assembler required Support for multiple D/AVE NX instances Multi-threading support, i.e. multiple applications can use D/AVE NX concurrently Small memory footprint Application Application Application EGL OpenGL/ES 2.0 OpenGL/ES 3.1 OpenGL backend Low level driver HAL part D/AVE NX D/AVE NX TES Electronic Solutions GmbH 4 Copyright 2017

D/AVE NX reference Qt system solution on Yocto-Linux TES delivers a complete reference system solution for Intel PSG SoCs supporting selected reference boards (e.g. D10-Nano SoC board featuring Cyclone 5 SoC) including: Qt 5.x Yocto Linux OS OpenGL ES 2.0 and EGL drivers for D/AVE NX and CDC-200NX as source code Qt and native OpenGL ES 2.0 example applications as source code Build scripts to check out required repositories (Yocto-Linux, Qt, meta layers, drivers, ) and build the D/AVE NX demo SD card image as well as the complete SDK. D/AVE NX as Megacore IP block (QSys component) CDC-200NX Display Controller as Megacore IP block (QSys component) The delivered package includes everything to evaluate the IP and start an own integration and application project. Sales & Marketing Contact TES Electronic Solutions GmbH Frankenstrasse 3 20097 Hamburg Germany graphics@tes-dst.com www.tes-dst.com TES Electronic Solutions GmbH 5 Copyright 2017