Isn t It Time You Got Faster, Quicker?

Similar documents
Instruction and Data Streams

Appendix D. Controller Implementation

CMSC Computer Architecture Lecture 12: Virtual Memory. Prof. Yanjing Li University of Chicago

Computer Graphics Hardware An Overview

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Single-Cycle Disadvantages & Advantages

CMSC22200 Computer Architecture Lecture 9: Out-of-Order, SIMD, VLIW. Prof. Yanjing Li University of Chicago

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Part A Datapath Design

Multiprocessors. HPC Prof. Robert van Engelen

UNIVERSITY OF MORATUWA

Multi-Threading. Hyper-, Multi-, and Simultaneous Thread Execution

CSC 220: Computer Organization Unit 11 Basic Computer Organization and Design

CS2410 Computer Architecture. Flynn s Taxonomy

Chapter 4 Threads. Operating Systems: Internals and Design Principles. Ninth Edition By William Stallings

n Explore virtualization concepts n Become familiar with cloud concepts

Avid Interplay Bundle

Chapter 3. Floating Point Arithmetic

APPLICATION NOTE PACE1750AE BUILT-IN FUNCTIONS

Fundamentals of. Chapter 1. Microprocessor and Microcontroller. Dr. Farid Farahmand. Updated: Tuesday, January 16, 2018

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5

CMSC Computer Architecture Lecture 2: ISA. Prof. Yanjing Li Department of Computer Science University of Chicago

AltiVec Technology. AltiVec is a trademark of Motorola, Inc.

OPC Server ECL Comfort 210/310 OPC Server

System Overview. Hardware Concept. s Introduction to the Features of MicroAutoBox t

MOTIF XF Extension Owner s Manual

Security and Communication. Ultimate. Because Intercom doesn t stop at the hardware level. Software Intercom Server for virtualised IT platforms

Chapter 1. Introduction to Computers and C++ Programming. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Pseudocode ( 1.1) Analysis of Algorithms. Primitive Operations. Pseudocode Details. Running Time ( 1.1) Estimating performance

CMSC Computer Architecture Lecture 3: ISA and Introduction to Microarchitecture. Prof. Yanjing Li University of Chicago

CMSC Computer Architecture Lecture 11: More Caches. Prof. Yanjing Li University of Chicago

Data Protection: Your Choice Is Simple PARTNER LOGO

EE University of Minnesota. Midterm Exam #1. Prof. Matthew O'Keefe TA: Eric Seppanen. Department of Electrical and Computer Engineering

Computer Architecture

Elementary Educational Computer

CA InterTest for CICS r8.5

ELEG 5173L Digital Signal Processing Introduction to TMS320C6713 DSK

Ones Assignment Method for Solving Traveling Salesman Problem

CS252 Spring 2017 Graduate Computer Architecture. Lecture 6: Out-of-Order Processors

G2 T. Specification Sheet G2T-001 G2T Touchscreen Mainframes Accepts G2 Plug-in Modules Four Sizes: 2RU, 3RU, 6RU and 8RU

End Semester Examination CSE, III Yr. (I Sem), 30002: Computer Organization

CORD Test Project in Okinawa Open Laboratory

Course Site: Copyright 2012, Elsevier Inc. All rights reserved.

Schlage Control, LE, NDE and CTE solutions. featuring ENGAGE web and mobile applications

ARM. Microcontroller Development Tools. ARM RealView C/C++ Compilation Tools with MicroLib. Easy-to-use IDE Supports Complete Development Cycle

Panel for Adobe Premiere Pro CC Partner Solution

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5.

VISUALSLX AN OPEN USER SHELL FOR HIGH-PERFORMANCE MODELING AND SIMULATION. Thomas Wiedemann

One advantage that SONAR has over any other music-sequencing product I ve worked

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Cache-Optimal Methods for Bit-Reversals

Weston Anniversary Fund

Reliable Transmission. Spring 2018 CS 438 Staff - University of Illinois 1

CMSC Computer Architecture Lecture 5: Pipelining. Prof. Yanjing Li University of Chicago

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor Advanced Issues

Chapter 4 The Datapath

Master Informatics Eng. 2017/18. A.J.Proença. Memory Hierarchy. (most slides are borrowed) AJProença, Advanced Architectures, MiEI, UMinho, 2017/18 1

COP4020 Programming Languages. Compilers and Interpreters Prof. Robert van Engelen

COMP Parallel Computing. PRAM (1): The PRAM model and complexity measures

1 Enterprise Modeler

SCI Reflective Memory

Structuring Redundancy for Fault Tolerance. CSE 598D: Fault Tolerant Software

Guide to Applying Online

FAST BIT-REVERSALS ON UNIPROCESSORS AND SHARED-MEMORY MULTIPROCESSORS

Lecture 28: Data Link Layer

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

TruVu 360 User Community. SpectroCare. Enterprise Fluid Intelligence for Predictive Maintenance. TruVu 360 Product Information

Service Oriented Enterprise Architecture and Service Oriented Enterprise

Lecture 3. RTL Design Methodology. Transition from Pseudocode & Interface to a Corresponding Block Diagram

EE 459/500 HDL Based Digital Design with Programmable Logic. Lecture 13 Control and Sequencing: Hardwired and Microprogrammed Control

Computer Systems - HS

Running Time. Analysis of Algorithms. Experimental Studies. Limitations of Experiments

Industrial SERVO DRIVES FOR COMMERCIAL & INDUSTRIAL Industrial Products

Analysis Metrics. Intro to Algorithm Analysis. Slides. 12. Alg Analysis. 12. Alg Analysis

Running Time ( 3.1) Analysis of Algorithms. Experimental Studies. Limitations of Experiments

Analysis of Algorithms

K-NET bus. When several turrets are connected to the K-Bus, the structure of the system is as showns

ICS Regent. Communications Modules. Module Operation. RS-232, RS-422 and RS-485 (T3150A) PD-6002

What are we going to learn? CSC Data Structures Analysis of Algorithms. Overview. Algorithm, and Inputs

CTx / CTx-II. Ultra Compact SD COFDM Concealment Transmitters. Features: Options: Accessories: Applications:

Global Support Guide. Verizon WIreless. For the BlackBerry 8830 World Edition Smartphone and the Motorola Z6c

Python Programming: An Introduction to Computer Science

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Bayesian approach to reliability modelling for a probability of failure on demand parameter

CMSC Computer Architecture Lecture 10: Caches. Prof. Yanjing Li University of Chicago

Outline. CSCI 4730 Operating Systems. Questions. What is an Operating System? Computer System Layers. Computer System Layers

Analysis of Algorithms

1&1 Next Level Hosting

SRx. HD/SD Dual Input Diversity COFDM Receiver. Features. Options

GE FUNDAMENTALS OF COMPUTING AND PROGRAMMING UNIT III

The University of Adelaide, School of Computer Science 22 November Computer Architecture. A Quantitative Approach, Sixth Edition.

Introduction to Network Technologies & Layered Architecture BUPT/QMUL

Air Force Data Reference Architecture and Platform

JavaFX. JavaFX 2.2 Installation Guide Release 2.2 E August 2012 Installation instructions by operating system for JavaFX 2.

Threads and Concurrency in Java: Part 1

USB TO PARALLEL USB to DB25 Parallel Adapter Cable

Threads and Concurrency in Java: Part 1

ECE5917 SoC Architecture: MP SoC Part 1. Tae Hee Han: Semiconductor Systems Engineering Sungkyunkwan University

3D Model Retrieval Method Based on Sample Prediction

DSP ELEMENTS IN MAX/MSP

CS : Programming for Non-Majors, Summer 2007 Programming Project #3: Two Little Calculations Due by 12:00pm (noon) Wednesday June

Chapter 2. C++ Basics. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Transcription:

Is t It Time You Got Faster, Quicker? AltiVec Techology At-a-Glace OVERVIEW Motorola s advaced AltiVec techology is desiged to eable host processors compatible with the PowerPC istructio-set architecture (ISA) to perform with sigificatly more geeral-purpose processig power. At the same time, this leadigedge techology is egieered to support high badwidth data processig ad algorithmic-itesive computatios, all i a sigle chip solutio. With its ease-of-use software eviromet, AltiVec techology is egieered to brig exceptioal power to applicatios such as telecom switches, IP telephoy gateways, speech processig systems, image ad video processig systems, virtual private etwork servers, highresolutio 3-D graphics ad more. AltiVec techology has prove itself to be a leader i eablig high performace: Motorola s MPC7455 was amed 2001 Embedded Processor of the Year by I-Stat MDR, ad accordig to the EEMBC, a cosortium of semicoductor, compiler ad RTOS vedors, the MPC7455 has the highest certified performace ratig of ay microprocessor i productio the cosortium has ever published. AltiVec techology offers a programmable solutio desiged to easily migrate via software upgrades to follow chagig stadards ad customer requiremets. The bottom lie? With AltiVec techology ad host processors compatible with the PowerPC ISA, your techology ivestmet is protected well ito the future. This at-a-glace guide to AltiVec techology is desiged to give you the iformatio you eed to make the right choices about processors ad performace. This guide icludes a roadmap, features ad beefits, bechmarks ad URLs to help you fid more iformatio.

THE SOLUTION FOR EMBEDDED COMPUTING CHALLENGES With its high performace ad ease-of-use software eviromet, AltiVec techology offers a sigle-chip solutio to may commo embedded computig challeges. AltiVec techology eables: High-badwidth data commuicatios Packet data processig Image ad video processig Access cocetrators/dslams - ADSL ad digital data cocetrators Speech recogitio Voice/soud processig Array umeric processig Basestatio processig Real-time cotiuous speech I/O - HMM, Viterbi acceleratio, eural algorithms 3-D graphics - Games, etertaimet - High-precisio CAD Virtual reality Motio video - MPEG2, MPEG4 - H.234 High-fidelity audio - 3-D audio, AC-3, MP3 Machie itelligece Istructio Stream AltiVec techology has prove itself to be a leader i eablig high performace. Dispatch IU FPU Vector Uit GPRs FPRs Vector Register File 32 bits 64 bits 128 bits Cache/Memory ALTIVEC EXECUTION OF MULTIPLY-ACCUMULATE ALTIVEC TECHNOLOGY S VECTOR EXECUTION UNIT va vb vc Prod Vector executio uit is cocurret with iteger ad floatig poit uits (FPUs) 32 separate, dedicated 128-bit vector registers - Large amespace for low register pressure/spillage - Separate files are accessible by executio uits i parallel - Deep register files allow for sophisticated software optimizatios No pealty for miglig iteger, FPU ad AltiVec techology operatios vd - Log vector legth eables more data-level parallelism

ALTIVEC TECHNOLOGY FEATURES AND BENEFITS ALTIVEC TECHNOLOGY BENEFITS Desiged to provide a sigle, high-performace RISC microprocessor with DSP-like computig power for cotroller ad sigal processig fuctios - Supplemets performace-leadig host processors compatible with the PowerPC ISA with a advaced, best-i-class executio uit - Vector processig egie desiged to provide for highly parallel operatios, which ca allow for the simultaeous executio of up to 16 operatios i a sigle clock cycle - Desiged to accelerate may traditioal computig ad embedded processig operatios with its wide data paths ad field operatios Desiged to provide product desigers ad customers with a iovative oe part/oe code itegrated approach egieered to coverge system cotrol fuctioality with specialized fuctioality typically residet o off-chip devices Offers a programmable solutio desiged to easily migrate via software upgrades to follow chagig stadards ad customer requiremets - Simplifies desig ad support programmable i flexible extesios to C laguage - Desiged to allow customers to leverage PowerPC compatibility ad legacy code, ad add AltiVec performace as they eed it ALTIVEC TECHNOLOGY FEATURES SIMD fuctioality for embedded applicatios with massive data processig eeds Key features: - 128-bit vector executio uit with 32-etry, 128-bit register file - Parallel processig with vector permute uit ad vector arithmetic logical uit - 162 additioal istructios - Advaced data types such as packed byte, halfword ad word itegers, ad packed IEEE sigle-precisio floats - Saturatio arithmetic Simplified architecture - Virtually o iterrupts other tha data storage iterrupt o loads ad stores - Allows hardware ualiged access support - Virtually o pealty for ruig AltiVec ad stadard PowerPC istructios simultaeously - Streamlied architecture to facilitate efficiet implemetatio Maitais PowerPC ISA s RISC register-toregister programmig model Supports parallel operatio o byte, halfword, word ad 128-bit operads - Itra ad iter-elemet arithmetic istructios - Itra ad iter-elemet coditioal istructios - Powerful permute, shift ad rotate istructios Vector iteger ad floatig-poit arithmetic - Data types 8-, 16- ad 32-bit siged ad usiged iteger data types 32-bit IEEE sigle-precisio floatig-poit data type 8-, 16- ad 32-bit Boolea data types (e.g., OxFFFF= 16-bit TRUE) - Modulo ad saturatio iteger arithmetic - 32-bit IEEE-default sigle-precisio floatig poit arithmetic IEEE-default exceptio hadlig IEEE-default roud-to-earest Fast o-ieee mode (e.g., deorms flushed to zero) Cotrol flow with highly flexible bit maipulatio egie - Compare creates field mask used by select fuctio - Compare RC bit eables settig Coditio Register Trivial accept/reject i 3-D graphics Exceptio detectio via software pollig Available library

ABOUT 128-BIT SIMD VECTOR ARCHITECTURE 128-BIT VECTOR ARCHITECTURE FEATURES 128-bit wide data paths betwee L1 cache, L2 cache, load/store uits ad registers - Wider data paths speed save ad restore operatios Offers SIMD processig support for the followig: - 16-way parallelism for 8-bit siged ad usiged bytes ad characters - 8-way parallelism for 16-bit siged ad usiged halfword - 4-way parallelism for 32-bit siged ad usiged itegers ad IEEE floatig poit umbers Four fully pipelied idepedet executio uits - Vector permute uit is a highly flexible byte maipulatio egie Vector simple fixed-poit, vector complex fixed-poit, ad vector floatig-poit executio egies - Dual AltiVec istructio issue Without the power of AltiVec techology, the code may have to call a routie six times to perform the same operatio o multiple pieces of data. With AltiVec techology, the routie may be ru oly oce, o all six sectios of data simultaeously. SAMPLE-BASED PROCESSING SISD (Sigle Istructio, Sigle Data) AC3 - Audio Decode SIMD (Sigle Istructio, Multiple Data) AC3 - Audio Decode do { decode (chael 1) decode (chael 2) decode (chael 3) decode (chael 4) decode (chael 5) decode (chael 6) } while (Amplifier is o; step time) do { decode (chael 1, chael 2, chael 3, chael 4, chael 5, chael 6) } while (Amplifier is o; step time) Approximately 6x performace improvemet ALTIVEC INSTRUCTION SET 162 istructios added to the PowerPC ISA 4 operad, o-destructive istructios - Up to three source operads ad a sigle destiatio operad - Supports advaced multiply-add/sum ad permute primitives Istructios fully pipelied with sigle-cycle throughput - Simple ops: 1 cycle latecy - Compoud ops: 3 4 cycle latecy - No restrictio o issue with scalar istructios Ehaced cache/memory iterface - Software hits for data re-use probability - Prefetch support (stride-n access) Simplified load/store architecture - Simple byte, halfword, word ad quadword loads ad stores - Virtually o ualiged accesses softwaremaaged via permute istructio

ALL ABOUT ALTIVEC TECHNOLOGY WHAT IS A VECTOR ARCHITECTURE, ANYWAY? Desiged to allow the simultaeous processig of may data items i parallel Has roots i supercomputig, which attempted to extract large amouts of parallelism from software Performs operatios o multiple data elemets by a sigle istructio, called Sigle Istructio, Multiple Data (SIMD) parallel processig AltiVec techology is a short SIMD vector architecture - Uses 128-bit wide registers to provide 4-, 8- or 16-way parallelism - Supports a wide variety of data types SIMD extesio to host processors compatible with the PowerPC ISA - Processes multiple data streams/blocks i a sigle cycle - Commo approach to accelerate processig of ext-geeratio data types (audio, video, packet data) MOTOROLA S HIGH-PERFORMANCE EMBEDDED MICROPROCESSOR PRODUCTS Features G4+ G4 Platform with AltiVec Rapid IO Higher Level of Itegratio G4+ MPC7451/55/57 Platform 7-Stage Pipelie, Pi-for-Pi Compatible, SMP ad AltiVec, 484/360 Pi, L3 Cache Iterface, 2+ GHz MPC7457/47 L Spec > 1 GHz MPX: 166 200 MHz, 512K L2 MPC745x/44x 0.18µ 0.18µ SOI 0.13µ SOI 0.13µ SOI MPC7455/45 L Spec 600 1000 MHz, MPX: 133 MHz, 256K L2 MPC7455/45 First Sample Date (left edge) MPC7451/41 L Spec 600 667 MHz, MPX: 133 MHz, 256K L2 MPC7451/41 Product Qualificatio (right edge) MPC7410 L Spec 400 500 MHz, MPX: 133 MHz Time MPC7410 Except for historical iformatio, all of the expectatios ad assumptios cotaied i the foregoig are forward-lookig statemets ivolvig risk ad ucertaities. Importat factors that could cause actual results to differ materially from such forward-lookig statemets iclude, but are ot limited to, the competitive eviromet for our products, chages of rates of all related services ad legislatio that may affect the idustry. For additioal iformatio regardig these ad other risks associated with Compay s busiess, refer to the Compay s reports with the SEC.

BENCHMARKING DATA EEMBC RESULTS: TELECOMMUNICATIONS AND NETWORKING WITH ALTIVEC TECHNOLOGY MPC7455 @ 1 GHz Telecommuicatios MPC7455 @ 1 GHz Usig AltiVec Techology MPC7455 @ 1 GHz 4x Faster Networkig EEMBC Telemark MPC7455 with AltiVec: 121.6 MPC7455 without AltiVec: 28.3 EEMBC Netmark MPC7455 with AltiVec: 98.4 MPC7455 without AltiVec: 29.4 MPC7455 @ 1 GHz Usig AltiVec Techology MPC7455 @ 1 GHz 3x Faster 0 20 40 60 80 100 120 The EEMBC Certificatio Laboratories, LLC (ECL) has certified these scores accordig to the rules established by the EEMBC Board of Directors ad ECL. These scores are repeatable ad the disclosure iformatio o the EEMBC Web site has all bee verified. www.eembc.org www.ebechmarks.com FOR MORE INFORMATION Fid more iformatio about AltiVec techology embedded i Motorola s G4 processors at www.motorola.com/altivec Libraries - May be liked via stadard third-party compilers - Cotai elemets that have bee show to be effective by EEMBC s etworkig ad telecom bechmark suites Applicatio otes - Software code may be icorporated ito customer s specific code, i.e., Fft, dct, Ivert, etc. Customer code - Motorola s software egieers are available to help customers take advatage of the power of AltiVec techology i their code. For more iformatio about Motorola s products: www.motorola.com/semicoductors For additioal tech questios: www.motorola.com/semicoductors/support MOTOROLA ad the Stylized M Logo are registered i the U.S. Patet ad Trademark Office. All other product or service ames are the property of their respective owers. Motorola, Ic. 2002 ALTIVECGLANCE/D REV 1