Processor Applications. The Processor Design Space. World s Cellular Subscribers. Nov. 12, 1997 Bob Brodersen (http://infopad.eecs.berkeley.

Similar documents
ECE 747 Digital Signal Processing Architecture. DSP Implementation Architectures

EEM870 Embedded System and Experiment Lecture 3: ARM Processor Architecture

Introduction to Microprocessor

ARM Processors for Embedded Applications

Microcomputer Architecture and Programming

Memory Systems IRAM. Principle of IRAM

CREATED BY M BILAL & Arslan Ahmad Shaad Visit:

Chapter 5. Introduction ARM Cortex series

Implementation of DSP Algorithms

Independent DSP Benchmarks: Methodologies and Results. Outline

Mobile Processors. Jose R. Ortiz Ubarri

REAL TIME DIGITAL SIGNAL PROCESSING

Embedded Computation

7/28/ Prentice-Hall, Inc Prentice-Hall, Inc Prentice-Hall, Inc Prentice-Hall, Inc Prentice-Hall, Inc.

Vertex Shader Design I

Module 1. Introduction. Version 2 EE IIT, Kharagpur 1

3.1 Description of Microprocessor. 3.2 History of Microprocessor

General Purpose Signal Processors

Classification of Semiconductor LSI

EMBEDDED SYSTEM BASICS AND APPLICATION

Chapter 1 Introduction

Microprocessors and Microcontrollers. Assignment 1:

VIII. DSP Processors. Digital Signal Processing 8 December 24, 2009

CMPE 310: Systems Design and Programming

CS6303 Computer Architecture Regulation 2013 BE-Computer Science and Engineering III semester 2 MARKS

Embedded Computing Platform. Architecture and Instruction Set

Lode DSP Core. Features. Overview

Microprocessors vs. DSPs (ESC-223)

Interfacing a High Speed Crypto Accelerator to an Embedded CPU

SYSTEM BUS AND MOCROPROCESSORS HISTORY

CHAPTER 4 MARIE: An Introduction to a Simple Computer

Chapter 15 ARM Architecture, Programming and Development Tools

ARM ARCHITECTURE. Contents at a glance:

EE 354 Fall 2015 Lecture 1 Architecture and Introduction

Basic FPGA Architectures. Actel FPGAs. PLD Technologies: Antifuse. 3 Digital Systems Implementation Programmable Logic Devices

Module 2. Embedded Processors and Memory. Version 2 EE IIT, Kharagpur 1

Embedded Systems. 7. System Components

DSP Platforms Lab (AD-SHARC) Session 05

Universität Dortmund. ARM Architecture

ADSP-2100A DSP microprocessor with off-chip Harvard architecture. ADSP-2101 DSP microcomputer with on-chip program and data memory

2. List the five interrupt pins available in INTR, TRAP, RST 7.5, RST 6.5, RST 5.5.

A 50Mvertices/s Graphics Processor with Fixed-Point Programmable Vertex Shader for Mobile Applications

Digital Semiconductor Alpha Microprocessor Product Brief

COMP2121: Microprocessors and Interfacing. Introduction to Microprocessors

CPE300: Digital System Architecture and Design

SA-1500: A 300 MHz RISC CPU with Attached Media Processor*

The ARM10 Family of Advanced Microprocessor Cores

Embedded Systems: Hardware Components (part I) Todor Stefanov

CHETTINAD COLLEGE OF ENGINEERING AND TECHNOLOGY COMPUTER ARCHITECURE- III YEAR EEE-6 TH SEMESTER 16 MARKS QUESTION BANK UNIT-1

COPROCESSOR APPROACH TO ACCELERATING MULTIMEDIA APPLICATION [CLAUDIO BRUNELLI, JARI NURMI ] Processor Design

Assembly Language for Intel-Based Computers, 4 th Edition. Kip R. Irvine. Chapter 2: IA-32 Processor Architecture

Microprocessors/Microcontrollers

SH-Mobile3: Application Processor for 3G Cellular Phones on a Low-Power SoC Design Platform

Understanding the basic building blocks of a microcontroller device in general. Knows the terminologies like embedded and external memory devices,

Microprocessors I MICROCOMPUTERS AND MICROPROCESSORS

6.111 Final Project Jonathan Downey Lauri Kauppila Brian Myhre

Typical DSP application

Microprocessor and Microcontroller question bank. 1 Distinguish between microprocessor and microcontroller.

Lecture 12. Motivation. Designing for Low Power: Approaches. Architectures for Low Power: Transmeta s Crusoe Processor

Age nda. Intel PXA27x Processor Family: An Applications Processor for Phone and PDA applications

REAL TIME DIGITAL SIGNAL PROCESSING

Lecture1: introduction. Outline: History overview Central processing unite Register set Special purpose address registers Datapath Control unit

Contents of this presentation: Some words about the ARM company

An introduction to Digital Signal Processors (DSP) Using the C55xx family

Computer Architecture s Changing Definition

Lecture 1 Introduction to Microprocessors

1. Microprocessor Architectures. 1.1 Intel 1.2 Motorola

TINY System Ultra-Low Power Sensor Hub for Always-on Context Features

The Nios II Family of Configurable Soft-core Processors

Advance Information 24-BIT GENERAL PURPOSE DIGITAL SIGNAL PROCESSOR

An introduction to DSP s. Examples of DSP applications Why a DSP? Characteristics of a DSP Architectures

Hardware Level Organization

High-performance and Low-power Consumption Vector Processor for LTE Baseband LSI

Digital Signal Processor 2010/1/4

Advanced processor designs

Ali Karimpour Associate Professor Ferdowsi University of Mashhad

ASSEMBLY LANGUAGE MACHINE ORGANIZATION

Ali Karimpour Associate Professor Ferdowsi University of Mashhad

Chapter 9: A Closer Look at System Hardware

CISC / RISC. Complex / Reduced Instruction Set Computers

Assembly Language for Intel-Based Computers, 4 th Edition. Chapter 2: IA-32 Processor Architecture. Chapter Overview.

Chapter 9: A Closer Look at System Hardware 4

Nios Soft Core Embedded Processor

CS Computer Architecture Spring Lecture 01: Introduction

Where Does The Cpu Store The Address Of The

Superscalar Machines. Characteristics of superscalar processors

Chapter 4. Enhancing ARM7 architecture by embedding RTOS

Computer Organization and Microprocessors SYLLABUS CHAPTER - 1 : BASIC STRUCTURE OF COMPUTERS CHAPTER - 3 : THE MEMORY SYSTEM

Segment 1A. Introduction to Microcomputer and Microprocessor

Software Defined Modem A commercial platform for wireless handsets

MIPS Technologies MIPS32 M4K Synthesizable Processor Core By the staff of

,1752'8&7,21. Figure 1-0. Table 1-0. Listing 1-0.

Superscalar Processors

DSP Co-Processing in FPGAs: Embedding High-Performance, Low-Cost DSP Functions

CPE300: Digital System Architecture and Design

The Central Processing Unit

UNIT II PROCESSOR AND MEMORY ORGANIZATION

SAE5C Computer Organization and Architecture. Unit : I - V

instruction fetch memory interface signal unit priority manager instruction decode stack register sets address PC2 PC3 PC4 instructions extern signals

2008/12/23. System Arch 2008 (Fire Tom Wada) 1

Transcription:

Processor Applications CS 152 Computer Architecture and Engineering Introduction to Architectures for Digital Signal Processing Nov. 12, 1997 Bob Brodersen (http://infopad.eecs.berkeley.edu) 1 General Purpose - high performance Pentiums, Alpha s, SPARC Used for general purpose software Heavy weight OS - UNIX, NT Workstations, PC s Embedded processors and processor cores ARM, 486SX, Hitachi SH7000, NEC V800 Single program Lightweight, often realtime OS DSP support Cellular phones, consumer electronics (e.g. CD players) Microcontrollers Extremely cost sensitive Small word size - 8 bit common Highest volume processors by far Automobiles, toasters, thermostats,... Increasing Cost Increasing volume 2 The Processor Design Space World s Cellular Subscribers Performance Application specific architectures for performance Embedded processors Microcontrollers Cost is everything Cost Microprocessors Performance is everything & Software rules 3 Millions 700 600 500 400 300 200 100 Analog Digital 0 1993 1994 1995 1996 1997 1998 1999 2000 2001 Will provide a ubiquitous infrastructure for wireless data as well as voice Year Source: Ericsson Radio Systems, Inc.

Multimedia I/O Architecture Embedded applications Radio Modem Embedded Processor E.g. Multimedia terminal electronics Graphics Out Sched ECC Pact Interface Uplink Radio Video I/O Low Power Bus Downlink Radio Voice I/O Pen In Data Flow SRAM FB Fifo Graphics Pen Fifo Audio Video Decomp Video Future chips will be a mix of processors, memory and dedicated hardware for specific algorithms and I/O µp Video Unit Coms custom DSP 5 6 Requirements of the Embedded Processors Area of processor cores = Cost Optimized for a single program - code often in on-chip ROM or off chip EPROM Minimum code size (one of the motivations initially for Java) Performance obtained by optimizing datapath Low cost Lowest possible area Technology behind the leading edge High level of integration of peripherals (reduces system cost) Fast time to market Compatible architectures (e.g. ARM) allows reuseable code Customizable core Nintendo processor Cellular phones Low power if application requires portability 7 8

Another figure of merit Computation per unit area National Semiconductor - Embedded Processor Family??? Nintendo processor Cellular phones 9 Simple architecture 3 stage pipeline - fetch - decode - execute Minimum power and size Short pipeline avoids branch prediction and bypass Versions range from 8-64 bit - choose minimum that meets requirements 10 Code size Example application (single chip system) If a majority of the chip is the program stored in ROM, then code size is a critical issue The Piranha has 3 sized instructions - basic 2 byte, and 2 byte plus 16 or 32 bit immediate 11 12

The DSP Module (DSPM) The National DSP Module Architecture Vector instructions directly supported Pipelined datapath supprts single cycle: Multiply, Add, Shift, Load/Store and Pointer adjustment Operates in parallel to processor core Saturation, overflow and rounding for ALU operations Automatic support for cyclic buffers (modulo arithmetic) Zero overhead repeat Three simultaneous addresses X Y Z Single cycle MAC support is typical for DSP acceleration 13 14 The 486 Embedded Processor The Embedded Features of the 486 GX Said to be designed for embedded batteryoperated and hand-held applications (???) Fully static design (clock can stop and all state is kept) Auto Clock Freeze stops circuits which are not being used in a given instruction (gated clocks) Stop Clock (60 µw), Stop Grant - clock runs but no program execution (40-85 mw) Split power supply - 2.0-3.3 Volt core, 3.3V. I/O, 15 16

Characterizing programs for their energy consumption Power = C V 2 f clock Power 130 mw 350 mw 190 mw 430 mw 290 mw 540 mw 490 mw 730 mw Process Subframe 330µW ComputeLag 107µW IFilterCodebook 63µW QuantizeGains 46µW CodebookSearch 44µW UpdateFilterState 8µW OrthogonalizeCodebook 6µW ComputeWeightedInput 22µW ThetaToCodeword 8µW ComputeLag(...) { R=dotprod(res,res); for (lag=0..127) { lp=getlt(lt); G = dotprod(lp, lp); } } Note the clock rates 17 mw 20 mw 23 mw 30 mw Top four functions account for 90 % of the power 65% of power dissipation in dot-vector products (data obtained from profiling of C++-code, weighted with estimated instruction energy costs) 17 18 An architecture optimized for multiplyaccumulate AddressGen MAC AddressGen MAC Energy/Flexibility Tradeoff s Arm 6 core (5V, 20 MHz):.02 MIPS/mW ZSP DSP Superscaler (3V, 200 MHz).3 MOPS/mW L G C Control Processor Reconfigurable Dot-Vector Processor (1.5V, 30 MHz) 5.9 MIPS/mW * MOPS = millions of operations/sec = millions of MACS/sec 19