Storage I/O Summary. Lecture 16: Multimedia and DSP Architectures
|
|
- Audra Green
- 6 years ago
- Views:
Transcription
1 Storage I/O Summary Storage devices Storage I/O Performance Measures» Throughput» Response time I/O Benchmarks» Scaling to track technological change» Throughput with restricted response time is normal measure I/O System design» Balanced design» Performance determined by weakest link Buses Processor Interface Issues 1 Lecture 16: Multimedia and DSP Architectures Prof. Kunle Olukotun EE 282h Fall 98/99 2
2 Input/Output (I/O) more than just disks and networks Secondary storage» disks, tapes, CD ROMs Communication» networks Human interface» video, audio, keyboards,» Multimedia Real world interface» temperature, pressure, position, velocity, voltage, current,» Digital Signal Processing (DSPs) 3 New Architecture Direction media processing will become the dominant force in computer arch. & microprocessor design.... new media-rich applications... involve significant real-time processing of continuous media streams, and make heavy use of vectors of packed 8-, 16-, and 32-bit integer and Fl. Pt. Needs include high memory BW, high network BW, continuous media data types, real-time response, fine grain parallelism How Multimedia Workloads Will Change Processor Design, Diefendorff & Dubey, IEEE Computer (9/97) 4
3 Multimedia Workloads Multimedia» Videoconferencing» Video authoring» Animation» Games Algorithms» Image compression (JPEG)» Video compression (MPEG)» 3D- graphics» encryption 5 Multimedia Characteristics Real-time response» Video, audio Continuous media data-types» 8-16 bits sufficient for many applications Data parallelism» E.g. Apply same operation to whole image» Vector or SIMD works well Coarse grained parallelism» E.g. video encoding/decoding, audio encoding /decoding Small loops» most time spent in kernel» Amenable to hand optimization High memory bandwidth» Video, 3D graphics» Caches not large enough 6
4 Multimedia ISA Extensions HP PA-RISC» MAX-2 SUN SPARC» VIS Intel x86» MMX MIPS» MDMX Power PC» Altivec 7 MMX MMX Technology Extension to the Intel Architecture Alex Peleg and Uri Weiser IEEE Micro, August 1996 Goals» Improve performance of multimedia applications Graphics, MPEG video Image processing, speech recognition» Remain completely compatible with the Intel x86 ISA» Minimize cost Approach» Use packed data types» Exploit SIMD parallelism» Make use of existing wide data paths 8
5 Data Types Three fixed-point integers fit into 64 bit quad word» Packed byte, 8 8-bit bytes» Packed word, 4 16-bit words (most common data type)» Packed doubleword, 2 32-bit doubleword User controlled fixed point» Advantage?» Disadvantage? 9 MMX Operands Eight 64-bit GP registers (MM0-MM7) MMX shares FPU» Why?» Can t do FP computation and MMX at the same time» What about graphics? Random access» Learned lesson from FPU design MMX Floating Point 10
6 MMX Operations 57 MMX instructions work on all data types Support for saturation arithmetic» Simplifies handling of underflow and overflow» Matches physical behavior Packed operations» Addition/subtraction» Multiplication» Compares» Shift Conversion operations» Pack» Unpack Performance improvement» Fewer loads and stores» Fewer arithmetic operations, but more conversion operations» Limitations 11 MMX Operations Unpacking byte data. 12
7 MMX ISA Summary 13 Using MMX Assembly language coding Use of libraries» E.g. IDCT, DCT, matrix multiply Use of C macros ( intrinsics )» Generate optimized assembly code» Performs register allocation and instruction scheduling MMX64 t0, t1; t0 = paddd(t0, t1);» Requires intimate knowledge of MMX Could a compiler generate MMX code? 14
8 Chroma Keying weatherman example for (i = 0; i < image_size; i++) new_image = (x[i] == Blue)? y[i] : x[i]; movq mm3, mem1 load 8 pixels from weatherman movq mm4, mem2 load 8 pixels from map pcmpeq mm1, mm3 generate select mask pand mm4, mm1 and map with mask pandn mm1, mm3 and weatherman with inverse mask por mm4, mm1 or masked images together 15 Chroma Keying 16
9 FIR filter on a GP Processor loop: lw r5, 0(r1) lw r6, 0(r2) mul r7, r5,r6 add r8,r8,r7 addi r1,r1,4 addi r2,r2,4 addi r4,r4,-1 bnez r4,loop sw r8,0(r3) 17 DSP vs. General Purpose MPU The MIPS/MFLOPS of DSPs is speed of Multiply-Accumulate (MAC).» DSP are judged by whether they can keep the multipliers busy 100% of the time. The SPECDSP is 4 algorithms:» Infinite Impulse Response (IIR) filters» Finite Impulse Response (FIR) filters» FFT, and» convolvers In DSPs, algorithms are king!» Binary compatibility not an issue Software is not (yet) king in DSPs.» People still write in assembly language for a product to minimize the die area for ROM in the DSP chip.» I predict this will change as algorithms get more complicated 18
10 DSP Instructions May specify multiple operations in a single instruction Must support Multiply-Accumulate (MAC) Need parallel move support Usually have special loop support to reduce branch overhead» Loop an instruction or sequence» 0 value in register usually means loop maximum number of times» Must be sure if calculate loop count that 0 does not mean 0 May have saturating shift left arithmetic May have conditional execution to reduce branches 19 DSP Data Path: Precision DSPs typically work with fixed point fractions Word size affects precision of fixed point numbers DSPs have 16-bit, 20-bit, or 24-bit data words Floating Point DSPs cost 2X - 4X vs. fixed point, slower than fixed point DSP programmers scale values inside code» SW Libraries» Separate explicit exponent Blocked Floating Point single exponent for a group of fractions Floating point support simplifies software development» All DSP benchmarks that we use will be floating point 20
11 DSP Data Path: Multiplier Specialized hardware performs all key arithmetic operations in 1 cycle 50% of instructions can involve multiplier => single cycle latency multiplier Need to perform multiply-accumulate (MAC) n-bit multiplier => 2n-bit product 21 DSP Memory FIR Tap implies multiple memory accesses DSPs want multiple data ports Some DSPs have ad hoc techniques to reduce memory bandwidth demand» Instruction repeat buffer: do 1 instruction 256 times» Often disables interrupts, thereby increasing interrupt response time Some recent DSPs have instruction caches» Even then may allow programmer to lock in instructions into cache» Option to turn cache into fast program memory No DSPs have data caches May have multiple data memories 22
12 DSP Addressing Have standard addressing modes: immediate, displacement, register indirect Want to keep MAC datapath busy Assumption: any extra instructions imply clock cycles of overhead in inner loop => complex addressing is good => don t use datapath to calculate fancy address Autoincrement/Autodecrement register indirect» lw r1,0(r2)+ => r1 <- M[r2]; r2<-r2+1» Option to do it before addressing, positive or negative 23 DSP Addressing: Buffers DSPs dealing with continuous I/O Often interact with an I/O buffer (delay lines) To save memory, buffer often organized as circular buffer What can do to avoid overhead of address checking instructions for circular buffer? Option 1: Keep start register and end register per address register for use with autoincrement addressing, reset to start when reach end of buffer Option 2: Keep a buffer length register, assuming buffers starts on aligned address, reset to start when reach end Every DSP has modulo or circular addressing 24
13 DSP Addressing: FFT FFTs start or end with data in bufferfly order 0 (000) => 0 (000) 1 (001) => 4 (100) 2 (010) => 2 (010) 3 (011) => 6 (110) 4 (100) => 1 (001) 5 (101) => 5 (101) 6 (110) => 3 (011) 7 (111) => 7 (111) What can do to avoid overhead of address checking instructions for FFT? Have an optional bit reverse address addressing mode for use with autoincrement addressing Many DSPs have bit reverse addressing for radix-2 FFT 25 FIR filter GP Processor loop: lw r5, 0(r1) lw r6, 0(r2) mul r7, r5,r6 add r8,r8,r7 addi r1,r1,4 addi r2,r2,4 addi r4,r4,-1 bnez r4,loop sw r8,0(r3) DSP: Bus / memory bandwidth bottleneck, control code repeat r4 mac acc,r5,r6 X:r5, (r1++) Y:r6, (r2++) sw acc,0(r3) overhead 26
14 Summary: How are DSPs different? Single cycle multiply accumulate (multiple busses and array multipliers) Complex instructions for standard DSP functions (IIR and FIR filters, convolvers) Specialized memory addressing» Modular arithmetic for circular buffers (delay lines)» Bit reversal (FFT) Zero overhead loops and repeat instructions I/ O support Serial and parallel ports General-purpose processors will become viable for many DSP applications Recent DSPs have adopted VLIW architectures 27
General Purpose Signal Processors
General Purpose Signal Processors First announced in 1978 (AMD) for peripheral computation such as in printers, matured in early 80 s (TMS320 series). General purpose vs. dedicated architectures: Pros:
More informationIntel s MMX. Why MMX?
Intel s MMX Dr. Richard Enbody CSE 820 Why MMX? Make the Common Case Fast Multimedia and Communication consume significant computing resources. Providing specific hardware support makes sense. 1 Goals
More informationInstruction Set Principles and Examples. Appendix B
Instruction Set Principles and Examples Appendix B Outline What is Instruction Set Architecture? Classifying ISA Elements of ISA Programming Registers Type and Size of Operands Addressing Modes Types of
More informationMedia Instructions, Coprocessors, and Hardware Accelerators. Overview
Media Instructions, Coprocessors, and Hardware Accelerators Steven P. Smith SoC Design EE382V Fall 2009 EE382 System-on-Chip Design Coprocessors, etc. SPS-1 University of Texas at Austin Overview SoCs
More informationEvaluating MMX Technology Using DSP and Multimedia Applications
Evaluating MMX Technology Using DSP and Multimedia Applications Ravi Bhargava * Lizy K. John * Brian L. Evans Ramesh Radhakrishnan * November 22, 1999 The University of Texas at Austin Department of Electrical
More informationAdvance CPU Design. MMX technology. Computer Architectures. Tien-Fu Chen. National Chung Cheng Univ. ! Basic concepts
Computer Architectures Advance CPU Design Tien-Fu Chen National Chung Cheng Univ. Adv CPU-0 MMX technology! Basic concepts " small native data types " compute-intensive operations " a lot of inherent parallelism
More informationEmbedded Computation
Embedded Computation What is an Embedded Processor? Any device that includes a programmable computer, but is not itself a general-purpose computer [W. Wolf, 2000]. Commonly found in cell phones, automobiles,
More informationMicroprocessor Extensions for Wireless Communications
Microprocessor Extensions for Wireless Communications Sridhar Rajagopal and Joseph R. Cavallaro DRAFT REPORT Rice University Center for Multimedia Communication Department of Electrical and Computer Engineering
More informationCS146 Computer Architecture. Fall Midterm Exam
CS146 Computer Architecture Fall 2002 Midterm Exam This exam is worth a total of 100 points. Note the point breakdown below and budget your time wisely. To maximize partial credit, show your work and state
More informationAn introduction to DSP s. Examples of DSP applications Why a DSP? Characteristics of a DSP Architectures
An introduction to DSP s Examples of DSP applications Why a DSP? Characteristics of a DSP Architectures DSP example: mobile phone DSP example: mobile phone with video camera DSP: applications Why a DSP?
More informationCOS 140: Foundations of Computer Science
COS 140: Foundations of Computer Science CPU Organization and Assembly Language Fall 2018 CPU 3 Components of the CPU..................................................... 4 Registers................................................................
More informationFinal Lecture. A few minutes to wrap up and add some perspective
Final Lecture A few minutes to wrap up and add some perspective 1 2 Instant replay The quarter was split into roughly three parts and a coda. The 1st part covered instruction set architectures the connection
More informationLecture 4: Instruction Set Architecture
Lecture 4: Instruction Set Architecture ISA types, register usage, memory addressing, endian and alignment, quantitative evaluation Reading: Textbook (5 th edition) Appendix A Appendix B (4 th edition)
More informationVector IRAM: A Microprocessor Architecture for Media Processing
IRAM: A Microprocessor Architecture for Media Processing Christoforos E. Kozyrakis kozyraki@cs.berkeley.edu CS252 Graduate Computer Architecture February 10, 2000 Outline Motivation for IRAM technology
More informationEJEMPLOS DE ARQUITECTURAS
Maestría en Electrónica Arquitectura de Computadoras Unidad 4 EJEMPLOS DE ARQUITECTURAS M. C. Felipe Santiago Espinosa Marzo/2017 ARM & MIPS Similarities ARM: the most popular embedded core Similar basic
More informationASSEMBLY LANGUAGE MACHINE ORGANIZATION
ASSEMBLY LANGUAGE MACHINE ORGANIZATION CHAPTER 3 1 Sub-topics The topic will cover: Microprocessor architecture CPU processing methods Pipelining Superscalar RISC Multiprocessing Instruction Cycle Instruction
More informationComputer Systems Laboratory Sungkyunkwan University
ARM & IA-32 Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu ARM (1) ARM & MIPS similarities ARM: the most popular embedded core Similar basic set
More informationInstruction Set Architecture. "Speaking with the computer"
Instruction Set Architecture "Speaking with the computer" The Instruction Set Architecture Application Compiler Instr. Set Proc. Operating System I/O system Instruction Set Architecture Digital Design
More informationCharacterization of Native Signal Processing Extensions
Characterization of Native Signal Processing Extensions Jason Law Department of Electrical and Computer Engineering University of Texas at Austin Austin, TX 78712 jlaw@mail.utexas.edu Abstract Soon if
More informationCPE300: Digital System Architecture and Design
CPE300: Digital System Architecture and Design Fall 2011 MW 17:30-18:45 CBC C316 Arithmetic Unit 10032011 http://www.egr.unlv.edu/~b1morris/cpe300/ 2 Outline Recap Chapter 3 Number Systems Fixed Point
More informationCOE608: Computer Organization and Architecture
Add on Instruction Set Architecture COE608: Computer Organization and Architecture Dr. Gul N. Khan http://www.ee.ryerson.ca/~gnkhan Electrical and Computer Engineering Ryerson University Overview More
More informationInstruction Set Architecture (ISA)
Instruction Set Architecture (ISA)... the attributes of a [computing] system as seen by the programmer, i.e. the conceptual structure and functional behavior, as distinct from the organization of the data
More informationLECTURE 10. Pipelining: Advanced ILP
LECTURE 10 Pipelining: Advanced ILP EXCEPTIONS An exception, or interrupt, is an event other than regular transfers of control (branches, jumps, calls, returns) that changes the normal flow of instruction
More informationINSTITUTO SUPERIOR TÉCNICO. Architectures for Embedded Computing
UNIVERSIDADE TÉCNICA DE LISBOA INSTITUTO SUPERIOR TÉCNICO Departamento de Engenharia Informática for Embedded Computing MEIC-A, MEIC-T, MERC Lecture Slides Version 3.0 - English Lecture 22 Title: and Extended
More informationLecture 3 Machine Language. Instructions: Instruction Execution cycle. Speaking computer before voice recognition interfaces
Lecture 3 Machine Language Speaking computer before voice recognition interfaces 1 Instructions: Language of the Machine More primitive than higher level languages e.g., no sophisticated control flow Very
More informationUsing Intel Streaming SIMD Extensions for 3D Geometry Processing
Using Intel Streaming SIMD Extensions for 3D Geometry Processing Wan-Chun Ma, Chia-Lin Yang Dept. of Computer Science and Information Engineering National Taiwan University firebird@cmlab.csie.ntu.edu.tw,
More informationCOSC 6385 Computer Architecture. Instruction Set Architectures
COSC 6385 Computer Architecture Instruction Set Architectures Spring 2012 Instruction Set Architecture (ISA) Definition on Wikipedia: Part of the Computer Architecture related to programming Defines set
More informationVLSI Signal Processing
VLSI Signal Processing Programmable DSP Architectures Chih-Wei Liu VLSI Signal Processing Lab Department of Electronics Engineering National Chiao Tung University Outline DSP Arithmetic Stream Interface
More informationCOMPUTER ORGANIZATION & ARCHITECTURE
COMPUTER ORGANIZATION & ARCHITECTURE Instructions Sets Architecture Lesson 5a 1 What are Instruction Sets The complete collection of instructions that are understood by a CPU Can be considered as a functional
More information55:132/22C:160, HPCA Spring 2011
55:132/22C:160, HPCA Spring 2011 Second Lecture Slide Set Instruction Set Architecture Instruction Set Architecture ISA, the boundary between software and hardware Specifies the logical machine that is
More informationEITF20: Computer Architecture Part2.1.1: Instruction Set Architecture
EITF20: Computer Architecture Part2.1.1: Instruction Set Architecture Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Instruction Set Principles The Role of Compilers MIPS 2 Main Content Computer
More informationChapter 13 Reduced Instruction Set Computers
Chapter 13 Reduced Instruction Set Computers Contents Instruction execution characteristics Use of a large register file Compiler-based register optimization Reduced instruction set architecture RISC pipelining
More informationECE 486/586. Computer Architecture. Lecture # 7
ECE 486/586 Computer Architecture Lecture # 7 Spring 2015 Portland State University Lecture Topics Instruction Set Principles Instruction Encoding Role of Compilers The MIPS Architecture Reference: Appendix
More informationHardware/Compiler Codevelopment for an Embedded Media Processor
Hardware/Compiler Codevelopment for an Embedded Media Processor CHRISTOFOROS KOZYRAKIS, STUDENT MEMBER, IEEE, DAVID JUDD, JOSEPH GEBIS, STUDENT MEMBER, IEEE, SAMUEL WILLIAMS, DAVID PATTERSON, FELLOW, IEEE,
More informationLecture 4: RISC Computers
Lecture 4: RISC Computers Introduction Program execution features RISC characteristics RISC vs. CICS Zebo Peng, IDA, LiTH 1 Introduction Reduced Instruction Set Computer (RISC) represents an important
More informationInstruction Set Architecture ISA ISA
Instruction Set Architecture ISA Today s topics: Note: desperate attempt to get back on schedule we won t cover all of these slides use for reference Risk vs. CISC x86 does both ISA influence on performance
More informationEC 413 Computer Organization
EC 413 Computer Organization Review I Prof. Michel A. Kinsy Computing: The Art of Abstraction Application Algorithm Programming Language Operating System/Virtual Machine Instruction Set Architecture (ISA)
More informationOriginal PlayStation: no vector processing or floating point support. Photorealism at the core of design strategy
Competitors using generic parts Performance benefits to be had for custom design Original PlayStation: no vector processing or floating point support Geometry issues Photorealism at the core of design
More informationMicroprocessors vs. DSPs (ESC-223)
Insight, Analysis, and Advice on Signal Processing Technology Microprocessors vs. DSPs (ESC-223) Kenton Williston Berkeley Design Technology, Inc. Berkeley, California USA +1 (510) 665-1600 info@bdti.com
More informationPage 1. Structure of von Nuemann machine. Instruction Set - the type of Instructions
Structure of von Nuemann machine Arithmetic and Logic Unit Input Output Equipment Main Memory Program Control Unit 1 1 Instruction Set - the type of Instructions Arithmetic + Logical (ADD, SUB, MULT, DIV,
More informationCS 352H Computer Systems Architecture Exam #1 - Prof. Keckler October 11, 2007
CS 352H Computer Systems Architecture Exam #1 - Prof. Keckler October 11, 2007 Name: Solutions (please print) 1-3. 11 points 4. 7 points 5. 7 points 6. 20 points 7. 30 points 8. 25 points Total (105 pts):
More informationImplementation of DSP Algorithms
Implementation of DSP Algorithms Main frame computers Dedicated (application specific) architectures Programmable digital signal processors voice band data modem speech codec 1 PDSP and General-Purpose
More informationLecture 4: RISC Computers
Lecture 4: RISC Computers Introduction Program execution features RISC characteristics RISC vs. CICS Zebo Peng, IDA, LiTH 1 Introduction Reduced Instruction Set Computer (RISC) is an important innovation
More informationCS6303 Computer Architecture Regulation 2013 BE-Computer Science and Engineering III semester 2 MARKS
CS6303 Computer Architecture Regulation 2013 BE-Computer Science and Engineering III semester 2 MARKS UNIT-I OVERVIEW & INSTRUCTIONS 1. What are the eight great ideas in computer architecture? The eight
More informationUniversität Dortmund. ARM Architecture
ARM Architecture The RISC Philosophy Original RISC design (e.g. MIPS) aims for high performance through o reduced number of instruction classes o large general-purpose register set o load-store architecture
More informationECE 154A Introduction to. Fall 2012
ECE 154A Introduction to Computer Architecture Fall 2012 Dmitri Strukov Lecture 10 Floating point review Pipelined design IEEE Floating Point Format single: 8 bits double: 11 bits single: 23 bits double:
More informationCOPROCESSOR APPROACH TO ACCELERATING MULTIMEDIA APPLICATION [CLAUDIO BRUNELLI, JARI NURMI ] Processor Design
COPROCESSOR APPROACH TO ACCELERATING MULTIMEDIA APPLICATION [CLAUDIO BRUNELLI, JARI NURMI ] Processor Design Lecture Objectives Background Need for Accelerator Accelerators and different type of parallelizm
More informationREAL TIME DIGITAL SIGNAL PROCESSING
REAL TIME DIGITAL SIGNAL PROCESSING UTN - FRBA 2011 www.electron.frba.utn.edu.ar/dplab Introduction Why Digital? A brief comparison with analog. Advantages Flexibility. Easily modifiable and upgradeable.
More informationThe Nios II Family of Configurable Soft-core Processors
The Nios II Family of Configurable Soft-core Processors James Ball August 16, 2005 2005 Altera Corporation Agenda Nios II Introduction Configuring your CPU FPGA vs. ASIC CPU Design Instruction Set Architecture
More informationInstruction Set Architecture
Instruction Set Architecture Instructor: Preetam Ghosh Preetam.ghosh@usm.edu CSC 626/726 Preetam Ghosh Language HLL : High Level Language Program written by Programming language like C, C++, Java. Sentence
More informationVertex Shader Design I
The following content is extracted from the paper shown in next page. If any wrong citation or reference missing, please contact ldvan@cs.nctu.edu.tw. I will correct the error asap. This course used only
More informationECE232: Hardware Organization and Design
ECE232: Hardware Organization and Design Lecture 2: Hardware/Software Interface Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Overview Basic computer components How does a microprocessor
More informationTypical Processor Execution Cycle
Typical Processor Execution Cycle Instruction Fetch Obtain instruction from program storage Instruction Decode Determine required actions and instruction size Operand Fetch Locate and obtain operand data
More informationAdvanced d Instruction Level Parallelism. Computer Systems Laboratory Sungkyunkwan University
Advanced d Instruction ti Level Parallelism Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu ILP Instruction-Level Parallelism (ILP) Pipelining:
More informationInstruction Sets Ch 9-10
Instruction Sets Ch 9-10 Characteristics Operands Operations Addressing Instruction Formats 1 Instruction Set (käskykanta) Collection of instructions that CPU understands Only interface to CPU from outside
More informationCMSC 411 Computer Systems Architecture Lecture 13 Instruction Level Parallelism 6 (Limits to ILP & Threading)
CMSC 411 Computer Systems Architecture Lecture 13 Instruction Level Parallelism 6 (Limits to ILP & Threading) Limits to ILP Conflicting studies of amount of ILP Benchmarks» vectorized Fortran FP vs. integer
More informationInstruction Sets Ch 9-10
Instruction Sets Ch 9-10 Characteristics Operands Operations Addressing Instruction Formats 1 Instruction Set (käskykanta) Collection of instructions that CPU understands Only interface to CPU from outside
More informationVector Architectures Vs. Superscalar and VLIW for Embedded Media Benchmarks
Vector Architectures Vs. Superscalar and VLIW for Embedded Media Benchmarks Christos Kozyrakis Stanford University David Patterson U.C. Berkeley http://csl.stanford.edu/~christos Motivation Ideal processor
More informationAdvanced Computer Architecture
Advanced Computer Architecture Chapter 1 Introduction into the Sequential and Pipeline Instruction Execution Martin Milata What is a Processors Architecture Instruction Set Architecture (ISA) Describes
More informationUniprocessors. HPC Fall 2012 Prof. Robert van Engelen
Uniprocessors HPC Fall 2012 Prof. Robert van Engelen Overview PART I: Uniprocessors and Compiler Optimizations PART II: Multiprocessors and Parallel Programming Models Uniprocessors Processor architectures
More informationComputer Science 324 Computer Architecture Mount Holyoke College Fall Topic Notes: MIPS Instruction Set Architecture
Computer Science 324 Computer Architecture Mount Holyoke College Fall 2007 Topic Notes: MIPS Instruction Set Architecture vonneumann Architecture Modern computers use the vonneumann architecture. Idea:
More informationInstruction Set Design
Instruction Set Design software instruction set hardware CPE442 Lec 3 ISA.1 Instruction Set Architecture Programmer's View ADD SUBTRACT AND OR COMPARE... 01010 01110 10011 10001 11010... CPU Memory I/O
More informationThe Processor: Instruction-Level Parallelism
The Processor: Instruction-Level Parallelism Computer Organization Architectures for Embedded Computing Tuesday 21 October 14 Many slides adapted from: Computer Organization and Design, Patterson & Hennessy
More informationEN2910A: Advanced Computer Architecture Topic 02: Review of classical concepts
EN2910A: Advanced Computer Architecture Topic 02: Review of classical concepts Prof. Sherief Reda School of Engineering Brown University S. Reda EN2910A FALL'15 1 Classical concepts (prerequisite) 1. Instruction
More informationFAST FIR FILTERS FOR SIMD PROCESSORS WITH LIMITED MEMORY BANDWIDTH
Key words: Digital Signal Processing, FIR filters, SIMD processors, AltiVec. Grzegorz KRASZEWSKI Białystok Technical University Department of Electrical Engineering Wiejska
More informationDSP Platforms Lab (AD-SHARC) Session 05
University of Miami - Frost School of Music DSP Platforms Lab (AD-SHARC) Session 05 Description This session will be dedicated to give an introduction to the hardware architecture and assembly programming
More informationInstruction Set (ISA) Design and Addressing Modes
3-1 3-1 Instruction Set (ISA) Design and Addressing Modes Material from sections 2.1, 2.2, and 2.3. Outline ISA Design Choices It s more than just picking instructions. ISA Design Choice Details Screw
More informationProcessor Architecture. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University
Processor Architecture Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Moore s Law Gordon Moore @ Intel (1965) 2 Computer Architecture Trends (1)
More informationBuses. Disks PCI RDRAM RDRAM LAN. Some slides adapted from lecture by David Culler. Pentium 4 Processor. Memory Controller Hub.
es > 100 MB/sec Pentium 4 Processor L1 and L2 caches Some slides adapted from lecture by David Culler 3.2 GB/sec Display Memory Controller Hub RDRAM RDRAM Dual Ultra ATA/100 24 Mbit/sec Disks LAN I/O Controller
More informationComputer Architecture. Chapter 2-2. Instructions: Language of the Computer
Computer Architecture Chapter 2-2 Instructions: Language of the Computer 1 Procedures A major program structuring mechanism Calling & returning from a procedure requires a protocol. The protocol is a sequence
More informationTypical DSP application
DSP markets DSP markets Typical DSP application TI DSP History: Modem applications 1982 TMS32010, TI introduces its first programmable general-purpose DSP to market Operating at 5 MIPS. It was ideal for
More informationCMSC Computer Architecture Lecture 2: ISA. Prof. Yanjing Li Department of Computer Science University of Chicago
CMSC 22200 Computer Architecture Lecture 2: ISA Prof. Yanjing Li Department of Computer Science University of Chicago Administrative Stuff! Lab1 is out! " Due next Thursday (10/6)! Lab2 " Out next Thursday
More informationFundamentals of Computer Design
CS359: Computer Architecture Fundamentals of Computer Design Yanyan Shen Department of Computer Science and Engineering 1 Defining Computer Architecture Agenda Introduction Classes of Computers 1.3 Defining
More informationMulti-Level Cache Hierarchy Evaluation for Programmable Media Processors. Overview
Multi-Level Cache Hierarchy Evaluation for Programmable Media Processors Jason Fritts Assistant Professor Department of Computer Science Co-Author: Prof. Wayne Wolf Overview Why Programmable Media Processors?
More informationEE382 Processor Design. Concurrent Processors
EE382 Processor Design Winter 1998-99 Chapter 7 and Green Book Lectures Concurrent Processors, including SIMD and Vector Processors Slide 1 Concurrent Processors Vector processors SIMD and small clustered
More information02 - Numerical Representation and Introduction to Junior
02 - Numerical Representation and Introduction to Junior September 10, 2013 Todays lecture Finite length effects, continued from Lecture 1 How to handle overflow Introduction to the Junior processor Demonstration
More information1. Microprocessor Architectures. 1.1 Intel 1.2 Motorola
1. Microprocessor Architectures 1.1 Intel 1.2 Motorola 1.1 Intel The Early Intel Microprocessors The first microprocessor to appear in the market was the Intel 4004, a 4-bit data bus device. This device
More informationEmbedded Systems. 7. System Components
Embedded Systems 7. System Components Lothar Thiele 7-1 Contents of Course 1. Embedded Systems Introduction 2. Software Introduction 7. System Components 10. Models 3. Real-Time Models 4. Periodic/Aperiodic
More informationCSE 141 Computer Architecture Spring Lecture 3 Instruction Set Architecute. Course Schedule. Announcements
CSE141: Introduction to Computer Architecture CSE 141 Computer Architecture Spring 2005 Lecture 3 Instruction Set Architecute Pramod V. Argade April 4, 2005 Instructor: TAs: Pramod V. Argade (p2argade@cs.ucsd.edu)
More informationChapter 2A Instructions: Language of the Computer
Chapter 2A Instructions: Language of the Computer Copyright 2009 Elsevier, Inc. All rights reserved. Instruction Set The repertoire of instructions of a computer Different computers have different instruction
More informationLatches. IT 3123 Hardware and Software Concepts. Registers. The Little Man has Registers. Data Registers. Program Counter
IT 3123 Hardware and Software Concepts Notice: This session is being recorded. CPU and Memory June 11 Copyright 2005 by Bob Brown Latches Can store one bit of data Can be ganged together to store more
More informationSWAR: MMX, SSE, SSE 2 Multiplatform Programming
SWAR: MMX, SSE, SSE 2 Multiplatform Programming Relatore: dott. Matteo Roffilli roffilli@csr.unibo.it 1 What s SWAR? SWAR = SIMD Within A Register SIMD = Single Instruction Multiple Data MMX,SSE,SSE2,Power3DNow
More informationInstruction Set. Instruction Sets Ch Instruction Representation. Machine Instruction. Instruction Set Design (5) Operation types
Instruction Sets Ch 10-11 Characteristics Operands Operations Addressing Instruction Formats Instruction Set Collection of instructions that CPU understands Only interface to CPU from outside CPU executes
More informationComputer Architecture
Computer Architecture Lecture 3: ISA Tradeoffs Dr. Ahmed Sallam Suez Canal University Based on original slides by Prof. Onur Mutlu Application Space Dream, and they will appear 2 Design Point A set of
More informationComputer System Architecture
CSC 203 1.5 Computer System Architecture Budditha Hettige Department of Statistics and Computer Science University of Sri Jayewardenepura Microprocessors 2011 Budditha Hettige 2 Processor Instructions
More informationEITF20: Computer Architecture Part2.1.1: Instruction Set Architecture
EITF20: Computer Architecture Part2.1.1: Instruction Set Architecture Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Instruction Set Principles The Role of Compilers MIPS 2 Main Content Computer
More informationProcessor Architecture
Processor Architecture Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu SSE2030: Introduction to Computer Systems, Spring 2018, Jinkyu Jeong (jinkyu@skku.edu)
More informationMemory Systems IRAM. Principle of IRAM
Memory Systems 165 other devices of the module will be in the Standby state (which is the primary state of all RDRAM devices) or another state with low-power consumption. The RDRAM devices provide several
More informationCISC 360. Computer Architecture. Seth Morecraft Course Web Site:
CISC 360 Computer Architecture Seth Morecraft (morecraf@udel.edu) Course Web Site: http://www.eecis.udel.edu/~morecraf/cisc360 Overview Intro to Computer Architecture About the Course Organization
More informationENGN1640: Design of Computing Systems Topic 06: Advanced Processor Design
ENGN1640: Design of Computing Systems Topic 06: Advanced Processor Design Professor Sherief Reda http://scale.engin.brown.edu Electrical Sciences and Computer Engineering School of Engineering Brown University
More informationDHANALAKSHMI SRINIVASAN INSTITUTE OF RESEARCH AND TECHNOLOGY. Department of Computer science and engineering
DHANALAKSHMI SRINIVASAN INSTITUTE OF RESEARCH AND TECHNOLOGY Department of Computer science and engineering Year :II year CS6303 COMPUTER ARCHITECTURE Question Bank UNIT-1OVERVIEW AND INSTRUCTIONS PART-B
More informationComputer Science 324 Computer Architecture Mount Holyoke College Fall Topic Notes: MIPS Instruction Set Architecture
Computer Science 324 Computer Architecture Mount Holyoke College Fall 2009 Topic Notes: MIPS Instruction Set Architecture vonneumann Architecture Modern computers use the vonneumann architecture. Idea:
More informationLecture 4: Instruction Set Design/Pipelining
Lecture 4: Instruction Set Design/Pipelining Instruction set design (Sections 2.9-2.12) control instructions instruction encoding Basic pipelining implementation (Section A.1) 1 Control Transfer Instructions
More informationChapter 1. Instructor: Josep Torrellas CS433. Copyright Josep Torrellas 1999, 2001, 2002,
Chapter 1 Instructor: Josep Torrellas CS433 Copyright Josep Torrellas 1999, 2001, 2002, 2013 1 Course Goals Introduce you to design principles, analysis techniques and design options in computer architecture
More informationProcessing Unit CS206T
Processing Unit CS206T Microprocessors The density of elements on processor chips continued to rise More and more elements were placed on each chip so that fewer and fewer chips were needed to construct
More informationSA-1500: A 300 MHz RISC CPU with Attached Media Processor*
and Bridges Division SA-1500: A 300 MHz RISC CPU with Attached Media Processor* Prashant P. Gandhi, Ph.D. and Bridges Division Computing Enhancement Group Intel Corporation Santa Clara, CA 95052 Prashant.Gandhi@intel.com
More informationCSIS1120A. 10. Instruction Set & Addressing Mode. CSIS1120A 10. Instruction Set & Addressing Mode 1
CSIS1120A 10. Instruction Set & Addressing Mode CSIS1120A 10. Instruction Set & Addressing Mode 1 Elements of a Machine Instruction Operation Code specifies the operation to be performed, e.g. ADD, SUB
More informationInstruction Set extensions to X86. Floating Point SIMD instructions
Instruction Set extensions to X86 Some extensions to x86 instruction set intended to accelerate 3D graphics AMD 3D-Now! Instructions simply accelerate floating point arithmetic. Accelerate object transformations
More informationAn Ultra High Performance Scalable DSP Family for Multimedia. Hot Chips 17 August 2005 Stanford, CA Erik Machnicki
An Ultra High Performance Scalable DSP Family for Multimedia Hot Chips 17 August 2005 Stanford, CA Erik Machnicki Media Processing Challenges Increasing performance requirements Need for flexibility &
More informationInstruction Set Architecture
C Fortran Ada etc. Basic Java Instruction Set Architecture Compiler Assembly Language Compiler Byte Code Nizamettin AYDIN naydin@yildiz.edu.tr http://www.yildiz.edu.tr/~naydin http://akademik.bahcesehir.edu.tr/~naydin
More information