Media Instructions, Coprocessors, and Hardware Accelerators. Overview
|
|
- Melina Adams
- 5 years ago
- Views:
Transcription
1 Media Instructions, Coprocessors, and Hardware Accelerators Steven P. Smith SoC Design EE382V Fall 2009 EE382 System-on-Chip Design Coprocessors, etc. SPS-1 University of Texas at Austin Overview SoCs offer tremendous potential for targeting applications to particular demands, such as performance, power, cost, etc. How do you take advantage of all those available transistors? Multiple general-purpose processor cores Most Flexible, but typically sub-optimal for specific applications Application-specific instruction set processors (ASIP) MMX media instruction extensions to PCs extend this concept Coprocessors Use well-defined control interface to processor Hardware accelerators Typically custom, memory-mapped interface EE382 System-on-Chip Design Coprocessors, etc. SPS-2 University of Texas at Austin 1
2 Media Instructions: MMX Multimedia applications tend to perform repetitive operations on large quantities of 8 and 16-bit data Filtering Compression Rendering Intel s MMX TM technology is designed to speed-up multimedia and communications applications. The technology includes special instructions and data types that allow such applications to achieve a new level of performance. EE382 System-on-Chip Design Coprocessors, etc. SPS-3 University of Texas at Austin MMX Introduction Processors enabled with MMX technology deliver enough performance to execute compute-intensive communications and multimedia tasks on the standard PC platform. Commonly accelerated applications include graphics, image processing, MPEG video, music synthesis, speech compression, speech recognition, games, video conferencing and more. EE382 System-on-Chip Design Coprocessors, etc. SPS-4 University of Texas at Austin 2
3 Key Attributes of MMX Target Applications Small integer data types Small, highly repetitive loops Frequent multiplies and accumulates Compute-intensive algorithms Highly parallel operations EE382 System-on-Chip Design Coprocessors, etc. SPS-5 University of Texas at Austin MMX Highlights Single Instruction, Multiple Data (SIMD) technique 57 instructions beyond base x86 instruction set Eight 64-bit wide MMX registers Four new data types EE382 System-on-Chip Design Coprocessors, etc. SPS-6 University of Texas at Austin 3
4 MMX SIMD Single Instruction, Multiple Data (SIMD) This allows many pieces of information to be processed with a single instruction, providing parallelism that greatly increases performance. Up to 8-way parallelism EE382 System-on-Chip Design Coprocessors, etc. SPS-7 University of Texas at Austin MMX Data Types Packed Byte: Eight bytes packed into one 64-bit quantity Packed Word: Four 16-bit words packed into one 64-bit quantity Packed Doubleword: Two 32-bit double words packed into one 64-bit quantity Quadword: One 64-bit quantity EE382 System-on-Chip Design Coprocessors, etc. SPS-8 University of Texas at Austin 4
5 MMX Data Types in 64-bit Registers EE382 System-on-Chip Design Coprocessors, etc. SPS-9 University of Texas at Austin MMX Instructions The MMX instructions cover several functional areas: Basic arithmetic operations such as add, subtract, multiply, arithmetic shift and multiply-add Comparison operations Conversion instructions to convert between the new data types - pack data together, and unpack from small to larger data types Logical operations such as AND, AND NOT,OR, and XOR Shift operations Data Transfer (MOV) instructions for MMX register-toregister transfers, or 64-bit and 32-bit load/store operations to memory EE382 System-on-Chip Design Coprocessors, etc. SPS-10 University of Texas at Austin 5
6 MMX Instruction Set Summary EE382 System-on-Chip Design Coprocessors, etc. SPS-11 University of Texas at Austin MMX Instruction Set Summary (2) EE382 System-on-Chip Design Coprocessors, etc. SPS-12 University of Texas at Austin 6
7 PADDW Instruction EE382 System-on-Chip Design Coprocessors, etc. SPS-13 University of Texas at Austin PADDSUW Instruction EE382 System-on-Chip Design Coprocessors, etc. SPS-14 University of Texas at Austin 7
8 PMADDWD Instruction EE382 System-on-Chip Design Coprocessors, etc. SPS-15 University of Texas at Austin PCMPGTW Instruction EE382 System-on-Chip Design Coprocessors, etc. SPS-16 University of Texas at Austin 8
9 PACKSS[DW] Instruction EE382 System-on-Chip Design Coprocessors, etc. SPS-17 University of Texas at Austin MMX Applications Chroma Keying EE382 System-on-Chip Design Coprocessors, etc. SPS-18 University of Texas at Austin 9
10 MMX Applications: Chroma Keying pcmpeqw EE382 System-on-Chip Design Coprocessors, etc. SPS-19 University of Texas at Austin MMX Applications: Chroma Keying pandn EE382 System-on-Chip Design Coprocessors, etc. SPS-20 University of Texas at Austin 10
11 MMX Applications: Vector Dot product EE382 System-on-Chip Design Coprocessors, etc. SPS-21 University of Texas at Austin MMX Applications: Vector Dot product EE382 System-on-Chip Design Coprocessors, etc. SPS-22 University of Texas at Austin 11
12 MMX Applications: Matrix Multiplication EE382 System-on-Chip Design Coprocessors, etc. SPS-23 University of Texas at Austin MMX Applications: Matrix Multiplication Instruction counts EE382 System-on-Chip Design Coprocessors, etc. SPS-24 University of Texas at Austin 12
13 Coprocessors Integrated with processor control logic Tightly-Coupled Coprocessors Task typically completes in a few cycles Small amounts of data Processor stalls waiting for the coprocessor Communication with coprocessor typically via registers and dedicated control signals Coprocessor ports Examples: ARM (ARM7TDMI); Texas Instruments TMS320C55x processors EE382 System-on-Chip Design Coprocessors, etc. SPS-25 University of Texas at Austin Tightly-Coupled Coprocessors Memory System Instruction decode TMS320C55x Register file T C C I/f TCC instructions TCC EE382 System-on-Chip Design Coprocessors, etc. SPS-26 University of Texas at Austin 13
14 Coprocessors Loosely-Coupled Coprocessors Used for larger tasks than is the case for tightly-coupled coprocessors Task runs in parallel with main processor May take many cycles per task Large amounts of data that coprocessor may access independent of main processor Still uses standard coprocessor interface EE382 System-on-Chip Design Coprocessors, etc. SPS-27 University of Texas at Austin Loosely-Coupled Coprocessors chips from AFE Analog Front end (AFE) Input buffer Datapath Address Generation PN Generation Controller & Counters Instruction buffer DSP/ coprocessor interface Output buffer Address Generation SRAM DSP CORRELATOR COPROCESSOR EE382 System-on-Chip Design Coprocessors, etc. SPS-28 University of Texas at Austin 14
15 Hardware Accelerators Similar to loosely-coupled coprocessors, but Ad hoc interface to controlling processor Usually memory-mapped Bus-based, FIFO, or register data interfaces Typically, the processor transfers data to the accelerator, issues a go command, and then collects result data later. Polled or interrupt-based interface Accelerator may have its own path to/from memory Often fixed function EE382 System-on-Chip Design Coprocessors, etc. SPS-29 University of Texas at Austin Common Hardware Accelerator Applications Graphics Compression Audio/Video Decoding Encryption: RSA, DES, AES Router frame queueing, port selection EE382 System-on-Chip Design Coprocessors, etc. SPS-30 University of Texas at Austin 15
16 Hardware Accelerator Interface: Interrupts or Polling? Polling interfaces usually require the processor to read a memory-mapped register to determine the state of the accelerator. Can the accelerator accept new input data? Is the accelerator done with its current task? Has the accelerator generated an error condition? Polling interfaces offer minimal latency between the setting of a condition on the accelerator and its discovery by the controlling processor. But processor isn t doing other work while it polls EE382 System-on-Chip Design Coprocessors, etc. SPS-31 University of Texas at Austin Hardware Accelerator Interface: Interrupts or Polling? Interrupt-based interfaces allow the accelerator to signal conditions to the controlling processor. Interrupt latency is longer than is achievable via the polling method. But the processor can more easily proceed with other work while the accelerator is busy with a task. Interrupts more efficient for coarse grained parallelism (i.e., larger tasks with looser and less frequent synchronization requirements) Interrupts may not work for real-time control tasks with tight schedules EE382 System-on-Chip Design Coprocessors, etc. SPS-32 University of Texas at Austin 16
17 So, When to Use Media Instructions, Coprocessors and Accelerators Media instructions are ideal so long as they meet performance goals. Highly flexible Parallelism from SIMD structures only Coprocessors are preferred when well-defined interfaces are available. Relatively easy to program May or may not stall the processor Accelerators use ad hoc, application-specific interfaces to achieve high levels of performance and parallelism Least flexible EE382 System-on-Chip Design Coprocessors, etc. SPS-33 University of Texas at Austin Conclusions Media instructions, coprocessors, and hardware accelerators each provide a means of increasing system performance. Decision factors for which to use include: Performance requirements Ease of programming Hardware design effort required Flexibility required Nature of parallelism achievable in target application Coarse or fine grained Small scale, localized or broad EE382 System-on-Chip Design Coprocessors, etc. SPS-34 University of Texas at Austin 17
Intel s MMX. Why MMX?
Intel s MMX Dr. Richard Enbody CSE 820 Why MMX? Make the Common Case Fast Multimedia and Communication consume significant computing resources. Providing specific hardware support makes sense. 1 Goals
More informationStorage I/O Summary. Lecture 16: Multimedia and DSP Architectures
Storage I/O Summary Storage devices Storage I/O Performance Measures» Throughput» Response time I/O Benchmarks» Scaling to track technological change» Throughput with restricted response time is normal
More informationIntel MMX Technology Overview
Intel MMX Technology Overview March 996 Order Number: 24308-002 E Information in this document is provided in connection with Intel products. No license under any patent or copyright is granted expressly
More informationMMX TM Technology Technical Overview
MMX TM Technology Technical Overview Information for Developers and ISVs From Intel Developer Services www.intel.com/ids Information in this document is provided in connection with Intel products. No license,
More informationCS802 Parallel Processing Class Notes
CS802 Parallel Processing Class Notes MMX Technology Instructor: Dr. Chang N. Zhang Winter Semester, 2006 Intel MMX TM Technology Chapter 1: Introduction to MMX technology 1.1 Features of the MMX Technology
More informationARM Processors for Embedded Applications
ARM Processors for Embedded Applications Roadmap for ARM Processors ARM Architecture Basics ARM Families AMBA Architecture 1 Current ARM Core Families ARM7: Hard cores and Soft cores Cache with MPU or
More informationVertex Shader Design I
The following content is extracted from the paper shown in next page. If any wrong citation or reference missing, please contact ldvan@cs.nctu.edu.tw. I will correct the error asap. This course used only
More informationMicroprocessor Extensions for Wireless Communications
Microprocessor Extensions for Wireless Communications Sridhar Rajagopal and Joseph R. Cavallaro DRAFT REPORT Rice University Center for Multimedia Communication Department of Electrical and Computer Engineering
More informationEvaluating MMX Technology Using DSP and Multimedia Applications
Evaluating MMX Technology Using DSP and Multimedia Applications Ravi Bhargava * Lizy K. John * Brian L. Evans Ramesh Radhakrishnan * November 22, 1999 The University of Texas at Austin Department of Electrical
More informationUNIT 2 PROCESSORS ORGANIZATION CONT.
UNIT 2 PROCESSORS ORGANIZATION CONT. Types of Operand Addresses Numbers Integer/floating point Characters ASCII etc. Logical Data Bits or flags x86 Data Types Operands in 8 bit -Byte 16 bit- word 32 bit-
More informationCannot increase performance by multiple issuing. -limitation of Instruction Fetch and decode rate (memory bottelneck) -Not enough ILP
Vector Processors Motivations: Cannot increase performance with deeper pipeline because: -clock cycle time limitation (latch delay) -increase dependences with deeper pipeline Cannot increase performance
More informationINSTITUTO SUPERIOR TÉCNICO. Architectures for Embedded Computing
UNIVERSIDADE TÉCNICA DE LISBOA INSTITUTO SUPERIOR TÉCNICO Departamento de Engenharia Informática for Embedded Computing MEIC-A, MEIC-T, MERC Lecture Slides Version 3.0 - English Lecture 22 Title: and Extended
More informationEmbedded Computation
Embedded Computation What is an Embedded Processor? Any device that includes a programmable computer, but is not itself a general-purpose computer [W. Wolf, 2000]. Commonly found in cell phones, automobiles,
More informationMark McDermo2 Steven Smith. SOC Design. Taxonomy of Hardware Accelera+on Microcoded Co- Processor. MC68332 Time Processing Unit
Hardware Accelera+on Mark McDermo2 Steven Smith Agenda Taxonomy of Hardware Accelera+on Microcoded Co- Processor MC68332 Time Processing Unit ISA Enhancements: HC12 Fuzzy Logic Accelera+on SIMD Instruc+ons
More informationHi Hsiao-Lung Chan, Ph.D. Dept Electrical Engineering Chang Gung University, Taiwan
Processors Hi Hsiao-Lung Chan, Ph.D. Dept Electrical Engineering Chang Gung University, Taiwan chanhl@maili.cgu.edu.twcgu General-purpose p processor Control unit Controllerr Control/ status Datapath ALU
More informationModule 2. Embedded Processors and Memory. Version 2 EE IIT, Kharagpur 1
Module 2 Embedded Processors and Memory Version 2 EE IIT, Kharagpur 1 Lesson 8 General Purpose Processors - I Version 2 EE IIT, Kharagpur 2 In this lesson the student will learn the following Architecture
More informationImplementation of DSP Algorithms
Implementation of DSP Algorithms Main frame computers Dedicated (application specific) architectures Programmable digital signal processors voice band data modem speech codec 1 PDSP and General-Purpose
More informationCOPROCESSOR APPROACH TO ACCELERATING MULTIMEDIA APPLICATION [CLAUDIO BRUNELLI, JARI NURMI ] Processor Design
COPROCESSOR APPROACH TO ACCELERATING MULTIMEDIA APPLICATION [CLAUDIO BRUNELLI, JARI NURMI ] Processor Design Lecture Objectives Background Need for Accelerator Accelerators and different type of parallelizm
More informationCo-synthesis and Accelerator based Embedded System Design
Co-synthesis and Accelerator based Embedded System Design COE838: Embedded Computer System http://www.ee.ryerson.ca/~courses/coe838/ Dr. Gul N. Khan http://www.ee.ryerson.ca/~gnkhan Electrical and Computer
More informationEE4144: ARM Cortex-M Processor
EE4144: ARM Cortex-M Processor EE4144 Fall 2014 EE4144 EE4144: ARM Cortex-M Processor Fall 2014 1 / 10 ARM Cortex-M 32-bit RISC processor Cortex-M4F Cortex-M3 + DSP instructions + floating point unit (FPU)
More informationThe ARM10 Family of Advanced Microprocessor Cores
The ARM10 Family of Advanced Microprocessor Cores Stephen Hill ARM Austin Design Center 1 Agenda Design overview Microarchitecture ARM10 o o Memory System Interrupt response 3. Power o o 4. VFP10 ETM10
More information17. Instruction Sets: Characteristics and Functions
17. Instruction Sets: Characteristics and Functions Chapter 12 Spring 2016 CS430 - Computer Architecture 1 Introduction Section 12.1, 12.2, and 12.3 pp. 406-418 Computer Designer: Machine instruction set
More informationEmbedded Systems. 7. System Components
Embedded Systems 7. System Components Lothar Thiele 7-1 Contents of Course 1. Embedded Systems Introduction 2. Software Introduction 7. System Components 10. Models 3. Real-Time Models 4. Periodic/Aperiodic
More informationDesign and Optimization of Geometry Acceleration for Portable 3D Graphics
M.S. Thesis Design and Optimization of Geometry Acceleration for Portable 3D Graphics Ju-ho Sohn 2002.12.20 oratory Department of Electrical Engineering and Computer Science Korea Advanced Institute of
More informationSTM32 Journal. In this Issue:
Volume 1, Issue 2 In this Issue: Bringing 32-bit Performance to 8- and 16-bit Applications Developing High-Quality Audio for Consumer Electronics Applications Bringing Floating-Point Performance and Precision
More information1. Microprocessor Architectures. 1.1 Intel 1.2 Motorola
1. Microprocessor Architectures 1.1 Intel 1.2 Motorola 1.1 Intel The Early Intel Microprocessors The first microprocessor to appear in the market was the Intel 4004, a 4-bit data bus device. This device
More information3.1 Description of Microprocessor. 3.2 History of Microprocessor
3.0 MAIN CONTENT 3.1 Description of Microprocessor The brain or engine of the PC is the processor (sometimes called microprocessor), or central processing unit (CPU). The CPU performs the system s calculating
More informationLatches. IT 3123 Hardware and Software Concepts. Registers. The Little Man has Registers. Data Registers. Program Counter
IT 3123 Hardware and Software Concepts Notice: This session is being recorded. CPU and Memory June 11 Copyright 2005 by Bob Brown Latches Can store one bit of data Can be ganged together to store more
More informationVector IRAM: A Microprocessor Architecture for Media Processing
IRAM: A Microprocessor Architecture for Media Processing Christoforos E. Kozyrakis kozyraki@cs.berkeley.edu CS252 Graduate Computer Architecture February 10, 2000 Outline Motivation for IRAM technology
More informationCS220. April 25, 2007
CS220 April 25, 2007 AT&T syntax MMX Most MMX documents are in Intel Syntax OPERATION DEST, SRC We use AT&T Syntax OPERATION SRC, DEST Always remember: DEST = DEST OPERATION SRC (Please note the weird
More informationInterfacing a High Speed Crypto Accelerator to an Embedded CPU
Interfacing a High Speed Crypto Accelerator to an Embedded CPU Alireza Hodjat ahodjat @ee.ucla.edu Electrical Engineering Department University of California, Los Angeles Ingrid Verbauwhede ingrid @ee.ucla.edu
More informationThis Material Was All Drawn From Intel Documents
This Material Was All Drawn From Intel Documents A ROAD MAP OF INTEL MICROPROCESSORS Hao Sun February 2001 Abstract The exponential growth of both the power and breadth of usage of the computer has made
More informationAn Ultra High Performance Scalable DSP Family for Multimedia. Hot Chips 17 August 2005 Stanford, CA Erik Machnicki
An Ultra High Performance Scalable DSP Family for Multimedia Hot Chips 17 August 2005 Stanford, CA Erik Machnicki Media Processing Challenges Increasing performance requirements Need for flexibility &
More informationAge nda. Intel PXA27x Processor Family: An Applications Processor for Phone and PDA applications
Intel PXA27x Processor Family: An Applications Processor for Phone and PDA applications N.C. Paver PhD Architect Intel Corporation Hot Chips 16 August 2004 Age nda Overview of the Intel PXA27X processor
More informationENGN1640: Design of Computing Systems Topic 06: Advanced Processor Design
ENGN1640: Design of Computing Systems Topic 06: Advanced Processor Design Professor Sherief Reda http://scale.engin.brown.edu Electrical Sciences and Computer Engineering School of Engineering Brown University
More informationTutorial Introduction
Tutorial Introduction PURPOSE: This tutorial describes the key features of the DSP56300 family of processors. OBJECTIVES: Describe the main features of the DSP 24-bit core. Identify the features and functions
More informationIn this tutorial, we will discuss the architecture, pin diagram and other key concepts of microprocessors.
About the Tutorial A microprocessor is a controlling unit of a micro-computer, fabricated on a small chip capable of performing Arithmetic Logical Unit (ALU) operations and communicating with the other
More informationMicroprocessors, Lecture 1: Introduction to Microprocessors
Microprocessors, Lecture 1: Introduction to Microprocessors Computing Systems General-purpose standalone systems (سيستم ھای نھفته ( systems Embedded 2 General-purpose standalone systems Stand-alone computer
More informationWhen an instruction is initially read from memory it goes to the Instruction register.
CS 320 Ch. 12 Instruction Sets Computer instructions are written in mnemonics. Mnemonics typically have a 1 to 1 correspondence between a mnemonic and the machine code. Mnemonics are the assembly language
More informationQUESTION BANK CS2252 MICROPROCESSOR AND MICROCONTROLLERS
FATIMA MICHAEL COLLEGE OF ENGINEERING & TECHNOLOGY Senkottai Village, Madurai Sivagangai Main Road, Madurai -625 020 QUESTION BANK CS2252 MICROPROCESSOR AND MICROCONTROLLERS UNIT 1 - THE 8085 AND 8086
More informationPowerVR Hardware. Architecture Overview for Developers
Public Imagination Technologies PowerVR Hardware Public. This publication contains proprietary information which is subject to change without notice and is supplied 'as is' without warranty of any kind.
More informationASSEMBLY LANGUAGE MACHINE ORGANIZATION
ASSEMBLY LANGUAGE MACHINE ORGANIZATION CHAPTER 3 1 Sub-topics The topic will cover: Microprocessor architecture CPU processing methods Pipelining Superscalar RISC Multiprocessing Instruction Cycle Instruction
More informationPlatforms 12.1 INTRODUCTION 12.2 GENERAL-PURPOSE PROCESSORS
Video Codec Design Iain E. G. Richardson Copyright q 2002 John Wiley & Sons, Ltd ISBNs: 0-471-48553-5 (Hardback); 0-470-84783-2 (Electronic) Platforms 12.1 INTRODUCTION In the early days of video coding
More informationA 50Mvertices/s Graphics Processor with Fixed-Point Programmable Vertex Shader for Mobile Applications
A 50Mvertices/s Graphics Processor with Fixed-Point Programmable Vertex Shader for Mobile Applications Ju-Ho Sohn, Jeong-Ho Woo, Min-Wuk Lee, Hye-Jung Kim, Ramchan Woo, Hoi-Jun Yoo Semiconductor System
More informationChapter 2: Data Manipulation
Chapter 2 Data Manipulation Computer Science An Overview Tenth Edition by J. Glenn Brookshear Presentation files modified by Farn Wang Chapter 2 Data Manipulation 2.1 Computer Architecture 2.2 Machine
More informationThe von Neumann Architecture. IT 3123 Hardware and Software Concepts. The Instruction Cycle. Registers. LMC Executes a Store.
IT 3123 Hardware and Software Concepts February 11 and Memory II Copyright 2005 by Bob Brown The von Neumann Architecture 00 01 02 03 PC IR Control Unit Command Memory ALU 96 97 98 99 Notice: This session
More informationPowerPC 740 and 750
368 floating-point registers. A reorder buffer with 16 elements is used as well to support speculative execution. The register file has 12 ports. Although instructions can be executed out-of-order, in-order
More informationThe PCMCIA DSP Card: An All-in-One Communications System
The PCMCIA DSP Card: An All-in-One Communications System Application Report Raj Chirayil Digital Signal Processing Applications Semiconductor Group SPRA145 October 1994 Printed on Recycled Paper IMPORTANT
More informationSWAR: MMX, SSE, SSE 2 Multiplatform Programming
SWAR: MMX, SSE, SSE 2 Multiplatform Programming Relatore: dott. Matteo Roffilli roffilli@csr.unibo.it 1 What s SWAR? SWAR = SIMD Within A Register SIMD = Single Instruction Multiple Data MMX,SSE,SSE2,Power3DNow
More informationAdvance CPU Design. MMX technology. Computer Architectures. Tien-Fu Chen. National Chung Cheng Univ. ! Basic concepts
Computer Architectures Advance CPU Design Tien-Fu Chen National Chung Cheng Univ. Adv CPU-0 MMX technology! Basic concepts " small native data types " compute-intensive operations " a lot of inherent parallelism
More informationThe objective of this presentation is to describe you the architectural changes of the new C66 DSP Core.
PRESENTER: Hello. The objective of this presentation is to describe you the architectural changes of the new C66 DSP Core. During this presentation, we are assuming that you're familiar with the C6000
More informationWhat's New in Computers
feature ARTCLE What's New in Computers MMX Technology for Multimedia pes S Balakrishnan n this article we discuss ntel's MMX technology and its integration as part of multimedia pes. S Balakrishnan is
More informationMicrocomputer Architecture and Programming
IUST-EE (Chapter 1) Microcomputer Architecture and Programming 1 Outline Basic Blocks of Microcomputer Typical Microcomputer Architecture The Single-Chip Microprocessor Microprocessor vs. Microcontroller
More informationEmbedded Systems: Hardware Components (part I) Todor Stefanov
Embedded Systems: Hardware Components (part I) Todor Stefanov Leiden Embedded Research Center Leiden Institute of Advanced Computer Science Leiden University, The Netherlands Outline Generic Embedded System
More informationIntroduction to Embedded System Processor Architectures
Introduction to Embedded System Processor Architectures Contents crafted by Professor Jari Nurmi Tampere University of Technology Department of Computer Systems Motivation Why Processor Design? Embedded
More informationCS6303 Computer Architecture Regulation 2013 BE-Computer Science and Engineering III semester 2 MARKS
CS6303 Computer Architecture Regulation 2013 BE-Computer Science and Engineering III semester 2 MARKS UNIT-I OVERVIEW & INSTRUCTIONS 1. What are the eight great ideas in computer architecture? The eight
More informationOriginal PlayStation: no vector processing or floating point support. Photorealism at the core of design strategy
Competitors using generic parts Performance benefits to be had for custom design Original PlayStation: no vector processing or floating point support Geometry issues Photorealism at the core of design
More informationUsing MMX Instructions to Perform 16-Bit x 31-Bit Multiplication
Using MMX Instructions to Perform 16-Bit x 31-Bit Multiplication Information for Developers and ISVs From Intel Developer Services www.intel.com/ids Information in this document is provided in connection
More informationComputer System Architecture
CSC 203 1.5 Computer System Architecture Budditha Hettige Department of Statistics and Computer Science University of Sri Jayewardenepura Microprocessors 2011 Budditha Hettige 2 Processor Instructions
More informationOperating Systems: Internals and Design Principles. Chapter 1 Computer System Overview Seventh Edition By William Stallings
Operating Systems: Internals and Design Principles Chapter 1 Computer System Overview Seventh Edition By William Stallings Operating Systems: Internals and Design Principles No artifact designed by man
More informationA Multiprocessor system generally means that more than one instruction stream is being executed in parallel.
Multiprocessor Systems A Multiprocessor system generally means that more than one instruction stream is being executed in parallel. However, Flynn s SIMD machine classification, also called an array processor,
More informationAn Evaluation of an Energy Efficient Many-Core SoC with Parallelized Face Detection
An Evaluation of an Energy Efficient Many-Core SoC with Parallelized Face Detection Hiroyuki Usui, Jun Tanabe, Toru Sano, Hui Xu, and Takashi Miyamori Toshiba Corporation, Kawasaki, Japan Copyright 2013,
More informationDesign and Implementation of a Super Scalar DLX based Microprocessor
Design and Implementation of a Super Scalar DLX based Microprocessor 2 DLX Architecture As mentioned above, the Kishon is based on the original DLX as studies in (Hennessy & Patterson, 1996). By: Amnon
More informationEmbedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institute of Technology, Delhi. Lecture - 10 System on Chip (SOC)
Embedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institute of Technology, Delhi Lecture - 10 System on Chip (SOC) In the last class, we had discussed digital signal processors.
More informationEE 354 Fall 2015 Lecture 1 Architecture and Introduction
EE 354 Fall 2015 Lecture 1 Architecture and Introduction Note: Much of these notes are taken from the book: The definitive Guide to ARM Cortex M3 and Cortex M4 Processors by Joseph Yiu, third edition,
More informationHercules ARM Cortex -R4 System Architecture. Processor Overview
Hercules ARM Cortex -R4 System Architecture Processor Overview What is Hercules? TI s 32-bit ARM Cortex -R4/R5 MCU family for Industrial, Automotive, and Transportation Safety Hardware Safety Features
More informationNext Generation Technology from Intel Intel Pentium 4 Processor
Next Generation Technology from Intel Intel Pentium 4 Processor 1 The Intel Pentium 4 Processor Platform Intel s highest performance processor for desktop PCs Targeted at consumer enthusiasts and business
More informationProcessor Applications. The Processor Design Space. World s Cellular Subscribers. Nov. 12, 1997 Bob Brodersen (http://infopad.eecs.berkeley.
Processor Applications CS 152 Computer Architecture and Engineering Introduction to Architectures for Digital Signal Processing Nov. 12, 1997 Bob Brodersen (http://infopad.eecs.berkeley.edu) 1 General
More informationReconfigurable Computing. Introduction
Reconfigurable Computing Tony Givargis and Nikil Dutt Introduction! Reconfigurable computing, a new paradigm for system design Post fabrication software personalization for hardware computation Traditionally
More informationSecurity IP-Cores. AES Encryption & decryption RSA Public Key Crypto System H-MAC SHA1 Authentication & Hashing. l e a d i n g t h e w a y
AES Encryption & decryption RSA Public Key Crypto System H-MAC SHA1 Authentication & Hashing l e a d i n g t h e w a y l e a d i n g t h e w a y Secure your sensitive content, guarantee its integrity and
More informationCLEARSPEED WHITEPAPER: CSX PROCESSOR ARCHITECTURE
CSX PROCESSOR ARCHITECTURE CLEARSPEED WHITEPAPER: CSX PROCESSOR ARCHITECTURE Abstract This paper describes the architecture of the CSX family of processors based on ClearSpeed s multi-threaded array processor;
More informationEC EMBEDDED AND REAL TIME SYSTEMS
EC6703 - EMBEDDED AND REAL TIME SYSTEMS Unit I -I INTRODUCTION TO EMBEDDED COMPUTING Part-A (2 Marks) 1. What is an embedded system? An embedded system employs a combination of hardware & software (a computational
More informationComputer Architecture
Computer Architecture Lecture 3: ISA Tradeoffs Dr. Ahmed Sallam Suez Canal University Based on original slides by Prof. Onur Mutlu Application Space Dream, and they will appear 2 Design Point A set of
More informationAll About the Cell Processor
All About the Cell H. Peter Hofstee, Ph. D. IBM Systems and Technology Group SCEI/Sony Toshiba IBM Design Center Austin, Texas Acknowledgements Cell is the result of a deep partnership between SCEI/Sony,
More information0;L$+LJK3HUIRUPDQFH ;3URFHVVRU:LWK,QWHJUDWHG'*UDSKLFV
0;L$+LJK3HUIRUPDQFH ;3URFHVVRU:LWK,QWHJUDWHG'*UDSKLFV Rajeev Jayavant Cyrix Corporation A National Semiconductor Company 8/18/98 1 0;L$UFKLWHFWXUDO)HDWXUHV ¾ Next-generation Cayenne Core Dual-issue pipelined
More informationPerformance Improvements of Microprocessor Platforms with a Coarse-Grained Reconfigurable Data-Path
Performance Improvements of Microprocessor Platforms with a Coarse-Grained Reconfigurable Data-Path MICHALIS D. GALANIS 1, GREGORY DIMITROULAKOS 2, COSTAS E. GOUTIS 3 VLSI Design Laboratory, Electrical
More informationModule 1. Introduction. Version 2 EE IIT, Kharagpur 1
Module 1 Introduction Version 2 EE IIT, Kharagpur 1 Lesson 4 Embedded Systems Components Part II Version 2 EE IIT, Kharagpur 2 Overview on Components Instructional Objectives After going through this lesson
More informationIntel Architecture MMX Technology
D Intel Architecture MMX Technology Programmer s Reference Manual March 1996 Order No. 243007-002 Subject to the terms and conditions set forth below, Intel hereby grants you a nonexclusive, nontransferable
More informationCopyright 2016 Xilinx
Zynq Architecture Zynq Vivado 2015.4 Version This material exempt per Department of Commerce license exception TSU Objectives After completing this module, you will be able to: Identify the basic building
More informationEfficient Hardware Acceleration on SoC- FPGA using OpenCL
Efficient Hardware Acceleration on SoC- FPGA using OpenCL Advisor : Dr. Benjamin Carrion Schafer Susmitha Gogineni 30 th August 17 Presentation Overview 1.Objective & Motivation 2.Configurable SoC -FPGA
More informationCalendar Description
ECE212 B1: Introduction to Microprocessors Lecture 1 Calendar Description Microcomputer architecture, assembly language programming, memory and input/output system, interrupts All the instructions are
More informationUNIT V MICRO CONTROLLER PROGRAMMING & APPLICATIONS TWO MARKS. 3.Give any two differences between microprocessor and micro controller.
UNIT V -8051 MICRO CONTROLLER PROGRAMMING & APPLICATIONS TWO MARKS 1. What is micro controller? Micro controller is a microprocessor with limited number of RAM, ROM, I/O ports and timer on a single chip
More informationOne instruction specifies multiple operations All scheduling of execution units is static
VLIW Architectures Very Long Instruction Word Architecture One instruction specifies multiple operations All scheduling of execution units is static Done by compiler Static scheduling should mean less
More informationARMv8 instructions set analysis. Student: Thomas Hochstrasser Supervisor: Prof. Dr. Ulrich Brüning
ARMv8 instructions set analysis Student: Thomas Hochstrasser Supervisor: Prof. Dr. Ulrich Brüning Motivation ARM is everywhere v 20-24 2 Motivation Comparision 99% of all smartphones and tablets using
More informationChapter 1 Computer System Overview
Operating Systems: Internals and Design Principles Chapter 1 Computer System Overview Ninth Edition By William Stallings Operating System Exploits the hardware resources of one or more processors Provides
More informationThe S6000 Family of Processors
The S6000 Family of Processors Today s Design Challenges The advent of software configurable processors In recent years, the widespread adoption of digital technologies has revolutionized the way in which
More informationMemory Systems IRAM. Principle of IRAM
Memory Systems 165 other devices of the module will be in the Standby state (which is the primary state of all RDRAM devices) or another state with low-power consumption. The RDRAM devices provide several
More informationChapter 06: Instruction Pipelining and Parallel Processing. Lesson 14: Example of the Pipelined CISC and RISC Processors
Chapter 06: Instruction Pipelining and Parallel Processing Lesson 14: Example of the Pipelined CISC and RISC Processors 1 Objective To understand pipelines and parallel pipelines in CISC and RISC Processors
More informationLOW-COST SIMD. Considerations For Selecting a DSP Processor Why Buy The ADSP-21161?
LOW-COST SIMD Considerations For Selecting a DSP Processor Why Buy The ADSP-21161? The Analog Devices ADSP-21161 SIMD SHARC vs. Texas Instruments TMS320C6711 and TMS320C6712 Author : K. Srinivas Introduction
More informationCache Justification for Digital Signal Processors
Cache Justification for Digital Signal Processors by Michael J. Lee December 3, 1999 Cache Justification for Digital Signal Processors By Michael J. Lee Abstract Caches are commonly used on general-purpose
More informationThe MorphoSys Parallel Reconfigurable System
The MorphoSys Parallel Reconfigurable System Guangming Lu 1, Hartej Singh 1,Ming-hauLee 1, Nader Bagherzadeh 1, Fadi Kurdahi 1, and Eliseu M.C. Filho 2 1 Department of Electrical and Computer Engineering
More informationsystems such as Linux (real time application interface Linux included). The unified 32-
1.0 INTRODUCTION The TC1130 is a highly integrated controller combining a Memory Management Unit (MMU) and a Floating Point Unit (FPU) on one chip. Thanks to the MMU, this member of the 32-bit TriCoreTM
More informationDeveloping Core Software Technologies for TI s OMAP Platform
SWPY006 - August 2002 White Paper By Justin Helmig Texas Instruments Senior Technical Staff, Wireless Software Applications Texas Instruments OMAP platform for wireless handsets offers a powerful hardware
More informationMicroprocessors and Microcontrollers (EE-231)
Microprocessors and Microcontrollers (EE-231) Main Objectives 8088 and 80188 8-bit Memory Interface 8086 t0 80386SX 16-bit Memory Interface I/O Interfacing I/O Address Decoding More on Address Decoding
More informationManaging Dynamic Reconfiguration Overhead in Systems-on-a-Chip Design Using Reconfigurable Datapaths and Optimized Interconnection Networks
Managing Dynamic Reconfiguration Overhead in Systems-on-a-Chip Design Using Reconfigurable Datapaths and Optimized Interconnection Networks Zhining Huang, Sharad Malik Electrical Engineering Department
More informationMIPS Technologies MIPS32 M4K Synthesizable Processor Core By the staff of
An Independent Analysis of the: MIPS Technologies MIPS32 M4K Synthesizable Processor Core By the staff of Berkeley Design Technology, Inc. OVERVIEW MIPS Technologies, Inc. is an Intellectual Property (IP)
More informationObjectives. Connecting with Computer Science 2
Objectives Learn why numbering systems are important to understand Refresh your knowledge of powers of numbers Learn how numbering systems are used to count Understand the significance of positional value
More informationHistory of the Intel 80x86
Intel s IA-32 Architecture Cptr280 Dr Curtis Nelson History of the Intel 80x86 1971 - Intel invents the microprocessor, the 4004 1975-8080 introduced 8-bit microprocessor 1978-8086 introduced 16 bit microprocessor
More informationAccelerating 3D Geometry Transformation with Intel MMX TM Technology
Accelerating 3D Geometry Transformation with Intel MMX TM Technology ECE 734 Project Report by Pei Qi Yang Wang - 1 - Content 1. Abstract 2. Introduction 2.1 3 -Dimensional Object Geometry Transformation
More informationCNAPS. +Adaptive Solutions. The CNAPS Architecture is. (Connected Network of Adaptive ProcessorS) Engine for. Dan Hammerstrom.
:--' 0/ CNAPS (Connected Network of Adaptive ProcessorS) Dan Hammerstrom Gary Tahara, Inc. + The CNAPS Architecture is highly parallel, highly integrated, relatively low cost, Engine for the emulation
More information