ISSCC 2003 / SESSION 2 / MULTIMEDIA SIGNAL PROCESSING / PAPER 2.6
|
|
- Angela Chase
- 5 years ago
- Views:
Transcription
1 ISSCC 2003 / SESSION 2 / MULTIMEDIA SIGNAL PROCESSING / PAPER A 51.2GOPS Scalable Video Recognition Processor for Intelligent Cruise Control Based on a Linear Array of Way VLIW Processing Elements Shorin Kyo, Takuya Koga, Shin ichiro Okazaki, Ryoichi Uchida, Satoshi Yoshimoto, Ichiro Kuroda NEC, Kawasaki, Japan Video recognition for intelligent cruise control (ICC) and other intelligent transport system (ITS) applications requires not only high computation performance but also high programmability. This is due to the necessity of using various recognition algorithms to cope with changing situations, sizes, and appearances of target objects such as vehicles, lane marks and obstacles. Existing microprocessors and DSPs provide high programmability but fail to offer enough performance because insufficient architectural support exists for the required pixel parallelism and non-continuous memory access. Dedicated ASICs[1] and configurable devices[2] have to devote die spaces to each recognition algorithm for each exclusively existing situation such as driving on highways versus local streets, or stop-and-go versus normal cruise. An integrated memory array processor for car electronics (IMAP-CE) provides both high performance and programmability for ICC applications is characterized by the following. 1) Parallel execution by a multiple of 128 SIMD 4-way VLIW processing elements (PEs). 2) Parallel algorithm design is facilitated by a linear PE connection and the ability of simultaneous access of data located in different memory address by each PE (indexed addressing). 3) Automatic background video data mapping for each PE is acheived by an asynchronous shift register (SR) structure. 4) An optimizing extended C compiler generates fully efficient codes. Figure shows the IMAP-CE containing a 16b RISC control processor (CP), a linear processor array (LPA) of 128 RISC type 8b PEs, and an interface unit. Each PE executes up to 4 instructions per clock cycle to provide a peak performance of 51.2 GOPS in a single chip, or up to GOPS in a cascade connected 16-chip configuration at 100MHz. CP issues up to 4 instructions per clock cycle, within which one is decoded for the CP, and as many as 4 are decoded and broadcast to the LPA. To facilitate instruction/data broadcast from the CP, the LPA is hierarchically consisted of 16 PE groups (PEG), where each PEG further consists of 8 linearly connected PEs. A 64b line buffer of each PEG is mutually connected in a SR configuration for burst transfer of data between the off-chip SDRAM and PE RAM blocks. Figure shows the CP (6-stage) and PE (3-stage) pipelines bridged by the instruction broadcast (BC1), CP data broadcast (BC2), and PE data reduction (RDU: extract an 8b data out of 8bx128, or produce an 1b status by wired ORing 1bx128) pipes. The cascade connection of multiple chips to form a larger LPA is supported for extra performance requirements. In Fig , the inter-pe data path is pulled out as pins, and an inter-pe data selector is used for selecting between the looped back PE data of the chip, and PE data from adjacent chips. In Fig , additional selector logics of the RDU pipe are used for selecting PE status/data from other chips supplied by a 1bx16/16bx1 configurable bus (CPDATA). Figure shows that, in the 1bx16 mode, each chip sends its own 1b status to all the other chips by using a separate 1b of CPDATA and uses the remaining 15b to receive data from the other chips. In the 16bx1 mode the particular chip possessing the requested PE data drives all bits of CPDATA to broadcast the data to all other chips. A combination of the LPA configuration with the indexed addressing capability, which is enabled by assigning each PE a separate RAM block, are shown to provide a base for parallel implementation of various image processing tasks [3]. This is unachievable by a conventional SIMD stream instruction design such as the Intel MMX/SSE instruction set, due to the parallelism being restricted only to continuously located memory data. Automatic background video data mapping is achieved by assigning a mutually connected SR, comprised of 4-sub-SRs, to each PE. The inter-connection between the 4x8b sub-srs is configurable into any of the 4 modes shown in Fig This enables mapping of up to 4 columns of video data to each PE, in case the horizontal pixel number of the video data exceeds the PE number. Whenever a video data line is shifted in completely, an interrupt signal is generated by the CP s video-interrupt-generator. The interrupt signal in turn invokes a software interrupt routine to run in the horizontal or vertical blank period of the video signal. During this time the video clock used for driving the sub-sr flip-flops (FFs) is shut off and ensures collision free asynchronous access to the sub- SRs by the PEs within the interrupt routine. A parallel data structure extended C language, called one dimensional C (1DC) is developed [3] to provide high programmability. As each PE is designed as a RISC processor with 24 general purpose registers, conventional compile techniques are fully adopted for creating an efficient optimizing 1DC compiler. A significant feature of the compiler is the virtualization of number of PE and video data mapping, by which source code modification is not required even if the number of chips or the video data mapping of a target system changes. The IMAP-CE is fabricated using 0.18µm 7M CMOS process and the 11x11mm 2 die contains 32.7M transistors. Figure summarizes the LSI features, and Fig shows the die micrograph. The evaluation of the image filters.in Fig shows that the IMAP-CE runs 3 to 50 times faster than a 2.8 GHz general-purpose processor. Figure also shows a 33ms/frame performance using a single chip IMAP-CE running the 1DC written ICC application of lane marking, road area, and vehicle detection, in which various complicated algorithms are combined for obtaining robust recognition results under bad weather conditions[4]. The results demonstrate the potential of IMAP-CE as a device solution for various timing-crucial ICC applications. References [1] M. Hariyama, et al., "VLSI Processor for Reliable Stereo Matching Based on Adaptive Window-Size Selection, IEEE Int. Conf. on Robotics and Automation, vol. 2, pp , May, [2] Y. Kondo, et al., "4 GOPS 3 way-vliw image recognition processor based on a configurable media-processor, ISSCC Digest of Technical Papers, pp , [3] S. Kyo, et al., "Efficient Implementation of Image Processing Algorithms on Linear Processor Arrays using the Data Parallel Language 1DC, IAPR Workshop on Machine Vision and Applications, pp , [4] S. Kyo, et al., "A Robust Vehicle Detecting and Tracking System for Wet Weather Conditions using the IMAP-VISION Image Processing Board, IEEE/IEEJ/JSAI Int. Conf. on Intelligent Transportation System, pp , 1999.
2 ISSCC 2003 / February 19, 2003 / Salon 1-6 / 4:15 PM 2 Figure 2.6.1: Processor block diagram. Figure 2.6.2: CP and LPA pipeline structure. Figure 2.6.3: CPDATA bus configuration. Figure 2.6.4: The four possible sub-sr inter-connections. Figure 2.6.5: Performance comparison results. Figure 2.6.6: Chip specifications.
3 2 Figure 2.6.7: Die micrograph.
4 Figure 2.6.1: Processor block diagram.
5 Figure 2.6.2: CP and LPA pipeline structure.
6 Figure 2.6.3: CPDATA bus configuration.
7 Figure 2.6.4: The four possible sub-sr inter-connections.
8 Figure 2.6.5: Performance comparison results.
9 Figure 2.6.6: Chip specifications.
10 Figure 2.6.7: Die micrograph.
ISSCC 2003 / SESSION 2 / MULTIMEDIA SIGNAL PROCESSING / PAPER 2.6
ISSCC 2003 / SESSION 2 / MULTIMEDIA SIGNAL PROCESSING / PAPER 2.6 2.6 A 51.2GOPS Scalable Video Recognition Processor for Intelligent Cruise Control Based on a Linear Array of 128 4-Way VLIW Processing
More informationIn-Vehicle Vision Processors for Driver Assistance Systems
In-Vehicle Vision Processors for Driver Assistance Systems Shorin Kyo System IP Core Research Laboratory NEC Corporation Kawasaki City, 211-8666 Tel : +81-044-431-7453 Fax : +81-044-431-7489 e-mail: s-kyo@cq.jp.nec.com
More informationISSCC 2001 / SESSION 9 / INTEGRATED MULTIMEDIA PROCESSORS / 9.2
ISSCC 2001 / SESSION 9 / INTEGRATED MULTIMEDIA PROCESSORS / 9.2 9.2 A 80/20MHz 160mW Multimedia Processor integrated with Embedded DRAM MPEG-4 Accelerator and 3D Rendering Engine for Mobile Applications
More informationISSCC 2003 / SESSION 8 / COMMUNICATIONS SIGNAL PROCESSING / PAPER 8.7
ISSCC 2003 / SESSION 8 / COMMUNICATIONS SIGNAL PROCESSING / PAPER 8.7 8.7 A Programmable Turbo Decoder for Multiple 3G Wireless Standards Myoung-Cheol Shin, In-Cheol Park KAIST, Daejeon, Republic of Korea
More informationISSCC 2003 / SESSION 14 / MICROPROCESSORS / PAPER 14.5
ISSCC 2003 / SESSION 14 / MICROPROCESSORS / PAPER 14.5 14.5 A 600MHz Single-Chip Multiprocessor with 4.8GB/s Internal Shared Pipelined Bus and 512kB Internal Memory Satoshi Kaneko, Katsunori Sawai, Norio
More informationMulticore SoC is coming. Scalable and Reconfigurable Stream Processor for Mobile Multimedia Systems. Source: 2007 ISSCC and IDF.
Scalable and Reconfigurable Stream Processor for Mobile Multimedia Systems Liang-Gee Chen Distinguished Professor General Director, SOC Center National Taiwan University DSP/IC Design Lab, GIEE, NTU 1
More informationMemory Systems IRAM. Principle of IRAM
Memory Systems 165 other devices of the module will be in the Standby state (which is the primary state of all RDRAM devices) or another state with low-power consumption. The RDRAM devices provide several
More informationFABRICATION TECHNOLOGIES
FABRICATION TECHNOLOGIES DSP Processor Design Approaches Full custom Standard cell** higher performance lower energy (power) lower per-part cost Gate array* FPGA* Programmable DSP Programmable general
More information3.1 Description of Microprocessor. 3.2 History of Microprocessor
3.0 MAIN CONTENT 3.1 Description of Microprocessor The brain or engine of the PC is the processor (sometimes called microprocessor), or central processing unit (CPU). The CPU performs the system s calculating
More informationProcessor Architectures At A Glance: M.I.T. Raw vs. UC Davis AsAP
Processor Architectures At A Glance: M.I.T. Raw vs. UC Davis AsAP Presenter: Course: EEC 289Q: Reconfigurable Computing Course Instructor: Professor Soheil Ghiasi Outline Overview of M.I.T. Raw processor
More informationChapter 2 Logic Gates and Introduction to Computer Architecture
Chapter 2 Logic Gates and Introduction to Computer Architecture 2.1 Introduction The basic components of an Integrated Circuit (IC) is logic gates which made of transistors, in digital system there are
More informationVLSI Design Automation
VLSI Design Automation IC Products Processors CPU, DSP, Controllers Memory chips RAM, ROM, EEPROM Analog Mobile communication, audio/video processing Programmable PLA, FPGA Embedded systems Used in cars,
More informationMPSOC Architectures for Computing for Imaging. Thierry Collette, Ph.D. CEA LIST Head of Architecture and Design Department
MPSOC Architectures for Computing for Imaging Thierry Collette, Ph.D. CEA LIST Head of Architecture and Design Department thierry.collette@cea.fr Performance Embedded Computing: a New Area www.tilera.com
More informationVLSI Design Automation. Calcolatori Elettronici Ing. Informatica
VLSI Design Automation 1 Outline Technology trends VLSI Design flow (an overview) 2 IC Products Processors CPU, DSP, Controllers Memory chips RAM, ROM, EEPROM Analog Mobile communication, audio/video processing
More informationSA-1500: A 300 MHz RISC CPU with Attached Media Processor*
and Bridges Division SA-1500: A 300 MHz RISC CPU with Attached Media Processor* Prashant P. Gandhi, Ph.D. and Bridges Division Computing Enhancement Group Intel Corporation Santa Clara, CA 95052 Prashant.Gandhi@intel.com
More informationTowards a Dynamically Reconfigurable System-on-Chip Platform for Video Signal Processing
Towards a Dynamically Reconfigurable System-on-Chip Platform for Video Signal Processing Walter Stechele, Stephan Herrmann, Andreas Herkersdorf Technische Universität München 80290 München Germany Walter.Stechele@ei.tum.de
More informationIntroduction to Microprocessor
Introduction to Microprocessor Slide 1 Microprocessor A microprocessor is a multipurpose, programmable, clock-driven, register-based electronic device That reads binary instructions from a storage device
More informationGated-Demultiplexer Tree Buffer for Low Power Using Clock Tree Based Gated Driver
Gated-Demultiplexer Tree Buffer for Low Power Using Clock Tree Based Gated Driver E.Kanniga 1, N. Imocha Singh 2,K.Selva Rama Rathnam 3 Professor Department of Electronics and Telecommunication, Bharath
More informationThe S6000 Family of Processors
The S6000 Family of Processors Today s Design Challenges The advent of software configurable processors In recent years, the widespread adoption of digital technologies has revolutionized the way in which
More informationAn Infrastructural IP for Interactive MPEG-4 SoC Functional Verification
International Journal on Electrical Engineering and Informatics - Volume 1, Number 2, 2009 An Infrastructural IP for Interactive MPEG-4 SoC Functional Verification Trio Adiono 1, Hans G. Kerkhoff 2 & Hiroaki
More informationAn Evaluation of an Energy Efficient Many-Core SoC with Parallelized Face Detection
An Evaluation of an Energy Efficient Many-Core SoC with Parallelized Face Detection Hiroyuki Usui, Jun Tanabe, Toru Sano, Hui Xu, and Takashi Miyamori Toshiba Corporation, Kawasaki, Japan Copyright 2013,
More informationDUE to the high computational complexity and real-time
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 15, NO. 3, MARCH 2005 445 A Memory-Efficient Realization of Cyclic Convolution and Its Application to Discrete Cosine Transform Hun-Chen
More informationMulti-Core SoCs for ADAS and Image Recognition Applications
Multi-Core SoCs for ADAS and Image Recognition Applications Takashi Miyamori, Senior Manager Embedded Core Technology Development Department Center for Semiconductor Research & Development Storage Device
More informationReal Time Image Processing Architecture for Robot Vision
Header for SPIE use Real Time Image Processing Architecture for Robot Vision Stelian Persa *, Pieter Jonker Pattern Recognition Group, Technical University Delft Lorentzweg 1, Delft, 2628 CJ,The Netherlands
More informationMEMORIES. Memories. EEC 116, B. Baas 3
MEMORIES Memories VLSI memories can be classified as belonging to one of two major categories: Individual registers, single bit, or foreground memories Clocked: Transparent latches and Flip-flops Unclocked:
More informationMemory in Digital Systems
MEMORIES Memory in Digital Systems Three primary components of digital systems Datapath (does the work) Control (manager) Memory (storage) Single bit ( foround ) Clockless latches e.g., SR latch Clocked
More informationTHE latest generation of microprocessors uses a combination
1254 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 30, NO. 11, NOVEMBER 1995 A 14-Port 3.8-ns 116-Word 64-b Read-Renaming Register File Creigton Asato Abstract A 116-word by 64-b register file for a 154 MHz
More information! Program logic functions, interconnect using SRAM. ! Advantages: ! Re-programmable; ! dynamically reconfigurable; ! uses standard processes.
Topics! SRAM-based FPGA fabrics:! Xilinx.! Altera. SRAM-based FPGAs! Program logic functions, using SRAM.! Advantages:! Re-programmable;! dynamically reconfigurable;! uses standard processes.! isadvantages:!
More informationAC : INFRARED COMMUNICATIONS FOR CONTROLLING A ROBOT
AC 2007-1527: INFRARED COMMUNICATIONS FOR CONTROLLING A ROBOT Ahad Nasab, Middle Tennessee State University SANTOSH KAPARTHI, Middle Tennessee State University American Society for Engineering Education,
More informationAn Infrastructural IP for Interactive MPEG-4 SoC Functional Verification
ITB J. ICT Vol. 3, No. 1, 2009, 51-66 51 An Infrastructural IP for Interactive MPEG-4 SoC Functional Verification 1 Trio Adiono, 2 Hans G. Kerkhoff & 3 Hiroaki Kunieda 1 Institut Teknologi Bandung, Bandung,
More informationISSCC 2006 / SESSION 22 / LOW POWER MULTIMEDIA / 22.1
ISSCC 26 / SESSION 22 / LOW POWER MULTIMEDIA / 22.1 22.1 A 125µW, Fully Scalable MPEG-2 and H.264/AVC Video Decoder for Mobile Applications Tsu-Ming Liu 1, Ting-An Lin 2, Sheng-Zen Wang 2, Wen-Ping Lee
More informationISSN: [Garade* et al., 6(1): January, 2017] Impact Factor: 4.116
IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY FULLY REUSED VLSI ARCHITECTURE OF DSRC ENCODERS USING SOLS TECHNIQUE Supriya Shivaji Garade*, Prof. P. R. Badadapure * Department
More informationEECS Dept., University of California at Berkeley. Berkeley Wireless Research Center Tel: (510)
A V Heterogeneous Reconfigurable Processor IC for Baseband Wireless Applications Hui Zhang, Vandana Prabhu, Varghese George, Marlene Wan, Martin Benes, Arthur Abnous, and Jan M. Rabaey EECS Dept., University
More informationDesign guidelines for embedded real time face detection application
Design guidelines for embedded real time face detection application White paper for Embedded Vision Alliance By Eldad Melamed Much like the human visual system, embedded computer vision systems perform
More informationOrganization. 5.1 Semiconductor Main Memory. William Stallings Computer Organization and Architecture 6th Edition
William Stallings Computer Organization and Architecture 6th Edition Chapter 5 Internal Memory 5.1 Semiconductor Main Memory 5.2 Error Correction 5.3 Advanced DRAM Organization 5.1 Semiconductor Main Memory
More informationWilliam Stallings Computer Organization and Architecture 6th Edition. Chapter 5 Internal Memory
William Stallings Computer Organization and Architecture 6th Edition Chapter 5 Internal Memory Semiconductor Memory Types Semiconductor Memory RAM Misnamed as all semiconductor memory is random access
More informationSIDDHARTH GROUP OF INSTITUTIONS :: PUTTUR Siddharth Nagar, Narayanavanam Road QUESTION BANK (DESCRIPTIVE) UNIT-I
SIDDHARTH GROUP OF INSTITUTIONS :: PUTTUR Siddharth Nagar, Narayanavanam Road 517583 QUESTION BANK (DESCRIPTIVE) Subject with Code : CO (16MC802) Year & Sem: I-MCA & I-Sem Course & Branch: MCA Regulation:
More informationA 167-processor Computational Array for Highly-Efficient DSP and Embedded Application Processing
A 167-processor Computational Array for Highly-Efficient DSP and Embedded Application Processing Dean Truong, Wayne Cheng, Tinoosh Mohsenin, Zhiyi Yu, Toney Jacobson, Gouri Landge, Michael Meeuwsen, Christine
More informationA 50Mvertices/s Graphics Processor with Fixed-Point Programmable Vertex Shader for Mobile Applications
A 50Mvertices/s Graphics Processor with Fixed-Point Programmable Vertex Shader for Mobile Applications Ju-Ho Sohn, Jeong-Ho Woo, Min-Wuk Lee, Hye-Jung Kim, Ramchan Woo, Hoi-Jun Yoo Semiconductor System
More informationESE 570 Cadence Lab Assignment 2: Introduction to Spectre, Manual Layout Drawing and Post Layout Simulation (PLS)
ESE 570 Cadence Lab Assignment 2: Introduction to Spectre, Manual Layout Drawing and Post Layout Simulation (PLS) Objective Part A: To become acquainted with Spectre (or HSpice) by simulating an inverter,
More informationFPGA for Complex System Implementation. National Chiao Tung University Chun-Jen Tsai 04/14/2011
FPGA for Complex System Implementation National Chiao Tung University Chun-Jen Tsai 04/14/2011 About FPGA FPGA was invented by Ross Freeman in 1989 SRAM-based FPGA properties Standard parts Allowing multi-level
More informationVector Architectures Vs. Superscalar and VLIW for Embedded Media Benchmarks
Vector Architectures Vs. Superscalar and VLIW for Embedded Media Benchmarks Christos Kozyrakis Stanford University David Patterson U.C. Berkeley http://csl.stanford.edu/~christos Motivation Ideal processor
More information1. Microprocessor Architectures. 1.1 Intel 1.2 Motorola
1. Microprocessor Architectures 1.1 Intel 1.2 Motorola 1.1 Intel The Early Intel Microprocessors The first microprocessor to appear in the market was the Intel 4004, a 4-bit data bus device. This device
More informationVLSI Design Automation. Maurizio Palesi
VLSI Design Automation 1 Outline Technology trends VLSI Design flow (an overview) 2 Outline Technology trends VLSI Design flow (an overview) 3 IC Products Processors CPU, DSP, Controllers Memory chips
More informationPOWER consumption has become one of the most important
704 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 39, NO. 4, APRIL 2004 Brief Papers High-Throughput Asynchronous Datapath With Software-Controlled Voltage Scaling Yee William Li, Student Member, IEEE, George
More informationDesign Methodologies
Design Methodologies 1981 1983 1985 1987 1989 1991 1993 1995 1997 1999 2001 2003 2005 2007 2009 Complexity Productivity (K) Trans./Staff - Mo. Productivity Trends Logic Transistor per Chip (M) 10,000 0.1
More informationA 1-GHz Configurable Processor Core MeP-h1
A 1-GHz Configurable Processor Core MeP-h1 Takashi Miyamori, Takanori Tamai, and Masato Uchiyama SoC Research & Development Center, TOSHIBA Corporation Outline Background Pipeline Structure Bus Interface
More informationIMAGINE: Signal and Image Processing Using Streams
IMAGINE: Signal and Image Processing Using Streams Brucek Khailany William J. Dally, Scott Rixner, Ujval J. Kapasi, Peter Mattson, Jinyung Namkoong, John D. Owens, Brian Towles Concurrent VLSI Architecture
More informationHP PA-8000 RISC CPU. A High Performance Out-of-Order Processor
The A High Performance Out-of-Order Processor Hot Chips VIII IEEE Computer Society Stanford University August 19, 1996 Hewlett-Packard Company Engineering Systems Lab - Fort Collins, CO - Cupertino, CA
More informationA HIGHLY PROGRAMMABLE SENSOR NETWORK INTERFACE WITH MULTIPLE SENSOR READOUT CIRCUITS
A HIGHLY PROGRAMMABLE SENSOR NETWORK INTERFACE WITH MULTIPLE SENSOR READOUT CIRCUITS Jichun Zhang, Junwei Zhou, Prasanna Balasundaram and Andrew Mason ECE Department, Michigan State University, East Lansing,
More informationARM Processors for Embedded Applications
ARM Processors for Embedded Applications Roadmap for ARM Processors ARM Architecture Basics ARM Families AMBA Architecture 1 Current ARM Core Families ARM7: Hard cores and Soft cores Cache with MPU or
More informationParallel Extraction Architecture for Information of Numerous Particles in Real-Time Image Measurement
Parallel Extraction Architecture for Information of Numerous Paper: Rb17-4-2346; May 19, 2005 Parallel Extraction Architecture for Information of Numerous Yoshihiro Watanabe, Takashi Komuro, Shingo Kagami,
More informationEE5780 Advanced VLSI CAD
EE5780 Advanced VLSI CAD Lecture 1 Introduction Zhuo Feng 1.1 Prof. Zhuo Feng Office: EERC 513 Phone: 487-3116 Email: zhuofeng@mtu.edu Class Website http://www.ece.mtu.edu/~zhuofeng/ee5780fall2013.html
More informationVLSI Design Automation
VLSI Design Automation IC Products Processors CPU, DSP, Controllers Memory chips RAM, ROM, EEPROM Analog Mobile communication, audio/video processing Programmable PLA, FPGA Embedded systems Used in cars,
More informationCAD for VLSI. Debdeep Mukhopadhyay IIT Madras
CAD for VLSI Debdeep Mukhopadhyay IIT Madras Tentative Syllabus Overall perspective of VLSI Design MOS switch and CMOS, MOS based logic design, the CMOS logic styles, Pass Transistors Introduction to Verilog
More informationEmerging DRAM Technologies
1 Emerging DRAM Technologies Michael Thiems amt051@email.mot.com DigitalDNA Systems Architecture Laboratory Motorola Labs 2 Motivation DRAM and the memory subsystem significantly impacts the performance
More informationAn Asynchronous Array of Simple Processors for DSP Applications
An Asynchronous Array of Simple Processors for DSP Applications Zhiyi Yu, Michael Meeuwsen, Ryan Apperson, Omar Sattari, Michael Lai, Jeremy Webb, Eric Work, Tinoosh Mohsenin, Mandeep Singh, Bevan Baas
More informationUnit 9 : Fundamentals of Parallel Processing
Unit 9 : Fundamentals of Parallel Processing Lesson 1 : Types of Parallel Processing 1.1. Learning Objectives On completion of this lesson you will be able to : classify different types of parallel processing
More informationChapter 3 : Control Unit
3.1 Control Memory Chapter 3 Control Unit The function of the control unit in a digital computer is to initiate sequences of microoperations. When the control signals are generated by hardware using conventional
More information2D/3D Graphics Accelerator for Mobile Multimedia Applications. Ramchan Woo, Sohn, Seong-Jun Song, Young-Don
RAMP-IV: A Low-Power and High-Performance 2D/3D Graphics Accelerator for Mobile Multimedia Applications Woo, Sungdae Choi, Ju-Ho Sohn, Seong-Jun Song, Young-Don Bae,, and Hoi-Jun Yoo oratory Dept. of EECS,
More informationA 167-processor 65 nm Computational Platform with Per-Processor Dynamic Supply Voltage and Dynamic Clock Frequency Scaling
A 167-processor 65 nm Computational Platform with Per-Processor Dynamic Supply Voltage and Dynamic Clock Frequency Scaling Dean Truong, Wayne Cheng, Tinoosh Mohsenin, Zhiyi Yu, Toney Jacobson, Gouri Landge,
More informationINTRODUCTION TO FPGA ARCHITECTURE
3/3/25 INTRODUCTION TO FPGA ARCHITECTURE DIGITAL LOGIC DESIGN (BASIC TECHNIQUES) a b a y 2input Black Box y b Functional Schematic a b y a b y a b y 2 Truth Table (AND) Truth Table (OR) Truth Table (XOR)
More informationAtmel AT94K FPSLIC Architecture Field Programmable Gate Array
Embedded Processor Based Built-In Self-Test and Diagnosis of FPGA Core in FPSLIC John Sunwoo (Logic BIST) Srinivas Garimella (RAM BIST) Sudheer Vemula (I/O Cell BIST) Chuck Stroud (Routing BIST) Jonathan
More informationAxcelerator Family FPGAs
Product Brief Axcelerator Family FPGAs u e Leading-Edge Performance 350+ MHz System Performance 500+ MHz Internal Performance High-Performance Embedded s 700 Mb/s LVDS Capable I/Os Specifications Up to
More informationGrand Challenge Scaling - Pushing a Fully Programmable TeraOp into Handset Imaging
Grand Challenge Scaling - Pushing a Fully Programmable TeraOp into Handset Imaging Chris Rowen Cadence Fellow/Tensilica CTO Outline Grand challenge problem for the next decade: video and vision intelligence
More informationPLAs & PALs. Programmable Logic Devices (PLDs) PLAs and PALs
PLAs & PALs Programmable Logic Devices (PLDs) PLAs and PALs PLAs&PALs By the late 1970s, standard logic devices were all the rage, and printed circuit boards were loaded with them. To offer the ultimate
More informationIncreasing interconnection network connectivity for reducing operator complexity in asynchronous vision systems
Increasing interconnection network connectivity for reducing operator complexity in asynchronous vision systems Valentin Gies and Thierry M. Bernard ENSTA, 32 Bd Victor 75015, Paris, FRANCE, contact@vgies.com,
More informationChapter 5 Internal Memory
Chapter 5 Internal Memory Memory Type Category Erasure Write Mechanism Volatility Random-access memory (RAM) Read-write memory Electrically, byte-level Electrically Volatile Read-only memory (ROM) Read-only
More information4. Hardware Platform: Real-Time Requirements
4. Hardware Platform: Real-Time Requirements Contents: 4.1 Evolution of Microprocessor Architecture 4.2 Performance-Increasing Concepts 4.3 Influences on System Architecture 4.4 A Real-Time Hardware Architecture
More informationHierarchical Multi-Chip Architecture for High Capacity Scalability of Fully Parallel Hamming-Distance Associative Memories
IEICE TRANS. ELECTRON., VOL.E87 C, NO.11 NOVEMBER 2004 1847 PAPER Special Section on New System Paradigms for Integrated Electronics Hierarchical Multi-Chip Architecture for High Capacity Scalability of
More informationUnit 5 DOS INTERRPUTS
Unit 5 DOS INTERRPUTS 5.1 Introduction The DOS (Disk Operating System) provides a large number of procedures to access devices, files and memory. These procedures can be called in any user program using
More informationComputer Organization. 8th Edition. Chapter 5 Internal Memory
William Stallings Computer Organization and Architecture 8th Edition Chapter 5 Internal Memory Semiconductor Memory Types Memory Type Category Erasure Write Mechanism Volatility Random-access memory (RAM)
More informationComputer Architecture
Computer Architecture Slide Sets WS 2013/2014 Prof. Dr. Uwe Brinkschulte M.Sc. Benjamin Betting Part 10 Thread and Task Level Parallelism Computer Architecture Part 10 page 1 of 36 Prof. Dr. Uwe Brinkschulte,
More informationIn this tutorial, we will discuss the architecture, pin diagram and other key concepts of microprocessors.
About the Tutorial A microprocessor is a controlling unit of a micro-computer, fabricated on a small chip capable of performing Arithmetic Logical Unit (ALU) operations and communicating with the other
More informationLab 16: Tri-State Busses and Memory U.C. Davis Physics 116B Note: We may use a more modern RAM chip. Pinouts, etc. will be provided.
Lab 16: Tri-State Busses and Memory U.C. Davis Physics 116B Note: We may use a more modern RAM chip. Pinouts, etc. will be provided. INTRODUCTION In this lab, you will build a fairly large circuit that
More informationMemory in Digital Systems
MEMORIES Memory in Digital Systems Three primary components of digital systems Datapath (does the work) Control (manager) Memory (storage) Single bit ( foround ) Clockless latches e.g., SR latch Clocked
More informationThe T0 Vector Microprocessor. Talk Outline
Slides from presentation at the Hot Chips VII conference, 15 August 1995.. The T0 Vector Microprocessor Krste Asanovic James Beck Bertrand Irissou Brian E. D. Kingsbury Nelson Morgan John Wawrzynek University
More informationSerial versus Parallel Data Transfers
Serial versus Parallel Data Transfers 1 SHIFT REGISTERS: CONVERTING BETWEEN SERIAL AND PARALLEL DATA Serial communications Most communications is carried out over serial links Fewer wires needed Less electronics
More informationImplementing Tile-based Chip Multiprocessors with GALS Clocking Styles
Implementing Tile-based Chip Multiprocessors with GALS Clocking Styles Zhiyi Yu, Bevan Baas VLSI Computation Lab, ECE Department University of California, Davis, USA Outline Introduction Timing issues
More informationReal-time and smooth scalable video streaming system with bitstream extractor intellectual property implementation
LETTER IEICE Electronics Express, Vol.11, No.5, 1 6 Real-time and smooth scalable video streaming system with bitstream extractor intellectual property implementation Liang-Hung Wang 1a), Yi-Mao Hsiao
More informationA Low Power Asynchronous FPGA with Autonomous Fine Grain Power Gating and LEDR Encoding
A Low Power Asynchronous FPGA with Autonomous Fine Grain Power Gating and LEDR Encoding N.Rajagopala krishnan, k.sivasuparamanyan, G.Ramadoss Abstract Field Programmable Gate Arrays (FPGAs) are widely
More informationEEL 4744C: Microprocessor Applications. Lecture 7. Part 1. Interrupt. Dr. Tao Li 1
EEL 4744C: Microprocessor Applications Lecture 7 Part 1 Interrupt Dr. Tao Li 1 M&M: Chapter 8 Or Reading Assignment Software and Hardware Engineering (new version): Chapter 12 Dr. Tao Li 2 Interrupt An
More informationReading Assignment. Interrupt. Interrupt. Interrupt. EEL 4744C: Microprocessor Applications. Lecture 7. Part 1
Reading Assignment EEL 4744C: Microprocessor Applications Lecture 7 M&M: Chapter 8 Or Software and Hardware Engineering (new version): Chapter 12 Part 1 Interrupt Dr. Tao Li 1 Dr. Tao Li 2 Interrupt An
More informationCAD Technology of the SX-9
KONNO Yoshihiro, IKAWA Yasuhiro, SAWANO Tomoki KANAMARU Keisuke, ONO Koki, KUMAZAKI Masahito Abstract This paper outlines the design techniques and CAD technology used with the SX-9. The LSI and package
More informationA LOSSLESS INDEX CODING ALGORITHM AND VLSI DESIGN FOR VECTOR QUANTIZATION
A LOSSLESS INDEX CODING ALGORITHM AND VLSI DESIGN FOR VECTOR QUANTIZATION Ming-Hwa Sheu, Sh-Chi Tsai and Ming-Der Shieh Dept. of Electronic Eng., National Yunlin Univ. of Science and Technology, Yunlin,
More informationNovel Multimedia Instruction Capabilities in VLIW Media Processors. Contents
Novel Multimedia Instruction Capabilities in VLIW Media Processors J. T. J. van Eijndhoven 1,2 F. W. Sijstermans 1 (1) Philips Research Eindhoven (2) Eindhoven University of Technology The Netherlands
More informationBasic FPGA Architectures. Actel FPGAs. PLD Technologies: Antifuse. 3 Digital Systems Implementation Programmable Logic Devices
3 Digital Systems Implementation Programmable Logic Devices Basic FPGA Architectures Why Programmable Logic Devices (PLDs)? Low cost, low risk way of implementing digital circuits as application specific
More informationConfigurable Processors for SOC Design. Contents crafted by Technology Evangelist Steve Leibson Tensilica, Inc.
Configurable s for SOC Design Contents crafted by Technology Evangelist Steve Leibson Tensilica, Inc. Why Listen to This Presentation? Understand how SOC design techniques, now nearly 20 years old, are
More informationDesigning High-Speed ATM Switch Fabrics by Using Actel FPGAs
pplication Note C105 esigning High-Speed TM Switch Fabrics by Using ctel FPGs The recent upsurge of interest in synchronous Transfer Mode (TM) is based on the recognition that it represents a new level
More informationIntel released new technology call P6P
P6 and IA-64 8086 released on 1978 Pentium release on 1993 8086 has upgrade by Pipeline, Super scalar, Clock frequency, Cache and so on But 8086 has limit, Hard to improve efficiency Intel released new
More informationPACE: Power-Aware Computing Engines
PACE: Power-Aware Computing Engines Krste Asanovic Saman Amarasinghe Martin Rinard Computer Architecture Group MIT Laboratory for Computer Science http://www.cag.lcs.mit.edu/ PACE Approach Energy- Conscious
More informationDesign and Implementation of High Performance Application Specific Memory
Design and Implementation of High Performance Application Specific Memory - 고성능 Application Specific Memory 의설계와구현 - M.S. Thesis Sungdae Choi Dec. 20th, 2002 Outline Introduction Memory for Mobile 3D Graphics
More informationAccelerated Computing Jun Makino Interactive Research Center of Science Tokyo Institute of Technology
Accelerated Computing Jun Makino Interactive Research Center of Science Tokyo Institute of Technology IEEE Cluster 2011 Sept 28, 2011 Satoshi s three questions 1. What does accelerated computing solve
More informationChapter 5B. Large and Fast: Exploiting Memory Hierarchy
Chapter 5B Large and Fast: Exploiting Memory Hierarchy One Transistor Dynamic RAM 1-T DRAM Cell word access transistor V REF TiN top electrode (V REF ) Ta 2 O 5 dielectric bit Storage capacitor (FET gate,
More information1. INTRODUCTION TO MICROPROCESSOR AND MICROCOMPUTER ARCHITECTURE:
1. INTRODUCTION TO MICROPROCESSOR AND MICROCOMPUTER ARCHITECTURE: A microprocessor is a programmable electronics chip that has computing and decision making capabilities similar to central processing unit
More informationCISC Attributes. E.g. Pentium is considered a modern CISC processor
What is CISC? CISC means Complex Instruction Set Computer chips that are easy to program and which make efficient use of memory. Since the earliest machines were programmed in assembly language and memory
More informationMemory & Simple I/O Interfacing
Chapter 10 Memory & Simple I/O Interfacing Expected Outcomes Explain the importance of tri-state devices in microprocessor system Distinguish basic type of semiconductor memory and their applications Relate
More informationThe Earth Simulator System
Architecture and Hardware for HPC Special Issue on High Performance Computing The Earth Simulator System - - - & - - - & - By Shinichi HABATA,* Mitsuo YOKOKAWA and Shigemune KITAWAKI The Earth Simulator,
More informationHigh Performance Interconnect and NoC Router Design
High Performance Interconnect and NoC Router Design Brinda M M.E Student, Dept. of ECE (VLSI Design) K.Ramakrishnan College of Technology Samayapuram, Trichy 621 112 brinda18th@gmail.com Devipoonguzhali
More informationSH-Mobile LSIs for Cell Phones
Hitachi Review 1 SH-Mobile LSIs for Cell Phones Toshinobu Kanai Hiroshi Yagi Junichi Nishimoto Ikuya Kawasaki OVERVIEW: With the increasing number of functions performed by cell phones, processors are
More information