ISSCC 2003 / SESSION 2 / MULTIMEDIA SIGNAL PROCESSING / PAPER 2.6

Size: px
Start display at page:

Download "ISSCC 2003 / SESSION 2 / MULTIMEDIA SIGNAL PROCESSING / PAPER 2.6"

Transcription

1 ISSCC 2003 / SESSION 2 / MULTIMEDIA SIGNAL PROCESSING / PAPER A 51.2GOPS Scalable Video Recognition Processor for Intelligent Cruise Control Based on a Linear Array of Way VLIW Processing Elements Shorin Kyo, Takuya Koga, Shin ichiro Okazaki, Ryoichi Uchida, Satoshi Yoshimoto, Ichiro Kuroda NEC, Kawasaki, Japan Video recognition for intelligent cruise control (ICC) and other intelligent transport system (ITS) applications requires not only high computation performance but also high programmability. This is due to the necessity of using various recognition algorithms to cope with changing situations, sizes, and appearances of target objects such as vehicles, lane marks and obstacles. Existing microprocessors and DSPs provide high programmability but fail to offer enough performance because insufficient architectural support exists for the required pixel parallelism and non-continuous memory access. Dedicated ASICs[1] and configurable devices[2] have to devote die spaces to each recognition algorithm for each exclusively existing situation such as driving on highways versus local streets, or stop-and-go versus normal cruise. An integrated memory array processor for car electronics (IMAP-CE) provides both high performance and programmability for ICC applications is characterized by the following. 1) Parallel execution by a multiple of 128 SIMD 4-way VLIW processing elements (PEs). 2) Parallel algorithm design is facilitated by a linear PE connection and the ability of simultaneous access of data located in different memory address by each PE (indexed addressing). 3) Automatic background video data mapping for each PE is acheived by an asynchronous shift register (SR) structure. 4) An optimizing extended C compiler generates fully efficient codes. Figure shows the IMAP-CE containing a 16b RISC control processor (CP), a linear processor array (LPA) of 128 RISC type 8b PEs, and an interface unit. Each PE executes up to 4 instructions per clock cycle to provide a peak performance of 51.2 GOPS in a single chip, or up to GOPS in a cascade connected 16-chip configuration at 100MHz. CP issues up to 4 instructions per clock cycle, within which one is decoded for the CP, and as many as 4 are decoded and broadcast to the LPA. To facilitate instruction/data broadcast from the CP, the LPA is hierarchically consisted of 16 PE groups (PEG), where each PEG further consists of 8 linearly connected PEs. A 64b line buffer of each PEG is mutually connected in a SR configuration for burst transfer of data between the off-chip SDRAM and PE RAM blocks. Figure shows the CP (6-stage) and PE (3-stage) pipelines bridged by the instruction broadcast (BC1), CP data broadcast (BC2), and PE data reduction (RDU: extract an 8b data out of 8bx128, or produce an 1b status by wired ORing 1bx128) pipes. The cascade connection of multiple chips to form a larger LPA is supported for extra performance requirements. In Fig , the inter-pe data path is pulled out as pins, and an inter-pe data selector is used for selecting between the looped back PE data of the chip, and PE data from adjacent chips. In Fig , additional selector logics of the RDU pipe are used for selecting PE status/data from other chips supplied by a 1bx16/16bx1 configurable bus (CPDATA). Figure shows that, in the 1bx16 mode, each chip sends its own 1b status to all the other chips by using a separate 1b of CPDATA and uses the remaining 15b to receive data from the other chips. In the 16bx1 mode the particular chip possessing the requested PE data drives all bits of CPDATA to broadcast the data to all other chips. A combination of the LPA configuration with the indexed addressing capability, which is enabled by assigning each PE a separate RAM block, are shown to provide a base for parallel implementation of various image processing tasks [3]. This is unachievable by a conventional SIMD stream instruction design such as the Intel MMX/SSE instruction set, due to the parallelism being restricted only to continuously located memory data. Automatic background video data mapping is achieved by assigning a mutually connected SR, comprised of 4-sub-SRs, to each PE. The inter-connection between the 4x8b sub-srs is configurable into any of the 4 modes shown in Fig This enables mapping of up to 4 columns of video data to each PE, in case the horizontal pixel number of the video data exceeds the PE number. Whenever a video data line is shifted in completely, an interrupt signal is generated by the CP s video-interrupt-generator. The interrupt signal in turn invokes a software interrupt routine to run in the horizontal or vertical blank period of the video signal. During this time the video clock used for driving the sub-sr flip-flops (FFs) is shut off and ensures collision free asynchronous access to the sub- SRs by the PEs within the interrupt routine. A parallel data structure extended C language, called one dimensional C (1DC) is developed [3] to provide high programmability. As each PE is designed as a RISC processor with 24 general purpose registers, conventional compile techniques are fully adopted for creating an efficient optimizing 1DC compiler. A significant feature of the compiler is the virtualization of number of PE and video data mapping, by which source code modification is not required even if the number of chips or the video data mapping of a target system changes. The IMAP-CE is fabricated using 0.18µm 7M CMOS process and the 11x11mm 2 die contains 32.7M transistors. Figure summarizes the LSI features, and Fig shows the die micrograph. The evaluation of the image filters.in Fig shows that the IMAP-CE runs 3 to 50 times faster than a 2.8 GHz general-purpose processor. Figure also shows a 33ms/frame performance using a single chip IMAP-CE running the 1DC written ICC application of lane marking, road area, and vehicle detection, in which various complicated algorithms are combined for obtaining robust recognition results under bad weather conditions[4]. The results demonstrate the potential of IMAP-CE as a device solution for various timing-crucial ICC applications. References [1] M. Hariyama, et al., "VLSI Processor for Reliable Stereo Matching Based on Adaptive Window-Size Selection, IEEE Int. Conf. on Robotics and Automation, vol. 2, pp , May, [2] Y. Kondo, et al., "4 GOPS 3 way-vliw image recognition processor based on a configurable media-processor, ISSCC Digest of Technical Papers, pp , [3] S. Kyo, et al., "Efficient Implementation of Image Processing Algorithms on Linear Processor Arrays using the Data Parallel Language 1DC, IAPR Workshop on Machine Vision and Applications, pp , [4] S. Kyo, et al., "A Robust Vehicle Detecting and Tracking System for Wet Weather Conditions using the IMAP-VISION Image Processing Board, IEEE/IEEJ/JSAI Int. Conf. on Intelligent Transportation System, pp , 1999.

2 ISSCC 2003 / February 19, 2003 / Salon 1-6 / 4:15 PM 2 Figure 2.6.1: Processor block diagram. Figure 2.6.2: CP and LPA pipeline structure. Figure 2.6.3: CPDATA bus configuration. Figure 2.6.4: The four possible sub-sr inter-connections. Figure 2.6.5: Performance comparison results. Figure 2.6.6: Chip specifications.

3 2 Figure 2.6.7: Die micrograph.

4 Figure 2.6.1: Processor block diagram.

5 Figure 2.6.2: CP and LPA pipeline structure.

6 Figure 2.6.3: CPDATA bus configuration.

7 Figure 2.6.4: The four possible sub-sr inter-connections.

8 Figure 2.6.5: Performance comparison results.

9 Figure 2.6.6: Chip specifications.

10 Figure 2.6.7: Die micrograph.

ISSCC 2003 / SESSION 2 / MULTIMEDIA SIGNAL PROCESSING / PAPER 2.6

ISSCC 2003 / SESSION 2 / MULTIMEDIA SIGNAL PROCESSING / PAPER 2.6 ISSCC 2003 / SESSION 2 / MULTIMEDIA SIGNAL PROCESSING / PAPER 2.6 2.6 A 51.2GOPS Scalable Video Recognition Processor for Intelligent Cruise Control Based on a Linear Array of 128 4-Way VLIW Processing

More information

In-Vehicle Vision Processors for Driver Assistance Systems

In-Vehicle Vision Processors for Driver Assistance Systems In-Vehicle Vision Processors for Driver Assistance Systems Shorin Kyo System IP Core Research Laboratory NEC Corporation Kawasaki City, 211-8666 Tel : +81-044-431-7453 Fax : +81-044-431-7489 e-mail: s-kyo@cq.jp.nec.com

More information

ISSCC 2001 / SESSION 9 / INTEGRATED MULTIMEDIA PROCESSORS / 9.2

ISSCC 2001 / SESSION 9 / INTEGRATED MULTIMEDIA PROCESSORS / 9.2 ISSCC 2001 / SESSION 9 / INTEGRATED MULTIMEDIA PROCESSORS / 9.2 9.2 A 80/20MHz 160mW Multimedia Processor integrated with Embedded DRAM MPEG-4 Accelerator and 3D Rendering Engine for Mobile Applications

More information

ISSCC 2003 / SESSION 8 / COMMUNICATIONS SIGNAL PROCESSING / PAPER 8.7

ISSCC 2003 / SESSION 8 / COMMUNICATIONS SIGNAL PROCESSING / PAPER 8.7 ISSCC 2003 / SESSION 8 / COMMUNICATIONS SIGNAL PROCESSING / PAPER 8.7 8.7 A Programmable Turbo Decoder for Multiple 3G Wireless Standards Myoung-Cheol Shin, In-Cheol Park KAIST, Daejeon, Republic of Korea

More information

ISSCC 2003 / SESSION 14 / MICROPROCESSORS / PAPER 14.5

ISSCC 2003 / SESSION 14 / MICROPROCESSORS / PAPER 14.5 ISSCC 2003 / SESSION 14 / MICROPROCESSORS / PAPER 14.5 14.5 A 600MHz Single-Chip Multiprocessor with 4.8GB/s Internal Shared Pipelined Bus and 512kB Internal Memory Satoshi Kaneko, Katsunori Sawai, Norio

More information

Multicore SoC is coming. Scalable and Reconfigurable Stream Processor for Mobile Multimedia Systems. Source: 2007 ISSCC and IDF.

Multicore SoC is coming. Scalable and Reconfigurable Stream Processor for Mobile Multimedia Systems. Source: 2007 ISSCC and IDF. Scalable and Reconfigurable Stream Processor for Mobile Multimedia Systems Liang-Gee Chen Distinguished Professor General Director, SOC Center National Taiwan University DSP/IC Design Lab, GIEE, NTU 1

More information

Memory Systems IRAM. Principle of IRAM

Memory Systems IRAM. Principle of IRAM Memory Systems 165 other devices of the module will be in the Standby state (which is the primary state of all RDRAM devices) or another state with low-power consumption. The RDRAM devices provide several

More information

FABRICATION TECHNOLOGIES

FABRICATION TECHNOLOGIES FABRICATION TECHNOLOGIES DSP Processor Design Approaches Full custom Standard cell** higher performance lower energy (power) lower per-part cost Gate array* FPGA* Programmable DSP Programmable general

More information

3.1 Description of Microprocessor. 3.2 History of Microprocessor

3.1 Description of Microprocessor. 3.2 History of Microprocessor 3.0 MAIN CONTENT 3.1 Description of Microprocessor The brain or engine of the PC is the processor (sometimes called microprocessor), or central processing unit (CPU). The CPU performs the system s calculating

More information

Processor Architectures At A Glance: M.I.T. Raw vs. UC Davis AsAP

Processor Architectures At A Glance: M.I.T. Raw vs. UC Davis AsAP Processor Architectures At A Glance: M.I.T. Raw vs. UC Davis AsAP Presenter: Course: EEC 289Q: Reconfigurable Computing Course Instructor: Professor Soheil Ghiasi Outline Overview of M.I.T. Raw processor

More information

Chapter 2 Logic Gates and Introduction to Computer Architecture

Chapter 2 Logic Gates and Introduction to Computer Architecture Chapter 2 Logic Gates and Introduction to Computer Architecture 2.1 Introduction The basic components of an Integrated Circuit (IC) is logic gates which made of transistors, in digital system there are

More information

VLSI Design Automation

VLSI Design Automation VLSI Design Automation IC Products Processors CPU, DSP, Controllers Memory chips RAM, ROM, EEPROM Analog Mobile communication, audio/video processing Programmable PLA, FPGA Embedded systems Used in cars,

More information

MPSOC Architectures for Computing for Imaging. Thierry Collette, Ph.D. CEA LIST Head of Architecture and Design Department

MPSOC Architectures for Computing for Imaging. Thierry Collette, Ph.D. CEA LIST Head of Architecture and Design Department MPSOC Architectures for Computing for Imaging Thierry Collette, Ph.D. CEA LIST Head of Architecture and Design Department thierry.collette@cea.fr Performance Embedded Computing: a New Area www.tilera.com

More information

VLSI Design Automation. Calcolatori Elettronici Ing. Informatica

VLSI Design Automation. Calcolatori Elettronici Ing. Informatica VLSI Design Automation 1 Outline Technology trends VLSI Design flow (an overview) 2 IC Products Processors CPU, DSP, Controllers Memory chips RAM, ROM, EEPROM Analog Mobile communication, audio/video processing

More information

SA-1500: A 300 MHz RISC CPU with Attached Media Processor*

SA-1500: A 300 MHz RISC CPU with Attached Media Processor* and Bridges Division SA-1500: A 300 MHz RISC CPU with Attached Media Processor* Prashant P. Gandhi, Ph.D. and Bridges Division Computing Enhancement Group Intel Corporation Santa Clara, CA 95052 Prashant.Gandhi@intel.com

More information

Towards a Dynamically Reconfigurable System-on-Chip Platform for Video Signal Processing

Towards a Dynamically Reconfigurable System-on-Chip Platform for Video Signal Processing Towards a Dynamically Reconfigurable System-on-Chip Platform for Video Signal Processing Walter Stechele, Stephan Herrmann, Andreas Herkersdorf Technische Universität München 80290 München Germany Walter.Stechele@ei.tum.de

More information

Introduction to Microprocessor

Introduction to Microprocessor Introduction to Microprocessor Slide 1 Microprocessor A microprocessor is a multipurpose, programmable, clock-driven, register-based electronic device That reads binary instructions from a storage device

More information

Gated-Demultiplexer Tree Buffer for Low Power Using Clock Tree Based Gated Driver

Gated-Demultiplexer Tree Buffer for Low Power Using Clock Tree Based Gated Driver Gated-Demultiplexer Tree Buffer for Low Power Using Clock Tree Based Gated Driver E.Kanniga 1, N. Imocha Singh 2,K.Selva Rama Rathnam 3 Professor Department of Electronics and Telecommunication, Bharath

More information

The S6000 Family of Processors

The S6000 Family of Processors The S6000 Family of Processors Today s Design Challenges The advent of software configurable processors In recent years, the widespread adoption of digital technologies has revolutionized the way in which

More information

An Infrastructural IP for Interactive MPEG-4 SoC Functional Verification

An Infrastructural IP for Interactive MPEG-4 SoC Functional Verification International Journal on Electrical Engineering and Informatics - Volume 1, Number 2, 2009 An Infrastructural IP for Interactive MPEG-4 SoC Functional Verification Trio Adiono 1, Hans G. Kerkhoff 2 & Hiroaki

More information

An Evaluation of an Energy Efficient Many-Core SoC with Parallelized Face Detection

An Evaluation of an Energy Efficient Many-Core SoC with Parallelized Face Detection An Evaluation of an Energy Efficient Many-Core SoC with Parallelized Face Detection Hiroyuki Usui, Jun Tanabe, Toru Sano, Hui Xu, and Takashi Miyamori Toshiba Corporation, Kawasaki, Japan Copyright 2013,

More information

DUE to the high computational complexity and real-time

DUE to the high computational complexity and real-time IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 15, NO. 3, MARCH 2005 445 A Memory-Efficient Realization of Cyclic Convolution and Its Application to Discrete Cosine Transform Hun-Chen

More information

Multi-Core SoCs for ADAS and Image Recognition Applications

Multi-Core SoCs for ADAS and Image Recognition Applications Multi-Core SoCs for ADAS and Image Recognition Applications Takashi Miyamori, Senior Manager Embedded Core Technology Development Department Center for Semiconductor Research & Development Storage Device

More information

Real Time Image Processing Architecture for Robot Vision

Real Time Image Processing Architecture for Robot Vision Header for SPIE use Real Time Image Processing Architecture for Robot Vision Stelian Persa *, Pieter Jonker Pattern Recognition Group, Technical University Delft Lorentzweg 1, Delft, 2628 CJ,The Netherlands

More information

MEMORIES. Memories. EEC 116, B. Baas 3

MEMORIES. Memories. EEC 116, B. Baas 3 MEMORIES Memories VLSI memories can be classified as belonging to one of two major categories: Individual registers, single bit, or foreground memories Clocked: Transparent latches and Flip-flops Unclocked:

More information

Memory in Digital Systems

Memory in Digital Systems MEMORIES Memory in Digital Systems Three primary components of digital systems Datapath (does the work) Control (manager) Memory (storage) Single bit ( foround ) Clockless latches e.g., SR latch Clocked

More information

THE latest generation of microprocessors uses a combination

THE latest generation of microprocessors uses a combination 1254 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 30, NO. 11, NOVEMBER 1995 A 14-Port 3.8-ns 116-Word 64-b Read-Renaming Register File Creigton Asato Abstract A 116-word by 64-b register file for a 154 MHz

More information

! Program logic functions, interconnect using SRAM. ! Advantages: ! Re-programmable; ! dynamically reconfigurable; ! uses standard processes.

! Program logic functions, interconnect using SRAM. ! Advantages: ! Re-programmable; ! dynamically reconfigurable; ! uses standard processes. Topics! SRAM-based FPGA fabrics:! Xilinx.! Altera. SRAM-based FPGAs! Program logic functions, using SRAM.! Advantages:! Re-programmable;! dynamically reconfigurable;! uses standard processes.! isadvantages:!

More information

AC : INFRARED COMMUNICATIONS FOR CONTROLLING A ROBOT

AC : INFRARED COMMUNICATIONS FOR CONTROLLING A ROBOT AC 2007-1527: INFRARED COMMUNICATIONS FOR CONTROLLING A ROBOT Ahad Nasab, Middle Tennessee State University SANTOSH KAPARTHI, Middle Tennessee State University American Society for Engineering Education,

More information

An Infrastructural IP for Interactive MPEG-4 SoC Functional Verification

An Infrastructural IP for Interactive MPEG-4 SoC Functional Verification ITB J. ICT Vol. 3, No. 1, 2009, 51-66 51 An Infrastructural IP for Interactive MPEG-4 SoC Functional Verification 1 Trio Adiono, 2 Hans G. Kerkhoff & 3 Hiroaki Kunieda 1 Institut Teknologi Bandung, Bandung,

More information

ISSCC 2006 / SESSION 22 / LOW POWER MULTIMEDIA / 22.1

ISSCC 2006 / SESSION 22 / LOW POWER MULTIMEDIA / 22.1 ISSCC 26 / SESSION 22 / LOW POWER MULTIMEDIA / 22.1 22.1 A 125µW, Fully Scalable MPEG-2 and H.264/AVC Video Decoder for Mobile Applications Tsu-Ming Liu 1, Ting-An Lin 2, Sheng-Zen Wang 2, Wen-Ping Lee

More information

ISSN: [Garade* et al., 6(1): January, 2017] Impact Factor: 4.116

ISSN: [Garade* et al., 6(1): January, 2017] Impact Factor: 4.116 IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY FULLY REUSED VLSI ARCHITECTURE OF DSRC ENCODERS USING SOLS TECHNIQUE Supriya Shivaji Garade*, Prof. P. R. Badadapure * Department

More information

EECS Dept., University of California at Berkeley. Berkeley Wireless Research Center Tel: (510)

EECS Dept., University of California at Berkeley. Berkeley Wireless Research Center Tel: (510) A V Heterogeneous Reconfigurable Processor IC for Baseband Wireless Applications Hui Zhang, Vandana Prabhu, Varghese George, Marlene Wan, Martin Benes, Arthur Abnous, and Jan M. Rabaey EECS Dept., University

More information

Design guidelines for embedded real time face detection application

Design guidelines for embedded real time face detection application Design guidelines for embedded real time face detection application White paper for Embedded Vision Alliance By Eldad Melamed Much like the human visual system, embedded computer vision systems perform

More information

Organization. 5.1 Semiconductor Main Memory. William Stallings Computer Organization and Architecture 6th Edition

Organization. 5.1 Semiconductor Main Memory. William Stallings Computer Organization and Architecture 6th Edition William Stallings Computer Organization and Architecture 6th Edition Chapter 5 Internal Memory 5.1 Semiconductor Main Memory 5.2 Error Correction 5.3 Advanced DRAM Organization 5.1 Semiconductor Main Memory

More information

William Stallings Computer Organization and Architecture 6th Edition. Chapter 5 Internal Memory

William Stallings Computer Organization and Architecture 6th Edition. Chapter 5 Internal Memory William Stallings Computer Organization and Architecture 6th Edition Chapter 5 Internal Memory Semiconductor Memory Types Semiconductor Memory RAM Misnamed as all semiconductor memory is random access

More information

SIDDHARTH GROUP OF INSTITUTIONS :: PUTTUR Siddharth Nagar, Narayanavanam Road QUESTION BANK (DESCRIPTIVE) UNIT-I

SIDDHARTH GROUP OF INSTITUTIONS :: PUTTUR Siddharth Nagar, Narayanavanam Road QUESTION BANK (DESCRIPTIVE) UNIT-I SIDDHARTH GROUP OF INSTITUTIONS :: PUTTUR Siddharth Nagar, Narayanavanam Road 517583 QUESTION BANK (DESCRIPTIVE) Subject with Code : CO (16MC802) Year & Sem: I-MCA & I-Sem Course & Branch: MCA Regulation:

More information

A 167-processor Computational Array for Highly-Efficient DSP and Embedded Application Processing

A 167-processor Computational Array for Highly-Efficient DSP and Embedded Application Processing A 167-processor Computational Array for Highly-Efficient DSP and Embedded Application Processing Dean Truong, Wayne Cheng, Tinoosh Mohsenin, Zhiyi Yu, Toney Jacobson, Gouri Landge, Michael Meeuwsen, Christine

More information

A 50Mvertices/s Graphics Processor with Fixed-Point Programmable Vertex Shader for Mobile Applications

A 50Mvertices/s Graphics Processor with Fixed-Point Programmable Vertex Shader for Mobile Applications A 50Mvertices/s Graphics Processor with Fixed-Point Programmable Vertex Shader for Mobile Applications Ju-Ho Sohn, Jeong-Ho Woo, Min-Wuk Lee, Hye-Jung Kim, Ramchan Woo, Hoi-Jun Yoo Semiconductor System

More information

ESE 570 Cadence Lab Assignment 2: Introduction to Spectre, Manual Layout Drawing and Post Layout Simulation (PLS)

ESE 570 Cadence Lab Assignment 2: Introduction to Spectre, Manual Layout Drawing and Post Layout Simulation (PLS) ESE 570 Cadence Lab Assignment 2: Introduction to Spectre, Manual Layout Drawing and Post Layout Simulation (PLS) Objective Part A: To become acquainted with Spectre (or HSpice) by simulating an inverter,

More information

FPGA for Complex System Implementation. National Chiao Tung University Chun-Jen Tsai 04/14/2011

FPGA for Complex System Implementation. National Chiao Tung University Chun-Jen Tsai 04/14/2011 FPGA for Complex System Implementation National Chiao Tung University Chun-Jen Tsai 04/14/2011 About FPGA FPGA was invented by Ross Freeman in 1989 SRAM-based FPGA properties Standard parts Allowing multi-level

More information

Vector Architectures Vs. Superscalar and VLIW for Embedded Media Benchmarks

Vector Architectures Vs. Superscalar and VLIW for Embedded Media Benchmarks Vector Architectures Vs. Superscalar and VLIW for Embedded Media Benchmarks Christos Kozyrakis Stanford University David Patterson U.C. Berkeley http://csl.stanford.edu/~christos Motivation Ideal processor

More information

1. Microprocessor Architectures. 1.1 Intel 1.2 Motorola

1. Microprocessor Architectures. 1.1 Intel 1.2 Motorola 1. Microprocessor Architectures 1.1 Intel 1.2 Motorola 1.1 Intel The Early Intel Microprocessors The first microprocessor to appear in the market was the Intel 4004, a 4-bit data bus device. This device

More information

VLSI Design Automation. Maurizio Palesi

VLSI Design Automation. Maurizio Palesi VLSI Design Automation 1 Outline Technology trends VLSI Design flow (an overview) 2 Outline Technology trends VLSI Design flow (an overview) 3 IC Products Processors CPU, DSP, Controllers Memory chips

More information

POWER consumption has become one of the most important

POWER consumption has become one of the most important 704 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 39, NO. 4, APRIL 2004 Brief Papers High-Throughput Asynchronous Datapath With Software-Controlled Voltage Scaling Yee William Li, Student Member, IEEE, George

More information

Design Methodologies

Design Methodologies Design Methodologies 1981 1983 1985 1987 1989 1991 1993 1995 1997 1999 2001 2003 2005 2007 2009 Complexity Productivity (K) Trans./Staff - Mo. Productivity Trends Logic Transistor per Chip (M) 10,000 0.1

More information

A 1-GHz Configurable Processor Core MeP-h1

A 1-GHz Configurable Processor Core MeP-h1 A 1-GHz Configurable Processor Core MeP-h1 Takashi Miyamori, Takanori Tamai, and Masato Uchiyama SoC Research & Development Center, TOSHIBA Corporation Outline Background Pipeline Structure Bus Interface

More information

IMAGINE: Signal and Image Processing Using Streams

IMAGINE: Signal and Image Processing Using Streams IMAGINE: Signal and Image Processing Using Streams Brucek Khailany William J. Dally, Scott Rixner, Ujval J. Kapasi, Peter Mattson, Jinyung Namkoong, John D. Owens, Brian Towles Concurrent VLSI Architecture

More information

HP PA-8000 RISC CPU. A High Performance Out-of-Order Processor

HP PA-8000 RISC CPU. A High Performance Out-of-Order Processor The A High Performance Out-of-Order Processor Hot Chips VIII IEEE Computer Society Stanford University August 19, 1996 Hewlett-Packard Company Engineering Systems Lab - Fort Collins, CO - Cupertino, CA

More information

A HIGHLY PROGRAMMABLE SENSOR NETWORK INTERFACE WITH MULTIPLE SENSOR READOUT CIRCUITS

A HIGHLY PROGRAMMABLE SENSOR NETWORK INTERFACE WITH MULTIPLE SENSOR READOUT CIRCUITS A HIGHLY PROGRAMMABLE SENSOR NETWORK INTERFACE WITH MULTIPLE SENSOR READOUT CIRCUITS Jichun Zhang, Junwei Zhou, Prasanna Balasundaram and Andrew Mason ECE Department, Michigan State University, East Lansing,

More information

ARM Processors for Embedded Applications

ARM Processors for Embedded Applications ARM Processors for Embedded Applications Roadmap for ARM Processors ARM Architecture Basics ARM Families AMBA Architecture 1 Current ARM Core Families ARM7: Hard cores and Soft cores Cache with MPU or

More information

Parallel Extraction Architecture for Information of Numerous Particles in Real-Time Image Measurement

Parallel Extraction Architecture for Information of Numerous Particles in Real-Time Image Measurement Parallel Extraction Architecture for Information of Numerous Paper: Rb17-4-2346; May 19, 2005 Parallel Extraction Architecture for Information of Numerous Yoshihiro Watanabe, Takashi Komuro, Shingo Kagami,

More information

EE5780 Advanced VLSI CAD

EE5780 Advanced VLSI CAD EE5780 Advanced VLSI CAD Lecture 1 Introduction Zhuo Feng 1.1 Prof. Zhuo Feng Office: EERC 513 Phone: 487-3116 Email: zhuofeng@mtu.edu Class Website http://www.ece.mtu.edu/~zhuofeng/ee5780fall2013.html

More information

VLSI Design Automation

VLSI Design Automation VLSI Design Automation IC Products Processors CPU, DSP, Controllers Memory chips RAM, ROM, EEPROM Analog Mobile communication, audio/video processing Programmable PLA, FPGA Embedded systems Used in cars,

More information

CAD for VLSI. Debdeep Mukhopadhyay IIT Madras

CAD for VLSI. Debdeep Mukhopadhyay IIT Madras CAD for VLSI Debdeep Mukhopadhyay IIT Madras Tentative Syllabus Overall perspective of VLSI Design MOS switch and CMOS, MOS based logic design, the CMOS logic styles, Pass Transistors Introduction to Verilog

More information

Emerging DRAM Technologies

Emerging DRAM Technologies 1 Emerging DRAM Technologies Michael Thiems amt051@email.mot.com DigitalDNA Systems Architecture Laboratory Motorola Labs 2 Motivation DRAM and the memory subsystem significantly impacts the performance

More information

An Asynchronous Array of Simple Processors for DSP Applications

An Asynchronous Array of Simple Processors for DSP Applications An Asynchronous Array of Simple Processors for DSP Applications Zhiyi Yu, Michael Meeuwsen, Ryan Apperson, Omar Sattari, Michael Lai, Jeremy Webb, Eric Work, Tinoosh Mohsenin, Mandeep Singh, Bevan Baas

More information

Unit 9 : Fundamentals of Parallel Processing

Unit 9 : Fundamentals of Parallel Processing Unit 9 : Fundamentals of Parallel Processing Lesson 1 : Types of Parallel Processing 1.1. Learning Objectives On completion of this lesson you will be able to : classify different types of parallel processing

More information

Chapter 3 : Control Unit

Chapter 3 : Control Unit 3.1 Control Memory Chapter 3 Control Unit The function of the control unit in a digital computer is to initiate sequences of microoperations. When the control signals are generated by hardware using conventional

More information

2D/3D Graphics Accelerator for Mobile Multimedia Applications. Ramchan Woo, Sohn, Seong-Jun Song, Young-Don

2D/3D Graphics Accelerator for Mobile Multimedia Applications. Ramchan Woo, Sohn, Seong-Jun Song, Young-Don RAMP-IV: A Low-Power and High-Performance 2D/3D Graphics Accelerator for Mobile Multimedia Applications Woo, Sungdae Choi, Ju-Ho Sohn, Seong-Jun Song, Young-Don Bae,, and Hoi-Jun Yoo oratory Dept. of EECS,

More information

A 167-processor 65 nm Computational Platform with Per-Processor Dynamic Supply Voltage and Dynamic Clock Frequency Scaling

A 167-processor 65 nm Computational Platform with Per-Processor Dynamic Supply Voltage and Dynamic Clock Frequency Scaling A 167-processor 65 nm Computational Platform with Per-Processor Dynamic Supply Voltage and Dynamic Clock Frequency Scaling Dean Truong, Wayne Cheng, Tinoosh Mohsenin, Zhiyi Yu, Toney Jacobson, Gouri Landge,

More information

INTRODUCTION TO FPGA ARCHITECTURE

INTRODUCTION TO FPGA ARCHITECTURE 3/3/25 INTRODUCTION TO FPGA ARCHITECTURE DIGITAL LOGIC DESIGN (BASIC TECHNIQUES) a b a y 2input Black Box y b Functional Schematic a b y a b y a b y 2 Truth Table (AND) Truth Table (OR) Truth Table (XOR)

More information

Atmel AT94K FPSLIC Architecture Field Programmable Gate Array

Atmel AT94K FPSLIC Architecture Field Programmable Gate Array Embedded Processor Based Built-In Self-Test and Diagnosis of FPGA Core in FPSLIC John Sunwoo (Logic BIST) Srinivas Garimella (RAM BIST) Sudheer Vemula (I/O Cell BIST) Chuck Stroud (Routing BIST) Jonathan

More information

Axcelerator Family FPGAs

Axcelerator Family FPGAs Product Brief Axcelerator Family FPGAs u e Leading-Edge Performance 350+ MHz System Performance 500+ MHz Internal Performance High-Performance Embedded s 700 Mb/s LVDS Capable I/Os Specifications Up to

More information

Grand Challenge Scaling - Pushing a Fully Programmable TeraOp into Handset Imaging

Grand Challenge Scaling - Pushing a Fully Programmable TeraOp into Handset Imaging Grand Challenge Scaling - Pushing a Fully Programmable TeraOp into Handset Imaging Chris Rowen Cadence Fellow/Tensilica CTO Outline Grand challenge problem for the next decade: video and vision intelligence

More information

PLAs & PALs. Programmable Logic Devices (PLDs) PLAs and PALs

PLAs & PALs. Programmable Logic Devices (PLDs) PLAs and PALs PLAs & PALs Programmable Logic Devices (PLDs) PLAs and PALs PLAs&PALs By the late 1970s, standard logic devices were all the rage, and printed circuit boards were loaded with them. To offer the ultimate

More information

Increasing interconnection network connectivity for reducing operator complexity in asynchronous vision systems

Increasing interconnection network connectivity for reducing operator complexity in asynchronous vision systems Increasing interconnection network connectivity for reducing operator complexity in asynchronous vision systems Valentin Gies and Thierry M. Bernard ENSTA, 32 Bd Victor 75015, Paris, FRANCE, contact@vgies.com,

More information

Chapter 5 Internal Memory

Chapter 5 Internal Memory Chapter 5 Internal Memory Memory Type Category Erasure Write Mechanism Volatility Random-access memory (RAM) Read-write memory Electrically, byte-level Electrically Volatile Read-only memory (ROM) Read-only

More information

4. Hardware Platform: Real-Time Requirements

4. Hardware Platform: Real-Time Requirements 4. Hardware Platform: Real-Time Requirements Contents: 4.1 Evolution of Microprocessor Architecture 4.2 Performance-Increasing Concepts 4.3 Influences on System Architecture 4.4 A Real-Time Hardware Architecture

More information

Hierarchical Multi-Chip Architecture for High Capacity Scalability of Fully Parallel Hamming-Distance Associative Memories

Hierarchical Multi-Chip Architecture for High Capacity Scalability of Fully Parallel Hamming-Distance Associative Memories IEICE TRANS. ELECTRON., VOL.E87 C, NO.11 NOVEMBER 2004 1847 PAPER Special Section on New System Paradigms for Integrated Electronics Hierarchical Multi-Chip Architecture for High Capacity Scalability of

More information

Unit 5 DOS INTERRPUTS

Unit 5 DOS INTERRPUTS Unit 5 DOS INTERRPUTS 5.1 Introduction The DOS (Disk Operating System) provides a large number of procedures to access devices, files and memory. These procedures can be called in any user program using

More information

Computer Organization. 8th Edition. Chapter 5 Internal Memory

Computer Organization. 8th Edition. Chapter 5 Internal Memory William Stallings Computer Organization and Architecture 8th Edition Chapter 5 Internal Memory Semiconductor Memory Types Memory Type Category Erasure Write Mechanism Volatility Random-access memory (RAM)

More information

Computer Architecture

Computer Architecture Computer Architecture Slide Sets WS 2013/2014 Prof. Dr. Uwe Brinkschulte M.Sc. Benjamin Betting Part 10 Thread and Task Level Parallelism Computer Architecture Part 10 page 1 of 36 Prof. Dr. Uwe Brinkschulte,

More information

In this tutorial, we will discuss the architecture, pin diagram and other key concepts of microprocessors.

In this tutorial, we will discuss the architecture, pin diagram and other key concepts of microprocessors. About the Tutorial A microprocessor is a controlling unit of a micro-computer, fabricated on a small chip capable of performing Arithmetic Logical Unit (ALU) operations and communicating with the other

More information

Lab 16: Tri-State Busses and Memory U.C. Davis Physics 116B Note: We may use a more modern RAM chip. Pinouts, etc. will be provided.

Lab 16: Tri-State Busses and Memory U.C. Davis Physics 116B Note: We may use a more modern RAM chip. Pinouts, etc. will be provided. Lab 16: Tri-State Busses and Memory U.C. Davis Physics 116B Note: We may use a more modern RAM chip. Pinouts, etc. will be provided. INTRODUCTION In this lab, you will build a fairly large circuit that

More information

Memory in Digital Systems

Memory in Digital Systems MEMORIES Memory in Digital Systems Three primary components of digital systems Datapath (does the work) Control (manager) Memory (storage) Single bit ( foround ) Clockless latches e.g., SR latch Clocked

More information

The T0 Vector Microprocessor. Talk Outline

The T0 Vector Microprocessor. Talk Outline Slides from presentation at the Hot Chips VII conference, 15 August 1995.. The T0 Vector Microprocessor Krste Asanovic James Beck Bertrand Irissou Brian E. D. Kingsbury Nelson Morgan John Wawrzynek University

More information

Serial versus Parallel Data Transfers

Serial versus Parallel Data Transfers Serial versus Parallel Data Transfers 1 SHIFT REGISTERS: CONVERTING BETWEEN SERIAL AND PARALLEL DATA Serial communications Most communications is carried out over serial links Fewer wires needed Less electronics

More information

Implementing Tile-based Chip Multiprocessors with GALS Clocking Styles

Implementing Tile-based Chip Multiprocessors with GALS Clocking Styles Implementing Tile-based Chip Multiprocessors with GALS Clocking Styles Zhiyi Yu, Bevan Baas VLSI Computation Lab, ECE Department University of California, Davis, USA Outline Introduction Timing issues

More information

Real-time and smooth scalable video streaming system with bitstream extractor intellectual property implementation

Real-time and smooth scalable video streaming system with bitstream extractor intellectual property implementation LETTER IEICE Electronics Express, Vol.11, No.5, 1 6 Real-time and smooth scalable video streaming system with bitstream extractor intellectual property implementation Liang-Hung Wang 1a), Yi-Mao Hsiao

More information

A Low Power Asynchronous FPGA with Autonomous Fine Grain Power Gating and LEDR Encoding

A Low Power Asynchronous FPGA with Autonomous Fine Grain Power Gating and LEDR Encoding A Low Power Asynchronous FPGA with Autonomous Fine Grain Power Gating and LEDR Encoding N.Rajagopala krishnan, k.sivasuparamanyan, G.Ramadoss Abstract Field Programmable Gate Arrays (FPGAs) are widely

More information

EEL 4744C: Microprocessor Applications. Lecture 7. Part 1. Interrupt. Dr. Tao Li 1

EEL 4744C: Microprocessor Applications. Lecture 7. Part 1. Interrupt. Dr. Tao Li 1 EEL 4744C: Microprocessor Applications Lecture 7 Part 1 Interrupt Dr. Tao Li 1 M&M: Chapter 8 Or Reading Assignment Software and Hardware Engineering (new version): Chapter 12 Dr. Tao Li 2 Interrupt An

More information

Reading Assignment. Interrupt. Interrupt. Interrupt. EEL 4744C: Microprocessor Applications. Lecture 7. Part 1

Reading Assignment. Interrupt. Interrupt. Interrupt. EEL 4744C: Microprocessor Applications. Lecture 7. Part 1 Reading Assignment EEL 4744C: Microprocessor Applications Lecture 7 M&M: Chapter 8 Or Software and Hardware Engineering (new version): Chapter 12 Part 1 Interrupt Dr. Tao Li 1 Dr. Tao Li 2 Interrupt An

More information

CAD Technology of the SX-9

CAD Technology of the SX-9 KONNO Yoshihiro, IKAWA Yasuhiro, SAWANO Tomoki KANAMARU Keisuke, ONO Koki, KUMAZAKI Masahito Abstract This paper outlines the design techniques and CAD technology used with the SX-9. The LSI and package

More information

A LOSSLESS INDEX CODING ALGORITHM AND VLSI DESIGN FOR VECTOR QUANTIZATION

A LOSSLESS INDEX CODING ALGORITHM AND VLSI DESIGN FOR VECTOR QUANTIZATION A LOSSLESS INDEX CODING ALGORITHM AND VLSI DESIGN FOR VECTOR QUANTIZATION Ming-Hwa Sheu, Sh-Chi Tsai and Ming-Der Shieh Dept. of Electronic Eng., National Yunlin Univ. of Science and Technology, Yunlin,

More information

Novel Multimedia Instruction Capabilities in VLIW Media Processors. Contents

Novel Multimedia Instruction Capabilities in VLIW Media Processors. Contents Novel Multimedia Instruction Capabilities in VLIW Media Processors J. T. J. van Eijndhoven 1,2 F. W. Sijstermans 1 (1) Philips Research Eindhoven (2) Eindhoven University of Technology The Netherlands

More information

Basic FPGA Architectures. Actel FPGAs. PLD Technologies: Antifuse. 3 Digital Systems Implementation Programmable Logic Devices

Basic FPGA Architectures. Actel FPGAs. PLD Technologies: Antifuse. 3 Digital Systems Implementation Programmable Logic Devices 3 Digital Systems Implementation Programmable Logic Devices Basic FPGA Architectures Why Programmable Logic Devices (PLDs)? Low cost, low risk way of implementing digital circuits as application specific

More information

Configurable Processors for SOC Design. Contents crafted by Technology Evangelist Steve Leibson Tensilica, Inc.

Configurable Processors for SOC Design. Contents crafted by Technology Evangelist Steve Leibson Tensilica, Inc. Configurable s for SOC Design Contents crafted by Technology Evangelist Steve Leibson Tensilica, Inc. Why Listen to This Presentation? Understand how SOC design techniques, now nearly 20 years old, are

More information

Designing High-Speed ATM Switch Fabrics by Using Actel FPGAs

Designing High-Speed ATM Switch Fabrics by Using Actel FPGAs pplication Note C105 esigning High-Speed TM Switch Fabrics by Using ctel FPGs The recent upsurge of interest in synchronous Transfer Mode (TM) is based on the recognition that it represents a new level

More information

Intel released new technology call P6P

Intel released new technology call P6P P6 and IA-64 8086 released on 1978 Pentium release on 1993 8086 has upgrade by Pipeline, Super scalar, Clock frequency, Cache and so on But 8086 has limit, Hard to improve efficiency Intel released new

More information

PACE: Power-Aware Computing Engines

PACE: Power-Aware Computing Engines PACE: Power-Aware Computing Engines Krste Asanovic Saman Amarasinghe Martin Rinard Computer Architecture Group MIT Laboratory for Computer Science http://www.cag.lcs.mit.edu/ PACE Approach Energy- Conscious

More information

Design and Implementation of High Performance Application Specific Memory

Design and Implementation of High Performance Application Specific Memory Design and Implementation of High Performance Application Specific Memory - 고성능 Application Specific Memory 의설계와구현 - M.S. Thesis Sungdae Choi Dec. 20th, 2002 Outline Introduction Memory for Mobile 3D Graphics

More information

Accelerated Computing Jun Makino Interactive Research Center of Science Tokyo Institute of Technology

Accelerated Computing Jun Makino Interactive Research Center of Science Tokyo Institute of Technology Accelerated Computing Jun Makino Interactive Research Center of Science Tokyo Institute of Technology IEEE Cluster 2011 Sept 28, 2011 Satoshi s three questions 1. What does accelerated computing solve

More information

Chapter 5B. Large and Fast: Exploiting Memory Hierarchy

Chapter 5B. Large and Fast: Exploiting Memory Hierarchy Chapter 5B Large and Fast: Exploiting Memory Hierarchy One Transistor Dynamic RAM 1-T DRAM Cell word access transistor V REF TiN top electrode (V REF ) Ta 2 O 5 dielectric bit Storage capacitor (FET gate,

More information

1. INTRODUCTION TO MICROPROCESSOR AND MICROCOMPUTER ARCHITECTURE:

1. INTRODUCTION TO MICROPROCESSOR AND MICROCOMPUTER ARCHITECTURE: 1. INTRODUCTION TO MICROPROCESSOR AND MICROCOMPUTER ARCHITECTURE: A microprocessor is a programmable electronics chip that has computing and decision making capabilities similar to central processing unit

More information

CISC Attributes. E.g. Pentium is considered a modern CISC processor

CISC Attributes. E.g. Pentium is considered a modern CISC processor What is CISC? CISC means Complex Instruction Set Computer chips that are easy to program and which make efficient use of memory. Since the earliest machines were programmed in assembly language and memory

More information

Memory & Simple I/O Interfacing

Memory & Simple I/O Interfacing Chapter 10 Memory & Simple I/O Interfacing Expected Outcomes Explain the importance of tri-state devices in microprocessor system Distinguish basic type of semiconductor memory and their applications Relate

More information

The Earth Simulator System

The Earth Simulator System Architecture and Hardware for HPC Special Issue on High Performance Computing The Earth Simulator System - - - & - - - & - By Shinichi HABATA,* Mitsuo YOKOKAWA and Shigemune KITAWAKI The Earth Simulator,

More information

High Performance Interconnect and NoC Router Design

High Performance Interconnect and NoC Router Design High Performance Interconnect and NoC Router Design Brinda M M.E Student, Dept. of ECE (VLSI Design) K.Ramakrishnan College of Technology Samayapuram, Trichy 621 112 brinda18th@gmail.com Devipoonguzhali

More information

SH-Mobile LSIs for Cell Phones

SH-Mobile LSIs for Cell Phones Hitachi Review 1 SH-Mobile LSIs for Cell Phones Toshinobu Kanai Hiroshi Yagi Junichi Nishimoto Ikuya Kawasaki OVERVIEW: With the increasing number of functions performed by cell phones, processors are

More information