Resource Efficiency of Scalable Processor Architectures for SDR-based Applications

Size: px
Start display at page:

Download "Resource Efficiency of Scalable Processor Architectures for SDR-based Applications"

Transcription

1 Resource Efficiency of Scalable Processor Architectures for SDR-based Applications Thorsten Jungeblut 1, Johannes Ax 2, Gregor Sievers 2, Boris Hübener 2, Mario Porrmann 2, Ulrich Rückert 1 1 Cognitive Interaction Technology Center of Excellence (CITEC), Bielefeld University 2 Heinz Nixdorf Institute, University of Paderborn

2 Motivation: Resource efficient processor architectures - Increasing complexity of mobile applications - More functionality - New communications standards (LTE Advanced: 1 GBit/s) - Multimedia applications (Video, 3-D, ) - Static hardware solutions Flexible software implementations (e.g., Software-defined radio - SDR) Powerful CPU necessary - High requirements to resource efficiency! RADCOM 2011 Slide 2

3 Motivation: VLIW architectures - RISC architectures allow for higher performance by increasing clock frequency Power consumption increases memory - Parallel architectures enable high performance at a reasonable low clock frequency FE DC RD Fetch Fetch Fetch Fetch Fetch Decode Decode Decode Decode Register Read Register Register Read Read Register Read Register Read High resource efficiency Data memory EX ME * / * / * / LD/ST LD/ST LD/ST LD/ST LD/ST ory ory ory ory ory 3... * /... n-1 * / WR Register Write Register Write Register Write Register Write Register Write - pared to superscalar architectures - VLIW (Very-Long Word)-architectures leave the scheduling to the compiler Low resource requirements RADCOM 2011 Slide 3

4 Design-space Exploration (DSE): Design flow Goal: Automated design flow! Definition of the semantic Vice-UPSLA Benchmarks Source code RTL description planned UPSLA (C-)piler RTL code Reference specification Assembler code RTL simulation Assembler RTL code Object files Synthesis Executable Linker Netlist Executable Emulator (Prototype) ASIC realization set simulator Profiling statistics Functional verification Visualization/ Profiling Resourcen efficiency RADCOM 2011 Slide 4

5 Modular VLIW-architecture memory FE DC Fetch Decode RD Register Read Bypass Data memory EX ME 0 1 * / * / LD/ST LD/ST LD/ST ory n-1 * / Condition register Register WR Register Write RADCOM 2011 Slide 5

6 Normalized Area/Power Normalized Latency Normalized number of clock cycles DSE (core level) - Trade-off between - clock cycles - clock frequency - area requirements - power consumption 1 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0 1 Slot 2 Slot 3 Slot 4 Slot Number of VLIW slots Number of VLIW slots RADCOM 2011 Slide 6

7 DSE (core level) - Trade-off between - clock cycles - clock frequency Cache FE DC Fetch Decode - area requirements - power consumption RD Register Read Bypass EX Data Cache LD/ST 0/2* / 0/2 LD/ST 1/3* / 1/3 Condition register ME ory Register WR Register Write The architecture 4x s 2x Multiply-accumulate 2x Division-step 2x Load/Store 31 General purpose registers 8 condition registers RADCOM 2011 Slide 7

8 VLIW-architecture: Key features Harvard architecture (LD/ST architecture) Six-staged pipeline (non-interlocked) compression 31 general purpose registers, 2x8 bit condition register 41 Base instructions 15 SIMD instructions 1-cycle-instructions, 1 cycle latency (MLA, BR, LDW: 2 cycles, DIV: 32 cycles) Parameterizable instruction alignment buffer (L0-cache) ARM-like instruction set (binary compiler) prehensive pipeline bypass 16-bit SIMD mode st cond Opcode Rn Rd Rm or immediate Bit RISC instructions RADCOM 2011 Slide 8

9 DSE (system level): Hardware accelerators (HWACC) 0: Data memory 1: Hardware extension memory Module number FE DC RD Module address Fetch Decode Register Read Bypass Data memory Address decoder EX 0 2 LD/ST 0/2* / 0/2 LD/ST 1 3 1/3* / 1/3 Condition register HWACC #1 ME ory Register HWACC #2 HWACC #3 WR Register Write RADCOM 2011 Slide 9

10 DSE (system level): Hardware accelerators (HWACC) Hardware accellerator Processing Time (Speedup) Additional hardware extensions: - UART (debugging) - Ethernet MAC - Clock counter - FIFOs - Area Requirements Power consumption CRC - 87 % (Speedup: x 8) % % x 6.8 Energy efficiency ECC - 93 % (Speedup: x 14) + 30 % + 30 % x 11.0 IEEE b - 88 % (Speedup: x 8) + 40 % + 19 % x 6.0 AES % (Speedup: x 66) + 39 % + 54 % x 43.0 RADCOM 2011 Slide 10

11 DSE (system level): Flexible integration of HWACCs SDRAM - L1-cache, external SDRAM - set extensions HWACC SDRAM Systembus Systembus Arbiter Instr.- Cache Daten- Cache Systembus CPU ISE ISE ISE - Hardware accelerators - Generic interface for external hardware extension ETH MAC MMIO CRC Lokalbusschnittstelle HWACC IEEE b EXT CRC MMIO ETH MAC ECC FPGA FPGA ASIC ETH PHY RAPTOR-System ETH PHY Host-PC RADCOM 2011 Slide 11

12 FPGA prototype based on RAPTOR system Ethernet MAC FE Fetch / L1 Cache ory DC Decode Bypass RD Register Read PHY PHY PHY PHY Condition Register EX * / * / * / * / Register ME LD/ST LD/ST LD/ST LD/ST L1 Data-Cache ory WR Register Write RADCOM 2011 Slide 12

13 Externally supplied Voltage Control µ JTAG Managment FPGA prototype based on RAPTOR system Ethernet MAC PHY PHY PHY PHY Bypass Condition Register Register FE DC RD EX ME Fetch / L1 Cache Decode Register Read * / * / * / * / LD/ST LD/ST LD/ST LD/ST L1 Data-Cache ory ory Power I2C JTAG I2C JTAG Power Subsystem Power Subsystem Configuration JTAG Xilinx Spartan-3A DSP XC3SD3400A, XC3SD1800A (compatible) x 51 Previous Module 51 High-Speed Differential I/Os DB-USRP 85 I/O RX High-Speed Differential I/Os Connector TX-DB MxFE Processor AD9862 D/A transmit signal path TX 2 x 14 two 14-Bit 128 MSPS A/D receive signal path RX 2 x 12 two 12-Bit 64 MSPS AUX ADC AUX DAC VIN 4 IOUT 4 I/O RX 16 Connector RX-DB 2 x 51 Next Module 102 WR Register Write 85 Local Bus RADCOM 2011 Slide 13

14 Externally supplied Voltage Control µ JTAG Managment FPGA prototype based on RAPTOR system Ethernet MAC PHY PHY PHY PHY Bypass Condition Register Register FE DC RD EX ME Fetch / L1 Cache Decode Register Read * / * / * / * / LD/ST LD/ST LD/ST LD/ST L1 Data-Cache ory ory Power I2C JTAG I2C JTAG Power Subsystem Power Subsystem Configuration JTAG Xilinx Spartan-3A DSP XC3SD3400A, XC3SD1800A (compatible) x 51 Previous Module 51 High-Speed Differential I/Os DB-USRP 85 I/O RX High-Speed Differential I/Os Connector TX-DB MxFE Processor AD9862 D/A transmit signal path TX 2 x 14 two 14-Bit 128 MSPS A/D receive signal path RX 2 x 12 two 12-Bit 64 MSPS AUX ADC AUX DAC VIN 4 IOUT 4 I/O RX 16 Connector RX-DB 2 x 51 Next Module 102 WR Register Write 85 Local Bus 128 MSPS A/D converter Spartan-3 ADSP FPGA for data pre-processing Modular approach 0-5 GHz transceivers RADCOM 2011 Slide 14

15 ASIC prototype in 65nm Package Test board - Standard cell implementation - 65 nm STMicroelectronics MHz mm² area requirements - 32 kb L1-Cache Die photo/layout - ~100 mw power consumption RADCOM 2011 Slide 15

16 DSE (NoC level) GigaNoC based on CPU RADCOM 2011 Slide 16

17 DSE (NoC level) GigaNoC based on CPU 2D mesh-topology Wormhole routing Highly scalable switching 5 I/O-ports per SB 750 MHz, 0.5 mm²/sb 24 Gbit/s link bandwidth RADCOM 2011 Slide 17

18 DSE (NoC level): IEEE b application Port Port Scrambling Diff. encoding Symbol mapping Scrambling Diff. encoding Symbol mapping Scrambling Diff. encoding Symbol mapping Fir-Filter (I-Part) Fir-Filter (I-Part) Port Port Fir-Filter (Q-Part) Fir-Filter (Q-Part) Synchronization RADCOM 2011 Slide 18

19 Clock cycles per Byte DSE (NoC level): IEEE b application 4000 Port Port Scrambling Diff. encoding Symbol mapping Scrambling Diff. encoding Symbol mapping Scrambling Diff. encoding Symbol mapping Fir-Filter (I-Part) Fir-Filter (I-Part) NoC (1 PE) -NoC (2 PE) -NoC (4 PE) Port Port Input data [Bytes] Fir-Filter (Q-Part) Fir-Filter (Q-Part) Synchronization RADCOM 2011 Slide 19

20 Thank you for your attention Dipl.-Ing. Thorsten Jungeblut Cognitive Interaction Technology Center of Excellence (CITEC), Bielefeld University Universitätsstraße Bielefeld Phone : Fax. : tj@cit-ec.uni-bielefeld.de RADCOM 2011 Slide 20

Design Space Exploration for Memory Subsystems of VLIW Architectures

Design Space Exploration for Memory Subsystems of VLIW Architectures E University of Paderborn Dr.-Ing. Mario Porrmann Design Space Exploration for Memory Subsystems of VLIW Architectures Thorsten Jungeblut 1, Gregor Sievers, Mario Porrmann 1, Ulrich Rückert 2 1 System

More information

CoreVA-MPSoC: A Many-core Architecture with Tightly Coupled Shared and Local Data Memories

CoreVA-MPSoC: A Many-core Architecture with Tightly Coupled Shared and Local Data Memories IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, POST-PRINT, DECEMBER 2017 1 CoreVA-MPSoC: A Many-core Architecture with Tightly Coupled and Local Data Memories Johannes Ax, Gregor Sievers, Julian

More information

The Nios II Family of Configurable Soft-core Processors

The Nios II Family of Configurable Soft-core Processors The Nios II Family of Configurable Soft-core Processors James Ball August 16, 2005 2005 Altera Corporation Agenda Nios II Introduction Configuring your CPU FPGA vs. ASIC CPU Design Instruction Set Architecture

More information

VLIW DSP Processor Design for Mobile Communication Applications. Contents crafted by Dr. Christian Panis Catena Radio Design

VLIW DSP Processor Design for Mobile Communication Applications. Contents crafted by Dr. Christian Panis Catena Radio Design VLIW DSP Processor Design for Mobile Communication Applications Contents crafted by Dr. Christian Panis Catena Radio Design Agenda Trends in mobile communication Architectural core features with significant

More information

Embedded Systems. 7. System Components

Embedded Systems. 7. System Components Embedded Systems 7. System Components Lothar Thiele 7-1 Contents of Course 1. Embedded Systems Introduction 2. Software Introduction 7. System Components 10. Models 3. Real-Time Models 4. Periodic/Aperiodic

More information

Intelop. *As new IP blocks become available, please contact the factory for the latest updated info.

Intelop. *As new IP blocks become available, please contact the factory for the latest updated info. A FPGA based development platform as part of an EDK is available to target intelop provided IPs or other standard IPs. The platform with Virtex-4 FX12 Evaluation Kit provides a complete hardware environment

More information

System-Level Analysis of Network Interfaces for Hierarchical MPSoCs

System-Level Analysis of Network Interfaces for Hierarchical MPSoCs System-Level Analysis of Network Interfaces for Hierarchical MPSoCs Johannes Ax *, Gregor Sievers *, Martin Flasskamp *, Wayne Kelly, Thorsten Jungeblut *, and Mario Porrmann * * Cognitronics and Sensor

More information

Towards Optimal Custom Instruction Processors

Towards Optimal Custom Instruction Processors Towards Optimal Custom Instruction Processors Wayne Luk Kubilay Atasu, Rob Dimond and Oskar Mencer Department of Computing Imperial College London HOT CHIPS 18 Overview 1. background: extensible processors

More information

A 1-GHz Configurable Processor Core MeP-h1

A 1-GHz Configurable Processor Core MeP-h1 A 1-GHz Configurable Processor Core MeP-h1 Takashi Miyamori, Takanori Tamai, and Masato Uchiyama SoC Research & Development Center, TOSHIBA Corporation Outline Background Pipeline Structure Bus Interface

More information

Hi Hsiao-Lung Chan, Ph.D. Dept Electrical Engineering Chang Gung University, Taiwan

Hi Hsiao-Lung Chan, Ph.D. Dept Electrical Engineering Chang Gung University, Taiwan Processors Hi Hsiao-Lung Chan, Ph.D. Dept Electrical Engineering Chang Gung University, Taiwan chanhl@maili.cgu.edu.twcgu General-purpose p processor Control unit Controllerr Control/ status Datapath ALU

More information

FiPS and M2DC: Novel Architectures for Reconfigurable Hyperscale Servers

FiPS and M2DC: Novel Architectures for Reconfigurable Hyperscale Servers FiPS and M2DC: Novel Architectures for Reconfigurable Hyperscale Servers Rene Griessl, Meysam Peykanu, Lennart Tigges, Jens Hagemeyer, Mario Porrmann Center of Excellence Cognitive Interaction Technology

More information

RAPTOR A Scalable Platform for Rapid Prototyping and FPGA-based Cluster Computing

RAPTOR A Scalable Platform for Rapid Prototyping and FPGA-based Cluster Computing RAPTOR A Scalable Platform for Rapid Prototyping and FPGA-based Cluster Computing Mario PORRMANN, Jens HAGEMEYER, Johannes ROMOTH, Manuel STRUGHOLTZ, and Christopher POHL 1 Heinz Nixdorf Institute, University

More information

DRPM architecture overview

DRPM architecture overview DRPM architecture overview Jens Hagemeyer, Dirk Jungewelter, Dario Cozzi, Sebastian Korf, Mario Porrmann Center of Excellence Cognitive action Technology, Bielefeld University, Germany Project partners:

More information

ARM Cortex core microcontrollers 3. Cortex-M0, M4, M7

ARM Cortex core microcontrollers 3. Cortex-M0, M4, M7 ARM Cortex core microcontrollers 3. Cortex-M0, M4, M7 Scherer Balázs Budapest University of Technology and Economics Department of Measurement and Information Systems BME-MIT 2018 Trends of 32-bit microcontrollers

More information

General Purpose Signal Processors

General Purpose Signal Processors General Purpose Signal Processors First announced in 1978 (AMD) for peripheral computation such as in printers, matured in early 80 s (TMS320 series). General purpose vs. dedicated architectures: Pros:

More information

Intellectual Property Macrocell for. SpaceWire Interface. Compliant with AMBA-APB Bus

Intellectual Property Macrocell for. SpaceWire Interface. Compliant with AMBA-APB Bus Intellectual Property Macrocell for SpaceWire Interface Compliant with AMBA-APB Bus L. Fanucci, A. Renieri, P. Terreni Tel. +39 050 2217 668, Fax. +39 050 2217522 Email: luca.fanucci@iet.unipi.it - 1 -

More information

Embedded Systems: Hardware Components (part I) Todor Stefanov

Embedded Systems: Hardware Components (part I) Todor Stefanov Embedded Systems: Hardware Components (part I) Todor Stefanov Leiden Embedded Research Center Leiden Institute of Advanced Computer Science Leiden University, The Netherlands Outline Generic Embedded System

More information

Field Programmable Gate Array (FPGA) Devices

Field Programmable Gate Array (FPGA) Devices Field Programmable Gate Array (FPGA) Devices 1 Contents Altera FPGAs and CPLDs CPLDs FPGAs with embedded processors ACEX FPGAs Cyclone I,II FPGAs APEX FPGAs Stratix FPGAs Stratix II,III FPGAs Xilinx FPGAs

More information

Cognitive Radio Platform Research at WINLAB

Cognitive Radio Platform Research at WINLAB Cognitive Radio Platform Research at WINLAB December 2, 2010 Zoran Miljanic and Ivan Seskar WINLAB Rutgers University www.winlab.rutgers.edu 1 WiNC2R objectives Programmable processing of phy and higher

More information

Microprocessors, Lecture 1: Introduction to Microprocessors

Microprocessors, Lecture 1: Introduction to Microprocessors Microprocessors, Lecture 1: Introduction to Microprocessors Computing Systems General-purpose standalone systems (سيستم ھای نھفته ( systems Embedded 2 General-purpose standalone systems Stand-alone computer

More information

Configurable and Extensible Processors Change System Design. Ricardo E. Gonzalez Tensilica, Inc.

Configurable and Extensible Processors Change System Design. Ricardo E. Gonzalez Tensilica, Inc. Configurable and Extensible Processors Change System Design Ricardo E. Gonzalez Tensilica, Inc. Presentation Overview Yet Another Processor? No, a new way of building systems Puts system designers in the

More information

Embedded Systems. 8. Hardware Components. Lothar Thiele. Computer Engineering and Networks Laboratory

Embedded Systems. 8. Hardware Components. Lothar Thiele. Computer Engineering and Networks Laboratory Embedded Systems 8. Hardware Components Lothar Thiele Computer Engineering and Networks Laboratory Do you Remember? 8 2 8 3 High Level Physical View 8 4 High Level Physical View 8 5 Implementation Alternatives

More information

Digital Signal Processor Core Technology

Digital Signal Processor Core Technology The World Leader in High Performance Signal Processing Solutions Digital Signal Processor Core Technology Abhijit Giri Satya Simha November 4th 2009 Outline Introduction to SHARC DSP ADSP21469 ADSP2146x

More information

The WINLAB Cognitive Radio Platform

The WINLAB Cognitive Radio Platform The WINLAB Cognitive Radio Platform IAB Meeting, Fall 2007 Rutgers, The State University of New Jersey Ivan Seskar Software Defined Radio/ Cognitive Radio Terminology Software Defined Radio (SDR) is any

More information

A HT3 Platform for Rapid Prototyping and High Performance Reconfigurable Computing

A HT3 Platform for Rapid Prototyping and High Performance Reconfigurable Computing A HT3 Platform for Rapid Prototyping and High Performance Reconfigurable Computing Second International Workshop on HyperTransport Research and Application (WHTRA 2011) University of Heidelberg Computer

More information

Course Introduction. Purpose: Objectives: Content: Learning Time:

Course Introduction. Purpose: Objectives: Content: Learning Time: Course Introduction Purpose: This course provides an overview of the Renesas SuperH series of 32-bit RISC processors, especially the microcontrollers in the SH-2 and SH-2A series Objectives: Learn the

More information

One instruction specifies multiple operations All scheduling of execution units is static

One instruction specifies multiple operations All scheduling of execution units is static VLIW Architectures Very Long Instruction Word Architecture One instruction specifies multiple operations All scheduling of execution units is static Done by compiler Static scheduling should mean less

More information

An introduction to DSP s. Examples of DSP applications Why a DSP? Characteristics of a DSP Architectures

An introduction to DSP s. Examples of DSP applications Why a DSP? Characteristics of a DSP Architectures An introduction to DSP s Examples of DSP applications Why a DSP? Characteristics of a DSP Architectures DSP example: mobile phone DSP example: mobile phone with video camera DSP: applications Why a DSP?

More information

04 - DSP Architecture and Microarchitecture

04 - DSP Architecture and Microarchitecture September 11, 2015 Memory indirect addressing (continued from last lecture) ; Reality check: Data hazards! ; Assembler code v3: repeat 256,endloop load r0,dm1[dm0[ptr0++]] store DM0[ptr1++],r0 endloop:

More information

A Closer Look at the Epiphany IV 28nm 64 core Coprocessor. Andreas Olofsson PEGPUM 2013

A Closer Look at the Epiphany IV 28nm 64 core Coprocessor. Andreas Olofsson PEGPUM 2013 A Closer Look at the Epiphany IV 28nm 64 core Coprocessor Andreas Olofsson PEGPUM 2013 1 Adapteva Achieves 3 World Firsts 1. First processor company to reach 50 GFLOPS/W 3. First semiconductor company

More information

Basic Computer Architecture

Basic Computer Architecture Basic Computer Architecture CSCE 496/896: Embedded Systems Witawas Srisa-an Review of Computer Architecture Credit: Most of the slides are made by Prof. Wayne Wolf who is the author of the textbook. I

More information

CPU Project in Western Digital: From Embedded Cores for Flash Controllers to Vision of Datacenter Processors with Open Interfaces

CPU Project in Western Digital: From Embedded Cores for Flash Controllers to Vision of Datacenter Processors with Open Interfaces CPU Project in Western Digital: From Embedded Cores for Flash Controllers to Vision of Datacenter Processors with Open Interfaces Zvonimir Z. Bandic, Sr. Director Robert Golla, Sr. Fellow Dejan Vucinic,

More information

ENHANCED TOOLS FOR RISC-V PROCESSOR DEVELOPMENT

ENHANCED TOOLS FOR RISC-V PROCESSOR DEVELOPMENT ENHANCED TOOLS FOR RISC-V PROCESSOR DEVELOPMENT THE FREE AND OPEN RISC INSTRUCTION SET ARCHITECTURE Codasip is the leading provider of RISC-V processor IP Codasip Bk: A portfolio of RISC-V processors Uniquely

More information

Architectures & instruction sets R_B_T_C_. von Neumann architecture. Computer architecture taxonomy. Assembly language.

Architectures & instruction sets R_B_T_C_. von Neumann architecture. Computer architecture taxonomy. Assembly language. Architectures & instruction sets Computer architecture taxonomy. Assembly language. R_B_T_C_ 1. E E C E 2. I E U W 3. I S O O 4. E P O I von Neumann architecture Memory holds data and instructions. Central

More information

A ONE CHIP HARDENED SOLUTION FOR HIGH SPEED SPACEWIRE SYSTEM IMPLEMENTATIONS

A ONE CHIP HARDENED SOLUTION FOR HIGH SPEED SPACEWIRE SYSTEM IMPLEMENTATIONS A ONE CHIP HARDENED SOLUTION FOR HIGH SPEED SPACEWIRE SYSTEM IMPLEMENTATIONS Joseph R. Marshall, Richard W. Berger, Glenn P. Rakow Conference Contents Standards & Topology ASIC Program History ASIC Features

More information

The Benefits of FPGA-Enabled Instruments in RF and Communications Test. Johan Olsson National Instruments Sweden AB

The Benefits of FPGA-Enabled Instruments in RF and Communications Test. Johan Olsson National Instruments Sweden AB The Benefits of FPGA-Enabled Instruments in RF and Communications Test Johan Olsson National Instruments Sweden AB 1 Agenda Introduction to FPGAs in test New FPGA-enabled test applications FPGA for test

More information

ADPCM: Adaptive Differential Pulse Code Modulation

ADPCM: Adaptive Differential Pulse Code Modulation ADPCM: Adaptive Differential Pulse Code Modulation Motivation and introduction This is the final exercise. You have three weeks to complete this exercise, but you will need these three weeks! In this exercise,

More information

Fujitsu System Applications Support. Fujitsu Microelectronics America, Inc. 02/02

Fujitsu System Applications Support. Fujitsu Microelectronics America, Inc. 02/02 Fujitsu System Applications Support 1 Overview System Applications Support SOC Application Development Lab Multimedia VoIP Wireless Bluetooth Processors, DSP and Peripherals ARM Reference Platform 2 SOC

More information

CS 310 Embedded Computer Systems CPUS. Seungryoul Maeng

CS 310 Embedded Computer Systems CPUS. Seungryoul Maeng 1 EMBEDDED SYSTEM HW CPUS Seungryoul Maeng 2 CPUs Types of Processors CPU Performance Instruction Sets Processors used in ES 3 Processors used in ES 4 Processors used in Embedded Systems RISC type ARM

More information

A Synchronization Method for Register Traces of Pipelined Processors

A Synchronization Method for Register Traces of Pipelined Processors Synchronization Method for Register Traces of Pipelined Processors Ralf Dreesen 1, Thorsten Jungeblut 2, Michael Thies 1, Mario Porrmann 2, Uwe Kastens 1, Ulrich Rückert 2 1 University of Paderborn, Department

More information

Ettus Research Update

Ettus Research Update Ettus Research Update Matt Ettus Ettus Research GRCon13 Outline 1 Introduction 2 Recent New Products 3 Third Generation Introduction Who am I? Core GNU Radio contributor since 2001 Designed

More information

Pipeline Hazards. Midterm #2 on 11/29 5th and final problem set on 11/22 9th and final lab on 12/1. https://goo.gl/forms/hkuvwlhvuyzvdat42

Pipeline Hazards. Midterm #2 on 11/29 5th and final problem set on 11/22 9th and final lab on 12/1. https://goo.gl/forms/hkuvwlhvuyzvdat42 Pipeline Hazards https://goo.gl/forms/hkuvwlhvuyzvdat42 Midterm #2 on 11/29 5th and final problem set on 11/22 9th and final lab on 12/1 1 ARM 3-stage pipeline Fetch,, and Execute Stages Instructions are

More information

ARM Processors for Embedded Applications

ARM Processors for Embedded Applications ARM Processors for Embedded Applications Roadmap for ARM Processors ARM Architecture Basics ARM Families AMBA Architecture 1 Current ARM Core Families ARM7: Hard cores and Soft cores Cache with MPU or

More information

Simplifying FPGA Design for SDR with a Network on Chip Architecture

Simplifying FPGA Design for SDR with a Network on Chip Architecture Simplifying FPGA Design for SDR with a Network on Chip Architecture Matt Ettus Ettus Research GRCon13 Outline 1 Introduction 2 RF NoC 3 Status and Conclusions USRP FPGA Capability Gen

More information

Graduate Institute of Electronics Engineering, NTU 9/16/2004

Graduate Institute of Electronics Engineering, NTU 9/16/2004 / 9/16/2004 ACCESS IC LAB Overview of DSP Processor Current Status of NTU DSP Laboratory (E1-304) Course outline of Programmable DSP Lab Lab handout and final project DSP processor is a specially designed

More information

Overview. Think Silicon is a privately held company founded in 2007 by the core team of Atmel MMC IC group

Overview. Think Silicon is a privately held company founded in 2007 by the core team of Atmel MMC IC group Nema An OpenGL & OpenCL Embedded Programmable Engine Georgios Keramidas & Iakovos Stamoulis Think Silicon mobile GRAPHICS Overview Think Silicon is a privately held company founded in 2007 by the core

More information

SoC Platforms and CPU Cores

SoC Platforms and CPU Cores SoC Platforms and CPU Cores COE838: Systems on Chip Design http://www.ee.ryerson.ca/~courses/coe838/ Dr. Gul N. Khan http://www.ee.ryerson.ca/~gnkhan Electrical and Computer Engineering Ryerson University

More information

Processor and Peripheral IP Cores for Microcontrollers in Embedded Space Applications

Processor and Peripheral IP Cores for Microcontrollers in Embedded Space Applications Processor and Peripheral IP Cores for Microcontrollers in Embedded Space Applications Presentation at ADCSS 2010 MESA November 4 th, 2010 www.aeroflex.com/gaisler Presentation outline Microcontroller requirements

More information

Massively Parallel Processor Breadboarding (MPPB)

Massively Parallel Processor Breadboarding (MPPB) Massively Parallel Processor Breadboarding (MPPB) 28 August 2012 Final Presentation TRP study 21986 Gerard Rauwerda CTO, Recore Systems Gerard.Rauwerda@RecoreSystems.com Recore Systems BV P.O. Box 77,

More information

REAL TIME DIGITAL SIGNAL PROCESSING

REAL TIME DIGITAL SIGNAL PROCESSING REAL TIME DIGITAL SIGNAL PROCESSING UTN - FRBA 2011 www.electron.frba.utn.edu.ar/dplab Introduction Why Digital? A brief comparison with analog. Advantages Flexibility. Easily modifiable and upgradeable.

More information

FCUDA-NoC: A Scalable and Efficient Network-on-Chip Implementation for the CUDA-to-FPGA Flow

FCUDA-NoC: A Scalable and Efficient Network-on-Chip Implementation for the CUDA-to-FPGA Flow FCUDA-NoC: A Scalable and Efficient Network-on-Chip Implementation for the CUDA-to-FPGA Flow Abstract: High-level synthesis (HLS) of data-parallel input languages, such as the Compute Unified Device Architecture

More information

FPGA Implementation and Validation of the Asynchronous Array of simple Processors

FPGA Implementation and Validation of the Asynchronous Array of simple Processors FPGA Implementation and Validation of the Asynchronous Array of simple Processors Jeremy W. Webb VLSI Computation Laboratory Department of ECE University of California, Davis One Shields Avenue Davis,

More information

Blackfin Products. ...maximum performance at minimum space

Blackfin Products. ...maximum performance at minimum space Products...maximum performance at minimum space 1 TCM-BF518 CM-BF527 CM-BF533 CM-BF548 e CPU Analog Devices BF518 @ 400MHz Analog Devices BF527 @ 600MHz Analog Devices BF533 @ 600MHz RAM 32MByte SDRAM

More information

SOFTWARE DEFINED RADIO

SOFTWARE DEFINED RADIO SOFTWARE DEFINED RADIO USR SDR WORKSHOP, SEPTEMBER 2017 PROF. MARCELO SEGURA SESSION 1: SDR PLATFORMS 1 PARAMETER TO BE CONSIDER 2 Bandwidth: bigger band better analysis possibilities. Spurious free BW:

More information

UCT Software-Defined Radio Research Group

UCT Software-Defined Radio Research Group UCT Software-Defined Radio Research Group UCT SDRRG Team UCT Faculty: Alan Langman Mike Inggs Simon Winberg PhD Students: Brandon Hamilton MSc Students: Bruce Raw Gordon Inggs Simon Scott Joseph Wamicha

More information

Fujitsu SOC Fujitsu Microelectronics America, Inc.

Fujitsu SOC Fujitsu Microelectronics America, Inc. Fujitsu SOC 1 Overview Fujitsu SOC The Fujitsu Advantage Fujitsu Solution Platform IPWare Library Example of SOC Engagement Model Methodology and Tools 2 SDRAM Raptor AHB IP Controller Flas h DM A Controller

More information

OASIS Network-on-Chip Prototyping on FPGA

OASIS Network-on-Chip Prototyping on FPGA Master thesis of the University of Aizu, Feb. 20, 2012 OASIS Network-on-Chip Prototyping on FPGA m5141120, Kenichi Mori Supervised by Prof. Ben Abdallah Abderazek Adaptive Systems Laboratory, Master of

More information

Introduction to Sitara AM437x Processors

Introduction to Sitara AM437x Processors Introduction to Sitara AM437x Processors AM437x: Highly integrated, scalable platform with enhanced industrial communications and security AM4376 AM4378 Software Key Features AM4372 AM4377 High-performance

More information

MIPS Technologies MIPS32 M4K Synthesizable Processor Core By the staff of

MIPS Technologies MIPS32 M4K Synthesizable Processor Core By the staff of An Independent Analysis of the: MIPS Technologies MIPS32 M4K Synthesizable Processor Core By the staff of Berkeley Design Technology, Inc. OVERVIEW MIPS Technologies, Inc. is an Intellectual Property (IP)

More information

Universität Dortmund. ARM Architecture

Universität Dortmund. ARM Architecture ARM Architecture The RISC Philosophy Original RISC design (e.g. MIPS) aims for high performance through o reduced number of instruction classes o large general-purpose register set o load-store architecture

More information

ECE 353 Lab 4. General MIDI Explorer. Professor Daniel Holcomb Fall 2015

ECE 353 Lab 4. General MIDI Explorer. Professor Daniel Holcomb Fall 2015 ECE 353 Lab 4 General MIDI Explorer Professor Daniel Holcomb Fall 2015 Where are we in Course Lab 0 Cache Simulator in C C programming, data structures Cache architecture and analysis Lab 1 Heat Flow Modeling

More information

CONTACT: ,

CONTACT: , S.N0 Project Title Year of publication of IEEE base paper 1 Design of a high security Sha-3 keccak algorithm 2012 2 Error correcting unordered codes for asynchronous communication 2012 3 Low power multipliers

More information

Design and Implementation of a FPGA-based Pipelined Microcontroller

Design and Implementation of a FPGA-based Pipelined Microcontroller Design and Implementation of a FPGA-based Pipelined Microcontroller Rainer Bermbach, Martin Kupfer University of Applied Sciences Braunschweig / Wolfenbüttel Germany Embedded World 2009, Nürnberg, 03.03.09

More information

Next Generation Multi-Purpose Microprocessor

Next Generation Multi-Purpose Microprocessor Next Generation Multi-Purpose Microprocessor Presentation at MPSA, 4 th of November 2009 www.aeroflex.com/gaisler OUTLINE NGMP key requirements Development schedule Architectural Overview LEON4FT features

More information

MYD-C7Z010/20 Development Board

MYD-C7Z010/20 Development Board MYD-C7Z010/20 Development Board MYC-C7Z010/20 CPU Module as Controller Board Two 0.8mm pitch 140-pin Connectors for Board-to-Board Connections 667MHz Xilinx XC7Z010/20 Dual-core ARM Cortex-A9 Processor

More information

INT 1011 TCP Offload Engine (Full Offload)

INT 1011 TCP Offload Engine (Full Offload) INT 1011 TCP Offload Engine (Full Offload) Product brief, features and benefits summary Provides lowest Latency and highest bandwidth. Highly customizable hardware IP block. Easily portable to ASIC flow,

More information

Versal: AI Engine & Programming Environment

Versal: AI Engine & Programming Environment Engineering Director, Xilinx Silicon Architecture Group Versal: Engine & Programming Environment Presented By Ambrose Finnerty Xilinx DSP Technical Marketing Manager October 16, 2018 MEMORY MEMORY MEMORY

More information

NetFPGA Hardware Architecture

NetFPGA Hardware Architecture NetFPGA Hardware Architecture Jeffrey Shafer Some slides adapted from Stanford NetFPGA tutorials NetFPGA http://netfpga.org 2 NetFPGA Components Virtex-II Pro 5 FPGA 53,136 logic cells 4,176 Kbit block

More information

LX4180. LMI: Local Memory Interface CI: Coprocessor Interface CEI: Custom Engine Interface LBC: Lexra Bus Controller

LX4180. LMI: Local Memory Interface CI: Coprocessor Interface CEI: Custom Engine Interface LBC: Lexra Bus Controller System-on-Chip 32-bit Embedded Processor LX4180 Product Brief R3000-class RISC Processor Core: Executes MIPS I instruction set*. Offers designers a familiar programming environment and choice of third

More information

Vertex Shader Design I

Vertex Shader Design I The following content is extracted from the paper shown in next page. If any wrong citation or reference missing, please contact ldvan@cs.nctu.edu.tw. I will correct the error asap. This course used only

More information

Programmable Logic Design Grzegorz Budzyń Lecture. 15: Advanced hardware in FPGA structures

Programmable Logic Design Grzegorz Budzyń Lecture. 15: Advanced hardware in FPGA structures Programmable Logic Design Grzegorz Budzyń Lecture 15: Advanced hardware in FPGA structures Plan Introduction PowerPC block RocketIO Introduction Introduction The larger the logical chip, the more additional

More information

The Use Of Virtual Platforms In MP-SoC Design. Eshel Haritan, VP Engineering CoWare Inc. MPSoC 2006

The Use Of Virtual Platforms In MP-SoC Design. Eshel Haritan, VP Engineering CoWare Inc. MPSoC 2006 The Use Of Virtual Platforms In MP-SoC Design Eshel Haritan, VP Engineering CoWare Inc. MPSoC 2006 1 MPSoC Is MP SoC design happening? Why? Consumer Electronics Complexity Cost of ASIC Increased SW Content

More information

Low-power Architecture. By: Jonathan Herbst Scott Duntley

Low-power Architecture. By: Jonathan Herbst Scott Duntley Low-power Architecture By: Jonathan Herbst Scott Duntley Why low power? Has become necessary with new-age demands: o Increasing design complexity o Demands of and for portable equipment Communication Media

More information

Compute Node Design for DAQ and Trigger Subsystem in Giessen. Justus Liebig University in Giessen

Compute Node Design for DAQ and Trigger Subsystem in Giessen. Justus Liebig University in Giessen Compute Node Design for DAQ and Trigger Subsystem in Giessen Justus Liebig University in Giessen Outline Design goals Current work in Giessen Hardware Software Future work Justus Liebig University in Giessen,

More information

UG0850 User Guide PolarFire FPGA Video Solution

UG0850 User Guide PolarFire FPGA Video Solution UG0850 User Guide PolarFire FPGA Video Solution Microsemi Headquarters One Enterprise, Aliso Viejo, CA 92656 USA Within the USA: +1 (800) 713-4113 Outside the USA: +1 (949) 380-6100 Sales: +1 (949) 380-6136

More information

The RM9150 and the Fast Device Bus High Speed Interconnect

The RM9150 and the Fast Device Bus High Speed Interconnect The RM9150 and the Fast Device High Speed Interconnect John R. Kinsel Principal Engineer www.pmc -sierra.com 1 August 2004 Agenda CPU-based SOC Design Challenges Fast Device (FDB) Overview Generic Device

More information

Case study: Performance-efficient Implementation of Robust Header Compression (ROHC) using an Application-Specific Processor

Case study: Performance-efficient Implementation of Robust Header Compression (ROHC) using an Application-Specific Processor Case study: Performance-efficient Implementation of Robust Header Compression (ROHC) using an Application-Specific Processor Gert Goossens, Patrick Verbist, Erik Brockmeyer, Luc De Coster Synopsys 1 Agenda

More information

Configurable Processors for SOC Design. Contents crafted by Technology Evangelist Steve Leibson Tensilica, Inc.

Configurable Processors for SOC Design. Contents crafted by Technology Evangelist Steve Leibson Tensilica, Inc. Configurable s for SOC Design Contents crafted by Technology Evangelist Steve Leibson Tensilica, Inc. Why Listen to This Presentation? Understand how SOC design techniques, now nearly 20 years old, are

More information

Qsys and IP Core Integration

Qsys and IP Core Integration Qsys and IP Core Integration Stephen A. Edwards (after David Lariviere) Columbia University Spring 2016 IP Cores Altera s IP Core Integration Tools Connecting IP Cores IP Cores Cyclone V SoC: A Mix of

More information

EE 354 Fall 2015 Lecture 1 Architecture and Introduction

EE 354 Fall 2015 Lecture 1 Architecture and Introduction EE 354 Fall 2015 Lecture 1 Architecture and Introduction Note: Much of these notes are taken from the book: The definitive Guide to ARM Cortex M3 and Cortex M4 Processors by Joseph Yiu, third edition,

More information

MYC-C7Z010/20 CPU Module

MYC-C7Z010/20 CPU Module MYC-C7Z010/20 CPU Module - 667MHz Xilinx XC7Z010/20 Dual-core ARM Cortex-A9 Processor with Xilinx 7-series FPGA logic - 1GB DDR3 SDRAM (2 x 512MB, 32-bit), 4GB emmc, 32MB QSPI Flash - On-board Gigabit

More information

Upper Level Protocols (ULP) Mapping. Common Services. Signaling Protocol. Transmission Protocol (Physical Coding) Physical Interface (PI)

Upper Level Protocols (ULP) Mapping. Common Services. Signaling Protocol. Transmission Protocol (Physical Coding) Physical Interface (PI) 1 Introduction The Fibre Channel (FC) is logically a bi-directional point-to-point serial data channel, structured for high performance information transport. Physically, Fibre Channel is an interconnection

More information

EE/CSE469 Review Problem 0

EE/CSE469 Review Problem 0 EE/CSE469 Review Problem 0 As you wait for class to start, answer the following question: What is important in a computer? What features do you look for when buying one? 0 Review Problem 1 Programming

More information

SoC Communication Complexity Problem

SoC Communication Complexity Problem When is the use of a Most Effective and Why MPSoC, June 2007 K. Charles Janac, Chairman, President and CEO SoC Communication Complexity Problem Arbitration problem in an SoC with 30 initiators: Hierarchical

More information

Platform-based Design

Platform-based Design Platform-based Design The New System Design Paradigm IEEE1394 Software Content CPU Core DSP Core Glue Logic Memory Hardware BlueTooth I/O Block-Based Design Memory Orthogonalization of concerns: the separation

More information

University Program Advance Material

University Program Advance Material University Program Advance Material Advance Material Modules Introduction ti to C8051F360 Analog Performance Measurement (ADC and DAC) Detailed overview of system variances, parameters (offset, gain, linearity)

More information

EyeCheck Smart Cameras

EyeCheck Smart Cameras EyeCheck Smart Cameras 2 3 EyeCheck 9xx & 1xxx series Technical data Memory: DDR RAM 128 MB FLASH 128 MB Interfaces: Ethernet (LAN) RS422, RS232 (not EC900, EC910, EC1000, EC1010) EtherNet / IP PROFINET

More information

Understanding JESD204B High-speed inter-device data transfers for SDR

Understanding JESD204B High-speed inter-device data transfers for SDR Understanding JESD204B High-speed inter-device data transfers for SDR Lars-Peter Clausen Introduction JESD204 Standard Designed as high-speed serial data link between converter (ADC, DAC) and logic device

More information

Amber Baruffa Vincent Varouh

Amber Baruffa Vincent Varouh Amber Baruffa Vincent Varouh Advanced RISC Machine 1979 Acorn Computers Created 1985 first RISC processor (ARM1) 25,000 transistors 32-bit instruction set 16 general purpose registers Load/Store Multiple

More information

Designing Embedded Processors in FPGAs

Designing Embedded Processors in FPGAs Designing Embedded Processors in FPGAs 2002 Agenda Industrial Control Systems Concept Implementation Summary & Conclusions Industrial Control Systems Typically Low Volume Many Variations Required High

More information

A Reconfigurable SOM Hardware Accelerator

A Reconfigurable SOM Hardware Accelerator A Reconfigurable SOM Hardware Accelerator M. Porrmann, M. Franzmeier, H. Kalte, U. Witkowski, U. Rückert Heinz Nixdorf Institute, System and Circuit Technology, University of Paderborn, Germany email:

More information

Embedded Computing Platform. Architecture and Instruction Set

Embedded Computing Platform. Architecture and Instruction Set Embedded Computing Platform Microprocessor: Architecture and Instruction Set Ingo Sander ingo@kth.se Microprocessor A central part of the embedded platform A platform is the basic hardware and software

More information

Ultra-Fast NoC Emulation on a Single FPGA

Ultra-Fast NoC Emulation on a Single FPGA The 25 th International Conference on Field-Programmable Logic and Applications (FPL 2015) September 3, 2015 Ultra-Fast NoC Emulation on a Single FPGA Thiem Van Chu, Shimpei Sato, and Kenji Kise Tokyo

More information

Contents of this presentation: Some words about the ARM company

Contents of this presentation: Some words about the ARM company The architecture of the ARM cores Contents of this presentation: Some words about the ARM company The ARM's Core Families and their benefits Explanation of the ARM architecture Architecture details, features

More information

RAČUNALNIŠKEA COMPUTER ARCHITECTURE

RAČUNALNIŠKEA COMPUTER ARCHITECTURE RAČUNALNIŠKEA COMPUTER ARCHITECTURE 6 Central Processing Unit - CPU RA - 6 2018, Škraba, Rozman, FRI 6 Central Processing Unit - objectives 6 Central Processing Unit objectives and outcomes: A basic understanding

More information

JESD204B Xilinx/IDT DAC1658D-53D interoperability Report

JESD204B Xilinx/IDT DAC1658D-53D interoperability Report [Interoperability Report] Rev 0.4 Page 1 of 14 JESD204B Xilinx/IDT DAC1658D-53D interoperability Report [Interoperability Report] Rev 0.4 Page 2 of 14 CONTENTS INTRODUCTION... 3 SCOPE... 3 HARDWARE...

More information

ADSP-2100A DSP microprocessor with off-chip Harvard architecture. ADSP-2101 DSP microcomputer with on-chip program and data memory

ADSP-2100A DSP microprocessor with off-chip Harvard architecture. ADSP-2101 DSP microcomputer with on-chip program and data memory Introduction. OVERVIEW This book is the second volume of digital signal processing applications based on the ADSP-00 DSP microprocessor family. It contains a compilation of routines for a variety of common

More information

Application Examples Avnet Silica & Enclustra Seminar Getting started with Xilinx Zynq SoC Fribourg, April 26, 2017

Application Examples Avnet Silica & Enclustra Seminar Getting started with Xilinx Zynq SoC Fribourg, April 26, 2017 1 2 3 Introduction The next few slides give a brief overview of what will be discussed in this presentation and they contain some general points that apply to both examples covered. The presentation does

More information

Zynq-7000 All Programmable SoC Product Overview

Zynq-7000 All Programmable SoC Product Overview Zynq-7000 All Programmable SoC Product Overview The SW, HW and IO Programmable Platform August 2012 Copyright 2012 2009 Xilinx Introducing the Zynq -7000 All Programmable SoC Breakthrough Processing Platform

More information

A (Very Hand-Wavy) Introduction to. PCI-Express. Jonathan Heathcote

A (Very Hand-Wavy) Introduction to. PCI-Express. Jonathan Heathcote A (Very Hand-Wavy) Introduction to PCI-Express Jonathan Heathcote Motivation Six Week Project Before PhD Starts: SpiNNaker Ethernet I/O is Sloooooow How Do You Get Things In/Out of SpiNNaker, Fast? Build

More information