Embedded Systems: Projects
|
|
- Gary Thompson
- 6 years ago
- Views:
Transcription
1 November 2016 Embedded Systems: Projects Davide Zoni PhD webpage: home.dei.polimi.it/zoni
2 Contacts & Places Prof. William Fornaciari (Professor in charge) webpage: home.dei.polimi.it/fornacia Davide Zoni PhD webpage: home.dei.polimi.it/zoni
3 Research Activities RTL Design and Verification Embedded CPUs Cache Coherence Design Interconnect design Complex multi-core analysis Security-aware SoC design Multi-core Design and Simulation Cache hierarchy in multi-cores NoC-cache design space exploration CPU-GPU architectures NoC optimization
4 Types of projects Bibliographic research (3 points max) state of the art on a specific topic material organization and presentation comparing different approaches Development project (9 points max) In depth understanding of the tools you are working with Basic theoretical background for the problem SW Coding/HW design
5 Projects 1 (Area: HW Design) Title: RTL Router Design in SystemVerilog Computer architecture, SystemVerilog The on-chip router represents the key component in the NoC. The project requires to design and implement a simple NoC router that supports Virtual Channels. The four stage architecture is the baseline solution while the VA-SA Speculative implementation represents a critical add-on for the project. The final design requires a complete TestBench for regressions. 1. SystemVerilog 2. Designing Network On-Chip Architectures in the Nanoscale Era, J.Flich and D. Bertozzi 2010 Free download:
6 Projects 2 (Area: HW Design) [discontinued] Title: HANDSHAKE Resynchronizer in SystemVerilog Computer architecture DVFS represents a key hardware mechanism to optimize power and performance in a chip. However, the use of different Voltage and Frequency Islands (VFIs) impose to resynchronize signals at each VFI boundary. In this perspective two families of resynchronization scheme can be used: handshake or FIFO. The project requires to implement a simple handshake resynchronizer starting from the DFS support provided by Xilinx FPGA. 1. Metastability - ( ) 2. Additional material provided by the teaching assistant
7 Projects 3 (Area: HW Design) [discontinued] Title: FIFO Resynchronizer in SystemVerilog Computer architecture DVFS represents a key hardware mechanism to optimize power and performance in a chip. However, the use of different Voltage and Frequency Islands (VFIs) impose to resynchronize signals at each VFI boundary. In this perspective two families of resynchronization scheme can be used: handshake or FIFO. The project requires to implement a simple FIFO resynchronizer starting from the DFS support provided by Xilinx FPGA. 1. Metastability - ( ) 2. Additional material provided by the teaching assistant
8 Projects 4 (Area: HW Design) Title: Superscalar Embedded CPU Design Computer architecture, SystemVerilog The OpenRisc Architecture represents the de-facto open-hw architecture and ISA. The mor1kx is an open source, Verilog implementation that is fully OpenRisc compliant. It implements a 6 stages CPU pipeline with split L1 caches. The project requires to enhance the provided architecture with a dual issue implementation. A complete validation of the final solution is also required. 1. SystemVerilog 2. Computer Architecture: A Quantitative Approach ( >=3rd edition )
9 Projects 5 (Area: HW Design) Title: Write-back Cache Implementation for an Embedded CPU Computer architecture, SystemVerilog The OpenRisc Architecture represents the de-facto open-hw architecture and ISA. The mor1kx is an open source, Verilog implementation that is fully OpenRisc compliant. It implements a 6 stages CPU pipeline with split, write-through L1 caches. The student is required to modify the cache implementation to support the more aggressive write-back cache writing mode. 1. SystemVerilog 2. Computer Architecture: A Quantitative Approach ( >=3rd edition )
10 Projects 6 (Area: HW Design) Title: Performance Counter Support for an Embedded CPU Computer architecture, SystemVerilog The OpenRisc represents the de-facto open-hw architecture and ISA. The mor1kx is an open source, Verilog implementation that is fully OpenRisc compliant. It implements a 6 stages CPU pipeline with split, write-through L1 caches. The performance counter represents a critical resource to analyze the architecture at run-time. The project requires to develop the minimal performance counter hardware support as well as the software side counterpart to read them for the following metrics: cpu-idle, L1 miss, L1 accesses, per-pipeline-stage stalls, branch-misspredictions. 1. SystemVerilog 2. Computer Architecture: A Quantitative Approach ( >=3rd edition )
11 Projects 7 (Area: HW Design) Title: Branch Prediction Schemes for an Embedded CPU Computer architecture, SystemVerilog Considering embedded CPUs, the branch prediction scheme strongly influences the overall system performance since the CPU is usually a single issue inorder architecture.the OpenRisc Architecture represents the de-facto open-hw architecture and ISA. The mor1kx is an open source, Verilog implementation that is fully OpenRisc compliant. It implements a 6 stages CPU pipeline with split, write-through L1 caches. The project requires to explore the already implemented branch prediction algorithms and implements few more to improve the CPU performance. The validation and design space exploration analysis will complete the project. 1. SystemVerilog 2. Computer Architecture: A Quantitative Approach ( >=3rd edition )
12 Projects 8 (Area: HW Design) Title: WISHBONE-compliant Bus Encryption for an Embedded CPU Computer architecture, SystemVerilog Considering the embedded SoCs the bus encryption represents a valuable features to prevent information leakage thus securing the architecture against the side-channel attack methodologies. The OpenRisc Architecture represents the de-facto open-hw architecture and ISA. The mor1kx is an open source, Verilog implementation that is fully OpenRisc compliant. It implements a 6 stages CPU pipeline with split, write-through L1 caches. The project requires the implementation of a flexible bus encryption scheme for the considered SoC. The trade-off analysis comparing the additional requested resources (area and power) and the performance and security metrics complemented the project outcome. 1. SystemVerilog 2. Computer Architecture: A Quantitative Approach ( >=3rd edition )
13 Projects 9 (Area: HW Design) Title: High Level Synthesis for Security Computer architecture, SystemVerilog The High Level Synthesis (HLS) allows to transform a software encoded algorithm into an hardware description language specification with the final goal to speed up portions of a complex algorithm in hardware thanks to ad-hoc accelerators. However, the automated code transformation process can result in a suboptimal design from the performance, power area or security viewpoints. The project aims to compare different cryptographic algorithms encoded in both hardware and software against the output from the HLS tool integrated in the Xilinx Vivado Software Suite. 1. SystemVerilog 2. Computer Architecture: A Quantitative Approach ( >=3rd edition )
14 Projects 10 (Area: Architecture Simulation) Title: System Cache and Cache Partitioning in big.little architectures Computer architecture, C++, Python Embedded multi-core solutions are embedded in smartphone, tablets and smart devices with a net impact on our daily life. However, the design of such architectures is strongly constrained by both the power consumption and limited by traditional bus-based on-chip interconnect and cache hierarchies. The project focuses on the cache partitioning schemes for LLC to contribute in the delivering of the next embedded multi-core reference architecture. A full-system, Linux-based, clustered multi-core will be explored considering different cache hierarchies and partitioning schemes using the PARSEC benchmark suite from the application side. 1. GEM LLC Partitioning Schemes - RECAP: Region-Aware Cache Partitioning
15 Projects 11 (Area: Architecture Simulation) Title: The impact of the prefetcher in big.little architectures Computer architecture, C++, Python The prefetcher emerges as a greedy master for cache lines, thus greatly contributing to the final performance of the overall system. Prefetching too late cannot shadow the memory access time, while prefetching too early waist cache lines. The scenario is further complicated by the running applications that competing for the same shared cache resources. The project aims to implement a simple cache partitioning scheme to evaluate and eventually constraint the prefetcher greediness. Different prefetchers coupled with the partitioning scheme will be evaluated. A full-system, Linux-based, clustered multi-core will be explored considering different cache hierarchies and partitioning schemes using the PARSEC benchmark suite from the application side. 1. GEM LLC Partitioning Schemes - RECAP: Region-Aware Cache Partitioning
16 Projects 12 (Area: Architecture Simulation) Title: CPU-GPU multi-core simulators Computer architecture, C++, Python The multi-cores are ubiquitous and the user expects the same performance regardless device at hand, i.e. smartphone, tablet, notebook, desktop. In this scenario the multimedia experience is becoming of paramount importance to deliver a successful architecture, thus chip factories are providing multi-cores endowed with powerful GPUs. The simulation still represents a critical design stage for the early architecture evaluation and the possibility to simulate CPU and GPU at the same time can represents a great advantage for the design architects. The project requires a complete exploration of the gem5-gpu simulation toolchain that allows to execute CUDA kernels in a full-system cycle accurate simulator. 1. GEM GEM5-GPU: gem5-gpu: A Heterogeneous CPU-GPU Simulator 3. No Mali: the ARM solution to mimic the GPU in gem5
17 Projects 13 (Area: HW Design) Title: OpenSparc T2 onto the Xilinx XUPV5-LX110T FPGA Students: <= 3 Computer architecture, Verilog, Xilinx Software The OpenSparc project aims to deliver an high performance multi-core platform to the academic community. OpenSPARC T2 is derived from the UltraSPARC T2 processor, a 64 bit eight core multi-threaded microprocessor. The students are required to boot-up the OpenSparc system onto the compatible XUPV5 FPGA using the ISE toolchain from Xilinx. A set of experiments with single- and multi-threaded applications complement the project assignement. 1. OpenSparc T2: to-use 2. OpenPiton:
18 Projects 14 (Area: HW Design) Title: Consistency Memory Models on a real multicore Verilog, C/C++ The memory consistency model describes the behavior of the shared memory system for programmers and implementors in terms of correctness. The OpenPiton implements the OpenSparcT1 architecture as the base building block and it is publicly available. The project requires to change the CPU-2-memory interface to explore the benefit of the most prominent consistency models: Sequential Consistency (SC), Total Store Order (TSO), Weak Consistency (WC). 1. A Primer on Memory Consistency and Cache Coherence, Sorin,Hill,Wood OpenPiton:
19 Projects 15 (Area: Architecture Simulation) Title: ElasticTrace (ARM) Computer architecture, C++, Python The cycle accurate simulation is a viable means to support the Design Space Exploration at early design stages. However, the complex multi-core makes such an evaluation technique extremely time consuming, thus allowing only a small subset of the design space to be explored. The Elastic Trace methodology has been developed at ARM (Samos-2016) to relieve the simulation burden generated by the simulation of complex out-of-order CPU models. Simulation traces are extracted once and can be replayed on a different architecture to validate the differences in terms of performance and power consumption between the two solutions, thus aggressively trimming down the simulation time. The student is required to evaluate the ARM solution considering different multi-core architectures to validate the simulation speed-up GEM5
20 Projects 16 (Area: Architecture Simulation) Title: SynchroTrace Computer architecture, C++, Python SynchroTrace has been developed at the Drexel Lab (Philadelphia University) to support the fast architectural explorations of multi-cores. The methodology should provide the same benefit of the ARM ElasticTrace solution while it delivers few additional DSE features. The student is required to evaluate the SynchroTrace solution considering different multi-core architectures to validate the simulation speed-up. 1. Synchrotrace tutorial _ GEM5
21 Projects 17 (Area: HW Design) Title: Rowhammer analysis on FPGAs Verilog, C/C++ The rowhammer is a security-based attack methodology that exploits the unintended side effect in DRAM memory cells of leaking their charges and possibly altering the content of nearby memory rows not involved in the memory access. Many memory vendors are updating their devices to face such a threat, while several devices will not be updated due to the high costs of the transition to the new model. The FPGAs falls in this category since the update to the new device version is expensive. The project requires to explore the possibility to attack an SDRAM equipped FPGA using the rowhammer methodology. 1. Google Project Zero: 2. Drammer: Deterministic Rowhammer Attacks on Mobile Platforms Veen et. al., CCS-2016
22 Projects 18 (Area: HW Design) Title: OpenRisc Mor1kx - porting to FPGA Computer architecture, SystemVerilog The OpenRisc represents the de-facto open-hw architecture and ISA. The mor1kx is an open source, Verilog implementation that is fully OpenRisc compliant. The project requires to port the design to one of the FPGAs that are available in the laboratory. A complete regression test is part of the project, while the port of the Linux OS is considered a plus. 1. SystemVerilog 2. Computer Architecture: A Quantitative Approach ( >=3rd edition )
Embedded Systems: Projects
December 2015 Embedded Systems: Projects Davide Zoni PhD email: davide.zoni@polimi.it webpage: home.dei.polimi.it/zoni Research Activities Interconnect: bus, NoC Simulation (component design, evaluation)
More informationEmbedded Systems 1: Course Presentation
October 2017 Embedded Systems 1: Course Presentation Davide Zoni PhD email: davide.zoni@polimi.it webpage: home.deib.polimi.it/zoni About me Education: PostDoC (March 2014 now) PhD Student ( Jan 2011 Dec
More informationEmbedded Systems 1: On Chip Bus
October 2016 Embedded Systems 1: On Chip Bus Davide Zoni PhD email: davide.zoni@polimi.it webpage: home.deib.polimi.it/zoni Additional Material and Reference Book 2 Reference Book Chapter Principles and
More informationComputer Architecture!
Informatics 3 Computer Architecture! Dr. Boris Grot and Dr. Vijay Nagarajan!! Institute for Computing Systems Architecture, School of Informatics! University of Edinburgh! General Information! Instructors
More informationSystem-on-Chip Architecture for Mobile Applications. Sabyasachi Dey
System-on-Chip Architecture for Mobile Applications Sabyasachi Dey Email: sabyasachi.dey@gmail.com Agenda What is Mobile Application Platform Challenges Key Architecture Focus Areas Conclusion Mobile Revolution
More informationLeveraging OpenSPARC. ESA Round Table 2006 on Next Generation Microprocessors for Space Applications EDD
Leveraging OpenSPARC ESA Round Table 2006 on Next Generation Microprocessors for Space Applications G.Furano, L.Messina TEC- OpenSPARC T1 The T1 is a new-from-the-ground-up SPARC microprocessor implementation
More informationComputer Architecture!
Informatics 3 Computer Architecture! Dr. Boris Grot and Dr. Vijay Nagarajan!! Institute for Computing Systems Architecture, School of Informatics! University of Edinburgh! General Information! Instructors:!
More informationIntegrating CPU and GPU, The ARM Methodology. Edvard Sørgård, Senior Principal Graphics Architect, ARM Ian Rickards, Senior Product Manager, ARM
Integrating CPU and GPU, The ARM Methodology Edvard Sørgård, Senior Principal Graphics Architect, ARM Ian Rickards, Senior Product Manager, ARM The ARM Business Model Global leader in the development of
More informationGetting to Work with OpenPiton. Princeton University. OpenPit
Getting to Work with OpenPiton Princeton University http://openpiton.org OpenPit Princeton Parallel Research Group Redesigning the Data Center of the Future Chip Architecture Operating Systems and Runtimes
More informationSpring 2016 :: CSE 502 Computer Architecture. Introduction. Nima Honarmand
Introduction Nima Honarmand CSE 502 - CompArch Computer Architecture is the science and art of selecting (or designing) and interconnecting hardware and software components to create computers Computer
More informationEnergy Efficient Computing Systems (EECS) Magnus Jahre Coordinator, EECS
Energy Efficient Computing Systems (EECS) Magnus Jahre Coordinator, EECS Who am I? Education Master of Technology, NTNU, 2007 PhD, NTNU, 2010. Title: «Managing Shared Resources in Chip Multiprocessor Memory
More informationBig.LITTLE Processing with ARM Cortex -A15 & Cortex-A7
Big.LITTLE Processing with ARM Cortex -A15 & Cortex-A7 Improving Energy Efficiency in High-Performance Mobile Platforms Peter Greenhalgh, ARM September 2011 This paper presents the rationale and design
More informationLecture 1: Introduction
Contemporary Computer Architecture Instruction set architecture Lecture 1: Introduction CprE 581 Computer Systems Architecture, Fall 2016 Reading: Textbook, Ch. 1.1-1.7 Microarchitecture; examples: Pipeline
More informationAnalyzing and Debugging Performance Issues with Advanced ARM CoreLink System IP Components
Analyzing and Debugging Performance Issues with Advanced ARM CoreLink System IP Components By William Orme, Strategic Marketing Manager, ARM Ltd. and Nick Heaton, Senior Solutions Architect, Cadence Finding
More informationGetting to Work with OpenPiton
Getting to Work with OpenPiton Jonathan Balkind, Michael McKeown, Yaosheng Fu, Tri Nguyen, Yanqi Zhou, Alexey Lavrov, Mohammad Shahrad, Adi Fuchs, Samuel Payne, Xiaohua Liang, Matthew Matl, David Wentzlaff
More informationFormal for Everyone Challenges in Achievable Multicore Design and Verification. FMCAD 25 Oct 2012 Daryl Stewart
Formal for Everyone Challenges in Achievable Multicore Design and Verification FMCAD 25 Oct 2012 Daryl Stewart 1 ARM is an IP company ARM licenses technology to a network of more than 1000 partner companies
More informationModeling Performance Use Cases with Traffic Profiles Over ARM AMBA Interfaces
Modeling Performance Use Cases with Traffic Profiles Over ARM AMBA Interfaces Li Chen, Staff AE Cadence China Agenda Performance Challenges Current Approaches Traffic Profiles Intro Traffic Profiles Implementation
More informationBERKELEY PAR LAB. RAMP Gold Wrap. Krste Asanovic. RAMP Wrap Stanford, CA August 25, 2010
RAMP Gold Wrap Krste Asanovic RAMP Wrap Stanford, CA August 25, 2010 RAMP Gold Team Graduate Students Zhangxi Tan Andrew Waterman Rimas Avizienis Yunsup Lee Henry Cook Sarah Bird Faculty Krste Asanovic
More informationOptimizing Cache Coherent Subsystem Architecture for Heterogeneous Multicore SoCs
Optimizing Cache Coherent Subsystem Architecture for Heterogeneous Multicore SoCs Niu Feng Technical Specialist, ARM Tech Symposia 2016 Agenda Introduction Challenges: Optimizing cache coherent subsystem
More informationFCUDA-NoC: A Scalable and Efficient Network-on-Chip Implementation for the CUDA-to-FPGA Flow
FCUDA-NoC: A Scalable and Efficient Network-on-Chip Implementation for the CUDA-to-FPGA Flow Abstract: High-level synthesis (HLS) of data-parallel input languages, such as the Compute Unified Device Architecture
More informationSYSTEMS ON CHIP (SOC) FOR EMBEDDED APPLICATIONS
SYSTEMS ON CHIP (SOC) FOR EMBEDDED APPLICATIONS Embedded System System Set of components needed to perform a function Hardware + software +. Embedded Main function not computing Usually not autonomous
More informationComputer Architecture
Informatics 3 Computer Architecture Dr. Vijay Nagarajan Institute for Computing Systems Architecture, School of Informatics University of Edinburgh (thanks to Prof. Nigel Topham) General Information Instructor
More informationComputer Architecture!
Informatics 3 Computer Architecture! Dr. Vijay Nagarajan and Prof. Nigel Topham! Institute for Computing Systems Architecture, School of Informatics! University of Edinburgh! General Information! Instructors
More informationA 3-D CPU-FPGA-DRAM Hybrid Architecture for Low-Power Computation
A 3-D CPU-FPGA-DRAM Hybrid Architecture for Low-Power Computation Abstract: The power budget is expected to limit the portion of the chip that we can power ON at the upcoming technology nodes. This problem,
More informationThe Use Of Virtual Platforms In MP-SoC Design. Eshel Haritan, VP Engineering CoWare Inc. MPSoC 2006
The Use Of Virtual Platforms In MP-SoC Design Eshel Haritan, VP Engineering CoWare Inc. MPSoC 2006 1 MPSoC Is MP SoC design happening? Why? Consumer Electronics Complexity Cost of ASIC Increased SW Content
More informationEnabling Arm DynamIQ support. Dan Handley (Arm) Ionela Voinescu (Arm) Vincent Guittot (Linaro)
Enabling Arm DynamIQ support Dan Handley (Arm) Ionela Voinescu (Arm) Vincent Guittot (Linaro) Agenda DynamIQ introduction DynamIQ and Arm Trusted Firmware OS Power Management with DynamIQ L3 partial power-down
More informationFPGA Entering the Era of the All Programmable SoC
FPGA Entering the Era of the All Programmable SoC Ivo Bolsens, Senior Vice President & CTO Page 1 Moore s Law: The Technology Pipeline Page 2 Industry Debates on Cost Page 3 Design Cost Estimated Chip
More informationParallel Computing: Parallel Architectures Jin, Hai
Parallel Computing: Parallel Architectures Jin, Hai School of Computer Science and Technology Huazhong University of Science and Technology Peripherals Computer Central Processing Unit Main Memory Computer
More informationSRAMs to Memory. Memory Hierarchy. Locality. Low Power VLSI System Design Lecture 10: Low Power Memory Design
SRAMs to Memory Low Power VLSI System Design Lecture 0: Low Power Memory Design Prof. R. Iris Bahar October, 07 Last lecture focused on the SRAM cell and the D or D memory architecture built from these
More informationEmbedded processors. Timo Töyry Department of Computer Science and Engineering Aalto University, School of Science timo.toyry(at)aalto.
Embedded processors Timo Töyry Department of Computer Science and Engineering Aalto University, School of Science timo.toyry(at)aalto.fi Comparing processors Evaluating processors Taxonomy of processors
More informationLow-power Architecture. By: Jonathan Herbst Scott Duntley
Low-power Architecture By: Jonathan Herbst Scott Duntley Why low power? Has become necessary with new-age demands: o Increasing design complexity o Demands of and for portable equipment Communication Media
More informationComputer Architecture
Informatics 3 Computer Architecture Dr. Boris Grot and Dr. Vijay Nagarajan Institute for Computing Systems Architecture, School of Informatics University of Edinburgh General Information Instructors: Boris
More informationECE 5775 (Fall 17) High-Level Digital Design Automation. Hardware-Software Co-Design
ECE 5775 (Fall 17) High-Level Digital Design Automation Hardware-Software Co-Design Announcements Midterm graded You can view your exams during TA office hours (Fri/Wed 11am-noon, Rhodes 312) Second paper
More informationCS 250 VLSI Design Lecture 11 Design Verification
CS 250 VLSI Design Lecture 11 Design Verification 2012-9-27 John Wawrzynek Jonathan Bachrach Krste Asanović John Lazzaro TA: Rimas Avizienis www-inst.eecs.berkeley.edu/~cs250/ IBM Power 4 174 Million Transistors
More informationOptimizing ARM SoC s with Carbon Performance Analysis Kits. ARM Technical Symposia, Fall 2014 Andy Ladd
Optimizing ARM SoC s with Carbon Performance Analysis Kits ARM Technical Symposia, Fall 2014 Andy Ladd Evolving System Requirements Processor Advances big.little Multicore Unicore DSP Cortex -R7 Block
More informationModel-Based Design for effective HW/SW Co-Design Alexander Schreiber Senior Application Engineer MathWorks, Germany
Model-Based Design for effective HW/SW Co-Design Alexander Schreiber Senior Application Engineer MathWorks, Germany 2013 The MathWorks, Inc. 1 Agenda Model-Based Design of embedded Systems Software Implementation
More informationModeling and Simulation of System-on. Platorms. Politecnico di Milano. Donatella Sciuto. Piazza Leonardo da Vinci 32, 20131, Milano
Modeling and Simulation of System-on on-chip Platorms Donatella Sciuto 10/01/2007 Politecnico di Milano Dipartimento di Elettronica e Informazione Piazza Leonardo da Vinci 32, 20131, Milano Key SoC Market
More informationPower Aware Architecture Design for Multicore SoCs
Power Aware Architecture Design for Multicore SoCs EDPS Monterey Patrick Sheridan Synopsys Virtual Prototyping April 2015 Low Power SoC Design Multi-disciplinary system problem Must manage energy consumption
More informationMicroprocessor Trends and Implications for the Future
Microprocessor Trends and Implications for the Future John Mellor-Crummey Department of Computer Science Rice University johnmc@rice.edu COMP 522 Lecture 4 1 September 2016 Context Last two classes: from
More informationDr. Yassine Hariri CMC Microsystems
Dr. Yassine Hariri Hariri@cmc.ca CMC Microsystems 03-26-2013 Agenda MCES Workshop Agenda and Topics Canada s National Design Network and CMC Microsystems Processor Eras: Background and History Single core
More informationVERIFICATION OF RISC-V PROCESSOR USING UVM TESTBENCH
VERIFICATION OF RISC-V PROCESSOR USING UVM TESTBENCH Chevella Anilkumar 1, K Venkateswarlu 2 1.2 ECE Department, JNTU HYDERABAD(INDIA) ABSTRACT RISC-V (pronounced "risk-five") is a new, open, and completely
More informationOn-chip Networks Enable the Dark Silicon Advantage. Drew Wingard CTO & Co-founder Sonics, Inc.
On-chip Networks Enable the Dark Silicon Advantage Drew Wingard CTO & Co-founder Sonics, Inc. Agenda Sonics history and corporate summary Power challenges in advanced SoCs General power management techniques
More informationTutorial on Software-Hardware Codesign with CORDIC
ECE5775 High-Level Digital Design Automation, Fall 2017 School of Electrical Computer Engineering, Cornell University Tutorial on Software-Hardware Codesign with CORDIC 1 Introduction So far in ECE5775
More informationCodesign Framework. Parts of this lecture are borrowed from lectures of Johan Lilius of TUCS and ASV/LL of UC Berkeley available in their web.
Codesign Framework Parts of this lecture are borrowed from lectures of Johan Lilius of TUCS and ASV/LL of UC Berkeley available in their web. Embedded Processor Types General Purpose Expensive, requires
More informationMulti-core Architectures. Dr. Yingwu Zhu
Multi-core Architectures Dr. Yingwu Zhu What is parallel computing? Using multiple processors in parallel to solve problems more quickly than with a single processor Examples of parallel computing A cluster
More informationCS3350B Computer Architecture. Introduction
CS3350B Computer Architecture Winter 2015 Introduction Marc Moreno Maza www.csd.uwo.ca/courses/cs3350b What is a computer? 2 What is a computer? 3 What is a computer? 4 What is a computer? 5 The Computer
More informationBuilding High Performance, Power Efficient Cortex and Mali systems with ARM CoreLink. Robert Kaye
Building High Performance, Power Efficient Cortex and Mali systems with ARM CoreLink Robert Kaye 1 Agenda Once upon a time ARM designed systems Compute trends Bringing it all together with CoreLink 400
More informationIntegrating the Par Lab Stack Running Damascene on SEJITS/ROS/RAMP Gold
Integrating the Par Lab Stack Running on /ROS/RAMP Gold Kevin Klues, Yunsup Lee, Andrew Waterman Par Lab Winter Retreat 2010 Overall Goal of the Par Lab: Create Productive, Efficient, Correct, Portable
More informationProfiling and Debugging OpenCL Applications with ARM Development Tools. October 2014
Profiling and Debugging OpenCL Applications with ARM Development Tools October 2014 1 Agenda 1. Introduction to GPU Compute 2. ARM Development Solutions 3. Mali GPU Architecture 4. Using ARM DS-5 Streamline
More informationComputer Architecture
Computer Architecture Slide Sets WS 2013/2014 Prof. Dr. Uwe Brinkschulte M.Sc. Benjamin Betting Part 10 Thread and Task Level Parallelism Computer Architecture Part 10 page 1 of 36 Prof. Dr. Uwe Brinkschulte,
More informationVerifying the Correctness of the PA 7300LC Processor
Verifying the Correctness of the PA 7300LC Processor Functional verification was divided into presilicon and postsilicon phases. Software models were used in the presilicon phase, and fabricated chips
More informationCopyright 2012, Elsevier Inc. All rights reserved.
Computer Architecture A Quantitative Approach, Fifth Edition Chapter 2 Memory Hierarchy Design 1 Introduction Programmers want unlimited amounts of memory with low latency Fast memory technology is more
More informationFCUDA-SoC: Platform Integration for Field-Programmable SoC with the CUDAto-FPGA
1 FCUDA-SoC: Platform Integration for Field-Programmable SoC with the CUDAto-FPGA Compiler Tan Nguyen 1, Swathi Gurumani 1, Kyle Rupnow 1, Deming Chen 2 1 Advanced Digital Sciences Center, Singapore {tan.nguyen,
More informationDesign methodology for multi processor systems design on regular platforms
Design methodology for multi processor systems design on regular platforms Ph.D in Electronics, Computer Science and Telecommunications Ph.D Student: Davide Rossi Ph.D Tutor: Prof. Roberto Guerrieri Outline
More informationNoC Generic Scoreboard VIP by François Cerisier and Mathieu Maisonneuve, Test and Verification Solutions
NoC Generic Scoreboard VIP by François Cerisier and Mathieu Maisonneuve, Test and Verification Solutions Abstract The increase of SoC complexity with more cores, IPs and other subsystems has led SoC architects
More informationCopyright 2016 Xilinx
Zynq Architecture Zynq Vivado 2015.4 Version This material exempt per Department of Commerce license exception TSU Objectives After completing this module, you will be able to: Identify the basic building
More informationMicro-Architectural Attacks and Countermeasures
Micro-Architectural Attacks and Countermeasures Çetin Kaya Koç koc@cs.ucsb.edu Çetin Kaya Koç http://koclab.org Winter 2017 1 / 25 Contents Micro-Architectural Attacks Cache Attacks Branch Prediction Attack
More informationEffective System Design with ARM System IP
Effective System Design with ARM System IP Mentor Technical Forum 2009 Serge Poublan Product Marketing Manager ARM 1 Higher level of integration WiFi Platform OS Graphic 13 days standby Bluetooth MP3 Camera
More informationThe Challenges of System Design. Raising Performance and Reducing Power Consumption
The Challenges of System Design Raising Performance and Reducing Power Consumption 1 Agenda The key challenges Visibility for software optimisation Efficiency for improved PPA 2 Product Challenge - Software
More informationMulti-core microcontroller design with Cortex-M processors and CoreSight SoC
Multi-core microcontroller design with Cortex-M processors and CoreSight SoC Joseph Yiu, ARM Ian Johnson, ARM January 2013 Abstract: While the majority of Cortex -M processor-based microcontrollers are
More information! Readings! ! Room-level, on-chip! vs.!
1! 2! Suggested Readings!! Readings!! H&P: Chapter 7 especially 7.1-7.8!! (Over next 2 weeks)!! Introduction to Parallel Computing!! https://computing.llnl.gov/tutorials/parallel_comp/!! POSIX Threads
More informationAdapted from David Patterson s slides on graduate computer architecture
Mei Yang Adapted from David Patterson s slides on graduate computer architecture Introduction Ten Advanced Optimizations of Cache Performance Memory Technology and Optimizations Virtual Memory and Virtual
More information12. Use of Test Generation Algorithms and Emulation
12. Use of Test Generation Algorithms and Emulation 1 12. Use of Test Generation Algorithms and Emulation Jacob Abraham Department of Electrical and Computer Engineering The University of Texas at Austin
More informationFPGA briefing Part II FPGA development DMW: FPGA development DMW:
FPGA briefing Part II FPGA development FPGA development 1 FPGA development FPGA development : Domain level analysis (Level 3). System level design (Level 2). Module level design (Level 1). Academical focus
More informationPerformance Verification for ESL Design Methodology from AADL Models
Performance Verification for ESL Design Methodology from AADL Models Hugues Jérome Institut Supérieur de l'aéronautique et de l'espace (ISAE-SUPAERO) Université de Toulouse 31055 TOULOUSE Cedex 4 Jerome.huges@isae.fr
More information2 TEST: A Tracer for Extracting Speculative Threads
EE392C: Advanced Topics in Computer Architecture Lecture #11 Polymorphic Processors Stanford University Handout Date??? On-line Profiling Techniques Lecture #11: Tuesday, 6 May 2003 Lecturer: Shivnath
More informationCO403 Advanced Microprocessors IS860 - High Performance Computing for Security. Basavaraj Talawar,
CO403 Advanced Microprocessors IS860 - High Performance Computing for Security Basavaraj Talawar, basavaraj@nitk.edu.in Course Syllabus Technology Trends: Transistor Theory. Moore's Law. Delay, Power,
More informationLecture 1: Course Introduction and Overview Prof. Randy H. Katz Computer Science 252 Spring 1996
Lecture 1: Course Introduction and Overview Prof. Randy H. Katz Computer Science 252 Spring 1996 RHK.S96 1 Computer Architecture Is the attributes of a [computing] system as seen by the programmer, i.e.,
More informationUsing Industry Standards to Exploit the Advantages and Resolve the Challenges of Multicore Technology
Using Industry Standards to Exploit the Advantages and Resolve the Challenges of Multicore Technology September 19, 2007 Markus Levy, EEMBC and Multicore Association Enabling the Multicore Ecosystem Multicore
More informationPACE: Power-Aware Computing Engines
PACE: Power-Aware Computing Engines Krste Asanovic Saman Amarasinghe Martin Rinard Computer Architecture Group MIT Laboratory for Computer Science http://www.cag.lcs.mit.edu/ PACE Approach Energy- Conscious
More informationENHANCED TOOLS FOR RISC-V PROCESSOR DEVELOPMENT
ENHANCED TOOLS FOR RISC-V PROCESSOR DEVELOPMENT THE FREE AND OPEN RISC INSTRUCTION SET ARCHITECTURE Codasip is the leading provider of RISC-V processor IP Codasip Bk: A portfolio of RISC-V processors Uniquely
More informationCopyright 2012, Elsevier Inc. All rights reserved.
Computer Architecture A Quantitative Approach, Fifth Edition Chapter 2 Memory Hierarchy Design 1 Introduction Introduction Programmers want unlimited amounts of memory with low latency Fast memory technology
More informationComputer Architecture. A Quantitative Approach, Fifth Edition. Chapter 2. Memory Hierarchy Design. Copyright 2012, Elsevier Inc. All rights reserved.
Computer Architecture A Quantitative Approach, Fifth Edition Chapter 2 Memory Hierarchy Design 1 Programmers want unlimited amounts of memory with low latency Fast memory technology is more expensive per
More informationMulticore Hardware and Parallelism
Multicore Hardware and Parallelism Minsoo Ryu Department of Computer Science and Engineering 2 1 Advent of Multicore Hardware 2 Multicore Processors 3 Amdahl s Law 4 Parallelism in Hardware 5 Q & A 2 3
More informationECSE 425 Lecture 1: Course Introduc5on Bre9 H. Meyer
ECSE 425 Lecture 1: Course Introduc5on 2011 Bre9 H. Meyer Staff Instructor: Bre9 H. Meyer, Professor of ECE Email: bre9 dot meyer at mcgill.ca Phone: 514-398- 4210 Office: McConnell 525 OHs: M 14h00-15h00;
More informationOptimizing Emulator Utilization by Russ Klein, Program Director, Mentor Graphics
Optimizing Emulator Utilization by Russ Klein, Program Director, Mentor Graphics INTRODUCTION Emulators, like Mentor Graphics Veloce, are able to run designs in RTL orders of magnitude faster than logic
More informationEmbedded Systems. 7. System Components
Embedded Systems 7. System Components Lothar Thiele 7-1 Contents of Course 1. Embedded Systems Introduction 2. Software Introduction 7. System Components 10. Models 3. Real-Time Models 4. Periodic/Aperiodic
More informationArchitecture of An AHB Compliant SDRAM Memory Controller
Architecture of An AHB Compliant SDRAM Memory Controller S. Lakshma Reddy Metch student, Department of Electronics and Communication Engineering CVSR College of Engineering, Hyderabad, Andhra Pradesh,
More informationLecture: Storage, GPUs. Topics: disks, RAID, reliability, GPUs (Appendix D, Ch 4)
Lecture: Storage, GPUs Topics: disks, RAID, reliability, GPUs (Appendix D, Ch 4) 1 Magnetic Disks A magnetic disk consists of 1-12 platters (metal or glass disk covered with magnetic recording material
More informationIWES st Italian Workshop on Embedded Systems Pisa September 2016
IWES 2016 1st Italian Workshop on Embedded Systems Pisa -- 19 September 2016 Research Group Overview Roberto Giorgi University of Siena, Italy http://www.dii.unisi.it/~giorgi Siena on Earth 2 Engineering
More informationCSCI-GA Multicore Processors: Architecture & Programming Lecture 10: Heterogeneous Multicore
CSCI-GA.3033-012 Multicore Processors: Architecture & Programming Lecture 10: Heterogeneous Multicore Mohamed Zahran (aka Z) mzahran@cs.nyu.edu http://www.mzahran.com Status Quo Previously, CPU vendors
More informationMoore s Law. CS 6534: Tech Trends / Intro. Good Ol Days: Frequency Scaling. The Power Wall. Charles Reiss. 24 August 2016
Moore s Law CS 6534: Tech Trends / Intro Microprocessor Transistor Counts 1971-211 & Moore's Law 2,6,, 1,,, Six-Core Core i7 Six-Core Xeon 74 Dual-Core Itanium 2 AMD K1 Itanium 2 with 9MB cache POWER6
More informationIMPROVING ENERGY EFFICIENCY THROUGH PARALLELIZATION AND VECTORIZATION ON INTEL R CORE TM
IMPROVING ENERGY EFFICIENCY THROUGH PARALLELIZATION AND VECTORIZATION ON INTEL R CORE TM I5 AND I7 PROCESSORS Juan M. Cebrián 1 Lasse Natvig 1 Jan Christian Meyer 2 1 Depart. of Computer and Information
More informationCS 6534: Tech Trends / Intro
1 CS 6534: Tech Trends / Intro Charles Reiss 24 August 2016 Moore s Law Microprocessor Transistor Counts 1971-2011 & Moore's Law 16-Core SPARC T3 2,600,000,000 1,000,000,000 Six-Core Core i7 Six-Core Xeon
More informationPowerAware RTL Verification of USB 3.0 IPs by Gayathri SN and Badrinath Ramachandra, L&T Technology Services Limited
PowerAware RTL Verification of USB 3.0 IPs by Gayathri SN and Badrinath Ramachandra, L&T Technology Services Limited INTRODUCTION Power management is a major concern throughout the chip design flow from
More informationAn Architectural Framework for Accelerating Dynamic Parallel Algorithms on Reconfigurable Hardware
An Architectural Framework for Accelerating Dynamic Parallel Algorithms on Reconfigurable Hardware Tao Chen, Shreesha Srinath Christopher Batten, G. Edward Suh Computer Systems Laboratory School of Electrical
More informationComputer Architecture. Fall Dongkun Shin, SKKU
Computer Architecture Fall 2018 1 Syllabus Instructors: Dongkun Shin Office : Room 85470 E-mail : dongkun@skku.edu Office Hours: Wed. 15:00-17:30 or by appointment Lecture notes nyx.skku.ac.kr Courses
More informationRead this before starting!
Points missed: Student's Name: Total score: /100 points East Tennessee State University Department of Computer and Information Sciences CSCI 4717 Computer Architecture TEST 1 for Fall Semester, 2005 Section
More informationValidation Strategies with pre-silicon platforms
Validation Strategies with pre-silicon platforms Shantanu Ganguly Synopsys Inc April 10 2014 2014 Synopsys. All rights reserved. 1 Agenda Market Trends Emulation HW Considerations Emulation Scenarios Debug
More informationDO-254 Testing of High Speed FPGA Interfaces by Nir Weintroub, CEO, and Sani Jabsheh, Verisense
DO-254 Testing of High Speed FPGA Interfaces by Nir Weintroub, CEO, and Sani Jabsheh, Verisense As the complexity of electronics for airborne applications continues to rise, an increasing number of applications
More informationDesign and Implementation of High Performance DDR3 SDRAM controller
Design and Implementation of High Performance DDR3 SDRAM controller Mrs. Komala M 1 Suvarna D 2 Dr K. R. Nataraj 3 Research Scholar PG Student(M.Tech) HOD, Dept. of ECE Jain University, Bangalore SJBIT,Bangalore
More informationStrober: Fast and Accurate Sample-Based Energy Simulation Framework for Arbitrary RTL
Strober: Fast and Accurate Sample-Based Energy Simulation Framework for Arbitrary RTL Donggyu Kim, Adam Izraelevitz, Christopher Celio, Hokeun Kim, Brian Zimmer, Yunsup Lee, Jonathan Bachrach, Krste Asanović
More informationThe Bifrost GPU architecture and the ARM Mali-G71 GPU
The Bifrost GPU architecture and the ARM Mali-G71 GPU Jem Davies ARM Fellow and VP of Technology Hot Chips 28 Aug 2016 Introduction to ARM Soft IP ARM licenses Soft IP cores (amongst other things) to our
More informationOutline Marquette University
COEN-4710 Computer Hardware Lecture 1 Computer Abstractions and Technology (Ch.1) Cristinel Ababei Department of Electrical and Computer Engineering Credits: Slides adapted primarily from presentations
More informationDeep Learning Accelerators
Deep Learning Accelerators Abhishek Srivastava (as29) Samarth Kulshreshtha (samarth5) University of Illinois, Urbana-Champaign Submitted as a requirement for CS 433 graduate student project Outline Introduction
More informationWorld Class Verilog & SystemVerilog Training
World Class Verilog & SystemVerilog Training Sunburst Design - Expert Verilog-2001 FSM, Multi-Clock Design & Verification Techniques by Recognized Verilog & SystemVerilog Guru, Cliff Cummings of Sunburst
More informationDigital Logic Design Lab
Digital Logic Design Lab DEPARTMENT OF ELECTRICAL ENGINEERING LAB BROCHURE DIGITAL LOGIC DESIGN LABORATORY CONTENTS Lab Venue... 3 Lab Objectives & Courses... 3 Lab Description & Experiments... 4 Hardware
More informationMemory Systems IRAM. Principle of IRAM
Memory Systems 165 other devices of the module will be in the Standby state (which is the primary state of all RDRAM devices) or another state with low-power consumption. The RDRAM devices provide several
More informationControl Hazards. Prediction
Control Hazards The nub of the problem: In what pipeline stage does the processor fetch the next instruction? If that instruction is a conditional branch, when does the processor know whether the conditional
More informationExtending the Power of FPGAs to Software Developers:
Extending the Power of FPGAs to Software Developers: The Journey has Begun Salil Raje Xilinx Corporate Vice President Software and IP Products Group Page 1 Agenda The Evolution of FPGAs and FPGA Programming
More information