Building Whole Systems: an Overview
|
|
- Tabitha Hunt
- 5 years ago
- Views:
Transcription
1 1 Building Whole Systems: an Overview In this lecture: System specification, design, and synthesis Hardware/software co-design Work tips Design challenges and trade-offs
2 2 Ideal flow (Informal) Specification Abracadabra Implementation
3 2 Ideal flow (Informal) Specification Abracadabra formalize choose platform decide architecture decide Hw/Sw synthesize optimize keep correctness Implementation
4 3 Realistic flow designer (implementation) Refine: Tools by Hand high-level imp Verify model 1 imp Verify model 2 customer (spec) imp Verify model 3 Redesign Optimize full imp model
5 4 Specs: Modeling vs. Synthesis Modeling: simple/focused specifications reason about the system have a reference behavior Synthesis: complete specifications produces the implementation complex/designer controlled refinement steps
6 5 Modeling Different computational models: Finite State Machines,SpecCharts Process Networks Petri-Nets (Control) Data-Flow Graphs, Executable: C, Java, Python,... VHDL, Verilog, SystemC, C#, MATLAB, BlueSpec
7 6 Synthesis Historically: Hw/Sw distinction Sw compilation vs. Hw synthesis specific computational models, languages, tools abstraction-level specific Vision: no distinction between Hw and Sw co-design, co-simulation, co-synthesis, unique language (SystemC, SystemVerilog? ) common tools (front-end) abstraction-level independent
8 7 Classic Design vs. Co-design Flow specification partition Classic Sw refine refine comm. extraction Hw refine refine compile synthesize implementation
9 7 Classic Design vs. Co-design Flow Sw refine refine compile specification partition comm. extraction implementation Classic Hw refine refine synthesize Sw compile specification refine refine partition comm. extraction implementation Co-design Hw synthesize
10 8 Hw/Sw co-design in a nutshell Late Hw/Sw partitioning: larger design space (fuzzy Hw/Sw border) allows for finer grain partitioning flexible (late architecture selection) more homogeneous system specification
11 8 Hw/Sw co-design in a nutshell Late Hw/Sw partitioning: larger design space (fuzzy Hw/Sw border) allows for finer grain partitioning flexible (late architecture selection) more homogeneous system specification the problems: requires a powerful enough language to express both Hw and Sw at different abstraction levels appropriate tool support
12 9 System Synthesis Issues neither Hw or Sw are fully functional Sw needs a Hw for implementation Target platform development: unit test Hw Partition implement DONE! Sw integrate and test Time
13 10 Improved System Development cross platform development: decide interfaces and write stubs develop Sw on an available machine + stubs develop Hw in parallel + tests in Sw Host Sw port, integrate,test Hw Target Sw Partition spec DONE! Time
14 11 An Example Spec: load a pair of integers on the serial and display a pixel on the screen
15 11 An Example Spec: load a pair of integers on the serial and display a pixel on the screen load integer and put pixel are Hw/Sw interfaces
16 11 An Example Spec: load a pair of integers on the serial and display a pixel on the screen load integer and put pixel are Hw/Sw interfaces Hardware Developed in parallel Stubs: int LoadInt() {scanf} void PutPixel() {printf} Software: PutPixel(LoadInt(),LoadInt())
17 11 An Example Spec: load a pair of integers on the serial and display a pixel on the screen load integer and put pixel are Hw/Sw interfaces Hardware Developed in parallel Stubs: int LoadInt() {scanf} void PutPixel() {printf} tested on the host platform Software: PutPixel(LoadInt(),LoadInt())
18 11 An Example Spec: load a pair of integers on the serial and display a pixel on the screen load integer and put pixel are Hw/Sw interfaces Hardware Developed in parallel (port) Stubs: int LoadInt() {scanf} void PutPixel() {printf} Re-write LoadInt and PutPixel using XIo_In,XIo_Out tested on the host platform Software: PutPixel(LoadInt(),LoadInt()) target implementation
19 12 Team Work
20 12 Team Work make an initial detailed specification reduces useless discussions later reduces free interpretation, blaming others
21 12 Team Work make an initial detailed specification reduces useless discussions later reduces free interpretation, blaming others distribute work wisely (Hw,Sw, Stubs, Integration, Testing, etc ) minimize interdependency allow for loads of parallel work
22 12 Team Work make an initial detailed specification reduces useless discussions later reduces free interpretation, blaming others distribute work wisely (Hw,Sw, Stubs, Integration, Testing, etc ) minimize interdependency allow for loads of parallel work report your progress to each other often
23 13 More Work Tips start from working Hw and build around full re-design takes much more time and effort
24 13 More Work Tips start from working Hw and build around full re-design takes much more time and effort do unit tests, simulate modules painful to detect bugs later
25 13 More Work Tips start from working Hw and build around full re-design takes much more time and effort do unit tests, simulate modules painful to detect bugs later use debuggers, printouts, leds, everything to make sure your system works
26 13 More Work Tips start from working Hw and build around full re-design takes much more time and effort do unit tests, simulate modules painful to detect bugs later use debuggers, printouts, leds, everything to make sure your system works write simple Sw tests for your Hw
27 13 More Work Tips start from working Hw and build around full re-design takes much more time and effort do unit tests, simulate modules painful to detect bugs later use debuggers, printouts, leds, everything to make sure your system works write simple Sw tests for your Hw if time is short, go around problems (patch bad Hw with Sw)
28 14 Real-life design trade-offs (choices, optimizations, and fine tuning)
29 15 Real-life restrictions Limited type and amount of resources Changing specifications Limited knowledge
30 16 Limited Resources
31 16 Limited Resources Time
32 16 Limited Resources Time Tools Build (compilers, synthesis, technology) Test, debug, maintenance
33 16 Limited Resources Time Tools Build (compilers, synthesis, technology) Test, debug, maintenance Target hardware support Type & amount of available IPs/chips Partially fixed architecture
34 16 Limited Resources Time Tools Build (compilers, synthesis, technology) Test, debug, maintenance Target hardware support Type & amount of available IPs/chips Partially fixed architecture Target software support operating system, libraries, drivers
35 17 Challenges
36 17 Challenges Design using available components Select the most suitable architecture (cores and communication channels) Adapt the custom hardware/software to the available system Allow some flexibility, configurability (shifting specs.)
37 17 Challenges Design using available components Select the most suitable architecture (cores and communication channels) Adapt the custom hardware/software to the available system Allow some flexibility, configurability (shifting specs.) Optimize Do not use more (area, memory, ) than you need Add Hw to speed up/simplify Sw (e.g. DMA ctrl)
38 17 Challenges Design using available components Select the most suitable architecture (cores and communication channels) Adapt the custom hardware/software to the available system Allow some flexibility, configurability (shifting specs.) Optimize Do not use more (area, memory, ) than you need Add Hw to speed up/simplify Sw (e.g. DMA ctrl) Test & Debug
39 18 Hw/architecture design techniques
40 18 Hw/architecture design techniques Start with a simple, working design
41 18 Hw/architecture design techniques Start with a simple, working design Expand gradually by adding tested IPs
42 18 Hw/architecture design techniques Start with a simple, working design Expand gradually by adding tested IPs Design custom IPs only when necessary
43 18 Hw/architecture design techniques Start with a simple, working design Expand gradually by adding tested IPs Design custom IPs only when necessary Communicating/shuffling data is usually the bottleneck, not computing: choose fast/many memories, buses
44 18 Hw/architecture design techniques Start with a simple, working design Expand gradually by adding tested IPs Design custom IPs only when necessary Communicating/shuffling data is usually the bottleneck, not computing: choose fast/many memories, buses Improve/Optimize a working prototype
45 19 Memory Band-width: VGA controller example 640x480 rrrbbbggg 60Hz
46 19 Memory Band-width: VGA controller example 640x480x9x60 = 165,888 kb/s 640x480 rrrbbbggg 60Hz
47 19 Memory Band-width: VGA controller example 640x480x9x60 = 165,888 kb/s 640x480 rrrbbbggg 60Hz = 5,184kw/s = 5.2Mw/s
48 19 Memory Band-width: VGA controller example 640x480x9x60 = 165,888 kb/s 640x480 rrrbbbggg 60Hz = 5,184kw/s = 5.2Mw/s 100 MHz bus clock Approx. 1 word each 19 cycles (read access!)
49 20 Memory Band-width: VGA controller example (II)
50 20 Memory Band-width: VGA controller example (II) LMB_BRAM: Single read access: 1clock cyle/word no problem. Avg. Utilization due to ctrl = 5%
51 20 Memory Band-width: VGA controller example (II) LMB_BRAM: Single read access: 1clock cyle/word no problem. Avg. Utilization due to ctrl = 5% PLB_BRAM: (plb_bram_if_ctrl.pdf) Single read: 6cc/w OK?. Avg. Utilization due to VGA ctrl =30% Burst reads: ~10cc/4x2ws = 1.25cc/w no problem. Avg. U = 7%
52 21 Memory Band-width: VGA controller example (III)
53 21 Memory Band-width: VGA controller example (III) PLB_DDR (plb_ddr.pdf) Single read: 14cc/2w OK?, avg. U = 35% Burst read: 16cc/2x2w ~ 4cc/w OK, avg. Utilization = 20%
54 21 Memory Band-width: VGA controller example (III) PLB_DDR (plb_ddr.pdf) Single read: 14cc/2w OK?, avg. U = 35% Burst read: 16cc/2x2w ~ 4cc/w OK, avg. Utilization = 20% PLB_EMC (SRAM,Flash): 8-bit access Single read: 10cc/B (40/w) problem. Bus cannot cope: avg. U = 200% Burst read, etc.
55 22 Variable Band-width: VGA controller example (IV) 640x480x9b 60Hz hsynch Pixel clock (pc) = 25MHz Bus clock (bc) = 100MHz Total pixels 800x525
56 22 Variable Band-width: VGA controller example (IV) 640x480x9b 60Hz Total pixels 800x525 hsynch Pixel clock (pc) = 25MHz Bus clock (bc) = 100MHz Instant Demand: 0w/bc
57 22 Variable Band-width: VGA controller example (IV) 640x480x9b 60Hz Total pixels 800x525 hsynch Pixel clock (pc) = 25MHz Bus clock (bc) = 100MHz Instant Demand: 0w/bc Instant Demand: 9b/pc = 0.07 w/bc 1w/14bc
58 22 Variable Band-width: VGA controller example (IV) 640x480x9b 60Hz Total pixels 800x525 hsynch Pixel clock (pc) = 25MHz Bus clock (bc) = 100MHz Instant Demand: 0w/bc Band-width demand changes at run-time: High band-width may be too high for the chosen bus Smoother bus utilization may be required Instant Demand: 9b/pc = 0.07 w/bc 1w/14bc
59 22 Variable Band-width: VGA controller example (IV) 640x480x9b 60Hz Total pixels 800x525 Band-width demand changes at run-time: High band-width may be too high for the chosen bus Smoother bus utilization may be required Solution: BUFFERING! hsynch Pixel clock (pc) = 25MHz Bus clock (bc) = 100MHz Instant Demand: 0w/bc Instant Demand: 9b/pc = 0.07 w/bc 1w/14bc
60 23 Variable Band-width: VGA controller example (V)
61 23 Variable Band-width: VGA controller example (V) Keep the buffer filled with data
62 23 Variable Band-width: VGA controller example (V) Keep the buffer filled with data Buffer size?
63 23 Variable Band-width: VGA controller example (V) Keep the buffer filled with data Buffer size? Easy way out: full frame - not always possible
64 23 Variable Band-width: VGA controller example (V) Keep the buffer filled with data Buffer size? Easy way out: full frame - not always possible Trial and error: start with a small buffer and increase it if the controller starves.
65 23 Variable Band-width: VGA controller example (V) Keep the buffer filled with data Buffer size? Easy way out: full frame - not always possible Trial and error: start with a small buffer and increase it if the controller starves. Analysis:
66 23 Variable Band-width: VGA controller example (V) Keep the buffer filled with data Buffer size? Easy way out: full frame - not always possible Trial and error: start with a small buffer and increase it if the controller starves. Analysis: Compute the avg. rate (19bc/w ~ 0.052w/bc)
67 23 Variable Band-width: VGA controller example (V) Keep the buffer filled with data Buffer size? Easy way out: full frame - not always possible Trial and error: start with a small buffer and increase it if the controller starves. Analysis: Compute the avg. rate (19bc/w ~ 0.052w/bc) Size = Longest_time_without_using_data x Rate (last pixel to first pixel delay) vhdl: ( )x800x0.052 = 1914 words (~2kw)
68 24 Fine Tuning: VGA controller example (VI) Initial assumption: all bits in a word carry information! complex decoder and unpacking method Bus transfer Memory organisation
69 24 Fine Tuning: VGA controller example (VI) Initial assumption: all bits in a word carry information! complex decoder and unpacking method Solutions: 1. Reduce bpp: 8 (4p/w) 2. Align & discard bits Required band-width and buffer size change! Bus transfer Memory organisation Translate 8 to 9 bits: 1. conversion table 2. default bit No conversion required Decoder not so simple
70 25 IP Configuration Trade-off area/power for performance: Processor Cache type/size Floating point support Pipeline depth (?) Memory sizes Interconnect type/width (buses) Timing/wait states
71 26 Memory Size Issues: SRAM executable example
72 26 Memory Size Issues: SRAM executable example problem: the program does not fit in the available on-chip BRAM
73 26 Memory Size Issues: SRAM executable example problem: the program does not fit in the available on-chip BRAM SOLUTIONS: compile with -Os, remove debug info. put the stack and heap in off-chip memories need to use available SDRAM, SRAM/Flash execute from the off-chip memory need to use proper controllers boot from BRAM, jump to an executable off-chip use caches to speed up
74 27 Memory Size Issues: SRAM executable example
75 27 Memory Size Issues: SRAM executable example Steps:
76 27 Memory Size Issues: SRAM executable example Steps: 1. Link the main application from SRAM_BASEADDR
77 27 Memory Size Issues: SRAM executable example Steps: 1. Link the main application from SRAM_BASEADDR 2. mb-objcopy -O binary main_app.elf main_app.bin
78 27 Memory Size Issues: SRAM executable example Steps: 1. Link the main application from SRAM_BASEADDR 2. mb-objcopy -O binary main_app.elf main_app.bin 3. Write/compile/link a bootloader from 0x0000
79 27 Memory Size Issues: SRAM executable example Steps: 1. Link the main application from SRAM_BASEADDR 2. mb-objcopy -O binary main_app.elf main_app.bin 3. Write/compile/link a bootloader from 0x0000 typedef int (*maintype)(int,char**); maintype maincode = (maintype)sram_baseaddr; int main(int argn, char **argv) { return maincode(argn, argv); }
80 27 Memory Size Issues: SRAM executable example Steps: 1. Link the main application from SRAM_BASEADDR 2. mb-objcopy -O binary main_app.elf main_app.bin 3. Write/compile/link a bootloader from 0x0000 typedef int (*maintype)(int,char**); maintype maincode = (maintype)sram_baseaddr; int main(int argn, char **argv) { return maincode(argn, argv); } 3. Add MDM debug periph., set mblaze DEBUG_ENABLE flag
81 27 Memory Size Issues: SRAM executable example Steps: 1. Link the main application from SRAM_BASEADDR 2. mb-objcopy -O binary main_app.elf main_app.bin 3. Write/compile/link a bootloader from 0x0000 typedef int (*maintype)(int,char**); maintype maincode = (maintype)sram_baseaddr; int main(int argn, char **argv) { return maincode(argn, argv); } 3. Add MDM debug periph., set mblaze DEBUG_ENABLE flag 4. Download configuration, connect in xmd: mbconnect mdm
82 27 Memory Size Issues: SRAM executable example Steps: 1. Link the main application from SRAM_BASEADDR 2. mb-objcopy -O binary main_app.elf main_app.bin 3. Write/compile/link a bootloader from 0x0000 typedef int (*maintype)(int,char**); maintype maincode = (maintype)sram_baseaddr; int main(int argn, char **argv) { return maincode(argn, argv); } 3. Add MDM debug periph., set mblaze DEBUG_ENABLE flag 4. Download configuration, connect in xmd: mbconnect mdm 5. xmd> dow -data main_app.bin SRAM_BASEADDR
83 27 Memory Size Issues: SRAM executable example Steps: 1. Link the main application from SRAM_BASEADDR 2. mb-objcopy -O binary main_app.elf main_app.bin 3. Write/compile/link a bootloader from 0x0000 typedef int (*maintype)(int,char**); maintype maincode = (maintype)sram_baseaddr; int main(int argn, char **argv) { return maincode(argn, argv); } 3. Add MDM debug periph., set mblaze DEBUG_ENABLE flag 4. Download configuration, connect in xmd: mbconnect mdm 5. xmd> dow -data main_app.bin SRAM_BASEADDR 6. Run or Download configuration again
84 28 Software fine tuning To adjust code speed and size: Algorithm selection (e.g. bubble vs. quick sort) Compiler optimization options Linker scripts segment splitting: distribute code,stack,heap, Driver choices low level, small, little functionality vs. high level, large, loads of functionality
85 29 Drivers - Hw/Sw interface: VGA controller example (VII)
86 29 Drivers - Hw/Sw interface: VGA controller example (VII) 8bpp (4p/w) Easy to modify single pixels (Xio_Out8) by writing single bytes
87 29 Drivers - Hw/Sw interface: VGA controller example (VII) 8bpp (4p/w) Easy to modify single pixels (Xio_Out8) by writing single bytes 9bpp (3p/w) Single pixels: read, modify & write Exact address/offset computation more complex
88 29 Drivers - Hw/Sw interface: VGA controller example (VII) 8bpp (4p/w) Easy to modify single pixels (Xio_Out8) by writing single bytes 9bpp (3p/w) Single pixels: read, modify & write Exact address/offset computation more complex Packed 9bpp Even harder to compute the offset/address, build masks, access split pixels, etc.
89 30 Conclusions
90 30 Conclusions Trade-offs are very common (e.g. band-width vs. simplicity, Hw vs. Sw)
91 30 Conclusions Trade-offs are very common (e.g. band-width vs. simplicity, Hw vs. Sw) Hardware, software, and interfaces must be designed together
92 30 Conclusions Trade-offs are very common (e.g. band-width vs. simplicity, Hw vs. Sw) Hardware, software, and interfaces must be designed together Knowledge about the available components is essential
Building Whole Systems: an Overview
1 (29) Building Whole Systems: an Overview In this lecture: System specification, design, and synthesis Hardware/software co-design Work tips Design challenges and trade-offs 2 (29) Ideal flow (Informal)
More informationReal-Life Design Trade-Offs
Real-Life Design Trade-Offs (choices, optimizations, fine tuning) EDAN85: Lecture 3 Real-life restrictions Limited type and amount of resources Changing specifications Limited knowledge 2 Limited Resources
More informationESE Back End 2.0. D. Gajski, S. Abdi. (with contributions from H. Cho, D. Shin, A. Gerstlauer)
ESE Back End 2.0 D. Gajski, S. Abdi (with contributions from H. Cho, D. Shin, A. Gerstlauer) Center for Embedded Computer Systems University of California, Irvine http://www.cecs.uci.edu 1 Technology advantages
More informationSystem Level Design with IBM PowerPC Models
September 2005 System Level Design with IBM PowerPC Models A view of system level design SLE-m3 The System-Level Challenges Verification escapes cost design success There is a 45% chance of committing
More informationHardware/Software Co-design
Hardware/Software Co-design Zebo Peng, Department of Computer and Information Science (IDA) Linköping University Course page: http://www.ida.liu.se/~petel/codesign/ 1 of 52 Lecture 1/2: Outline : an Introduction
More informationL2 - C language for Embedded MCUs
Formation C language for Embedded MCUs: Learning how to program a Microcontroller (especially the Cortex-M based ones) - Programmation: Langages L2 - C language for Embedded MCUs Learning how to program
More informationInterconnects, Memory, GPIO
Interconnects, Memory, GPIO Dr. Francesco Conti f.conti@unibo.it Slide contributions adapted from STMicroelectronics and from Dr. Michele Magno, others Processor vs. MCU Pipeline Harvard architecture Separate
More informationHardware-Software Codesign
Hardware-Software Codesign 8. Performance Estimation Lothar Thiele 8-1 System Design specification system synthesis estimation -compilation intellectual prop. code instruction set HW-synthesis intellectual
More informationThe Use Of Virtual Platforms In MP-SoC Design. Eshel Haritan, VP Engineering CoWare Inc. MPSoC 2006
The Use Of Virtual Platforms In MP-SoC Design Eshel Haritan, VP Engineering CoWare Inc. MPSoC 2006 1 MPSoC Is MP SoC design happening? Why? Consumer Electronics Complexity Cost of ASIC Increased SW Content
More informationARM Processors for Embedded Applications
ARM Processors for Embedded Applications Roadmap for ARM Processors ARM Architecture Basics ARM Families AMBA Architecture 1 Current ARM Core Families ARM7: Hard cores and Soft cores Cache with MPU or
More informationCodesign Framework. Parts of this lecture are borrowed from lectures of Johan Lilius of TUCS and ASV/LL of UC Berkeley available in their web.
Codesign Framework Parts of this lecture are borrowed from lectures of Johan Lilius of TUCS and ASV/LL of UC Berkeley available in their web. Embedded Processor Types General Purpose Expensive, requires
More informationBasic Concepts COE 205. Computer Organization and Assembly Language Dr. Aiman El-Maleh
Basic Concepts COE 205 Computer Organization and Assembly Language Dr. Aiman El-Maleh College of Computer Sciences and Engineering King Fahd University of Petroleum and Minerals [Adapted from slides of
More informationNIOS II Pixel Display
NIOS Pixel Display SDRAM 512Mb Clock Reset_bar CPU Onchip Memory External Memory Controller JTAG UART Pixel DMA Resampler Scaler Dual Port FIFO VGA Controller Timer System ID VGA Connector PLL 2 tj SDRAM
More informationUniversity of Toronto ECE532 Digital Hardware Lab 5: Adding a User-Designed Peripheral
Version 1.5 8/16/2004 This lab can be started during Lab 4 and completed during Lab 5, if necessary. Goals Add a user designed peripheral to a basic MicroBlaze system. Demonstrate the required structure
More informationPrototyping Architectural Support for Program Rollback Using FPGAs
Prototyping Architectural Support for Program Rollback Using FPGAs Radu Teodorescu and Josep Torrellas http://iacoma.cs.uiuc.edu University of Illinois at Urbana-Champaign Motivation Problem: Software
More informationVHDL vs. BSV: A case study on a Java-optimized processor
VHDL vs. BSV: A case study on a Java-optimized processor April 18, 2007 Outline Introduction Goal Design parameters Goal Design parameters What are we trying to do? Compare BlueSpec SystemVerilog (BSV)
More informationCadence SystemC Design and Verification. NMI FPGA Network Meeting Jan 21, 2015
Cadence SystemC Design and Verification NMI FPGA Network Meeting Jan 21, 2015 The High Level Synthesis Opportunity Raising Abstraction Improves Design & Verification Optimizes Power, Area and Timing for
More informationA design methodology for TTA protocol processors
A design methodology for TTA protocol processors Presentation by Seppo Virtanen seppo.virtanen@utu.fi http://users.utu.fi/seaavi Embedded Systems lab, Turku Centre for Computer Science (TUCS) http://www.tucs.fi
More informationIntroduction to Computer Systems
CS-213 Introduction to Computer Systems Yan Chen Topics: Staff, text, and policies Lecture topics and assignments Lab rationale CS 213 F 06 Teaching staff Instructor TA Prof. Yan Chen (Thu 2-4pm, Tech
More informationIntroduction. Definition. What is an embedded system? What are embedded systems? Challenges in embedded computing system design. Design methodologies.
Introduction What are embedded systems? Challenges in embedded computing system design. Design methodologies. What is an embedded system? Communication Avionics Automobile Consumer Electronics Office Equipment
More informationSYSTEMS ON CHIP (SOC) FOR EMBEDDED APPLICATIONS
SYSTEMS ON CHIP (SOC) FOR EMBEDDED APPLICATIONS Embedded System System Set of components needed to perform a function Hardware + software +. Embedded Main function not computing Usually not autonomous
More informationSimXMD Simulation-based HW/SW Co-debugging for field-programmable Systems-on-Chip
SimXMD Simulation-based HW/SW Co-debugging for field-programmable Systems-on-Chip Ruediger Willenberg and Paul Chow High-Performance Reconfigurable Computing Group University of Toronto September 4, 2013
More informationSystem Debug. This material exempt per Department of Commerce license exception TSU Xilinx, Inc. All Rights Reserved
System Debug This material exempt per Department of Commerce license exception TSU Objectives After completing this module, you will be able to: Describe GNU Debugger (GDB) functionality Describe Xilinx
More informationMartin Kruliš, v
Martin Kruliš 1 Optimizations in General Code And Compilation Memory Considerations Parallelism Profiling And Optimization Examples 2 Premature optimization is the root of all evil. -- D. Knuth Our goal
More informationAdapted from instructor s supplementary material from Computer. Patterson & Hennessy, 2008, MK]
Lecture 17 Adapted from instructor s supplementary material from Computer Organization and Design, 4th Edition, Patterson & Hennessy, 2008, MK] SRAM / / Flash / RRAM / HDD SRAM / / Flash / RRAM/ HDD SRAM
More informationCS650 Computer Architecture. Lecture 9 Memory Hierarchy - Main Memory
CS65 Computer Architecture Lecture 9 Memory Hierarchy - Main Memory Andrew Sohn Computer Science Department New Jersey Institute of Technology Lecture 9: Main Memory 9-/ /6/ A. Sohn Memory Cycle Time 5
More informationIntroduction to Computer Systems
Introduction to Computer Systems Today: Welcome to EECS 213 Lecture topics and assignments Next time: Bits & bytes and some Boolean algebra Fabián E. Bustamante, Spring 2010 Welcome to Intro. to Computer
More informationIntroduction to Computer Systems
Introduction to Computer Systems Today:! Welcome to EECS 213! Lecture topics and assignments Next time:! Bits & bytes! and some Boolean algebra Fabián E. Bustamante, 2007 Welcome to Intro. to Computer
More informationc. What are the machine cycle times (in nanoseconds) of the non-pipelined and the pipelined implementations?
Brown University School of Engineering ENGN 164 Design of Computing Systems Professor Sherief Reda Homework 07. 140 points. Due Date: Monday May 12th in B&H 349 1. [30 points] Consider the non-pipelined
More informationOptimizing HW/SW Partition of a Complex Embedded Systems. Simon George November 2015.
Optimizing HW/SW Partition of a Complex Embedded Systems Simon George November 2015 Zynq-7000 All Programmable SoC HP ACP GP Page 2 Zynq UltraScale+ MPSoC Page 3 HW/SW Optimization Challenges application()
More informationCSE 351: The Hardware/Software Interface. Section 2 Integer representations, two s complement, and bitwise operators
CSE 351: The Hardware/Software Interface Section 2 Integer representations, two s complement, and bitwise operators Integer representations In addition to decimal notation, it s important to be able to
More informationSimXMD: Simulation-based HW/SW Co-Debugging for FPGA Embedded Systems
FPGAworld 2014 SimXMD: Simulation-based HW/SW Co-Debugging for FPGA Embedded Systems Ruediger Willenberg and Paul Chow High-Performance Reconfigurable Computing Group University of Toronto September 9,
More informationOutline. Computer programming. Debugging. What is it. Debugging. Hints. Debugging
Outline Computer programming Debugging Hints Gathering evidence Common C errors "Education is a progressive discovery of our own ignorance." Will Durant T.U. Cluj-Napoca - Computer Programming - lecture
More informationSimXMD Co-Debugging Software and Hardware in FPGA Embedded Systems
University of Toronto FPGA Seminar SimXMD Co-Debugging Software and Hardware in FPGA Embedded Systems Ruediger Willenberg and Paul Chow High-Performance Reconfigurable Computing Group University of Toronto
More informationECEN 449: Microprocessor System Design Department of Electrical and Computer Engineering Texas A&M University
ECEN 449: Microprocessor System Design Department of Electrical and Computer Engineering Texas A&M University Prof. Sunil Khatri TA: Monther Abusultan (Lab exercises created by A. Targhetta / P. Gratz)
More informationHardware Modelling. Design Flow Overview. ECS Group, TU Wien
Hardware Modelling Design Flow Overview ECS Group, TU Wien 1 Outline Difference: Hardware vs. Software Design Flow Steps Specification Realisation Verification FPGA Design Flow 2 Hardware vs. Software:
More informationECE532 Design Project Group Report Disparity Map Generation Using Stereoscopic Camera on the Atlys Board
ECE532 Design Project Group Report Disparity Map Generation Using Stereoscopic Camera on the Atlys Board Team 3 Alim-Karim Jiwan Muhammad Tariq Yu Ting Chen Table of Contents 1 Project Overview... 4 1.1
More informationSystem-level simulation (HW/SW co-simulation) Outline. EE290A: Design of Embedded System ASV/LL 9/10
System-level simulation (/SW co-simulation) Outline Problem statement Simulation and embedded system design functional simulation performance simulation POLIS implementation partitioning example implementation
More informationELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 7: Memory Organization Part II
ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 7: Organization Part II Ujjwal Guin, Assistant Professor Department of Electrical and Computer Engineering Auburn University, Auburn,
More information08 - Address Generator Unit (AGU)
October 2, 2014 Todays lecture Memory subsystem Address Generator Unit (AGU) Schedule change A new lecture has been entered into the schedule (to compensate for the lost lecture last week) Memory subsystem
More informationTest and Verification Solutions. ARM Based SOC Design and Verification
Test and Verification Solutions ARM Based SOC Design and Verification 7 July 2008 1 7 July 2008 14 March 2 Agenda System Verification Challenges ARM SoC DV Methodology ARM SoC Test bench Construction Conclusion
More informationLecture 7: Introduction to Co-synthesis Algorithms
Design & Co-design of Embedded Systems Lecture 7: Introduction to Co-synthesis Algorithms Sharif University of Technology Computer Engineering Dept. Winter-Spring 2008 Mehdi Modarressi Topics for today
More informationC Review. MaxMSP Developers Workshop Summer 2009 CNMAT
C Review MaxMSP Developers Workshop Summer 2009 CNMAT C Syntax Program control (loops, branches): Function calls Math: +, -, *, /, ++, -- Variables, types, structures, assignment Pointers and memory (***
More informationPage 1. Structure of von Nuemann machine. Instruction Set - the type of Instructions
Structure of von Nuemann machine Arithmetic and Logic Unit Input Output Equipment Main Memory Program Control Unit 1 1 Instruction Set - the type of Instructions Arithmetic + Logical (ADD, SUB, MULT, DIV,
More informationLecture 03 Bits, Bytes and Data Types
Lecture 03 Bits, Bytes and Data Types Computer Languages A computer language is a language that is used to communicate with a machine. Like all languages, computer languages have syntax (form) and semantics
More informationCo-synthesis and Accelerator based Embedded System Design
Co-synthesis and Accelerator based Embedded System Design COE838: Embedded Computer System http://www.ee.ryerson.ca/~courses/coe838/ Dr. Gul N. Khan http://www.ee.ryerson.ca/~gnkhan Electrical and Computer
More informationENHANCED TOOLS FOR RISC-V PROCESSOR DEVELOPMENT
ENHANCED TOOLS FOR RISC-V PROCESSOR DEVELOPMENT THE FREE AND OPEN RISC INSTRUCTION SET ARCHITECTURE Codasip is the leading provider of RISC-V processor IP Codasip Bk: A portfolio of RISC-V processors Uniquely
More informationChris Riesbeck, Fall Introduction to Computer Systems
Chris Riesbeck, Fall 2011 Introduction to Computer Systems Welcome to Intro. to Computer Systems Everything you need to know http://www.cs.northwestern.edu/academics/courses/213/ Instructor: Chris Riesbeck
More informationOptimize DSP Designs and Code using Fixed-Point Designer
Optimize DSP Designs and Code using Fixed-Point Designer MathWorks Korea 이웅재부장 Senior Application Engineer 2013 The MathWorks, Inc. 1 Agenda Fixed-point concepts Introducing Fixed-Point Designer Overview
More informationModeling and Simulation of System-on. Platorms. Politecnico di Milano. Donatella Sciuto. Piazza Leonardo da Vinci 32, 20131, Milano
Modeling and Simulation of System-on on-chip Platorms Donatella Sciuto 10/01/2007 Politecnico di Milano Dipartimento di Elettronica e Informazione Piazza Leonardo da Vinci 32, 20131, Milano Key SoC Market
More informationSummary of Last Class. Processes. C vs. Java. C vs. Java (cont.) C vs. Java (cont.) Tevfik Ko!ar. CSC Systems Programming Fall 2008
CSC 4304 - Systems Programming Fall 2008 Lecture - II Basics of C Programming Summary of Last Class Basics of UNIX: logging in, changing password text editing with vi, emacs and pico file and director
More informationChapter 5A. Large and Fast: Exploiting Memory Hierarchy
Chapter 5A Large and Fast: Exploiting Memory Hierarchy Memory Technology Static RAM (SRAM) Fast, expensive Dynamic RAM (DRAM) In between Magnetic disk Slow, inexpensive Ideal memory Access time of SRAM
More informationStorage I/O Summary. Lecture 16: Multimedia and DSP Architectures
Storage I/O Summary Storage devices Storage I/O Performance Measures» Throughput» Response time I/O Benchmarks» Scaling to track technological change» Throughput with restricted response time is normal
More informationAn open hardware VJ platform
Technical aspects June 2009 What we are speaking about Open Hardware, for real What we are speaking about A device for video performance artists (VJs)... inspired by the popular MilkDrop program for PCs
More informationDepartment of Computer Science, Institute for System Architecture, Operating Systems Group. Real-Time Systems '08 / '09. Hardware.
Department of Computer Science, Institute for System Architecture, Operating Systems Group Real-Time Systems '08 / '09 Hardware Marcus Völp Outlook Hardware is Source of Unpredictability Caches Pipeline
More informationDesign of Transport Triggered Architecture Processor for Discrete Cosine Transform
Design of Transport Triggered Architecture Processor for Discrete Cosine Transform by J. Heikkinen, J. Sertamo, T. Rautiainen,and J. Takala Presented by Aki Happonen Table of Content Introduction Transport
More informationDATA STRUCTURES AND ALGORITHMS
DATA STRUCTURES AND ALGORITHMS Sorting algorithms External sorting, Search Summary of the previous lecture Fast sorting algorithms Quick sort Heap sort Radix sort Running time of these algorithms in average
More informationChapter 11 Introduction to Programming in C
C: A High-Level Language Chapter 11 Introduction to Programming in C Original slides from Gregory Byrd, North Carolina State University Modified slides by Chris Wilcox, Colorado State University! Gives
More informationLecture 1 Introduction to Microprocessors
CPE 390: Microprocessor Systems Spring 2018 Lecture 1 Introduction to Microprocessors Bryan Ackland Department of Electrical and Computer Engineering Stevens Institute of Technology Hoboken, NJ 07030 1
More informationReference System: PLB DDR2 with OPB Central DMA Author: James Lucero
Application Note: Embedded Processing XAPP935 (v1.1) June 7, 2007 R Reference System: PLB DDR2 with OPB Central DMA Author: James Lucero Abstract This reference system demonstrates the functionality of
More informationThis is CS50. Harvard University Fall Quiz 0 Answer Key
Quiz 0 Answer Key Answers other than the below may be possible. Binary Bulbs. 0. Bit- Sized Questions. 1. Because 0 is non- negative, we need to set aside one pattern of bits (000) for it, which leaves
More informationEE382V: System-on-a-Chip (SoC) Design
EE382V: System-on-a-Chip (SoC) Design Lecture 8 HW/SW Co-Design Sources: Prof. Margarida Jacome, UT Austin Andreas Gerstlauer Electrical and Computer Engineering University of Texas at Austin gerstl@ece.utexas.edu
More informationOptimizing ARM SoC s with Carbon Performance Analysis Kits. ARM Technical Symposia, Fall 2014 Andy Ladd
Optimizing ARM SoC s with Carbon Performance Analysis Kits ARM Technical Symposia, Fall 2014 Andy Ladd Evolving System Requirements Processor Advances big.little Multicore Unicore DSP Cortex -R7 Block
More informationPlacement de processus (MPI) sur architecture multi-cœur NUMA
Placement de processus (MPI) sur architecture multi-cœur NUMA Emmanuel Jeannot, Guillaume Mercier LaBRI/INRIA Bordeaux Sud-Ouest/ENSEIRB Runtime Team Lyon, journées groupe de calcul, november 2010 Emmanuel.Jeannot@inria.fr
More informationAS part of my MPhil course Advanced Computer Design, I was required to choose or define a challenging hardware project
OMAR CHOUDARY, ADVANCED COMPUTER DESIGN, APRIL 2010 1 From Verilog to Bluespec: Tales of an AES Implementation for FPGAs Omar Choudary, University of Cambridge Abstract In this paper I present a combined
More informationConcepts from High-Performance Computing
Concepts from High-Performance Computing Lecture A - Overview of HPC paradigms OBJECTIVE: The clock speeds of computer processors are topping out as the limits of traditional computer chip technology are
More informationDigital Blocks Semiconductor IP
Digital Blocks Semiconductor IP TFT Controller General Description The Digital Blocks TFT Controller IP Core interfaces a microprocessor and frame buffer memory via the AMBA 2.0 to a TFT panel. In an FPGA,
More informationMultiple Choice Type Questions
Techno India Batanagar Computer Science and Engineering Model Questions Subject Name: Computer Architecture Subject Code: CS 403 Multiple Choice Type Questions 1. SIMD represents an organization that.
More informationSU 2017 May 18/23 LAB 3 Bitwise operations, Program structures, Functions (pass-by-value), local vs. global variables. Debuggers
SU 2017 May 18/23 LAB 3 Bitwise operations, Program structures, Functions (pass-by-value), local vs. global variables. Debuggers 1. Problem A Pass-by-value, and trace a program with debugger 1.1 Specification
More informationCMSC 411 Computer Systems Architecture Lecture 13 Instruction Level Parallelism 6 (Limits to ILP & Threading)
CMSC 411 Computer Systems Architecture Lecture 13 Instruction Level Parallelism 6 (Limits to ILP & Threading) Limits to ILP Conflicting studies of amount of ILP Benchmarks» vectorized Fortran FP vs. integer
More informationRISC-V CUSTOMIZATION WITH STUDIO 8
RISC-V CUSTOMIZATION WITH STUDIO 8 Zdeněk Přikryl CTO, Codasip GmbH WHO IS CODASIP Leading provider of RISC-V processor IP Introduced its first RISC-V processor in November 2015 Offers its own portfolio
More informationMultimedia Decoder Using the Nios II Processor
Multimedia Decoder Using the Nios II Processor Third Prize Multimedia Decoder Using the Nios II Processor Institution: Participants: Instructor: Indian Institute of Science Mythri Alle, Naresh K. V., Svatantra
More informationChapter 11 Introduction to Programming in C
Chapter 11 Introduction to Programming in C C: A High-Level Language Gives symbolic names to values don t need to know which register or memory location Provides abstraction of underlying hardware operations
More informationFPGA-Accelerated Instrumentation
ROTOFLEX: FGA-Accelerated Instrumentation Michael K. apamichael, Eric S. Chung, James C. Hoe, Babak Falsafi, Ken Mai papamix@cs.cmu.edu, {echung, jhoe, babak, kenmai}@ece.cmu.edu ROTOFLEX Computer Architecture
More informationCh 4. Parameters and Function Overloading
2014-1 Ch 4. Parameters and Function Overloading March 19, 2014 Advanced Networking Technology Lab. (YU-ANTL) Dept. of Information & Comm. Eng, Graduate School, Yeungnam University, KOREA (Tel : +82-53-810-2497;
More informationAppendix SystemC Product Briefs. All product claims contained within are provided by the respective supplying company.
Appendix SystemC Product Briefs All product claims contained within are provided by the respective supplying company. Blue Pacific Computing BlueWave Blue Pacific s BlueWave is a simulation GUI, including
More informationEEM870 Embedded System and Experiment Lecture 4: SoC Design Flow and Tools
EEM870 Embedded System and Experiment Lecture 4: SoC Design Flow and Tools Wen-Yen Lin, Ph.D. Department of Electrical Engineering Chang Gung University Email: wylin@mail.cgu.edu.tw March 2013 Agenda Introduction
More informationSystemC abstractions and design refinement for HW- SW SoC design. Dündar Dumlugöl. Vice President of Engineering, CoWare, Inc.
SystemC abstractions and design refinement for HW- SW SoC design Dündar Dumlugöl Vice President of Engineering, CoWare, Inc. Overview SystemC abstraction levels & design flow Interface Synthesis Analyzing
More informationChapter 11 Introduction to Programming in C
Chapter 11 Introduction to Programming in C Original slides from Gregory Byrd, North Carolina State University Modified slides by Chris Wilcox, Colorado State University C: A High-Level Language! Gives
More informationThe Central Processing Unit
The Central Processing Unit All computers derive from the same basic design, usually referred to as the von Neumann architecture. This concept involves solving a problem by defining a sequence of commands
More informationIntroduction to RISC-V
Introduction to RISC-V Jielun Tan, James Connolly February, 2019 Overview What is RISC-V Why RISC-V ISA overview Software environment Beta testing What is RISC-V RISC-V (pronounced risk-five ) is an open,
More informationESL design with the Agility Compiler for SystemC
ESL design with the Agility Compiler for SystemC SystemC behavioral design & synthesis Steve Chappell & Chris Sullivan Celoxica ESL design portfolio Complete ESL design environment Streaming Video Processing
More informationENCE Computer Organization and Architecture. Chapter 1. Software Perspective
Computer Organization and Architecture Chapter 1 Software Perspective The Lifetime of a Simple Program A Simple Program # include int main() { printf( hello, world\n ); } The goal of this course
More informationDonn Morrison Department of Computer Science. TDT4255 Memory hierarchies
TDT4255 Lecture 10: Memory hierarchies Donn Morrison Department of Computer Science 2 Outline Chapter 5 - Memory hierarchies (5.1-5.5) Temporal and spacial locality Hits and misses Direct-mapped, set associative,
More informationIntegrated Workflow to Implement Embedded Software and FPGA Designs on the Xilinx Zynq Platform Puneet Kumar Senior Team Lead - SPC
Integrated Workflow to Implement Embedded Software and FPGA Designs on the Xilinx Zynq Platform Puneet Kumar Senior Team Lead - SPC 2012 The MathWorks, Inc. 1 Agenda Integrated Hardware / Software Top
More informationFall 2015 COMP Operating Systems. Lab #3
Fall 2015 COMP 3511 Operating Systems Lab #3 Outline n Operating System Debugging, Generation and System Boot n Review Questions n Process Control n UNIX fork() and Examples on fork() n exec family: execute
More informationHardware-Software Co-Design and Prototyping on SoC FPGAs Puneet Kumar Prateek Sikka Application Engineering Team
Hardware-Software Co-Design and Prototyping on SoC FPGAs Puneet Kumar Prateek Sikka Application Engineering Team 2015 The MathWorks, Inc. 1 Agenda Integrated Hardware / Software Top down Workflow for SoC
More informationPreface... (vii) CHAPTER 1 INTRODUCTION TO COMPUTERS
Contents Preface... (vii) CHAPTER 1 INTRODUCTION TO COMPUTERS 1.1. INTRODUCTION TO COMPUTERS... 1 1.2. HISTORY OF C & C++... 3 1.3. DESIGN, DEVELOPMENT AND EXECUTION OF A PROGRAM... 3 1.4 TESTING OF PROGRAMS...
More informationMainstream Computer System Components CPU Core 2 GHz GHz 4-way Superscaler (RISC or RISC-core (x86): Dynamic scheduling, Hardware speculation
Mainstream Computer System Components CPU Core 2 GHz - 3.0 GHz 4-way Superscaler (RISC or RISC-core (x86): Dynamic scheduling, Hardware speculation One core or multi-core (2-4) per chip Multiple FP, integer
More informationDigital Systems Design. System on a Programmable Chip
Digital Systems Design Introduction to System on a Programmable Chip Dr. D. J. Jackson Lecture 11-1 System on a Programmable Chip Generally involves utilization of a large FPGA Large number of logic elements
More informationXilinx Vivado/SDK Tutorial
Xilinx Vivado/SDK Tutorial (Laboratory Session 1, EDAN15) Flavius.Gruian@cs.lth.se March 21, 2017 This tutorial shows you how to create and run a simple MicroBlaze-based system on a Digilent Nexys-4 prototyping
More informationMulti-Level Cache Hierarchy Evaluation for Programmable Media Processors. Overview
Multi-Level Cache Hierarchy Evaluation for Programmable Media Processors Jason Fritts Assistant Professor Department of Computer Science Co-Author: Prof. Wayne Wolf Overview Why Programmable Media Processors?
More informationHardware/Software Partitioning for SoCs. EECE Advanced Topics in VLSI Design Spring 2009 Brad Quinton
Hardware/Software Partitioning for SoCs EECE 579 - Advanced Topics in VLSI Design Spring 2009 Brad Quinton Goals of this Lecture Automatic hardware/software partitioning is big topic... In this lecture,
More informationV8uC: Sparc V8 micro-controller derived from LEON2-FT
V8uC: Sparc V8 micro-controller derived from LEON2-FT ESA Workshop on Avionics Data, Control and Software Systems Noordwijk, 4 November 2010 Walter Errico SITAEL Aerospace phone: +39 0584 388398 e-mail:
More informationEarly Performance-Cost Estimation of Application-Specific Data Path Pipelining
Early Performance-Cost Estimation of Application-Specific Data Path Pipelining Jelena Trajkovic Computer Science Department École Polytechnique de Montréal, Canada Email: jelena.trajkovic@polymtl.ca Daniel
More informationPerformance Tuning on the Blackfin Processor
1 Performance Tuning on the Blackfin Processor Outline Introduction Building a Framework Memory Considerations Benchmarks Managing Shared Resources Interrupt Management An Example Summary 2 Introduction
More informationKampala August, Agner Fog
Advanced microprocessor optimization Kampala August, 2007 Agner Fog www.agner.org Agenda Intel and AMD microprocessors Out Of Order execution Branch prediction Platform, 32 or 64 bits Choice of compiler
More informationPilot: A Platform-based HW/SW Synthesis System
Pilot: A Platform-based HW/SW Synthesis System SOC Group, VLSI CAD Lab, UCLA Led by Jason Cong Zhong Chen, Yiping Fan, Xun Yang, Zhiru Zhang ICSOC Workshop, Beijing August 20, 2002 Outline Overview The
More informationField Programmable Gate Array (FPGA)
Field Programmable Gate Array (FPGA) Lecturer: Krébesz, Tamas 1 FPGA in general Reprogrammable Si chip Invented in 1985 by Ross Freeman (Xilinx inc.) Combines the advantages of ASIC and uc-based systems
More informationHigh Performance Computing Lecture 1. Matthew Jacob Indian Institute of Science
High Performance Computing Lecture 1 Matthew Jacob Indian Institute of Science Agenda 1. Program execution: Compilation, Object files, Function call and return, Address space, Data & its representation
More information