CS 2461: Computer Architecture 1 Program performance and High Performance Processors

Size: px
Start display at page:

Download "CS 2461: Computer Architecture 1 Program performance and High Performance Processors"

Transcription

1 Couse Objectives: Whee ae we. CS 2461: Pogam pefomance and High Pefomance Pocessos Instucto: Pof. Bhagi Naahai Bits&bytes: Logic devices HW building blocks Pocesso: ISA, datapath Using building blocks to assemble a pocesso (LC3) Pogamming the pocesso: Assembly Tanslating high level pogams to Poc Implementing C on LC3 Bits&Bytes to High level Pogams Next Use application witten in high level language Pogam uns on a pocesso How ae high level pogams implemented on pocesso? Run-time stack, allocation of vaiables, tanslation of high level code to machine code Map high level data stuctues to low level data stuctues Stuct to linea mapping in memoy What else does softwae develope want afte pogam is implemented coectly? PERFORMANCE! Pefomance of pogams What to measue Model? Technology tends eal pocessos how to impove pefomance Pipelining, ILP, Multi-coe Memoy oganization basics Memoy hieachy: cache, main memoy, etc. How to ewite you pogam to make it un faste code optimization 1

2 Pefomance of Pogams Complexity of algoithms How good/efficient is you algoithm Measue using Big-Oh notation: O(N log N) Next question : How well is the code executing on the machine??????? Actual time to un the pogam What ae the factos that come into play Whee is the pogam and data stoed What ae the actual machine instuctions executed Why is some HW bette than othes fo diffeent pogams? What factos of system pefomance ae HW elated How does machine instuction set affect pefomance What ae the technology tends and how do they play a ole? Pogam Pefomance: The Geat Reality Ou focus Thee s moe to pefomance than asymptotic complexity Must optimize at multiple levels: algoithm, data epesentations, pocedues, and loops Must undestand system to optimize pefomance How pogams ae compiled and executed How is data stoed What data stuctues ae used How to measue pogam pefomance and identify bottlenecks How to impove pefomance without destoying code modulaity and geneality Technology Tends & Pefomance Speed will depend on clock cycle (fequency) of the cicuits How fast can we switch the tansistos Feed the signal to the gate of MOS tansisto, how long fo the tansisto to thow the switch How lage is the tansisto featue size Mooe s Law Founde of Intel hypothesized on ate of incease in pefomance It is not a law in the sense of laws of physics, etc. Obsevations: pefomance doubles evey 18 months If you knew this, how would it guide you business decisions? Case study: Apple Computes in 85 Delay (ps) Delay vs. Featue Size Gate Delay (ps) Inteconnect Delay (ps) Cu & Low k Inteconnect Delay (ps) Al & SiO Featue Size (nm) Boh, M. T., Inteconnect Scaling - The Real Limite To High Pefomance ULSI, Poceedings of the IEEE Intenational Electon Devices, pages

3 Tansistos / Chip Memoy Capacity (Single Chip DRAM) MPU Tansistos/chip (M) size 1200 DRAM Bits/chip (G) Yea 50 pentiums Yea yea size(mb) cyc time ns ns ns ns ns ns ns The CPU-Memoy Gap Pefomance Tends: Summay ns 100,000,000 10,000,000 1,000, ,000 10,000 1, The inceasing gap between DRAM, disk, and CPU speeds yea Disk seek time DRAM access time SRAM access time CPU cycle time Wokstation pefomance (measued in Spec Maks) impoves oughly 50% pe yea (2X evey 18 months) Pefomance will include not just pocesso, but memoy and disk I/O Impovement in cost pefomance estimated at 70% pe yea 3

4 Pefomance: What to measue? Which of these aiplanes has the best pefomance? Plane DC to Pais Speed Passenges Pefomance? Boeing hous 610 mph 470 BAD/Sud Concode 3 hous 1350 mph 132 The Bottom Line: Pefomance metic depends on application Plane Boeing 747 BAD/Sud Concode DC to Pais 6.5 hous 3 hous Speed 610 mph 1350 mph Passenges Time to un the task (Execution Time/Response Time/Latency) Time to tavel fom DC to Pais Tasks pe unit time (Thoughput/Bandwidth) Passenge miles pe hou; how many passenges tanspoted pe unit time Thoughput (pmph) 286, ,200 Compute Pefomance: TIME, TIME, TIME Response Time (latency) How long does it take fo my job to un? How long does it take to execute a job? How long must I wait fo the database quey? Thoughput How many jobs can the machine un at once? What is the aveage execution ate? How much wok is getting done? Metic chosen usually depends on use community: sys admin vs single use? If we upgade a machine with a new pocesso what do we incease? If we add a new machine to the lab what do we incease? 4

5 Execution Time How to Model Pefomance Elapsed Time counts eveything (disk and memoy accesses, I/O, etc.) a useful numbe, but often not good fo compaison puposes CPU time doesn't count I/O o time spent unning othe pogams can be boken up into system time, and use time Ou focus in this couse: use CPU time time spent executing the lines of code that ae "in" ou pogam The asymptotic complexity big O Time = O( f(n)) : function of the size of the input Soting O(n log n) This measues efficiency of you algoithm i.e., how good is solution technique Is this enough when we talk of actual time measued on the pocesso??? Thee s moe to pefomance than asymptotic complexity Must optimize at multiple levels: algoithm, data epesentations, pocedues, and loops Must undestand system to optimize pefomance How pogams ae compiled and executed, data stoage, data stuctues, I/O management Pocesso time: how to measue? Numbe of clock cycles it takes to complete the execution of you pogam What is you pogam A numbe of instuctions Diffeent types: load, stoe,, banch Stoed in memoy Executed on the CPU Aspects of CPU Pefomance CPU time = Seconds = Instuctions x Cycles x Seconds Pogam Pogam Instuction Cycle CPU = IC * CPI * Clk 5

6 CPI Aveage CPI Cycles pe instuction Diffeent instuctions may take diffeent time Example in LC 3? We obseved that not evey instuction needs to go though all the instuction execution steps Eg: no need to calculate effective addess, fetch fom memoy o egistes Reality: diffeent times associated with diffeent opeations Especially tue of memoy opeations Application has an instuction mix Pofile of application instuction types, Load/Stoe (memoy), Banch, Jumps, etc. x 1, x 2, x 3 as pecentage ( x1=0.4) Pocesso has CPI fo each type of instuction Pat of ISA of a pocesso.specifications doc Example: =1.0, Load/Stoe=2.0, etc. t 1, t 2, t 3, What is effective CPI? Weighted aveage CPI = x 1 *t 1 + x 2 *t 2 +. CPI: Cycles pe instuction Pinciples of Compute Achitectue Design: Thumb Rules Depends on the instuction Aveage cycles pe instuction Example: Common case fast Focus on impoving those instuctions that ae fequently used Amdahl s Law Faction enhanced/optimized uns faste Pinciple of Locality: pogam spends 90% of its time in 10% of code Eg: wod pocessing Spatial: items nea each othe tend to be accessed Tempoal: ecently used items tend to be used again Concuency/Paallelism Ovelap the instuction execution steps Pipeline pocessos Multi-coe pocessos 6

7 Amdahl s Law: Speedup Application takes X time How to un it faste Enhance/optimize a potion of it Which potion Can we enhance all of it Note that we ae talking of solving the enhanced pat in a diffeent way, and possibly using diffeent (moe costly) esouces Whee to focus ou optimizations? Look at etun on investment Code segments that take long time can give us the best etuns Pofile you code to undestand which pats ae dominating Impoving Pefomance of pocessos: quick eview Ae eal pocessos like LC 3? How can we impove the pefomance of the pocesso? What design pinciples? Quick oveview of techniques used in eal pocesso designs Pipelining Instuction level paallel (ILP) pocessos Multitheaded pocessos multi-coe Real-Wold Pipelines: Ca Washes Instuction Pipeline Sequential Paallel Instuction execution pocess lends itself natually to pipelining ovelap the subtasks of instuction fetch, decode and execute Pipelined Idea Divide pocess into independent stages Move objects though stages in sequence At any given times, multiple objects being pocessed Inst Fetch Decode Execute Mem Access Wite Back Result 7

8 Conventional Pipelined Execution Repesentat Speedup of Pipelines Time IFetch Dcd Exec Mem WB Instuction 1 IFetch Dcd Exec Mem WB Instuction 2 IFetch Dcd Exec Mem WB Instuction 3 IFetch Dcd Exec Mem WB IFetch Dcd Exec Mem WB Pogam Flow IFetch Dcd Exec Mem WB If we have a k stage pipeline, and n tasks (instuctions) to pocess When does fist instuction complete: k cycles When does 2 nd instuction complete: next cycle (k+1) to complete 2 tasks How long to complete n tasks? Afte fist task, we get one output evey cycle fo next (n-1) cycles Theefoe T k = k + (n-1) cycles Speedup of Pipelines If we have a k stage pipeline, and n tasks (instuctions) to pocess: time to complete n tasks (instuctions)t k = k + (n-1) cycles Time fo non-pipelined = nk cycles Theefoe speedup using k stage pipeline S k = T 1 / T k = nk / (k + n-1) fo lage n, this is ~ nk/n = k Challenges: Data hazads solved in intenal fowading Contol hazads (banches).pediction but still a poblem Example Suppose we execute 100 instuctions Single Cycle Machine 45 ns/cycle x 1 CPI x 100 inst = 4500 ns Multicycle Machine 10 ns/cycle x 4.6 CPI (due to inst mix) x 100 inst = 4600 ns Ideal pipelined machine 10 ns/cycle x (1 CPI x 100 inst + 4 cycle dain) = 1040 ns 8

9 So how had is it to design a Pipelined Pocesso Go back and examine you datapath and contol diagam associated esouces with states ensue that flows do not conflict, o figue out how to esolve asset contol in appopiate stage Instuction Fetch Next PC Addess Sample Datapath What do we need to do to pipeline the pocess? 4 Adde Memoy Inst Inst. Decode. Fetch Next SEQ PC RS1 RS2 RD Imm File Sign Extend Execute Add. Calc MUX MUX Zeo? Memoy Access MUX Data Memoy L M D Wite Back MUX WB Data 5 Steps of Datapath Visualizing Pipelining Instuction Fetch Inst. Decode. Fetch Execute Add. Calc Memoy Access Wite Back Time (clock cycles) Next PC Addess 4 Adde Memoy IF/ID Next SEQ PC RS1 RS2 Imm File Sign Extend ID/EX Next SEQ PC MUX MUX Zeo? EX/MEM MUX RD RD RD Data Memoy MEM/WB MUX WB Data I n s t. O d e Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 D Latches 9

10 Can t be that easy.poblems? Back to ou old fiend: CPU time equation Limits to pipelining: Hazads pevent next instuction fom executing duing its designated clock cycle and intoduce stall cycles which incease CPI Stuctual hazads: HW cannot suppot this combination of instuctions - two dogs fighting fo the same bone Data hazads: Instuction depends on esult of pio instuction still in the pipeline Data dependencies Contol hazads: Caused by delay between the fetching of instuctions and decisions about changes in contol flow (banches and jumps). Contol dependencies Can always esolve hazads by stalling But, moe stall cycles = moe CPU time = less pefomance Incease pefomance = decease stall cycles Recall equation fo CPU time So what ae we doing by pipelining the instuction execution pocess? Clock? Instuction Count? CPI? How is CPI effected by the vaious hazads? Speed Up Equation fo Pipelining One Memoy Pot/Stuctual Hazads Time (clock cycles) CPI pipelined Ideal CPI Aveage Stall cycles pe Inst Pipeline depth Cycle Time Speedup 1 Pipeline stall CPI Cycle Time unpipelined pipelined I n s t. Load Inst 1 Inst 2 Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Moe stalls means lowe pefomance! O d e Inst 3 Inst 4 10

11 One Memoy Pot/Stuctual Hazads Data Dependencies I n s t. O d e Time (clock cycles) Load Inst 1 Inst 2 Stall Inst 3 Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Bubble Bubble Bubble Bubble Bubble Tue dependencies and False dependencies false implies we can emove the dependency i.e., compile can emove them tue implies we ae stuck with it! Thee types of data dependencies defined in tems of how succeeding instuction depends on peceding instuction RAW: Read afte Wite o Flow dependency WAR: Wite afte Read o anti-dependency WAW: Wite afte Wite Data Hazads False Data Hazads Read Afte Wite (RAW) Inst J ties to ead opeand befoe Inst I wites it I: add 1,2,3 J: sub 4,1,3 Wite afte Read (WAR) Inst J ties to wite opeand befoe Inst I eads it I: add 1,2,3 J: mul 2,5,6 Caused by a Dependence (in compile nomenclatue). This hazad esults fom an actual need fo communication. Caused by a egiste dependence (in compile nomenclatue) can be emoved at compile time by assigning diffeent egistes Assign diffeent egiste to output of instuction J: mul 7, 5, 6 11

12 Intenal Fowading: Getting id of some hazads Data Hazad on R1 In some cases the data needed by the next instuction at the stage has been computed by the (o some stage defining it) but has not been witten back to the egistes Can we fowad this esult by bypassing stages? I n s t. O d e Time (clock cycles) IF ID/RF EX MEM WB add 1,2,3 sub 4,1,3 and 6,1,7 o 8,1,9 xo 10,1,11 Fowading to Avoid Data Hazad Contol Hazads: Banches I n s t. O d e add 1,2,3 sub 4,1,3 and 6,1,7 o 8,1,9 xo 10,1,11 Time (clock cycles) Instuction flow Steam of instuctions pocessed by Inst. Fetch Speed of input flow puts bound on ate of outputs geneated Banch instuction affects instuction flow Do not know next instuction to be executed until banch outcome known When we hit a banch instuction Need to compute taget addess (whee to banch) Resolution of banch condition (tue o false) Might need to flush pipeline if othe instuctions have been fetched fo execution 12

13 Contol Hazad on Banches Thee Stage Stall Solution? 10: beq 1,3,36 Banch pediction algoithms Implemented in hadwae Use histoy to pedict a banch 14: and 2,3,5 18: o 6,1,7 Example: fo loop Banch always taken except fo last iteation 22: add 8,1,9 36: xo 10,1,11 Is this how eal pocessos look? NO moe stuff.. Pipelining is fist step.next is Instuction Level paallelism (ILP What if we had many pipeline units? ILP is tanspaent to the use Multiple opeations executed in paallel even though the system is handed a single pogam witten with a sequential pocesso in mind Same execution hadwae as a nomal RISC machine May be moe than one of any given type of hadwae Achitectues fo ILP Scala Pipeline (baseline) Instuction Paallelism = D Opeation Latency = 1 Peak IPC = 1 (IPC: Instuctions Pe Cycle) SUCCESSIVE INSTRUCTIONS D IF DE EX WB TIME IN CYCLES (OF BASELINE MACHINE) 13

14 Supescala Pocessos Supescala (Pipelined) Execution IP = DxN OL = 1 baseline cycles Peak IPC = N pe baseline cycle Is it that simple... Oppotunities (to speed up things) ae moe but poblems become moe challenging N IF DE EX WB So can SW do anything about the poblems? This is whee you get to claim SW folks ae smate than HW folks! Compile can look at the entie code Analyze dependencies at compile time Rewite code Reaange instuctions to impove paallelism Make bette use of egistes These ae all things that moden compiles do by default! Example 1. ADD 1, 2, 3 {1,2,3} ae dependent 2. MUL 4, 1, 2 on each othe: sequential 3. ADD 2, 4, 3 4. MUL 10, 11, 12 {4,5,6} dependent on each 5. ADD 14, 10, 11 othe: sequential 6. SUB 15, 14, 12 No paallelism in code when pasing sequentially 14

15 Example Example 1. ADD 1, 2, 3 {1,2,3} ae dependent 2. MUL 4, 1, 2 on each othe: sequential 3. ADD 2, 4, 3 4. MUL 10, 11, 12 {4,5,6} dependent on each 5. ADD 14, 10, 11 othe: sequential 6. SUB 15, 14, 12 As a goup {1,2,3} and {4,5,6} ae not dependent on each othe..theefoe: 1. ADD 1, 2, 3 {1,2} ae independent 2. MUL 10, 11, MUL 4, 1, 2 {3,4} independent 4. ADD 14, 10, ADD 2, 4, 3 {5,6} independent 6. SUB 15, 14, 12 Now we have paallelism in code So ae ILP pocessos the eal thing.. Multitheaded Pocessing NO! Even moe techniques: Have you witten pogams with multiple theads (in Java)? Question: can we un theads in paallel? Now we ente the ealm of multi-coe pocessos Poblems become even moe challenging but oppotunities fo pefomance impovement explode! Fine Gain Thead 1 Thead 2 Thead 3 Thead 4 Coase Gain Thead 5 Idle slot 60 15

16 Simultaneous Multi-theading... One thead, 8 units Cycle M M FX FX FP FP BRCC Two theads, 8 units Cycle M M FX FX FP FP BRCC Time (pocesso cycle) Now this is some eal seiou stuff but NO this is not yet the kcka** stuff Supescala Fine-Gained Coase-Gained Simultaneous Multitheading M = Load/Stoe, FX = Fixed Point, FP = Floating Point, BR = Banch, CC = Condition Codes Thead 1 Thead 2 Thead 3 Thead 4 Thead 5 Idle slot Time (pocesso cycle) And NOW we ae talking eal stuff... Multipocessing Pocesso 1 Poc 2 Thead/Code 1 Thead/Code 2 SMT: Simultaneous Multitheading- 1 pocesso 3 4 Theefoe 5 Idle slot Multipocessing SMT Coe 1 Coe 2 Multi-Coe Dual-coe Intel Xeon pocessos Each coe is hype-theaded Pivate L1 caches Shaed L2 caches Intel Xeon Dual-coe C O R E 1 hype-theads L1 cache C O R E 0 L2 cache memoy L1 cache 16

17 Next : Did we foget a key pat of a compute system? Key component in a compute? Memoy How ae eal memoy systems oganized? How do they affect pefomance? 17

COSC 6385 Computer Architecture. - Pipelining

COSC 6385 Computer Architecture. - Pipelining COSC 6385 Compute Achitectue - Pipelining Sping 2012 Some of the slides ae based on a lectue by David Culle, Pipelining Pipelining is an implementation technique wheeby multiple instuctions ae ovelapped

More information

CISC 662 Graduate Computer Architecture Lecture 6 - Hazards

CISC 662 Graduate Computer Architecture Lecture 6 - Hazards CISC 662 Gaduate Compute Achitectue Lectue 6 - Hazads Michela Taufe http://www.cis.udel.edu/~taufe/teaching/cis662f07 Powepoint Lectue Notes fom John Hennessy and David Patteson s: Compute Achitectue,

More information

Introduction To Pipelining. Chapter Pipelining1 1

Introduction To Pipelining. Chapter Pipelining1 1 Intoduction To Pipelining Chapte 6.1 - Pipelining1 1 Mooe s Law Mooe s Law says that the numbe of pocessos on a chip doubles about evey 18 months. Given the data on the following two slides, is this tue?

More information

Lecture 8 Introduction to Pipelines Adapated from slides by David Patterson

Lecture 8 Introduction to Pipelines Adapated from slides by David Patterson Lectue 8 Intoduction to Pipelines Adapated fom slides by David Patteson http://www-inst.eecs.bekeley.edu/~cs61c/ * 1 Review (1/3) Datapath is the hadwae that pefoms opeations necessay to execute pogams.

More information

Computer Science 141 Computing Hardware

Computer Science 141 Computing Hardware Compute Science 141 Computing Hadwae Fall 2006 Havad Univesity Instucto: Pof. David Books dbooks@eecs.havad.edu [MIPS Pipeline Slides adapted fom Dave Patteson s UCB CS152 slides and May Jane Iwin s CSE331/431

More information

The Processor: Improving Performance Data Hazards

The Processor: Improving Performance Data Hazards The Pocesso: Impoving Pefomance Data Hazads Monday 12 Octobe 15 Many slides adapted fom: and Design, Patteson & Hennessy 5th Edition, 2014, MK and fom Pof. May Jane Iwin, PSU Summay Pevious Class Pipeline

More information

COEN-4730 Computer Architecture Lecture 2 Review of Instruction Sets and Pipelines

COEN-4730 Computer Architecture Lecture 2 Review of Instruction Sets and Pipelines 1 COEN-4730 Compute Achitectue Lectue 2 Review of nstuction Sets and Pipelines Cistinel Ababei Dept. of Electical and Compute Engineeing Maquette Univesity Cedits: Slides adapted fom pesentations of Sudeep

More information

CS 61C: Great Ideas in Computer Architecture. Pipelining Hazards. Instructor: Senior Lecturer SOE Dan Garcia

CS 61C: Great Ideas in Computer Architecture. Pipelining Hazards. Instructor: Senior Lecturer SOE Dan Garcia CS 61C: Geat Ideas in Compute Achitectue Pipelining Hazads Instucto: Senio Lectue SOE Dan Gacia 1 Geat Idea #4: Paallelism So9wae Paallel Requests Assigned to compute e.g. seach Gacia Paallel Theads Assigned

More information

ECE331: Hardware Organization and Design

ECE331: Hardware Organization and Design ECE331: Hadwae Oganization and Design Lectue 16: Pipelining Adapted fom Compute Oganization and Design, Patteson & Hennessy, UCB Last time: single cycle data path op System clock affects pimaily the Pogam

More information

UCB CS61C : Machine Structures

UCB CS61C : Machine Structures inst.eecs.bekeley.edu/~cs61c UCB CS61C : Machine Stuctues Lectue SOE Dan Gacia Lectue 28 CPU Design : Pipelining to Impove Pefomance 2010-04-05 Stanfod Reseaches have invented a monitoing technique called

More information

CMCS Mohamed Younis CMCS 611, Advanced Computer Architecture 1

CMCS Mohamed Younis CMCS 611, Advanced Computer Architecture 1 CMCS 611-101 Advanced Compute Achitectue Lectue 6 Intoduction to Pipelining Septembe 23, 2009 www.csee.umbc.edu/~younis/cmsc611/cmsc611.htm Mohamed Younis CMCS 611, Advanced Compute Achitectue 1 Pevious

More information

Lecture #22 Pipelining II, Cache I

Lecture #22 Pipelining II, Cache I inst.eecs.bekeley.edu/~cs61c CS61C : Machine Stuctues Lectue #22 Pipelining II, Cache I Wiewold cicuits 2008-7-29 http://www.maa.og/editoial/mathgames/mathgames_05_24_04.html http://www.quinapalus.com/wi-index.html

More information

Chapter 4 (Part III) The Processor: Datapath and Control (Pipeline Hazards)

Chapter 4 (Part III) The Processor: Datapath and Control (Pipeline Hazards) Chapte 4 (Pat III) The Pocesso: Datapath and Contol (Pipeline Hazads) 陳瑞奇 (J.C. Chen) 亞洲大學資訊工程學系 Adapted fom class notes by Pof. M.J. Iwin, PSU and Pof. D. Patteson, UCB 1 吃感冒藥副作用怎麼辦? http://big5.sznews.com/health/images/attachement/jpg/site3/20120319/001558d90b3310d0c1683e.jpg

More information

Computer Architecture. Pipelining and Instruction Level Parallelism An Introduction. Outline of This Lecture

Computer Architecture. Pipelining and Instruction Level Parallelism An Introduction. Outline of This Lecture Compute Achitectue Pipelining and nstuction Level Paallelism An ntoduction Adapted fom COD2e by Hennessy & Patteson Slide 1 Outline of This Lectue ntoduction to the Concept of Pipelined Pocesso Pipelined

More information

Administrivia. CMSC 411 Computer Systems Architecture Lecture 5. Data Hazard Even with Forwarding Figure A.9, Page A-20

Administrivia. CMSC 411 Computer Systems Architecture Lecture 5. Data Hazard Even with Forwarding Figure A.9, Page A-20 Administivia CMSC 411 Compute Systems Achitectue Lectue 5 Basic Pipelining (cont.) Alan Sussman als@cs.umd.edu as@csu dedu Homewok poblems fo Unit 1 due today Homewok poblems fo Unit 3 posted soon CMSC

More information

CSE4201. Computer Architecture

CSE4201. Computer Architecture CSE 4201 Compute Achitectue Pof. Mokhta Aboelaze Pats of these slides ae taken fom Notes by Pof. David Patteson at UCB Outline MIPS and instuction set Simple pipeline in MIPS Stuctual and data hazads Fowading

More information

Review from last lecture

Review from last lecture CSE820 Gaduate Compute Achitectue Week 3 Pefomance + Pipeline Review Based on slides by David Patteson Review fom last lectue Tacking and extapolating technology pat of achitect s esponsibility Expect

More information

CENG 3420 Computer Organization and Design. Lecture 07: MIPS Processor - II. Bei Yu

CENG 3420 Computer Organization and Design. Lecture 07: MIPS Processor - II. Bei Yu CENG 3420 Compute Oganization and Design Lectue 07: MIPS Pocesso - II Bei Yu CEG3420 L07.1 Sping 2016 Review: Instuction Citical Paths q Calculate cycle time assuming negligible delays (fo muxes, contol

More information

Lecture Topics ECE 341. Lecture # 12. Control Signals. Control Signals for Datapath. Basic Processing Unit. Pipelining

Lecture Topics ECE 341. Lecture # 12. Control Signals. Control Signals for Datapath. Basic Processing Unit. Pipelining EE 341 Lectue # 12 Instucto: Zeshan hishti zeshan@ece.pdx.edu Novembe 10, 2014 Potland State Univesity asic Pocessing Unit ontol Signals Hadwied ontol Datapath contol signals Dealing with memoy delay Pipelining

More information

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Instruc>on Level Parallelism

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Instruc>on Level Parallelism Agenda CS 61C: Geat Ideas in Compute Achitectue (Machine Stuctues) Instuc>on Level Paallelism Instuctos: Randy H. Katz David A. PaJeson hjp://inst.eecs.bekeley.edu/~cs61c/fa10 Review Instuc>on Set Design

More information

CENG 3420 Lecture 07: Pipeline

CENG 3420 Lecture 07: Pipeline CENG 3420 Lectue 07: Pipeline Bei Yu byu@cse.cuhk.edu.hk CENG3420 L07.1 Sping 2017 Outline q Review: Flip-Flop Contol Signals q Pipeline Motivations q Pipeline Hazads q Exceptions CENG3420 L07.2 Sping

More information

User Visible Registers. CPU Structure and Function Ch 11. General CPU Organization (4) Control and Status Registers (5) Register Organisation (4)

User Visible Registers. CPU Structure and Function Ch 11. General CPU Organization (4) Control and Status Registers (5) Register Organisation (4) PU Stuctue and Function h Geneal Oganisation Registes Instuction ycle Pipelining anch Pediction Inteupts Use Visible Registes Vaies fom one achitectue to anothe Geneal pupose egiste (GPR) ata, addess,

More information

You Are Here! Review: Hazards. Agenda. Agenda. Review: Load / Branch Delay Slots 7/28/2011

You Are Here! Review: Hazards. Agenda. Agenda. Review: Load / Branch Delay Slots 7/28/2011 CS 61C: Geat Ideas in Compute Achitectue (Machine Stuctues) Instuction Level Paallelism: Multiple Instuction Issue Guest Lectue: Justin Hsia Softwae Paallel Requests Assigned to compute e.g., Seach Katz

More information

CS 61C: Great Ideas in Computer Architecture Instruc(on Level Parallelism: Mul(ple Instruc(on Issue

CS 61C: Great Ideas in Computer Architecture Instruc(on Level Parallelism: Mul(ple Instruc(on Issue CS 61C: Geat Ideas in Compute Achitectue Instuc(on Level Paallelism: Mul(ple Instuc(on Issue Instuctos: Kste Asanovic, Randy H. Katz hbp://inst.eecs.bekeley.edu/~cs61c/fa12 1 Paallel Requests Assigned

More information

Review: Moore s Law. EECS 252 Graduate Computer Architecture Lecture 2. Review: Joy s Law in ManyCore world. Bell s Law new class per decade

Review: Moore s Law. EECS 252 Graduate Computer Architecture Lecture 2. Review: Joy s Law in ManyCore world. Bell s Law new class per decade EECS 252 Gaduate Compute Achitectue Lectue 2 ℵ 0 Review of Instuction Sets, Pipelines, and Caches Januay 26 th, 2009 Review Mooe s Law John Kubiatowicz Electical Engineeing and Compute Sciences Univesity

More information

COSC 6385 Computer Architecture - Pipelining

COSC 6385 Computer Architecture - Pipelining COSC 6385 Computer Architecture - Pipelining Fall 2006 Some of the slides are based on a lecture by David Culler, Instruction Set Architecture Relevant features for distinguishing ISA s Internal storage

More information

Pre-requisites. This is a textbook-based course. Chapter 1. Pipelines, Performance, Caches, and Virtual Memory. January 2009 Paul H J Kelly

Pre-requisites. This is a textbook-based course. Chapter 1. Pipelines, Performance, Caches, and Virtual Memory. January 2009 Paul H J Kelly 332 Advanced Compute Achitectue Chapte 1 Intoduction and eview of Pipelines, Pefomance, Caches, and Vitual Januay 2009 Paul H J Kelly These lectue notes ae patly based on the couse text, Hennessy and Patteson

More information

Any modern computer system will incorporate (at least) two levels of storage:

Any modern computer system will incorporate (at least) two levels of storage: 1 Any moden compute system will incopoate (at least) two levels of stoage: pimay stoage: andom access memoy (RAM) typical capacity 32MB to 1GB cost pe MB $3. typical access time 5ns to 6ns bust tansfe

More information

A Memory Efficient Array Architecture for Real-Time Motion Estimation

A Memory Efficient Array Architecture for Real-Time Motion Estimation A Memoy Efficient Aay Achitectue fo Real-Time Motion Estimation Vasily G. Moshnyaga and Keikichi Tamau Depatment of Electonics & Communication, Kyoto Univesity Sakyo-ku, Yoshida-Honmachi, Kyoto 66-1, JAPAN

More information

CISC 662 Graduate Computer Architecture Lecture 6 - Hazards

CISC 662 Graduate Computer Architecture Lecture 6 - Hazards CISC 662 Graduate Computer Architecture Lecture 6 - Hazards Michela Taufer http://www.cis.udel.edu/~taufer/teaching/cis662f07 Powerpoint Lecture Notes from John Hennessy and David Patterson s: Computer

More information

Computer Architecture

Computer Architecture Lecture 3: Pipelining Iakovos Mavroidis Computer Science Department University of Crete 1 Previous Lecture Measurements and metrics : Performance, Cost, Dependability, Power Guidelines and principles in

More information

All lengths in meters. E = = 7800 kg/m 3

All lengths in meters. E = = 7800 kg/m 3 Poblem desciption In this poblem, we apply the component mode synthesis (CMS) technique to a simple beam model. 2 0.02 0.02 All lengths in metes. E = 2.07 10 11 N/m 2 = 7800 kg/m 3 The beam is a fee-fee

More information

EI338: Computer Systems and Engineering (Computer Architecture & Operating Systems)

EI338: Computer Systems and Engineering (Computer Architecture & Operating Systems) EI338: Computer Systems and Engineering (Computer Architecture & Operating Systems) Chentao Wu 吴晨涛 Associate Professor Dept. of Computer Science and Engineering Shanghai Jiao Tong University SEIEE Building

More information

Pipeline Overview. Dr. Jiang Li. Adapted from the slides provided by the authors. Jiang Li, Ph.D. Department of Computer Science

Pipeline Overview. Dr. Jiang Li. Adapted from the slides provided by the authors. Jiang Li, Ph.D. Department of Computer Science Pipeline Overview Dr. Jiang Li Adapted from the slides provided by the authors Outline MIPS An ISA for Pipelining 5 stage pipelining Structural and Data Hazards Forwarding Branch Schemes Exceptions and

More information

DYNAMIC STORAGE ALLOCATION. Hanan Samet

DYNAMIC STORAGE ALLOCATION. Hanan Samet ds0 DYNAMIC STORAGE ALLOCATION Hanan Samet Compute Science Depatment and Cente fo Automation Reseach and Institute fo Advanced Compute Studies Univesity of Mayland College Pak, Mayland 07 e-mail: hjs@umiacs.umd.edu

More information

Computer and Information Sciences College / Computer Science Department Enhancing Performance with Pipelining

Computer and Information Sciences College / Computer Science Department Enhancing Performance with Pipelining Computer and Information Sciences College / Computer Science Department Enhancing Performance with Pipelining Single-Cycle Design Problems Assuming fixed-period clock every instruction datapath uses one

More information

ANALYTIC PERFORMANCE MODELS FOR SINGLE CLASS AND MULTIPLE CLASS MULTITHREADED SOFTWARE SERVERS

ANALYTIC PERFORMANCE MODELS FOR SINGLE CLASS AND MULTIPLE CLASS MULTITHREADED SOFTWARE SERVERS ANALYTIC PERFORMANCE MODELS FOR SINGLE CLASS AND MULTIPLE CLASS MULTITHREADED SOFTWARE SERVERS Daniel A Menascé Mohamed N Bennani Dept of Compute Science Oacle, Inc Geoge Mason Univesity 1211 SW Fifth

More information

Information Retrieval. CS630 Representing and Accessing Digital Information. IR Basics. User Task. Basic IR Processes

Information Retrieval. CS630 Representing and Accessing Digital Information. IR Basics. User Task. Basic IR Processes CS630 Repesenting and Accessing Digital Infomation Infomation Retieval: Basics Thosten Joachims Conell Univesity Infomation Retieval Basics Retieval Models Indexing and Pepocessing Data Stuctues ~ 4 lectues

More information

Lecture 3. Pipelining. Dr. Soner Onder CS 4431 Michigan Technological University 9/23/2009 1

Lecture 3. Pipelining. Dr. Soner Onder CS 4431 Michigan Technological University 9/23/2009 1 Lecture 3 Pipelining Dr. Soner Onder CS 4431 Michigan Technological University 9/23/2009 1 A "Typical" RISC ISA 32-bit fixed format instruction (3 formats) 32 32-bit GPR (R0 contains zero, DP take pair)

More information

MIPS An ISA for Pipelining

MIPS An ISA for Pipelining Pipelining: Basic and Intermediate Concepts Slides by: Muhamed Mudawar CS 282 KAUST Spring 2010 Outline: MIPS An ISA for Pipelining 5 stage pipelining i Structural Hazards Data Hazards & Forwarding Branch

More information

MapReduce Optimizations and Algorithms 2015 Professor Sasu Tarkoma

MapReduce Optimizations and Algorithms 2015 Professor Sasu Tarkoma apreduce Optimizations and Algoithms 2015 Pofesso Sasu Takoma www.cs.helsinki.fi Optimizations Reduce tasks cannot stat befoe the whole map phase is complete Thus single slow machine can slow down the

More information

Modern Computer Architecture

Modern Computer Architecture Modern Computer Architecture Lecture2 Pipelining: Basic and Intermediate Concepts Hongbin Sun 国家集成电路人才培养基地 Xi an Jiaotong University Pipelining: Its Natural! Laundry Example Ann, Brian, Cathy, Dave each

More information

EE 6900: Interconnection Networks for HPC Systems Fall 2016

EE 6900: Interconnection Networks for HPC Systems Fall 2016 EE 6900: Inteconnection Netwoks fo HPC Systems Fall 2016 Avinash Kaanth Kodi School of Electical Engineeing and Compute Science Ohio Univesity Athens, OH 45701 Email: kodi@ohio.edu 1 Acknowledgement: Inteconnection

More information

CS 110 Computer Architecture. Pipelining. Guest Lecture: Shu Yin. School of Information Science and Technology SIST

CS 110 Computer Architecture. Pipelining. Guest Lecture: Shu Yin.   School of Information Science and Technology SIST CS 110 Computer Architecture Pipelining Guest Lecture: Shu Yin http://shtech.org/courses/ca/ School of Information Science and Technology SIST ShanghaiTech University Slides based on UC Berkley's CS61C

More information

Query Language #1/3: Relational Algebra Pure, Procedural, and Set-oriented

Query Language #1/3: Relational Algebra Pure, Procedural, and Set-oriented Quey Language #1/3: Relational Algeba Pue, Pocedual, and Set-oiented To expess a quey, we use a set of opeations. Each opeation takes one o moe elations as input paamete (set-oiented). Since each opeation

More information

Accelerating Storage with RDMA Max Gurtovoy Mellanox Technologies

Accelerating Storage with RDMA Max Gurtovoy Mellanox Technologies Acceleating Stoage with RDMA Max Gutovoy Mellanox Technologies 2018 Stoage Develope Confeence EMEA. Mellanox Technologies. All Rights Reseved. 1 What is RDMA? Remote Diect Memoy Access - povides the ability

More information

Parallel processing model for XML parsing

Parallel processing model for XML parsing Recent Reseaches in Communications, Signals and nfomation Technology Paallel pocessing model fo XML pasing ADRANA GEORGEVA Fac. Applied Mathematics and nfomatics Technical Univesity of Sofia, TU-Sofia

More information

THE THETA BLOCKCHAIN

THE THETA BLOCKCHAIN THE THETA BLOCKCHAIN Theta is a decentalized video steaming netwok, poweed by a new blockchain and token. By Theta Labs, Inc. Last Updated: Nov 21, 2017 esion 1.0 1 OUTLINE Motivation Reputation Dependent

More information

Module 6 STILL IMAGE COMPRESSION STANDARDS

Module 6 STILL IMAGE COMPRESSION STANDARDS Module 6 STILL IMAE COMPRESSION STANDARDS Lesson 17 JPE-2000 Achitectue and Featues Instuctional Objectives At the end of this lesson, the students should be able to: 1. State the shotcomings of JPE standad.

More information

A New Finite Word-length Optimization Method Design for LDPC Decoder

A New Finite Word-length Optimization Method Design for LDPC Decoder A New Finite Wod-length Optimization Method Design fo LDPC Decode Jinlei Chen, Yan Zhang and Xu Wang Key Laboatoy of Netwok Oiented Intelligent Computation Shenzhen Gaduate School, Habin Institute of Technology

More information

Modeling a shared medium access node with QoS distinction

Modeling a shared medium access node with QoS distinction Modeling a shaed medium access node with QoS distinction Matthias Gies, Jonas Geutet Compute Engineeing and Netwoks Laboatoy (TIK) Swiss Fedeal Institute of Technology Züich CH-8092 Züich, Switzeland email:

More information

Multidimensional Testing

Multidimensional Testing Multidimensional Testing QA appoach fo Stoage netwoking Yohay Lasi Visuality Systems 1 Intoduction Who I am Yohay Lasi, QA Manage at Visuality Systems Visuality Systems the leading commecial povide of

More information

Computer Architecture Spring 2016

Computer Architecture Spring 2016 Computer Architecture Spring 2016 Lecture 02: Introduction II Shuai Wang Department of Computer Science and Technology Nanjing University Pipeline Hazards Major hurdle to pipelining: hazards prevent the

More information

Controlled Information Maximization for SOM Knowledge Induced Learning

Controlled Information Maximization for SOM Knowledge Induced Learning 3 Int'l Conf. Atificial Intelligence ICAI'5 Contolled Infomation Maximization fo SOM Knowledge Induced Leaning Ryotao Kamimua IT Education Cente and Gaduate School of Science and Technology, Tokai Univeisity

More information

GCC-AVR Inline Assembler Cookbook Version 1.2

GCC-AVR Inline Assembler Cookbook Version 1.2 GCC-AVR Inline Assemble Cookbook Vesion 1.2 About this Document The GNU C compile fo Atmel AVR isk pocessos offes, to embed assembly language code into C pogams. This cool featue may be used fo manually

More information

A Full-mode FME VLSI Architecture Based on 8x8/4x4 Adaptive Hadamard Transform For QFHD H.264/AVC Encoder

A Full-mode FME VLSI Architecture Based on 8x8/4x4 Adaptive Hadamard Transform For QFHD H.264/AVC Encoder 20 IEEE/IFIP 9th Intenational Confeence on VLSI and System-on-Chip A Full-mode FME VLSI Achitectue Based on 8x8/ Adaptive Hadamad Tansfom Fo QFHD H264/AVC Encode Jialiang Liu, Xinhua Chen College of Infomation

More information

Chapter 4 The Processor 1. Chapter 4A. The Processor

Chapter 4 The Processor 1. Chapter 4A. The Processor Chapter 4 The Processor 1 Chapter 4A The Processor Chapter 4 The Processor 2 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware

More information

DYNAMIC STORAGE ALLOCATION. Hanan Samet

DYNAMIC STORAGE ALLOCATION. Hanan Samet ds0 DYNAMIC STORAGE ALLOCATION Hanan Samet Compute Science Depatment and Cente fo Automation Reseach and Institute fo Advanced Compute Studies Univesity of Mayland College Pak, Mayland 074 e-mail: hjs@umiacs.umd.edu

More information

High performance CUDA based CNN image processor

High performance CUDA based CNN image processor High pefomance UDA based NN image pocesso GEORGE VALENTIN STOIA, RADU DOGARU, ELENA RISTINA STOIA Depatment of Applied Electonics and Infomation Engineeing Univesity Politehnica of Buchaest -3, Iuliu Maniu

More information

HY425 Lecture 05: Branch Prediction

HY425 Lecture 05: Branch Prediction HY425 Lecture 05: Branch Prediction Dimitrios S. Nikolopoulos University of Crete and FORTH-ICS October 19, 2011 Dimitrios S. Nikolopoulos HY425 Lecture 05: Branch Prediction 1 / 45 Exploiting ILP in hardware

More information

Image Enhancement in the Spatial Domain. Spatial Domain

Image Enhancement in the Spatial Domain. Spatial Domain 8-- Spatial Domain Image Enhancement in the Spatial Domain What is spatial domain The space whee all pixels fom an image In spatial domain we can epesent an image by f( whee x and y ae coodinates along

More information

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier Science 6 PM 7 8 9 10 11 Midnight Time 30 40 20 30 40 20

More information

Readings. H+P Appendix A, Chapter 2.3 This will be partly review for those who took ECE 152

Readings. H+P Appendix A, Chapter 2.3 This will be partly review for those who took ECE 152 Readings H+P Appendix A, Chapter 2.3 This will be partly review for those who took ECE 152 Recent Research Paper The Optimal Logic Depth Per Pipeline Stage is 6 to 8 FO4 Inverter Delays, Hrishikesh et

More information

Department of Computer and IT Engineering University of Kurdistan. Computer Architecture Pipelining. By: Dr. Alireza Abdollahpouri

Department of Computer and IT Engineering University of Kurdistan. Computer Architecture Pipelining. By: Dr. Alireza Abdollahpouri Department of Computer and IT Engineering University of Kurdistan Computer Architecture Pipelining By: Dr. Alireza Abdollahpouri Pipelined MIPS processor Any instruction set can be implemented in many

More information

Advanced Computer Architecture

Advanced Computer Architecture Advanced Computer Architecture Chapter 1 Introduction into the Sequential and Pipeline Instruction Execution Martin Milata What is a Processors Architecture Instruction Set Architecture (ISA) Describes

More information

arxiv: v1 [cs.lo] 3 Dec 2018

arxiv: v1 [cs.lo] 3 Dec 2018 A high-level opeational semantics fo hadwae weak memoy models axiv:1812.00996v1 [cs.lo] 3 Dec 2018 Abstact Robet J. Colvin School of Electical Engineeing and Infomation Technology The Univesity of Queensland

More information

CS 61C: Great Ideas in Computer Architecture Pipelining and Hazards

CS 61C: Great Ideas in Computer Architecture Pipelining and Hazards CS 61C: Great Ideas in Computer Architecture Pipelining and Hazards Instructors: Vladimir Stojanovic and Nicholas Weaver http://inst.eecs.berkeley.edu/~cs61c/sp16 1 Pipelined Execution Representation Time

More information

Pipelining, Instruction Level Parallelism and Memory in Processors. Advanced Topics ICOM 4215 Computer Architecture and Organization Fall 2010

Pipelining, Instruction Level Parallelism and Memory in Processors. Advanced Topics ICOM 4215 Computer Architecture and Organization Fall 2010 Pipelining, Instruction Level Parallelism and Memory in Processors Advanced Topics ICOM 4215 Computer Architecture and Organization Fall 2010 NOTE: The material for this lecture was taken from several

More information

4.2. Co-terminal and Related Angles. Investigate

4.2. Co-terminal and Related Angles. Investigate .2 Co-teminal and Related Angles Tigonometic atios can be used to model quantities such as

More information

a Not yet implemented in current version SPARK: Research Kit Pointer Analysis Parameters Soot Pointer analysis. Objectives

a Not yet implemented in current version SPARK: Research Kit Pointer Analysis Parameters Soot Pointer analysis. Objectives SPARK: Soot Reseach Kit Ondřej Lhoták Objectives Spak is a modula toolkit fo flow-insensitive may points-to analyses fo Java, which enables expeimentation with: vaious paametes of pointe analyses which

More information

EITF20: Computer Architecture Part2.2.1: Pipeline-1

EITF20: Computer Architecture Part2.2.1: Pipeline-1 EITF20: Computer Architecture Part2.2.1: Pipeline-1 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Pipelining Harzards Structural hazards Data hazards Control hazards Implementation issues Multi-cycle

More information

Journal of World s Electrical Engineering and Technology J. World. Elect. Eng. Tech. 1(1): 12-16, 2012

Journal of World s Electrical Engineering and Technology J. World. Elect. Eng. Tech. 1(1): 12-16, 2012 2011, Scienceline Publication www.science-line.com Jounal of Wold s Electical Engineeing and Technology J. Wold. Elect. Eng. Tech. 1(1): 12-16, 2012 JWEET An Efficient Algoithm fo Lip Segmentation in Colo

More information

CS 352H Computer Systems Architecture Exam #1 - Prof. Keckler October 11, 2007

CS 352H Computer Systems Architecture Exam #1 - Prof. Keckler October 11, 2007 CS 352H Computer Systems Architecture Exam #1 - Prof. Keckler October 11, 2007 Name: Solutions (please print) 1-3. 11 points 4. 7 points 5. 7 points 6. 20 points 7. 30 points 8. 25 points Total (105 pts):

More information

SYSTEM LEVEL REUSE METRICS FOR OBJECT ORIENTED SOFTWARE : AN ALTERNATIVE APPROACH

SYSTEM LEVEL REUSE METRICS FOR OBJECT ORIENTED SOFTWARE : AN ALTERNATIVE APPROACH I J C A 7(), 202 pp. 49-53 SYSTEM LEVEL REUSE METRICS FOR OBJECT ORIENTED SOFTWARE : AN ALTERNATIVE APPROACH Sushil Goel and 2 Rajesh Vema Associate Pofesso, Depatment of Compute Science, Dyal Singh College,

More information

Lecture 05: Pipelining: Basic/ Intermediate Concepts and Implementation

Lecture 05: Pipelining: Basic/ Intermediate Concepts and Implementation Lecture 05: Pipelining: Basic/ Intermediate Concepts and Implementation CSE 564 Computer Architecture Summer 2017 Department of Computer Science and Engineering Yonghong Yan yan@oakland.edu www.secs.oakland.edu/~yan

More information

The Java Virtual Machine. Compiler construction The structure of a frame. JVM stacks. Lecture 2

The Java Virtual Machine. Compiler construction The structure of a frame. JVM stacks. Lecture 2 Compile constuction 2009 Lectue 2 Code geneation 1: Geneating code The Java Vitual Machine Data types Pimitive types, including intege and floating-point types of vaious sizes and the boolean type. The

More information

Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1

Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1 Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1 Introduction Chapter 4.1 Chapter 4.2 Review: MIPS (RISC) Design Principles Simplicity favors regularity fixed size instructions small number

More information

Processor Architecture. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Processor Architecture. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University Processor Architecture Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Moore s Law Gordon Moore @ Intel (1965) 2 Computer Architecture Trends (1)

More information

dc - Linux Command Dc may be invoked with the following command-line options: -V --version Print out the version of dc

dc - Linux Command Dc may be invoked with the following command-line options: -V --version Print out the version of dc - CentOS 5.2 - Linux Uses Guide - Linux Command SYNOPSIS [-V] [--vesion] [-h] [--help] [-e sciptexpession] [--expession=sciptexpession] [-f sciptfile] [--file=sciptfile] [file...] DESCRIPTION is a evese-polish

More information

Lecture 1: Introduction

Lecture 1: Introduction Lecture 1: Introduction Dr. Eng. Amr T. Abdel-Hamid Winter 2014 Computer Architecture Text book slides: Computer Architec ture: A Quantitative Approach 5 th E dition, John L. Hennessy & David A. Patterso

More information

Lecture 2: Processor and Pipelining 1

Lecture 2: Processor and Pipelining 1 The Simple BIG Picture! Chapter 3 Additional Slides The Processor and Pipelining CENG 6332 2 Datapath vs Control Datapath signals Control Points Controller Datapath: Storage, FU, interconnect sufficient

More information

Slotted Random Access Protocol with Dynamic Transmission Probability Control in CDMA System

Slotted Random Access Protocol with Dynamic Transmission Probability Control in CDMA System Slotted Random Access Potocol with Dynamic Tansmission Pobability Contol in CDMA System Intaek Lim 1 1 Depatment of Embedded Softwae, Busan Univesity of Foeign Studies, itlim@bufs.ac.k Abstact In packet

More information

Gravitational Shift for Beginners

Gravitational Shift for Beginners Gavitational Shift fo Beginnes This pape, which I wote in 26, fomulates the equations fo gavitational shifts fom the elativistic famewok of special elativity. Fist I deive the fomulas fo the gavitational

More information

EITF20: Computer Architecture Part2.2.1: Pipeline-1

EITF20: Computer Architecture Part2.2.1: Pipeline-1 EITF20: Computer Architecture Part2.2.1: Pipeline-1 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Pipelining Harzards Structural hazards Data hazards Control hazards Implementation issues Multi-cycle

More information

1.3 Multiplexing, Time-Switching, Point-to-Point versus Buses

1.3 Multiplexing, Time-Switching, Point-to-Point versus Buses http://achvlsi.ics.foth.g/~kateveni/534 1.3 Multiplexing, Time-Switching, Point-to-Point vesus Buses n R m Aggegation (multiplexing) Distibution (demultiplexing) Simplest Netwoking, like simplest pogamming:

More information

A Novel Parallel Deadlock Detection Algorithm and Architecture

A Novel Parallel Deadlock Detection Algorithm and Architecture A Novel Paallel Deadlock Detection Aloithm and Achitectue Pun H. Shiu 2, Yudon Tan 2, Vincent J. Mooney III {ship, ydtan, mooney}@ece.atech.ed }@ece.atech.edu http://codesin codesin.ece.atech.eduedu,2

More information

Processor Architecture

Processor Architecture Processor Architecture Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu SSE2030: Introduction to Computer Systems, Spring 2018, Jinkyu Jeong (jinkyu@skku.edu)

More information

ECE154A Introduction to Computer Architecture. Homework 4 solution

ECE154A Introduction to Computer Architecture. Homework 4 solution ECE154A Introduction to Computer Architecture Homework 4 solution 4.16.1 According to Figure 4.65 on the textbook, each register located between two pipeline stages keeps data shown below. Register IF/ID

More information

XFVHDL: A Tool for the Synthesis of Fuzzy Logic Controllers

XFVHDL: A Tool for the Synthesis of Fuzzy Logic Controllers XFVHDL: A Tool fo the Synthesis of Fuzzy Logic Contolles E. Lago, C. J. Jiménez, D. R. López, S. Sánchez-Solano and A. Baiga Instituto de Micoelectónica de Sevilla. Cento Nacional de Micoelectónica, Edificio

More information

Using SPEC SFS with the SNIA Emerald Program for EPA Energy Star Data Center Storage Program Vernon Miller IBM Nick Principe Dell EMC

Using SPEC SFS with the SNIA Emerald Program for EPA Energy Star Data Center Storage Program Vernon Miller IBM Nick Principe Dell EMC Using SPEC SFS with the SNIA Emeald Pogam fo EPA Enegy Sta Data Cente Stoage Pogam Venon Mille IBM Nick Pincipe Dell EMC v6 Agenda Backgound on SNIA Emeald/Enegy Sta fo block Intoduce NAS/File test addition;

More information

Persistent Memory what developers need to know Mark Carlson Co-chair SNIA Technical Council Toshiba

Persistent Memory what developers need to know Mark Carlson Co-chair SNIA Technical Council Toshiba Pesistent Memoy what developes need to know Mak Calson Co-chai SNIA Technical Council Toshiba 2018 Stoage Develope Confeence EMEA. All Rights Reseved. 1 Contents Welcome Pesistent Memoy Oveview Non-Volatile

More information

Embeddings into Crossed Cubes

Embeddings into Crossed Cubes Embeddings into Cossed Cubes Emad Abuelub *, Membe, IAENG Abstact- The hypecube paallel achitectue is one of the most popula inteconnection netwoks due to many of its attactive popeties and its suitability

More information

An Extension to the Local Binary Patterns for Image Retrieval

An Extension to the Local Binary Patterns for Image Retrieval , pp.81-85 http://x.oi.og/10.14257/astl.2014.45.16 An Extension to the Local Binay Pattens fo Image Retieval Zhize Wu, Yu Xia, Shouhong Wan School of Compute Science an Technology, Univesity of Science

More information

EITF20: Computer Architecture Part2.2.1: Pipeline-1

EITF20: Computer Architecture Part2.2.1: Pipeline-1 EITF20: Computer Architecture Part2.2.1: Pipeline-1 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Pipelining Harzards Structural hazards Data hazards Control hazards Implementation issues Multi-cycle

More information

We are IntechOpen, the world s leading publisher of Open Access books Built by scientists, for scientists. International authors and editors

We are IntechOpen, the world s leading publisher of Open Access books Built by scientists, for scientists. International authors and editors We ae IntechOpen, the wold s leading publishe of Open Access books Built by scientists, fo scientists,800 6,000 0M Open access books available Intenational authos and editos Downloads Ou authos ae among

More information

CSE 533: Advanced Computer Architectures. Pipelining. Instructor: Gürhan Küçük. Yeditepe University

CSE 533: Advanced Computer Architectures. Pipelining. Instructor: Gürhan Küçük. Yeditepe University CSE 533: Advanced Computer Architectures Pipelining Instructor: Gürhan Küçük Yeditepe University Lecture notes based on notes by Mark D. Hill and John P. Shen Updated by Mikko Lipasti Pipelining Forecast

More information

Pipelining! Advanced Topics on Heterogeneous System Architectures. Politecnico di Milano! Seminar DEIB! 30 November, 2017!

Pipelining! Advanced Topics on Heterogeneous System Architectures. Politecnico di Milano! Seminar DEIB! 30 November, 2017! Advanced Topics on Heterogeneous System Architectures Pipelining! Politecnico di Milano! Seminar Room @ DEIB! 30 November, 2017! Antonio R. Miele! Marco D. Santambrogio! Politecnico di Milano! 2 Outline!

More information

Ti Parallel Computing PIPELINING. Michał Roziecki, Tomáš Cipr

Ti Parallel Computing PIPELINING. Michał Roziecki, Tomáš Cipr Ti5317000 Parallel Computing PIPELINING Michał Roziecki, Tomáš Cipr 2005-2006 Introduction to pipelining What is this What is pipelining? Pipelining is an implementation technique in which multiple instructions

More information

Chapter 4. The Processor

Chapter 4. The Processor Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified

More information

(a, b) x y r. For this problem, is a point in the - coordinate plane and is a positive number.

(a, b) x y r. For this problem, is a point in the - coordinate plane and is a positive number. Illustative G-C Simila cicles Alignments to Content Standads: G-C.A. Task (a, b) x y Fo this poblem, is a point in the - coodinate plane and is a positive numbe. a. Using a tanslation and a dilation, show

More information