Barriers. CS252 Graduate Computer Architecture Lecture 22. Synchronization (con t) Memory Technology Error Correction Codes April 18 th, 2010

Size: px
Start display at page:

Download "Barriers. CS252 Graduate Computer Architecture Lecture 22. Synchronization (con t) Memory Technology Error Correction Codes April 18 th, 2010"

Transcription

1 CS252 Graduate Computer Archtecture Lecture 22 Synchronzaton (con t) Technology Error Correcton Codes Aprl 18 th, 2010 John Kubatowcz Electrcal Engneerng and Computer Scences Unversty of Calforna, Berkeley Revew: Zoo of hardware prmtves test&set (&address) { /* most archtectures */ result = M[address]; M[address] = 1; return result; swap (&address, regster) { /* x86 */ temp = M[address]; M[address] = regster; regster = temp; compare&swap (&address, reg1, reg2) { /* */ f (reg1 == M[address]) { M[address] = reg2; return success; else { return falure; load-lnked&store condtonal(&address) { /* R4000, alpha */ loop: ll r1, M[address]; mov r2, 1; /* Can do arbtrary comp */ sc r2, M[address]; beqz r2, loop; 4/13/2011 cs252-s11, Lecture 21 2 Barrers Software algorthms mplemented usng locks, flags, counters Hardware barrers Wred-AND lne separate from address/data bus» Set nput hgh when arrve, wat for output to be hgh to leave In practce, multple wres to allow reuse Useful when barrers are global and very frequent Dffcult to support arbtrary subset of processors» even harder wth multple processes per processor Dffcult to dynamcally change number and dentty of partcpants» e.g. latter due to process mgraton Not common today on bus-based machnes 4/13/2011 cs252-s11, Lecture 21 3 A Smple Centralzed Barrer Shared counter # of processes that have arrved ncrement when arrve (lock), check untl reaches numprocs Problem? struct bar_type {nt counter; struct lock_type lock; nt flag = 0; bar_name; BARRIER (bar_name, p) { LOCK(bar_name.lock); f (bar_name.counter == 0) bar_name.flag = 0; /* reset flag f frst to reach*/ mycount = bar_name.counter++; /* mycount s prvate */ UNLOCK(bar_name.lock); f (mycount == p) { /* last to arrve */ bar_name.counter = 0; /* reset for next barrer */ bar_name.flag = 1; /* release waters */ else whle (bar_name.flag == 0); /* busy wat for release */ 4/13/2011 cs252-s11, Lecture 21 4

2 A Workng Centralzed Barrer Consecutvely enterng the same barrer doesn t work Must prevent process from enterng untl all have left prevous nstance Could use another counter, but ncreases latency and contenton Sense reversal: wat for flag to take dfferent value consecutve tmes Toggle ths value only when all processes reach Improved Barrer Algorthms for a Bus Software combnng tree Only k processors access the same locaton, where k s degree of tree Contenton Lttle contenton BARRIER (bar_name, p) { local_sense =!(local_sense); /* toggle prvate sense varable */ LOCK(bar_name.lock); mycount = bar_name.counter++; /* mycount s prvate */ f (bar_name.counter == p) UNLOCK(bar_name.lock); bar_name.flag = local_sense; /* release waters*/ else { UNLOCK(bar_name.lock); whle (bar_name.flag!= local_sense) {; Flat Tree structured Separate arrval and ext trees, and use sense reversal Valuable n dstrbuted network: communcate along dfferent paths On bus, all traffc goes on same bus, and no less total traffc Hgher latency (log p steps of work, and O(p) seralzed bus xactons) Advantage on bus s use of ordnary reads/wrtes nstead of locks 4/13/2011 cs252-s11, Lecture /13/2011 cs252-s11, Lecture 21 6 Lock-Free Synchronzaton What happens f process grabs lock, then goes to sleep??? Page fault Processor schedulng Etc Lock-free synchronzaton: Operatons do not requre mutual excluson of multple nsts Nonblockng: Some process wll complete n a fnte amount of tme even f other processors halt Wat-Free (Herlhy): Every (nonfaultng) process wll complete n a fnte amount of tme Systems based on LL&SC can mplement these 4/13/2011 cs252-s11, Lecture 21 7 Transactonal Transacton-based model of memory Interface: start transacton(); read/wrte data commt transacton(): If conflcts detected, commt wll abort and must be retred What s a conflct?» If values you read are wrtten by others n mddle of transacton» If values you wrte are wrtten by others n mddle of transacton Hardware support for transactons Typcally uses cache coherence protocol to help process How to detect conflct?» Set R/W flags on cache lne when access» Conflcts detected when cache lne nvaldates (and/or nterventons) notce bts set Eager Conflct detecton:» Newer transacton s assumed to conflct wth older one 4/13/2011 cs252-s11, Lecture 21 8

3 Bref dscusson of Transactonal LogTM: Log-based Transactonal Kevn Moore, Jayaram Bobba, Mchelle Moravan, Mark Hll & Davd Wood Use of Cache Coherence protocol to detect transacton conflcts Transactonal Interface: begn_transacton(): Request that subsequent statements for a transacton commt_transacton(): Ends successful transacton begun by matchng begn_transacton(). Dscards any transacton state saved for potental abort abort_transacton(): Transfers control to a prevously regster conflct handler whch should undo and dscard work snce last begn_transacton() 4/18/2011 cs252-s11, Lecture 22 9 Specfc Loggng Mechansm 4/18/2011 cs252-s11, Lecture Man Background Performance of Man : Latency: Cache Mss Penalty» Access Tme: tme between request and word arrves» Cycle Tme: tme between requests Bandwdth: I/O & Large Block Mss Penalty (L2) Man s DRAM: Dynamc Random Access Dynamc snce needs to be refreshed perodcally (8 ms, 1% tme) Addresses dvded nto 2 halves ( as a 2D matrx):» RAS or Row Address Strobe» CAS or Column Address Strobe Cache uses SRAM: Statc Random Access No refresh (6 transstors/bt vs. 1 transstor Sze: DRAM/SRAM 4-8, Cost/Cycle tme: SRAM/DRAM 8-4/18/2011 cs252-s11, Lecture DRAM Archtecture N+M N M Row Address Decoder Col. 1 bt lnes Col. word lnes 2 M Row 1 Column Decoder & Sense Amplfers Data D Row 2 N cell (one bt) Bts stored n 2-dmensonal arrays on chp Modern chps have around 4 logcal banks on each chp each logcal bank physcally mplemented as many smaller arrays 4/18/2011 cs252-s11, Lecture 22 12

4 1-T Cell (DRAM) Wrte: 1. Drve bt lne 2.. Select row Read: 1. Precharge bt lne to Vdd/2 2.. Select row bt 3. Cell and bt lne share charges» Very small voltage changes on the bt lne 4. Sense (fancy sense amp)» Can detect changes of ~1 mllon electrons 5. Wrte: restore the value Refresh 1. Just do a dummy read to every cell. row select DRAM Capactors: more capactance n a small area Trench capactors: Stacked capactors Logc ABOVE capactor Logc BELOW capactor Gan n surface area of capactor Gan n surface area of capactor Better Scalng propertes 2-dm cross-secton qute small Better Planarzaton 4/18/2011 cs252-s11, Lecture /18/2011 cs252-s11, Lecture DRAM Operaton: Three Steps Precharge charges bt lnes to known value, requred before next row access Row access (RAS) decode row address, enable addressed row (often multple Kb n row) btlnes share charge wth storage cell small change n voltage detected by sense amplfers whch latch whole row of bts sense amplfers drve btlnes full ral to recharge storage cells Column access (CAS) decode column address to select small number of sense amplfer latches (4, 8,, or 32 bts dependng on DRAM package) on read, send latched bts out to chp pns on wrte, change sense amplfer latches. whch then charge storage cells to requred value can perform multple column accesses on same row wthout another row access (burst mode) 4/18/2011 cs252-s11, Lecture RAS_L CAS_L A WE_L OE_L DRAM Read Tmng (Example) Every DRAM access begns at: The asserton of the RAS_L 2 ways to read: early or late v. CAS DRAM Read Cycle Tme Row Address Read Access Tme RAS_L CAS_L WE_L OE_L A 256K x 8 9 DRAM 8 Col Address Junk Row Address Col Address Junk D Hgh Z Junk Data Out Hgh Z Data Out Output Enable Delay Early Read Cycle: OE_L asserted before CAS_L Late Read Cycle: OE_L asserted after CAS_L 4/18/2011 cs252-s11, Lecture 22 D

5 Man Performance Access Tme Cycle Tme Tme DRAM (Read/Wrte) Cycle Tme >> DRAM (Read/Wrte) Access Tme 2:1; why? DRAM (Read/Wrte) Cycle Tme : How frequent can you ntate an access? Analogy: A lttle kd can only ask hs father for money on Saturday DRAM (Read/Wrte) Access Tme: How quckly wll you get what you want once you ntate an access? Analogy: As soon as he asks, hs father wll gve hm the money DRAM Bandwdth Lmtaton analogy: What happens f he runs out of money on Wednesday? 4/18/2011 cs252-s11, Lecture Access Pattern wthout Interleavng: D1 avalable Start Access for D1 Access Pattern wth 4-way Interleavng: Access 0 Increasng Bandwdth - Interleavng Access 1 Access 2 Access 3 Start Access for D2 We can Access 0 agan CPU CPU /18/2011 cs252-s11, Lecture Man Performance Wde: CPU/Mux 1 word; Mux/Cache, Bus, N words (Alpha: 64 bts & 256 bts) Smple: CPU, Cache, Bus, same wdth (32 bts) Interleaved: CPU, Cache, Bus 1 word: N Modules (4 Modules); example s word nterleaved 4/18/2011 cs252-s11, Lecture Quest for DRAM Performance 1. Fast Page mode Add tmng sgnals that allow repeated accesses to row buffer wthout another row access tme Such a buffer comes naturally, as each array wll buffer 1024 to 2048 bts for each access 2. Synchronous DRAM (SDRAM) Add a clock sgnal to DRAM nterface, so that the repeated transfers would not bear overhead to synchronze wth DRAM controller 3. Double Data Rate (DDR SDRAM) Transfer data on both the rsng edge and fallng edge of the DRAM clock sgnal doublng the peak data rate DDR2 lowers power by droppng the voltage from 2.5 to 1.8 volts + offers hgher clock rates: up to 400 MHz DDR3 drops to 1.5 volts + hgher clock rates: up to 800 MHz Improved Bandwdth, not Latency 4/18/2011 cs252-s11, Lecture 22 20

6 Fast Systems: DRAM specfc Multple CAS accesses: several names (page mode) Extended Data Out (EDO): 30% faster n page mode Newer DRAMs to address gap; what wll they cost, wll they survve? RAMBUS: startup company; renvented DRAM nterface» Each a module vs. slce of memory» Short bus between CPU and chps» Does own refresh» Varable amount of data returned» 1 byte / 2 ns (500 MB/s per chp) Synchronous DRAM: 2 banks on chp, a clock sgnal to DRAM, transfer synchronous to system clock ( MHz)» DDR DRAM: Two transfers per clock (on rsng and fallng edge) Intel clams s the next bg thng» Stands for Fully-Buffered Dual-Inlne RAM» Same basc technology as DDR, but utlzes a seral dasy-chan channel between dfferent memory components. 4/18/2011 cs252-s11, Lecture Fast Page Mode Operaton Regular DRAM Organzaton: N rows x N column x M-bt Read & Wrte M-bt at a tme Each M-bt access requres a RAS / CAS cycle Fast Page Mode DRAM N x M SRAM to save a row After a row s read nto the regster Only CAS s needed to access other M-bt blocks on that row RAS_L remans asserted whle CAS_L s toggled RAS_L CAS_L 1st M-bt Access Column Address N rows M-bt Output N cols DRAM N x M SRAM M bts 2nd M-bt 3rd M-bt 4th M-bt Row Address A Row Address Col Address Col Address Col Address Col Address 4/18/2011 cs252-s11, Lecture SDRAM tmng (Sngle Data Rate) 200MHz Clock Double-Data Rate (DDR2) DRAM Row Column Precharge Row CAS RAS (New ) CAS Latency Mcron 128M-bt dram (usng 2Megbt4bank ver) Row (12 bts), bank (2 bts), column (9 bts) x Precharge Burst READ 4/18/2011 cs252-s11, Lecture Data [ Mcron, 256Mb DDR2 SDRAM datasheet ] 400Mb/s Data Rate 4/18/2011 cs252-s11, Lecture 22 24

7 DDR vs DDR2 vs DDR3 vs DDR4 All about ncreasng the rate at the pns Not an mprovement n latency In fact, latency can sometmes be worse Internal banks often consumed for ncreased bandwdth DDR4 (January 2011) Samsung, Currently 2.13Gb/sec Target: 4 Gb/sec DRAM Power: Not always up, but 4/18/2011 cs252-s11, Lecture /18/2011 cs252-s11, Lecture DRAM Packagng ~7 Clock and control sgnals Address lnes multplexed row/column address ~12 Data bus (4b,8b,b,32b) DRAM chp DIMM (Dual Inlne Module) contans multple chps arranged n ranks Each rank has clock/control/address sgnals connected n parallel (sometmes need buffers to drve sgnals to all chps), and data pns work together to return wde word e.g., a rank could mplement a 64-bt data bus usng x4-bt chps, or a 64-bt data bus usng 8x8-bt chps. A modern DIMM usually has one or two ranks (occasonally 4 f hgh capacty) A rank wll contan the same number of banks as each consttuent chp (e.g., 4-8) 4/18/2011 cs252-s11, Lecture DRAM Channel Controller 64-bt Data Bus Command/Address Bus Rank Rank 4/18/2011 cs252-s11, Lecture 22 28

8 Memores FLASH Regular DIMM Uses Commodty DRAMs wth specal controller on actual DIMM board Connecton s n a seral form: Controller 4/18/2011 cs252-s11, Lecture Lke a normal transstor but: Samsung 2007: Has a floatng gate that can hold chargegb, NAND Flash To wrte: rase or lower wordlne hgh enough to cause charges to tunnel To read: turn on wordlne as f normal transstor» presence of charge changes threshold and thus measured current Two varetes: NAND: denser, must be read and wrtten n blocks NOR: much less dense, fast to read and wrte 4/18/2011 cs252-s11, Lecture Tunnelng Magnetc Juncton (MRAM) Phase Change memory (IBM, Samsung, Intel) Tunnelng Magnetc Juncton RAM (TMJ-RAM) Speed of SRAM, densty of DRAM, non-volatle (no refresh) Spntroncs : combnaton quantum spn and electroncs Same technology used n hgh-densty dsk-drves 4/18/2011 cs252-s11, Lecture Phase Change (called PRAM or PCM) Chalcogende materal can change from amorphous to crystallne state wth applcaton of heat Two states have very dfferent resstve propertes Smlar to materal used n CD-RW process Exctng alternatve to FLASH Hgher speed May be easy to ntegrate wth CMOS processes 4/18/2011 cs252-s11, Lecture 22 32

9 Error Correcton Codes (ECC) systems generate errors (accdentally flppedbts) DRAMs store very lttle charge per bt Soft errors occur occasonally when cells are struck by alpha partcles or other envronmental upsets. Less frequently, hard errors can occur when chps permanently fal. Problem gets worse as memores get denser and larger Where s perfect memory requred? servers, spacecraft/mltary computers, ebay, Memores are protected aganst falures wth ECCs Extra bts are added to each data-word used to detect and/or correct faults n the memory system n general, each possble data word value s mapped to a unque code word. A fault changes a vald code word to an nvald one - whch can be detected. ECC Approach: Redundancy Approach: Redundancy Add extra nformaton so that we can recover from errors Can we do better than just create complete copes? Block Codes: Data Coded n blocks k data bts coded nto n encoded bts Measure of overhead: Rate of Code: K/N Often called an (n,k) code Consder data as vectors n GF(2) [.e. vectors of bts ] Code Space s set of all 2 n vectors, Data space set of 2 k vectors Encodng functon: C=f(d) Decodng functon: d=f(c ) Not all possble code vectors, C, are vald! 4/18/2011 cs252-s11, Lecture /18/2011 cs252-s11, Lecture General Idea: Code Vector Space Code Space Code Dstance (Hammng Dstance) C 0 =f(v 0 ) v 0 Not every vector n the code space s vald Hammng Dstance (d): Mnmum number of bt flps to turn one code word nto another Number of errors that we can detect: (d-1) Number of errors that we can fx: ½(d-1) 4/18/2011 cs252-s11, Lecture Some Code Types Lnear Codes: C G d S H C Code s generated by G and n null-space of H (n,k) code: Data space 2 k, Code space 2 n (n,k,d) code: specfy dstance d as well Random code: Need to both dentfy errors and correct them Dstance d correct ½(d-1) errors Erasure code: Can correct errors f we know whch bts/symbols are bad Example: RAID codes, where symbols are blocks of dsk Dstance d correct (d-1) errors Error detecton code: Dstance d detect (d-1) errors Hammng Codes d = 3 Columns nonzero, Dstnct d = 4 Columns nonzero, Dstnct, Odd-weght Bnary Golay code: based on quadratc resdues mod 23 Bnary code: [24, 12, 8] and [23, 12, 7]. Often used n space-based schemes, can correct 3 errors 4/18/2011 cs252-s11, Lecture 22 36

10 Hammng Bound, symbols n GF(2) Consder an (n,k) code wth dstance d How do n, k, and d relate to one another? Frst queston: How bg are spheres? For dstance d, spheres are of radus ½ (d-1),».e. all error wth weght ½ (d-1) or less must ft wthn sphere Thus, sze of sphere s at least: 1 + Num(1-bt err) + Num(2-bt err) + + Num( ½(d-1) bt err) Sze 1 ( d 1) 2 e0 n e Hammng bound reflects bn-packng of spheres: need 2 k of these spheres wthn code space 2 k 1 ( d 1) 2 e0 n 2 e n k n 2 (1 n) 2, d 3 4/18/2011 cs252-s11, Lecture How to Generate code words? Consder a lnear code. Need a Generator Matrx. Let v be the data value (k bts), C be resultng code (n bts): C Are there 2 k unque code values? Only f the k columns of G are lnearly ndependent! Of course, need some way of decodng as well. v G v f d C ' G must be an nk matrx Is ths lnear??? Why or why not? A code s systematc f the data s drectly encoded wthn the code words. Means Generator has form: I Can always turn non-systematc G code nto a systematc one (row ops) P But What s dstance of code? Not Obvous! 4/18/2011 cs252-s11, Lecture Implctly Defnng Codes by Check Matrx Consder a party-check matrx H (n[n-k]) Defne vald code words C as those that gve S =0 (null space of H) S H C 0 Sze of null space? (null-rank H)=k f (n-k) lnearly ndependent columns n H Suppose we transmt code word C wth error: Model ths as vector E whch flps selected bts of C to get R (receved): R C E Consder what happens when we multply by H: S H R H ( C E) H E What s dstance of code? Code has dstance d f no sum of d-1 or less columns yelds 0 I.e. No error vectors, E, of weght < d have zero syndromes So Code desgn s desgnng H matrx 4/18/2011 cs252-s11, Lecture How to relate G and H (Bnary Codes) Defnng H makes t easy to understand dstance of code, but hard to generate code (H defnes code mplctly!) However, let H be of followng form: P s (n-k)k, I s (n-k)(n-k) H P I Result: H s (n-k)n Then, G can be of followng form (maxmal code sze): I G P P s (n-k)k, I s kk Result: G s nk Notce: G generates values n null-space of H and has k ndependent columns so generates 2 k unque values: S H I G v P I v 0 P 4/18/2011 cs252-s11, Lecture 22 40

11 Concluson Man memory s Dense, Slow Cycle tme > Access tme! Technques to optmze memory Wder Interleaved : for sequental or ndependent accesses Avodng bank conflcts: SW & HW DRAM specfc optmzatons: page mode & Specalty DRAM ECC: add redundancy to correct for errors (n,k,d) n code bts, k data bts, dstance d Lnear codes: code vectors computed by lnear transformaton Erasure code: after dentfyng erasures, can correct 4/18/2011 cs252-s11, Lecture 22 41

CpE 442. Memory System

CpE 442. Memory System CpE 442 Memory System CPE 442 memory.1 Outline of Today s Lecture Recap and Introduction (5 minutes) Memory System: the BIG Picture? (15 minutes) Memory Technology: SRAM and Register File (25 minutes)

More information

Virtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory

Virtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory Background EECS. Operatng System Fundamentals No. Vrtual Memory Prof. Hu Jang Department of Electrcal Engneerng and Computer Scence, York Unversty Memory-management methods normally requres the entre process

More information

Memories: Memory Technology

Memories: Memory Technology Memories: Memory Technology Z. Jerry Shi Assistant Professor of Computer Science and Engineering University of Connecticut * Slides adapted from Blumrich&Gschwind/ELE475 03, Peh/ELE475 * Memory Hierarchy

More information

CS152 Computer Architecture and Engineering Lecture 16: Memory System

CS152 Computer Architecture and Engineering Lecture 16: Memory System CS152 Computer Architecture and Engineering Lecture 16: System March 15, 1995 Dave Patterson (patterson@cs) and Shing Kong (shing.kong@eng.sun.com) Slides available on http://http.cs.berkeley.edu/~patterson

More information

Topic 21: Memory Technology

Topic 21: Memory Technology Topic 21: Memory Technology COS / ELE 375 Computer Architecture and Organization Princeton University Fall 2015 Prof. David August 1 Old Stuff Revisited Mercury Delay Line Memory Maurice Wilkes, in 1947,

More information

Topic 21: Memory Technology

Topic 21: Memory Technology Topic 21: Memory Technology COS / ELE 375 Computer Architecture and Organization Princeton University Fall 2015 Prof. David August 1 Old Stuff Revisited Mercury Delay Line Memory Maurice Wilkes, in 1947,

More information

Role of Synchronization. CS 258 Parallel Computer Architecture Lecture 23. Hardware-Software Trade-offs in Synchronization and Data Layout

Role of Synchronization. CS 258 Parallel Computer Architecture Lecture 23. Hardware-Software Trade-offs in Synchronization and Data Layout CS 28 Parallel Computer Architecture Lecture 23 Hardware-Software Trade-offs in Synchronization and Data Layout April 21, 2008 Prof John D. Kubiatowicz http://www.cs.berkeley.edu/~kubitron/cs28 Role of

More information

Outline. Digital Systems. C.2: Gates, Truth Tables and Logic Equations. Truth Tables. Logic Gates 9/8/2011

Outline. Digital Systems. C.2: Gates, Truth Tables and Logic Equations. Truth Tables. Logic Gates 9/8/2011 9/8/2 2 Outlne Appendx C: The Bascs of Logc Desgn TDT4255 Computer Desgn Case Study: TDT4255 Communcaton Module Lecture 2 Magnus Jahre 3 4 Dgtal Systems C.2: Gates, Truth Tables and Logc Equatons All sgnals

More information

Cache Performance 3/28/17. Agenda. Cache Abstraction and Metrics. Direct-Mapped Cache: Placement and Access

Cache Performance 3/28/17. Agenda. Cache Abstraction and Metrics. Direct-Mapped Cache: Placement and Access Agenda Cache Performance Samra Khan March 28, 217 Revew from last lecture Cache access Assocatvty Replacement Cache Performance Cache Abstracton and Metrcs Address Tag Store (s the address n the cache?

More information

ELEC 377 Operating Systems. Week 6 Class 3

ELEC 377 Operating Systems. Week 6 Class 3 ELEC 377 Operatng Systems Week 6 Class 3 Last Class Memory Management Memory Pagng Pagng Structure ELEC 377 Operatng Systems Today Pagng Szes Vrtual Memory Concept Demand Pagng ELEC 377 Operatng Systems

More information

Efficient Distributed File System (EDFS)

Efficient Distributed File System (EDFS) Effcent Dstrbuted Fle System (EDFS) (Sem-Centralzed) Debessay(Debsh) Fesehaye, Rahul Malk & Klara Naherstedt Unversty of Illnos-Urbana Champagn Contents Problem Statement, Related Work, EDFS Desgn Rate

More information

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour 6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the

More information

Assembler. Building a Modern Computer From First Principles.

Assembler. Building a Modern Computer From First Principles. Assembler Buldng a Modern Computer From Frst Prncples www.nand2tetrs.org Elements of Computng Systems, Nsan & Schocken, MIT Press, www.nand2tetrs.org, Chapter 6: Assembler slde Where we are at: Human Thought

More information

High level vs Low Level. What is a Computer Program? What does gcc do for you? Program = Instructions + Data. Basic Computer Organization

High level vs Low Level. What is a Computer Program? What does gcc do for you? Program = Instructions + Data. Basic Computer Organization What s a Computer Program? Descrpton of algorthms and data structures to acheve a specfc ojectve Could e done n any language, even a natural language lke Englsh Programmng language: A Standard notaton

More information

Computer Architecture ELEC3441

Computer Architecture ELEC3441 Causes of Cache Msses: The 3 C s Computer Archtecture ELEC3441 Lecture 9 Cache (2) Dr. Hayden Kwo-Hay So Department of Electrcal and Electronc Engneerng Compulsory: frst reference to a lne (a..a. cold

More information

CS 268: Lecture 8 Router Support for Congestion Control

CS 268: Lecture 8 Router Support for Congestion Control CS 268: Lecture 8 Router Support for Congeston Control Ion Stoca Computer Scence Dvson Department of Electrcal Engneerng and Computer Scences Unversty of Calforna, Berkeley Berkeley, CA 9472-1776 Router

More information

Parallel matrix-vector multiplication

Parallel matrix-vector multiplication Appendx A Parallel matrx-vector multplcaton The reduced transton matrx of the three-dmensonal cage model for gel electrophoress, descrbed n secton 3.2, becomes excessvely large for polymer lengths more

More information

Multilevel Memories. Joel Emer Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology

Multilevel Memories. Joel Emer Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology 1 Multilevel Memories Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Based on the material prepared by Krste Asanovic and Arvind CPU-Memory Bottleneck 6.823

More information

Lecture 5: Multilayer Perceptrons

Lecture 5: Multilayer Perceptrons Lecture 5: Multlayer Perceptrons Roger Grosse 1 Introducton So far, we ve only talked about lnear models: lnear regresson and lnear bnary classfers. We noted that there are functons that can t be represented

More information

Real-Time Guarantees. Traffic Characteristics. Flow Control

Real-Time Guarantees. Traffic Characteristics. Flow Control Real-Tme Guarantees Requrements on RT communcaton protocols: delay (response s) small jtter small throughput hgh error detecton at recever (and sender) small error detecton latency no thrashng under peak

More information

CS311 Lecture 21: SRAM/DRAM/FLASH

CS311 Lecture 21: SRAM/DRAM/FLASH S 14 L21-1 2014 CS311 Lecture 21: SRAM/DRAM/FLASH DARM part based on ISCA 2002 tutorial DRAM: Architectures, Interfaces, and Systems by Bruce Jacob and David Wang Jangwoo Kim (POSTECH) Thomas Wenisch (University

More information

Chapter 5B. Large and Fast: Exploiting Memory Hierarchy

Chapter 5B. Large and Fast: Exploiting Memory Hierarchy Chapter 5B Large and Fast: Exploiting Memory Hierarchy One Transistor Dynamic RAM 1-T DRAM Cell word access transistor V REF TiN top electrode (V REF ) Ta 2 O 5 dielectric bit Storage capacitor (FET gate,

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

Memory and I/O Organization

Memory and I/O Organization Memory and I/O Organzaton 8-1 Prncple of Localty Localty small proporton of memory accounts for most run tme Rule of thumb For 9% of run tme next nstructon/data wll come from 1% of program/data closest

More information

ECE 485/585 Microprocessor System Design

ECE 485/585 Microprocessor System Design Microprocessor System Design Lecture 5: Zeshan Chishti DRAM Basics DRAM Evolution SDRAM-based Memory Systems Electrical and Computer Engineering Dept. Maseeh College of Engineering and Computer Science

More information

Memory technology and optimizations ( 2.3) Main Memory

Memory technology and optimizations ( 2.3) Main Memory Memory technology and optimizations ( 2.3) 47 Main Memory Performance of Main Memory: Latency: affects Cache Miss Penalty» Access Time: time between request and word arrival» Cycle Time: minimum time between

More information

Conditional Speculative Decimal Addition*

Conditional Speculative Decimal Addition* Condtonal Speculatve Decmal Addton Alvaro Vazquez and Elsardo Antelo Dep. of Electronc and Computer Engneerng Unv. of Santago de Compostela, Span Ths work was supported n part by Xunta de Galca under grant

More information

CS650 Computer Architecture. Lecture 9 Memory Hierarchy - Main Memory

CS650 Computer Architecture. Lecture 9 Memory Hierarchy - Main Memory CS65 Computer Architecture Lecture 9 Memory Hierarchy - Main Memory Andrew Sohn Computer Science Department New Jersey Institute of Technology Lecture 9: Main Memory 9-/ /6/ A. Sohn Memory Cycle Time 5

More information

Simulation Based Analysis of FAST TCP using OMNET++

Simulation Based Analysis of FAST TCP using OMNET++ Smulaton Based Analyss of FAST TCP usng OMNET++ Umar ul Hassan 04030038@lums.edu.pk Md Term Report CS678 Topcs n Internet Research Sprng, 2006 Introducton Internet traffc s doublng roughly every 3 months

More information

Assembler. Shimon Schocken. Spring Elements of Computing Systems 1 Assembler (Ch. 6) Compiler. abstract interface.

Assembler. Shimon Schocken. Spring Elements of Computing Systems 1 Assembler (Ch. 6) Compiler. abstract interface. IDC Herzlya Shmon Schocken Assembler Shmon Schocken Sprng 2005 Elements of Computng Systems 1 Assembler (Ch. 6) Where we are at: Human Thought Abstract desgn Chapters 9, 12 abstract nterface H.L. Language

More information

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009.

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009. Farrukh Jabeen Algorthms 51 Assgnment #2 Due Date: June 15, 29. Assgnment # 2 Chapter 3 Dscrete Fourer Transforms Implement the FFT for the DFT. Descrbed n sectons 3.1 and 3.2. Delverables: 1. Concse descrpton

More information

Cache Memories. Lecture 14 Cache Memories. Inserting an L1 Cache Between the CPU and Main Memory. General Org of a Cache Memory

Cache Memories. Lecture 14 Cache Memories. Inserting an L1 Cache Between the CPU and Main Memory. General Org of a Cache Memory Topcs Lecture 4 Cache Memores Generc cache memory organzaton Drect mapped caches Set assocate caches Impact of caches on performance Cache Memores Cache memores are small, fast SRAM-based memores managed

More information

Programming in Fortran 90 : 2017/2018

Programming in Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Exercse 1 : Evaluaton of functon dependng on nput Wrte a program who evaluate the functon f (x,y) for any two user specfed values

More information

Lecture 20: Memory Hierarchy Main Memory and Enhancing its Performance. Grinch-Like Stuff

Lecture 20: Memory Hierarchy Main Memory and Enhancing its Performance. Grinch-Like Stuff Lecture 20: ory Hierarchy Main ory and Enhancing its Performance Professor Alvin R. Lebeck Computer Science 220 Fall 1999 HW #4 Due November 12 Projects Finish reading Chapter 5 Grinch-Like Stuff CPS 220

More information

Lecture 18: DRAM Technologies

Lecture 18: DRAM Technologies Lecture 18: DRAM Technologies Last Time: Cache and Virtual Memory Review Today DRAM organization or, why is DRAM so slow??? Lecture 18 1 Main Memory = DRAM Lecture 18 2 Basic DRAM Architecture Lecture

More information

Harvard University CS 101 Fall 2005, Shimon Schocken. Assembler. Elements of Computing Systems 1 Assembler (Ch. 6)

Harvard University CS 101 Fall 2005, Shimon Schocken. Assembler. Elements of Computing Systems 1 Assembler (Ch. 6) Harvard Unversty CS 101 Fall 2005, Shmon Schocken Assembler Elements of Computng Systems 1 Assembler (Ch. 6) Why care about assemblers? Because Assemblers employ some nfty trcks Assemblers are the frst

More information

Loop Permutation. Loop Transformations for Parallelism & Locality. Legality of Loop Interchange. Loop Interchange (cont)

Loop Permutation. Loop Transformations for Parallelism & Locality. Legality of Loop Interchange. Loop Interchange (cont) Loop Transformatons for Parallelsm & Localty Prevously Data dependences and loops Loop transformatons Parallelzaton Loop nterchange Today Loop nterchange Loop transformatons and transformaton frameworks

More information

Private Information Retrieval (PIR)

Private Information Retrieval (PIR) 2 Levente Buttyán Problem formulaton Alce wants to obtan nformaton from a database, but she does not want the database to learn whch nformaton she wanted e.g., Alce s an nvestor queryng a stock-market

More information

Lecture 18: Memory Hierarchy Main Memory and Enhancing its Performance Professor Randy H. Katz Computer Science 252 Spring 1996

Lecture 18: Memory Hierarchy Main Memory and Enhancing its Performance Professor Randy H. Katz Computer Science 252 Spring 1996 Lecture 18: Memory Hierarchy Main Memory and Enhancing its Performance Professor Randy H. Katz Computer Science 252 Spring 1996 RHK.S96 1 Review: Reducing Miss Penalty Summary Five techniques Read priority

More information

News. Recap: While Loop Example. Reading. Recap: Do Loop Example. Recap: For Loop Example

News. Recap: While Loop Example. Reading. Recap: Do Loop Example. Recap: For Loop Example Unversty of Brtsh Columba CPSC, Intro to Computaton Jan-Apr Tamara Munzner News Assgnment correctons to ASCIIArtste.java posted defntely read WebCT bboards Arrays Lecture, Tue Feb based on sldes by Kurt

More information

Loop Transformations for Parallelism & Locality. Review. Scalar Expansion. Scalar Expansion: Motivation

Loop Transformations for Parallelism & Locality. Review. Scalar Expansion. Scalar Expansion: Motivation Loop Transformatons for Parallelsm & Localty Last week Data dependences and loops Loop transformatons Parallelzaton Loop nterchange Today Scalar expanson for removng false dependences Loop nterchange Loop

More information

EEC 483 Computer Organization

EEC 483 Computer Organization EEC 483 Computer Organization Chapter 5 Large and Fast: Exploiting Memory Hierarchy Chansu Yu Table of Contents Ch.1 Introduction Ch. 2 Instruction: Machine Language Ch. 3-4 CPU Implementation Ch. 5 Cache

More information

Concurrent Apriori Data Mining Algorithms

Concurrent Apriori Data Mining Algorithms Concurrent Apror Data Mnng Algorthms Vassl Halatchev Department of Electrcal Engneerng and Computer Scence York Unversty, Toronto October 8, 2015 Outlne Why t s mportant Introducton to Assocaton Rule Mnng

More information

CS 320 February 2, 2018 Ch 5 Memory

CS 320 February 2, 2018 Ch 5 Memory CS 320 February 2, 2018 Ch 5 Memory Main memory often referred to as core by the older generation because core memory was a mainstay of computers until the advent of cheap semi-conductor memory in the

More information

The Memory Hierarchy 1

The Memory Hierarchy 1 The Memory Hierarchy 1 What is a cache? 2 What problem do caches solve? 3 Memory CPU Abstraction: Big array of bytes Memory memory 4 Performance vs 1980 Processor vs Memory Performance Memory is very slow

More information

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster

More information

The DRAM Cell. EEC 581 Computer Architecture. Memory Hierarchy Design (III) 1T1C DRAM cell

The DRAM Cell. EEC 581 Computer Architecture. Memory Hierarchy Design (III) 1T1C DRAM cell EEC 581 Computer Architecture Memory Hierarchy Design (III) Department of Electrical Engineering and Computer Science Cleveland State University The DRAM Cell Word Line (Control) Bit Line (Information)

More information

ECE7995 (4) Basics of Memory Hierarchy. [Adapted from Mary Jane Irwin s slides (PSU)]

ECE7995 (4) Basics of Memory Hierarchy. [Adapted from Mary Jane Irwin s slides (PSU)] ECE7995 (4) Basics of Memory Hierarchy [Adapted from Mary Jane Irwin s slides (PSU)] Major Components of a Computer Processor Devices Control Memory Input Datapath Output Performance Processor-Memory Performance

More information

Today s Outline. Sorting: The Big Picture. Why Sort? Selection Sort: Idea. Insertion Sort: Idea. Sorting Chapter 7 in Weiss.

Today s Outline. Sorting: The Big Picture. Why Sort? Selection Sort: Idea. Insertion Sort: Idea. Sorting Chapter 7 in Weiss. Today s Outlne Sortng Chapter 7 n Wess CSE 26 Data Structures Ruth Anderson Announcements Wrtten Homework #6 due Frday 2/26 at the begnnng of lecture Proect Code due Mon March 1 by 11pm Today s Topcs:

More information

Agenda & Reading. Simple If. Decision-Making Statements. COMPSCI 280 S1C Applications Programming. Programming Fundamentals

Agenda & Reading. Simple If. Decision-Making Statements. COMPSCI 280 S1C Applications Programming. Programming Fundamentals Agenda & Readng COMPSCI 8 SC Applcatons Programmng Programmng Fundamentals Control Flow Agenda: Decsonmakng statements: Smple If, Ifelse, nested felse, Select Case s Whle, DoWhle/Untl, For, For Each, Nested

More information

K-means and Hierarchical Clustering

K-means and Hierarchical Clustering Note to other teachers and users of these sldes. Andrew would be delghted f you found ths source materal useful n gvng your own lectures. Feel free to use these sldes verbatm, or to modfy them to ft your

More information

Machine Learning. Support Vector Machines. (contains material adapted from talks by Constantin F. Aliferis & Ioannis Tsamardinos, and Martin Law)

Machine Learning. Support Vector Machines. (contains material adapted from talks by Constantin F. Aliferis & Ioannis Tsamardinos, and Martin Law) Machne Learnng Support Vector Machnes (contans materal adapted from talks by Constantn F. Alfers & Ioanns Tsamardnos, and Martn Law) Bryan Pardo, Machne Learnng: EECS 349 Fall 2014 Support Vector Machnes

More information

The Codesign Challenge

The Codesign Challenge ECE 4530 Codesgn Challenge Fall 2007 Hardware/Software Codesgn The Codesgn Challenge Objectves In the codesgn challenge, your task s to accelerate a gven software reference mplementaton as fast as possble.

More information

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following.

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following. Complex Numbers The last topc n ths secton s not really related to most of what we ve done n ths chapter, although t s somewhat related to the radcals secton as we wll see. We also won t need the materal

More information

COSC 6385 Computer Architecture - Memory Hierarchies (III)

COSC 6385 Computer Architecture - Memory Hierarchies (III) COSC 6385 Computer Architecture - Memory Hierarchies (III) Edgar Gabriel Spring 2014 Memory Technology Performance metrics Latency problems handled through caches Bandwidth main concern for main memory

More information

Mainstream Computer System Components CPU Core 2 GHz GHz 4-way Superscaler (RISC or RISC-core (x86): Dynamic scheduling, Hardware speculation

Mainstream Computer System Components CPU Core 2 GHz GHz 4-way Superscaler (RISC or RISC-core (x86): Dynamic scheduling, Hardware speculation Mainstream Computer System Components CPU Core 2 GHz - 3.0 GHz 4-way Superscaler (RISC or RISC-core (x86): Dynamic scheduling, Hardware speculation One core or multi-core (2-4) per chip Multiple FP, integer

More information

EEM 486: Computer Architecture. Lecture 9. Memory

EEM 486: Computer Architecture. Lecture 9. Memory EEM 486: Computer Architecture Lecture 9 Memory The Big Picture Designing a Multiple Clock Cycle Datapath Processor Control Memory Input Datapath Output The following slides belong to Prof. Onur Mutlu

More information

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 A mathematcal programmng approach to the analyss, desgn and

More information

Sequential search. Building Java Programs Chapter 13. Sequential search. Sequential search

Sequential search. Building Java Programs Chapter 13. Sequential search. Sequential search Sequental search Buldng Java Programs Chapter 13 Searchng and Sortng sequental search: Locates a target value n an array/lst by examnng each element from start to fnsh. How many elements wll t need to

More information

The MESI State Transition Graph

The MESI State Transition Graph Small-scale shared memory multiprocessors Semantics of the shared address space model (Ch. 5.3-5.5) Design of the M(O)ESI snoopy protocol Design of the Dragon snoopy protocol Performance issues Synchronization

More information

CMPS 10 Introduction to Computer Science Lecture Notes

CMPS 10 Introduction to Computer Science Lecture Notes CPS 0 Introducton to Computer Scence Lecture Notes Chapter : Algorthm Desgn How should we present algorthms? Natural languages lke Englsh, Spansh, or French whch are rch n nterpretaton and meanng are not

More information

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier Some materal adapted from Mohamed Youns, UMBC CMSC 611 Spr 2003 course sldes Some materal adapted from Hennessy & Patterson / 2003 Elsever Scence Performance = 1 Executon tme Speedup = Performance (B)

More information

CENG3420 Lecture 08: Memory Organization

CENG3420 Lecture 08: Memory Organization CENG3420 Lecture 08: Memory Organization Bei Yu byu@cse.cuhk.edu.hk (Latest update: February 22, 2018) Spring 2018 1 / 48 Overview Introduction Random Access Memory (RAM) Interleaving Secondary Memory

More information

Solving two-person zero-sum game by Matlab

Solving two-person zero-sum game by Matlab Appled Mechancs and Materals Onlne: 2011-02-02 ISSN: 1662-7482, Vols. 50-51, pp 262-265 do:10.4028/www.scentfc.net/amm.50-51.262 2011 Trans Tech Publcatons, Swtzerland Solvng two-person zero-sum game by

More information

4/11/17. Agenda. Princeton University Computer Science 217: Introduction to Programming Systems. Goals of this Lecture. Storage Management.

4/11/17. Agenda. Princeton University Computer Science 217: Introduction to Programming Systems. Goals of this Lecture. Storage Management. //7 Prnceton Unversty Computer Scence 7: Introducton to Programmng Systems Goals of ths Lecture Storage Management Help you learn about: Localty and cachng Typcal storage herarchy Vrtual memory How the

More information

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1)

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1) Secton 1.2 Subsets and the Boolean operatons on sets If every element of the set A s an element of the set B, we say that A s a subset of B, or that A s contaned n B, or that B contans A, and we wrte A

More information

Chapter 2: Memory Hierarchy Design (Part 3) Introduction Caches Main Memory (Section 2.2) Virtual Memory (Section 2.4, Appendix B.4, B.

Chapter 2: Memory Hierarchy Design (Part 3) Introduction Caches Main Memory (Section 2.2) Virtual Memory (Section 2.4, Appendix B.4, B. Chapter 2: Memory Hierarchy Design (Part 3) Introduction Caches Main Memory (Section 2.2) Virtual Memory (Section 2.4, Appendix B.4, B.5) Memory Technologies Dynamic Random Access Memory (DRAM) Optimized

More information

CPE 628 Chapter 2 Design for Testability. Dr. Rhonda Kay Gaede UAH. UAH Chapter Introduction

CPE 628 Chapter 2 Design for Testability. Dr. Rhonda Kay Gaede UAH. UAH Chapter Introduction Chapter 2 Desgn for Testablty Dr Rhonda Kay Gaede UAH 2 Introducton Dffcultes n and the states of sequental crcuts led to provdng drect access for storage elements, whereby selected storage elements are

More information

Convolutional interleaver for unequal error protection of turbo codes

Convolutional interleaver for unequal error protection of turbo codes Convolutonal nterleaver for unequal error protecton of turbo codes Sna Vaf, Tadeusz Wysock, Ian Burnett Unversty of Wollongong, SW 2522, Australa E-mal:{sv39,wysock,an_burnett}@uow.edu.au Abstract: Ths

More information

CENG4480 Lecture 09: Memory 1

CENG4480 Lecture 09: Memory 1 CENG4480 Lecture 09: Memory 1 Bei Yu byu@cse.cuhk.edu.hk (Latest update: November 8, 2017) Fall 2017 1 / 37 Overview Introduction Memory Principle Random Access Memory (RAM) Non-Volatile Memory Conclusion

More information

Mainstream Computer System Components

Mainstream Computer System Components Mainstream Computer System Components Double Date Rate (DDR) SDRAM One channel = 8 bytes = 64 bits wide Current DDR3 SDRAM Example: PC3-12800 (DDR3-1600) 200 MHz (internal base chip clock) 8-way interleaved

More information

Giving credit where credit is due

Giving credit where credit is due CSCE 23J Computer Organzaton Cache Memores Dr. Stee Goddard goddard@cse.unl.edu Gng credt where credt s due Most of sldes for ths lecture are based on sldes created by Drs. Bryant and O Hallaron, Carnege

More information

Lecture 15: Memory Hierarchy Optimizations. I. Caches: A Quick Review II. Iteration Space & Loop Transformations III.

Lecture 15: Memory Hierarchy Optimizations. I. Caches: A Quick Review II. Iteration Space & Loop Transformations III. Lecture 15: Memory Herarchy Optmzatons I. Caches: A Quck Revew II. Iteraton Space & Loop Transformatons III. Types of Reuse ALSU 7.4.2-7.4.3, 11.2-11.5.1 15-745: Memory Herarchy Optmzatons Phllp B. Gbbons

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

Basics DRAM ORGANIZATION. Storage element (capacitor) Data In/Out Buffers. Word Line. Bit Line. Switching element HIGH-SPEED MEMORY SYSTEMS

Basics DRAM ORGANIZATION. Storage element (capacitor) Data In/Out Buffers. Word Line. Bit Line. Switching element HIGH-SPEED MEMORY SYSTEMS Basics DRAM ORGANIZATION DRAM Word Line Bit Line Storage element (capacitor) In/Out Buffers Decoder Sense Amps... Bit Lines... Switching element Decoder... Word Lines... Memory Array Page 1 Basics BUS

More information

Network Coding as a Dynamical System

Network Coding as a Dynamical System Network Codng as a Dynamcal System Narayan B. Mandayam IEEE Dstngushed Lecture (jont work wth Dan Zhang and a Su) Department of Electrcal and Computer Engneerng Rutgers Unversty Outlne. Introducton 2.

More information

THE low-density parity-check (LDPC) code is getting

THE low-density parity-check (LDPC) code is getting Implementng the NASA Deep Space LDPC Codes for Defense Applcatons Wley H. Zhao, Jeffrey P. Long 1 Abstract Selected codes from, and extended from, the NASA s deep space low-densty party-check (LDPC) codes

More information

AMath 483/583 Lecture 21 May 13, Notes: Notes: Jacobi iteration. Notes: Jacobi with OpenMP coarse grain

AMath 483/583 Lecture 21 May 13, Notes: Notes: Jacobi iteration. Notes: Jacobi with OpenMP coarse grain AMath 483/583 Lecture 21 May 13, 2011 Today: OpenMP and MPI versons of Jacob teraton Gauss-Sedel and SOR teratve methods Next week: More MPI Debuggng and totalvew GPU computng Read: Class notes and references

More information

Internal Memory. Computer Architecture. Outline. Memory Hierarchy. Semiconductor Memory Types. Copyright 2000 N. AYDIN. All rights reserved.

Internal Memory. Computer Architecture. Outline. Memory Hierarchy. Semiconductor Memory Types. Copyright 2000 N. AYDIN. All rights reserved. Computer Architecture Prof. Dr. Nizamettin AYDIN naydin@yildiz.edu.tr nizamettinaydin@gmail.com Internal Memory http://www.yildiz.edu.tr/~naydin 1 2 Outline Semiconductor main memory Random Access Memory

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

User Authentication Based On Behavioral Mouse Dynamics Biometrics

User Authentication Based On Behavioral Mouse Dynamics Biometrics User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA

More information

Improving Low Density Parity Check Codes Over the Erasure Channel. The Nelder Mead Downhill Simplex Method. Scott Stransky

Improving Low Density Parity Check Codes Over the Erasure Channel. The Nelder Mead Downhill Simplex Method. Scott Stransky Improvng Low Densty Party Check Codes Over the Erasure Channel The Nelder Mead Downhll Smplex Method Scott Stransky Programmng n conjuncton wth: Bors Cukalovc 18.413 Fnal Project Sprng 2004 Page 1 Abstract

More information

Copyright 2012, Elsevier Inc. All rights reserved.

Copyright 2012, Elsevier Inc. All rights reserved. Computer Architecture A Quantitative Approach, Fifth Edition Chapter 2 Memory Hierarchy Design Edited by Mansour Al Zuair 1 Introduction Programmers want unlimited amounts of memory with low latency Fast

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

Outline. Self-Organizing Maps (SOM) US Hebbian Learning, Cntd. The learning rule is Hebbian like:

Outline. Self-Organizing Maps (SOM) US Hebbian Learning, Cntd. The learning rule is Hebbian like: Self-Organzng Maps (SOM) Turgay İBRİKÇİ, PhD. Outlne Introducton Structures of SOM SOM Archtecture Neghborhoods SOM Algorthm Examples Summary 1 2 Unsupervsed Hebban Learnng US Hebban Learnng, Cntd 3 A

More information

Memory hierarchy Outline

Memory hierarchy Outline Memory hierarchy Outline Performance impact Principles of memory hierarchy Memory technology and basics 2 Page 1 Performance impact Memory references of a program typically determine the ultimate performance

More information

CPS101 Computer Organization and Programming Lecture 13: The Memory System. Outline of Today s Lecture. The Big Picture: Where are We Now?

CPS101 Computer Organization and Programming Lecture 13: The Memory System. Outline of Today s Lecture. The Big Picture: Where are We Now? cps 14 memory.1 RW Fall 2 CPS11 Computer Organization and Programming Lecture 13 The System Robert Wagner Outline of Today s Lecture System the BIG Picture? Technology Technology DRAM A Real Life Example

More information

Sample Solution. Advanced Computer Networks P 1 P 2 P 3 P 4 P 5. Module: IN2097 Date: Examiner: Prof. Dr.-Ing. Georg Carle Exam: Final exam

Sample Solution. Advanced Computer Networks P 1 P 2 P 3 P 4 P 5. Module: IN2097 Date: Examiner: Prof. Dr.-Ing. Georg Carle Exam: Final exam Char of Network Archtectures and Servces Department of Informatcs Techncal Unversty of Munch Note: Durng the attendance check a stcker contanng a unque QR code wll be put on ths exam. Ths QR code contans

More information

Data Representation in Digital Design, a Single Conversion Equation and a Formal Languages Approach

Data Representation in Digital Design, a Single Conversion Equation and a Formal Languages Approach Data Representaton n Dgtal Desgn, a Sngle Converson Equaton and a Formal Languages Approach Hassan Farhat Unversty of Nebraska at Omaha Abstract- In the study of data representaton n dgtal desgn and computer

More information

CS 534: Computer Vision Model Fitting

CS 534: Computer Vision Model Fitting CS 534: Computer Vson Model Fttng Sprng 004 Ahmed Elgammal Dept of Computer Scence CS 534 Model Fttng - 1 Outlnes Model fttng s mportant Least-squares fttng Maxmum lkelhood estmaton MAP estmaton Robust

More information

Real-time interactive applications

Real-time interactive applications Real-tme nteractve applcatons PC-2-PC phone PC-2-phone Dalpad Net2phone vdeoconference Webcams Now we look at a PC-2-PC Internet phone example n detal Internet phone over best-effort (1) Best effort packet

More information

Storage Binding in RTL synthesis

Storage Binding in RTL synthesis Storage Bndng n RTL synthess Pe Zhang Danel D. Gajsk Techncal Report ICS-0-37 August 0th, 200 Center for Embedded Computer Systems Department of Informaton and Computer Scence Unersty of Calforna, Irne

More information

Computer System Components

Computer System Components Computer System Components CPU Core 1 GHz - 3.2 GHz 4-way Superscaler RISC or RISC-core (x86): Deep Instruction Pipelines Dynamic scheduling Multiple FP, integer FUs Dynamic branch prediction Hardware

More information

Q.1 Q.20 Carry One Mark Each. is differentiable for all real values of x

Q.1 Q.20 Carry One Mark Each. is differentiable for all real values of x Q. Q.0 Carry One Mark Each CS Computer Scence: Gate 007 Paper. Consder the followng two statements about the functon f ( x) = x : P. f ( x) s contnuous for all real values of x Q. f ( x) s dfferentable

More information

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty

More information

A New Transaction Processing Model Based on Optimistic Concurrency Control

A New Transaction Processing Model Based on Optimistic Concurrency Control A New Transacton Processng Model Based on Optmstc Concurrency Control Wang Pedong,Duan Xpng,Jr. Abstract-- In ths paper, to support moblty and dsconnecton of moble clents effectvely n moble computng envronment,

More information

An Efficient Garbage Collection for Flash Memory-Based Virtual Memory Systems

An Efficient Garbage Collection for Flash Memory-Based Virtual Memory Systems S. J and D. Shn: An Effcent Garbage Collecton for Flash Memory-Based Vrtual Memory Systems 2355 An Effcent Garbage Collecton for Flash Memory-Based Vrtual Memory Systems Seunggu J and Dongkun Shn, Member,

More information

ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 7: Memory Organization Part II

ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 7: Memory Organization Part II ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 7: Organization Part II Ujjwal Guin, Assistant Professor Department of Electrical and Computer Engineering Auburn University, Auburn,

More information

Array transposition in CUDA shared memory

Array transposition in CUDA shared memory Array transposton n CUDA shared memory Mke Gles February 19, 2014 Abstract Ths short note s nspred by some code wrtten by Jeremy Appleyard for the transposton of data through shared memory. I had some

More information

VRT012 User s guide V0.1. Address: Žirmūnų g. 27, Vilnius LT-09105, Phone: (370-5) , Fax: (370-5) ,

VRT012 User s guide V0.1. Address: Žirmūnų g. 27, Vilnius LT-09105, Phone: (370-5) , Fax: (370-5) , VRT012 User s gude V0.1 Thank you for purchasng our product. We hope ths user-frendly devce wll be helpful n realsng your deas and brngng comfort to your lfe. Please take few mnutes to read ths manual

More information