Chapter 3. Arithmetic for Computers

Size: px
Start display at page:

Download "Chapter 3. Arithmetic for Computers"

Transcription

1 COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface ARM Editio Chapter 3 Arithmetic for Computers Arithmetic for Computers Operatios o itegers Additio ad subtractio Multiplicatio ad divisio Dealig with overflow Floatig-poit real umbers Represetatio ad operatios 3.1 Itroductio Chapter 3 Arithmetic for Computers 2

2 Iteger Additio Example: Additio ad Subtractio Overflow if result out of rage Addig +ve ad ve operads, o overflow Addig two +ve operads Overflow if result sig is 1 Addig two ve operads Overflow if result sig is 0 Chapter 3 Arithmetic for Computers 3 Iteger Subtractio Add egatio of secod operad Example: 7 6 = 7 + ( 6) +7: : : Overflow if result out of rage Subtractig two +ve or two ve operads, o overflow Subtractig +ve from ve operad Overflow if result sig is 0 Subtractig ve from +ve operad Overflow if result sig is 1 Chapter 3 Arithmetic for Computers 4

3 Arithmetic for Multimedia Graphics ad media processig operates o vectors of 8-bit ad 16-bit data Use 64-bit adder, with partitioed carry chai Operate o 8 8-bit, 4 16-bit, or 2 32-bit vectors SIMD (sigle-istructio, multiple-data) Saturatig operatios O overflow, result is largest represetable value c.f. 2s-complemet modulo arithmetic E.g., clippig i audio, saturatio i video Chapter 3 Arithmetic for Computers 5 Multiplicatio Start with log-multiplicatio approach 3.3 Multiplicatio multiplicad multiplier product Legth of product is the sum of operad legths Chapter 3 Arithmetic for Computers 6

4 Multiplicatio Hardware Iitially 0 Chapter 3 Arithmetic for Computers 7 Optimized Multiplier Perform steps i parallel: add/shift Oe cycle per partial-product additio That s ok, if frequecy of multiplicatios is low Chapter 3 Arithmetic for Computers 8

5 Faster Multiplier Uses multiple adders Cost/performace tradeoff Ca be pipelied Several multiplicatio performed i parallel Chapter 3 Arithmetic for Computers 9 LEGv8 Multiplicatio Three multiply istructios: MUL: multiply Gives the lower 64 bits of the product SMULH: siged multiply high Gives the upper 64 bits of the product, assumig the operads are siged UMULH: usiged multiply high Gives the lower 64 bits of the product, assumig the operads are usiged Chapter 3 Arithmetic for Computers 10

6 Divisio divisor quotiet divided remaider -bit operads yield -bit quotiet ad remaider Check for 0 divisor Log divisio approach If divisor divided bits 1 bit i quotiet, subtract Otherwise 0 bit i quotiet, brig dow ext divided bit Restorig divisio Do the subtract, ad if remaider goes < 0, add divisor back Siged divisio Divide usig absolute values Adjust sig of quotiet ad remaider as required 3.4 Divisio Chapter 3 Arithmetic for Computers 11 Divisio Hardware Iitially divisor i left half Iitially divided Chapter 3 Arithmetic for Computers 12

7 Optimized Divider Oe cycle per partial-remaider subtractio Looks a lot like a multiplier! Same hardware ca be used for both Chapter 3 Arithmetic for Computers 13 Faster Divisio Ca t use parallel hardware as i multiplier Subtractio is coditioal o sig of remaider Faster dividers (e.g. SRT devisio) geerate multiple quotiet bits per step Still require multiple steps Chapter 3 Arithmetic for Computers 14

8 LEGv8 Divisio Two istructios: SDIV (siged) UDIV (usiged) Both istructios igore overflow ad divisio-by-zero Chapter 3 Arithmetic for Computers 15 Floatig Poit Represetatio for o-itegral umbers Icludig very small ad very large umbers Like scietific otatio 3.5 Floatig Poit I biary ormalized ot ormalized ±1.xxxxxxx 2 2 yyyy Types float ad double i C Chapter 3 Arithmetic for Computers 16

9 Floatig Poit Stadard Defied by IEEE Std Developed i respose to divergece of represetatios Portability issues for scietific code Now almost uiversally adopted Two represetatios Sigle precisio (32-bit) Double precisio (64-bit) Chapter 3 Arithmetic for Computers 17 IEEE Floatig-Poit Format sigle: 8 bits double: 11 bits S Expoet sigle: 23 bits double: 52 bits Fractio x = (-1) S (1+ Fractio) 2 (Expoet -Bias) S: sig bit (0 Þ o-egative, 1 Þ egative) Normalize sigificad: 1.0 sigificad < 2.0 Always has a leadig pre-biary-poit 1 bit, so o eed to represet it explicitly (hidde bit) Sigificad is Fractio with the 1. restored Expoet: excess represetatio: actual expoet + Bias Esures expoet is usiged Sigle: Bias = 127; Double: Bias = 1203 Chapter 3 Arithmetic for Computers 18

10 Sigle-Precisio Rage Expoets ad reserved Smallest value Expoet: Þ actual expoet = = 126 Fractio: Þ sigificad = 1.0 ± ± Largest value expoet: Þ actual expoet = = +127 Fractio: Þ sigificad 2.0 ± ± Chapter 3 Arithmetic for Computers 19 Double-Precisio Rage Expoets ad reserved Smallest value Expoet: Þ actual expoet = = 1022 Fractio: Þ sigificad = 1.0 ± ± Largest value Expoet: Þ actual expoet = = Fractio: Þ sigificad 2.0 ± ± Chapter 3 Arithmetic for Computers 20

11 Floatig-Poit Precisio Relative precisio all fractio bits are sigificat Sigle: approx 2 23 Equivalet to 23 log decimal digits of precisio Double: approx 2 52 Equivalet to 52 log decimal digits of precisio Chapter 3 Arithmetic for Computers 21 Floatig-Poit Example Represet = ( 1) S = 1 Fractio = Expoet = 1 + Bias Sigle: = 126 = Double: = 1022 = Sigle: Double: Chapter 3 Arithmetic for Computers 22

12 Floatig-Poit Example What umber is represeted by the sigleprecisio float S = 1 Fractio = Fxpoet = = 129 x = ( 1) 1 ( ) 2 ( ) = ( 1) = 5.0 Chapter 3 Arithmetic for Computers 23 Deormal Numbers Expoet = Þ hidde bit is 0 x = (-1) S (0+ Fractio) 2 Smaller tha ormal umbers allow for gradual uderflow, with dimiishig precisio Deormal with fractio = x = (-1) S (0+ 0) 2 -Bias Two represetatios of 0.0! -Bias = ±0.0 Chapter 3 Arithmetic for Computers 24

13 Ifiities ad NaNs Expoet = , Fractio = ±Ifiity Ca be used i subsequet calculatios, avoidig eed for overflow check Expoet = , Fractio Not-a-Number (NaN) Idicates illegal or udefied result e.g., 0.0 / 0.0 Ca be used i subsequet calculatios Chapter 3 Arithmetic for Computers 25 Floatig-Poit Additio Cosider a 4-digit decimal example Alig decimal poits Shift umber with smaller expoet Add sigificads = Normalize result & check for over/uderflow Roud ad reormalize if ecessary Chapter 3 Arithmetic for Computers 26

14 Floatig-Poit Additio Now cosider a 4-digit biary example ( ) 1. Alig biary poits Shift umber with smaller expoet Add sigificads = Normalize result & check for over/uderflow , with o over/uderflow 4. Roud ad reormalize if ecessary (o chage) = Chapter 3 Arithmetic for Computers 27 FP Adder Hardware Much more complex tha iteger adder Doig it i oe clock cycle would take too log Much loger tha iteger operatios Slower clock would pealize all istructios FP adder usually takes several cycles Ca be pipelied Chapter 3 Arithmetic for Computers 28

15 FP Adder Hardware Step 1 Step 2 Step 3 Step 4 Chapter 3 Arithmetic for Computers 29 Floatig-Poit Multiplicatio Cosider a 4-digit decimal example Add expoets For biased expoets, subtract bias from sum New expoet = = 5 2. Multiply sigificads = Þ Normalize result & check for over/uderflow Roud ad reormalize if ecessary Determie sig of result from sigs of operads Chapter 3 Arithmetic for Computers 30

16 Floatig-Poit Multiplicatio Now cosider a 4-digit biary example ( ) 1. Add expoets Ubiased: = 3 Biased: ( ) + ( ) = = Multiply sigificads = Þ Normalize result & check for over/uderflow (o chage) with o over/uderflow 4. Roud ad reormalize if ecessary (o chage) 5. Determie sig: +ve ve Þ ve = Chapter 3 Arithmetic for Computers 31 FP Arithmetic Hardware FP multiplier is of similar complexity to FP adder But uses a multiplier for sigificads istead of a adder FP arithmetic hardware usually does Additio, subtractio, multiplicatio, divisio, reciprocal, square-root FP «iteger coversio Operatios usually takes several cycles Ca be pipelied Chapter 3 Arithmetic for Computers 32

17 FP Istructios i LEGv8 Separate FP registers 32 sigle-precisio: S0,, S31 32 double-precisio: DS0,, D31 S stored i the lower 32 bits of D FP istructios operate oly o FP registers Programs geerally do t do iteger ops o FP data, or vice versa More registers with miimal code-size impact FP load ad store istructios LDURS, LDURD STURS, STURD Chapter 3 Arithmetic for Computers 33 FP Istructios i LEGv8 Sigle-precisio arithmetic FADDS, FSUBS, FMULS, FDIVS e.g., FADDS S2, S4, S6 Double-precisio arithmetic FADDD, FSUBD, FMULD, FDIVD e.g., FADDD D2, D4, D6 Sigle- ad double-precisio compariso FCMPS, FCMPD Sets or clears FP coditio-code bits Brach o FP coditio code true or false B.cod Chapter 3 Arithmetic for Computers 34

18 FP Example: F to C C code: float f2c (float fahr) { retur ((5.0/9.0)*(fahr )); } fahr i S12, result i S0, literals i global memory space Compiled LEGv8 code: f2c: LDURS S16, [X27,cost5] // S16 = 5.0 (5.0 i memory) LDURS S18, [X27,cost9] // S18 = 9.0 (9.0 i memory) FDIVS S16, S16, S18 // S16 = 5.0 / 9.0 LDURS S18, [X27,cost32] // S18 = 32.0 FSUBS S18, S12, S18 // S18 = fahr 32.0 FMULS S0, S16, S18 // S0 = (5/9)*(fahr 32.0) BR LR // retur Chapter 3 Arithmetic for Computers 35 FP Example: Array Multiplicatio X = X + Y Z All matrices, 64-bit double-precisio elemets C code: void mm (double x[][], double y[][], double z[][]) { it i, j, k; for (i = 0; i! = 32; i = i + 1) for (j = 0; j! = 32; j = j + 1) for (k = 0; k! = 32; k = k + 1) x[i][j] = x[i][j] + y[i][k] * z[k][j]; } Addresses of x, y, z i X0, X1, X2, ad i, j, k i X19, X20, X21 Chapter 3 Arithmetic for Computers 36

19 FP Example: Array Multiplicatio LEGv8 code: mm:... LDI X10, 32 // X10 = 32 (row size/loop ed) LDI X19, 0 // i = 0; iitialize 1st for loop L1: LDI X20, 0 // j = 0; restart 2d for loop L2: LDI X21, 0 // k = 0; restart 3rd for loop LSL X11, X19, 5 // X11 = i * 2 5 (size of row of c) ADD X11, X11, X20 // X11 = i * size(row) + j LSL X11, X11, 3 // X11 = byte offset of [i][j] ADD X11, X0, X11 // X11 = byte address of c[i][j] LDURD D4, [X11,#0] // D4 = 8 bytes of c[i][j] L3: LSL X9, X21, 5 // X9 = k * 2 5 (size of row of b) ADD X9, X9, X20 // X9 = k * size(row) + j LSL X9, X9, 3 // X9 = byte offset of [k][j] ADD X9, X2, X9 // X9 = byte address of b[k][j] LDURD D16, [X9,#0] // D16 = 8 bytes of b[k][j] LSL X9, X19, 5 // X9 = i * 2 5 (size of row of a) Chapter 3 Arithmetic for Computers 37 FP Example: Array Multiplicatio ADD X9, X9, X21 // X9 = i * size(row) + k LSL X9, X9, 3 // X9 = byte offset of [i][k] ADD X9, X1, X9 // X9 = byte address of a[i][k] LDURD D18, [X9,#0] // D18 = 8 bytes of a[i][k] FMULD D16, D18, D16 // D16 = a[i][k] * b[k][j] FADDD D4, D4, D16 // f4 = c[i][j] + a[i][k] * b[k][j] ADDI X21, X21, 1 // $k = k + 1 CMP X21, X10 // test k vs. 32 B.LT L3 // if (k < 32) go to L3 STURD D4, [X11,0] // = D4 ADDI X20, X20, #1 // $j = j + 1 CMP X20, X10 // test j vs. 32 B.LT L2 // if (j < 32) go to L2 ADDI X19, X19, #1 // $i = i + 1 CMP X19, X10 // test i vs. 32 B.LT L1 // if (i < 32) go to L1 Chapter 3 Arithmetic for Computers 38

20 Accurate Arithmetic IEEE Std 754 specifies additioal roudig cotrol Extra bits of precisio (guard, roud, sticky) Choice of roudig modes Allows programmer to fie-tue umerical behavior of a computatio Not all FP uits implemet all optios Most programmig laguages ad FP libraries just use defaults Trade-off betwee hardware complexity, performace, ad market requiremets Chapter 3 Arithmetic for Computers 39 Subword Parallellism Graphics ad audio applicatios ca take advatage of performig simultaeous operatios o short vectors Example: 128-bit adder: Sixtee 8-bit adds Eight 16-bit adds Four 32-bit adds Also called data-level parallelism, vector parallelism, or Sigle Istructio, Multiple Data (SIMD) Chapter 3 Arithmetic for Computers Parallelism ad Computer Arithmetic: Subword Parallelism

21 ARMv8 SIMD bit registers (V0,, V31) Works with iteger ad FP Examples: 16 8-bit iteger adds: ADD V1.16B, V2.16B, V3.16B 4 32-bit FP adds: FADD V1.4S, V2.4S, V3.4S Chapter 3 Arithmetic for Computers 41 Other ARMv8 Features 245 SIMD istructios, icludig: Square root Fused multiply-add, multiply-subtract Covertio ad scalar ad vector roud-toitegral Structured (strided) vector load/stores Saturatig arithmetic Chapter 3 Arithmetic for Computers 42

22 x86 FP Architecture Origially based o 8087 FP coprocessor 8 80-bit exteded-precisio registers Used as a push-dow stack Registers idexed from TOS: ST(0), ST(1), FP values are 32-bit or 64 i memory Coverted o load/store of memory operad Iteger operads ca also be coverted o load/store Very difficult to geerate ad optimize code Result: poor FP performace 3.7 Real Stuff: Streamig SIMD Extesios ad AVX i x86 Chapter 3 Arithmetic for Computers 43 x86 FP Istructios Data trasfer Arithmetic Compare Trascedetal FILD mem/st(i) FISTP mem/st(i) FLDPI FLD1 FLDZ FIADDP mem/st(i) FISUBRP mem/st(i) FIMULP mem/st(i) FIDIVRP mem/st(i) FSQRT FABS FRNDINT FICOMP FIUCOMP FSTSW AX/mem FPATAN F2XMI FCOS FPTAN FPREM FPSIN FYL2X Optioal variatios I: iteger operad P: pop operad from stack R: reverse operad order But ot all combiatios allowed Chapter 3 Arithmetic for Computers 44

23 Streamig SIMD Extesio 2 (SSE2) Adds bit registers Exteded to 8 registers i AMD64/EM64T Ca be used for multiple FP operads 2 64-bit double precisio 4 32-bit double precisio Istructios operate o them simultaeously Sigle-Istructio Multiple-Data Chapter 3 Arithmetic for Computers 45 Matrix Multiply Uoptimized code: 1. void dgemm (it, double* A, double* B, double* C) 2. { 3. for (it i = 0; i < ; ++i) 4. for (it j = 0; j < ; ++j) 5. { 6. double cij = C[i+j*]; /* cij = C[i][j] */ 7. for(it k = 0; k < ; k++ ) 8. cij += A[i+k*] * B[k+j*]; /* cij += A[i][k]*B[k][j] */ 9. C[i+j*] = cij; /* C[i][j] = cij */ 10. } 11. } 3.8 Goig Faster: Subword Parallelism ad Matrix Multiply Chapter 3 Arithmetic for Computers 46

24 Matrix Multiply x86 assembly code: 1. vmovsd (%r10),%xmm0 # Load 1 elemet of C ito %xmm0 2. mov %rsi,%rcx # register %rcx = %rsi 3. xor %eax,%eax # register %eax = 0 4. vmovsd (%rcx),%xmm1 # Load 1 elemet of B ito %xmm1 5. add %r9,%rcx # register %rcx = %rcx + %r9 6. vmulsd (%r8,%rax,8),%xmm1,%xmm1 # Multiply %xmm1, elemet of A 7. add $0x1,%rax # register %rax = %rax cmp %eax,%edi # compare %eax to %edi 9. vaddsd %xmm1,%xmm0,%xmm0 # Add %xmm1, %xmm0 10. jg 30 <dgemm+0x30> # jump if %eax > %edi 11. add $0x1,%r11d # register %r11 = %r vmovsd %xmm0,(%r10) # Store %xmm0 ito C elemet 3.8 Goig Faster: Subword Parallelism ad Matrix Multiply Chapter 3 Arithmetic for Computers 47 Matrix Multiply Optimized C code: 1. #iclude <x86itri.h> 2. void dgemm (it, double* A, double* B, double* C) 3. { 4. for ( it i = 0; i < ; i+=4 ) 5. for ( it j = 0; j < ; j++ ) { 6. m256d c0 = _mm256_load_pd(c+i+j*); /* c0 = C[i][j] */ 7. for( it k = 0; k < ; k++ ) 8. c0 = _mm256_add_pd(c0, /* c0 += A[i][k]*B[k][j] */ 9. _mm256_mul_pd(_mm256_load_pd(a+i+k*), 10. _mm256_broadcast_sd(b+k+j*))); 11. _mm256_store_pd(c+i+j*, c0); /* C[i][j] = c0 */ 12. } 13. } 3.8 Goig Faster: Subword Parallelism ad Matrix Multiply Chapter 3 Arithmetic for Computers 48

25 Matrix Multiply Optimized x86 assembly code: 1. vmovapd (%r11),%ymm0 # Load 4 elemets of C ito %ymm0 2. mov %rbx,%rcx # register %rcx = %rbx 3. xor %eax,%eax # register %eax = 0 4. vbroadcastsd (%rax,%r8,1),%ymm1 # Make 4 copies of B elemet 5. add $0x8,%rax # register %rax = %rax vmulpd (%rcx),%ymm1,%ymm1 # Parallel mul %ymm1,4 A elemets 7. add %r9,%rcx # register %rcx = %rcx + %r9 8. cmp %r10,%rax # compare %r10 to %rax 9. vaddpd %ymm1,%ymm0,%ymm0 # Parallel add %ymm1, %ymm0 10. je 50 <dgemm+0x50> # jump if ot %r10!= %rax 11. add $0x1,%esi # register % esi = % esi vmovapd %ymm0,(%r11) # Store %ymm0 ito 4 C elemets 3.8 Goig Faster: Subword Parallelism ad Matrix Multiply Chapter 3 Arithmetic for Computers 49 Right Shift ad Divisio Left shift by i places multiplies a iteger by 2 i Right shift divides by 2 i? Oly for usiged itegers For siged itegers Arithmetic right shift: replicate the sig bit e.g., 5 / >> 2 = = 2 Rouds toward c.f >>> 2 = = Fallacies ad Pitfalls Chapter 3 Arithmetic for Computers 50

26 Associativity Parallel programs may iterleave operatios i uexpected orders Assumptios of associativity may fail (x+y)+z x -1.50E+38 y 1.50E E+00 z E+00 x+(y+z) -1.50E E E+00 Need to validate parallel programs uder varyig degrees of parallelism Chapter 3 Arithmetic for Computers 51 Who Cares About FP Accuracy? Importat for scietific code But for everyday cosumer use? My bak balace is out by ! L The Itel Petium FDIV bug The market expects accuracy See Colwell, The Petium Chroicles Chapter 3 Arithmetic for Computers 52

27 Cocludig Remarks Bits have o iheret meaig Iterpretatio depeds o the istructios applied 3.9 Cocludig Remarks Computer represetatios of umbers Fiite rage ad precisio Need to accout for this i programs Chapter 3 Arithmetic for Computers 53 Cocludig Remarks ISAs support arithmetic Siged ad usiged itegers Floatig-poit approximatio to reals Bouded rage ad precisio Operatios ca overflow ad uderflow Chapter 3 Arithmetic for Computers 54

Chapter 3. Floating Point Arithmetic

Chapter 3. Floating Point Arithmetic COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 3 Floatig Poit Arithmetic Review - Multiplicatio 0 1 1 0 = 6 multiplicad 32-bit ALU shift product right multiplier add

More information

Chapter 3. Floating Point. Floating Point Standard. Morgan Kaufmann Publishers 13 November, Chapter 3 Arithmetic for Computers 1

Chapter 3. Floating Point. Floating Point Standard. Morgan Kaufmann Publishers 13 November, Chapter 3 Arithmetic for Computers 1 COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface RISC-V Edition Chapter 3 Arithmetic for Computers Floating Point Numbers Floating Point Representation for non-integral numbers Including

More information

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 3. Arithmetic for Computers

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 3. Arithmetic for Computers COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 3 Arithmetic for Computers Arithmetic for Computers Operations on integers Addition and subtraction Multiplication

More information

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 3. Arithmetic for Computers

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 3. Arithmetic for Computers COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 3 Arithmetic for Computers Arithmetic for Computers Operations on integers Addition and subtraction Multiplication

More information

Arithmetic for Computers. Integer Addition. Chapter 3

Arithmetic for Computers. Integer Addition. Chapter 3 COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 3 Arithmetic for Computers Arithmetic for Computers Operations on integers Addition and subtraction Multiplication

More information

Integer Subtraction. Chapter 3. Overflow conditions. Arithmetic for Computers. Signed Addition. Integer Addition. Arithmetic for Computers

Integer Subtraction. Chapter 3. Overflow conditions. Arithmetic for Computers. Signed Addition. Integer Addition. Arithmetic for Computers COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Integer Subtraction Chapter 3 Arithmetic for Computers Add negation of second operand Example: 7 6 = 7 + ( 6) +7: 0000 0000

More information

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 3. Arithmetic for Computers

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 3. Arithmetic for Computers COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 3 Arithmetic for Computers Arithmetic for Computers Operations on integers Addition and subtraction Multiplication

More information

Chapter 3 Arithmetic for Computers

Chapter 3 Arithmetic for Computers Chapter 3 Arithmetic for Computers 1 Arithmetic for Computers Operations on integers Addition and subtraction Multiplication and division Dealing with overflow Floating-point real numbers Representation

More information

Chapter 3. Arithmetic for Computers

Chapter 3. Arithmetic for Computers Chapter 3 Arithmetic for Computers Arithmetic for Computers Operations on integers Addition and subtraction Multiplication and division Dealing with overflow Floating-point real numbers Representation

More information

Chapter 3. Arithmetic for Computers

Chapter 3. Arithmetic for Computers Chapter 3 Arithmetic for Computers Arithmetic for Computers Operations on integers Addition and subtraction Multiplication and division Dealing with overflow Floating-point real numbers Representation

More information

Chapter 3. Arithmetic for Computers

Chapter 3. Arithmetic for Computers Chapter 3 Arithmetic for Computers Arithmetic for Computers Operations on integers Addition and subtraction Multiplication and division Dealing with overflow Floating-point real numbers Representation

More information

TDT4255 Computer Design. Lecture 4. Magnus Jahre

TDT4255 Computer Design. Lecture 4. Magnus Jahre 1 TDT4255 Computer Design Lecture 4 Magnus Jahre 2 Chapter 3 Computer Arithmetic ti Acknowledgement: Slides are adapted from Morgan Kaufmann companion material 3 Arithmetic for Computers Operations on

More information

Chapter 3. Arithmetic for Computers

Chapter 3. Arithmetic for Computers Chapter 3 Arithmetic for Computers Arithmetic for Computers Unsigned vs. Signed Integers Operations on integers Addition and subtraction Multiplication and division Dealing with overflow Floating-point

More information

COMPUTER ORGANIZATION AND. Edition. The Hardware/Software Interface. Chapter 3. Arithmetic for Computers

COMPUTER ORGANIZATION AND. Edition. The Hardware/Software Interface. Chapter 3. Arithmetic for Computers ARM D COMPUTER ORGANIZATION AND Edition The Hardware/Software Interface Chapter 3 Arithmetic for Computers Modified and extended by R.J. Leduc - 2016 In this chapter, we will investigate: How integer arithmetic

More information

Chapter 3. Arithmetic Text: P&H rev

Chapter 3. Arithmetic Text: P&H rev Chapter 3 Arithmetic Text: P&H rev3.29.16 Arithmetic for Computers Operations on integers Addition and subtraction Multiplication and division Dealing with overflow Floating-point real numbers Representation

More information

Computer Systems - HS

Computer Systems - HS What have we leared so far? Computer Systems High Level ENGG1203 2d Semester, 2017-18 Applicatios Sigals Systems & Cotrol Systems Computer & Embedded Systems Digital Logic Combiatioal Logic Sequetial Logic

More information

CMSC Computer Architecture Lecture 3: ISA and Introduction to Microarchitecture. Prof. Yanjing Li University of Chicago

CMSC Computer Architecture Lecture 3: ISA and Introduction to Microarchitecture. Prof. Yanjing Li University of Chicago CMSC 22200 Computer Architecture Lecture 3: ISA ad Itroductio to Microarchitecture Prof. Yajig Li Uiversity of Chicago Lecture Outlie ISA uarch (hardware implemetatio of a ISA) Logic desig basics Sigle-cycle

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Part A Datapath Design

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Part A Datapath Design COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter The Processor Part A path Desig Itroductio CPU performace factors Istructio cout Determied by ISA ad compiler. CPI ad

More information

Appendix D. Controller Implementation

Appendix D. Controller Implementation COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Appedix D Cotroller Implemetatio Cotroller Implemetatios Combiatioal logic (sigle-cycle); Fiite state machie (multi-cycle, pipelied);

More information

Instruction and Data Streams

Instruction and Data Streams Advaced Architectures Master Iformatics Eg. 2017/18 A.J.Proeça Data Parallelism 1 (vector & SIMD extesios) (most slides are borrowed) AJProeça, Advaced Architectures, MiEI, UMiho, 2017/18 1 Istructio ad

More information

Introduction CHAPTER Computers

Introduction CHAPTER Computers Iside a Computer CHAPTER Itroductio. Computers A computer is a electroic device that accepts iput, stores data, processes data accordig to a set of istructios (called program), ad produces output i desired

More information

EE260: Digital Design, Spring /16/18. n Example: m 0 (=x 1 x 2 ) is adjacent to m 1 (=x 1 x 2 ) and m 2 (=x 1 x 2 ) but NOT m 3 (=x 1 x 2 )

EE260: Digital Design, Spring /16/18. n Example: m 0 (=x 1 x 2 ) is adjacent to m 1 (=x 1 x 2 ) and m 2 (=x 1 x 2 ) but NOT m 3 (=x 1 x 2 ) EE26: Digital Desig, Sprig 28 3/6/8 EE 26: Itroductio to Digital Desig Combiatioal Datapath Yao Zheg Departmet of Electrical Egieerig Uiversity of Hawaiʻi at Māoa Combiatioal Logic Blocks Multiplexer Ecoders/Decoders

More information

Floating Point Arithmetic

Floating Point Arithmetic Floating Point Arithmetic Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong (jinkyu@skku.edu)

More information

End Semester Examination CSE, III Yr. (I Sem), 30002: Computer Organization

End Semester Examination CSE, III Yr. (I Sem), 30002: Computer Organization Ed Semester Examiatio 2013-14 CSE, III Yr. (I Sem), 30002: Computer Orgaizatio Istructios: GROUP -A 1. Write the questio paper group (A, B, C, D), o frot page top of aswer book, as per what is metioed

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Single-Cycle Disadvantages & Advantages

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Single-Cycle Disadvantages & Advantages COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 4 The Processor Pipeliig Sigle-Cycle Disadvatages & Advatages Clk Uses the clock cycle iefficietly the clock cycle must

More information

Elementary Educational Computer

Elementary Educational Computer Chapter 5 Elemetary Educatioal Computer. Geeral structure of the Elemetary Educatioal Computer (EEC) The EEC coforms to the 5 uits structure defied by vo Neuma's model (.) All uits are preseted i a simplified

More information

CSC 220: Computer Organization Unit 11 Basic Computer Organization and Design

CSC 220: Computer Organization Unit 11 Basic Computer Organization and Design College of Computer ad Iformatio Scieces Departmet of Computer Sciece CSC 220: Computer Orgaizatio Uit 11 Basic Computer Orgaizatio ad Desig 1 For the rest of the semester, we ll focus o computer architecture:

More information

UNIVERSITY OF MORATUWA

UNIVERSITY OF MORATUWA UNIVERSITY OF MORATUWA FACULTY OF ENGINEERING DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING B.Sc. Egieerig 2014 Itake Semester 2 Examiatio CS2052 COMPUTER ARCHITECTURE Time allowed: 2 Hours Jauary 2016

More information

Lesson 1. The datapath

Lesson 1. The datapath Lesso. Computers Structure ad Orgaizatio Graduate i Computer Scieces / Graduate i Computers Egieerig Computers Structure ad Orgaizatio. Graduate i Computer Scieces / Graduate i Computer Egieerig Automatic

More information

CMSC Computer Architecture Lecture 2: ISA. Prof. Yanjing Li Department of Computer Science University of Chicago

CMSC Computer Architecture Lecture 2: ISA. Prof. Yanjing Li Department of Computer Science University of Chicago CMSC 22200 Computer Architecture Lecture 2: ISA Prof. Yajig Li Departmet of Computer Sciece Uiversity of Chicago Admiistrative Stuff Lab1 out toight Due Thursday (10/18) Lab1 review sessio Tomorrow, 10/05,

More information

EE University of Minnesota. Midterm Exam #1. Prof. Matthew O'Keefe TA: Eric Seppanen. Department of Electrical and Computer Engineering

EE University of Minnesota. Midterm Exam #1. Prof. Matthew O'Keefe TA: Eric Seppanen. Department of Electrical and Computer Engineering EE 4363 1 Uiversity of Miesota Midterm Exam #1 Prof. Matthew O'Keefe TA: Eric Seppae Departmet of Electrical ad Computer Egieerig Uiversity of Miesota Twi Cities Campus EE 4363 Itroductio to Microprocessors

More information

BACHMANN-LANDAU NOTATIONS. Lecturer: Dr. Jomar F. Rabajante IMSP, UPLB MATH 174: Numerical Analysis I 1 st Sem AY

BACHMANN-LANDAU NOTATIONS. Lecturer: Dr. Jomar F. Rabajante IMSP, UPLB MATH 174: Numerical Analysis I 1 st Sem AY BACHMANN-LANDAU NOTATIONS Lecturer: Dr. Jomar F. Rabajate IMSP, UPLB MATH 174: Numerical Aalysis I 1 st Sem AY 018-019 RANKING OF FUNCTIONS Name Big-Oh Eamples Costat O(1 10 Logarithmic O(log log, log(

More information

Lecture 1: Introduction and Strassen s Algorithm

Lecture 1: Introduction and Strassen s Algorithm 5-750: Graduate Algorithms Jauary 7, 08 Lecture : Itroductio ad Strasse s Algorithm Lecturer: Gary Miller Scribe: Robert Parker Itroductio Machie models I this class, we will primarily use the Radom Access

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Determined by ISA and compiler. Determined by CPU hardware

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Determined by ISA and compiler. Determined by CPU hardware COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface ARM Editio Chapter 4 The Processor Itroductio CPU performace factors Istructio cout Determied by ISA ad compiler CPI ad Cycle time Determied

More information

A Comparison of Floating Point and Logarithmic Number Systems for FPGAs

A Comparison of Floating Point and Logarithmic Number Systems for FPGAs Compariso of Floatig Poit ad Logarithmic Number Systems for FPGs Michael Haselma, Michael Beauchamp, aro Wood, Scott Hauck Dept. of Electrical Egieerig Uiversity of Washigto Seattle, W {haselma, mjb7,

More information

Thomas Polzer Institut für Technische Informatik

Thomas Polzer Institut für Technische Informatik Thomas Polzer tpolzer@ecs.tuwien.ac.at Institut für Technische Informatik Operations on integers Addition and subtraction Multiplication and division Dealing with overflow Floating-point real numbers VO

More information

September, 2003 Saeid Nooshabadi

September, 2003 Saeid Nooshabadi COMP3211 lec21-fp-iii.1 COMP 3221 Microprocessors and Embedded Systems Lectures 21 : Floating Point Number Representation III http://www.cse.unsw.edu.au/~cs3221 September, 2003 Saeid@unsw.edu.au Overview

More information

COMPUTER ORGANIZATION AND DESIGN. The Hardware/Software Interface. Chapter 4. The Processor: C Multiple Issue Based on P&H

COMPUTER ORGANIZATION AND DESIGN. The Hardware/Software Interface. Chapter 4. The Processor: C Multiple Issue Based on P&H COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface Chapter 4 The Processor: C Multiple Issue Based on P&H Instruction-Level Parallelism (ILP) Pipelining: executing multiple instructions in

More information

APPLICATION NOTE PACE1750AE BUILT-IN FUNCTIONS

APPLICATION NOTE PACE1750AE BUILT-IN FUNCTIONS APPLICATION NOTE PACE175AE BUILT-IN UNCTIONS About This Note This applicatio brief is iteded to explai ad demostrate the use of the special fuctios that are built ito the PACE175AE processor. These powerful

More information

Chapter 2. C++ Basics. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Chapter 2. C++ Basics. Copyright 2015 Pearson Education, Ltd.. All rights reserved. Chapter 2 C++ Basics Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Overview 2.1 Variables ad Assigmets 2.2 Iput ad Output 2.3 Data Types ad Expressios 2.4 Simple Flow of Cotrol 2.5 Program

More information

. Written in factored form it is easy to see that the roots are 2, 2, i,

. Written in factored form it is easy to see that the roots are 2, 2, i, CMPS A Itroductio to Programmig Programmig Assigmet 4 I this assigmet you will write a java program that determies the real roots of a polyomial that lie withi a specified rage. Recall that the roots (or

More information

Floating Point Arithmetic. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Floating Point Arithmetic. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University Floating Point Arithmetic Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Floating Point (1) Representation for non-integral numbers Including very

More information

top() Applications of Stacks

top() Applications of Stacks CS22 Algorithms ad Data Structures MW :00 am - 2: pm, MSEC 0 Istructor: Xiao Qi Lecture 6: Stacks ad Queues Aoucemets Quiz results Homework 2 is available Due o September 29 th, 2004 www.cs.mt.edu~xqicoursescs22

More information

Lecture 5. Counting Sort / Radix Sort

Lecture 5. Counting Sort / Radix Sort Lecture 5. Coutig Sort / Radix Sort T. H. Corme, C. E. Leiserso ad R. L. Rivest Itroductio to Algorithms, 3rd Editio, MIT Press, 2009 Sugkyukwa Uiversity Hyuseug Choo choo@skku.edu Copyright 2000-2018

More information

Forms of Information Representation in Digital Computers.

Forms of Information Representation in Digital Computers. Chapter Forms of Iformatio Represetatio i Digital Computers.. Geeral Cosideratios There are distiguished the followig two fudametal modes of iformatio represetatio: (.) a) iteral; b) exteral; Iteral represetatios

More information

Isn t It Time You Got Faster, Quicker?

Isn t It Time You Got Faster, Quicker? Is t It Time You Got Faster, Quicker? AltiVec Techology At-a-Glace OVERVIEW Motorola s advaced AltiVec techology is desiged to eable host processors compatible with the PowerPC istructio-set architecture

More information

EE 459/500 HDL Based Digital Design with Programmable Logic. Lecture 13 Control and Sequencing: Hardwired and Microprogrammed Control

EE 459/500 HDL Based Digital Design with Programmable Logic. Lecture 13 Control and Sequencing: Hardwired and Microprogrammed Control EE 459/500 HDL Based Digital Desig with Programmable Logic Lecture 13 Cotrol ad Sequecig: Hardwired ad Microprogrammed Cotrol Refereces: Chapter s 4,5 from textbook Chapter 7 of M.M. Mao ad C.R. Kime,

More information

COMPUTER ORGANIZATION AND DESIGN. ARM Edition. The Hardware/Software Interface. Chapter 2. Instructions: Language of the Computer

COMPUTER ORGANIZATION AND DESIGN. ARM Edition. The Hardware/Software Interface. Chapter 2. Instructions: Language of the Computer COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface ARM Editio Chapter 2 Istructios: Laguage of the Computer Istructio Set The repertoire of istructios of a computer Differet computers have

More information

CS475 Parallel Programming

CS475 Parallel Programming CS475 Parallel Programmig Dese Matrix Multiply Wim Bohm, Colorado State Uiversity Except as otherwise oted, the cotet of this presetatio is licesed uder the Creative Commos Attributio 2.5 licese. Block

More information

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5 Morga Kaufma Publishers 26 February, 28 COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 5 Set-Associative Cache Architecture Performace Summary Whe CPU performace icreases:

More information

CSC165H1 Worksheet: Tutorial 8 Algorithm analysis (SOLUTIONS)

CSC165H1 Worksheet: Tutorial 8 Algorithm analysis (SOLUTIONS) CSC165H1, Witer 018 Learig Objectives By the ed of this worksheet, you will: Aalyse the ruig time of fuctios cotaiig ested loops. 1. Nested loop variatios. Each of the followig fuctios takes as iput a

More information

Arithmetic for Computers. Hwansoo Han

Arithmetic for Computers. Hwansoo Han Arithmetic for Computers Hwansoo Han Arithmetic for Computers Operations on integers Addition and subtraction Multiplication and division Dealing with overflow Floating-point real numbers Representation

More information

CSCI 402: Computer Architectures. Arithmetic for Computers (4) Fengguang Song Department of Computer & Information Science IUPUI.

CSCI 402: Computer Architectures. Arithmetic for Computers (4) Fengguang Song Department of Computer & Information Science IUPUI. CSCI 402: Computer Architectures Arithmetic for Computers (4) Fengguang Song Department of Computer & Information Science IUPUI Homework 4 Assigned on Feb 22, Thursday Due Time: 11:59pm, March 5 on Monday

More information

COMPUTER ORGANIZATION AND DESIGN

COMPUTER ORGANIZATION AND DESIGN COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Review Istructio Set Architecture Istructio Set The repertoire of istructios of a computer Differet computers have differet istructio

More information

Chapter 3 Classification of FFT Processor Algorithms

Chapter 3 Classification of FFT Processor Algorithms Chapter Classificatio of FFT Processor Algorithms The computatioal complexity of the Discrete Fourier trasform (DFT) is very high. It requires () 2 complex multiplicatios ad () complex additios [5]. As

More information

EE123 Digital Signal Processing

EE123 Digital Signal Processing Last Time EE Digital Sigal Processig Lecture 7 Block Covolutio, Overlap ad Add, FFT Discrete Fourier Trasform Properties of the Liear covolutio through circular Today Liear covolutio with Overlap ad add

More information

Outline. Applications of FFT in Communications. Fundamental FFT Algorithms. FFT Circuit Design Architectures. Conclusions

Outline. Applications of FFT in Communications. Fundamental FFT Algorithms. FFT Circuit Design Architectures. Conclusions FFT Circuit Desig Outlie Applicatios of FFT i Commuicatios Fudametal FFT Algorithms FFT Circuit Desig Architectures Coclusios DAB Receiver Tuer OFDM Demodulator Chael Decoder Mpeg Audio Decoder 56/5/ 4/48

More information

Computer Architecture and IC Design Lab. Chapter 3 Part 2 Arithmetic for Computers Floating Point

Computer Architecture and IC Design Lab. Chapter 3 Part 2 Arithmetic for Computers Floating Point Chapter 3 Part 2 Arithmetic for Computers Floating Point Floating Point Representation for non integral numbers Including very small and very large numbers 4,600,000,000 or 4.6 x 10 9 0.0000000000000000000000000166

More information

A graphical view of big-o notation. c*g(n) f(n) f(n) = O(g(n))

A graphical view of big-o notation. c*g(n) f(n) f(n) = O(g(n)) ca see that time required to search/sort grows with size of We How do space/time eeds of program grow with iput size? iput. time: cout umber of operatios as fuctio of iput Executio size operatio Assigmet:

More information

Lecture Notes 6 Introduction to algorithm analysis CSS 501 Data Structures and Object-Oriented Programming

Lecture Notes 6 Introduction to algorithm analysis CSS 501 Data Structures and Object-Oriented Programming Lecture Notes 6 Itroductio to algorithm aalysis CSS 501 Data Structures ad Object-Orieted Programmig Readig for this lecture: Carrao, Chapter 10 To be covered i this lecture: Itroductio to algorithm aalysis

More information

FPGA IMPLEMENTATION OF BASE-N LOGARITHM. Salvador E. Tropea

FPGA IMPLEMENTATION OF BASE-N LOGARITHM. Salvador E. Tropea FPGA IMPLEMENTATION OF BASE-N LOGARITHM Salvador E. Tropea Electróica e Iformática Istituto Nacioal de Tecología Idustrial Bueos Aires, Argetia email: salvador@iti.gov.ar ABSTRACT I this work, we preset

More information

COMPUTER ORGANIZATION AND DESIGN

COMPUTER ORGANIZATION AND DESIGN COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 3 Arithmetic for Computers Arithmetic for Computers Operations on integers Addition and subtraction Multiplication

More information

BOOLEAN MATHEMATICS: GENERAL THEORY

BOOLEAN MATHEMATICS: GENERAL THEORY CHAPTER 3 BOOLEAN MATHEMATICS: GENERAL THEORY 3.1 ISOMORPHIC PROPERTIES The ame Boolea Arithmetic was chose because it was discovered that literal Boolea Algebra could have a isomorphic umerical aspect.

More information

Computer Architecture

Computer Architecture Computer Architecture Overview Prof. Tie-Fu Che Dept. of Computer Sciece Natioal Chug Cheg Uiv Sprig 2002 Overview- Computer Architecture Course Focus Uderstadig the desig techiques, machie structures,

More information

Chapter 11. Friends, Overloaded Operators, and Arrays in Classes. Copyright 2014 Pearson Addison-Wesley. All rights reserved.

Chapter 11. Friends, Overloaded Operators, and Arrays in Classes. Copyright 2014 Pearson Addison-Wesley. All rights reserved. Chapter 11 Frieds, Overloaded Operators, ad Arrays i Classes Copyright 2014 Pearso Addiso-Wesley. All rights reserved. Overview 11.1 Fried Fuctios 11.2 Overloadig Operators 11.3 Arrays ad Classes 11.4

More information

Basic allocator mechanisms The course that gives CMU its Zip! Memory Management II: Dynamic Storage Allocation Mar 6, 2000.

Basic allocator mechanisms The course that gives CMU its Zip! Memory Management II: Dynamic Storage Allocation Mar 6, 2000. 5-23 The course that gives CM its Zip Memory Maagemet II: Dyamic Storage Allocatio Mar 6, 2000 Topics Segregated lists Buddy system Garbage collectio Mark ad Sweep Copyig eferece coutig Basic allocator

More information

Ones Assignment Method for Solving Traveling Salesman Problem

Ones Assignment Method for Solving Traveling Salesman Problem Joural of mathematics ad computer sciece 0 (0), 58-65 Oes Assigmet Method for Solvig Travelig Salesma Problem Hadi Basirzadeh Departmet of Mathematics, Shahid Chamra Uiversity, Ahvaz, Ira Article history:

More information

GPUMP: a Multiple-Precision Integer Library for GPUs

GPUMP: a Multiple-Precision Integer Library for GPUs GPUMP: a Multiple-Precisio Iteger Library for GPUs Kaiyog Zhao ad Xiaowe Chu Departmet of Computer Sciece, Hog Kog Baptist Uiversity Hog Kog, P. R. Chia Email: {kyzhao, chxw}@comp.hkbu.edu.hk Abstract

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor Advanced Issues

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor Advanced Issues COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 4 The Processor Advaced Issues Review: Pipelie Hazards Structural hazards Desig pipelie to elimiate structural hazards.

More information

Data diverse software fault tolerance techniques

Data diverse software fault tolerance techniques Data diverse software fault tolerace techiques Complemets desig diversity by compesatig for desig diversity s s limitatios Ivolves obtaiig a related set of poits i the program data space, executig the

More information

condition w i B i S maximum u i

condition w i B i S maximum u i ecture 10 Dyamic Programmig 10.1 Kapsack Problem November 1, 2004 ecturer: Kamal Jai Notes: Tobias Holgers We are give a set of items U = {a 1, a 2,..., a }. Each item has a weight w i Z + ad a utility

More information

COMPUTER ORGANIZATION AND DESIGN

COMPUTER ORGANIZATION AND DESIGN COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface ARM Editio Chapter 6 Parallel Processors from Cliet to Cloud Itroductio Goal: coectig multiple computers to get higher performace Multiprocessors

More information

Lecturers: Sanjam Garg and Prasad Raghavendra Feb 21, Midterm 1 Solutions

Lecturers: Sanjam Garg and Prasad Raghavendra Feb 21, Midterm 1 Solutions U.C. Berkeley CS170 : Algorithms Midterm 1 Solutios Lecturers: Sajam Garg ad Prasad Raghavedra Feb 1, 017 Midterm 1 Solutios 1. (4 poits) For the directed graph below, fid all the strogly coected compoets

More information

CMSC22200 Computer Architecture Lecture 9: Out-of-Order, SIMD, VLIW. Prof. Yanjing Li University of Chicago

CMSC22200 Computer Architecture Lecture 9: Out-of-Order, SIMD, VLIW. Prof. Yanjing Li University of Chicago CMSC22200 Computer Architecture Lecture 9: Out-of-Order, SIMD, VLIW Prof. Yajig Li Uiversity of Chicago Admiistrative Stuff Lab2 due toight Exam I: covers lectures 1-9 Ope book, ope otes, close device

More information

CSE 305. Computer Architecture

CSE 305. Computer Architecture CSE 305 Computer Architecture Computer Architecture Course Teachers Rifat Shahriyar (rifat1816@gmail.com) Johra Muhammad Moosa Textbook Computer Orgaizatio ad Desig (The Hardware/Software Iterface) David

More information

Lecture 2. RTL Design Methodology. Transition from Pseudocode & Interface to a Corresponding Block Diagram

Lecture 2. RTL Design Methodology. Transition from Pseudocode & Interface to a Corresponding Block Diagram Lecture 2 RTL Desig Methodology Trasitio from Pseudocode & Iterface to a Correspodig Block Diagram Structure of a Typical Digital Data Iputs Datapath (Executio Uit) Data Outputs System Cotrol Sigals Status

More information

CMSC Computer Architecture Lecture 5: Pipelining. Prof. Yanjing Li University of Chicago

CMSC Computer Architecture Lecture 5: Pipelining. Prof. Yanjing Li University of Chicago CMSC 22200 Computer Architecture Lecture 5: Pipeliig Prof. Yajig Li Uiversity of Chicago Admiistrative Stuff Lab1 Due toight Lab2: out later today; due 2 weeks from ow Review sessio this Friday Turig award

More information

WYSE Academic Challenge Sectional Computer Science 2005 SOLUTION SET

WYSE Academic Challenge Sectional Computer Science 2005 SOLUTION SET WYSE Academic Challege Sectioal Computer Sciece 2005 SOLUTION SET 1. Correct aswer: a. Hz = cycle / secod. CPI = 2, therefore, CPI*I = 2 * 28 X 10 8 istructios = 56 X 10 8 cycles. The clock rate is 56

More information

Lecture 1: Introduction and Fundamental Concepts 1

Lecture 1: Introduction and Fundamental Concepts 1 Uderstadig Performace Lecture : Fudametal Cocepts ad Performace Aalysis CENG 332 Algorithm Determies umber of operatios executed Programmig laguage, compiler, architecture Determie umber of machie istructios

More information

The Magma Database file formats

The Magma Database file formats The Magma Database file formats Adrew Gaylard, Bret Pikey, ad Mart-Mari Breedt Johaesburg, South Africa 15th May 2006 1 Summary Magma is a ope-source object database created by Chris Muller, of Kasas City,

More information

Multiprocessors. HPC Prof. Robert van Engelen

Multiprocessors. HPC Prof. Robert van Engelen Multiprocessors Prof. Robert va Egele Overview The PMS model Shared memory multiprocessors Basic shared memory systems SMP, Multicore, ad COMA Distributed memory multicomputers MPP systems Network topologies

More information

Solution printed. Do not start the test until instructed to do so! CS 2604 Data Structures Midterm Spring, Instructions:

Solution printed. Do not start the test until instructed to do so! CS 2604 Data Structures Midterm Spring, Instructions: CS 604 Data Structures Midterm Sprig, 00 VIRG INIA POLYTECHNIC INSTITUTE AND STATE U T PROSI M UNI VERSI TY Istructios: Prit your ame i the space provided below. This examiatio is closed book ad closed

More information

Polynomial Functions and Models. Learning Objectives. Polynomials. P (x) = a n x n + a n 1 x n a 1 x + a 0, a n 0

Polynomial Functions and Models. Learning Objectives. Polynomials. P (x) = a n x n + a n 1 x n a 1 x + a 0, a n 0 Polyomial Fuctios ad Models 1 Learig Objectives 1. Idetify polyomial fuctios ad their degree 2. Graph polyomial fuctios usig trasformatios 3. Idetify the real zeros of a polyomial fuctio ad their multiplicity

More information

Instruction Set extensions to X86. Floating Point SIMD instructions

Instruction Set extensions to X86. Floating Point SIMD instructions Instruction Set extensions to X86 Some extensions to x86 instruction set intended to accelerate 3D graphics AMD 3D-Now! Instructions simply accelerate floating point arithmetic. Accelerate object transformations

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 19 Query Optimizatio Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe Itroductio Query optimizatio Coducted by a query optimizer i a DBMS Goal:

More information

COSC 1P03. Ch 7 Recursion. Introduction to Data Structures 8.1

COSC 1P03. Ch 7 Recursion. Introduction to Data Structures 8.1 COSC 1P03 Ch 7 Recursio Itroductio to Data Structures 8.1 COSC 1P03 Recursio Recursio I Mathematics factorial Fiboacci umbers defie ifiite set with fiite defiitio I Computer Sciece sytax rules fiite defiitio,

More information

CSCI 402: Computer Architectures

CSCI 402: Computer Architectures CSCI 402: Computer Architectures Arithmetic for Computers (5) Fengguang Song Department of Computer & Information Science IUPUI What happens when the exact result is not any floating point number, too

More information

Fast Fourier Transform (FFT) Algorithms

Fast Fourier Transform (FFT) Algorithms Fast Fourier Trasform FFT Algorithms Relatio to the z-trasform elsewhere, ozero, z x z X x [ ] 2 ~ elsewhere,, ~ e j x X x x π j e z z X X π 2 ~ The DFS X represets evely spaced samples of the z- trasform

More information

9.1. Sequences and Series. Sequences. What you should learn. Why you should learn it. Definition of Sequence

9.1. Sequences and Series. Sequences. What you should learn. Why you should learn it. Definition of Sequence _9.qxd // : AM Page Chapter 9 Sequeces, Series, ad Probability 9. Sequeces ad Series What you should lear Use sequece otatio to write the terms of sequeces. Use factorial otatio. Use summatio otatio to

More information

CMSC Computer Architecture Lecture 11: More Caches. Prof. Yanjing Li University of Chicago

CMSC Computer Architecture Lecture 11: More Caches. Prof. Yanjing Li University of Chicago CMSC 22200 Computer Architecture Lecture 11: More Caches Prof. Yajig Li Uiversity of Chicago Lecture Outlie Caches 2 Review Memory hierarchy Cache basics Locality priciples Spatial ad temporal How to access

More information

CMSC Computer Architecture Lecture 10: Caches. Prof. Yanjing Li University of Chicago

CMSC Computer Architecture Lecture 10: Caches. Prof. Yanjing Li University of Chicago CMSC 22200 Computer Architecture Lecture 10: Caches Prof. Yajig Li Uiversity of Chicago Midterm Recap Overview ad fudametal cocepts ISA Uarch Datapath, cotrol Sigle cycle, multi cycle Pipeliig Basic idea,

More information

How do we evaluate algorithms?

How do we evaluate algorithms? F2 Readig referece: chapter 2 + slides Algorithm complexity Big O ad big Ω To calculate ruig time Aalysis of recursive Algorithms Next time: Litterature: slides mostly The first Algorithm desig methods:

More information

COMP Parallel Computing. PRAM (1): The PRAM model and complexity measures

COMP Parallel Computing. PRAM (1): The PRAM model and complexity measures COMP 633 - Parallel Computig Lecture 2 August 24, 2017 : The PRAM model ad complexity measures 1 First class summary This course is about parallel computig to achieve high-er performace o idividual problems

More information

Java Expressions & Flow Control

Java Expressions & Flow Control Java Expressios & Flow Cotrol Rui Moreira Expressio Separators:. [ ] ( ), ; Dot used as decimal separator or to access attributes ad methods double d = 2.6; Poto poto = ew Poto(2, 3); it i = poto.x; it

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 20 Itroductio to Trasactio Processig Cocepts ad Theory Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe Itroductio Trasactio Describes local

More information

Abstract. Chapter 4 Computation. Overview 8/13/18. Bjarne Stroustrup Note:

Abstract. Chapter 4 Computation. Overview 8/13/18. Bjarne Stroustrup   Note: Chapter 4 Computatio Bjare Stroustrup www.stroustrup.com/programmig Abstract Today, I ll preset the basics of computatio. I particular, we ll discuss expressios, how to iterate over a series of values

More information

5 th Edition. The Processor We will examine two MIPS implementations A simplified version A more realistic pipelined version

5 th Edition. The Processor We will examine two MIPS implementations A simplified version A more realistic pipelined version COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface Chapter 4 5 th Edition Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined

More information

Chapter 4. Procedural Abstraction and Functions That Return a Value. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Chapter 4. Procedural Abstraction and Functions That Return a Value. Copyright 2015 Pearson Education, Ltd.. All rights reserved. Chapter 4 Procedural Abstractio ad Fuctios That Retur a Value Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Overview 4.1 Top-Dow Desig 4.2 Predefied Fuctios 4.3 Programmer-Defied Fuctios 4.4

More information

IMP: Superposer Integrated Morphometrics Package Superposition Tool

IMP: Superposer Integrated Morphometrics Package Superposition Tool IMP: Superposer Itegrated Morphometrics Package Superpositio Tool Programmig by: David Lieber ( 03) Caisius College 200 Mai St. Buffalo, NY 4208 Cocept by: H. David Sheets, Dept. of Physics, Caisius College

More information

ANN WHICH COVERS MLP AND RBF

ANN WHICH COVERS MLP AND RBF ANN WHICH COVERS MLP AND RBF Josef Boští, Jaromír Kual Faculty of Nuclear Scieces ad Physical Egieerig, CTU i Prague Departmet of Software Egieerig Abstract Two basic types of artificial eural etwors Multi

More information