Organizational issues (I)

Size: px

Start display at page:

Download "Organizational issues (I)"

Jesse Brown
5 years ago
Views:

1 COSC 6385 Computer Architecture Introduction and Organizational Issues Fall 2007 Organizational issues (I) Classes: Monday, 1.00pm 2.30pm, PGH 232 Wednesday, 1.00pm 2.30pm, PGH 232 Evaluation 25% homework 75% three quizzes ( 25% each) In case of questions: gabriel@cs.uh.edu Tel: (713) Office hours: PGH 524, Mo, Wed, 3pm-4pm or by appointment All slides available on the website: And as an online course ( to be established soon) 1

2 Organizational Issues (II) TA of the course: Spoorthy Mareddy PGH 526 Tentative dates for the quizzes: Monday, September 24 th Monday, October 22 nd Wednesday, November 28th Contents Textbook: John L. Hennessy, David A. Patterson Computer Architecture A Quantitative Approach 4 th Edition Morgan Kaufmann Publishers 2

3 Contents (II) Most of chapters 1 to 5 Appendix A, B, C Selected sections regarding Storage systems Vector Processors Why an advanced lecture on Computer Architecture? for (i=0; i<n; i++ ) { c[i] = a[i] + b[i]; Every loop iteration requires 3 memory operations 2 loads 1 store For a micro-processor having a frequency of 2 GHz this would require 9 1 3*4Bytes *2*10 s = 24GBytes/ s to satisfy one Floating Point Unit (FPU) Most modern processors have 2 FPUs and 2 IUs which can work in parallel 3

4 Memory technology ( DDR: Double Data Rate SDRAM Bandwidth of a memory module with SB SB max SB BUS f BUS max = SBBus * fbus* Op/ Cycle : max. memory bandwidth : Bandwidth of the memory bus (64 Bit = 8 Bytes) : Frequency of the memory bus Memory bandwidth Name PC100 SDRAM PC133 SDRAM PC1600 DDR PC2100 DDR PC2700 DDR PC3200 DDR PC3700 DDR PC4200 DDR Frequency of memory bus (MHz) max. bandwidth 800 MB/s 1.1 GB/s 1.6 GB/s 2.1 GB/s 2.7 GB/s 3.2 GB/s 3.7 GB/s 4.2 GB/s 4

5 Memory modules (cont.) Dual Channel Memory: 2 I/O Channels between memory controller und memory module DDR2: further evolution of the DDR technology uses 1.8 Volts vs. 2.5 Volts technology larger capacity of the chips higher frequency Name PC PC PC PC Frequency of memory bus 400 MHz 533 MHz 667 MHz 800 MHz Bandwidth of a module 3.2 GB/s 4.2 GB/s 5.3 GB/s 6.4 GB/s Dual Channel DDR2 bandwidth 6.4 GB/s 8.4 GB/s 10.6 GB/s 12.8 GB/s Memory hierarchies Backup (tape) Primary data storage (disk) main memory Caches Register Size TB, PT ~ 100 GB ~ 1 GB ~ 1 MB < 256 Words Access time [cycles] >

6 Memory hierarchies Do I have to care about memory hierarchies? Example: Matrix-multiply of two dense matrices Trivial code for ( i=0; i<dim; i++ ) { for ( j=0; j<dim; j++ ) { for ( k=0; k<dim; k++) { c[i][j] += a[i][k] * b[k][j]; Matrix-multiply Performance of the trivial implementation on an 2.2 GHz AMD Opteron with 2 GB main memory 1 MB 2 nd level cache Matrix dimension 256x256 Execution time [sec] Performance [MFLOPS] x

7 Matrix-multiply (II) Peak performance of the processor 2 * (2.2 * 10 9 ) Floating point operations/sec = 4.4 * 10 9 = 4.4 GFLOPS Number of floating point units Frequency of the processor assuming that each FPU can finish an operation per cycle Theoretical floating point peak performance of the processor Where are the missing FLOPS between theoretical peek and achieved performance? Memory wait time Blocked code for ( i=0; i<dim; i+=block ) { for ( j=0; j<dim; j+=block ) { for ( k=0; k<dim; k+=block) { for (ii=i; ii<(i+block); ii++) { for (jj=j; jj<(j+block); jj++) { for (kk=k; kk<(k+block);kk++) { c[ii][jj] += a[ii][kk] * b[kk][jj]; 7

8 Performance of the blocked code Matrix dimension block Execution time [sec] Performance [MFLOPS] trivial [MFLOPS] 256x x

9 9

10 Top 500 List ( Top 500 List 10

of the most frequently used supercomputers in 1996. (www.es.

11 Earth Simulator Target: Achievement of high-speed numerical simulations with processing speed of 1000 times higher than that of the most frequently used supercomputers in ( 640 nodes 8 processors/node 40 TFLOPS peak 10 TByte memory 11

Organizational issues (I)

COSC 6385 Computer Architecture Introduction and Organizational Issues Fall 2008 Organizational issues (I) Classes: Monday, 1.00pm 2.30pm, PGH 232 Wednesday, 1.00pm 2.30pm, PGH 232 Evaluation 25% homework