Determining the Relevancy of Moore's Law Through the Comparison of Ten Distinct Processor Systems

Size: px
Start display at page:

Download "Determining the Relevancy of Moore's Law Through the Comparison of Ten Distinct Processor Systems"

Transcription

1 Determining the Relevancy of Moore's Law Through the Comparison of Ten Distinct Processor Systems Nickolas DeVito Department of Electrical Engineering and Computer Science University of Central Florida Orlando, FL Abstract The main purpose of this project is to compare and contrast the fundamental metrics and specifications of ten example processor architectures, dating from the 199s to the present day. CPU systems are analyzed based on their clock rates, memory capacity, number of or cores, and data bus word width. The most popular designs studied were parallel, multi-core microarchitectures for general or special purposes. baseline designs, such as Intel Pentium, AMD Athlon, and Xilinx FPGA-based architectures, were examined in this project, as were less historical designs, such as Xeon EX and OpenSPACE T1. Through this project's research, a better understanding of how different CPU systems operate will be achieved, as well as how Moore's Law still defines processor improvement rates in the computer industry. It is important to acknowledge these systems similarities and differences, as CPU architecture design is a massive field worldwide, and future innovations will determine how they improve. Keywords Amdahl's Law, Architecture, Baseline Computer Systems, Clock rate, Core, Data bus, Memory capacity, Moore s Law, Parallelism, Specifications, Speedup, Transistors. I. OVERVIEW OF PROCESSOR ARCHITECTURE Computer hardware involves a carefully implemented system of instruction sets operating under defined specifications. This project studies processor architecture by looking at ten different CPU systems, including five baseline computer systems. The components and principles being used to explore processor architecture are defined below. Performance data for the ten systems is also shown. 1) Classic Components: A computer system is typically made up of five classic components: datapath, control, memory, input, and output [1]. The datapath manipulates data that will be used within the computer. The control takes that data and determines how it will operate, effectively giving directions to the datapath. Together, the datapath and control form the computer's processor. Memory acts outside of the processor, storing and transmitting data, which can then be retrieved or rewritten after following directions from the control. The input and output both connect to external devices that the user can observe or manipulate. Input takes actions from the user, and translates them into directions for the processor, while output often takes data and gives it to the user to observe or download. Examples of input include keyboards, microphones, and the mouse, while output includes monitors and printers. In addition, certain systems act as both input and output, such as external hard drives and memory sticks. 2) Processor Buses and Bit Width: Interaction between a computer's processor and memory is necessary to complete basic functions. Data is transferred between the two components by the use of two buses, the address bus and data bus. Bits from the processor can be sent along the address bus to the memory, so that they can be stored. While the address bus can only send data in one direction, the data bus can transfer data to and from the processor and memory. This is because the information sent along the address bus is simply a direction for the memory address where information is stored. The actual data at that address can be sent to the processor along the data bus, so that it can be "crunched," and then sent back to the memory once the execution has been completed. The bit width refers to the number of wires in each bus. Since each wire can submit one binary digit at a time, having n wires allows for 2^n possible values to be had. 3) Metrics Studied: This project refers to several metrics that are used to study and evaluate the processor architectures. CPU clock rate (measured in MHz) generally refers to how fast a processor performs. This speed rate is evaluated by determining how many clock cycles can be executed in a second. Instruction set size (measured in bits) refers to how large a length the instructions held on a processor can be. Having a larger data bus width (and effectively greater instruction set size) means more machine instructions can be held, meaning a greater variety of CPU tasks can be executed. Memory capacity (measured in MB) refers to how large the memory of a computer system is, and therefore how much data can be stored. These metrics, and their respective units, can be seen in the graphs in Section 3 of this project. 4) Significance: It is important to look at these metrics when comparing computer architectures, because they can inherently be used to evaluate performance, and determine the rate at which they have improved over a period of time. Looking at clock rates between different is a good way to determine time enhancement, and the overall speedup of a system, in line with Amdahl's Law. Memory capacity can be used to illustrate Moore's Law. Page 1 of 5

2 Since having larger memory means more transistors are being used, systems of increasing memory capacity can be compared to evaluate the rate at which a system's memory doubles. 5) Processor Performance Equation: An equation exists that can be used to evaluate processor performance for different systems. Performance is inversely proportional to execution time, and execution time can be defined as a combination of instruction count, CPI, and clock rate. Instruction count refers to the number of instructions that need to be executed. CPI stands for clocks per instructions, and means how many clock cycles are needed to complete a single instruction. Clock rate (or cycle time) means the number of clocks needed to complete the program. Overall, the equation states: [Performance] = 1/[CPU Time] = 1/[Instruction Count]*[CPI]*[Cycle Time]. 6) Parallelism: Parallelism explains that multiple processor systems can be used in unison to form one large computer architecture, effectively increasing execution time, memory capacity, and overall performance [2]. Parallelism is a practice often used in large server mills and supercomputers, accomplishing more than a single desktop could achieve. While connecting hundreds of CPU units isn't cost or space efficient for consumer-based computers, parallelism can still be used on a smaller scale by installing multiple core. Dual core CPUs can store twice as much memory as single cores. Quad core CPUs can store four times as much, and so on. Having more cores also lowers execution time. Speedup factor compares the execution time of two different systems. If a single core system is turned into a dual core system, then programs could be executed twice as fast, as expressed by the following equation: [Speedup] = 1/[1-Fraction_ Enhanced]+[Franction_Enhanced/Speedup_Enhanced]. The next sections in this report will give a more direct look to how these concepts can be used to study processor architectures, by looking at ten specific examples. In Section 2 the ten processor architectures spanning from 199s until today will be introduced in detail. Section 3 will evaluate the data achieved by studying these architectures, and Section 4 will conclude with what can be learned from the charted metrics. II. LITERATURE REVIEW All ten processor systems were researched from various reports and sources. The findings for each system can be found below, starting with the five baseline computer systems and following with five non-baseline designs used as comparisons. The metrics can also be seen in Table : The MasPar MP-1 Model 128 Backend was developed in 1992, as a special purpose highly parallel architecture [2]. The SIMD architecture includes 8192 PEs, each containing a 4-bit processor with 16 KB of RAM. A -bit, 128 MHz processor controls data flow in the PE array. The SNAP-1 Parallel AI Prototype was first developed in 1992, as a special purpose architecture for natural language understanding [3]. Known as the SNAP project, the University of Southern California processor system included 144 -bit DSP chips (distributed over 8 different boards), each operating at 25 MHz. The local memory for each chip in the system was capable of storing 256 KB. The Intel Pentium III processor system was first designed in 1999, as a general purpose desktop architecture [1]. It utilized 18 -bit, each running at 1 GHz. Each CPU unit could store 256 KB of memory, making the computer system an ideal option for conducting simulation tests and achieving benchmark results. 2-29: The AMD Athlon 24+ Processor System was developed in 22, as a general purpose desktop architecture [1]. The computer system contains 256 parallelized, each running at 24 MHz (as the name suggests). All in the systems have a data length of bits, and can store 256 KB of memory. Like the Intel Pentium III, the AMD Athlon 24+ is also ideal for simulation tests and benchmark results, but the larger amount of and later release date means it has a better performance overall. In 26, the Xilinx Virtex-4 SX35 FPGA-Based Scalable Architecture was released as a special purpose architecture for DCT computation [4]. It achieved FPGA dynamic partial reconfiguration with 8 distinct core hardware arrangements (73577 gates overall), each operating at 1 MHz. The -bit single core CPUs could each store 2.3 MB of memory. The OpenSPARC T1 CPU Microarchitecture was also released in 26 (originally named the UltraSPARC T1) [8]. It was developed by Oracle with 8, 4-way multithreaded -bit cores, using a RTL model for simulation. Each core could store up to 3.19 MB of memory, and could run at 8 MHz. 28 saw the release of an Experimental Intel Core 2 Quad Machine, which was designed as a special purpose architecture for workload synthesis and statistical modeling (although a commercial version for general purposes came shortly after) [5]. It included 4 homogeneous -bit dual cores (8 in total), each operating at 24 MHz. The cores could each store 4.1 GB of memory, which is a huge amount compared to most other architectures, but having a smaller number of cores balanced out the total memory capabilities of the system. The Xeon EX Processor System was designed in 29 as a general purpose processor system with eight dual-threaded cores [6]. Each core held a data length of 64 bits, and overall, the system was made up of 2.3 billion transistors (45 nm in length). The CPU ran at MHz, and each core could contain 24 MB of system memory. 21-present: In 21, the IA 2D-Mesh Network-On-Chip Architecture was constructed as a special purpose message passing computer system [7]. Having 48 integrated cores (with 1.3 billion transistors) enabled the processor to accomplish rapid calculations at a pace previous CPU systems could not match. Page 2 of 5

3 The processor had a word width of bits, and each core could operate at 2 MHz, storing 256 KB of memory. The LTE MIMO Processor SIMT Architecture was first released in 213, to be a special purpose RSIMP approach system for parallel instruction sequences [9]. The system relied on 16 unique quad-core CPUs (4core-4wide), totaling 64 -bit cores in total. Each core could store KB of memory, while operating at a 8 MHz clock rate Clock Rate (MHz) vs. Time (Year) III. DATA ANALYSIS The data abstracted for all ten computer architectures (detailed in Table 1) were used to chart four different graphs evaluating relevant metrics over time. The four key metrics used to determine performance were clock rate, memory capacity, the number of total cores, and the data bus word width (see Figure 1 for more detailed explanation). Figure 2 shows the clock rates of each of the ten processor architectures over time. As can be seen in the graph, there is a visible growth in the amount of clocks a processor can compute per second. Naturally, the older architectures operate more slowly. Since having more transistors enables a higher clock rate, this graph shows a clear representation of Moore's Law acting on these systems. The rate of changing memory capacity for all systems over time can be seen in Figure 3. Although the graph shows a slight decline over time, this is due in part to the fact some architectures studied were special purpose systems with multiple parallel cores, and others were simple desktop systems. Looking at the potential for systems to store more memory, those that appear later over time (such as the Xilinx Virtex-4 SX35 and the Xeon EX) have the largest capacity. Figure 4 displays the number of cores present in architectures over time. This graph also shows a decline, but this is due to the 1992 MasPar MP-1 Model 128 Back End being an outlier with 8192 combined cores. Ignoring this shows a more reasonable range of cores being present in the architectures. Moreso, many newer systems are able to achieve greater or similar results with fewer cores because they have been better optimized for performance. Lastly, Figure 5 shows the varying lengths of each processor system's word width over time. Although the graph shows a visible increase, it's important to note that all but one of the systems studied supported a data bus with bits. The Xeon EX system's data bus was 64-bits long, despite having less cores than most architectures Fig. 2. Plot shows exponential growth of processor clock rate over time. Memory Capacity (MB) vs. Time (Year) Fig. 3. Plot shows total memory capacity of architectures over time. Number of Cores vs. Time (Year) Fig. 4. Plot shows number of cores present in compared CPU designs. Metrics covered analyzed in this paper: CPU clock rate (MHz) vs. Year Memory Capacity (MB) vs. Year Number of Processors or Cores vs. Year Data bus Word Width (bits) vs. Year Fig. 1. These metrics were used to study each of the ten processor systems by evaluating varying performance over time. Page 3 of 5

4 Data Bus Word Width (bits) vs. Time (Year) Fig. 5. Plot shows exponential growth of CPU data bus length over time. IV. CONCLUSION This project set out to display the validity of Moore's Law using ten processor architectures (five baseline systems and five non-baseline). Overall, the findings were agreeable, as especially observed by the comparable clock rates of each system, seen in Figure 2. Other metrics weren't as reliable in expressing Moore's Law, but this was largely due to the types of example architecture systems used. Rather than select ten general purpose desktop to compare in succession, a wide arrange of architectures designed for different purposes. Some systems didn't chart well in the metric graphs because they were lower profile than systems before them, and had less operating cores with lower memory capacity. This isn't to say that Moore's Law can't be applied, but that certain architecture systems only require a certain amount of specifications to complete their purposes. If more closely similar systems were studied, perhaps the findings would have been more noticeable. V. REFERENCES [1] H. A. Bahr and R. F. DeMara, "OTBSAF Scalability on Pentium III/4 and Athlon 64/XP3 Architectures," in MSIAC Modeling and Simulation Journal, on February 9, 25, Vol.6, No. 3, March, 25, pp [2] H. Bahr, R. F. DeMara, and M. Georgiopoulos, "Integer-Encoded Massively Parallel Processing of Fast-Learning ARTMAP Networks," in Proceedings of the 1997 SPIE AeroSense Symposium (AeroSense-97), pp , Orlando, Florida, U.S.A., April 21-24, [3] R. F. DeMara and D. I. Moldovan, "The SNAP-1 Parallel AI Prototype," IEEE Transactions on Parallel and Distributed Systems, Vol. 4, No. 8, August, 1993, pp [4] J. Huang, M. Parris, J. Lee, and R. F. DeMara, "Scalable FPGA-based Architecture for DCT Computation Using Dynamic Partial Reconfiguration," ACM Transactions on Embedded Computing Systems, Vol. 9, No. 1, Art. 9, October, 29, pp [5] Hughes, C.; Tao Li, "Accelerating multi-core processor design space evaluation using automatic multi-threaded workload synthesis," Workload Characterization, 28. IISWC 28. IEEE International Symposium on, vol., no., pp.163,172, Sept. 28. [6] Rusu, S.; Simon Tam; Muljono, H.; Stinson, J.; Ayers, D.; Chang, Jonathan; Varada, R.; Ratta, M.; Kottapalli, S.; Vora, S., "A 45 nm 8- Core Enterprise Xeon Processor," Solid-State Circuits, IEEE Journal of, vol.45, no.1, pp.7,14, Jan. 21. [7] Howard, J.; Dighe, S.; Vangal, S.R.; Ruhl, G.; Borkar, N.; Jain, S.; Erraguntla, V.; Konow, M.; Riepen, M.; Gries, M.; Droege, G.; Lund- Larsen, T.; Steibl, S.; Borkar, S.; De, V.K.; Van Der Wijngaart, R., "A 48-Core IA- Processor in 45 nm CMOS Using On-Die Message- Passing and DVFS for Performance and Power Scaling," Solid-State Circuits, IEEE Journal of, vol.46, no.1, pp.173,183, Jan [8] Kahng, A.B.; Seokhyeong Kang; Kumar, R.; Sartori, J., "Designing a processor from the ground up to allow voltage/reliability tradeoffs," High Performance Computer Architecture (HPCA), 21 IEEE 16th International Symposium on, vol., no., pp.1,11, 9-14 Jan. 21. [9] Zheng Yu; Zhiyi Yu; Xueqiu Yu; Ningxi Liu; Xiaoyang Zeng, "Low- Power Multicore Processor Design With Reconfigurable Same- Instruction Multiple Process," Circuits and Systems II: Express Briefs, IEEE Transactions on, vol.61, no.6, pp.423,427, June 214. [1] Shute, G., "Components of a Computer," Computer Components, June 214 University of Minnesota Duluth, August 215. TABLE I. SPECIFICATIONS FOR TEN CPU ARCHITECTURES STUDIED IN PROJECT Name of Architecture [reference] SNAP-1 Parallel AI Prototype [3] Intel Pentium III Processor System [1] AMD Athlon 24+ Processor System [1] Xilinx Virtex-4 SX35 FPGA-Based Scalable Architecture [4] MasPar MP-1 Model 128 Back End [2] Xeon EX Processor System [6] : Application- Specific or -purpose Computation NLU: Special DCT Computation: Special Highly Parallel Architecture: Special Die Area, Number of Transistors, or Number of Chips/Boards/etc. 144 DSP Chips on 8 large circuit boards Computer System utilizing 18 Computer System utilizing Distinct HW arrangements and gate count of clusters of PEs, holding bit 8 dual-threaded 64- bit cores, each with CPU Clock Rate (MHz) Memory Capacity (MB) U* MB U* MB U* MB 2.3MB/CP U* 8 CPU = 184 MB 16KB/CPU * 8192 CPU = MB 24MB/CPU * 8 Data Bus Word Width (bits) 64 Number of Cores or CPUs 144 single core CPUs = 144 cores 18 single core CPUs = 18 cores 256 single core CPUs = 256 cores 8 single core CPUs = 8 cores 8192 single core CPUs = 8192 cores 8 dual-threaded CPUs = 8 cores Ideal Speedup for 99% parallel code (ignoring overheads) 144 cores so Told/Tnew= 1/(.1+ (.99/144)) = fold 1 1/(.1+ (.99/18)) = fold 256 cores so Told/Tnew= 1/(.1+ (.99/256)) = fold 1/(.1+ (.99/8)) = fold 8192 cores so Told/Tnew = 1/(.1+ (.99/8192)) = fold 1/(.1+ (.99/8)) = Page 4 of 5

5 Name of Architecture [reference] LTE MIMO Processor SIMT Architecture [9] IA 2D-Mesh Network- On-Chip Architecture [7] Experimental Intel Core 2 Quad Machine [5] OpenSPARC T1 CPU Microarchitecture [8] : Application- Specific or -purpose Computation Die Area, Number of Transistors, or Number of Chips/Boards/etc. CPU Clock Rate (MHz) Memory Capacity (MB) Data Bus Word Width (bits) Number of Cores or CPUs Ideal Speedup for 99% parallel code (ignoring overheads) 2.3 B transistors 192 MB 7.48-fold RSIMP Approach: Special Message Passing: Special Workload Synthesis and Statistical Modeling: Special 16-core processor, each 4core-4wide 48 integrated CPUs, with 1.3 B total transistors 4 homogeneous dual cores, with shared L2 caches 8 CPU cores, with each being 4-way multithreaded KB/CPU * MB U* 48 CPU = MB 4.1GB/CPU * 8.77 MB 3.19MB/CP U* MB 16 quad core CPUs = 64 cores 48 single core CPUs = 48 cores 4 dual core CPUs = 8 cores 8 single core CPUs = 8 cores 64 cores so Told/Tnew= 1/(.1+ (.99/64)) = fold 4 1/(.1+ (.99/48)) =.65-fold 1/(.1+ (.99/8)) = 7.48-fold 1/(.1+ (.99/8)) = 7.48-fold Page 5 of 5

Comparison of Processor Architectures and Metrics from 1992 to 2011

Comparison of Processor Architectures and Metrics from 1992 to 2011 Comparison of Processor Architectures and Metrics from 1992 to 211 Michael Colucciello Department of Electrical Engineering and Computer Science University of Central Florida Orlando, FL 32816-2362 Abstract

More information

ADVANCES IN PROCESSOR DESIGN AND THE EFFECTS OF MOORES LAW AND AMDAHLS LAW IN RELATION TO THROUGHPUT MEMORY CAPACITY AND PARALLEL PROCESSING

ADVANCES IN PROCESSOR DESIGN AND THE EFFECTS OF MOORES LAW AND AMDAHLS LAW IN RELATION TO THROUGHPUT MEMORY CAPACITY AND PARALLEL PROCESSING ADVANCES IN PROCESSOR DESIGN AND THE EFFECTS OF MOORES LAW AND AMDAHLS LAW IN RELATION TO THROUGHPUT MEMORY CAPACITY AND PARALLEL PROCESSING Evan Baytan Department of Electrical Engineering and Computer

More information

An Analysis of machine processors and their evolution of performance, metrics and intended uses with respect to time.

An Analysis of machine processors and their evolution of performance, metrics and intended uses with respect to time. An Analysis of machine processors and their evolution of performance, metrics and intended uses with respect to time. Ryan Hromada Department of Electrical Engineering and Computer Science University of

More information

The Time is Moving and The Processor Technology also changing through time

The Time is Moving and The Processor Technology also changing through time The Time is Moving and The Processor Technology also changing through time 1993 214 Hai Nguyen Le Thanh Department of Electrical Engineering and Computer Science University of Central Florida Orlando,

More information

Review and Analysis of Select Performance Metrics for Processor Architecture Designs: a chronology from the 1990s to the Present

Review and Analysis of Select Performance Metrics for Processor Architecture Designs: a chronology from the 1990s to the Present Review and Analysis of Select Performance Metrics for Processor Architecture Designs: a chronology from the 199s to the Present William J. Santos Department of Electrical Engineering and Computer Science

More information

Survey of Processor Architectures and Applied Metrics

Survey of Processor Architectures and Applied Metrics Survey of Processor Architectures and Applied Metrics Luis Gamarra Jimenez Department of Electrical Engineering and Computer Science University of Central Florida Orlando, FL 816-2362 Abstract Processor

More information

A Retrospective Examination of Moore s Law Relative to Multiple CPU Designs from the 1990 s to Present Day

A Retrospective Examination of Moore s Law Relative to Multiple CPU Designs from the 1990 s to Present Day A Retrospective Examination of Moore s Law Relative to Multiple CPU Designs from the 199 s to Present Day Steven Nguyen Department of Electrical Engineering and Computer Science University of Central Florida

More information

1.3 Data processing; data storage; data movement; and control.

1.3 Data processing; data storage; data movement; and control. CHAPTER 1 OVERVIEW ANSWERS TO QUESTIONS 1.1 Computer architecture refers to those attributes of a system visible to a programmer or, put another way, those attributes that have a direct impact on the logical

More information

Computer Architecture. Introduction. Lynn Choi Korea University

Computer Architecture. Introduction. Lynn Choi Korea University Computer Architecture Introduction Lynn Choi Korea University Class Information Lecturer Prof. Lynn Choi, School of Electrical Eng. Phone: 3290-3249, 공학관 411, lchoi@korea.ac.kr, TA: 윤창현 / 신동욱, 3290-3896,

More information

Homeschool Enrichment. The System Unit: Processing & Memory

Homeschool Enrichment. The System Unit: Processing & Memory Homeschool Enrichment The System Unit: Processing & Memory Overview This chapter covers: How computers represent data and programs How the CPU, memory, and other components are arranged inside the system

More information

Fundamentals of Quantitative Design and Analysis

Fundamentals of Quantitative Design and Analysis Fundamentals of Quantitative Design and Analysis Dr. Jiang Li Adapted from the slides provided by the authors Computer Technology Performance improvements: Improvements in semiconductor technology Feature

More information

Low-Power Interconnection Networks

Low-Power Interconnection Networks Low-Power Interconnection Networks Li-Shiuan Peh Associate Professor EECS, CSAIL & MTL MIT 1 Moore s Law: Double the number of transistors on chip every 2 years 1970: Clock speed: 108kHz No. transistors:

More information

Microprocessor Trends and Implications for the Future

Microprocessor Trends and Implications for the Future Microprocessor Trends and Implications for the Future John Mellor-Crummey Department of Computer Science Rice University johnmc@rice.edu COMP 522 Lecture 4 1 September 2016 Context Last two classes: from

More information

A Simple Model for Estimating Power Consumption of a Multicore Server System

A Simple Model for Estimating Power Consumption of a Multicore Server System , pp.153-160 http://dx.doi.org/10.14257/ijmue.2014.9.2.15 A Simple Model for Estimating Power Consumption of a Multicore Server System Minjoong Kim, Yoondeok Ju, Jinseok Chae and Moonju Park School of

More information

Advanced Computer Architecture (CS620)

Advanced Computer Architecture (CS620) Advanced Computer Architecture (CS620) Background: Good understanding of computer organization (eg.cs220), basic computer architecture (eg.cs221) and knowledge of probability, statistics and modeling (eg.cs433).

More information

Performance. February 12, Howard Huang 1

Performance. February 12, Howard Huang 1 Performance Today we ll try to answer several questions about performance. Why is performance important? How can you define performance more precisely? How do hardware and software design affect performance?

More information

Chapter 1: Introduction to the Microprocessor and Computer 1 1 A HISTORICAL BACKGROUND

Chapter 1: Introduction to the Microprocessor and Computer 1 1 A HISTORICAL BACKGROUND Chapter 1: Introduction to the Microprocessor and Computer 1 1 A HISTORICAL BACKGROUND The Microprocessor Called the CPU (central processing unit). The controlling element in a computer system. Controls

More information

CSCI 402: Computer Architectures. Computer Abstractions and Technology (4) Fengguang Song Department of Computer & Information Science IUPUI.

CSCI 402: Computer Architectures. Computer Abstractions and Technology (4) Fengguang Song Department of Computer & Information Science IUPUI. CSCI 402: Computer Architectures Computer Abstractions and Technology (4) Fengguang Song Department of Computer & Information Science IUPUI Contents 1.7 - End of Chapter 1 Power wall The multicore era

More information

Chapter 1: Fundamentals of Quantitative Design and Analysis

Chapter 1: Fundamentals of Quantitative Design and Analysis 1 / 12 Chapter 1: Fundamentals of Quantitative Design and Analysis Be careful in this chapter. It contains a tremendous amount of information and data about the changes in computer architecture since the

More information

Performance, Power, Die Yield. CS301 Prof Szajda

Performance, Power, Die Yield. CS301 Prof Szajda Performance, Power, Die Yield CS301 Prof Szajda Administrative HW #1 assigned w Due Wednesday, 9/3 at 5:00 pm Performance Metrics (How do we compare two machines?) What to Measure? Which airplane has the

More information

Serial. Parallel. CIT 668: System Architecture 2/14/2011. Topics. Serial and Parallel Computation. Parallel Computing

Serial. Parallel. CIT 668: System Architecture 2/14/2011. Topics. Serial and Parallel Computation. Parallel Computing CIT 668: System Architecture Parallel Computing Topics 1. What is Parallel Computing? 2. Why use Parallel Computing? 3. Types of Parallelism 4. Amdahl s Law 5. Flynn s Taxonomy of Parallel Computers 6.

More information

45-year CPU Evolution: 1 Law -2 Equations

45-year CPU Evolution: 1 Law -2 Equations 4004 8086 PowerPC 601 Pentium 4 Prescott 1971 1978 1992 45-year CPU Evolution: 1 Law -2 Equations Daniel Etiemble LRI Université Paris Sud 2004 Xeon X7560 Power9 Nvidia Pascal 2010 2017 2016 Are there

More information

Multicore computer: Combines two or more processors (cores) on a single die. Also called a chip-multiprocessor.

Multicore computer: Combines two or more processors (cores) on a single die. Also called a chip-multiprocessor. CS 320 Ch. 18 Multicore Computers Multicore computer: Combines two or more processors (cores) on a single die. Also called a chip-multiprocessor. Definitions: Hyper-threading Intel's proprietary simultaneous

More information

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 1. Computer Abstractions and Technology

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 1. Computer Abstractions and Technology COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 1 Computer Abstractions and Technology Classes of Computers Personal computers General purpose, variety of software

More information

Enhancing Analysis-Based Design with Quad-Core Intel Xeon Processor-Based Workstations

Enhancing Analysis-Based Design with Quad-Core Intel Xeon Processor-Based Workstations Performance Brief Quad-Core Workstation Enhancing Analysis-Based Design with Quad-Core Intel Xeon Processor-Based Workstations With eight cores and up to 80 GFLOPS of peak performance at your fingertips,

More information

Copyright 2012, Elsevier Inc. All rights reserved.

Copyright 2012, Elsevier Inc. All rights reserved. Computer Architecture A Quantitative Approach, Fifth Edition Chapter 1 Fundamentals of Quantitative Design and Analysis 1 Computer Technology Performance improvements: Improvements in semiconductor technology

More information

Hardware-Software Codesign. 1. Introduction

Hardware-Software Codesign. 1. Introduction Hardware-Software Codesign 1. Introduction Lothar Thiele 1-1 Contents What is an Embedded System? Levels of Abstraction in Electronic System Design Typical Design Flow of Hardware-Software Systems 1-2

More information

TR An Overview of NVIDIA Tegra K1 Architecture. Ang Li, Radu Serban, Dan Negrut

TR An Overview of NVIDIA Tegra K1 Architecture. Ang Li, Radu Serban, Dan Negrut TR-2014-17 An Overview of NVIDIA Tegra K1 Architecture Ang Li, Radu Serban, Dan Negrut November 20, 2014 Abstract This paperwork gives an overview of NVIDIA s Jetson TK1 Development Kit and its Tegra K1

More information

Performance COE 403. Computer Architecture Prof. Muhamed Mudawar. Computer Engineering Department King Fahd University of Petroleum and Minerals

Performance COE 403. Computer Architecture Prof. Muhamed Mudawar. Computer Engineering Department King Fahd University of Petroleum and Minerals Performance COE 403 Computer Architecture Prof. Muhamed Mudawar Computer Engineering Department King Fahd University of Petroleum and Minerals What is Performance? How do we measure the performance of

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 1. Computer Abstractions and Technology

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 1. Computer Abstractions and Technology COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 1 Computer Abstractions and Technology The Computer Revolution Progress in computer technology Underpinned by Moore

More information

Chapter 1. Introduction: Part I. Jens Saak Scientific Computing II 7/348

Chapter 1. Introduction: Part I. Jens Saak Scientific Computing II 7/348 Chapter 1 Introduction: Part I Jens Saak Scientific Computing II 7/348 Why Parallel Computing? 1. Problem size exceeds desktop capabilities. Jens Saak Scientific Computing II 8/348 Why Parallel Computing?

More information

The Art of Parallel Processing

The Art of Parallel Processing The Art of Parallel Processing Ahmad Siavashi April 2017 The Software Crisis As long as there were no machines, programming was no problem at all; when we had a few weak computers, programming became a

More information

EE5780 Advanced VLSI CAD

EE5780 Advanced VLSI CAD EE5780 Advanced VLSI CAD Lecture 1 Introduction Zhuo Feng 1.1 Prof. Zhuo Feng Office: EERC 513 Phone: 487-3116 Email: zhuofeng@mtu.edu Class Website http://www.ece.mtu.edu/~zhuofeng/ee5780fall2013.html

More information

VLSI Design Automation

VLSI Design Automation VLSI Design Automation IC Products Processors CPU, DSP, Controllers Memory chips RAM, ROM, EEPROM Analog Mobile communication, audio/video processing Programmable PLA, FPGA Embedded systems Used in cars,

More information

PC I/O. May 7, Howard Huang 1

PC I/O. May 7, Howard Huang 1 PC I/O Today wraps up the I/O material with a little bit about PC I/O systems. Internal buses like PCI and ISA are critical. External buses like USB and Firewire are becoming more important. Today also

More information

7/28/ Prentice-Hall, Inc Prentice-Hall, Inc Prentice-Hall, Inc Prentice-Hall, Inc Prentice-Hall, Inc.

7/28/ Prentice-Hall, Inc Prentice-Hall, Inc Prentice-Hall, Inc Prentice-Hall, Inc Prentice-Hall, Inc. Technology in Action Technology in Action Chapter 9 Behind the Scenes: A Closer Look a System Hardware Chapter Topics Computer switches Binary number system Inside the CPU Cache memory Types of RAM Computer

More information

Aim High. Intel Technical Update Teratec 07 Symposium. June 20, Stephen R. Wheat, Ph.D. Director, HPC Digital Enterprise Group

Aim High. Intel Technical Update Teratec 07 Symposium. June 20, Stephen R. Wheat, Ph.D. Director, HPC Digital Enterprise Group Aim High Intel Technical Update Teratec 07 Symposium June 20, 2007 Stephen R. Wheat, Ph.D. Director, HPC Digital Enterprise Group Risk Factors Today s s presentations contain forward-looking statements.

More information

Computer and Information Sciences College / Computer Science Department CS 207 D. Computer Architecture

Computer and Information Sciences College / Computer Science Department CS 207 D. Computer Architecture Computer and Information Sciences College / Computer Science Department CS 207 D Computer Architecture The Computer Revolution Progress in computer technology Underpinned by Moore s Law Makes novel applications

More information

Performance of computer systems

Performance of computer systems Performance of computer systems Many different factors among which: Technology Raw speed of the circuits (clock, switching time) Process technology (how many transistors on a chip) Organization What type

More information

Parallelized Progressive Network Coding with Hardware Acceleration

Parallelized Progressive Network Coding with Hardware Acceleration Parallelized Progressive Network Coding with Hardware Acceleration Hassan Shojania, Baochun Li Department of Electrical and Computer Engineering University of Toronto Network coding Information is coded

More information

EE282 Computer Architecture. Lecture 1: What is Computer Architecture?

EE282 Computer Architecture. Lecture 1: What is Computer Architecture? EE282 Computer Architecture Lecture : What is Computer Architecture? September 27, 200 Marc Tremblay Computer Systems Laboratory Stanford University marctrem@csl.stanford.edu Goals Understand how computer

More information

Intel Enterprise Processors Technology

Intel Enterprise Processors Technology Enterprise Processors Technology Kosuke Hirano Enterprise Platforms Group March 20, 2002 1 Agenda Architecture in Enterprise Xeon Processor MP Next Generation Itanium Processor Interconnect Technology

More information

High Capacity and High Performance 20nm FPGAs. Steve Young, Dinesh Gaitonde August Copyright 2014 Xilinx

High Capacity and High Performance 20nm FPGAs. Steve Young, Dinesh Gaitonde August Copyright 2014 Xilinx High Capacity and High Performance 20nm FPGAs Steve Young, Dinesh Gaitonde August 2014 Not a Complete Product Overview Page 2 Outline Page 3 Petabytes per month Increasing Bandwidth Global IP Traffic Growth

More information

Multi-Core Microprocessor Chips: Motivation & Challenges

Multi-Core Microprocessor Chips: Motivation & Challenges Multi-Core Microprocessor Chips: Motivation & Challenges Dileep Bhandarkar, Ph. D. Architect at Large DEG Architecture & Planning Digital Enterprise Group Intel Corporation October 2005 Copyright 2005

More information

Architecture without explicit locks for logic simulation on SIMD machines

Architecture without explicit locks for logic simulation on SIMD machines Architecture without explicit locks for logic on machines M. Chimeh Department of Computer Science University of Glasgow UKMAC, 2016 Contents 1 2 3 4 5 6 The Using models to replicate the behaviour of

More information

CS Computer Architecture Spring Lecture 01: Introduction

CS Computer Architecture Spring Lecture 01: Introduction CS 35101 Computer Architecture Spring 2008 Lecture 01: Introduction Created by Shannon Steinfadt Indicates slide was adapted from :Kevin Schaffer*, Mary Jane Irwinº, and from Computer Organization and

More information

Lecture 1: Introduction

Lecture 1: Introduction Contemporary Computer Architecture Instruction set architecture Lecture 1: Introduction CprE 581 Computer Systems Architecture, Fall 2016 Reading: Textbook, Ch. 1.1-1.7 Microarchitecture; examples: Pipeline

More information

How What When Why CSC3501 FALL07 CSC3501 FALL07. Louisiana State University 1- Introduction - 1. Louisiana State University 1- Introduction - 2

How What When Why CSC3501 FALL07 CSC3501 FALL07. Louisiana State University 1- Introduction - 1. Louisiana State University 1- Introduction - 2 Computer Organization and Design Dr. Arjan Durresi Louisiana State University Baton Rouge, LA 70803 durresi@csc.lsu.edu d These slides are available at: http://www.csc.lsu.edu/~durresi/csc3501_07/ Louisiana

More information

Background Heterogeneous Architectures Performance Modeling Single Core Performance Profiling Multicore Performance Estimation Test Cases Multicore

Background Heterogeneous Architectures Performance Modeling Single Core Performance Profiling Multicore Performance Estimation Test Cases Multicore By Dan Stafford Background Heterogeneous Architectures Performance Modeling Single Core Performance Profiling Multicore Performance Estimation Test Cases Multicore Design Space Results & Observations General

More information

MSc-IT 1st Semester Fall 2016, Course Instructor M. Imran khalil 1

MSc-IT 1st Semester Fall 2016, Course Instructor M. Imran khalil 1 Objectives Overview Differentiate among various styles of system units on desktop computers, notebook computers, and mobile devices Identify chips, adapter cards, and other components of a motherboard

More information

2009 International Solid-State Circuits Conference Intel Paper Highlights

2009 International Solid-State Circuits Conference Intel Paper Highlights 2009 International Solid-State Circuits Conference Intel Paper Highlights Mark Bohr Intel Senior Fellow Soumyanath Krishnamurthy Intel Fellow 1 2009 ISSCC Intel Paper Summary Under embargo until February,

More information

Parallelism and Concurrency. COS 326 David Walker Princeton University

Parallelism and Concurrency. COS 326 David Walker Princeton University Parallelism and Concurrency COS 326 David Walker Princeton University Parallelism What is it? Today's technology trends. How can we take advantage of it? Why is it so much harder to program? Some preliminary

More information

FABRICATION TECHNOLOGIES

FABRICATION TECHNOLOGIES FABRICATION TECHNOLOGIES DSP Processor Design Approaches Full custom Standard cell** higher performance lower energy (power) lower per-part cost Gate array* FPGA* Programmable DSP Programmable general

More information

Fundamentals of Computer Design

Fundamentals of Computer Design Fundamentals of Computer Design Computer Architecture J. Daniel García Sánchez (coordinator) David Expósito Singh Francisco Javier García Blas ARCOS Group Computer Science and Engineering Department University

More information

Computer Architecture

Computer Architecture Computer Architecture Chapter 7 Parallel Processing 1 Parallelism Instruction-level parallelism (Ch.6) pipeline superscalar latency issues hazards Processor-level parallelism (Ch.7) array/vector of processors

More information

(ii) Why are we going to multi-core chips to find performance? Because we have to.

(ii) Why are we going to multi-core chips to find performance? Because we have to. CSE 30321 Computer Architecture I Fall 2009 Lab 06 Introduction to Multi-core Processors and Parallel Programming Assigned: November 3, 2009 Due: November 17, 2009 1. Introduction: This lab will introduce

More information

Technology in Action

Technology in Action Technology in Action Chapter 9 Behind the Scenes: A Closer Look at System Hardware 1 Binary Language Computers work in binary language. Consists of two numbers: 0 and 1 Everything a computer does is broken

More information

Computer Architecture!

Computer Architecture! Informatics 3 Computer Architecture! Dr. Boris Grot and Dr. Vijay Nagarajan!! Institute for Computing Systems Architecture, School of Informatics! University of Edinburgh! General Information! Instructors:!

More information

Introduction to Microprocessor

Introduction to Microprocessor Introduction to Microprocessor Slide 1 Microprocessor A microprocessor is a multipurpose, programmable, clock-driven, register-based electronic device That reads binary instructions from a storage device

More information

Microarchitecture Overview. Performance

Microarchitecture Overview. Performance Microarchitecture Overview Prof. Scott Rixner Duncan Hall 3028 rixner@rice.edu January 15, 2007 Performance 4 Make operations faster Process improvements Circuit improvements Use more transistors to make

More information

Calendar Description

Calendar Description ECE212 B1: Introduction to Microprocessors Lecture 1 Calendar Description Microcomputer architecture, assembly language programming, memory and input/output system, interrupts All the instructions are

More information

Computer Architecture A Quantitative Approach, Fifth Edition. Chapter 1. Copyright 2012, Elsevier Inc. All rights reserved. Computer Technology

Computer Architecture A Quantitative Approach, Fifth Edition. Chapter 1. Copyright 2012, Elsevier Inc. All rights reserved. Computer Technology Computer Architecture A Quantitative Approach, Fifth Edition Chapter 1 Fundamentals of Quantitative Design and Analysis 1 Computer Technology Performance improvements: Improvements in semiconductor technology

More information

HISTORY OF MICROPROCESSORS

HISTORY OF MICROPROCESSORS HISTORY OF MICROPROCESSORS CONTENTS Introduction 4-Bit Microprocessors 8-Bit Microprocessors 16-Bit Microprocessors 1 32-Bit Microprocessors 64-Bit Microprocessors 2 INTRODUCTION Fairchild Semiconductors

More information

Parallelism in Hardware

Parallelism in Hardware Parallelism in Hardware Minsoo Ryu Department of Computer Science and Engineering 2 1 Advent of Multicore Hardware 2 Multicore Processors 3 Amdahl s Law 4 Parallelism in Hardware 5 Q & A 2 3 Moore s Law

More information

Online Course Evaluation. What we will do in the last week?

Online Course Evaluation. What we will do in the last week? Online Course Evaluation Please fill in the online form The link will expire on April 30 (next Monday) So far 10 students have filled in the online form Thank you if you completed it. 1 What we will do

More information

Computer Architecture. Fall Dongkun Shin, SKKU

Computer Architecture. Fall Dongkun Shin, SKKU Computer Architecture Fall 2018 1 Syllabus Instructors: Dongkun Shin Office : Room 85470 E-mail : dongkun@skku.edu Office Hours: Wed. 15:00-17:30 or by appointment Lecture notes nyx.skku.ac.kr Courses

More information

Multicore Hardware and Parallelism

Multicore Hardware and Parallelism Multicore Hardware and Parallelism Minsoo Ryu Department of Computer Science and Engineering 2 1 Advent of Multicore Hardware 2 Multicore Processors 3 Amdahl s Law 4 Parallelism in Hardware 5 Q & A 2 3

More information

VLSI Design Automation

VLSI Design Automation VLSI Design Automation IC Products Processors CPU, DSP, Controllers Memory chips RAM, ROM, EEPROM Analog Mobile communication, audio/video processing Programmable PLA, FPGA Embedded systems Used in cars,

More information

CSE502: Computer Architecture CSE 502: Computer Architecture

CSE502: Computer Architecture CSE 502: Computer Architecture CSE 502: Computer Architecture Multi-{Socket,,Thread} Getting More Performance Keep pushing IPC and/or frequenecy Design complexity (time to market) Cooling (cost) Power delivery (cost) Possible, but too

More information

Comparison of Parallel Processing Systems. Motivation

Comparison of Parallel Processing Systems. Motivation Comparison of Parallel Processing Systems Ash Dean Katie Willis CS 67 George Mason University Motivation Increasingly, corporate and academic projects require more computing power than a typical PC can

More information

Computer Architecture Spring 2016

Computer Architecture Spring 2016 Computer Architecture Spring 2016 Lecture 19: Multiprocessing Shuai Wang Department of Computer Science and Technology Nanjing University [Slides adapted from CSE 502 Stony Brook University] Getting More

More information

EECS4201 Computer Architecture

EECS4201 Computer Architecture Computer Architecture A Quantitative Approach, Fifth Edition Chapter 1 Fundamentals of Quantitative Design and Analysis These slides are based on the slides provided by the publisher. The slides will be

More information

Performance Characterization, Prediction, and Optimization for Heterogeneous Systems with Multi-Level Memory Interference

Performance Characterization, Prediction, and Optimization for Heterogeneous Systems with Multi-Level Memory Interference The 2017 IEEE International Symposium on Workload Characterization Performance Characterization, Prediction, and Optimization for Heterogeneous Systems with Multi-Level Memory Interference Shin-Ying Lee

More information

VLSI Design Automation. Calcolatori Elettronici Ing. Informatica

VLSI Design Automation. Calcolatori Elettronici Ing. Informatica VLSI Design Automation 1 Outline Technology trends VLSI Design flow (an overview) 2 IC Products Processors CPU, DSP, Controllers Memory chips RAM, ROM, EEPROM Analog Mobile communication, audio/video processing

More information

THERMAL BENCHMARK AND POWER BENCHMARK SOFTWARE

THERMAL BENCHMARK AND POWER BENCHMARK SOFTWARE Nice, Côte d Azur, France, 27-29 September 26 THERMAL BENCHMARK AND POWER BENCHMARK SOFTWARE Marius Marcu, Mircea Vladutiu, Horatiu Moldovan and Mircea Popa Department of Computer Science, Politehnica

More information

NVIDIA GTX200: TeraFLOPS Visual Computing. August 26, 2008 John Tynefield

NVIDIA GTX200: TeraFLOPS Visual Computing. August 26, 2008 John Tynefield NVIDIA GTX200: TeraFLOPS Visual Computing August 26, 2008 John Tynefield 2 Outline Execution Model Architecture Demo 3 Execution Model 4 Software Architecture Applications DX10 OpenGL OpenCL CUDA C Host

More information

HPC Technology Trends

HPC Technology Trends HPC Technology Trends High Performance Embedded Computing Conference September 18, 2007 David S Scott, Ph.D. Petascale Product Line Architect Digital Enterprise Group Risk Factors Today s s presentations

More information

Measurement-based Analysis of TCP/IP Processing Requirements

Measurement-based Analysis of TCP/IP Processing Requirements Measurement-based Analysis of TCP/IP Processing Requirements Srihari Makineni Ravi Iyer Communications Technology Lab Intel Corporation {srihari.makineni, ravishankar.iyer}@intel.com Abstract With the

More information

Developing a Data Driven System for Computational Neuroscience

Developing a Data Driven System for Computational Neuroscience Developing a Data Driven System for Computational Neuroscience Ross Snider and Yongming Zhu Montana State University, Bozeman MT 59717, USA Abstract. A data driven system implies the need to integrate

More information

Understanding Dual-processors, Hyper-Threading Technology, and Multicore Systems

Understanding Dual-processors, Hyper-Threading Technology, and Multicore Systems Understanding Dual-processors, Hyper-Threading Technology, and Multicore Systems This paper will provide you with a basic understanding of the differences among several computer system architectures dual-processor

More information

Computers: Inside and Out

Computers: Inside and Out Computers: Inside and Out Computer Components To store binary information the most basic components of a computer must exist in two states State # 1 = 1 State # 2 = 0 1 Transistors Computers use transistors

More information

Fundamentals of Computers Design

Fundamentals of Computers Design Computer Architecture J. Daniel Garcia Computer Architecture Group. Universidad Carlos III de Madrid Last update: September 8, 2014 Computer Architecture ARCOS Group. 1/45 Introduction 1 Introduction 2

More information

A TALENTED CPU-TO-GPU MEMORY MAPPING TECHNIQUE

A TALENTED CPU-TO-GPU MEMORY MAPPING TECHNIQUE A TALENTED CPU-TO-GPU MEMORY MAPPING TECHNIQUE Abu Asaduzzaman, Deepthi Gummadi, and Chok M. Yip Department of Electrical Engineering and Computer Science Wichita State University Wichita, Kansas, USA

More information

SYSTEM BUS AND MOCROPROCESSORS HISTORY

SYSTEM BUS AND MOCROPROCESSORS HISTORY SYSTEM BUS AND MOCROPROCESSORS HISTORY Dr. M. Hebaishy momara@su.edu.sa http://colleges.su.edu.sa/dawadmi/fos/pages/hebaishy.aspx Digital Logic Design Ch1-1 SYSTEM BUS The CPU sends various data values,

More information

About the Presentations

About the Presentations About the Presentations The presentations cover the objectives found in the opening of each chapter. All chapter objectives are listed in the beginning of each presentation. You may customize the presentations

More information

Computer Architecture s Changing Definition

Computer Architecture s Changing Definition Computer Architecture s Changing Definition 1950s Computer Architecture Computer Arithmetic 1960s Operating system support, especially memory management 1970s to mid 1980s Computer Architecture Instruction

More information

LECTURE 1. Introduction

LECTURE 1. Introduction LECTURE 1 Introduction CLASSES OF COMPUTERS When we think of a computer, most of us might first think of our laptop or maybe one of the desktop machines frequently used in the Majors Lab. Computers, however,

More information

UMBC. Rubini and Corbet, Linux Device Drivers, 2nd Edition, O Reilly. Systems Design and Programming

UMBC. Rubini and Corbet, Linux Device Drivers, 2nd Edition, O Reilly. Systems Design and Programming Systems Design and Programming Instructor: Professor Jim Plusquellic Text: Barry B. Brey, The Intel Microprocessors, 8086/8088, 80186/80188, 80286, 80386, 80486, Pentium and Pentium Pro Processor Architecture,

More information

CS 590: High Performance Computing. Parallel Computer Architectures. Lab 1 Starts Today. Already posted on Canvas (under Assignment) Let s look at it

CS 590: High Performance Computing. Parallel Computer Architectures. Lab 1 Starts Today. Already posted on Canvas (under Assignment) Let s look at it Lab 1 Starts Today Already posted on Canvas (under Assignment) Let s look at it CS 590: High Performance Computing Parallel Computer Architectures Fengguang Song Department of Computer Science IUPUI 1

More information

The Computer Revolution. Classes of Computers. Chapter 1

The Computer Revolution. Classes of Computers. Chapter 1 COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition 1 Chapter 1 Computer Abstractions and Technology 1 The Computer Revolution Progress in computer technology Underpinned by Moore

More information

A+ Guide to Hardware: Managing, Maintaining, and Troubleshooting, 5e. Chapter 1 Introducing Hardware

A+ Guide to Hardware: Managing, Maintaining, and Troubleshooting, 5e. Chapter 1 Introducing Hardware : Managing, Maintaining, and Troubleshooting, 5e Chapter 1 Introducing Hardware Objectives Learn that a computer requires both hardware and software to work Learn about the many different hardware components

More information

ELE 455/555 Computer System Engineering. Section 1 Review and Foundations Class 5 Computer System Performance

ELE 455/555 Computer System Engineering. Section 1 Review and Foundations Class 5 Computer System Performance ELE 455/555 Computer System Engineering Section 1 Review and Foundations Class 5 Computer System Overview Eight Great Ideas in Computer Architecture Design for Moore s Law Integrated Circuit resources

More information

Auto-Tuning Multi-Programmed Workload on the SCC

Auto-Tuning Multi-Programmed Workload on the SCC Auto-Tuning Multi-Programmed Workload on the SCC Brian Roscoe, Mathias Herlev, Chen Liu Department of Electrical and Computer Engineering Clarkson University Potsdam, NY 13699, USA {roscoebj,herlevm,cliu}@clarkson.edu

More information

Part 1 of 3 -Understand the hardware components of computer systems

Part 1 of 3 -Understand the hardware components of computer systems Part 1 of 3 -Understand the hardware components of computer systems The main circuit board, the motherboard provides the base to which a number of other hardware devices are connected. Devices that connect

More information

Parallel Programming Multicore systems

Parallel Programming Multicore systems FYS3240 PC-based instrumentation and microcontrollers Parallel Programming Multicore systems Spring 2011 Lecture #9 Bekkeng, 4.4.2011 Introduction Until recently, innovations in processor technology have

More information

White Paper Assessing FPGA DSP Benchmarks at 40 nm

White Paper Assessing FPGA DSP Benchmarks at 40 nm White Paper Assessing FPGA DSP Benchmarks at 40 nm Introduction Benchmarking the performance of algorithms, devices, and programming methodologies is a well-worn topic among developers and research of

More information

Parallel graph traversal for FPGA

Parallel graph traversal for FPGA LETTER IEICE Electronics Express, Vol.11, No.7, 1 6 Parallel graph traversal for FPGA Shice Ni a), Yong Dou, Dan Zou, Rongchun Li, and Qiang Wang National Laboratory for Parallel and Distributed Processing,

More information

Lecture 2: Performance

Lecture 2: Performance Lecture 2: Performance Today s topics: Technology wrap-up Performance trends and equations Reminders: YouTube videos, canvas, and class webpage: http://www.cs.utah.edu/~rajeev/cs3810/ 1 Important Trends

More information

Systems Design and Programming. Instructor: Chintan Patel

Systems Design and Programming. Instructor: Chintan Patel Systems Design and Programming Instructor: Chintan Patel Text: Barry B. Brey, 'The Intel Microprocessors, 8086/8088, 80186/80188, 80286, 80386, 80486, Pentium and Pentium Pro Processor, Pentium II, Pentium

More information

CIT 668: System Architecture

CIT 668: System Architecture CIT 668: System Architecture Computer Systems Architecture I 1. System Components 2. Processor 3. Memory 4. Storage 5. Network 6. Operating System Topics Images courtesy of Majd F. Sakr or from Wikipedia

More information