Determining the Relevancy of Moore's Law Through the Comparison of Ten Distinct Processor Systems
|
|
- Thomas Gilmore
- 5 years ago
- Views:
Transcription
1 Determining the Relevancy of Moore's Law Through the Comparison of Ten Distinct Processor Systems Nickolas DeVito Department of Electrical Engineering and Computer Science University of Central Florida Orlando, FL Abstract The main purpose of this project is to compare and contrast the fundamental metrics and specifications of ten example processor architectures, dating from the 199s to the present day. CPU systems are analyzed based on their clock rates, memory capacity, number of or cores, and data bus word width. The most popular designs studied were parallel, multi-core microarchitectures for general or special purposes. baseline designs, such as Intel Pentium, AMD Athlon, and Xilinx FPGA-based architectures, were examined in this project, as were less historical designs, such as Xeon EX and OpenSPACE T1. Through this project's research, a better understanding of how different CPU systems operate will be achieved, as well as how Moore's Law still defines processor improvement rates in the computer industry. It is important to acknowledge these systems similarities and differences, as CPU architecture design is a massive field worldwide, and future innovations will determine how they improve. Keywords Amdahl's Law, Architecture, Baseline Computer Systems, Clock rate, Core, Data bus, Memory capacity, Moore s Law, Parallelism, Specifications, Speedup, Transistors. I. OVERVIEW OF PROCESSOR ARCHITECTURE Computer hardware involves a carefully implemented system of instruction sets operating under defined specifications. This project studies processor architecture by looking at ten different CPU systems, including five baseline computer systems. The components and principles being used to explore processor architecture are defined below. Performance data for the ten systems is also shown. 1) Classic Components: A computer system is typically made up of five classic components: datapath, control, memory, input, and output [1]. The datapath manipulates data that will be used within the computer. The control takes that data and determines how it will operate, effectively giving directions to the datapath. Together, the datapath and control form the computer's processor. Memory acts outside of the processor, storing and transmitting data, which can then be retrieved or rewritten after following directions from the control. The input and output both connect to external devices that the user can observe or manipulate. Input takes actions from the user, and translates them into directions for the processor, while output often takes data and gives it to the user to observe or download. Examples of input include keyboards, microphones, and the mouse, while output includes monitors and printers. In addition, certain systems act as both input and output, such as external hard drives and memory sticks. 2) Processor Buses and Bit Width: Interaction between a computer's processor and memory is necessary to complete basic functions. Data is transferred between the two components by the use of two buses, the address bus and data bus. Bits from the processor can be sent along the address bus to the memory, so that they can be stored. While the address bus can only send data in one direction, the data bus can transfer data to and from the processor and memory. This is because the information sent along the address bus is simply a direction for the memory address where information is stored. The actual data at that address can be sent to the processor along the data bus, so that it can be "crunched," and then sent back to the memory once the execution has been completed. The bit width refers to the number of wires in each bus. Since each wire can submit one binary digit at a time, having n wires allows for 2^n possible values to be had. 3) Metrics Studied: This project refers to several metrics that are used to study and evaluate the processor architectures. CPU clock rate (measured in MHz) generally refers to how fast a processor performs. This speed rate is evaluated by determining how many clock cycles can be executed in a second. Instruction set size (measured in bits) refers to how large a length the instructions held on a processor can be. Having a larger data bus width (and effectively greater instruction set size) means more machine instructions can be held, meaning a greater variety of CPU tasks can be executed. Memory capacity (measured in MB) refers to how large the memory of a computer system is, and therefore how much data can be stored. These metrics, and their respective units, can be seen in the graphs in Section 3 of this project. 4) Significance: It is important to look at these metrics when comparing computer architectures, because they can inherently be used to evaluate performance, and determine the rate at which they have improved over a period of time. Looking at clock rates between different is a good way to determine time enhancement, and the overall speedup of a system, in line with Amdahl's Law. Memory capacity can be used to illustrate Moore's Law. Page 1 of 5
2 Since having larger memory means more transistors are being used, systems of increasing memory capacity can be compared to evaluate the rate at which a system's memory doubles. 5) Processor Performance Equation: An equation exists that can be used to evaluate processor performance for different systems. Performance is inversely proportional to execution time, and execution time can be defined as a combination of instruction count, CPI, and clock rate. Instruction count refers to the number of instructions that need to be executed. CPI stands for clocks per instructions, and means how many clock cycles are needed to complete a single instruction. Clock rate (or cycle time) means the number of clocks needed to complete the program. Overall, the equation states: [Performance] = 1/[CPU Time] = 1/[Instruction Count]*[CPI]*[Cycle Time]. 6) Parallelism: Parallelism explains that multiple processor systems can be used in unison to form one large computer architecture, effectively increasing execution time, memory capacity, and overall performance [2]. Parallelism is a practice often used in large server mills and supercomputers, accomplishing more than a single desktop could achieve. While connecting hundreds of CPU units isn't cost or space efficient for consumer-based computers, parallelism can still be used on a smaller scale by installing multiple core. Dual core CPUs can store twice as much memory as single cores. Quad core CPUs can store four times as much, and so on. Having more cores also lowers execution time. Speedup factor compares the execution time of two different systems. If a single core system is turned into a dual core system, then programs could be executed twice as fast, as expressed by the following equation: [Speedup] = 1/[1-Fraction_ Enhanced]+[Franction_Enhanced/Speedup_Enhanced]. The next sections in this report will give a more direct look to how these concepts can be used to study processor architectures, by looking at ten specific examples. In Section 2 the ten processor architectures spanning from 199s until today will be introduced in detail. Section 3 will evaluate the data achieved by studying these architectures, and Section 4 will conclude with what can be learned from the charted metrics. II. LITERATURE REVIEW All ten processor systems were researched from various reports and sources. The findings for each system can be found below, starting with the five baseline computer systems and following with five non-baseline designs used as comparisons. The metrics can also be seen in Table : The MasPar MP-1 Model 128 Backend was developed in 1992, as a special purpose highly parallel architecture [2]. The SIMD architecture includes 8192 PEs, each containing a 4-bit processor with 16 KB of RAM. A -bit, 128 MHz processor controls data flow in the PE array. The SNAP-1 Parallel AI Prototype was first developed in 1992, as a special purpose architecture for natural language understanding [3]. Known as the SNAP project, the University of Southern California processor system included 144 -bit DSP chips (distributed over 8 different boards), each operating at 25 MHz. The local memory for each chip in the system was capable of storing 256 KB. The Intel Pentium III processor system was first designed in 1999, as a general purpose desktop architecture [1]. It utilized 18 -bit, each running at 1 GHz. Each CPU unit could store 256 KB of memory, making the computer system an ideal option for conducting simulation tests and achieving benchmark results. 2-29: The AMD Athlon 24+ Processor System was developed in 22, as a general purpose desktop architecture [1]. The computer system contains 256 parallelized, each running at 24 MHz (as the name suggests). All in the systems have a data length of bits, and can store 256 KB of memory. Like the Intel Pentium III, the AMD Athlon 24+ is also ideal for simulation tests and benchmark results, but the larger amount of and later release date means it has a better performance overall. In 26, the Xilinx Virtex-4 SX35 FPGA-Based Scalable Architecture was released as a special purpose architecture for DCT computation [4]. It achieved FPGA dynamic partial reconfiguration with 8 distinct core hardware arrangements (73577 gates overall), each operating at 1 MHz. The -bit single core CPUs could each store 2.3 MB of memory. The OpenSPARC T1 CPU Microarchitecture was also released in 26 (originally named the UltraSPARC T1) [8]. It was developed by Oracle with 8, 4-way multithreaded -bit cores, using a RTL model for simulation. Each core could store up to 3.19 MB of memory, and could run at 8 MHz. 28 saw the release of an Experimental Intel Core 2 Quad Machine, which was designed as a special purpose architecture for workload synthesis and statistical modeling (although a commercial version for general purposes came shortly after) [5]. It included 4 homogeneous -bit dual cores (8 in total), each operating at 24 MHz. The cores could each store 4.1 GB of memory, which is a huge amount compared to most other architectures, but having a smaller number of cores balanced out the total memory capabilities of the system. The Xeon EX Processor System was designed in 29 as a general purpose processor system with eight dual-threaded cores [6]. Each core held a data length of 64 bits, and overall, the system was made up of 2.3 billion transistors (45 nm in length). The CPU ran at MHz, and each core could contain 24 MB of system memory. 21-present: In 21, the IA 2D-Mesh Network-On-Chip Architecture was constructed as a special purpose message passing computer system [7]. Having 48 integrated cores (with 1.3 billion transistors) enabled the processor to accomplish rapid calculations at a pace previous CPU systems could not match. Page 2 of 5
3 The processor had a word width of bits, and each core could operate at 2 MHz, storing 256 KB of memory. The LTE MIMO Processor SIMT Architecture was first released in 213, to be a special purpose RSIMP approach system for parallel instruction sequences [9]. The system relied on 16 unique quad-core CPUs (4core-4wide), totaling 64 -bit cores in total. Each core could store KB of memory, while operating at a 8 MHz clock rate Clock Rate (MHz) vs. Time (Year) III. DATA ANALYSIS The data abstracted for all ten computer architectures (detailed in Table 1) were used to chart four different graphs evaluating relevant metrics over time. The four key metrics used to determine performance were clock rate, memory capacity, the number of total cores, and the data bus word width (see Figure 1 for more detailed explanation). Figure 2 shows the clock rates of each of the ten processor architectures over time. As can be seen in the graph, there is a visible growth in the amount of clocks a processor can compute per second. Naturally, the older architectures operate more slowly. Since having more transistors enables a higher clock rate, this graph shows a clear representation of Moore's Law acting on these systems. The rate of changing memory capacity for all systems over time can be seen in Figure 3. Although the graph shows a slight decline over time, this is due in part to the fact some architectures studied were special purpose systems with multiple parallel cores, and others were simple desktop systems. Looking at the potential for systems to store more memory, those that appear later over time (such as the Xilinx Virtex-4 SX35 and the Xeon EX) have the largest capacity. Figure 4 displays the number of cores present in architectures over time. This graph also shows a decline, but this is due to the 1992 MasPar MP-1 Model 128 Back End being an outlier with 8192 combined cores. Ignoring this shows a more reasonable range of cores being present in the architectures. Moreso, many newer systems are able to achieve greater or similar results with fewer cores because they have been better optimized for performance. Lastly, Figure 5 shows the varying lengths of each processor system's word width over time. Although the graph shows a visible increase, it's important to note that all but one of the systems studied supported a data bus with bits. The Xeon EX system's data bus was 64-bits long, despite having less cores than most architectures Fig. 2. Plot shows exponential growth of processor clock rate over time. Memory Capacity (MB) vs. Time (Year) Fig. 3. Plot shows total memory capacity of architectures over time. Number of Cores vs. Time (Year) Fig. 4. Plot shows number of cores present in compared CPU designs. Metrics covered analyzed in this paper: CPU clock rate (MHz) vs. Year Memory Capacity (MB) vs. Year Number of Processors or Cores vs. Year Data bus Word Width (bits) vs. Year Fig. 1. These metrics were used to study each of the ten processor systems by evaluating varying performance over time. Page 3 of 5
4 Data Bus Word Width (bits) vs. Time (Year) Fig. 5. Plot shows exponential growth of CPU data bus length over time. IV. CONCLUSION This project set out to display the validity of Moore's Law using ten processor architectures (five baseline systems and five non-baseline). Overall, the findings were agreeable, as especially observed by the comparable clock rates of each system, seen in Figure 2. Other metrics weren't as reliable in expressing Moore's Law, but this was largely due to the types of example architecture systems used. Rather than select ten general purpose desktop to compare in succession, a wide arrange of architectures designed for different purposes. Some systems didn't chart well in the metric graphs because they were lower profile than systems before them, and had less operating cores with lower memory capacity. This isn't to say that Moore's Law can't be applied, but that certain architecture systems only require a certain amount of specifications to complete their purposes. If more closely similar systems were studied, perhaps the findings would have been more noticeable. V. REFERENCES [1] H. A. Bahr and R. F. DeMara, "OTBSAF Scalability on Pentium III/4 and Athlon 64/XP3 Architectures," in MSIAC Modeling and Simulation Journal, on February 9, 25, Vol.6, No. 3, March, 25, pp [2] H. Bahr, R. F. DeMara, and M. Georgiopoulos, "Integer-Encoded Massively Parallel Processing of Fast-Learning ARTMAP Networks," in Proceedings of the 1997 SPIE AeroSense Symposium (AeroSense-97), pp , Orlando, Florida, U.S.A., April 21-24, [3] R. F. DeMara and D. I. Moldovan, "The SNAP-1 Parallel AI Prototype," IEEE Transactions on Parallel and Distributed Systems, Vol. 4, No. 8, August, 1993, pp [4] J. Huang, M. Parris, J. Lee, and R. F. DeMara, "Scalable FPGA-based Architecture for DCT Computation Using Dynamic Partial Reconfiguration," ACM Transactions on Embedded Computing Systems, Vol. 9, No. 1, Art. 9, October, 29, pp [5] Hughes, C.; Tao Li, "Accelerating multi-core processor design space evaluation using automatic multi-threaded workload synthesis," Workload Characterization, 28. IISWC 28. IEEE International Symposium on, vol., no., pp.163,172, Sept. 28. [6] Rusu, S.; Simon Tam; Muljono, H.; Stinson, J.; Ayers, D.; Chang, Jonathan; Varada, R.; Ratta, M.; Kottapalli, S.; Vora, S., "A 45 nm 8- Core Enterprise Xeon Processor," Solid-State Circuits, IEEE Journal of, vol.45, no.1, pp.7,14, Jan. 21. [7] Howard, J.; Dighe, S.; Vangal, S.R.; Ruhl, G.; Borkar, N.; Jain, S.; Erraguntla, V.; Konow, M.; Riepen, M.; Gries, M.; Droege, G.; Lund- Larsen, T.; Steibl, S.; Borkar, S.; De, V.K.; Van Der Wijngaart, R., "A 48-Core IA- Processor in 45 nm CMOS Using On-Die Message- Passing and DVFS for Performance and Power Scaling," Solid-State Circuits, IEEE Journal of, vol.46, no.1, pp.173,183, Jan [8] Kahng, A.B.; Seokhyeong Kang; Kumar, R.; Sartori, J., "Designing a processor from the ground up to allow voltage/reliability tradeoffs," High Performance Computer Architecture (HPCA), 21 IEEE 16th International Symposium on, vol., no., pp.1,11, 9-14 Jan. 21. [9] Zheng Yu; Zhiyi Yu; Xueqiu Yu; Ningxi Liu; Xiaoyang Zeng, "Low- Power Multicore Processor Design With Reconfigurable Same- Instruction Multiple Process," Circuits and Systems II: Express Briefs, IEEE Transactions on, vol.61, no.6, pp.423,427, June 214. [1] Shute, G., "Components of a Computer," Computer Components, June 214 University of Minnesota Duluth, August 215. TABLE I. SPECIFICATIONS FOR TEN CPU ARCHITECTURES STUDIED IN PROJECT Name of Architecture [reference] SNAP-1 Parallel AI Prototype [3] Intel Pentium III Processor System [1] AMD Athlon 24+ Processor System [1] Xilinx Virtex-4 SX35 FPGA-Based Scalable Architecture [4] MasPar MP-1 Model 128 Back End [2] Xeon EX Processor System [6] : Application- Specific or -purpose Computation NLU: Special DCT Computation: Special Highly Parallel Architecture: Special Die Area, Number of Transistors, or Number of Chips/Boards/etc. 144 DSP Chips on 8 large circuit boards Computer System utilizing 18 Computer System utilizing Distinct HW arrangements and gate count of clusters of PEs, holding bit 8 dual-threaded 64- bit cores, each with CPU Clock Rate (MHz) Memory Capacity (MB) U* MB U* MB U* MB 2.3MB/CP U* 8 CPU = 184 MB 16KB/CPU * 8192 CPU = MB 24MB/CPU * 8 Data Bus Word Width (bits) 64 Number of Cores or CPUs 144 single core CPUs = 144 cores 18 single core CPUs = 18 cores 256 single core CPUs = 256 cores 8 single core CPUs = 8 cores 8192 single core CPUs = 8192 cores 8 dual-threaded CPUs = 8 cores Ideal Speedup for 99% parallel code (ignoring overheads) 144 cores so Told/Tnew= 1/(.1+ (.99/144)) = fold 1 1/(.1+ (.99/18)) = fold 256 cores so Told/Tnew= 1/(.1+ (.99/256)) = fold 1/(.1+ (.99/8)) = fold 8192 cores so Told/Tnew = 1/(.1+ (.99/8192)) = fold 1/(.1+ (.99/8)) = Page 4 of 5
5 Name of Architecture [reference] LTE MIMO Processor SIMT Architecture [9] IA 2D-Mesh Network- On-Chip Architecture [7] Experimental Intel Core 2 Quad Machine [5] OpenSPARC T1 CPU Microarchitecture [8] : Application- Specific or -purpose Computation Die Area, Number of Transistors, or Number of Chips/Boards/etc. CPU Clock Rate (MHz) Memory Capacity (MB) Data Bus Word Width (bits) Number of Cores or CPUs Ideal Speedup for 99% parallel code (ignoring overheads) 2.3 B transistors 192 MB 7.48-fold RSIMP Approach: Special Message Passing: Special Workload Synthesis and Statistical Modeling: Special 16-core processor, each 4core-4wide 48 integrated CPUs, with 1.3 B total transistors 4 homogeneous dual cores, with shared L2 caches 8 CPU cores, with each being 4-way multithreaded KB/CPU * MB U* 48 CPU = MB 4.1GB/CPU * 8.77 MB 3.19MB/CP U* MB 16 quad core CPUs = 64 cores 48 single core CPUs = 48 cores 4 dual core CPUs = 8 cores 8 single core CPUs = 8 cores 64 cores so Told/Tnew= 1/(.1+ (.99/64)) = fold 4 1/(.1+ (.99/48)) =.65-fold 1/(.1+ (.99/8)) = 7.48-fold 1/(.1+ (.99/8)) = 7.48-fold Page 5 of 5
Comparison of Processor Architectures and Metrics from 1992 to 2011
Comparison of Processor Architectures and Metrics from 1992 to 211 Michael Colucciello Department of Electrical Engineering and Computer Science University of Central Florida Orlando, FL 32816-2362 Abstract
More informationADVANCES IN PROCESSOR DESIGN AND THE EFFECTS OF MOORES LAW AND AMDAHLS LAW IN RELATION TO THROUGHPUT MEMORY CAPACITY AND PARALLEL PROCESSING
ADVANCES IN PROCESSOR DESIGN AND THE EFFECTS OF MOORES LAW AND AMDAHLS LAW IN RELATION TO THROUGHPUT MEMORY CAPACITY AND PARALLEL PROCESSING Evan Baytan Department of Electrical Engineering and Computer
More informationAn Analysis of machine processors and their evolution of performance, metrics and intended uses with respect to time.
An Analysis of machine processors and their evolution of performance, metrics and intended uses with respect to time. Ryan Hromada Department of Electrical Engineering and Computer Science University of
More informationThe Time is Moving and The Processor Technology also changing through time
The Time is Moving and The Processor Technology also changing through time 1993 214 Hai Nguyen Le Thanh Department of Electrical Engineering and Computer Science University of Central Florida Orlando,
More informationReview and Analysis of Select Performance Metrics for Processor Architecture Designs: a chronology from the 1990s to the Present
Review and Analysis of Select Performance Metrics for Processor Architecture Designs: a chronology from the 199s to the Present William J. Santos Department of Electrical Engineering and Computer Science
More informationSurvey of Processor Architectures and Applied Metrics
Survey of Processor Architectures and Applied Metrics Luis Gamarra Jimenez Department of Electrical Engineering and Computer Science University of Central Florida Orlando, FL 816-2362 Abstract Processor
More informationA Retrospective Examination of Moore s Law Relative to Multiple CPU Designs from the 1990 s to Present Day
A Retrospective Examination of Moore s Law Relative to Multiple CPU Designs from the 199 s to Present Day Steven Nguyen Department of Electrical Engineering and Computer Science University of Central Florida
More information1.3 Data processing; data storage; data movement; and control.
CHAPTER 1 OVERVIEW ANSWERS TO QUESTIONS 1.1 Computer architecture refers to those attributes of a system visible to a programmer or, put another way, those attributes that have a direct impact on the logical
More informationComputer Architecture. Introduction. Lynn Choi Korea University
Computer Architecture Introduction Lynn Choi Korea University Class Information Lecturer Prof. Lynn Choi, School of Electrical Eng. Phone: 3290-3249, 공학관 411, lchoi@korea.ac.kr, TA: 윤창현 / 신동욱, 3290-3896,
More informationHomeschool Enrichment. The System Unit: Processing & Memory
Homeschool Enrichment The System Unit: Processing & Memory Overview This chapter covers: How computers represent data and programs How the CPU, memory, and other components are arranged inside the system
More informationFundamentals of Quantitative Design and Analysis
Fundamentals of Quantitative Design and Analysis Dr. Jiang Li Adapted from the slides provided by the authors Computer Technology Performance improvements: Improvements in semiconductor technology Feature
More informationLow-Power Interconnection Networks
Low-Power Interconnection Networks Li-Shiuan Peh Associate Professor EECS, CSAIL & MTL MIT 1 Moore s Law: Double the number of transistors on chip every 2 years 1970: Clock speed: 108kHz No. transistors:
More informationMicroprocessor Trends and Implications for the Future
Microprocessor Trends and Implications for the Future John Mellor-Crummey Department of Computer Science Rice University johnmc@rice.edu COMP 522 Lecture 4 1 September 2016 Context Last two classes: from
More informationA Simple Model for Estimating Power Consumption of a Multicore Server System
, pp.153-160 http://dx.doi.org/10.14257/ijmue.2014.9.2.15 A Simple Model for Estimating Power Consumption of a Multicore Server System Minjoong Kim, Yoondeok Ju, Jinseok Chae and Moonju Park School of
More informationAdvanced Computer Architecture (CS620)
Advanced Computer Architecture (CS620) Background: Good understanding of computer organization (eg.cs220), basic computer architecture (eg.cs221) and knowledge of probability, statistics and modeling (eg.cs433).
More informationPerformance. February 12, Howard Huang 1
Performance Today we ll try to answer several questions about performance. Why is performance important? How can you define performance more precisely? How do hardware and software design affect performance?
More informationChapter 1: Introduction to the Microprocessor and Computer 1 1 A HISTORICAL BACKGROUND
Chapter 1: Introduction to the Microprocessor and Computer 1 1 A HISTORICAL BACKGROUND The Microprocessor Called the CPU (central processing unit). The controlling element in a computer system. Controls
More informationCSCI 402: Computer Architectures. Computer Abstractions and Technology (4) Fengguang Song Department of Computer & Information Science IUPUI.
CSCI 402: Computer Architectures Computer Abstractions and Technology (4) Fengguang Song Department of Computer & Information Science IUPUI Contents 1.7 - End of Chapter 1 Power wall The multicore era
More informationChapter 1: Fundamentals of Quantitative Design and Analysis
1 / 12 Chapter 1: Fundamentals of Quantitative Design and Analysis Be careful in this chapter. It contains a tremendous amount of information and data about the changes in computer architecture since the
More informationPerformance, Power, Die Yield. CS301 Prof Szajda
Performance, Power, Die Yield CS301 Prof Szajda Administrative HW #1 assigned w Due Wednesday, 9/3 at 5:00 pm Performance Metrics (How do we compare two machines?) What to Measure? Which airplane has the
More informationSerial. Parallel. CIT 668: System Architecture 2/14/2011. Topics. Serial and Parallel Computation. Parallel Computing
CIT 668: System Architecture Parallel Computing Topics 1. What is Parallel Computing? 2. Why use Parallel Computing? 3. Types of Parallelism 4. Amdahl s Law 5. Flynn s Taxonomy of Parallel Computers 6.
More information45-year CPU Evolution: 1 Law -2 Equations
4004 8086 PowerPC 601 Pentium 4 Prescott 1971 1978 1992 45-year CPU Evolution: 1 Law -2 Equations Daniel Etiemble LRI Université Paris Sud 2004 Xeon X7560 Power9 Nvidia Pascal 2010 2017 2016 Are there
More informationMulticore computer: Combines two or more processors (cores) on a single die. Also called a chip-multiprocessor.
CS 320 Ch. 18 Multicore Computers Multicore computer: Combines two or more processors (cores) on a single die. Also called a chip-multiprocessor. Definitions: Hyper-threading Intel's proprietary simultaneous
More informationCOMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 1. Computer Abstractions and Technology
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 1 Computer Abstractions and Technology Classes of Computers Personal computers General purpose, variety of software
More informationEnhancing Analysis-Based Design with Quad-Core Intel Xeon Processor-Based Workstations
Performance Brief Quad-Core Workstation Enhancing Analysis-Based Design with Quad-Core Intel Xeon Processor-Based Workstations With eight cores and up to 80 GFLOPS of peak performance at your fingertips,
More informationCopyright 2012, Elsevier Inc. All rights reserved.
Computer Architecture A Quantitative Approach, Fifth Edition Chapter 1 Fundamentals of Quantitative Design and Analysis 1 Computer Technology Performance improvements: Improvements in semiconductor technology
More informationHardware-Software Codesign. 1. Introduction
Hardware-Software Codesign 1. Introduction Lothar Thiele 1-1 Contents What is an Embedded System? Levels of Abstraction in Electronic System Design Typical Design Flow of Hardware-Software Systems 1-2
More informationTR An Overview of NVIDIA Tegra K1 Architecture. Ang Li, Radu Serban, Dan Negrut
TR-2014-17 An Overview of NVIDIA Tegra K1 Architecture Ang Li, Radu Serban, Dan Negrut November 20, 2014 Abstract This paperwork gives an overview of NVIDIA s Jetson TK1 Development Kit and its Tegra K1
More informationPerformance COE 403. Computer Architecture Prof. Muhamed Mudawar. Computer Engineering Department King Fahd University of Petroleum and Minerals
Performance COE 403 Computer Architecture Prof. Muhamed Mudawar Computer Engineering Department King Fahd University of Petroleum and Minerals What is Performance? How do we measure the performance of
More informationCOMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 1. Computer Abstractions and Technology
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 1 Computer Abstractions and Technology The Computer Revolution Progress in computer technology Underpinned by Moore
More informationChapter 1. Introduction: Part I. Jens Saak Scientific Computing II 7/348
Chapter 1 Introduction: Part I Jens Saak Scientific Computing II 7/348 Why Parallel Computing? 1. Problem size exceeds desktop capabilities. Jens Saak Scientific Computing II 8/348 Why Parallel Computing?
More informationThe Art of Parallel Processing
The Art of Parallel Processing Ahmad Siavashi April 2017 The Software Crisis As long as there were no machines, programming was no problem at all; when we had a few weak computers, programming became a
More informationEE5780 Advanced VLSI CAD
EE5780 Advanced VLSI CAD Lecture 1 Introduction Zhuo Feng 1.1 Prof. Zhuo Feng Office: EERC 513 Phone: 487-3116 Email: zhuofeng@mtu.edu Class Website http://www.ece.mtu.edu/~zhuofeng/ee5780fall2013.html
More informationVLSI Design Automation
VLSI Design Automation IC Products Processors CPU, DSP, Controllers Memory chips RAM, ROM, EEPROM Analog Mobile communication, audio/video processing Programmable PLA, FPGA Embedded systems Used in cars,
More informationPC I/O. May 7, Howard Huang 1
PC I/O Today wraps up the I/O material with a little bit about PC I/O systems. Internal buses like PCI and ISA are critical. External buses like USB and Firewire are becoming more important. Today also
More information7/28/ Prentice-Hall, Inc Prentice-Hall, Inc Prentice-Hall, Inc Prentice-Hall, Inc Prentice-Hall, Inc.
Technology in Action Technology in Action Chapter 9 Behind the Scenes: A Closer Look a System Hardware Chapter Topics Computer switches Binary number system Inside the CPU Cache memory Types of RAM Computer
More informationAim High. Intel Technical Update Teratec 07 Symposium. June 20, Stephen R. Wheat, Ph.D. Director, HPC Digital Enterprise Group
Aim High Intel Technical Update Teratec 07 Symposium June 20, 2007 Stephen R. Wheat, Ph.D. Director, HPC Digital Enterprise Group Risk Factors Today s s presentations contain forward-looking statements.
More informationComputer and Information Sciences College / Computer Science Department CS 207 D. Computer Architecture
Computer and Information Sciences College / Computer Science Department CS 207 D Computer Architecture The Computer Revolution Progress in computer technology Underpinned by Moore s Law Makes novel applications
More informationPerformance of computer systems
Performance of computer systems Many different factors among which: Technology Raw speed of the circuits (clock, switching time) Process technology (how many transistors on a chip) Organization What type
More informationParallelized Progressive Network Coding with Hardware Acceleration
Parallelized Progressive Network Coding with Hardware Acceleration Hassan Shojania, Baochun Li Department of Electrical and Computer Engineering University of Toronto Network coding Information is coded
More informationEE282 Computer Architecture. Lecture 1: What is Computer Architecture?
EE282 Computer Architecture Lecture : What is Computer Architecture? September 27, 200 Marc Tremblay Computer Systems Laboratory Stanford University marctrem@csl.stanford.edu Goals Understand how computer
More informationIntel Enterprise Processors Technology
Enterprise Processors Technology Kosuke Hirano Enterprise Platforms Group March 20, 2002 1 Agenda Architecture in Enterprise Xeon Processor MP Next Generation Itanium Processor Interconnect Technology
More informationHigh Capacity and High Performance 20nm FPGAs. Steve Young, Dinesh Gaitonde August Copyright 2014 Xilinx
High Capacity and High Performance 20nm FPGAs Steve Young, Dinesh Gaitonde August 2014 Not a Complete Product Overview Page 2 Outline Page 3 Petabytes per month Increasing Bandwidth Global IP Traffic Growth
More informationMulti-Core Microprocessor Chips: Motivation & Challenges
Multi-Core Microprocessor Chips: Motivation & Challenges Dileep Bhandarkar, Ph. D. Architect at Large DEG Architecture & Planning Digital Enterprise Group Intel Corporation October 2005 Copyright 2005
More informationArchitecture without explicit locks for logic simulation on SIMD machines
Architecture without explicit locks for logic on machines M. Chimeh Department of Computer Science University of Glasgow UKMAC, 2016 Contents 1 2 3 4 5 6 The Using models to replicate the behaviour of
More informationCS Computer Architecture Spring Lecture 01: Introduction
CS 35101 Computer Architecture Spring 2008 Lecture 01: Introduction Created by Shannon Steinfadt Indicates slide was adapted from :Kevin Schaffer*, Mary Jane Irwinº, and from Computer Organization and
More informationLecture 1: Introduction
Contemporary Computer Architecture Instruction set architecture Lecture 1: Introduction CprE 581 Computer Systems Architecture, Fall 2016 Reading: Textbook, Ch. 1.1-1.7 Microarchitecture; examples: Pipeline
More informationHow What When Why CSC3501 FALL07 CSC3501 FALL07. Louisiana State University 1- Introduction - 1. Louisiana State University 1- Introduction - 2
Computer Organization and Design Dr. Arjan Durresi Louisiana State University Baton Rouge, LA 70803 durresi@csc.lsu.edu d These slides are available at: http://www.csc.lsu.edu/~durresi/csc3501_07/ Louisiana
More informationBackground Heterogeneous Architectures Performance Modeling Single Core Performance Profiling Multicore Performance Estimation Test Cases Multicore
By Dan Stafford Background Heterogeneous Architectures Performance Modeling Single Core Performance Profiling Multicore Performance Estimation Test Cases Multicore Design Space Results & Observations General
More informationMSc-IT 1st Semester Fall 2016, Course Instructor M. Imran khalil 1
Objectives Overview Differentiate among various styles of system units on desktop computers, notebook computers, and mobile devices Identify chips, adapter cards, and other components of a motherboard
More information2009 International Solid-State Circuits Conference Intel Paper Highlights
2009 International Solid-State Circuits Conference Intel Paper Highlights Mark Bohr Intel Senior Fellow Soumyanath Krishnamurthy Intel Fellow 1 2009 ISSCC Intel Paper Summary Under embargo until February,
More informationParallelism and Concurrency. COS 326 David Walker Princeton University
Parallelism and Concurrency COS 326 David Walker Princeton University Parallelism What is it? Today's technology trends. How can we take advantage of it? Why is it so much harder to program? Some preliminary
More informationFABRICATION TECHNOLOGIES
FABRICATION TECHNOLOGIES DSP Processor Design Approaches Full custom Standard cell** higher performance lower energy (power) lower per-part cost Gate array* FPGA* Programmable DSP Programmable general
More informationFundamentals of Computer Design
Fundamentals of Computer Design Computer Architecture J. Daniel García Sánchez (coordinator) David Expósito Singh Francisco Javier García Blas ARCOS Group Computer Science and Engineering Department University
More informationComputer Architecture
Computer Architecture Chapter 7 Parallel Processing 1 Parallelism Instruction-level parallelism (Ch.6) pipeline superscalar latency issues hazards Processor-level parallelism (Ch.7) array/vector of processors
More information(ii) Why are we going to multi-core chips to find performance? Because we have to.
CSE 30321 Computer Architecture I Fall 2009 Lab 06 Introduction to Multi-core Processors and Parallel Programming Assigned: November 3, 2009 Due: November 17, 2009 1. Introduction: This lab will introduce
More informationTechnology in Action
Technology in Action Chapter 9 Behind the Scenes: A Closer Look at System Hardware 1 Binary Language Computers work in binary language. Consists of two numbers: 0 and 1 Everything a computer does is broken
More informationComputer Architecture!
Informatics 3 Computer Architecture! Dr. Boris Grot and Dr. Vijay Nagarajan!! Institute for Computing Systems Architecture, School of Informatics! University of Edinburgh! General Information! Instructors:!
More informationIntroduction to Microprocessor
Introduction to Microprocessor Slide 1 Microprocessor A microprocessor is a multipurpose, programmable, clock-driven, register-based electronic device That reads binary instructions from a storage device
More informationMicroarchitecture Overview. Performance
Microarchitecture Overview Prof. Scott Rixner Duncan Hall 3028 rixner@rice.edu January 15, 2007 Performance 4 Make operations faster Process improvements Circuit improvements Use more transistors to make
More informationCalendar Description
ECE212 B1: Introduction to Microprocessors Lecture 1 Calendar Description Microcomputer architecture, assembly language programming, memory and input/output system, interrupts All the instructions are
More informationComputer Architecture A Quantitative Approach, Fifth Edition. Chapter 1. Copyright 2012, Elsevier Inc. All rights reserved. Computer Technology
Computer Architecture A Quantitative Approach, Fifth Edition Chapter 1 Fundamentals of Quantitative Design and Analysis 1 Computer Technology Performance improvements: Improvements in semiconductor technology
More informationHISTORY OF MICROPROCESSORS
HISTORY OF MICROPROCESSORS CONTENTS Introduction 4-Bit Microprocessors 8-Bit Microprocessors 16-Bit Microprocessors 1 32-Bit Microprocessors 64-Bit Microprocessors 2 INTRODUCTION Fairchild Semiconductors
More informationParallelism in Hardware
Parallelism in Hardware Minsoo Ryu Department of Computer Science and Engineering 2 1 Advent of Multicore Hardware 2 Multicore Processors 3 Amdahl s Law 4 Parallelism in Hardware 5 Q & A 2 3 Moore s Law
More informationOnline Course Evaluation. What we will do in the last week?
Online Course Evaluation Please fill in the online form The link will expire on April 30 (next Monday) So far 10 students have filled in the online form Thank you if you completed it. 1 What we will do
More informationComputer Architecture. Fall Dongkun Shin, SKKU
Computer Architecture Fall 2018 1 Syllabus Instructors: Dongkun Shin Office : Room 85470 E-mail : dongkun@skku.edu Office Hours: Wed. 15:00-17:30 or by appointment Lecture notes nyx.skku.ac.kr Courses
More informationMulticore Hardware and Parallelism
Multicore Hardware and Parallelism Minsoo Ryu Department of Computer Science and Engineering 2 1 Advent of Multicore Hardware 2 Multicore Processors 3 Amdahl s Law 4 Parallelism in Hardware 5 Q & A 2 3
More informationVLSI Design Automation
VLSI Design Automation IC Products Processors CPU, DSP, Controllers Memory chips RAM, ROM, EEPROM Analog Mobile communication, audio/video processing Programmable PLA, FPGA Embedded systems Used in cars,
More informationCSE502: Computer Architecture CSE 502: Computer Architecture
CSE 502: Computer Architecture Multi-{Socket,,Thread} Getting More Performance Keep pushing IPC and/or frequenecy Design complexity (time to market) Cooling (cost) Power delivery (cost) Possible, but too
More informationComparison of Parallel Processing Systems. Motivation
Comparison of Parallel Processing Systems Ash Dean Katie Willis CS 67 George Mason University Motivation Increasingly, corporate and academic projects require more computing power than a typical PC can
More informationComputer Architecture Spring 2016
Computer Architecture Spring 2016 Lecture 19: Multiprocessing Shuai Wang Department of Computer Science and Technology Nanjing University [Slides adapted from CSE 502 Stony Brook University] Getting More
More informationEECS4201 Computer Architecture
Computer Architecture A Quantitative Approach, Fifth Edition Chapter 1 Fundamentals of Quantitative Design and Analysis These slides are based on the slides provided by the publisher. The slides will be
More informationPerformance Characterization, Prediction, and Optimization for Heterogeneous Systems with Multi-Level Memory Interference
The 2017 IEEE International Symposium on Workload Characterization Performance Characterization, Prediction, and Optimization for Heterogeneous Systems with Multi-Level Memory Interference Shin-Ying Lee
More informationVLSI Design Automation. Calcolatori Elettronici Ing. Informatica
VLSI Design Automation 1 Outline Technology trends VLSI Design flow (an overview) 2 IC Products Processors CPU, DSP, Controllers Memory chips RAM, ROM, EEPROM Analog Mobile communication, audio/video processing
More informationTHERMAL BENCHMARK AND POWER BENCHMARK SOFTWARE
Nice, Côte d Azur, France, 27-29 September 26 THERMAL BENCHMARK AND POWER BENCHMARK SOFTWARE Marius Marcu, Mircea Vladutiu, Horatiu Moldovan and Mircea Popa Department of Computer Science, Politehnica
More informationNVIDIA GTX200: TeraFLOPS Visual Computing. August 26, 2008 John Tynefield
NVIDIA GTX200: TeraFLOPS Visual Computing August 26, 2008 John Tynefield 2 Outline Execution Model Architecture Demo 3 Execution Model 4 Software Architecture Applications DX10 OpenGL OpenCL CUDA C Host
More informationHPC Technology Trends
HPC Technology Trends High Performance Embedded Computing Conference September 18, 2007 David S Scott, Ph.D. Petascale Product Line Architect Digital Enterprise Group Risk Factors Today s s presentations
More informationMeasurement-based Analysis of TCP/IP Processing Requirements
Measurement-based Analysis of TCP/IP Processing Requirements Srihari Makineni Ravi Iyer Communications Technology Lab Intel Corporation {srihari.makineni, ravishankar.iyer}@intel.com Abstract With the
More informationDeveloping a Data Driven System for Computational Neuroscience
Developing a Data Driven System for Computational Neuroscience Ross Snider and Yongming Zhu Montana State University, Bozeman MT 59717, USA Abstract. A data driven system implies the need to integrate
More informationUnderstanding Dual-processors, Hyper-Threading Technology, and Multicore Systems
Understanding Dual-processors, Hyper-Threading Technology, and Multicore Systems This paper will provide you with a basic understanding of the differences among several computer system architectures dual-processor
More informationComputers: Inside and Out
Computers: Inside and Out Computer Components To store binary information the most basic components of a computer must exist in two states State # 1 = 1 State # 2 = 0 1 Transistors Computers use transistors
More informationFundamentals of Computers Design
Computer Architecture J. Daniel Garcia Computer Architecture Group. Universidad Carlos III de Madrid Last update: September 8, 2014 Computer Architecture ARCOS Group. 1/45 Introduction 1 Introduction 2
More informationA TALENTED CPU-TO-GPU MEMORY MAPPING TECHNIQUE
A TALENTED CPU-TO-GPU MEMORY MAPPING TECHNIQUE Abu Asaduzzaman, Deepthi Gummadi, and Chok M. Yip Department of Electrical Engineering and Computer Science Wichita State University Wichita, Kansas, USA
More informationSYSTEM BUS AND MOCROPROCESSORS HISTORY
SYSTEM BUS AND MOCROPROCESSORS HISTORY Dr. M. Hebaishy momara@su.edu.sa http://colleges.su.edu.sa/dawadmi/fos/pages/hebaishy.aspx Digital Logic Design Ch1-1 SYSTEM BUS The CPU sends various data values,
More informationAbout the Presentations
About the Presentations The presentations cover the objectives found in the opening of each chapter. All chapter objectives are listed in the beginning of each presentation. You may customize the presentations
More informationComputer Architecture s Changing Definition
Computer Architecture s Changing Definition 1950s Computer Architecture Computer Arithmetic 1960s Operating system support, especially memory management 1970s to mid 1980s Computer Architecture Instruction
More informationLECTURE 1. Introduction
LECTURE 1 Introduction CLASSES OF COMPUTERS When we think of a computer, most of us might first think of our laptop or maybe one of the desktop machines frequently used in the Majors Lab. Computers, however,
More informationUMBC. Rubini and Corbet, Linux Device Drivers, 2nd Edition, O Reilly. Systems Design and Programming
Systems Design and Programming Instructor: Professor Jim Plusquellic Text: Barry B. Brey, The Intel Microprocessors, 8086/8088, 80186/80188, 80286, 80386, 80486, Pentium and Pentium Pro Processor Architecture,
More informationCS 590: High Performance Computing. Parallel Computer Architectures. Lab 1 Starts Today. Already posted on Canvas (under Assignment) Let s look at it
Lab 1 Starts Today Already posted on Canvas (under Assignment) Let s look at it CS 590: High Performance Computing Parallel Computer Architectures Fengguang Song Department of Computer Science IUPUI 1
More informationThe Computer Revolution. Classes of Computers. Chapter 1
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition 1 Chapter 1 Computer Abstractions and Technology 1 The Computer Revolution Progress in computer technology Underpinned by Moore
More informationA+ Guide to Hardware: Managing, Maintaining, and Troubleshooting, 5e. Chapter 1 Introducing Hardware
: Managing, Maintaining, and Troubleshooting, 5e Chapter 1 Introducing Hardware Objectives Learn that a computer requires both hardware and software to work Learn about the many different hardware components
More informationELE 455/555 Computer System Engineering. Section 1 Review and Foundations Class 5 Computer System Performance
ELE 455/555 Computer System Engineering Section 1 Review and Foundations Class 5 Computer System Overview Eight Great Ideas in Computer Architecture Design for Moore s Law Integrated Circuit resources
More informationAuto-Tuning Multi-Programmed Workload on the SCC
Auto-Tuning Multi-Programmed Workload on the SCC Brian Roscoe, Mathias Herlev, Chen Liu Department of Electrical and Computer Engineering Clarkson University Potsdam, NY 13699, USA {roscoebj,herlevm,cliu}@clarkson.edu
More informationPart 1 of 3 -Understand the hardware components of computer systems
Part 1 of 3 -Understand the hardware components of computer systems The main circuit board, the motherboard provides the base to which a number of other hardware devices are connected. Devices that connect
More informationParallel Programming Multicore systems
FYS3240 PC-based instrumentation and microcontrollers Parallel Programming Multicore systems Spring 2011 Lecture #9 Bekkeng, 4.4.2011 Introduction Until recently, innovations in processor technology have
More informationWhite Paper Assessing FPGA DSP Benchmarks at 40 nm
White Paper Assessing FPGA DSP Benchmarks at 40 nm Introduction Benchmarking the performance of algorithms, devices, and programming methodologies is a well-worn topic among developers and research of
More informationParallel graph traversal for FPGA
LETTER IEICE Electronics Express, Vol.11, No.7, 1 6 Parallel graph traversal for FPGA Shice Ni a), Yong Dou, Dan Zou, Rongchun Li, and Qiang Wang National Laboratory for Parallel and Distributed Processing,
More informationLecture 2: Performance
Lecture 2: Performance Today s topics: Technology wrap-up Performance trends and equations Reminders: YouTube videos, canvas, and class webpage: http://www.cs.utah.edu/~rajeev/cs3810/ 1 Important Trends
More informationSystems Design and Programming. Instructor: Chintan Patel
Systems Design and Programming Instructor: Chintan Patel Text: Barry B. Brey, 'The Intel Microprocessors, 8086/8088, 80186/80188, 80286, 80386, 80486, Pentium and Pentium Pro Processor, Pentium II, Pentium
More informationCIT 668: System Architecture
CIT 668: System Architecture Computer Systems Architecture I 1. System Components 2. Processor 3. Memory 4. Storage 5. Network 6. Operating System Topics Images courtesy of Majd F. Sakr or from Wikipedia
More information