Computer Architecture Fall 2018 1
Syllabus Instructors: Dongkun Shin Office : Room 85470 E-mail : dongkun@skku.edu Office Hours: Wed. 15:00-17:30 or by appointment Lecture notes nyx.skku.ac.kr Courses Computer Architecture (2018 Fall) http://nyx.skku.ac.kr/?page_id=1267 Lecture notes and talks will be given in English. 2
Syllabus (cont d) Main text D. A. Patterson and J. L. Hennessy, Computer Organization and Design MIPS Edition: The Hardware/Software Interface, Elsevier, 2013 (5 th Edition). Grading policy (subject to change) Attendance: 5% Midterm exam: 30% Final exam: 50% Assignment: 15% 4 th edition If you cheat on tests and other assignments, you will fail the class. 5 th edition 3
Syllabus (cont d) Course Outline 1. Introduction, Motivation, Computer Abstraction & Technology (Chapter 1) 2. Performance Measurements & Evaluation (Chapter 1) 3. Instruction Set Architecture, MIPS (Chapter2) 4. Arithmetic: Addition, Subtraction, Multiplication, Division and Floating Point (Chapter 3) 5. Processor Implementation (Chapter 4) 6. Memory Hierarchy: Cache, Virtual Memory (Chapter 5) 7. Parallel Processors (Chapter 6) 4
Syllabus (cont d) Assignments Programming assignments Pipeline simulator (C) Cache simulator (C or C++) If you don t complete the programming assignments, you cannot get A+ grade irrespective of your exam scores. Prerequisites Digital systems C programming System Programming 5
If you have any questions, please feel free to interrupt me in English or Korean. 6
What You Will Learn The hardware/software interface How programs are translated into the machine language How the hardware executes the machine code Hardware components and their organization What determines program performance And how it can be improved How hardware designers improve performance/energy What is parallel processing Why learn this stuff? You want to call yourself a computer scientist You want to build software people use (need performance) You need to make a purchasing decision or offer expert advice 7
Understanding Performance Algorithm Determines number of operations executed Programming language, compiler, architecture Determine number of machine instructions executed per operation Processor and memory system Determine how fast instructions are executed I/O system (including OS) Determines how fast I/O operations are executed 8
Components of a Computer The BIG Picture Same components for all kinds of computer Desktop, server, embedded Input/output includes User-interface devices Display, keyboard, mouse Storage devices Hard disk, CD/DVD, flash Network adapters For communicating with other computers Our primary focus: Processor 9
Apple ipad2 Capacitive multitouch LCD screen 3.8 V, 25 Watt-hour battery Computer board touchscreen line driver Apple 1GHz A5 dual-core Processor (512 MB RAM) power management chip Toshiba 16GB NAND Flash Power Management IC 10
Inside the Processor Apple A6 11
A Safe Place for Data Volatile main memory Loses instructions and data when power off Non-volatile secondary memory Magnetic disk Flash memory Optical disk (CDROM, DVD) 12
Decimal vs. Binary Notation 4TB 4TiB (flash memory) 4TB SSD 13
Classes of Computers Personal computers General purpose, variety of software Subject to cost/performance tradeoff Server computers Network based High capacity, performance, reliability Range from small servers to building sized Cloud Computing: Google, Amazon, MS Supercomputers High-end scientific and engineering calculations Highest capability but represent a small fraction of the overall computer market Oak Ridge National Laboratory 14
Cloud Computing 15
Classes of Computers Embedded computers Hidden as components of systems Most prevalent type Stringent power/performance/cost constraints Designed to run one application, Annual growth rate of 40% vs. 9% for desktops and servers SW is integrated with H/W and delivered as a single system Unique application requirements (performance, cost, and power) Low tolerance for failure 16
The Computer Revolution Progress in computer technology Underpinned by Moore s Law Makes novel applications feasible Smartphones Computers in automobiles AI, Machine Learning, Big Data Human genome project Computers are pervasive 17
Key Driving Factors of Machine Learning Algorithm Computing Power 18
Neural Processing Unit Google TPU 19
Uniprocessor Performance 20
Contributor 1: Technology Processor logic capacity: about 30% per year clock rate: about 20% per year Memory DRAM capacity: about 60% per year (4x every 3 years) Memory speed: about 10% per year Cost per bit: improves about 25% per year Disk capacity: about 60% per year 21
Technology improvement Moore's law the number of transistors per integrated circuit would double every 18 months 22
Contributor 2: Computer Architecture Exploiting Parallelism (Single processor) Pipelining Superscalar VLIW Multiprocessor Media Instructions (SIMD) Cache Memory 23
Contributor 2: Computer Architecture Technology Contribution 24
Superscalar Processors Multiple functional units ALPHA Pentium 25
Multicore Processor Intel Core i7-970 Architecture 26
Eight Great Ideas Design for Moore s Law Use abstraction to simplify design Make the common case fast Performance via parallelism Performance via pipelining Performance via prediction Hierarchy of memories Dependability via redundancy 27
Below Your Program Application software Written in high-level language System software Operating System I/O operations, memory and storage allocation, Scheduling tasks & sharing resources Compiler Hardware translate a high-level language program into the hardware instructions Processor, memory, I/O controllers User Application software Systems software Hardware Operating system compiler assembler Programs user writes and runs 28
Levels of Program Code High-level language Level of abstraction closer to problem domain Provides for productivity and portability Assembly language Textual representation of instructions Hardware representation Binary digits (bits) Encoded instructions and data 29
Computer System Organization Control Datapath Central Processing Unit (CPU) or processor Input Memory Output Datapath performs the arithmetic operations Control tells the datapath, memory, and I/O devices what to do 30
Input Device Inputs Object Code 31
Programs (as Machine Code) are stored in memory 32
How does a machine code program execute? Sequential execution 33
Processor Fetches an Instruction Processor fetches an instruction from memory 34
Instruction Decode Control decodes the instruction to determine what to execute 35
Instruction Execution Datapath executes the instruction as directed by control 36
Completion At program completion the data to be output resides in memory 37
Output Device Outputs Data 38
The Hardware/Software Interface Why the text sub-title is The Hardware/Software Interface? Hardware needs software to operate The instruction set architecture includes everything programmers need to know to make a binary program to work (Instructions, Arithmetic and Logic Unit (ALU), registers available, register size, etc.) The interface is important since it influences the performance of the computer; It also allows a given instruction set to work on different machines A given Instruction Set Architecture may have different implementations in hardware. Computer Architecture vs. Computer Organization 39
Instruction Set Architecture (ISA) 40
Instruction set Instruction Set Architecture (ISA) A set of assembly language instructions (ISA) provides a link between software and hardware. Given an instruction set, software programmers and hardware engineers work more or less independently. (Abstraction) ISA is designed to extract the most performance out of the available hardware technology. Software Hardware 41
Sales of Microprocessors Instruction set architectures 42
Computer Architecture Architecture: System attributes that have a direct impact on the logical execution of a program Architecture is visible to a programmer: Instruction set Data representation I/O mechanisms Memory addressing Types of ISA: RISC, CISC, VLIW, Superscalar, CRISC Examples: IBM370/x86/Pentium/K6 (CISC) PowerPC (Superscalar) Alpha (Superscalar) MIPS (RISC and Superscalar) Sparc (RISC), UltraSparc (Superscalar) ARM (RISC), Cortex (CRISC) Intel Core (CRISC) CISC and RISC is of no more interest in today s computer technology. Computers today are truly hybrid systems. RISC architects have adopted a larger set of instructions and CISC architects have realized the benefits of implementing a core set of instructions that can execute in a single CPU cycle 43
Computer Organization Organization: Physical details that are transparent to a programmer, such as Hardware implementation of an instruction Control signals Memory technology used Example: System/370 architecture has been used in many IBM computers, which widely differ in their organization. 44
ARM Architecture Versions 45