Pavel Tvrdík, Jiří Kašpar (ČVUT FIT) HW Trends and Architectures MI-POA, 2011, Lecture 1 1/29 HW Trends and Architectures prof. Ing. Pavel Tvrdík CSc. Ing. Jiří Kašpar Department of Computer Systems Faculty of Information Technology Czech Technical University in Prague Pavel Tvrdík, Jiří Kašpar, 2011 Advanced Computer System Architectures, MI-POA, 02/2011, Lecture 1 https://edux.fit.cvut.cz/courses/mi-poa Evropský sociální fond Praha & EU: Investujeme do vaší budoucnosti
What is Computer Architecture? What is Computer Architecture? Pavel Tvrdík, Jiří Kašpar (ČVUT FIT) HW Trends and Architectures MI-POA, 2011, Lecture 1 2/29
Pavel Tvrdík, Jiří Kašpar (ČVUT FIT) HW Trends and Architectures MI-POA, 2011, Lecture 1 3/29 What is Computer Architecture? Computer Architecture A modern meaning of the term computer architecture covers three aspects of computer design: instruction set architecture, computer organization, and computer hardware.
Pavel Tvrdík, Jiří Kašpar (ČVUT FIT) HW Trends and Architectures MI-POA, 2011, Lecture 1 4/29 What is Computer Architecture? Instruction Set Architecture (ISA) ISA defines machine interfaces like: instruction set, register sets, memory organization, and interrupt and exception handling. Instruction Set Architecture (ISA) ISA determines a hardware functionality of a given computer.
Pavel Tvrdík, Jiří Kašpar (ČVUT FIT) HW Trends and Architectures MI-POA, 2011, Lecture 1 5/29 What is Computer Architecture? Computer Organization and Hardware Computer organization describes the high-level aspects of a design: the memory system, the bus structure, and Computer Organization and Hardware the design of the CPU internals (i.e. how arithmetic, logic, branching, and data transfers are implemented). Computer hardware refers to the specifics of a machine, included the detailed logic design and the packaging technology of the machine.
Pavel Tvrdík, Jiří Kašpar (ČVUT FIT) HW Trends and Architectures MI-POA, 2011, Lecture 1 6/29 But, Architecture of computer systems as other engineering disciplines highly depends on the compromise between: Cost of contemporary technology. Limits of contemporary technology. Time to market. What is Computer Architecture? Computer Organization and Hardware So how it depends on evolution of technology?
Technology Evolution Technology Evolution Pavel Tvrdík, Jiří Kašpar (ČVUT FIT) HW Trends and Architectures MI-POA, 2011, Lecture 1 7/29
Technology Evolution Computer Performance per $1000 Pavel Tvrdík, Jiří Kašpar (ČVUT FIT) HW Trends and Architectures MI-POA, 2011, Lecture 1 8/29
Pavel Tvrdík, Jiří Kašpar (ČVUT FIT) HW Trends and Architectures MI-POA, 2011, Lecture 1 9/29 Technology Improvements Vacuum tube transistor integrated circuit VLSI Processor Transistor count: increases about 30% to 40% per year Memory Disk Technology Evolution DRAM capacity: increases about 60% per year (4x every 3 yrs.) Cost per bit: decreases about 25% per year Capacity: increases about 60% per year New technologicals principles (e.g. flash memory)
Technology Evolution Moore s Law Moore s Law In 1965, Gordon Moore predicted that the number of transistors that can be integrated on a die would double every 18 to 24 months (i.e., grow exponentially with time). Amazingly visionary million transistor/chip barrier was crossed in the 1980 s. 2300 transistors, 1 MHz clock (Intel 4004) - 1971 16 Million transistors (Ultra Sparc III) - 1997 42 Million transistors, 2 GHz clock (Intel Xeon) 2001 55 Million transistors, 3 GHz, 130nm technology, 250mm 2 die (Intel Pentium 4) - 2004 140 Million transistors (HP PA-8500) - 1998 2 Billion transistors (Intel Itanium Tukwila) - 2010 Pavel Tvrdík, Jiří Kašpar (ČVUT FIT) HW Trends and Architectures MI-POA, 2011, Lecture 1 10/29
Technology Evolution Moore s Law Intel Processors: Transistors per Chip Pavel Tvrdík, Jiří Kašpar (ČVUT FIT) HW Trends and Architectures MI-POA, 2011, Lecture 1 11/29
Technology Evolution Intel x86 Processors Intel x86 Processors Pavel Tvrdík, Jiří Kašpar (ČVUT FIT) HW Trends and Architectures MI-POA, 2011, Lecture 1 12/29
Technology Evolution Intel x86 Processors Cost of Circuit and its Speed Pavel Tvrdík, Jiří Kašpar (ČVUT FIT) HW Trends and Architectures MI-POA, 2011, Lecture 1 13/29
Technology Evolution The CPU Power Wall The CPU Power Wall Pavel Tvrdík, Jiří Kašpar (ČVUT FIT) HW Trends and Architectures MI-POA, 2011, Lecture 1 14/29
Performance 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 Technology Evolution The CPU Power Wall Processor-Memory Performance Gap Pavel Tvrdík, Jiří Kašpar (ČVUT FIT) HW Trends and Architectures MI-POA, 2011, Lecture 1 15/29 1997 1998 1999 2000 1000 Moore s Law CPU: 55% per year 100 10 Processor-Memory Performance Gap: (grows 50% per year) 1 DRAM: 7% per year o 1980 No caches in microprocessors o 1995 Two-level cache on a microprocessor
Faster Pavel Tvrdík, Jiří Kašpar (ČVUT FIT) HW Trends and Architectures MI-POA, 2011, Lecture 1 16/29 Bigger Typical Single Core Memory Hierarchy Registers are at the top of the hierarchy Typical size < 1 KB Access time < 0.5 ns Level 1 Cache (8 64 KB) Access time: 0.5 1 ns L2 Cache (512KB 8MB) Access time: 2 10 ns Main Memory (1 2 GB) Access time: 50 70 ns Disk Storage (> 200 GB) Access time: milliseconds Technology Evolution The CPU Power Wall Microprocessor Registers L1 Cache L2 Cache Memory Bus Memory I/O Bus Disk, Tape, etc
Pavel Tvrdík, Jiří Kašpar (ČVUT FIT) HW Trends and Architectures MI-POA, 2011, Lecture 1 17/29 Common Features of Contemporary Server CPUs Mesh/torus topology Technology Evolution The CPU Power Wall high speed cpu-cpu interconnect. Packet routing. Integrated memory controller. Multiple cores. Multiple execution units per core. Execution of multiple threads in a core. Global address space. Memory cache hierarchy: one or two level caches in a core. Next cache level (private or shared) on a chip.
Pavel Tvrdík, Jiří Kašpar (ČVUT FIT) HW Trends and Architectures MI-POA, 2011, Lecture 1 18/29 Overview of contemporary CPUs We can see similar features and functionalities in design of all major architectures (in alphabetic order): AMD x64 IBM Power Intel Itanium Intel x64 Sun Niagara Technology Evolution Overview of contemporary CPUs
Technology Evolution Overview of contemporary CPUs IBM Processor Technology Roadmap POWER5 130 nm POWER6 65 nm POWER7 45 nm POWER8 POWER4 180 nm Dual Core Chip Multi Processing Distributed Switch Shared L2 Dynamic LPARs (32) Dual Core Enhanced Scaling SMT Distributed Switch + Core Parallelism + FP Performance + Memory bandwidth + Virtualization Dual Core High Frequencies Virtualization + Memory Subsystem + Altivec Instruction Retry Dyn Energy Mgmt SMT + Protection Keys Multi Core On-Chip edram Power Optimized Cores Mem Subsystem ++ SMT++ Reliability + VSM & VSX (AltiVec) Protection Keys+ Concept Phase 2001 2004 2007 2010 Pavel Tvrdík, Jiří Kašpar (ČVUT FIT) HW Trends and Architectures MI-POA, 2011, Lecture 1 19/29
Technology Evolution Overview of contemporary CPUs Processor Designs Pavel Tvrdík, Jiří Kašpar (ČVUT FIT) HW Trends and Architectures MI-POA, 2011, Lecture 1 20/29 POWER5 POWER5+ POWER6 POWER7 Technology 130 nm 90 nm 60 nm 45 nm Size 389 mm 2 245 mm 2 341 mm 2 567 mm 2 Transistors 276 M 276 M 790 M 1.2 B Cores 2 2 2 4 / 6 / 8 Frequencies 1.65 GHz 1.9 GHz 3-5 GHz 3-4 GHz L2 Cache 1.9 MB Shared 1.9 MB Shared 4 MB / Core 256 KB / Core L3 Cache 36 MB 36 MB 32 MB 4 MB / Core Memory Cntrl 1 1 2 / 1 2
Technology Evolution Overview of contemporary CPUs IBM POWER7 Processor Chip Local SMP Links Cores : 8 ( 4 / 6 core options ) 567mm 2 Technology: POWER7 CORE L2 Cache F A S T POWER7 CORE L2 Cache POWER7 CORE L2 Cache POWER7 CORE L2 Cache 45nm lithography, Cu, SOI, edram Transistors: 1.2 B Equivalent function of 2.7B L3 REGION edram efficiency MC0 L3 Cache and Chip Interconnect MC1 Eight processor cores L2 Cache L2 Cache L2 Cache L2 Cache 12 execution units per core 4 Way SMT per core up to 4 threads per core POWER7 CORE POWER7 CORE POWER7 CORE POWER7 CORE 32 Threads per chip L1: 32 KB I Cache / 32 KB D Cache Remote SMP & I/O Links L2: 256 KB per core L3: Shared 32MB on chip edram Dual DDR3 Memory Controllers 90 GB/s Memory bandwidth per chip Scalability up to 32 Sockets 360 GB/s SMP bandwidth/chip 20,000 coherent operations in flight Pavel Tvrdík, Jiří Kašpar (ČVUT FIT) HW Trends and Architectures MI-POA, 2011, Lecture 1 21/29
cache cache system interface Pavel Tvrdík, Jiří Kašpar (ČVUT FIT) HW Trends and Architectures MI-POA, 2011, Lecture 1 22/29 Intel Itanium Processor 9300 Series (Tukwila) core core core core Technology Evolution cache cache Overview of contemporary CPUs Up to 4 Cores/8 Threads 6MB private L3 cache per core Four Intel QPI Links Up to 8 sockets glueless, max. 64 with node controllers two DDR3 Memory Controllers Intel Hyper-Threading Technology First 2Billion Transistor Processor
Technology Evolution Overview of contemporary CPUs Intel Xeon 7500 Series Pavel Tvrdík, Jiří Kašpar (ČVUT FIT) HW Trends and Architectures MI-POA, 2011, Lecture 1 23/29 8cores 2 threads per core 24 MB shared L3 cache Four Intel QPI Links Up to 8 sockets glueless 4 channel DDR3 memory controller Intel Hyper-Threading Technology
Technology Evolution Overview of contemporary CPUs AMD Opteron 6000 Series Pavel Tvrdík, Jiří Kašpar (ČVUT FIT) HW Trends and Architectures MI-POA, 2011, Lecture 1 24/29 12 cores 8 threads per core 12MB shared L2 cache 4 HyperTransport links (up to 4 socket system) 4 channel DDR3 memory
Technology Evolution Overview of contemporary CPUs Sun Niagara 2 Server on a Chip (SoC) Pavel Tvrdík, Jiří Kašpar (ČVUT FIT) HW Trends and Architectures MI-POA, 2011, Lecture 1 25/29 8 cores 8 threads per core Single socket system 4MB shared L2 cache 4 dual-channel DDR3 controllers two 1G/10G Ethernet ports one 2.5Gb/s PCIe port 503 million transistors
Technology Evolution Overview of contemporary CPUs Sun Niagara 3 (not released) Pavel Tvrdík, Jiří Kašpar (ČVUT FIT) HW Trends and Architectures MI-POA, 2011, Lecture 1 26/29 16 cores 8 threads per core 6MB shared L2 cache 6 coherence links (up to 4 socket system) two DDR3 controllers four channels of DDR3 two 1G/10G Ethernet ports two 5Gb/s PCIe ports 1 billion-transistor
Pavel Tvrdík, Jiří Kašpar (ČVUT FIT) HW Trends and Architectures MI-POA, 2011, Lecture 1 27/29 Technology Evolution Perspective: 100+ Cores? Perspective: 100+ Cores? Multicore: 2X / 2 years 64 cores in 8 years Manycore: 8X multicore 1000 100 80x86 Uniprocessors No longer sold 10 1 1 2 4 64 8 128 16 256 32 512 2003 2005 2007 2009 2011 2013 2015 64 16-way MP laptops for sale in future
Pavel Tvrdík, Jiří Kašpar (ČVUT FIT) HW Trends and Architectures MI-POA, 2011, Lecture 1 28/29 Technology Evolution A New Era of Parallel Computing is Coming A New Era of Parallel Computing is Coming,000,000,000 TF Multi-core Era: A new paradigm in computing 1,000,000 TF 1,000 TF Massively Parallel Era 1 TF Vector Era.001 TF 1970 1980 1990 2000 2010 2020 2030
Sources John Hennessy: Technology Trends: The Datacenter is the Computer, The Cellphone/Laptop is the Computer, October, 2007 http://www.hpts.ws/papers/2007/techtrendshptspatterson2007.ppt David Patterson and John Hennessy: Computer Organization & Design: The Hardware/Software Interface Muhamed Mudawar: Introduction to Computer Architecture http://opencourseware.kfupm.edu.sa/colleges/ccse/ics/ics233/files/2_unit1.ppt Mary Jane Irwin: CSE 431 Computer Architecture, Lecture 01 http://www.cse.psu.edu/~cg431 Sources Sudip S. Dosanjh: HPC User Forum, Norfolk, 2008 http://www.hpcuserforum.com/presentations/norfolk/sandia%2520iaa.hpcuser.ppt Ben Schrooten, Shawn Borchardt, Eddie Willett, Vandana Chopra: Computer Architecture & Related Topics http://www.mgnet.org/~douglas/classes/cs521-s02/arch/arch.ppt Edward L. Bosworth: The Power Wall, January 2010. http://edwardbosworth.com/my5155_slides/chapter01/thepowerwall.doc Ray Kurzweil: The Web Within Us: When Minds and Machines Become One, 2009, http://www.slideshare.net/serge111/singularity-presentation-ray-kurzweil-at-google Pavel Tvrdík, Jiří Kašpar (ČVUT FIT) HW Trends and Architectures MI-POA, 2011, Lecture 1 29/29