What Transitioning from 32-bit to 64-bit x86 Computing Means Today

Similar documents
Assembly Language for Intel-Based Computers, 4 th Edition. Chapter 2: IA-32 Processor Architecture Included elements of the IA-64 bit

CS241 Computer Organization Spring Introduction to Assembly

Instruction Set Architectures

CMSC 313 COMPUTER ORGANIZATION & ASSEMBLY LANGUAGE PROGRAMMING LECTURE 03, SPRING 2013

Instruction Set Architectures

Introduction to Machine/Assembler Language

IA-32 Architecture COE 205. Computer Organization and Assembly Language. Computer Engineering Department

The von Neumann Machine

Instruction Set Architectures

CMSC Lecture 03. UMBC, CMSC313, Richard Chang

The von Neumann Machine

Advanced Microprocessors

Lecture 15 Intel Manual, Vol. 1, Chapter 3. Fri, Mar 6, Hampden-Sydney College. The x86 Architecture. Robb T. Koether. Overview of the x86

The AMD64 Technology for Server and Workstation. Dr. Ulrich Knechtel Enterprise Program Manager EMEA

Introduction to IA-32. Jo, Heeseung

Interfacing Compiler and Hardware. Computer Systems Architecture. Processor Types And Instruction Sets. What Instructions Should A Processor Offer?

INTRODUCTION TO IA-32. Jo, Heeseung

The x86 Architecture

Computer System Architecture

RISC I from Berkeley. 44k Transistors 1Mhz 77mm^2

CS 16: Assembly Language Programming for the IBM PC and Compatibles

Assembly Language Programming 64-bit environments

We can study computer architectures by starting with the basic building blocks. Adders, decoders, multiplexors, flip-flops, registers,...

Assembly Language. Lecture 2 x86 Processor Architecture

Hardware and Software Architecture. Chapter 2

Assembly Language. Lecture 2 - x86 Processor Architecture. Ahmed Sallam

Registers. Ray Seyfarth. September 8, Bit Intel Assembly Language c 2011 Ray Seyfarth

Memory Models. Registers

Today: Machine Programming I: Basics

ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION Lesson-11: 80x86 Architecture

Reverse Engineering II: Basics. Gergely Erdélyi Senior Antivirus Researcher

MACHINE-LEVEL PROGRAMMING I: BASICS COMPUTER ARCHITECTURE AND ORGANIZATION

Intel Enterprise Processors Technology

Chapter 2. lw $s1,100($s2) $s1 = Memory[$s2+100] sw $s1,100($s2) Memory[$s2+100] = $s1

CS 31: Intro to Systems ISAs and Assembly. Martin Gagné Swarthmore College February 7, 2017

Virtual Machines and Dynamic Translation: Implementing ISAs in Software

1 Overview of the AMD64 Architecture

The x86 Architecture. ICS312 - Spring 2018 Machine-Level and Systems Programming. Henri Casanova

Assembly Language Each statement in an assembly language program consists of four parts or fields.

Module 3 Instruction Set Architecture (ISA)

Reverse Engineering II: The Basics

Several Common Compiler Strategies. Instruction scheduling Loop unrolling Static Branch Prediction Software Pipelining

History of the Intel 80x86

MACHINE-LEVEL PROGRAMMING I: BASICS

Microsoft. iron Krokhmal et IT /2005

Computer Organization & Assembly Language Programming

Instruction Set Architecture (ISA) Data Types

Lab 2: Introduction to Assembly Language Programming

Complex Instruction Set Computer (CISC)

Computer Processors. Part 2. Components of a Processor. Execution Unit The ALU. Execution Unit. The Brains of the Box. Processors. Execution Unit (EU)

CS Bootcamp x86-64 Autumn 2015

Outline. What Makes a Good ISA? Programmability. Implementability

The Instruction Set. Chapter 5

Real instruction set architectures. Part 2: a representative sample

Low Level Programming Lecture 2. International Faculty of Engineerig, Technical University of Łódź

Itanium 2 Impact Software / Systems MSC.Software. Jay Clark Director, Business Development High Performance Computing

The Pentium Processor

Outline. What Makes a Good ISA? Programmability. Implementability. Programmability Easy to express programs efficiently?

Intel released new technology call P6P

System calls and assembler

Advance CPU Design. MMX technology. Computer Architectures. Tien-Fu Chen. National Chung Cheng Univ. ! Basic concepts

Itanium 2 Processor Microarchitecture Overview

Intel Enterprise Solutions

Lecture 3: Instruction Set Architecture

Intel Architecture. Compass Security Schweiz AG Werkstrasse 20 Postfach 2038 CH-8645 Jona

Moving from 32 to 64 bits while maintaining compatibility. Orlando Ricardo Nunes Rocha

Credits and Disclaimers

MODE (mod) FIELD CODES. mod MEMORY MODE: 8-BIT DISPLACEMENT MEMORY MODE: 16- OR 32- BIT DISPLACEMENT REGISTER MODE

Assembly Language for Intel-Based Computers, 4 th Edition. Chapter 2: IA-32 Processor Architecture. Chapter Overview.

Assembly Language for x86 Processors 7 th Edition. Chapter 2: x86 Processor Architecture

Datapoint 2200 IA-32. main memory. components. implemented by Intel in the Nicholas FitzRoy-Dale

IA32 Intel 32-bit Architecture

COS 318: Operating Systems. Overview. Prof. Margaret Martonosi Computer Science Department Princeton University

Last Time: Floating Point. Intel x86 Processors. Lecture 4: Machine Basics Computer Architecture and Systems Programming ( )

Assembly I: Basic Operations. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

OpenVMS Performance Update

Credits and Disclaimers

Homework. Reading. Machine Projects. Labs. Exam Next Class. None (Finish all previous reading assignments) Continue with MP5

Introduction to the x86 Architecture. Camiel Vanderhoeven

Assembly Language for Intel-Based Computers, 4 th Edition. Kip R. Irvine. Chapter 2: IA-32 Processor Architecture

IA-32 Architecture. Computer Organization and Assembly Languages Yung-Yu Chuang 2005/10/6. with slides by Kip Irvine and Keith Van Rhein

x86 Programming I CSE 351 Winter

Carnegie Mellon. 5 th Lecture, Jan. 31, Instructors: Todd C. Mowry & Anthony Rowe

CS 31: Intro to Systems ISAs and Assembly. Kevin Webb Swarthmore College February 9, 2016

MICROPROCESSOR TECHNOLOGY

CS 31: Intro to Systems ISAs and Assembly. Kevin Webb Swarthmore College September 25, 2018

Computer Organization (II) IA-32 Processor Architecture. Pu-Jen Cheng

T Reverse Engineering Malware: Static Analysis I

Building 96-processor Opteron Cluster at Florida International University (FIU) January 5-10, 2004

Moodle WILLINGDON COLLEGE SANGLI (B. SC.-II) Digital Electronics

Universität Dortmund. ARM Architecture

Sanhita Sarkar Oracle Corporation.

Alternate definition: Instruction Set Architecture (ISA) What is Computer Architecture? Computer Organization. Computer structure: Von Neumann model

Advanced Computer Architecture

UMBC. A register, an immediate or a memory address holding the values on. Stores a symbolic name for the memory location that it represents.

Reverse Engineering II: The Basics

Binghamton University. CS-220 Spring x86 Assembler. Computer Systems: Sections

This Unit: Putting It All Together. CIS 371 Computer Organization and Design. Sources. What is Computer Architecture?

HP s Performance Oriented Datacenter

Assembler Programming. Lecture 2

Transcription:

What Transitioning from 32-bit to 64-bit x86 Computing Means Today Chris Wanner Senior Architect, Industry Standard Servers Hewlett-Packard 2004 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice

Agenda What and Why of 64bit computing Intel EM64T vs. AMD64 X86 64bit extensions vs. Itanium 2 Transition to 64bit computing 2

64bit processors X86-64 bit Extensions IPF POWER x86-64 extensions brings 64bit computing to the volume/mainstream industry standard market Power 3 Merced Power 4 McKinley Opteron Xeon Madison Power 5 PA-RISC 8000 8500 8700 8800 SPARC Ultra SPARC Ultra SPARC II Ultra SPARC III Ultra SPARC IV ALPHA EV4 EV5 EV6 EV7 MIPS R4K R8K R10K R12K R14K R16K 1990 2000 3

What and Why of 64bit computing? Its about: - Data handling - Memory addressability

Data handling Registers Datapaths Arithmetic units What size chunks can we use to move and manipulate data What is the benefit of being able to use larger chunks of data? Higher performance Greater accuracy 64bit arithmetic vs. 32bit 64bit logical operations vs. 32bit 64bit floating point operations vs. 32bit 5

Data handling register size 64 18+ yrs 7 yrs 32 16 8 4 4 yrs 3 yrs 32-bit computing fueled the growth of the Industry Standard Server market 64-bit computing will continue to feed the need for higher levels of performance 1970 1980 1990 2000 6

Data handling - register size But. Tempered by the reality that 32bit processors in Industry Standard Servers already can move and compute data in chunks larger than 32bits: Cache line size is 512-bits 64-bit front side bus 64-bit, 128-bit, and even 256-bit internal datapaths 80bit FPUs, 64bit MMX, and 128bit XMM SIMD floating point and Integer operations (SSE2) There would be little need for a true 64bit processor if data size was the only reason. 7

Data handling register quantity Its not just about the width of registers, its also about quantity of registers: 64bit processors typically have more registers than 32-bit processors More registers can equal more performance Registers are faster than cache or memory More registers = more data can be held close to the CPU core and used without incurring CPU idles Ex. IPF = 128 General Purpose registers vs. 8 GPR for IA-32 8

Data handling - register quantity But. Even though the basic IA-32 ISA only specifies 8 GPRs, additional but specific registers are available with x87, MMX, and SSE extensions So there must be more still. 9

What and Why of 64bit computing? Its about: - data handling - Memory addressability

Memory Addressability How much memory a CPU can access is dependent on the bit-ness of the CPU: Address range = 2 bit-ness Thus: 2 16 = 64KB 2 32 = 4 GB 2 64 = 16 Exabytes 32-bit processors 64-bit processors 32bit address range 64bit address range 4,000,000,000 times 11

How important is a larger address space? No one will need more than 64K of memory Urban Legend quote attributed to Bill Gates

Addressability over time 1TB 4GB 1GB 1MB 64k 1K 1 3 yrs 4 yrs 7 yrs 18+ yrs 1970 1980 1990 2000 13

Who needs more than 4GB of memory? A: increasingly more applications are requiring more than 4GB of memory

Memory addressability Consider: Currently 4GB address space is shared between OS kernel, library routines, and applications Applications get only 2GB 3GB of space Server consolidation solutions where a number of applications are sharing the available memory space Consolidation solutions are becoming more prevalent across the industry Greater CPU power Need to reduce TCO Virtual address space may be even more important than physical Database applications that can store more data in memory rather than on disk decreases database delays by orders of magnitude Email applications where each user supported requires memory resources More memory = more supported users 15

Memory Addressability These and many other solutions can benefit from larger address space and thus: More memory = more performance More memory = more capabilities More memory = more reliability and availability These are not new concepts to computing, But x86 64bit extensions moves new capabilities into the the volume industry standard computing space 16

What took so long?

Memory capacity and pricing trends >4GB capacities in a typical Industry Standard Server has not been practical during the past 10 years 18 16 14 12 10 8 6 4 2 0 Not practical Practical 1994 1996 1998 2000 2002 $100,000 Economical $10,000 $1,000 Expense of >4GB has not been economical until recently Not economical $100 1994 1996 1998 2000 2002 18

Memory barriers removed So at this time it is both practical and economical to have large memory capacities in volume servers thus making 64bit computing ala x86 64bit extensions viable and important 19

x86 64 bit Extensions Questions?

64bit Extensions Architectures What? Intel: EM64T (Extended Memory 64bit technology) AMD: AMD64 Microsoft: X64 extensions) (AMD s x86-64bit technology) (Microsoft s term for x86 64bit 21

64bit extensions registers & instructions

x86 to x86-extensions - registers SSE & SSE2 GPR X87/MMX 127 0 XMM0...... XMM7 XMM8...... XMM15 63 RAX 31 R8 R15 EAX EBX ECX EDX ESP EBP ESI EDI 15 7 0 ah bx cx dx sp bp si di al 79 MMX0/FPR0...... MMX7/FPR7 0 Program Counter 63 31 15 0 EIP ip 64bit extensions is the latest in a series of changes to the x86 architecture that has been occurring over the last 20+ years 23

x86 extensions 10 new instructions Instruction AMD Intel Notes CDQE Supported Supported New mnemonic for existing opcode CMPSQ Supported Supported New mnemonic for existing opcode LODSQ Supported Supported New mnemonic for existing opcode MOVSQ Supported Supported New mnemonic for existing opcode STOSQ Supported Supported New mnemonic for existing opcode MOVZX Supported Supported 64-bit version of existing instruction SYSCALL Supported in all modes 64-bit mode only New for Intel in 64bit mode only SYSRET Supported in all modes 64-bit mode only New for Intel in 64bit mode only CMPXCHG16B Not supported Supported 8-byte only version in AMD64 SWAPGS Supported Supported New Minor differences in the implementations of 64bit extensions is expected to be handled by compilers and OS s transparent to the end user Different platforms but single binary 24

32bit and 64bit modes legacy Mode Long Mode Legacy Compatibility Native 64-bit User Application 32 bit 32 bit 64 bit Kernel Operating System 32 bit Thunking* 64 bit 64 bit Drivers 32 bit 64 bit 64 bit * Windows - Thunking/DLL Linux - System call emulation Existing SW infrastructure Allows users to move to 64-bit without giving up 32-bit compatibility or performance Full 64bit environment 25

Ecosystem Support for x86 64bit Extensions OS & Applications

OS and Applications Transition from x86 16bit to 32bit: 82386 Release > 8 years Windows NT 3.1 Windows 95 Transition from x86 32bit to 64bit: Opteron/AMD64 <1 year 2 years SuSE/SLES8 Redhat EL3 Microsoft x86 OS 64bit OS support significantly faster than last major transition 27

OS Support Linux Products 32-bit x86 IPF 64-bit X86-64 64 Redhat Enterprise Linux 3 SuSE Linux Enterprise Server 9 Microsoft Products Windows XP 64-bit Edition Windows Server 2003 Web Edition Windows Server 2003 Standard Edition Windows Server 2003 Enterprise Edition Windows Server 2003 Datacenter Edition Available now Expected release 1H05 28

Application support 350 300 250 200 150 100 50 0 AMD64 In development Linux OSs EM64T Shipped Q1'03 Q2'03 Q3'03 Q4'03 Q1'04 Q2'04 Q3'04 Development tools e.g. GNU & C++ compilers, debuggers, profilers, libraries Database engines e.g. SQL, Oracle 8i,9i, MySQL Infrastructure applications e.g. VMware, Zeus web server,.net environment Vertical applications -.e.g. Synopsys, Cadence, Fluent, Matlab 29

X86 64bit extensions vs. Itanium 2 Architecturally significant differences Instruction set significant differences positioning significant differences

Xeon/Opteron compared to Itanium 2 Xeon / Opteron 3 Integer 1 TB 6.4 GB/s 20 GB/s 1MB 4MB 12 31 1 2 3 4 5 6 40 Registers Fmisc, Fmul,Fadd 1 for SIMD 2 Load or 2 Store 2.2 GHz, 3.2+GHz 3 Instructions / Cycle Memory Addressing System Bus Bandwidth On-die Cache Pipeline Stages Issue Ports On-die Registers Execution Units Core Frequency Instructions / Clk Itanium 2 Processor 1024 TB 6.4 GB/s 6 MB 8 1 2 3 4 5 6 7 8 9 1011 264 Application Registers + 64 Predicate Registers* 6 Integer, 3 Branch 2 FP (FMAC) 1 SIMD 2 Load and 2 Store 1.5 GHz 6 Instructions / Cycle 31

Positioning x86 64bit extensions vs. IPF Integrity & NonStop servers HPC Large SMP, large memory ProLiant ProLiant & Integrity Integrity Integrity & NonStop Mix of ProLiant, Integrity & NonStop ProLiant & Integrity systems Web Mail Infrastructure Services, caching, proxy Messaging HPC BI Directory, DNS, firewall, security Work group BI Biz intelligence/ SCM planning OLTP med App tier Biz intelligence Very large data sets ERP medium OLTP large ERP large For customers who need the highest levels of performance and scalability for the most demanding applications and enterprise environments, Itanium architecture and HP Integrity servers are the solutions of choice 1-4 processors 4-8 processors 8-64+ processors 32

Positioning continued Breadth of Applications 32-bit x86 X86 64 64-bit IPF Scalability 33

Transitioning to 64bits

32 bit to 64 bit transitioning Lessons learn with Itanium: - some applications port extremely well - others are a huge burden - esp. 16bit code - assembly code - be judicious about what to port and what not to port - some applications benefit from 64bit - others run slower in 64bit mode - 64bit extensions gives you the flexibility to port only those applications that make sense to port and the rest can stay 32bits!!! 35

What applications should port to x86-64? Database: Many database apps are memory bound within a 32-bit environment and benefit greatly from larger physical address space Possibly even run entire database out of memory rather than from disk email: Larger address space allows the server to support a much larger number of users per server Fewer servers / lower TCO Terminal Server: Avoiding kernel address space limitations when hosting multiple applications Ex. Microsoft Office hosting on Terminal Server in a 64bit environment can support 50% more users vs. 32bit environment 36

What applications should port to x86-64? Business Apps: Apps that have high memory requirements Apps that have high computational requirements Technical / Scientific computing: Need for a large virtual and physical address space Complex computations These requirements are valid for porting to IPF 64-bits also, it s a matter of degree: - low/med requirements = x86 64-bit extensions - high requirements = Itanium 2 processor 37

Co-produced by:

Backup Opteron Ecosystem support 39