Moving from 32 to 64 bits while maintaining compatibility. Orlando Ricardo Nunes Rocha

Similar documents
Assembly Language for Intel-Based Computers, 4 th Edition. Chapter 2: IA-32 Processor Architecture Included elements of the IA-64 bit

1 Overview of the AMD64 Architecture

Assembly Language Programming 64-bit environments

Instruction Set Architectures

Assembly Language for x86 Processors 7 th Edition. Chapter 2: x86 Processor Architecture

CS 16: Assembly Language Programming for the IBM PC and Compatibles

ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION Lesson-11: 80x86 Architecture

History of the Intel 80x86

Assembly Language. Lecture 2 - x86 Processor Architecture. Ahmed Sallam

Memory Models. Registers

UNIT 2 PROCESSORS ORGANIZATION CONT.

Assembly Language. Lecture 2 x86 Processor Architecture

EEM336 Microprocessors I. The Microprocessor and Its Architecture

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2018 Lecture 4

The von Neumann Machine

Instruction Set Architectures

Computer Organization (II) IA-32 Processor Architecture. Pu-Jen Cheng

Computer Processors. Part 2. Components of a Processor. Execution Unit The ALU. Execution Unit. The Brains of the Box. Processors. Execution Unit (EU)

Assembly Language for Intel-Based Computers, 4 th Edition. Kip R. Irvine. Chapter 2: IA-32 Processor Architecture

IA-32 Architecture COE 205. Computer Organization and Assembly Language. Computer Engineering Department

The von Neumann Machine

Introduction to IA-32. Jo, Heeseung

INTRODUCTION TO IA-32. Jo, Heeseung

Registers. Ray Seyfarth. September 8, Bit Intel Assembly Language c 2011 Ray Seyfarth

BLAST on Intel EM64T Architecture

6/17/2011. Introduction. Chapter Objectives Upon completion of this chapter, you will be able to:

CHAPTER 3 BASIC EXECUTION ENVIRONMENT

Hardware and Software Architecture. Chapter 2

Low Level Programming Lecture 2. International Faculty of Engineerig, Technical University of Łódź

Complex Instruction Set Computer (CISC)

CSC 252: Computer Organization Spring 2018: Lecture 5

6/20/2011. Introduction. Chapter Objectives Upon completion of this chapter, you will be able to:

Chapter 2: The Microprocessor and its Architecture

Assembly I: Basic Operations. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

The Instruction Set. Chapter 5

Assembly Language for Intel-Based Computers, 4 th Edition. Chapter 2: IA-32 Processor Architecture. Chapter Overview.

For your convenience Apress has placed some of the front matter material after the index. Please use the Bookmarks and Contents at a Glance links to

Advanced Microprocessors

Lecture (02) The Microprocessor and Its Architecture By: Dr. Ahmed ElShafee

Computer System Architecture

MACHINE-LEVEL PROGRAMMING I: BASICS COMPUTER ARCHITECTURE AND ORGANIZATION

MICROPROCESSOR TECHNOLOGY

CS Bootcamp x86-64 Autumn 2015

Computer Systems Laboratory Sungkyunkwan University

EEM336 Microprocessors I. Addressing Modes

+ Machine Level Programming: x86-64 History

UNIT- 5. Chapter 12 Processor Structure and Function

IA-32 Architecture. Computer Organization and Assembly Languages Yung-Yu Chuang 2005/10/6. with slides by Kip Irvine and Keith Van Rhein

Chapter 2. lw $s1,100($s2) $s1 = Memory[$s2+100] sw $s1,100($s2) Memory[$s2+100] = $s1

Assembly Language Each statement in an assembly language program consists of four parts or fields.

Lecture 15 Intel Manual, Vol. 1, Chapter 3. Fri, Mar 6, Hampden-Sydney College. The x86 Architecture. Robb T. Koether. Overview of the x86

The Pentium Processor

Computer System Architecture

IA32 Intel 32-bit Architecture

Intel Architecture. Compass Security Schweiz AG Werkstrasse 20 Postfach 2038 CH-8645 Jona

Today: Machine Programming I: Basics

How Software Executes

What Transitioning from 32-bit to 64-bit x86 Computing Means Today

Systems Architecture I

Dr. Ramesh K. Karne Department of Computer and Information Sciences, Towson University, Towson, MD /12/2014 Slide 1

Instruction Set Architectures

Registers. Registers

Outline. What Makes a Good ISA? Programmability. Implementability

x86 Programming I CSE 351 Winter

The x86 Architecture

Lecture 4 CIS 341: COMPILERS

Introduction to Machine/Assembler Language

CMSC 313 COMPUTER ORGANIZATION & ASSEMBLY LANGUAGE PROGRAMMING LECTURE 03, SPRING 2013

Processing Unit CS206T

CMSC Lecture 03. UMBC, CMSC313, Richard Chang

Interfacing Compiler and Hardware. Computer Systems Architecture. Processor Types And Instruction Sets. What Instructions Should A Processor Offer?

Functions. Ray Seyfarth. August 4, Bit Intel Assembly Language c 2011 Ray Seyfarth

Intel x86-64 and Y86-64 Instruction Set Architecture

Datapoint 2200 IA-32. main memory. components. implemented by Intel in the Nicholas FitzRoy-Dale

EJEMPLOS DE ARQUITECTURAS

Today: Machine Programming I: Basics. Machine Level Programming I: Basics. Intel x86 Processors. Intel x86 Evolution: Milestones

CS241 Computer Organization Spring Introduction to Assembly

MACHINE-LEVEL PROGRAMMING I: BASICS

Credits and Disclaimers

Outline. What Makes a Good ISA? Programmability. Implementability. Programmability Easy to express programs efficiently?

MOV Move INSTRUCTION SET REFERENCE, A-M. Description. Opcode Instruction 64-Bit Mode. Compat/ Leg Mode

Real instruction set architectures. Part 2: a representative sample

Carnegie Mellon. 5 th Lecture, Jan. 31, Instructors: Todd C. Mowry & Anthony Rowe

Machine-level Representation of Programs. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Digital Forensics Lecture 3 - Reverse Engineering

Homework. Reading. Machine Projects. Labs. Exam Next Class. None (Finish all previous reading assignments) Continue with MP5

Lecture 3 CIS 341: COMPILERS

Millions of instructions per second [MIPS] executed by a single chip microprocessor

Meet & Greet! Come hang out with your TAs and Fellow Students (& eat free insomnia cookies) When : TODAY!! 5-6 pm Where : 3rd Floor Atrium, CIT

An Overview of ISA (Instruction Set Architecture)

2.7 Supporting Procedures in hardware. Why procedures or functions? Procedure calls

RISC I from Berkeley. 44k Transistors 1Mhz 77mm^2

CS 101, Mock Computer Architecture

17. Instruction Sets: Characteristics and Functions

INTRODUCTION TO MICROPROCESSORS

Binghamton University. CS-220 Spring x86 Assembler. Computer Systems: Sections

Representation of Information

Chapter 11. Addressing Modes

Faculty of Engineering Student Number:

SWAR: MMX, SSE, SSE 2 Multiplatform Programming

Transcription:

Moving from 32 to 64 bits while maintaining compatibility Orlando Ricardo Nunes Rocha Informatics Department, University of Minho 4710 Braga, Portugal orocha@deb.uminho.pt Abstract. The EM64T is a recent technology adopted by INTEL, allowing the new processors (Dual core, Xeon) to run 64 bit software. To compete with the rival AMD, Intel had to adapt its own x86 architecture increasing the memory addressing capabilities of their 32 bit processors.the advantage of using 64 bit computing is the ability to work with a higher range of integer values and a larger memory support. To provide the X86 architecture with 64 bit capacities, INTEL added eight generalpurposed 64 bit registers and eight 64 bit registers for streaming SIMD extensions. A new instruction pointer called RIP was been included like other new functionalities like fast interrupt prioritization mechanism, uniform byte register addressing and a new instruction pointer relative addressing mode. The arithmetic and logical operations are now directly supported for 64 bit integers, and pushes an pops on the stack are executed with eight byte strides. This new technology allows both 32 and 64 bit applications to run simultaneously on a system with a 64 bit operating system. The problems related to the 32 64 bit compatibility can be decreased or avoid, allowing software producers a gradual transition for modifying their 32 bit to 64 bit software. Introduction A new concept of supporting 32 bit and 64 bit software by the same processor has been emerging in the last few years. AMD was the first company to explore this new concept. It was followed by the Intel, which named its own technology EM64T (Extended Memory 64 Technology). The principal idea behind these new features is to allow applications to address larger amounts of memory and support the coexistence of 32 bit and 64 bit applications in the same processor [7]. EM64T extends virtual and physical memory addressing beyond the 4 GB limit of current 32 bit processors [14]. Some modifications had to be made in the instruction set to support these new features. The system register that manages the new 64 bit extensions and at same time the old 32 bit instruction had to be altered; new registers were added to support 64 bit integers and to increase the performance of CPU, reducing the number of times that the CPU has access the memory to load instructions or load and save data. Intel also had to modify the memory access bus to 64 bits because a larger amount of data has to be transported. A new generation of Intel processors have the EM64T extensions included, such as newer versions of Pentium 4, Pentium D, Pentium Extreme Edition, Celeron D, Xeon, and Core2 processors [4]. At present, the most used processor in servers and workstations is the Xeon 1

processor, offering to customers a reliable 64 bit support. It allows to install and run all of the existing 32 bit applications, in which 32 bit execution remains critical, with excellent performance. The 64 bit scientific, engineering, and design applications that need larger memory support can be used without code recompilation[9]. The cluster SEARCH of the department of informatics of University of Minho has 96 recent Xeon processors, so it s possible to take advantage of all EM64T characteristics that will be mentioned in this paper, maintaining the compatibility between 32 bit and 64 bit applications with a low cost 64 bit support. 64 bit computing There are three areas that can be classified as 64 bit data, addressing, and software environment. 64 bit integer registers, 64 bit floating point registers, and 64 bit data paths between processor, memory, cache memory and registers has to be present in the processor architecture[12]. There are two main advantages using a 64 bit architecture, which are a higher range of integer values and a larger memory support. The increase of integer values representation allows to scientific and simulations applications fewer calculations to generate the same result as a 32 bit applications. This is more relevant for applications that do a large number of calculations with integers, because earlier in 32 bit registers it was necessary a larger number of registers (double) to represent a 64 bit number, causing many more accesses to the memory. However in the applications that use floating point math, the increase of speed is not so relevant, because the registers have the same length of 80 bits. 64 bit applications can address up to 16 Exabytes of RAM, but nowadays most of PCs have an artificial limit on the amount of memory they can recognize, due to physical constraints[3]. In table 1 shows the enlargement of memory from 4 bit to 64 bit computing. Table 1: Differences in scale between architectures; source [13] Bits Binary Number of memory addresses 4 24 16 9 28 256 10 210 1024 16 216 65,536 32 232 4,294,967,297 64 264 18,446,744,073,709,600,000 EM64T technology EM64T is an increment at the Intel IA 32 architecture, providing a coexistence of 32 bit and 64 bit computing in a single processor. This is the idea behind the EM64T concept. To take advantage of this technology the chipset of motherboards and BIOS require a 64 bit support, as well a 64 bit operating system [14]. 2

Enhancement of the Intel IA 32 Eight new 64 bit general purpose registers were added; they were named R8 thru R15. These new registers are true general purpose registers, because they do not have a specialized task like old 32 bit eight registers EAX, EBX, etc. The EAX, EBX, ECX and EDX registers were modified having an R prefix to support the 64 bit extension. The rest of registers such as index registers, RSI, RDI, and the stack pointers RBP and RSP have been modified too [11]. All 64 bit registers continue to use the same division scheme of old 32 bit registers that allows them to be used for 32 bit, 16 bit and 8 bit operations. Like 32 bit registers division scheme to the least significant bits from RAX register are used to 8 bit or 16 bit operations designated as uniform byte register addressing [5]. The old EIP was modified to a new 64 bit instruction pointer (RIP), allowing like this to access at any 8 bytes of data stored in memory. Figure 1 shows the new structure of the registers; alterations are displayed in purple colour. Figure 1: New structure of the General purpose registers and XMM registers; Source [1] Also added were eight new XMM registers for SIMD instructions (MMX, SSE, SSE2 and SS3) accomplishing a total of 16 XMM registers, but they continue to be 128 bit wide. They support the storage of two 64 bit floating point numbers in a same register, and they are used most of time by multimedia instructions that use several calculations with real numbers. The control registers was modified to allow enabling and disabling of EM64T extend features, by adding what Intel calls MSRs (Extended feature enable MSR or IA32_EFER). This new feature contains some control bits (bit 10 and 8). Figure 2 shows each bit function in the 64bit register. The IA32e mode enabling and disabling mechanism is explained in controlling IA32e mode. Figure 2: Extended feature enable MSR. Source [8] 3

Default values of 64 bit for addressing size and 32 bits for operands size are used in 64 bit mode. The defaults values are changed just if it is necessary, for instance if an instruction of 32 bit is executed. There is a new prefix called REX in the new instructions that manage the operand size and addressing size. However not all instructions need this REX field, being used just if the instruction uses a 64 bit operand. A RIP (relative instruction pointer) was added and this provides a new address method relative at the instruction pointer position in the stack. This means, the address of one instruction can be composed by adding a value to the instruction pointer address. This addressing mode uses a signed 32 bit displacement that allows an offset range of ± 2GB from instruction pointer address [8]. Operating Modes There are two distinct operation modes available in EM64T, legacy mode and IA32e. This last one includes two sub modes, 64 bit mode and the compatibility mode. Legacy mode: In this mode the processor just works like a normal IA32 processor, running only 32 bit applications. The processor may operate in three different operating submodes like protected mode, Real address mode and System management mode[10]. IA32e mode: This is the new mode that was added by Intel and gives the possibility to run a 64 bit operating system while still being able to run unmodified 32 bit applications. The 32 bit applications continue to use only an address space with a maximum of 4 GB. Compatibility Mode: maintains binary compatibility with 16 bit or 32 bit applications. It allows the applications to coexist under a 64 bit operating system. To execute existing 16 and 32 bit application the operating system change their codesegment descriptor CS.L bit to 0 [8]. 64 bit mode: the default address size is 64 bit and the default operand size is 32 bit. This mode permits the full memory advantages of a 64 bit solution, but only 64 bit applications will work. This option is enabled by the operating system passing all registers and instruction pointer to 64 bit, and the applications will have access to the full physical memory range. Arithmetic and logical operations, memory to registers and register to memory operations are directly supported for 64 bit integers. A larger virtual address space and a larger physical address space can be used. A total of 2 64 or 16 exabytes can be addressed. There are some limitations on actual EM64T processors because they have only 36 address lines, that means 2 36 or 64 GB of RAM can be addressed. The Xeon DP is slightly different because it has a 40 address lines, allowing 2 40 bytes of addressing space[5 7;9]. 4

Controlling IA32e mode As it was already referred, the operation of 64 bit mode and compatibility mode are governed by various control bits in the Extended Feature Enable Register (IA32_EFER) MSR and CS descriptor [8]. The IA32_EFER.LMA controls the legacy mode or IA32e activation and code segment descriptor bits (CS.L and CS.D) are used to control the sub operating mode 64 bit mode and the compatibility mode. Figure 3 shows the different CS.L and CS.D conjugation that control the IA32e modes. If CS.L=1 and CS.D=0 the processor is running in 64 bit mode and the default operand size is 32 bit and address size is 64 bit. The compatibility mode is activated when CS.L=0 and the CS.D controls the operand and address sizes to 32 bit or 16 bit[8]. The LMA switches the legacy mode and the IA32e mode. Figure 3: EM64T Processor modes; Source [8]. Benchmarks Figure 4 presents a test of a Pentium 4 processor with and without active 64 bit extensions. This benchmark is online in Hardware.Fr website [2]. They studied different functions of encode, decode, visual effects and calculations to compare the EM64T technology with old IA32 legacy mode. The performance is observed by measuring the time, in seconds, of a task (lower values correspond at better performance). Figure 4: A benchmark test in a Pentium 4 660 with em64t extensions activated and deactivated. Source [2]. 5

As it can be seen in the figure, the gain of performance is not expressive and in some functions the IA32 legacy mode exceeds the EM64T. That can be seen the increase of performance in EM64T is higher in image and video applications where is accomplished a larger floating point calculations and integer calculations, so a larger number of generalpurpose register and the enlargement of XMM registers helps in this increase of multimedia performance. Conclusions The EM64T seems to be the ideal technology for a progressive transition of 32 bit to 64 bit applications. Due to majority of the applications being still of 32 bit, this allows software developers a larger period of time to port their applications to 64 bit. The 64 bit applications can run simultaneously with 32 applications without recompilation. The benchmark test do not exhibit a significant increase of performance with active EM64T extension, but there are other agents that influence the performance of a processor like Front side bus, Pipeline levels, cache, etc. As the tests took place with the same processor these should be taken into account to demonstrate the small performance increase of EM64T. Perhaps this technology can be more efficient when it is used with database systems because the larger addressing space allows managing a larger amount of data. My personal opinion is that, the main advantage of using a processor with EM64T extension is the possibility to run 32 bit applications and 64 bit applications at same time, avoiding the necessity of acquisition of two different processors, a 32 bit support processor and a 64 bit support processor. Em64T allows Intel to take advantage of the existing IA32 architecture, avoiding the development of a new architecture to support both 32 bit applications and 64 bit applications decreasing the cost of development of a new architecture. As the cluster SEARCH only has Xeon processors, most of problems associated with 32 bit/64 bit compatibility issues can be solved. Moreover, as the cluster is accessible to scientific community of University of Minho, it is valuable to have a system that supports both 32 bit and 64 bit applications. References [1] http://www.xbitlabs.com/articles/cpu/print/core2duo 64bit.html. 24 7 2006. [2] http://www.hardware.fr/news/lire/25 02 2005/#7320. 2007 [3] http://en.wikipedia.org/wiki/64 bit. 2007 [4] http://en.wikipedia.org/wiki/em64t. 2007 [5] http://www.hardwaresecrets.com/article/262. 2007 6

[6] David Watts and Robert Moon. IBM Eserver xseries 366 Technical. 2005. IBM Redbooks Paper [7] Garima Kochhar, Kalyana Chadalavada, Amina Saify and Rizwan Ali. (2004) BLAST on Intel EM64T Architecture. Dell. [8] Intel. (2007) Intel Extended Memory 64 Technology Software Developer's Guide Volume 1. [9] Intel. (2004) The 64 bit Tipping Point. Intel Solutions. [10] Intel Corporation. (2006) Intel 64 and IA 32 Architectures Software Developer's Manual. [11] James Leiterman. (2005) 32/64 BIT 80 x 86 Assembly Language Architecture. [12] Jerry Haigh. 64 Bit Computing Solves the World's Most Complex Problems. 1996. [13] John Coombs and John Fruehe. (2004) Planning Considerations for Intel Extended Memory 64 Technology on Servers and Workstations. Dell. [14] Ramesh Radhakrishnan, Jimmy Pike and Skipper Smith. (2004) INTEL EXTENDED MEMORY 64 TECHNOLOGY (EM64T). Dell. 7