CS Bootcamp x86-64 Autumn 2015

Similar documents
Registers. Ray Seyfarth. September 8, Bit Intel Assembly Language c 2011 Ray Seyfarth

Credits and Disclaimers

Assembly Language Each statement in an assembly language program consists of four parts or fields.

Computer Processors. Part 2. Components of a Processor. Execution Unit The ALU. Execution Unit. The Brains of the Box. Processors. Execution Unit (EU)

6/20/2011. Introduction. Chapter Objectives Upon completion of this chapter, you will be able to:

EEM336 Microprocessors I. Addressing Modes

Instruction Set Architectures

Credits and Disclaimers

Binghamton University. CS-220 Spring x86 Assembler. Computer Systems: Sections

How Software Executes

Instruction Set Architectures

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2018 Lecture 4

How Software Executes

MACHINE-LEVEL PROGRAMMING I: BASICS COMPUTER ARCHITECTURE AND ORGANIZATION

CS 261 Fall Machine and Assembly Code. Data Movement and Arithmetic. Mike Lam, Professor

C to Assembly SPEED LIMIT LECTURE Performance Engineering of Software Systems. I-Ting Angelina Lee. September 13, 2012

CSE351 Spring 2018, Midterm Exam April 27, 2018

RISC I from Berkeley. 44k Transistors 1Mhz 77mm^2

Instruction Set Architecture (ISA) Data Types

Lecture 15 Intel Manual, Vol. 1, Chapter 3. Fri, Mar 6, Hampden-Sydney College. The x86 Architecture. Robb T. Koether. Overview of the x86

Introduction to Machine/Assembler Language

x86 Programming I CSE 351 Winter

The von Neumann Machine

The x86 Architecture

x64 Cheat Sheet Fall 2014

Meet & Greet! Come hang out with your TAs and Fellow Students (& eat free insomnia cookies) When : TODAY!! 5-6 pm Where : 3rd Floor Atrium, CIT

Assembly I: Basic Operations. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

The von Neumann Machine

Machine/Assembler Language Putting It All Together

Introduction to 8086 Assembly

Assembly Language Programming 64-bit environments

CS241 Computer Organization Spring 2015 IA

CS165 Computer Security. Understanding low-level program execution Oct 1 st, 2015

X86 Addressing Modes Chapter 3" Review: Instructions to Recognize"

CS412/CS413. Introduction to Compilers Tim Teitelbaum. Lecture 21: Generating Pentium Code 10 March 08

Machine Programming 3: Procedures

x86 64 Programming I CSE 351 Autumn 2018 Instructor: Justin Hsia

Today: Machine Programming I: Basics

Today: Machine Programming I: Basics. Machine Level Programming I: Basics. Intel x86 Processors. Intel x86 Evolution: Milestones

How Software Executes

Von Neumann Architecture. Machine Languages. Context of this Lecture. High-Level Languages. Assembly Language: Part 1. Princeton University.

Function Calls and Stack

W4118: PC Hardware and x86. Junfeng Yang

Putting the pieces together

CSE 351 Section 4 GDB and x86-64 Assembly Hi there! Welcome back to section, we re happy that you re here

Do not turn the page until 5:10.

CSC 2400: Computer Systems. Towards the Hardware: Machine-Level Representation of Programs

Program Exploitation Intro

Machine Language CS 3330 Samira Khan

Machine Program: Procedure. Zhaoguo Wang

CSC 8400: Computer Systems. Machine-Level Representation of Programs

Process Layout and Function Calls

Function Call Convention

x86 assembly CS449 Fall 2017

Machine Level Programming I: Basics

Where We Are. Optimizations. Assembly code. generation. Lexical, Syntax, and Semantic Analysis IR Generation. Low-level IR code.

x86 Programming I CSE 351 Autumn 2016 Instructor: Justin Hsia

Lecture 4 CIS 341: COMPILERS

Princeton University Computer Science 217: Introduction to Programming Systems. Assembly Language: Part 1

A4 Sample Solution Ch3

+ Machine Level Programming: x86-64 History

MACHINE-LEVEL PROGRAMMING I: BASICS

CSE 351 Midterm Exam

Instruction Set Architectures

Carnegie Mellon. 5 th Lecture, Jan. 31, Instructors: Todd C. Mowry & Anthony Rowe

x86 assembly CS449 Spring 2016

CS 107 Lecture 10: Assembly Part I

CS 31: Intro to Systems ISAs and Assembly. Kevin Webb Swarthmore College February 9, 2016

System calls and assembler

Turning C into Object Code Code in files p1.c p2.c Compile with command: gcc -O p1.c p2.c -o p Use optimizations (-O) Put resulting binary in file p

Chapter 3: Addressing Modes

CS 31: Intro to Systems ISAs and Assembly. Kevin Webb Swarthmore College September 25, 2018

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2017 Lecture 5

Computer Architecture and System Programming Laboratory. TA Session 3

CS 16: Assembly Language Programming for the IBM PC and Compatibles

administrivia today start assembly probably won t finish all these slides Assignment 4 due tomorrow any questions?

Low-Level Essentials for Understanding Security Problems Aurélien Francillon

The Hardware/Software Interface CSE351 Spring 2013

Lab 3. The Art of Assembly Language (II)

Assembly Language Lab # 9

Reverse Engineering II: The Basics

Compiler construction. x86 architecture. This lecture. Lecture 6: Code generation for x86. x86: assembly for a real machine.

Machine Level Programming I: Basics. Credits to Randy Bryant & Dave O Hallaron

CSE 351 Midterm Exam Spring 2016 May 2, 2015

EECE.3170: Microprocessor Systems Design I Summer 2017 Homework 4 Solution

Today: Machine Programming I: Basics. Machine-Level Programming I: Basics. Intel x86 Processors. Intel x86 Evolution: Milestones

Module 3 Instruction Set Architecture (ISA)

Reverse Engineering II: Basics. Gergely Erdélyi Senior Antivirus Researcher

Do not turn the page until 11:30.

Lecture 3 CIS 341: COMPILERS

Inside VMProtect. Introduction. Internal. Analysis. VM Logic. Inside VMProtect. Conclusion. Samuel Chevet. 16 January 2015.

Islamic University Gaza Engineering Faculty Department of Computer Engineering ECOM 2125: Assembly Language LAB. Lab # 7. Procedures and the Stack

Chapter 11. Addressing Modes

Data Representa/ons: IA32 + x86-64

IA-32 & AMD64. Crash Dump Analysis 2015/2016. CHARLES UNIVERSITY IN PRAGUE faculty of mathematics and physics.

Branching and Looping

x86 architecture et similia

Functions. Ray Seyfarth. August 4, Bit Intel Assembly Language c 2011 Ray Seyfarth

Intel x86-64 and Y86-64 Instruction Set Architecture

UMBC. A register, an immediate or a memory address holding the values on. Stores a symbolic name for the memory location that it represents.

Transcription:

The x86-64 instruction set architecture (ISA) is used by most laptop and desktop processors. We will be embedding assembly into some of our C++ code to explore programming in assembly language. Depending on the source, the name for this instruction set is generally known as x86-64, x86 64, AMD64, and x64. Some of the material here has been drawn from the GCC Inline Assembly HOWTO https://www.ibiblio.org/gferg/ldp/gcc-inline-assembly-howto.html. 1 Registers We will focus only on the 16 general purpose (integer) registers storing 64-bit values. Other registers do exist such as those for SIMD and floating point operations, but they are beyond the scope of this class. The original 8086 (1976) had the eight 16 bit registers ax, bx, cx, dx, si, di, bp, sp. IA-32 (1986) extended these eight registers to 32 bit, and gave the 32 bit sizes names eax, ebx, ecx, edx, esi, edi, ebp, and esp. It also preserved accessing the lower 16 bits using the names devised by the 8086. After Intel went off in its own direction for a 64 bit processor with the Itanium, AMD created the x86-64 ISA when it introduced the Opteron in 2003. The Itanium ended up being a commercial failure, and ultimately Intel also adopted the x64-64 ISA. It extends the registers of the 8086 and IA-32 to 64 bits under the names rax, rbx, rcx, rdx, rsi, rdi, rbp, and rsp. It also introduced eight additional registers r8, r9, r10, r11, r12, r13, r14, and r15. Addressing the smaller forms were also preserved, and can be accessed. For the original eight registers, the lowest byte can be accessed by replacing the letter x with the letter l in their register name, e.g. al. For the eight new registers, they can be accessed by appending the letter b to the register name, e.g. r8b. The lowest 16 bits, known as a word in this context, can be accessed for the original eight registers by their 8086 names, e.g. ax. For the new registers, they are accessed by appending the letter w to their name, e.g. r8w. The lowest 32 bits, known as a double word in this context, can be accessed for the original eight registers by their IA-32 name which prepends the letter e to their 8086 names, e.g. eax. For the new registers, they are accessed by appending the letter d to their names, e.g. r8d. The full 64 bits, known as a quad word in this context, can be accessed for the original eight registers by prepending the letter r to their 8086 names, e.g. rax. For the new registers, they are accessed by just their name with no suffixes, e.g. r8. Many of the operations allow for specifying which registers to use, but some require their operands in specific registers. 2 Embedding Assembly into GCC Generally speaking, C++ compilers allow for a programmer to directly embed assembly into the middle of a function. It is not often used modernly because in most situations a compiler will write more performant code and assembly is not portable between processors. The rare cases that I still see it are to access specific hardware features or new operations that a compiler does not support. These happen in things like an operating system kernel, and 1

very specialized code where efficiency is paramount like video decoders. Although we will not discuss these in this class, if you find yourself in a situation where you think you need straight assembly, first look into the compiler intrinsics that are supported by your compiler. Often those will allow you to accomplish what you need in a simpler manner. The method of embedding assembly is not part of ISO C++ (or ISO C), so we will focus on how GCC extends the language to allow for direct assembly. Since a design goal for Clang is to be a drop-in replacement for GCC, this syntax is also supported there. The general structure is as follows asm ( assembler code : output operands / o p t i o n a l / : input operands / o p t i o n a l / : l i s t o f clobbered r e g i s t e r s / o p t i o n a l / The assembler code is in AT&T syntax by default. This is in contrast to Intel syntax which is seen more often. Translating between the two syntaxes is generally not too difficult. The main differences are 1. Source Destination Ordering. In Intel syntax an operation is normally ordered destination source, whereas AT&T uses source destination. 2. Register Naming. In AT&T syntax, register names are prefixed by a %, such as %eax. 3. Immediate Operands. In AT&T syntax, immediate operands (constants) are prefixed by a $. 4. Operand Size. In AT&T syntax the size of memory operands is determined from the last character of the op-code name. Op-code suffixes of b, w, and l specify byte(8-bit), word(16-bit), and long(32-bit) memory references. Intel syntax accomplishes this by prefixing memory operands (not the op-codes) with byte ptr, word ptr, and dword ptr. Thus, Intel mov al, byte ptr foo is movb foo, %al in AT&T syntax. Other differences do exist. To learn about them consult the manual for GNU gas https: //www.gnu.org/software/binutils/manual/. The code itself can be specified in several ways, but the simplest to use is to put each op-code into its own double quoted string with semicolons at the end. For example int negateintasm ( int in ) { asm( xor $ 0 x f f f f f f f f, %0; add $0x1, %0; : =r ( in ) // output : 0 ( in ) // input : // c l o b b e r e d r e g i s t e r s is a simple function that corresponds to the following C++ function. 2

3 Determining the op-codes to use There is some inherit difficulty in translating code to assembly. You have to have a large knowledge of the specific ISA to select the right op-codes. So we will see how to get the compiler to tell us. First we create a simple source code file that has just the function we wish to translate, say We then compile this with no optimizations and the debugging symbols. On a Linux system this would be g++ -O0 -g -std=c++11 -c -o file.o file.cpp. This will create a file named file.o which is the compiled form of file.cpp. Next we run objdump on it as objdump -S file.o > file.s. This will create a file named file.s which will contain the annotated assembly. t e s t. o : f i l e format e l f 6 4 x86 64 Disassembly o f s e c t i o n. t e x t : 0000000000000000 < Z 9negateInti >: 0 : 55 push %rbp 1 : 48 89 e5 mov %rsp,% rbp 4 : 89 7d f c mov %edi, 0 x4(%rbp ) 7 : f 7 5d f c negl 0x4(%rbp ) a : 8b 45 f c mov 0x4(%rbp ),%eax d : 5d pop %rbp e : c3 r e t q We see the lines of assembly interspersed with the original C++ code. Note the negl operation. This is the GCC performing an optimization on our code even though we requested 3

that it not do so. Unfortunately there is not much we can do with this, GCC will optimize it regardless. However, if we use Clang to compile instead we get a difference assembly listing. t e s t. o : f i l e format e l f 6 4 x86 64 Disassembly o f s e c t i o n. t e x t : 0000000000000000 < Z 9negateInti >: 0 : 55 push %rbp 1 : 48 89 e5 mov %rsp,% rbp 4 : 89 7d f c mov %edi, 0 x4(%rbp ) 7 : 8b 7d f c mov 0x4(%rbp ),% edi a : 83 f 7 f f xor $ 0 x f f f f f f f f,%edi d : 83 c7 01 add $0x1,% edi 1 0 : 89 7d f c mov %edi, 0x4(%rbp ) 1 3 : 8b 45 f c mov 0x4(%rbp ),%eax 1 6 : 5d pop %rbp 1 7 : c3 r e t q In this listing we can see the explicit xor and add instructions. It is from this Clang listing we can see the translation of the function to embedded assembly. int negateintasm ( int in ) { asm( xor $ 0 x f f f f f f f f, %0; add $0x1, %0; : =r ( in ) // output : 0 ( in ) // input : // c l o b b e r e d r e g i s t e r s Since writing assembly from scratch is tedious, we will instead use this approach of compiling the code, looking at the result, and translating it back to embedded assembly. 4 Common Structures and How They Translate to Assembly Since we are cheating by using the compiler, we really only need to recognize how some structures will translate to a simplified C++ syntax. From there we can guess how our statement translated by comparing the original to the generated assembly. 4

To get precise definitions of what each operation does, consult the Intel Software Developer Manuals: https://www-ssl.intel.com/content/www/us/en/processors/ architectures-software-developer-manuals.html 4.1 Conditional Statements Given a conditional statement i f ( t e s t expr ) then statement else else statement this translates to a psuedo C++ syntax as t = t e s t expr ; i f (! t ) goto f a l s e ; then statement goto done ; f a l s e : else statement done : Note that this is still mostly valid C++, but it is closer to how the assembly will be. Then the actual assembly is easy to see with gotos changed to various jump statements (jmp, je, jne, etc). 4.2 Loops Given a basic loop do body statement while ( t e s t expr this translates to a pseudo C++ syntax as loop : body statement t = t e s t expr ; i f ( t ) goto loop ; Again, the translation from here to assembly should be straightforward. 5