Reverse Engineering II: The Basics

Similar documents
Reverse Engineering II: Basics. Gergely Erdélyi Senior Antivirus Researcher

Reverse Engineering II: The Basics

T Reverse Engineering Malware: Static Analysis I

Static Analysis I PAOLO PALUMBO, F-SECURE CORPORATION

Lecture 15 Intel Manual, Vol. 1, Chapter 3. Fri, Mar 6, Hampden-Sydney College. The x86 Architecture. Robb T. Koether. Overview of the x86

X86 Addressing Modes Chapter 3" Review: Instructions to Recognize"

The x86 Architecture

Introduction to IA-32. Jo, Heeseung

INTRODUCTION TO IA-32. Jo, Heeseung

Complex Instruction Set Computer (CISC)

CS241 Computer Organization Spring 2015 IA

Practical Malware Analysis

EXPERIMENT WRITE UP. LEARNING OBJECTIVES: 1. Get hands on experience with Assembly Language Programming 2. Write and debug programs in TASM/MASM

MODE (mod) FIELD CODES. mod MEMORY MODE: 8-BIT DISPLACEMENT MEMORY MODE: 16- OR 32- BIT DISPLACEMENT REGISTER MODE

Computer Processors. Part 2. Components of a Processor. Execution Unit The ALU. Execution Unit. The Brains of the Box. Processors. Execution Unit (EU)

Addressing Modes on the x86

Module 3 Instruction Set Architecture (ISA)

UMBC. A register, an immediate or a memory address holding the values on. Stores a symbolic name for the memory location that it represents.

CS 31: Intro to Systems ISAs and Assembly. Martin Gagné Swarthmore College February 7, 2017

Assembly Language Each statement in an assembly language program consists of four parts or fields.

The Microprocessor and its Architecture

Scott M. Lewandowski CS295-2: Advanced Topics in Debugging September 21, 1998

x86 Assembly Tutorial COS 318: Fall 2017

CSC 2400: Computer Systems. Towards the Hardware: Machine-Level Representation of Programs

We can study computer architectures by starting with the basic building blocks. Adders, decoders, multiplexors, flip-flops, registers,...

Assembly I: Basic Operations. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

CSC 8400: Computer Systems. Machine-Level Representation of Programs

Instruction Set Architectures

CMSC Lecture 03. UMBC, CMSC313, Richard Chang

CNIT 127: Exploit Development. Ch 1: Before you begin. Updated

Advanced Microprocessors

CS165 Computer Security. Understanding low-level program execution Oct 1 st, 2015

Instruction Set Architectures

Chapter 3: Addressing Modes

Basic Execution Environment

6/20/2011. Introduction. Chapter Objectives Upon completion of this chapter, you will be able to:

Registers. Ray Seyfarth. September 8, Bit Intel Assembly Language c 2011 Ray Seyfarth

Machine-level Representation of Programs. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Lab 2: Introduction to Assembly Language Programming

CS 31: Intro to Systems ISAs and Assembly. Kevin Webb Swarthmore College September 25, 2018

Hardware and Software Architecture. Chapter 2

Instruction Set Architecture (ISA) Data Types

CS412/CS413. Introduction to Compilers Tim Teitelbaum. Lecture 21: Generating Pentium Code 10 March 08

Chapter 11. Addressing Modes

CS 31: Intro to Systems ISAs and Assembly. Kevin Webb Swarthmore College February 9, 2016

Instruction Set Architectures

Computer Architecture and Assembly Language. Practical Session 3

x86 assembly CS449 Fall 2017

Binghamton University. CS-220 Spring x86 Assembler. Computer Systems: Sections

Assembler Programming. Lecture 2

SPRING TERM BM 310E MICROPROCESSORS LABORATORY PRELIMINARY STUDY

Assembly Language for Intel-Based Computers, 4 th Edition. Chapter 2: IA-32 Processor Architecture Included elements of the IA-64 bit

Stack -- Memory which holds register contents. Will keep the EIP of the next address after the call

CS241 Computer Organization Spring Introduction to Assembly

EEM336 Microprocessors I. Addressing Modes

W4118: PC Hardware and x86. Junfeng Yang

Assembly Language: IA-32 Instructions

System calls and assembler

complement) Multiply Unsigned: MUL (all operands are nonnegative) AX = BH * AL IMUL BH IMUL CX (DX,AX) = CX * AX Arithmetic MUL DWORD PTR [0x10]

Lab 3. The Art of Assembly Language (II)

Credits and Disclaimers

Assembly level Programming. 198:211 Computer Architecture. (recall) Von Neumann Architecture. Simplified hardware view. Lecture 10 Fall 2012

Lecture (02) The Microprocessor and Its Architecture By: Dr. Ahmed ElShafee

CMSC 313 COMPUTER ORGANIZATION & ASSEMBLY LANGUAGE PROGRAMMING LECTURE 03, SPRING 2013

Assembly Language Lab # 9

The x86 Architecture. ICS312 - Spring 2018 Machine-Level and Systems Programming. Henri Casanova

Program Exploitation Intro

Machine and Assembly Language Principles

IA-32 Architecture. CS 4440/7440 Malware Analysis and Defense

Inline Assembler. Willi-Hans Steeb and Yorick Hardy. International School for Scientific Computing

Turning C into Object Code Code in files p1.c p2.c Compile with command: gcc -O p1.c p2.c -o p Use optimizations (-O) Put resulting binary in file p

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2018 Lecture 4

Marking Scheme. Examination Paper Department of CE. Module: Microprocessors (630313)

Interfacing Compiler and Hardware. Computer Systems Architecture. Processor Types And Instruction Sets. What Instructions Should A Processor Offer?

X86 Review Process Layout, ISA, etc. CS642: Computer Security. Drew Davidson

x86 Programming I CSE 351 Winter

SYSC3601 Microprocessor Systems. Unit 2: The Intel 8086 Architecture and Programming Model

6/17/2011. Introduction. Chapter Objectives Upon completion of this chapter, you will be able to:

Ethical Hacking. Assembly Language Tutorial

IA32 Intel 32-bit Architecture

Assembly Language Programming Introduction

Proposed Common Configuration Method

Chapter 2: The Microprocessor and its Architecture

Introduction to Machine/Assembler Language

CSE2421 FINAL EXAM SPRING Name KEY. Instructions: Signature

Access. Young W. Lim Sat. Young W. Lim Access Sat 1 / 19

The Instruction Set. Chapter 5

UMBC. 1 (Feb. 9, 2002) seg_base + base + index. Systems Design & Programming 80x86 Assembly II CMPE 310. Base-Plus-Index addressing:

Introduction to Reverse Engineering. Alan Padilla, Ricardo Alanis, Stephen Ballenger, Luke Castro, Jake Rawlins

An Introduction to x86 ASM

x86 architecture et similia

EEM336 Microprocessors I. The Microprocessor and Its Architecture

Winter Compiler Construction T11 Activation records + Introduction to x86 assembly. Today. Tips for PA4. Today:

CS 16: Assembly Language Programming for the IBM PC and Compatibles

Access. Young W. Lim Fri. Young W. Lim Access Fri 1 / 18

Credits and Disclaimers

administrivia today start assembly probably won t finish all these slides Assignment 4 due tomorrow any questions?

CSCI 192 Engineering Programming 2. Assembly Language

1. Introduction to Assembly Language

Reverse Engineering Low Level Software. CS5375 Software Reverse Engineering Dr. Jaime C. Acosta

Transcription:

Reverse Engineering II: The Basics Gergely Erdélyi Senior Manager, Anti-malware Research Protecting the irreplaceable f-secure.com

Binary Numbers 1 0 1 1 - Nibble B 1 0 1 1 1 1 0 1 - Byte B D 1 0 1 1 1 1 0 1 0 0 1 1 1 0 0 1 - Word B D 3 9 2

Byte Order a.k.a. Endianness 00 01 12 34 = 0x3412 (Little Endian) = 0x1234 (Big Endian) 34 12 = 0x1234 (Little Endian) = 0x3412 (Big Endian) 00 01 3

Little Endian Dword 00 01 02 03 12 34 56 78 0x78563412 78 56 34 12 0x12345678 00 01 02 03 4

Endianness Matters Data exchange between computers Networking protocols File formats for disk storage Mixing endinannes 5

System Endianness Little Endian Big Endian Switchable Endianness Intel x86 PowerPC (exc. G5) ARM Intel 8051 Sparc (exc. v9) Alpha Most ucontrollers System/370 Intel IA64 6

ASCII Code 0x00-0x1F 0x20-0x3F Control Characters Digits and Punctuation Backspace, Line feed 0-9 <> =.,: *-()! 0x40-0x5F 0x60-0x7E Upper-case Letters and Special Lower-case Letters and Special ABCD... @[]\^_ abcd... `{} ~ 7

ASCII Example H e l l o 1 2 3 4 48 65 6C 6C 6F 20 31 32 33 34 http://en.wikipedia.org/wiki/ascii 8

Unicode Strings BOM H e l l o ff fe 48 00 65 00 6c 00 6c 00 6f 00 UTF-16 / UCS-2 http://en.wikipedia.org/wiki/utf-16/ucs-2 http://en.wikipedia.org/wiki/category:unicode 9

String Storage ASCIIZ: Zero-terminated ASCII Pascal: Size byte + ASCII string Delphi: Size Dword + ASCII or Unicode string H e l l o ASCIIZ: 48 65 6C 6C 6F 00 Pascal: 05 48 65 6C 6C 6F 10

Intel x86 Architecture Image Copyright 2004 GNU 11

Introduction to Intel x86 Started with 8086 in 1978 Continued with 8088, 80186, 80286, 386, 486, Pentium, 686... CISC architecture 32-bit is called x86-32 or IA-32 64-bit is called x86-64, AMD64, EMT64T 80386 introduced in 1986 Has a 32-bit word length Has eight general-purpose registers Supports paging and virtual memory Addresses up to 4GiB of memory 12

Data Register Layout Image Copyright 1997-2008 Intel Corporation 13

Data Registers AL / AH / AX EAX Accumulator Arithmetic operations BL / BH / BX EBX Data index General data storage, index CL / CH / CX ECX Loop counter Loop constructs DL / DH / DX EDX Data register Arithmetics 14

Address Registers IP / EIP Instruction Pointer Program execution SP / ESP Stack Pointer Stack operation BP / EBP Base Pointer Stack frame SI / ESI Source Index String operation DI / EDI Destination Index String operation 15

Segment Registers CS Code Segment Program code DS Data Segment Program data ES / FS / GS Other Segments Other uses 16

EFLAGS Register Image Copyright 1997-2008 Intel Corporation 17

Mnemonic Examples MOV EAX, 1 ADD EDX, 5 SUB EBX, 2 AND ECX, 0 XOR EDX, 4 SHL ECX, 6 Move 1 to EAX Add 5 to EDX Subtract 2 from EBX Bit-wise AND 0 to ECX Bit-wise exclusive OR 4 to EDX Shift ECX left by six ROR EBX, 3 Bit-wise rotate EBX right by 3 INC ECX Increment ECX 18

More Mnemonics JNZ label JMP label CALL func RET LOOP label PUSH EAX POP EDI LODSB Jump if not zero (equal) Unconditional jump to label Call function Return from function ECX--, Jump to label if not zero Push EAX to stack Pop EDI from stack Load byte from DS:ESI to AL 19

Reversing C Code Image Copyright 1988, 1978 by Bell Telephone Labratories, Incorporated 20

Basic Data Types char - 1 byte short - 2 bytes int long - 4 bytes (platform word) - 4 bytes float - 4 bytes floating point double - 8 bytes floating point 21

Arrays and Pointers Pointers can point to any memory location One-dimensional arrays are flat memory Multi-dimensional arrays use pointers A A A A char a[4]; char *b, c; c = a[2]; c = *(b+2); 22

Structures and Unions Structure Union struct { unsigned int id; unsigned short age; char name[16]; } record; union foo { int one; char two; }; Memory is allocated for all members combined. Memory is allocated for the largest member only. sizeof(record) = 24 sizeof(foo) = 4 23

Structure Alignment Data structures are aligned to word size by default #pragma pack(n) directive can change it #pragma pack(1) removes alignment Important when reconstructing structures 24

Structure Storage Aligned DWORD id WORD age 2-byte padding Packed DWORD id WORD age 16 BYTES name 16 BYTES name sizeof(record) = 24 sizeof(record) = 22 25

Simple C Program int foobar(int x, int y) { int z = x+y; return z; } int main(void) { int z = foobar(1, 2); } 26

Function Calls Calling conventions are important to know Mixing them will crash the program stdcall - Standard calls on Windows cdecl - Most common C calling convention fastcall - Uses registers for arguments thiscall - Pass this pointer in ECX in C++ 27

cdecl Calls PUSH arg2 PUSH arg1 CALL function ADD ESP,8 PUSH EBP MOV EBP, ESP SUB ESP, 4 MOV EAX, [EBP+8] MOV ESP, EBP POP EBP RET Stack ARG2 ARG1 RET Addr. Saved EBP LOC1 arg1: EBP+8 arg2: EBP+12 loc1: EBP-4 28

stdcall Calls PUSH arg1 PUSH arg2 CALL function PUSH EBP MOV EBP, ESP SUB ESP, 4 MOV EAX, [EBP+8] MOV ESP, EBP POP EBP RETN 8 ARG1 ARG2 RET Addr. Saved EBP LOC1 arg1: EBP+8 arg1: EBP+12 loc1: EBP-4 29

Further Reading Intel Processor Documentation http://www.intel.com/products/processor/ manuals/index.htm Netwide Assembler Mnemonic Documentation http://sourceforge.net/docman/display_doc.php? docid=47259&group_id=6208 The Art of Assembly Language Programming Windows 32-bit Edition http://webster.cs.ucr.edu/aoa/index.html 30