Reverse Engineering II: The Basics

Similar documents
Reverse Engineering II: The Basics

Reverse Engineering II: Basics. Gergely Erdélyi Senior Antivirus Researcher

T Reverse Engineering Malware: Static Analysis I

Static Analysis I PAOLO PALUMBO, F-SECURE CORPORATION

X86 Addressing Modes Chapter 3" Review: Instructions to Recognize"

Lecture 15 Intel Manual, Vol. 1, Chapter 3. Fri, Mar 6, Hampden-Sydney College. The x86 Architecture. Robb T. Koether. Overview of the x86

Practical Malware Analysis

The x86 Architecture

Introduction to IA-32. Jo, Heeseung

INTRODUCTION TO IA-32. Jo, Heeseung

CSC 2400: Computer Systems. Towards the Hardware: Machine-Level Representation of Programs

Complex Instruction Set Computer (CISC)

CSC 8400: Computer Systems. Machine-Level Representation of Programs

CS241 Computer Organization Spring 2015 IA

CNIT 127: Exploit Development. Ch 1: Before you begin. Updated

Module 3 Instruction Set Architecture (ISA)

CS412/CS413. Introduction to Compilers Tim Teitelbaum. Lecture 21: Generating Pentium Code 10 March 08

Addressing Modes on the x86

Scott M. Lewandowski CS295-2: Advanced Topics in Debugging September 21, 1998

Assembly Language: IA-32 Instructions

Computer Processors. Part 2. Components of a Processor. Execution Unit The ALU. Execution Unit. The Brains of the Box. Processors. Execution Unit (EU)

EXPERIMENT WRITE UP. LEARNING OBJECTIVES: 1. Get hands on experience with Assembly Language Programming 2. Write and debug programs in TASM/MASM

MODE (mod) FIELD CODES. mod MEMORY MODE: 8-BIT DISPLACEMENT MEMORY MODE: 16- OR 32- BIT DISPLACEMENT REGISTER MODE

CS 31: Intro to Systems ISAs and Assembly. Martin Gagné Swarthmore College February 7, 2017

Lab 3. The Art of Assembly Language (II)

CS165 Computer Security. Understanding low-level program execution Oct 1 st, 2015

6/20/2011. Introduction. Chapter Objectives Upon completion of this chapter, you will be able to:

CS 31: Intro to Systems ISAs and Assembly. Kevin Webb Swarthmore College September 25, 2018

Assembly Language Each statement in an assembly language program consists of four parts or fields.

CS 31: Intro to Systems ISAs and Assembly. Kevin Webb Swarthmore College February 9, 2016

UMBC. A register, an immediate or a memory address holding the values on. Stores a symbolic name for the memory location that it represents.

Program Exploitation Intro

The Microprocessor and its Architecture

16.317: Microprocessor Systems Design I Fall 2015

x86 Assembly Tutorial COS 318: Fall 2017

An Introduction to x86 ASM

Chapter 3: Addressing Modes

Instruction Set Architectures

16.317: Microprocessor Systems Design I Fall 2014

Basic Execution Environment

CMSC Lecture 03. UMBC, CMSC313, Richard Chang

SPRING TERM BM 310E MICROPROCESSORS LABORATORY PRELIMINARY STUDY

We can study computer architectures by starting with the basic building blocks. Adders, decoders, multiplexors, flip-flops, registers,...

Instruction Set Architectures

System calls and assembler

EEM336 Microprocessors I. Addressing Modes

Hardware and Software Architecture. Chapter 2

Advanced Microprocessors

complement) Multiply Unsigned: MUL (all operands are nonnegative) AX = BH * AL IMUL BH IMUL CX (DX,AX) = CX * AX Arithmetic MUL DWORD PTR [0x10]

Reverse Engineering Low Level Software. CS5375 Software Reverse Engineering Dr. Jaime C. Acosta

x86 assembly CS449 Fall 2017

Marking Scheme. Examination Paper Department of CE. Module: Microprocessors (630313)

Inline Assembler. Willi-Hans Steeb and Yorick Hardy. International School for Scientific Computing

Registers. Ray Seyfarth. September 8, Bit Intel Assembly Language c 2011 Ray Seyfarth

Machine-level Representation of Programs. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Stack -- Memory which holds register contents. Will keep the EIP of the next address after the call

Chapter 11. Addressing Modes

Assembly I: Basic Operations. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Instruction Set Architecture (ISA) Data Types

Computer Architecture and Assembly Language. Practical Session 3

administrivia today start assembly probably won t finish all these slides Assignment 4 due tomorrow any questions?

Binghamton University. CS-220 Spring x86 Assembler. Computer Systems: Sections

COMPUTER ENGINEERING DEPARTMENT

CSE2421 FINAL EXAM SPRING Name KEY. Instructions: Signature

Lab 2: Introduction to Assembly Language Programming

Instruction Set Architectures

Week /8086 Microprocessor Programming I

Winter Compiler Construction T11 Activation records + Introduction to x86 assembly. Today. Tips for PA4. Today:

Introduction to Reverse Engineering. Alan Padilla, Ricardo Alanis, Stephen Ballenger, Luke Castro, Jake Rawlins

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2018 Lecture 4

Using MMX Instructions to Perform Simple Vector Operations

EECE.3170: Microprocessor Systems Design I Summer 2017 Homework 4 Solution

W4118: PC Hardware and x86. Junfeng Yang

CS241 Computer Organization Spring Introduction to Assembly

CSE P 501 Compilers. x86 Lite for Compiler Writers Hal Perkins Autumn /25/ Hal Perkins & UW CSE J-1

CSE351 Spring 2018, Midterm Exam April 27, 2018

IA-32 Architecture. CS 4440/7440 Malware Analysis and Defense

Islamic University Gaza Engineering Faculty Department of Computer Engineering ECOM 2125: Assembly Language LAB. Lab # 7. Procedures and the Stack

Lecture (02) The Microprocessor and Its Architecture By: Dr. Ahmed ElShafee

Assembly Language Lab # 9

Assembly level Programming. 198:211 Computer Architecture. (recall) Von Neumann Architecture. Simplified hardware view. Lecture 10 Fall 2012

Assembler Programming. Lecture 2

16.317: Microprocessor Systems Design I Spring 2015

The x86 Architecture. ICS312 - Spring 2018 Machine-Level and Systems Programming. Henri Casanova

Assembly Language for Intel-Based Computers, 4 th Edition. Chapter 2: IA-32 Processor Architecture Included elements of the IA-64 bit

Turning C into Object Code Code in files p1.c p2.c Compile with command: gcc -O p1.c p2.c -o p Use optimizations (-O) Put resulting binary in file p

22 Assembly Language for Intel-Based Computers, 4th Edition. 3. Each edge is a transition from one state to another, caused by some input.

The IA-32 Stack and Function Calls. CS4379/5375 Software Reverse Engineering Dr. Jaime C. Acosta

Interfacing Compiler and Hardware. Computer Systems Architecture. Processor Types And Instruction Sets. What Instructions Should A Processor Offer?

1. Introduction to Assembly Language

Credits and Disclaimers

CMSC 313 COMPUTER ORGANIZATION & ASSEMBLY LANGUAGE PROGRAMMING LECTURE 03, SPRING 2013

SYSC3601 Microprocessor Systems. Unit 2: The Intel 8086 Architecture and Programming Model

CSCE 212H, Spring 2008 Lab Assignment 3: Assembly Language Assigned: Feb. 7, Due: Feb. 14, 11:59PM

Ethical Hacking. Assembly Language Tutorial

Machine and Assembly Language Principles

Proposed Common Configuration Method

CS499. Intel Architecture

x86 Programming I CSE 351 Winter

Q1: Multiple choice / 20 Q2: Memory addressing / 40 Q3: Assembly language / 40 TOTAL SCORE / 100

Transcription:

Reverse Engineering II: The Basics This document is only to be distributed to teachers and students of the Malware Analysis and Antivirus Technologies course and should only be used in accordance with the course guidelines. Protecting the irreplaceable f-secure.com

Agenda Very basics Intel x86 crash course Basics of C reversing 2

Binary Numbers 1 0 1 1 - Nibble B 1 0 1 1 B 1 1 0 1 D - Byte 1 0 1 1 B 1 1 0 1 D 0 0 1 1 3 1 0 0 1 9 - Word 3

Byte Order a.k.a. Endianness 00 01 12 34 = 0x3412 (Little Endian) = 0x1234 (Big Endian) 34 12 00 01 = 0x1234 (Little Endian) = 0x3412 (Big Endian) 4

Little Endian Dword 00 01 02 03 12 34 56 78 0x78563412 78 56 34 12 00 01 02 03 0x12345678 5

Endianness Matters Data exchange between computers Networking protocols File formats for disk storage 6

System Endianness Little Endian Intel x86 Intel 8051 Most ucontrollers Big Endian PowerPC (exc. G5) Sparc (exc. v9) System/370 Switchable Endianness ARM Alpha Intel IA64 7

ASCII Code 0x00-0x1F 0x20-0x3F 0x40-0x5F 0x60-0x7E Control Characters Digits and Punctuation Upper-case Letters and Special Lower-case Letters and Special Backspace, Line feed 0-9 <> =.,: *-()! ABCD... @[]\^_ abcd... `{} ~ 8

ASCII Example H e l l o 1 2 3 4 48 65 6C 6C 6F 20 31 32 33 34 http://en.wikipedia.org/wiki/ascii 9

Unicode Strings BOM H e l l o ff fe 48 00 65 00 6c 00 6c 00 6f 00 UTF-16 / UCS-2 http://en.wikipedia.org/wiki/utf-16/ucs-2 http://en.wikipedia.org/wiki/category:unicode 10

String Storage ASCIIZ: Zero-terminated ASCII Pascal: Size byte + ASCII string Delphi: Size Dword + ASCII or Unicode string H e l l o ASCIIZ: 48 65 6C 6C 6F 00 Pascal: 05 48 65 6C 6C 6F 11

Intel x86 Architecture Image Copyright 2004 GNU 12

Introduction to Intel x86 Started with 8086 in 1978 Continued with 8088, 80186, 80286, 386, 486, Pentium, 686... CISC architecture 32-bit is called x86-32 or IA-32 64-bit is called x86-64, AMD64, EMT64T 80386 introduced in 1986 Has a 32-bit word length Has eight general-purpose registers Supports paging and virtual memory Addresses up to 4GiB of memory 13

Data Register Layout Image Copyright 1997-2008 Intel Corporation 14

Data Registers AL / AH / AX EAX Accumulator Arithmetic operations BL / BH / BX EBX Data index General data storage, index CL / CH / CX ECX Loop counter Loop constructs DL / DH / DX EDX Data register Arithmetics 15

Address Registers IP / EIP Instruction Pointer Program execution SP / ESP Stack Pointer Stack operation BP / EBP Base Pointer Stack frame SI / ESI Source Index String operation DI / EDI Destination Index String operation 16

Segment Registers CS Code Segment Program code DS Data Segment Program data ES / FS / GS Other Segments Other uses 17

EFLAGS Register Image Copyright 1997-2008 Intel Corporation 18

Mnemonic Examples MOV EAX, 1 ADD EDX, 5 SUB EBX, 2 AND ECX, 0 XOR EDX, 4 SHL ECX, 6 Move 1 to EAX Add 5 to EDX Subtract 2 from EBX Bit-wise AND 0 to ECX Bit-wise exclusive OR 4 to EDX Shift ECX left by six ROR EBX, 3 Bit-wise rotate EBX right by 3 INC ECX Increment ECX 19

More Mnemonics JNZ label JMP label CALL func RET LOOP label PUSH EAX POP EDI LODSB Jump if not zero (equal) Unconditional jump to label Call function Return from function ECX--, Jump to label if not zero Push EAX to stack Pop EDI from stack Load byte from DS:ESI to AL 20

Reversing C code

Basic Data Types char - 1 byte short - 2 bytes int - 4 bytes (platform word) long - 4 bytes float - 4 bytes floating point double - 8 bytes floating point

Pointers and Arrays Pointers can point to any memory location One-dimensional arrays are flat memory Multi-dimensional arrays use pointers A[0] A[1] A[2] A[3] char a[4]; char *b, c; c = a[2]; b = a; c = *(b+2);

Composite Types: Structure Memory is allocated for all members Members are accessible separately struct { unsigned int id; unsigned short age; char name[16]; } record;

Alignment Data structures are aligned to word size #pragma pack(n) directive can change it #pragma pack(1) removes alignment Important when reconstructing structures

Structure Storage Aligned long id; short age; 2-byte padding Packed long id; short age; char name[16]; char name[16]; sizeof(record) = 24 sizeof(record) = 22

Composite Types: Union Memory is allocated for the largest member Holds only one member at a time union foo { int one; }; char two;

Control Structures Conditional Branch Iteration Switch-Case Goto label

Conditional Branch: if var_c = dword ptr -0Ch int example_if() { int foo = 0; if (foo) { do_one_thing(); } else { do_another(); } } push ebp mov ebp, esp sub esp, 18h mov [ebp+var_c], 0 cmp [ebp+var_c], 0 jz short loc_1f27 call _do_one_thing jmp short locret_1f2c loc_1f27: call _do_another locret_1f2c: leave retn

Iteration: for int example_for() { int i; } for (i=0; i<10; i++) { if (check_something(i)) break; } push ebp mov ebp, esp sub esp, 28h mov [ebp+var_c], 0 jmp short loc_1f51 loc_1f3d: mov eax, [ebp+var_c] mov [esp], eax call _check_something test eax, eax jnz short locret_1f57 lea eax, [ebp+var_c] inc dword ptr [eax] loc_1f51: cmp [ebp+var_c], 9 jle short loc_1f3d locret_1f57: leave retn

Iteration: while int example_while() { int i = 0; } while (i < 100) { if (check_something(i)) break; } push ebp mov ebp, esp sub esp, 28h mov [ebp+var_c], 0 jmp short loc_1f77 loc_1f68: mov eax, [ebp+var_c] mov [esp], eax call _check_something test eax, eax jnz short locret_1f7d loc_1f77: cmp [ebp+var_c], 64h jl short loc_1f68 locret_1f7d: leave retn

Branching: Switch-Case int example_switch() { int i = 1; } switch (i) { case 0: do_one_thing(); break; case 1: do_another(); break; default: check_something(i); } push ebp mov ebp, esp sub esp, 38h mov [ebp+var_c], 1 mov eax, [ebp+var_c] mov [ebp+var_1c], eax cmp [ebp+var_1c], 0 jz short loc_1fab cmp [ebp+var_1c], 1 jz short loc_1fb2 mov eax, [ebp+var_c] mov [esp], eax call _check_something jmp short locret_1fb9 loc_1fab: call _do_one_thing jmp short locret_1fb9 loc_1fb2: call _do_another jmp short $+2 locret_1fb9: leave retn

Branching: Goto int example_goto(void) { open_files(); if do_one_thing() goto error; if do_another() goto error; close_files(); return 1; error: close_files(); return 0; } push ebp mov ebp, esp sub esp, 18h call _open_files call _do_one_thing test eax, eax jnz short loc_1fe6 call _do_another test eax, eax jnz short loc_1fe6 call _close_files mov [ebp+var_c], 1 jmp short loc_1ff2 loc_1fe6: call _close_files mov [ebp+var_c], 0 loc_1ff2: mov eax, [ebp+var_c] leave retn

Function Calling Conventions Common calling conventions: stdcall - Standard calls on Windows cdecl - Most common C calling convention fastcall - Uses registers for arguments thiscall - Pass this pointer in ECX in C++ Most important: Who is going to clean the stack? Mixing them will crash the program

Simple C Program int foobar(int x, int y) { int z; return x; } int main(void) { int z = foobar(1, 2); }

cdecl Calls PUSH arg2 PUSH arg1 CALL function ADD ESP,8 PUSH EBP MOV EBP, ESP SUB ESP, 4 MOV EAX, [EBP+8] MOV ESP, EBP POP EBP RET Stack ARG2 ARG1 RET Addr. Saved EBP LOC1 arg1: EBP+8 arg2: EBP+12 loc1: EBP-4

stdcall Calls PUSH arg2 PUSH arg1 CALL function PUSH EBP MOV EBP, ESP SUB ESP, 4 MOV EAX, [EBP+8] MOV ESP, EBP POP EBP RETN 8 ARG2 ARG1 RET Addr. Saved EBP LOC1 arg1: EBP+8 arg2: EBP+12 loc1: EBP-4

Reading Intel x86 Function-call Conventions: http://www.unixwiz.net/techtips/win32- callconv-asm.html C Programming Information: http://www.cprogramming.com/