Practical Malware Analysis

Similar documents
CNIT 127: Exploit Development. Ch 1: Before you begin. Updated

A CRASH COURSE IN X86 DISASSEMBLY

Program Exploitation Intro

Reverse Engineering Low Level Software. CS5375 Software Reverse Engineering Dr. Jaime C. Acosta

Module 3 Instruction Set Architecture (ISA)

CSC 2400: Computer Systems. Towards the Hardware: Machine-Level Representation of Programs

SOEN228, Winter Revision 1.2 Date: October 25,

CSC 8400: Computer Systems. Machine-Level Representation of Programs

CS 31: Intro to Systems ISAs and Assembly. Martin Gagné Swarthmore College February 7, 2017

Reverse Engineering II: The Basics

Reverse Engineering II: Basics. Gergely Erdélyi Senior Antivirus Researcher

Memory Models. Registers

X86 Addressing Modes Chapter 3" Review: Instructions to Recognize"

Lecture 2 Assembly Language

An Introduction to x86 ASM

Machine and Assembly Language Principles

Reverse Engineering II: The Basics

Lab 3. The Art of Assembly Language (II)

Dr. Ramesh K. Karne Department of Computer and Information Sciences, Towson University, Towson, MD /12/2014 Slide 1

CS 31: Intro to Systems ISAs and Assembly. Kevin Webb Swarthmore College September 25, 2018

Lecture 15 Intel Manual, Vol. 1, Chapter 3. Fri, Mar 6, Hampden-Sydney College. The x86 Architecture. Robb T. Koether. Overview of the x86

Digital Forensics Lecture 3 - Reverse Engineering

Introduction to Reverse Engineering. Alan Padilla, Ricardo Alanis, Stephen Ballenger, Luke Castro, Jake Rawlins

CS 31: Intro to Systems ISAs and Assembly. Kevin Webb Swarthmore College February 9, 2016

The x86 Architecture

The Instruction Set. Chapter 5

CS165 Computer Security. Understanding low-level program execution Oct 1 st, 2015

W4118: PC Hardware and x86. Junfeng Yang

Computer Systems Lecture 9

Low-Level Essentials for Understanding Security Problems Aurélien Francillon

Assembly Language Programming: Procedures. EECE416 uc. Charles Kim Howard University. Fall

Summary: Direct Code Generation

CPS104 Recitation: Assembly Programming

administrivia today start assembly probably won t finish all these slides Assignment 4 due tomorrow any questions?

CC411: Introduction To Microprocessors

Registers. Registers

CS412/CS413. Introduction to Compilers Tim Teitelbaum. Lecture 21: Generating Pentium Code 10 March 08

Towards the Hardware"

complement) Multiply Unsigned: MUL (all operands are nonnegative) AX = BH * AL IMUL BH IMUL CX (DX,AX) = CX * AX Arithmetic MUL DWORD PTR [0x10]

T Jarkko Turkulainen, F-Secure Corporation

Sample Exam I PAC II ANSWERS

Assembly Language: IA-32 Instructions

Rev101. spritzers - CTF team. spritz.math.unipd.it/spritzers.html

3.1 DATA MOVEMENT INSTRUCTIONS 45

VARDHAMAN COLLEGE OF ENGINEERING (AUTONOMOUS) Shamshabad, Hyderabad

Second Part of the Course

EXPERIMENT WRITE UP. LEARNING OBJECTIVES: 1. Get hands on experience with Assembly Language Programming 2. Write and debug programs in TASM/MASM

We can study computer architectures by starting with the basic building blocks. Adders, decoders, multiplexors, flip-flops, registers,...

Winter Compiler Construction T11 Activation records + Introduction to x86 assembly. Today. Tips for PA4. Today:

Representation of Information

Assembly Language. Lecture 2 - x86 Processor Architecture. Ahmed Sallam

System calls and assembler

Computer System Architecture

x86 assembly CS449 Fall 2017

Assembly Language: Overview!

Computer Architecture and Assembly Language. Practical Session 3

EECE416 :Microcomputer Fundamentals and Design. X86 Assembly Programming Part 1. Dr. Charles Kim

T Reverse Engineering Malware: Static Analysis I

Basic Execution Environment

Chapter 4 Processor Architecture: Y86 (Sections 4.1 & 4.3) with material from Dr. Bin Ren, College of William & Mary

Assembly Language. Lecture 2 x86 Processor Architecture

x86 Assembly Tutorial COS 318: Fall 2017

The X86 Assembly Language Instruction Nop Means

CSCE 212H, Spring 2008 Lab Assignment 3: Assembly Language Assigned: Feb. 7, Due: Feb. 14, 11:59PM

Buffer Overflow Attack (AskCypert CLaaS)

Islamic University Gaza Engineering Faculty Department of Computer Engineering ECOM 2125: Assembly Language LAB

Assembly Language for Intel-Based Computers, 4 th Edition. Chapter 2: IA-32 Processor Architecture Included elements of the IA-64 bit

CSE P 501 Compilers. x86 Lite for Compiler Writers Hal Perkins Autumn /25/ Hal Perkins & UW CSE J-1

Computer Processors. Part 2. Components of a Processor. Execution Unit The ALU. Execution Unit. The Brains of the Box. Processors. Execution Unit (EU)

Lecture 10 Return-oriented programming. Stephen Checkoway University of Illinois at Chicago Based on slides by Bailey, Brumley, and Miller

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2016 Lecture 12

Marking Scheme. Examination Paper Department of CE. Module: Microprocessors (630313)

SPRING TERM BM 310E MICROPROCESSORS LABORATORY PRELIMINARY STUDY

EECE.3170: Microprocessor Systems Design I Summer 2017 Homework 4 Solution

Stack -- Memory which holds register contents. Will keep the EIP of the next address after the call

Function Calls COS 217. Reading: Chapter 4 of Programming From the Ground Up (available online from the course Web site)

Compiler Construction D7011E

Islamic University Gaza Engineering Faculty Department of Computer Engineering ECOM 2125: Assembly Language LAB. Lab # 7. Procedures and the Stack

X86 Review Process Layout, ISA, etc. CS642: Computer Security. Drew Davidson

The Geometry of Innocent Flesh on the Bone

Complex Instruction Set Computer (CISC)

SYSC3601 Microprocessor Systems. Unit 2: The Intel 8086 Architecture and Programming Model

CNIT 127: Exploit Development. Ch 2: Stack Overflows in Linux

CS61 Section Solutions 3

Ethical Hacking. Assembly Language Tutorial

We will first study the basic instructions for doing multiplications and divisions

Assembly Language Programming of 8085

x86 architecture et similia

238P: Operating Systems. Lecture 3: Calling conventions. Anton Burtsev October, 2018

Buffer Overflows Defending against arbitrary code insertion and execution

Assembly Language Programming

Betriebssysteme und Sicherheit Sicherheit. Buffer Overflows

Static Analysis I PAOLO PALUMBO, F-SECURE CORPORATION

Computer Architecture and System Programming Laboratory. TA Session 3

CSE351 Spring 2018, Midterm Exam April 27, 2018

x86 Assembly Crash Course Don Porter

Architecture & Instruction set of 8085 Microprocessor and 8051 Micro Controller

Programming of 8085 microprocessor and 8051 micro controller Study material

Basic exploitation techniques

CPU. Fall 2003 CSE 207 Digital Design Project #4 R0 R1 R2 R3 R4 R5 R6 R7 PC STATUS IR. Control Logic RAM MAR MDR. Internal Processor Bus

Transcription:

Practical Malware Analysis Ch 4: A Crash Course in x86 Disassembly Revised 1-16-7

Basic Techniques Basic static analysis Looks at malware from the outside Basic dynamic analysis Only shows you how the malware operates in one case Disassembly View code of malware & figure out what it does

Levels of Abstraction

Six Levels of Abstraction Hardware Microcode Machine code Low-level languages High-level languages Interpreted languages

Hardware Digital circuits XOR, AND, OR, NOT gates Cannot be easily manipulated by software

Microcode Also called firmware Only operates on specific hardware it was designed for Not usually important for malware analysis

Machine code Opcodes Tell the processor to do something Created when a program written in a highlevel language is compiled

Low-level languages Human-readable version of processor's instruction set Assembly language PUSH, POP, NOP, MOV, JMP... Disassembler generates assembly language This is the highest level language that can be reliably recovered from malware when source code is unavailable

High-level languages Most programmers use these C, C++, etc. Converted to machine code by a compiler

Interpreted languages Highest level Java, C#, Perl,.NET, Python Code is not compiled into machine code It is translated into bytecode An intermediate representation Independent of hardware and OS Bytecode executes in an interpreter, which translates bytecode into machine language on the fly at runtime Ex: Java Virtual Machine

Reverse Engineering

Disassembly Malware on a disk is in binary form at the machine code level Disassembly converts the binary form to assembly language IDA Pro is the most popular disassembler

Assembly Language Different versions for each type of processor x86 32-bit Intel (most common) x64 64-bit Intel SPARC, PowerPC, MIPS, ARM others Windows runs on x86 or x64 x64 machines can run x86 programs Most malware is designed for x86

The x86 Architecture

CPU (Central Processing Unit) executes code RAM stores all data and code I/O system interfaces with hard disk, keyboard, monitor, etc.

CPU Components Control unit Fetches instructions from RAM using a register named the instruction pointer Registers Data storage within the CPU Faster than RAM ALU (Arithmetic Logic Unit) Executes an instruction and places results in registers or RAM

Main Memory (RAM)

Data Values placed in RAM when a program loads Sometimes these values are called static They may not change while the program is running Sometimes these values are called global Available to any part of the program

Code Instructions for the CPU Controls what the program does

Heap Dynamic memory Changes frequently during program execution Program creates (allocates) new values, and eliminates (frees) them when they are no longer needed

Stack Local variables and parameters for functions Helps programs flow

Instructions Mnemonic followed by operands mov ecx 0x42 Move into Extended C register the value 42 (hex) mov ecx is 0xB9 in hexadecimal The value 42 is 0x4200000000 In binary this instruction is 0xB942000000

Assembly Language Instructions We're using the Intel format AT&T format reverses the source and destination positions

Endianness Big-Endian Most significant byte first 0x42 as a 64-bit value would be 0x00000042 Little-Endian Least significant byte first 0x42 as a 64-bit value would be 0x42000000 Network data uses big-endian x86 programs use little-endian

IP Addresses 127.0.0.1, or in hex, 7F 00 00 01 Sent over the network as 0x7F000001 Stored in RAM as 0x0100007F

Operands Immediate Fixed values like 0x42 Register eax, ebx, ecx, and so on Memory address Denoted with brackets, like [eax]

Registers

Registers General registers Used by the CPU during execution Segment registers Used to track sections of memory Status flags Used to make decisions Instruction pointer Address of next instruction to execute

Size of Registers General registers are all 32 bits in size Can be referenced as either 32bits (edx) or 16 bits (dx) Four registers (eax, ebx, ecx, edx) can also be referenced as 8-bit values AL is lowest 8 bits AH is higher 8 bits

General Registers Typically store data or memory addresses Normally interchangeable Some instructions reference specific registers Multiplication and division use EAX and EDX Conventions Compilers use registers in consistent ways EAX contains the return value for function calls

Flags EFLAGS is a status register 32 bits in size Each bit is a flag SET (1) or Cleared (0)

Important Flags ZF Zero flag Set when the result of an operation is zero CF Carry flag Set when result is too large or small for destination SF Sign Flag Set when result is negative, or when most significant bit is set after arithmetic TF Trap Flag Used for debugging if set, processor executes only one instruction at a time

EIP (Extended Instruction Pointer) Contains the memory address of the next instruction to be executed If EIP contains wrong data, the CPU will fetch non-legitimate instructions and crash Buffer overflows target EIP

Simple Instructions

Simple Instructions mov destination, source Moves data from one location to another We use Intel format throughout the book, with destination first Remember indirect addressing [ebx] means the memory location pointed to by EBX

lea (Load Effective Address) lea destination, source lea eax, [ebx+8] Puts ebx + 8 into eax Compare mov eax, [ebx+8] Moves the data at location ebx+8 into eax

sub Subtracts add Adds inc Increments dec Decrements mul Multiplies div Divides Arithmetic

NOP Does nothing 0x90 Commonly used as a NOP Sled Allows attackers to run code even if they are imprecise about jumping to it

The Stack Memory for functions, local variables, and flow control Last in, First out ESP (Extended Stack Pointer) top of stack EBP (Extended Base Pointer) bottom of stack PUSH puts data on the stack POP takes data off the stack

Other Stack Instructions To enter a function Call or Enter To exit a function Leave or Ret

Function Calls Small programs that do one thing and return, like printf() Prologue Instructions at the start of a function that prepare stack and registers for the function to use Epilogue Instructions at the end of a end of a function that restore the stack and registers to their state before the function was called

Stack Frames

Conditionals test Compares two values the way AND does, but does not alter them test eax, eax Sets Zero Flag if eax is zero cmp eax, ebx Sets Zero Flag if the arguments are equal

Branching jz loc Jump to loc if the Zero Flag is set jnz loc Jump to loc if the Zero Flag is cleared

C Main Method Every C program has a main() function int main(int argc, char** argv) argc contains the number of arguments on the command line argv is a pointer to an array of names containing the arguments

cp foo bar argc = 3 argv[0] = cp argv[1] = foo argv[2] = bar Example