Assembly Language Programming Introduction

Similar documents
Introduction to IA-32. Jo, Heeseung

INTRODUCTION TO IA-32. Jo, Heeseung

Complex Instruction Set Computer (CISC)

Hardware and Software Architecture. Chapter 2

EXPERIMENT WRITE UP. LEARNING OBJECTIVES: 1. Get hands on experience with Assembly Language Programming 2. Write and debug programs in TASM/MASM

Computer Processors. Part 2. Components of a Processor. Execution Unit The ALU. Execution Unit. The Brains of the Box. Processors. Execution Unit (EU)

We can study computer architectures by starting with the basic building blocks. Adders, decoders, multiplexors, flip-flops, registers,...

8086 INTERNAL ARCHITECTURE

Addressing Modes on the x86

Module 3 Instruction Set Architecture (ISA)

The Instruction Set. Chapter 5

Basic Execution Environment

The Microprocessor and its Architecture

Code segment Stack segment

Program controlled semiconductor device (IC) which fetches (from memory), decodes and executes instructions.

ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION Lesson-11: 80x86 Architecture

Lecture 5:8086 Outline: 1. introduction 2. execution unit 3. bus interface unit

EEM336 Microprocessors I. The Microprocessor and Its Architecture

EC-333 Microprocessor and Interfacing Techniques

Lecture (02) The Microprocessor and Its Architecture By: Dr. Ahmed ElShafee

MODE (mod) FIELD CODES. mod MEMORY MODE: 8-BIT DISPLACEMENT MEMORY MODE: 16- OR 32- BIT DISPLACEMENT REGISTER MODE

Instruction Set Architectures

Machine-level Representation of Programs. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

CS 16: Assembly Language Programming for the IBM PC and Compatibles

Chapter 02: Computer Organization. Lesson 02: Functional units and components in a computer organization- Part 1: Processor

Lecture 15 Intel Manual, Vol. 1, Chapter 3. Fri, Mar 6, Hampden-Sydney College. The x86 Architecture. Robb T. Koether. Overview of the x86

Intel 8086 MICROPROCESSOR ARCHITECTURE

Assembly Language. Lecture 2 - x86 Processor Architecture. Ahmed Sallam

The x86 Architecture

Lecture 5: Computer Organization Instruction Execution. Computer Organization Block Diagram. Components. General Purpose Registers.

Assembler Programming. Lecture 2

CS 31: Intro to Systems ISAs and Assembly. Martin Gagné Swarthmore College February 7, 2017

Reverse Engineering II: Basics. Gergely Erdélyi Senior Antivirus Researcher

x86 Assembly Tutorial COS 318: Fall 2017

Assembly Language. Lecture 2 x86 Processor Architecture

Intel 8086 MICROPROCESSOR. By Y V S Murthy

Reverse Engineering II: The Basics

Credits and Disclaimers

Internal architecture of 8086

MICROPROCESSOR PROGRAMMING AND SYSTEM DESIGN

Instruction Set Architecture (ISA) Data Types

CS241 Computer Organization Spring 2015 IA

Assembly Language for Intel-Based Computers, 4 th Edition. Chapter 2: IA-32 Processor Architecture Included elements of the IA-64 bit

UNIT 2 PROCESSORS ORGANIZATION CONT.

Binghamton University. CS-220 Spring x86 Assembler. Computer Systems: Sections

The x86 Architecture. ICS312 - Spring 2018 Machine-Level and Systems Programming. Henri Casanova

CS241 Computer Organization Spring Introduction to Assembly

6/17/2011. Introduction. Chapter Objectives Upon completion of this chapter, you will be able to:

Microprocessor. By Mrs. R.P.Chaudhari Mrs.P.S.Patil

9/25/ Software & Hardware Architecture

Assembly Language Each statement in an assembly language program consists of four parts or fields.

Chapter 2: The Microprocessor and its Architecture

Interfacing Compiler and Hardware. Computer Systems Architecture. Processor Types And Instruction Sets. What Instructions Should A Processor Offer?

Chapter 11. Addressing Modes

Moodle WILLINGDON COLLEGE SANGLI (B. SC.-II) Digital Electronics

The von Neumann Machine

Credits and Disclaimers

COS 318: Operating Systems. Overview. Prof. Margaret Martonosi Computer Science Department Princeton University

CC411: Introduction To Microprocessors

Lab 2: Introduction to Assembly Language Programming

Real instruction set architectures. Part 2: a representative sample

CSC 2400: Computer Systems. Towards the Hardware: Machine-Level Representation of Programs

UMBC. contain new IP while 4th and 5th bytes contain CS. CALL BX and CALL [BX] versions also exist. contain displacement added to IP.

Low Level Programming Lecture 2. International Faculty of Engineerig, Technical University of Łódź

Basic characteristics & features of 8086 Microprocessor Dr. M. Hebaishy

IA32 Intel 32-bit Architecture

Introduction to Machine/Assembler Language

CSC 8400: Computer Systems. Machine-Level Representation of Programs

SPRING TERM BM 310E MICROPROCESSORS LABORATORY PRELIMINARY STUDY

The von Neumann Machine

Assembly I: Basic Operations. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Datapoint 2200 IA-32. main memory. components. implemented by Intel in the Nicholas FitzRoy-Dale

SRI VENKATESWARA COLLEGE OF ENGINEERING AND TECHNOLOGY DEPARTMENT OF ECE EC6504 MICROPROCESSOR AND MICROCONTROLLER (REGULATION 2013)

x86 architecture et similia

Chapter 3: Addressing Modes

Instruction Set Architectures

Scott M. Lewandowski CS295-2: Advanced Topics in Debugging September 21, 1998

Advanced Microprocessors

SYSC3601 Microprocessor Systems. Unit 2: The Intel 8086 Architecture and Programming Model

Instruction Set Architectures

UMBC. A register, an immediate or a memory address holding the values on. Stores a symbolic name for the memory location that it represents.

icroprocessor istory of Microprocessor ntel 8086:

Assembly level Programming. 198:211 Computer Architecture. (recall) Von Neumann Architecture. Simplified hardware view. Lecture 10 Fall 2012

CS 31: Intro to Systems ISAs and Assembly. Kevin Webb Swarthmore College February 9, 2016

ADVANCE MICROPROCESSOR & INTERFACING

CS 31: Intro to Systems ISAs and Assembly. Kevin Webb Swarthmore College September 25, 2018

Memory Models. Registers

Ethical Hacking. Assembly Language Tutorial

Microcomputer Architecture..Second Year (Sem.2).Lecture(2) مدرس المادة : م. سندس العزاوي... قسم / الحاسبات

Introduction to Microprocessor

T Reverse Engineering Malware: Static Analysis I

Reverse Engineering II: The Basics

Registers. Ray Seyfarth. September 8, Bit Intel Assembly Language c 2011 Ray Seyfarth

Marking Scheme. Examination Paper Department of CE. Module: Microprocessors (630313)

CMSC Lecture 03. UMBC, CMSC313, Richard Chang

6/20/2011. Introduction. Chapter Objectives Upon completion of this chapter, you will be able to:

MICROPROCESSOR TECHNOLOGY

For your convenience Apress has placed some of the front matter material after the index. Please use the Bookmarks and Contents at a Glance links to

IA-32 Architecture COE 205. Computer Organization and Assembly Language. Computer Engineering Department

Northern India Engineering College, Delhi (GGSIP University) PAPER I

Transcription:

Assembly Language Programming Introduction October 10, 2017

Motto: R7 is used by the processor as its program counter (PC). It is recommended that R7 not be used as a stack pointer. Source: PDP-11 04/34/45/55 processor handbook, Digital Equipment Corporation, 1976.

Why go down to machine language? Access to hardware registers for processor and I/O cards. Access to instructions not known to the compilers. Precise control of code execution in places liable to deadlocks or races at the level of hardware. Atomic operations test-and-set. Violation of compiler conventions for additional optimization (parameter passing, memory allocation, final calls i.e. tail-recursion). Access to rarely used modes of processor work, performing hardware code from ROM memory etc. Hardware-restricted resources, e.g. embedded systems.

How do we pay for it? Laborious and boring (especially initially) coding process. Fantastically easy to make errors. Very hard to debug. Difficult maintenance. Basically unportable (but see compatibility ). For typical programs the compiler-generated code is usually better than the hand-written one.

Easing the pain Only the necessary parts should be written in assembly language. Assembly code should be encapsulated inside well-defined interfaces (procedures/functions). If possible, try to generate the assembly code automatically: macros, rewriting rules, patterns etc.

Viewing generated code Use -S options in GCC compiler, using -fverbose-asm does not hurt either. Look for places, which obviously could be improved. Better yet, before doing that use a profiler, to avoid improving rarely executed code.

Computer architecture preliminary definition Abstract description of computer structure, which is necessary for programmer coding in the machine language (or a similar one). Attention: such a structure can have different hardware implementations, e.g. direct built-in control or microprogramming.

Levels of virtual machine interface ISA: machine language (Instruction Set Architecture) ABI (with operating system services) API (with libraries)

Important processor properties Basic properties of computer system architecture, which the programmer is interested in: basic word size, memory address space, addressing modes, instruction set, execution time (may depend on argument forms), stack organization, interrupt system (number of levels).

Organization of a simple computer Classical von Neumann model. Components: processor, memory, external devices. Buses, DMA channels. Typically programs are stored in operating memory, you cannot tell by looking at bits, whether they are program or data. Instruction = operation code + arguments. Sources of arguments: processor registers, program code, other memory cells. Format of coding, fields. Memory cells. Adressing. Bit, byte, word. Memory size. Memory cycle.

Simplified processor schema General-purpose and special registers Arithmetic-logical unit (ALU) Instruction decoder Instruction counter (program counter) Interregister transfers

General purpose registers On Pentium (32-bits architecture): EAX (AX, AH AL) EBX (BX, BH BL) ECX (CX, CH CL) EDX (DX, DH DL) ESI (SI) EDI (DI) EBP (BP)

Typical special registers instruction count (EIC, not accessible directly), instruction register (IR, not accessible), processor status/control word (FLAGS), stack pointer (ESP), memory address and buffer registers (not accessible), segment registers (CS, DS, ES, FS, GS, SS). Additionally general-purpose EBP register is often used as a frame pointer on stack.

Processor cycle Typical processor cycle (instruction cycle) = phases of instruction execution: 1 fetch fetching from memory the instruction pointed by instruction register 2 decode analyzing the instruction format, finding argument modes 3 read fetching argument(s) from memory 4 execute just that 5 write-back storing the result in register or memory 6 interrupt checking for imterrupts.

Bus Maximum frequency is restricted by the so called bus skew, resulting from unequal speed of signal propagation on parallel lines. Multiplexing addresses and data on the same lines means bus sharing. The same bus lines are used (at different cycles) for sending addresses and data. Additional control lines, e.g. wait states compensate for speed mismatch between processor and memory.

Binary arithmetic and data representation Unsigned integer numbers (natural numbers) simply. Arithmetic operations on them like for base 10. Carry and borrow. Multiple-precision arithmetic.

Representing signed integer numbers Variants: sign-magnitude one s complement (to the decremented base) two s complement (to the base), the highest bit has negative weight. shifted

Arithmetic operations Overflow instead of carry for signed numbers BCD representation with correction codes.

Real number representation Floating-point numbers of the form sign 2 c f, where sign is 1 or -1 c is integer f is fraction Normalization additional condition 1 > f >= 1/2, ensuring unique values for c i f. For zero f = 0. Maximum precision, easy comparisons Arithmetic operations include (temporary) denormalization.

Processor optimization Pipelined processing In pipelined processors) different phases for consecutive instruction are processed in parallel. Th best speedup is for sequences of instructions, complications during changes of control flow or interrupts. In such situations the processor pipe has to be emptied and filled starting from different address. Funny trick on some RISC processors: delayed branch (aka. delay slot) the jump is performed only after executing the next instruction. Of course the compiler should generate appropriate code by shuffling instrction (empirically: possible in 90% cases). Hypothetical execution for conditional branches.

Superscalar architecture Modern processors contain more than one pipeline with independent execution units. As a result, instruction can be executed concurrently. Such an architecture is called superscalar. This architecture has interesting consequences for optimization. Often (e.g. on Pentium) it is better to replaces complex instructions with sequences of simple instruction, becuse they can be executed in parallel.

Efficiency Perverse example: We have program with the execution time of 200 seconds, of which 160 seconds is spent in multiplication. How much faster should multiplication unit works to achieve speedup of 5 times in execution of the program? Let s call this increase in speed w: 200 sec. 5 = 160 sec. w + (200 160) sec. that is 40 sec. = 160 sec. w + 40 sec.

Literatura B.S. Chalk Computer Organisation and Architecture. An Introduction A.S. Tanenbaum Structured Computer Organization D.A. Patterson, D.L. Hennessy Computer Organization and Design. The hardware/software interface Advanced: M.L. Schmit Pentium Processor Optimization Tools