0. The first step of this tutorial is to read the following documents (either in the tutorial or at home):

Size: px
Start display at page:

Download "0. The first step of this tutorial is to read the following documents (either in the tutorial or at home):"

Transcription

1 0. The first step of this tutorial is to read the following documents (either in the tutorial or at home):! ia-32-architectures-software-developer-system-programming-manual pdf! It is important to understand how the performance mechanism works before basing your measurements on it. 1. Execute the following command to make sure that your chipset is compatible with RDTSC: $ cat /proc/cpuinfo Your CPU chipset should be Intel and both tsc and rtscp should be enabled as flag. Intel CPUs have a timestamp counter to keep track of every cycle that occurs on the CPU. Starting with the Intel Pentium processor, the devices have included a per-core timestamp register that stores the value of the timestamp counter and that can be accessed by the RDTSC and RDTSCP assembly instructions. When running a Linux OS, the developer can check if his CPU supports the RDTSCP instruction by looking at the flags field of /proc/cpuinfo ; if rdtscp is one of the flags, then it is supported. How it works: Using the "RDTSC" instruction loads the high-order 32 bits of the timestamp register into EDX, and the low-order 32 bits into EAX 2. As we did in last tutorial, write a 1 MB file into /dev/tutorial:! $ dd if=/dev/random of=/dev/tutorial bs= count=1 3. Using the "RDTSC" instruction loads the high order 32 bits of the timestamp register into EDX, and the low order 32 bits into EAX. The following is used to obtain the current time stamp in C programs:

2 asm volatile ("RDTSC\n\t"!!! "mov %%eax, %1\n\t": "=r" (cycles_high),!! "=r" (cycles_low):: "%eax", "%edx"); To access the complete value, we use uint64_t num_of_cycles = (((uint64_t)cycles_high << 32) cycles_low); 4. Modify the mycat.c (from the previous tutorial) as follows: #include<stdio.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <unistd.h> #include <inttypes.h> int main() {! int fd, ret, mainloop=0;! char f[]="/dev/tutorial";! char buf[1024];! struct timeval startread, endread, startwrite, endwrite;! uint64_t start, end, read_cycles, write_cycles, minread, minwrite;! unsigned cycles_low, cycles_high, cycles_low1, cycles_high1;! while(mainloop < 1000) // Number of trials! fd = open(f,o_rdonly);! if( fd < 0)!! printf("unable to open file\n");!! return 0;! }! ret = read(fd, buf, 1024);! while (ret > 0 ) "RDTSC\n\t" "mov %%edx, %0\n\t" "mov %%eax, %1\n\t": "=r" (cycles_high), "=r" (cycles_low)::"%eax", "%ebx", "%ecx", "%edx");!! write(1, buf, 1024);! "mov %%eax, %1\n\t": "=r" (cycles_high1)!,"=r" (cycles_low1)::"%eax", "%ebx", "%ecx", "%edx");!! start = ( ((uint64_t)cycles_high << 32) cycles_low );

3 !! end = ( ((uint64_t)cycles_high1 << 32) cycles_low1 );!! write_cycles += (end - start);! "mov %%eax, %1\n\t": "=r" (cycles_high)!, "=r" (cycles_low)::"%eax", "%ebx", "%ecx", "%edx");!! ret = read(fd, buf, 1024);! "mov %%eax, %1\n\t": "=r" (cycles_high1)!, "=r" (cycles_low1)::"%eax", "%ebx", "%ecx", "%edx");!! start = ( ((uint64_t)cycles_high << 32) cycles_low );!! end = ( ((uint64_t)cycles_high1 << 32) cycles_low1 );!! read_cycles += (end - start);!! }!! mainloop = mainloop + 1;! }! sleep(1);! printf("\nread cycles: %llu\n", read_cycles);! printf("write cycles: %llu\n", write_cycles);! return 0; } 5. What is this program doing? This program is measuring the total time it takes to read and write to stdout from the /dev/tutorial device. 6. How would you calculate the average time for both read and write? You can divide the total number of cycles between the number of iterations of the loop. NOTE: Remember that even a measurement operation increases the number of CPU cycles and real time execution of the program! Next tutorial we will try to measure both RDTSC and gettimeofday() to see what s this overhead. 7. Finally, let s consider all of these points for future measurements: Fixing CPUID issues: This is a good start: cpuid followed by rdtsc; since the variance in cpuid does not affect the start of the measurement. This is a bad end: a single rdtsc; compiler might reorder rdtsc to somewhere that we do not want.

4 This is bad fix for this problem: cpuid then rdtsc; same issue, rdtsc might still move to other positions. This is a better fix for the end: cpuid followed by rdtsc, followed by cpuid - The problem is that we are now counting cpuid's time "inside" our measurement. - Since cpuid has a high variance the result would not be reliable. The solution: - The preferred way is using RDTSCP if supported by CPU. Using RDTSCP ensures that all the instructions that before it in the source code are executed, before itself.! asm volatile ("CPUID\n\t" // this is outside!!!!! : "=r" (cycles_high), "=r" (cycles_low)!!! :: "%rax", "%rbx", "%rcx", "%rdx"); /*call the function to measure here*/! asm volatile("rdtscp\n\t" // notice that CPUID is no longer inside the measuring range!!! "CPUID\n\t"!!! : "=r" (cycles_high1), "=r" (cycles_low1)!!! :: "%rax", "%rbx", "%rcx", "%rdx"); The not-so-optimal way:! asm volatile ("CPUID\n\t"::: "%rax", "%rbx", "%rcx", "%rdx"); // this is outside! asm volatile ("RDTSC\n\t"!!! : "=r" (cycles_high), "=r" (cycles_low)!!! :: "%rax", "%rdx");

5 /*call the function to measure here*/! asm volatile("mov %%cr0, %%eax\n\t"!! "mov %%eax, %%cr0\n\t" // this causes the serialization!!!!! : "=r" (cycles_high1), "=r" (cycles_low1)!!! :: "%rax", "%rdx");

Evaluation of selected C++11 features with GCC, ICC and Clang

Evaluation of selected C++11 features with GCC, ICC and Clang Evaluation of selected C++11 features with GCC, ICC and Clang August 2014 Author: Stephen Wang Supervisor: Pawel Szostek CERN openlab Summer Student Report 2014 Project Specification The project concerns

More information

Assembly Language Programming 64-bit environments

Assembly Language Programming 64-bit environments Assembly Language Programming 64-bit environments October 17, 2017 Some recent history Intel together with HP start to work on 64-bit processor using VLIW technology. Itanium processor is born with the

More information

Operating Systems. Part 8. Operating Systems. What is an operating system? Interact with Applications. Vector Tables. The master software

Operating Systems. Part 8. Operating Systems. What is an operating system? Interact with Applications. Vector Tables. The master software Part 8 Operating Systems Operating Systems The master software Operating Systems What is an operating system? Master controller for all of the activities that take place within a computer Basic Duties:

More information

Time Measurement. CS 201 Gerson Robboy Portland State University. Topics. Time scales Interval counting Cycle counters K-best measurement scheme

Time Measurement. CS 201 Gerson Robboy Portland State University. Topics. Time scales Interval counting Cycle counters K-best measurement scheme Time Measurement CS 201 Gerson Robboy Portland State University Topics Time scales Interval counting Cycle counters K-best measurement scheme Computer Time Scales Microscopic Time Scale (1 Ghz Machine)

More information

Instruction Set Architectures

Instruction Set Architectures Instruction Set Architectures! ISAs! Brief history of processors and architectures! C, assembly, machine code! Assembly basics: registers, operands, move instructions 1 What should the HW/SW interface

More information

The course that gives CMU its Zip! Time Measurement Oct. 24, 2002

The course that gives CMU its Zip! Time Measurement Oct. 24, 2002 15-213 The course that gives CMU its Zip! Time Measurement Oct. 24, 2002 Topics Time scales Interval counting Cycle counters K-best measurement scheme class18.ppt Computer Time Scales Microscopic Time

More information

Machine Program: Procedure. Zhaoguo Wang

Machine Program: Procedure. Zhaoguo Wang Machine Program: Procedure Zhaoguo Wang Requirements of procedure calls? P() { y = Q(x); y++; 1. Passing control int Q(int i) { int t, z; return z; Requirements of procedure calls? P() { y = Q(x); y++;

More information

15-213/18-243, Spring 2011 Exam 1

15-213/18-243, Spring 2011 Exam 1 Andrew login ID: Full Name: Section: 15-213/18-243, Spring 2011 Exam 1 Thursday, March 3, 2011 (v1) Instructions: Make sure that your exam is not missing any sheets, then write your Andrew login ID, full

More information

mith College Computer Science CSC231 - Assembly Week #3 Dominique Thiébaut

mith College Computer Science CSC231 - Assembly Week #3 Dominique Thiébaut mith College Computer Science CSC231 - Assembly Week #3 Dominique Thiébaut dthiebaut@smith.edu memory mov add registers hexdump listings number systems Let's Review Last Week's Material section.data

More information

The von Neumann Machine

The von Neumann Machine The von Neumann Machine 1 1945: John von Neumann Wrote a report on the stored program concept, known as the First Draft of a Report on EDVAC also Alan Turing Konrad Zuse Eckert & Mauchly The basic structure

More information

Machine/Assembler Language Putting It All Together

Machine/Assembler Language Putting It All Together COMP 40: Machine Structure and Assembly Language Programming Fall 2015 Machine/Assembler Language Putting It All Together Noah Mendelsohn Tufts University Email: noah@cs.tufts.edu Web: http://www.cs.tufts.edu/~noah

More information

Assembly Language for Intel-Based Computers, 4 th Edition. Chapter 2: IA-32 Processor Architecture Included elements of the IA-64 bit

Assembly Language for Intel-Based Computers, 4 th Edition. Chapter 2: IA-32 Processor Architecture Included elements of the IA-64 bit Assembly Language for Intel-Based Computers, 4 th Edition Kip R. Irvine Chapter 2: IA-32 Processor Architecture Included elements of the IA-64 bit Slides prepared by Kip R. Irvine Revision date: 09/25/2002

More information

Computer Time Scales Time Measurement Oct. 24, Measurement Challenge. Time on a Computer System. The course that gives CMU its Zip!

Computer Time Scales Time Measurement Oct. 24, Measurement Challenge. Time on a Computer System. The course that gives CMU its Zip! 5-23 The course that gives CMU its Zip! Computer Time Scales Microscopic Time Scale ( Ghz Machine) Macroscopic class8.ppt Topics Time Measurement Oct. 24, 2002! Time scales! Interval counting! Cycle counters!

More information

The von Neumann Machine

The von Neumann Machine The von Neumann Machine 1 1945: John von Neumann Wrote a report on the stored program concept, known as the First Draft of a Report on EDVAC also Alan Turing Konrad Zuse Eckert & Mauchly The basic structure

More information

Return Oriented Programming

Return Oriented Programming ROP gadgets Small instruction sequence ending with a ret instruction 0xc3 Gadgets are found in existing, resident code and libraries There exist tools to search for and find gadgets Gadgets are put together

More information

Credits and Disclaimers

Credits and Disclaimers Credits and Disclaimers 1 The examples and discussion in the following slides have been adapted from a variety of sources, including: Chapter 3 of Computer Systems 3 nd Edition by Bryant and O'Hallaron

More information

Corrections made in this version not in first posting:

Corrections made in this version not in first posting: 1 Changelog 1 Corrections made in this version not in first posting: 27 Mar 2017: slide 18: mark suspect numbers for 1 accumulator 5 May 2017: slide 7: slower if to can be slower if notes on rotate 2 I

More information

SOEN228, Winter Revision 1.2 Date: October 25,

SOEN228, Winter Revision 1.2 Date: October 25, SOEN228, Winter 2003 Revision 1.2 Date: October 25, 2003 1 Contents Flags Mnemonics Basic I/O Exercises Overview of sample programs 2 Flag Register The flag register stores the condition flags that retain

More information

Computer Processors. Part 2. Components of a Processor. Execution Unit The ALU. Execution Unit. The Brains of the Box. Processors. Execution Unit (EU)

Computer Processors. Part 2. Components of a Processor. Execution Unit The ALU. Execution Unit. The Brains of the Box. Processors. Execution Unit (EU) Part 2 Computer Processors Processors The Brains of the Box Computer Processors Components of a Processor The Central Processing Unit (CPU) is the most complex part of a computer In fact, it is the computer

More information

Registers. Ray Seyfarth. September 8, Bit Intel Assembly Language c 2011 Ray Seyfarth

Registers. Ray Seyfarth. September 8, Bit Intel Assembly Language c 2011 Ray Seyfarth Registers Ray Seyfarth September 8, 2011 Outline 1 Register basics 2 Moving a constant into a register 3 Moving a value from memory into a register 4 Moving values from a register into memory 5 Moving

More information

Introduction to Machine/Assembler Language

Introduction to Machine/Assembler Language COMP 40: Machine Structure and Assembly Language Programming Fall 2017 Introduction to Machine/Assembler Language Noah Mendelsohn Tufts University Email: noah@cs.tufts.edu Web: http://www.cs.tufts.edu/~noah

More information

Assembly Language for x86 Processors 7 th Edition. Chapter 2: x86 Processor Architecture

Assembly Language for x86 Processors 7 th Edition. Chapter 2: x86 Processor Architecture Assembly Language for x86 Processors 7 th Edition Kip Irvine Chapter 2: x86 Processor Architecture Slides prepared by the author Revision date: 1/15/2014 (c) Pearson Education, 2015. All rights reserved.

More information

Analysis of Inherent Randomness of the Linux kernel

Analysis of Inherent Randomness of the Linux kernel Analysis of Inherent Randomness of the Linux kernel Nicholas Mc Guire (DSLab Lanzhou University, China) Peter Okech (Strathmore University, Kenya) Georg Schiesser (Opentech, Austria) 11 th Real-Time Linux

More information

Tutorial 10 Protection Cont.

Tutorial 10 Protection Cont. Tutorial 0 Protection Cont. 2 Privilege Levels Lower number => higher privilege Code can access data of equal/lower privilege levels only Code can call more privileged data via call gates Each level has

More information

Lecture 4 CIS 341: COMPILERS

Lecture 4 CIS 341: COMPILERS Lecture 4 CIS 341: COMPILERS CIS 341 Announcements HW2: X86lite Available on the course web pages. Due: Weds. Feb. 7 th at midnight Pair-programming project Zdancewic CIS 341: Compilers 2 X86 Schematic

More information

How Software Executes

How Software Executes How Software Executes CS-576 Systems Security Instructor: Georgios Portokalidis Overview Introduction Anatomy of a program Basic assembly Anatomy of function calls (and returns) Memory Safety Programming

More information

Machine Programming 3: Procedures

Machine Programming 3: Procedures Machine Programming 3: Procedures CS61, Lecture 5 Prof. Stephen Chong September 15, 2011 Announcements Assignment 2 (Binary bomb) due next week If you haven t yet please create a VM to make sure the infrastructure

More information

Credits and Disclaimers

Credits and Disclaimers Credits and Disclaimers 1 The examples and discussion in the following slides have been adapted from a variety of sources, including: Chapter 3 of Computer Systems 3 nd Edition by Bryant and O'Hallaron

More information

Function Call Convention

Function Call Convention Function Call Convention Compass Security Schweiz AG Werkstrasse 20 Postfach 2038 CH-8645 Jona Tel +41 55 214 41 60 Fax +41 55 214 41 61 team@csnc.ch www.csnc.ch Content Intel Architecture Memory Layout

More information

an infinite loop Processes and Exceptions doing nothing on a busy system timing nothing

an infinite loop Processes and Exceptions doing nothing on a busy system timing nothing an infinite loop Processes and Exceptions int main(void) { while (1) { /* waste CPU time */ If I run this on a lab machine, can you still use it? even if the machine only has one core? 1 2 timing nothing

More information

Program Exploitation Intro

Program Exploitation Intro Program Exploitation Intro x86 Assembly 04//2018 Security 1 Univeristà Ca Foscari, Venezia What is Program Exploitation "Making a program do something unexpected and not planned" The right bugs can be

More information

Computer Architecture and System Programming Laboratory. TA Session 5

Computer Architecture and System Programming Laboratory. TA Session 5 Computer Architecture and System Programming Laboratory TA Session 5 Addressing Mode specifies how to calculate effective memory address of an operand x86 64-bit addressing mode rule: up to two of the

More information

CS 16: Assembly Language Programming for the IBM PC and Compatibles

CS 16: Assembly Language Programming for the IBM PC and Compatibles CS 16: Assembly Language Programming for the IBM PC and Compatibles Discuss the general concepts Look at IA-32 processor architecture and memory management Dive into 64-bit processors Explore the components

More information

Assembly Language Each statement in an assembly language program consists of four parts or fields.

Assembly Language Each statement in an assembly language program consists of four parts or fields. Chapter 3: Addressing Modes Assembly Language Each statement in an assembly language program consists of four parts or fields. The leftmost field is called the label. - used to identify the name of a memory

More information

We can study computer architectures by starting with the basic building blocks. Adders, decoders, multiplexors, flip-flops, registers,...

We can study computer architectures by starting with the basic building blocks. Adders, decoders, multiplexors, flip-flops, registers,... COMPUTER ARCHITECTURE II: MICROPROCESSOR PROGRAMMING We can study computer architectures by starting with the basic building blocks Transistors and logic gates To build more complex circuits Adders, decoders,

More information

Control-flow Enforcement Technology H.J. Lu. Intel November, 2018

Control-flow Enforcement Technology H.J. Lu. Intel November, 2018 Control-flow Enforcement Technology H.J. Lu Intel November, 2018 Introduction Control-flow Enforcement Technology (CET) An upcoming Intel processor family feature that blocks return/jumporiented programming

More information

CS Bootcamp x86-64 Autumn 2015

CS Bootcamp x86-64 Autumn 2015 The x86-64 instruction set architecture (ISA) is used by most laptop and desktop processors. We will be embedding assembly into some of our C++ code to explore programming in assembly language. Depending

More information

x86 Programming I CSE 351 Winter

x86 Programming I CSE 351 Winter x86 Programming I CSE 351 Winter 2017 http://xkcd.com/409/ Administrivia Lab 2 released! Da bomb! Go to section! No Luis OH Later this week 2 Roadmap C: car *c = malloc(sizeof(car)); c->miles = 100; c->gals

More information

Performance Evaluation. December 2, 1999

Performance Evaluation. December 2, 1999 15-213 Performance Evaluation December 2, 1999 Topics Getting accurate measurements Amdahl s Law class29.ppt Time on a Computer System real (wall clock) time = user time (time executing instructing instructions

More information

Instruction Set Architectures

Instruction Set Architectures Instruction Set Architectures Computer Systems: Section 4.1 Suppose you built a computer What Building Blocks would you use? Arithmetic Logic Unit (ALU) OP1 OP2 OPERATION ALU RES ALU + Registers R0: 0x0000

More information

Time Measurement Nov 4, 2009"

Time Measurement Nov 4, 2009 Time Measurement Nov 4, 2009" Reminder" 2! Computer Time Scales" Microscopic Time Scale (1 Ghz Machine) Macroscopic Integer Add FP Multiply FP Divide Keystroke Interrupt Handler Disk Access Screen Refresh

More information

MACHINE-LEVEL PROGRAMMING I: BASICS COMPUTER ARCHITECTURE AND ORGANIZATION

MACHINE-LEVEL PROGRAMMING I: BASICS COMPUTER ARCHITECTURE AND ORGANIZATION MACHINE-LEVEL PROGRAMMING I: BASICS COMPUTER ARCHITECTURE AND ORGANIZATION Today: Machine Programming I: Basics History of Intel processors and architectures C, assembly, machine code Assembly Basics:

More information

Machine Language CS 3330 Samira Khan

Machine Language CS 3330 Samira Khan Machine Language CS 3330 Samira Khan University of Virginia Feb 2, 2017 AGENDA Logistics Review of Abstractions Machine Language 2 Logistics Feedback Not clear Hard to hear Use microphone Good feedback

More information

How Software Executes

How Software Executes How Software Executes CS-576 Systems Security Instructor: Georgios Portokalidis Overview Introduction Anatomy of a program Basic assembly Anatomy of function calls (and returns) Memory Safety Intel x86

More information

Assembly I: Basic Operations. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Assembly I: Basic Operations. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University Assembly I: Basic Operations Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Basic Execution Environment RAX RBX RCX RDX RSI RDI RBP RSP R8 R9 R10

More information

Florian Florob Zeitz

Florian Florob Zeitz esign Goals Florian Zeitz 2017-06-07 1 / 48 1 esign Goals 2 3 esign Goals 4 5 6 2 / 48 1 esign Goals 2 3 esign Goals 4 5 6 3 / 48 esign Goals 4 / 48 esign Goals low-level control config registers hardware

More information

Handling of Interrupts & Exceptions

Handling of Interrupts & Exceptions CSE 2421: Systems I Low-Level Programming and Computer Organization Handling of Interrupts & Exceptions Read/Study: Bryant 8.1 11-30-2018 Presentation O Gojko Babić Computer Architecture (repeat) A modern

More information

EEM336 Microprocessors I. The Microprocessor and Its Architecture

EEM336 Microprocessors I. The Microprocessor and Its Architecture EEM336 Microprocessors I The Microprocessor and Its Architecture Introduction This chapter presents the microprocessor as a programmable device by first looking at its internal programming model and then

More information

CS 261 Fall Mike Lam, Professor. x86-64 Control Flow

CS 261 Fall Mike Lam, Professor. x86-64 Control Flow CS 261 Fall 2018 Mike Lam, Professor x86-64 Control Flow Topics Condition codes Jumps Conditional moves Jump tables Motivation We cannot translate the following C function to assembly, using only data

More information

#include <stdio.h> #include <math.h> int shownum(int digits[], int digitcount);

#include <stdio.h> #include <math.h> int shownum(int digits[], int digitcount); Problem 1: Programming in C [20 Points] Write a C program that takes as input a positive integer number and converts it to base 4. Some examples of input and output of this program are as follow: Example

More information

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2018 Lecture 4

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2018 Lecture 4 CS24: INTRODUCTION TO COMPUTING SYSTEMS Spring 2018 Lecture 4 LAST TIME Enhanced our processor design in several ways Added branching support Allows programs where work is proportional to the input values

More information

Computer Architecture and System Programming Laboratory. TA Session 3

Computer Architecture and System Programming Laboratory. TA Session 3 Computer Architecture and System Programming Laboratory TA Session 3 Stack - LIFO word-size data structure STACK is temporary storage memory area register points on top of stack (by default, it is highest

More information

Today: Machine Programming I: Basics. Machine Level Programming I: Basics. Intel x86 Processors. Intel x86 Evolution: Milestones

Today: Machine Programming I: Basics. Machine Level Programming I: Basics. Intel x86 Processors. Intel x86 Evolution: Milestones Today: Machine Programming I: Basics Machine Level Programming I: Basics 15 213/1 213: Introduction to Computer Systems 5 th Lecture, Jan 29, 2013 History of Intel processors and architectures C, assembly,

More information

System calls and assembler

System calls and assembler System calls and assembler Michal Sojka sojkam1@fel.cvut.cz ČVUT, FEL License: CC-BY-SA 4.0 System calls (repetition from lectures) A way for normal applications to invoke operating system (OS) kernel's

More information

3 Lecture: The Process Model

3 Lecture: The Process Model 3 Lecture: The Process Model Outline: Announcements Two stories System Calls Again System Call Mechanisms How to do it OS Pre-history: The boot process How it all begins (on an Intel PC with a floppy)

More information

Intel x86-64 and Y86-64 Instruction Set Architecture

Intel x86-64 and Y86-64 Instruction Set Architecture CSE 2421: Systems I Low-Level Programming and Computer Organization Intel x86-64 and Y86-64 Instruction Set Architecture Presentation J Read/Study: Bryant 3.1 3.5, 4.1 Gojko Babić 03-07-2018 Intel x86

More information

University of Washington

University of Washington Roadmap C: car *c = malloc(sizeof(car)); c->miles = 100; c->gals = 17; float mpg = get_mpg(c); free(c); Assembly language: Machine code: Computer system: get_mpg: pushq %rbp movq %rsp, %rbp... popq %rbp

More information

Intel Architecture. Compass Security Schweiz AG Werkstrasse 20 Postfach 2038 CH-8645 Jona

Intel Architecture. Compass Security Schweiz AG Werkstrasse 20 Postfach 2038 CH-8645 Jona Intel Architecture Compass Security Schweiz AG Werkstrasse 20 Postfach 2038 CH-8645 Jona Tel +41 55 214 41 60 Fax +41 55 214 41 61 team@csnc.ch www.csnc.ch Content Intel Architecture Memory Layout C Arrays

More information

Millions of instructions per second [MIPS] executed by a single chip microprocessor

Millions of instructions per second [MIPS] executed by a single chip microprocessor Microprocessor Design Trends Joy's Law [Bill Joy of BSD4.x and Sun fame] MIPS = 2 year-1984 Millions of instructions per second [MIPS] executed by a single chip microprocessor More realistic rate is a

More information

Today: Machine Programming I: Basics

Today: Machine Programming I: Basics Today: Machine Programming I: Basics History of Intel processors and architectures C, assembly, machine code Assembly Basics: Registers, operands, move Intro to x86-64 1 Intel x86 Processors Totally dominate

More information

x86 architecture et similia

x86 architecture et similia x86 architecture et similia 1 FREELY INSPIRED FROM CLASS 6.828, MIT A full PC has: PC architecture 2 an x86 CPU with registers, execution unit, and memory management CPU chip pins include address and data

More information

Machine Level Programming I: Basics

Machine Level Programming I: Basics Carnegie Mellon Machine Level Programming I: Basics Kai Shen Why do I care for machine code? Chances are, you ll never write programs in machine code Compilers are much better & more patient than you are

More information

Introduction to Intel x86-64 Assembly, Architecture, Applications, & Alliteration. Xeno Kovah

Introduction to Intel x86-64 Assembly, Architecture, Applications, & Alliteration. Xeno Kovah Introduction to Intel x86-64 Assembly, Architecture, Applications, & Alliteration Xeno Kovah 2014-2015 xeno@legbacore.com All materials is licensed under a Creative Commons Share Alike license. http://creativecommons.org/licenses/by-sa/3.0/

More information

Changes made in this version not seen in first lecture:

Changes made in this version not seen in first lecture: 1 Changelog 1 Changes made in this version not seen in first lecture: 11 April 2018: loop unrolling v cache blocking (2): corrected second example which just did no loop unrolling or cache blocking before

More information

Lecture (02) The Microprocessor and Its Architecture By: Dr. Ahmed ElShafee

Lecture (02) The Microprocessor and Its Architecture By: Dr. Ahmed ElShafee Lecture (02) The Microprocessor and Its Architecture By: Dr. Ahmed ElShafee ١ INTERNAL MICROPROCESSOR ARCHITECTURE Before a program is written or instruction investigated, internal configuration of the

More information

Branching and Looping

Branching and Looping Branching and Looping Ray Seyfarth August 10, 2011 Branching and looping So far we have only written straight line code Conditional moves helped spice things up In addition conditional moves kept the pipeline

More information

A New Approach to Determining the Time-Stamping Counter's Overhead on the Pentium Pro Processors *

A New Approach to Determining the Time-Stamping Counter's Overhead on the Pentium Pro Processors * A New Approach to Determining the Time-Stamping Counter's Overhead on the Pentium Pro Processors * Hsin-Ta Chiao and Shyan-Ming Yuan Department of Computer and Information Science National Chiao Tung University

More information

1. A student is testing an implementation of a C function; when compiled with gcc, the following x86-64 assembly code is produced:

1. A student is testing an implementation of a C function; when compiled with gcc, the following x86-64 assembly code is produced: This assignment refers to concepts discussed in sections 2.1.1 2.1.3, 2.1.8, 2.2.1 2.2.6, 3.2, 3.4, and 3.7.1of csapp; see that material for discussions of x86 assembly language and its relationship to

More information

Carnegie Mellon. 5 th Lecture, Jan. 31, Instructors: Todd C. Mowry & Anthony Rowe

Carnegie Mellon. 5 th Lecture, Jan. 31, Instructors: Todd C. Mowry & Anthony Rowe Machine Level Programming I: Basics 15 213/18 213: 213: Introduction to Computer Systems 5 th Lecture, Jan. 31, 2012 Instructors: Todd C. Mowry & Anthony Rowe 1 Today: Machine Programming gi: Basics History

More information

%r8 %r8d. %r9 %r9d. %r10 %r10d. %r11 %r11d. %r12 %r12d. %r13 %r13d. %r14 %r14d %rbp. %r15 %r15d. Sean Barker

%r8 %r8d. %r9 %r9d. %r10 %r10d. %r11 %r11d. %r12 %r12d. %r13 %r13d. %r14 %r14d %rbp. %r15 %r15d. Sean Barker Procedure Call Registers Return %rax %eax %r8 %r8d Arg 5 %rbx %ebx %r9 %r9d Arg 6 Arg 4 %rcx %ecx %r10 %r10d Arg 3 %rdx %edx %r11 %r11d Arg 2 %rsi %esi %r12 %r12d Arg 1 %rdi %edi %r13 %r13d ptr %esp %r14

More information

Computer Systems Lecture 9

Computer Systems Lecture 9 Computer Systems Lecture 9 CPU Registers in x86 CPU status flags EFLAG: The Flag register holds the CPU status flags The status flags are separate bits in EFLAG where information on important conditions

More information

W4118: PC Hardware and x86. Junfeng Yang

W4118: PC Hardware and x86. Junfeng Yang W4118: PC Hardware and x86 Junfeng Yang A PC How to make it do something useful? 2 Outline PC organization x86 instruction set gcc calling conventions PC emulation 3 PC board 4 PC organization One or more

More information

Computer Science & Engineering Department I. I. T. Kharagpur

Computer Science & Engineering Department I. I. T. Kharagpur Computer Science & Engineering Department I. I. T. Kharagpur Operating System: CS33007 3rd Year CSE: 5th Semester (Autumn 2006-2007) Lecture II (Linux System Calls I) Goutam Biswas Date: 26th July, 2006

More information

Putting the pieces together

Putting the pieces together IBM developerworks : Linux : Linux articles All of dw Advanced search IBM home Products & services Support & downloads My account Inline assembly for x86 in Linux e-mail it! Contents: GNU assembler syntax

More information

The Hardware/Software Interface CSE351 Spring 2013

The Hardware/Software Interface CSE351 Spring 2013 The Hardware/Software Interface CSE351 Spring 2013 x86 Programming II 2 Today s Topics: control flow Condition codes Conditional and unconditional branches Loops 3 Conditionals and Control Flow A conditional

More information

238P: Operating Systems. Lecture 3: Calling conventions. Anton Burtsev October, 2018

238P: Operating Systems. Lecture 3: Calling conventions. Anton Burtsev October, 2018 238P: Operating Systems Lecture 3: Calling conventions Anton Burtsev October, 2018 What does CPU do internally? (Remember Lecture 01 - Introduction?) CPU execution loop CPU repeatedly reads instructions

More information

CSC 252: Computer Organization Spring 2018: Lecture 6

CSC 252: Computer Organization Spring 2018: Lecture 6 CSC 252: Computer Organization Spring 2018: Lecture 6 Instructor: Yuhao Zhu Department of Computer Science University of Rochester Action Items: Assignment 2 is out Announcement Programming Assignment

More information

Dr. Ramesh K. Karne Department of Computer and Information Sciences, Towson University, Towson, MD /12/2014 Slide 1

Dr. Ramesh K. Karne Department of Computer and Information Sciences, Towson University, Towson, MD /12/2014 Slide 1 Dr. Ramesh K. Karne Department of Computer and Information Sciences, Towson University, Towson, MD 21252 rkarne@towson.edu 11/12/2014 Slide 1 Intel x86 Aseembly Language Assembly Language Assembly Language

More information

47: #define NEH_CPU_IS_VIA 0x : #define NEH_CPU_READ 0x : #define NEH_CPU_MASK 0x : 51: #define NEH_RNG_PRESENT 0x000000

47: #define NEH_CPU_IS_VIA 0x : #define NEH_CPU_READ 0x : #define NEH_CPU_MASK 0x : 51: #define NEH_RNG_PRESENT 0x000000 1: /* 2: --------------------------------------------------------------------------- 3: Copyright (c) 1998-2007, Brian Gladman, Worcester, UK. All rights reserved. 4: 5: LICENSE TERMS 6: 7: The free distribution

More information

RISC I from Berkeley. 44k Transistors 1Mhz 77mm^2

RISC I from Berkeley. 44k Transistors 1Mhz 77mm^2 The Case for RISC RISC I from Berkeley 44k Transistors 1Mhz 77mm^2 2 MIPS: A Classic RISC ISA Instructions 4 bytes (32 bits) 4-byte aligned Instructions operate on memory and registers Memory Data types

More information

SYSTEM CALL IMPLEMENTATION. CS124 Operating Systems Fall , Lecture 14

SYSTEM CALL IMPLEMENTATION. CS124 Operating Systems Fall , Lecture 14 SYSTEM CALL IMPLEMENTATION CS124 Operating Systems Fall 2017-2018, Lecture 14 2 User Processes and System Calls Previously stated that user applications interact with the kernel via system calls Typically

More information

Exploits and gdb. Tutorial 5

Exploits and gdb. Tutorial 5 Exploits and gdb Tutorial 5 Exploits and gdb 1. Buffer Vulnerabilities 2. Code Injection 3. Integer Attacks 4. Advanced Exploitation 5. GNU Debugger (gdb) Buffer Vulnerabilities Basic Idea Overflow or

More information

Overhead Evaluation about Kprobes and Djprobe (Direct Jump Probe)

Overhead Evaluation about Kprobes and Djprobe (Direct Jump Probe) Overhead Evaluation about Kprobes and Djprobe (Direct Jump Probe) Masami Hiramatsu Hitachi, Ltd., SDL Jul. 13. 25 1. Abstract To implement flight recorder system, the overhead

More information

15-213/ Final Exam Notes Sheet Spring 2013!

15-213/ Final Exam Notes Sheet Spring 2013! Jumps 15-213/18-213 Final Exam Notes Sheet Spring 2013 Arithmetic Operations Jump Condi+on jmp 1 je ZF jne ~ZF js SF jns ~SF jg ~(SF^OF)&~ZF jge ~(SF^OF) jl (SF^OF) jle (SF^OF) ZF ja ~CF&~ZF jb CF Format

More information

last time out-of-order execution and instruction queues the data flow model idea

last time out-of-order execution and instruction queues the data flow model idea 1 last time 2 out-of-order execution and instruction queues the data flow model idea graph of operations linked by depedencies latency bound need to finish longest dependency chain multiple accumulators

More information

Soumava Ghosh The University of Texas at Austin

Soumava Ghosh The University of Texas at Austin Soumava Ghosh The University of Texas at Austin Agenda Overview of programs that perform I/O Linking, loading and the x86 model Modifying programs to perform I/O on the x86 model Interpreting and loading

More information

last time SIMD (single instruction multiple data) hardware idea: wider ALUs and registers Intel s interface _mm

last time SIMD (single instruction multiple data) hardware idea: wider ALUs and registers Intel s interface _mm 1 last time 2 SIMD (single instruction multiple data) hardware idea: wider ALUs and registers Intel s interface _mm sharing the CPU: context switching context = visible CPU state (registers, condition

More information

Protection and System Calls. Otto J. Anshus

Protection and System Calls. Otto J. Anshus Protection and System Calls Otto J. Anshus Protection Issues CPU protection Prevent a user from using the CPU for too long Throughput of jobs, and response time to events (incl. user interactive response

More information

Introduction to 8086 Assembly

Introduction to 8086 Assembly Introduction to 8086 Assembly Lecture 7 Multiplication and Division Multiplication commands: mul and imul mul source (source: register/memory) Unsigned Integer Multiplication (mul) mul src (src: register/memory)

More information

Principles. Performance Tuning. Examples. Amdahl s Law: Only Bottlenecks Matter. Original Enhanced = Speedup. Original Enhanced.

Principles. Performance Tuning. Examples. Amdahl s Law: Only Bottlenecks Matter. Original Enhanced = Speedup. Original Enhanced. Principles Performance Tuning CS 27 Don t optimize your code o Your program might be fast enough already o Machines are getting faster and cheaper every year o Memory is getting denser and cheaper every

More information

Instruction Set Architectures

Instruction Set Architectures Instruction Set Architectures ISAs Brief history of processors and architectures C, assembly, machine code Assembly basics: registers, operands, move instructions 1 What should the HW/SW interface contain?

More information

Practical Malware Analysis

Practical Malware Analysis Practical Malware Analysis Ch 4: A Crash Course in x86 Disassembly Revised 1-16-7 Basic Techniques Basic static analysis Looks at malware from the outside Basic dynamic analysis Only shows you how the

More information

Machine and Assembly Language Principles

Machine and Assembly Language Principles Machine and Assembly Language Principles Assembly language instruction is synonymous with a machine instruction. Therefore, need to understand machine instructions and on what they operate - the architecture.

More information

Formal Verification of x86 Machine-Code Programs

Formal Verification of x86 Machine-Code Programs Formal Verification of x86 Machine-Code Programs Computer Architecture and Program Analysis Shilpi Goel shigoel@cs.utexas.edu Department of Computer Science The University of Texas at Austin Software and

More information

CSE351 Spring 2018, Midterm Exam April 27, 2018

CSE351 Spring 2018, Midterm Exam April 27, 2018 CSE351 Spring 2018, Midterm Exam April 27, 2018 Please do not turn the page until 11:30. Last Name: First Name: Student ID Number: Name of person to your left: Name of person to your right: Signature indicating:

More information

Processes (Intro) Yannis Smaragdakis, U. Athens

Processes (Intro) Yannis Smaragdakis, U. Athens Processes (Intro) Yannis Smaragdakis, U. Athens Process: CPU Virtualization Process = Program, instantiated has memory, code, current state What kind of memory do we have? registers + address space Let's

More information

MACHINE-LEVEL PROGRAMMING I: BASICS

MACHINE-LEVEL PROGRAMMING I: BASICS MACHINE-LEVEL PROGRAMMING I: BASICS CS 429H: SYSTEMS I Instructor: Emmett Witchel Today: Machine Programming I: Basics History of Intel processors and architectures C, assembly, machine code Assembly Basics:

More information

Where We Are. Optimizations. Assembly code. generation. Lexical, Syntax, and Semantic Analysis IR Generation. Low-level IR code.

Where We Are. Optimizations. Assembly code. generation. Lexical, Syntax, and Semantic Analysis IR Generation. Low-level IR code. Where We Are Source code if (b == 0) a = b; Low-level IR code Optimized Low-level IR code Assembly code cmp $0,%rcx cmovz %rax,%rdx Lexical, Syntax, and Semantic Analysis IR Generation Optimizations Assembly

More information

How Software Executes

How Software Executes How Software Executes CS-576 Systems Security Instructor: Georgios Portokalidis Overview Introduction Anatomy of a program Basic assembly Anatomy of function calls (and returns) Memory Safety Programming

More information

4) C = 96 * B 5) 1 and 3 only 6) 2 and 4 only

4) C = 96 * B 5) 1 and 3 only 6) 2 and 4 only Instructions: The following questions use the AT&T (GNU) syntax for x86-32 assembly code, as in the course notes. Submit your answers to these questions to the Curator as OQ05 by the posted due date and

More information