Lab 03 - x86-64: atoi

Similar documents
CSCI0330 Intro Computer Systems Doeppner. Lab 02 - Tools Lab. Due: Sunday, September 23, 2018 at 6:00 PM. 1 Introduction 0.

18-600: Recitation #3

Project Data: Manipulating Bits

Exercise Session 6 Computer Architecture and Systems Programming

ECE 375: Computer Organization and Assembly Language Programming

a translator to convert your AST representation to a TAC intermediate representation; and

CSE 410: Systems Programming

Using the KD30 Debugger

CS & IT Conversions. Magnitude 10,000 1,

Bits, Words, and Integers

ENGR 3950U / CSCI 3020U (Operating Systems) Simulated UNIX File System Project Instructor: Dr. Kamran Sartipi


COP Programming Assignment #7

printf( Please enter another number: ); scanf( %d, &num2);

We first learn one useful option of gcc. Copy the following C source file to your

Course Schedule. CS 221 Computer Architecture. Week 3: Plan. I. Hexadecimals and Character Representations. Hexadecimal Representation

CSE 351 Section 4 GDB and x86-64 Assembly Hi there! Welcome back to section, we re happy that you re here

Level ISA3: Information Representation

CSCI341. Lecture 22, MIPS Programming: Directives, Linkers, Loaders, Memory

CS 211 Programming Practicum Fall 2018

Source level debugging. October 18, 2016

Final Programming Project

CSE 351. GDB Introduction

Representation of Information

Recitation: Bomb Lab. September 17 th 2018

Jackson State University Department of Computer Science CSC / Advanced Information Security Spring 2013 Lab Project # 5

,879 B FAT #1 FAT #2 root directory data. Figure 1: Disk layout for a 1.44 Mb DOS diskette. B is the boot sector.

CSci 4061 Introduction to Operating Systems. Programs in C/Unix

Question Points Score Total: 100

GDB cheatsheet - page 1

CSE 361 Fall 2017 Lab Assignment L2: Defusing a Binary Bomb Assigned: Wednesday Sept. 20 Due: Wednesday Oct. 04 at 11:59 pm

CS 429H, Spring 2012 Optimizing the Performance of a Pipelined Processor Assigned: March 26, Due: April 19, 11:59PM

2. MACHINE REPRESENTATION OF TYPICAL ARITHMETIC DATA FORMATS (NATURAL AND INTEGER NUMBERS).

Part III Appendices 165

CS Programming In C

CS 2150 Exam 2, Spring 2018 Page 1 of 6 UVa userid:

CS367 Test 1 Review Guide

CSE 361S Intro to Systems Software Lab #4

CS/ECE 252: INTRODUCTION TO COMPUTER ENGINEERING UNIVERSITY OF WISCONSIN MADISON

LAB A Translating Data to Binary

LAB WORK NO. 3 TURBO DEBUGGER ENVIRONMENT

Adding Binary Integers. Part 5. Adding Base 10 Numbers. Adding 2's Complement. Adding Binary Example = 10. Arithmetic Logic Unit

ECOM 2325 Computer Organization and Assembly Language. Instructor: Ruba A.Salamah INTRODUCTION

Programming Assignment IV Due Thursday, June 1, 2017 at 11:59pm

Lab 1 Introduction to UNIX and C

Programming Assignment IV Due Thursday, November 18th, 2010 at 11:59 PM

Tutorial 1 C Tutorial: Pointers, Strings, Exec

Midterm 1 topics (in one slide) Bits and bitwise operations. Outline. Unsigned and signed integers. Floating point numbers. Number representation

Ceng Recitation 1 - Defusing a Binary Bomb

World Inside a Computer is Binary

CS61C : Machine Structures

CS107, Lecture 3 Bits and Bytes; Bitwise Operators

Computer Architecture and System Software Lecture 02: Overview of Computer Systems & Start of Chapter 2

Homework 1 CS 642: Information Security

CS143 Handout 05 Summer 2011 June 22, 2011 Programming Project 1: Lexical Analysis

Number System. Introduction. Decimal Numbers

Number representations

1 The Simpletron Assembly Language

UNIVERSITY OF NEBRASKA AT OMAHA Computer Science 4500/8506 Operating Systems Summer 2016 Programming Assignment 1 Introduction The purpose of this

CS 361S - Network Security and Privacy Spring Project #2

Short Notes of CS201

CS354 gdb Tutorial Written by Chris Feilbach

Number Systems Using and Converting Between Decimal, Binary, Octal and Hexadecimal Number Systems

Agenda. Peer Instruction Question 1. Peer Instruction Answer 1. Peer Instruction Question 2 6/22/2011

Lab 09 - Virtual Memory

Decimal & Binary Representation Systems. Decimal & Binary Representation Systems

University of California, Berkeley College of Engineering

An Introduction to Komodo

GDB Tutorial. A Walkthrough with Examples. CMSC Spring Last modified March 22, GDB Tutorial

CS 4410, Fall 2017 Project 1: My First Shell Assigned: August 27, 2017 Due: Monday, September 11:59PM

CS155: Computer Security Spring Project #1. Due: Part 1: Thursday, April pm, Part 2: Monday, April pm.

CS201 - Introduction to Programming Glossary By

Learning Objectives. A Meta Comment. Exercise 1. Contents. From CS61Wiki

EE 355 Lab 3 - Algorithms & Control Structures

Lab 4. Out: Friday, February 25th, 2005

Final Exam. 11 May 2018, 120 minutes, 26 questions, 100 points

Intro to Programming. Unit 7. What is Programming? What is Programming? Intro to Programming

CNIT 127: Exploit Development. Ch 3: Shellcode. Updated

Hexadecimal Numbers. Journal: If you were to extend our numbering system to more digits, what digits would you use? Why those?

EECE.2160: ECE Application Programming

Project #1 Exceptions and Simple System Calls

SU 2017 May 11/16 LAB 2: Character and integer literals, number systems, character arrays manipulation, relational operator

The C Programming Language Guide for the Robot Course work Module

EXPERIMENT 1. FAMILIARITY WITH DEBUG, x86 REGISTERS and MACHINE INSTRUCTIONS

Programming Project 4: COOL Code Generation

CSC201, SECTION 002, Fall 2000: Homework Assignment #2

Variables and Data Representation

Memory Addressing, Binary, and Hexadecimal Review

CSCI0330 Intro Computer Systems Doeppner. Project Shell 2. Due: November 8, 2017 at 11:59pm. 1 Introduction 2

CS 107 Lecture 2: Bits and Bytes (continued)

CSCI 1100L: Topics in Computing Lab Lab 11: Programming with Scratch

Introduction to MiniSim A Simple von Neumann Machine

CS 105 Malloc Lab: Writing a Dynamic Storage Allocator See Web page for due date

CS61C Machine Structures. Lecture 4 C Pointers and Arrays. 1/25/2006 John Wawrzynek. www-inst.eecs.berkeley.edu/~cs61c/

Recitation: Attack Lab

Embedded Systems - FS 2018

Part 1 (70% of grade for homework 2): MIPS Programming: Simulating a simple computer

CS 211 Programming Practicum Spring 2018

CSCI-1200 Data Structures Spring 2014 Lecture 5 Pointers, Arrays, Pointer Arithmetic

Introduction to C. Why C? Difference between Python and C C compiler stages Basic syntax in C

Transcription:

CSCI0330 Intro Computer Systems Doeppner Lab 03 - x86-64: atoi Due: October 1, 2017 at 4pm 1 Introduction 1 2 Assignment 1 2.1 Algorithm 2 3 Assembling and Testing 3 3.1 A Text Editor, Makefile, and gdb 3 4 Useful Hints 4 5 Getting Checked Off 5 1 Introduction Many programming languages, such as Java, provide a convenient mechanism for converting between integers and strings. For example, Java allows integers to be concatenated with an empty string for conversion in one direction, and provides the static method Integer.parseInt() for conversion in the other direction. C provides a function atoi() (ASCII-to-integer), which 1 converts a null-terminated ASCII character string into an integer, in its standard library <stdlib.h>, and provides sprintf() to convert integers and other data types to character strings. Such language features abstract the underlying representation of these data types from the programmer, making them easier to use. x86-64 assembly, however, is a language without such features. All data are treated the same by the language instructions it is the responsibility of the programmer to interpret each datum and to utilize it appropriately. 2 Assignment This handout contains the following files: 1 Rather than store metadata about the size of a string, strings in C are represented simply as arrays of characters. Consequently, a different method of determining the end of a string is needed. C does this by using the null-terminating character \0' as the last character in the array.

atoi.c : a C file which calls your x86-64 atoi() function. my atoi.h : a header file declaring your x86-64 atoi() function. my atoi.s : the assembly file in which you will be writing atoi(). Makefile : a makefile. lab03.pdf : this document. In this lab, you will be implementing the atoi() function in x86-64. Converting between these data types will require you to deal directly with the byte-level representation of strings. You will be writing your code in the provided file my_atoi.s. The code for dealing with function calls is already provided in this file; you will only be writing the function body of atoi(). Also provided is a short C program which calls your function; use it to test your assembly program. To get started, run the command cs0330_install lab03. There are several ways to implement atoi(). The easiest algorithm is provided below. 2.1 Algorithm The simplest approach to implementing atoi() is to iterate through the characters of the string from left to right, updating an accumulator with the growing integer value. Initially, the accumulator is set to 0. At each character c, you multiply the accumulator by 10, and then add the value of character c (keeping in mind that the character's ASCII code value is not the same as its integer value). When you reach the end of the array or a non-integer character, return the number in the accumulator: atoi(char[] array): int accumulator = 0 for each character c in array until first non-digit character: accumulator = accumulator * 10 accumulator = accumulator + (integer represented by c) return accumulator Some advice: Each ASCII character is exactly one byte long. Make sure that each x86-64 instruction you use to retrieve and manipulate each character operates on a single byte. Each subsequent character is therefore offset exactly one byte from the previous character. To retrieve the integer value of a digit character, you can subtract 48. Why? (hint: consider the ASCII values for each digit character). The result for atoi of any non-digit value, like hi, should be 0. Non-digit inputs are illegal inputs for atoi()! So atoi should stop trying to convert to an int at the first non-digit character. For example, 55g should return 55.

3 Assembling and Testing For this lab, you will use a text editor with the provided Makefile and gdb to write and run x86-64. 3.1 A Text Editor, Makefile, and gdb You are provided with a makefile, which will build the atoi program for you. To build atoi, run make atoi, or just make. To clean up, run make clean. After building the testing program, you can run it by typing./atoi or on the command line in your lab handout directory. This program will print the output generated by your program for all input you enter. Use it to verify the correctness of your output. Typing CTRL-D on a blank line (which closes the input stream) will terminate the test program. In the previous lab you learned how to use gdb to debug C programs. gdb can also be used to debug x86-64 programs; several additional instructions, as well as the executable objdump, will make debugging x86-64 code a much less challenging task than it otherwise would be. Run your x86-64 program in gdb the same way you would run a C program: gdb <executable>. Several new gdb instructions are very useful for debugging x86-64: break *<address> sets a breakpoint at the given address; stepi steps through a single x86-64 instruction; print can print values stored in registers and at memory addresses. To print the value contained in a register, use print followed by the register name with the % replaced by a $. For example, print $rbx prints the value stored in %rbx. You can use the values in registers as variables as well. For example, the following are valid gdb commands: print $rbx * 7 + 3 print *(void**)($rsp + $rax) You can access the value of the program counter with the variable $rip. Lastly, you can format numbers using different bases with various print arguments: /x formats data as a hexadecimal number, which is extremely useful for printing pointers; /d formats data as a decimal number; and /t formats data as a binary number. info registers prints the value contained in each register and condition code.

disas is used to disassemble the program, producing the x86-64 instructions corresponding to the raw bytes of the program. With no argument, this command disassembles the current function. The command can also take arguments such as a function name or an address range. A handy GDB cheatsheet can be found on the class website, or on page 255 of the textbook. In addition to using GDB, you may want to see the assembly code of your program in its entirety. The executable objdump provides a convenient way to do this. When run with the -d flag, objdump disassembles an executable file into x86-64, providing the address of each instruction next to the instruction itself in its output. Run objdump -d <executable> to see the disassembled x86-64. You may find it helpful to pipe the output of objdump to a file, which you can then view in the text editor of your choice. 4 Useful Hints On the website, you can find the x86-64 cheatsheet which contains a lot of useful information about x86-64 registers, conventions, and instructions. Use this document as a quick reference as you work with x86-64. All the information it provides except for the information on procedure calls should be useful for this lab. If your atoi() function is given a string that is not a valid integer, it should not crash. However, like the atoi() function provided in C, it does not need to recognize this fact or display an error message. For the purposes of this assignment, your function should return the integer corresponding to all digits up to the first non-digit character in the string. If the first character is a non-digit character, your function should return 0. You do not need to handle negative integers as input. You should consider your integers unsigned for the purpose of this assignment the TA-provided test programs will cast negative inputs to their unsigned equivalents. Similarly, there is no need to handle overflow. It is useful to write out your program in C before you attempt to write it in x86-64. You should always provide a C or C-like high-level description of what your assembly code is doing in the comments to any code you write in assembly language it makes your code much easier to follow. A common bug is to use the incorrect addressing mode, particularly with absolute addresses. Consult the x86-64 Cheat Sheet section on addressing modes if your code seems to be producing a lot of segmentation faults. Remember that strings in C are null-terminated when your loop reaches the null byte, it has reached the end of the string.

The purpose of this lab is to become comfortable coding in x86-64. With that in mind, the TAs will be willing to answer any conceptual questions that you may have. Assembly language can be very challenging at first don t hesitate to ask for help if you need it. 5 Getting Checked Off Once you ve completed your atoi() program, you ve finished the lab. Submit your work using 33lab_checkoff lab03 [--verbose]. You can run this script as many times as you wish to check what your grade was - it will never lower your grade. Remember to read the course missive for information about course requirements and policies regarding labs and assignments.