Page 1. Today. Important From Last Time. Is the assembly code right? Is the assembly code right? Which compiler is right?

Similar documents
Important From Last Time

Important From Last Time

Page 1. Today. Last Time. Is the assembly code right? Is the assembly code right? Which compiler is right? Compiler requirements CPP Volatile

Programming in C and C++

Undefined Behaviour in C

Page 1. Stuff. Last Time. Today. Safety-Critical Systems MISRA-C. Terminology. Interrupts Inline assembly Intrinsics

Areas for growth: I love feedback

Process Layout and Function Calls

C++ Undefined Behavior

GCC and Assembly language. GCC and Assembly language. Consider an example (dangeous) foo.s

Important From Last Time

Deep C (and C++) by Olve Maudal

Section Notes - Week 1 (9/17)

QUIZ. What is wrong with this code that uses default arguments?

Hardening LLVM with Random Testing

CPEG421/621 Tutorial

C++ Undefined Behavior What is it, and why should I care?

CSC C69: OPERATING SYSTEMS

Arithmetic and Bitwise Operations on Binary Data

CS 2505 Computer Organization I Test 2. Do not start the test until instructed to do so! printed

Machine-Level Programming V: Unions and Memory layout

Announcements. assign0 due tonight. Labs start this week. No late submissions. Very helpful for assign1

last time Assembly part 2 / C part 1 condition codes reminder: quiz

CSE 374 Programming Concepts & Tools. Hal Perkins Spring 2010

x86 architecture et similia

MISRA-C. Subset of the C language for critical systems

General Syntax. Operators. Variables. Arithmetic. Comparison. Assignment. Boolean. Types. Syntax int i; float j = 1.35; int k = (int) j;

CSE351 Spring 2018, Midterm Exam April 27, 2018

Question Points Score Total: 100

But first, encode deck of cards. Integer Representation. Two possible representations. Two better representations WELLESLEY CS 240 9/8/15

Hacking in C. The C programming language. Radboud University, Nijmegen, The Netherlands. Spring 2018

2/5/2018. Expressions are Used to Perform Calculations. ECE 220: Computer Systems & Programming. Our Class Focuses on Four Types of Operator in C

377 Student Guide to C++

Memory Corruption 101 From Primitives to Exploit

OBJECT ORIENTED PROGRAMMING

CSci 4061 Introduction to Operating Systems. Programs in C/Unix

GDB Tutorial. A Walkthrough with Examples. CMSC Spring Last modified March 22, GDB Tutorial

CS 31: Intro to Systems Pointers and Memory. Kevin Webb Swarthmore College October 2, 2018

CSCI-243 Exam 1 Review February 22, 2015 Presented by the RIT Computer Science Community

Running a C program Compilation Python and C Variables and types Data and addresses Functions Performance. John Edgar 2

18-642: Code Style for Compilers

QUIZ How do we implement run-time constants and. compile-time constants inside classes?

Credits and Disclaimers

Agenda. Peer Instruction Question 1. Peer Instruction Answer 1. Peer Instruction Question 2 6/22/2011

Introduction to C. Robert Escriva. Cornell CS 4411, August 30, Geared toward programmers

Assembly Language II: Addressing Modes & Control Flow

Lecture 05 Integer overflow. Stephen Checkoway University of Illinois at Chicago

edunepal_info

Optimization part 1 1

EL2310 Scientific Programming

Pointers (continued), arrays and strings

Computer Systems Lecture 9

Representing and Manipulating Integers. Jo, Heeseung

Deep C by Olve Maudal

Computer Security. Robust and secure programming in C. Marius Minea. 12 October 2017

THE EVALUATION OF OPERANDS AND ITS PROBLEMS IN C++

HW1 due Monday by 9:30am Assignment online, submission details to come

CS 61C: Great Ideas in Computer Architecture. C Arrays, Strings, More Pointers

AS08-C++ and Assembly Calling and Returning. CS220 Logic Design AS08-C++ and Assembly. AS08-C++ and Assembly Calling Conventions

CIS 2107 Computer Systems and Low-Level Programming Spring 2012 Final

Draft. Chapter 1 Program Structure. 1.1 Introduction. 1.2 The 0s and the 1s. 1.3 Bits and Bytes. 1.4 Representation of Numbers in Memory

CSE 351 Section 4 GDB and x86-64 Assembly Hi there! Welcome back to section, we re happy that you re here

A heap, a stack, a bottle and a rack. Johan Montelius HT2017

What is concurrency? Concurrency. What is parallelism? concurrency vs parallelism. Concurrency: (the illusion of) happening at the same time.

COMP322 - Introduction to C++ Lecture 02 - Basics of C++

Pointers (continued), arrays and strings

High Performance Computing in C and C++

C++ Tutorial AM 225. Dan Fortunato

Lecture 2: C Programm

Changelog. Assembly (part 1) logistics note: lab due times. last time: C hodgepodge

Hacking in C. Pointers. Radboud University, Nijmegen, The Netherlands. Spring 2019

Hacking in C. The C programming language. Radboud University, Nijmegen, The Netherlands. Spring 2018

Corrections made in this version not seen in first lecture:

Concurrency. Johan Montelius KTH

File Handling in C. EECS 2031 Fall October 27, 2014

Question Points Score Total: 100

CS61 Section Solutions 3

CS61, Fall 2012 Midterm Review Section

CSE 303: Concepts and Tools for Software Development

CprE 288 Introduction to Embedded Systems Exam 1 Review. 1

CSE 374 Programming Concepts & Tools. Hal Perkins Spring 2010

CS Bootcamp x86-64 Autumn 2015

CSE351 Autumn 2012 Midterm Exam (5 Nov 2012)

AGENDA Binary Operations CS 3330 Samira Khan

CSE 374 Programming Concepts & Tools

x86 assembly CS449 Spring 2016

Handling of Interrupts & Exceptions

C Language Part 1 Digital Computer Concept and Practice Copyright 2012 by Jaejin Lee

Bits, Bytes, and Integers Part 2

Machine Programming 1: Introduction

Agenda. CS 61C: Great Ideas in Computer Architecture. Lecture 2: Numbers & C Language 8/29/17. Recap: Binary Number Conversion

DEPARTMENT OF MATHS, MJ COLLEGE

A Fast Review of C Essentials Part I

CS 61C: Great Ideas in Computer Architecture. Lecture 2: Numbers & C Language. Krste Asanović & Randy Katz

15-213/18-243, Fall 2010 Exam 1 - Version A

What the CPU Sees Basic Flow Control Conditional Flow Control Structured Flow Control Functions and Scope. C Flow Control.

x86 assembly CS449 Fall 2017

CSCE 548 Building Secure Software Integers & Integer-related Attacks & Format String Attacks. Professor Lisa Luo Spring 2018

Introduction to C. Zhiyuan Teo. Cornell CS 4411, August 26, Geared toward programmers

x86 Assembly Crash Course Don Porter

Transcription:

Important From Last Time Today Embedded C Pros and cons Macros and how to avoid them Intrinsics Interrupt syntax Inline assembly Advanced C What C programs mean How to create C programs that mean nothing The point: Embedded systems need to work all the time You cannot create systems that really work unless you understand your programming language and your tools This is a major theme for the rest of this class Is the assembly code right? int my_loop (int base) int index, count = 0; for (index = base; index < (base+10); index++) count++; urn count; my_loop: movl $10, %eax Is the assembly code right? int my_compare (void) signed char a = 1; unsigned char b = -1; urn (a > b); my_compare: movl $1, %eax Which compiler is right? int another_compare (void) urn -1 < (unsigned short)1; gcc 4.3.2 for x86: gcc 3.2.3 for msp430: another_compare: movl $1, %eax another_compare: mov #llo(0), r15 $ gcc foo.c o foo $./foo Segmentation fault (core dumped) $ gcc O2 foo.c o foo $./foo Hello, world. Is it OK for the optimizer to turn a broken program into a working one? What about the other way around is it OK for the optimizer to break a working program? Page 1

What mathematical function is equivalent to this C function? unsigned foo (unsigned a, unsigned b) urn a+b; How about this function? int foo (int a, int b) urn a+b; foo ( a, b) = ( a + b) mod 2 CHAR_BIT *sizeof(unsigned) a + b foo( a, b) = if INT_MIN (a + b) INT_MAX undefined otherwise What mathematical function is equivalent to Internet Explorer 7? What does each of these mean? x = y = z; v++ v + v++ *p + (i=1) (x=0) + (x=0) i = i + 1; i = (i = i + 1); Point of all this? Arithmetic, logical, and comparison operators are not equivalent to their mathematical counterparts Expression evaluation is nontrivial Other parts of the C language have similar counterintuitive We need a way to figure out what programs mean The C standard is an English language description of this It is a free download How To Think About C The C standard describes an abstract machine Think of it as a simple interper for C For everything your program does, the C abstract machine tells us the result C implementation has to act as if it implements the computation described by the abstract machine But actually, it may do things very differently Page 2

int my_loop (int base) int index, count = 0; for (index = base; index < (base+10); index++) count++; urn count; However If you program breaks certain rules, the C implementation can do anything it wants It s very hard to create C programs that provably don t break the rules Let s look in more detail about things you can do in C 4 basic categories my_loop: movl $10, %eax Some operations are defined to behave in a certain way for all C implementations (1+1) a[5]=3; where a[5] is in-bounds *p where p is int *p and p points to an int if (z) where z is initialized As a programmer your goal is to execute mostly operations with well-defined Some operations have implementationdefined The C implementation chooses how to implement the The choice must be consistent and documented Examples Sizes of various integers (long, short, etc.) Integer representation Two s complement? Ones complement? Sign magnitude? Effect of bitwise operations on signed values Floating-point rounding Use of implementation-defined constructs in unavoidable in real C programs This can limit portability of code Some operations have unspecified Implementation has freedom of choice Can make a different choice each time Examples Value of padding bytes in structures Order of evaluation of subexpressions Order of evaluation of function arguments Total of 53 kinds of unspecified mentioned in the C standard Code that may depend on unspecified : foo (x(),y()); Code that definitely has unspecified : printf ( a ) + printf ( b ) + printf ( c ) Your program must never rely on something that is unspecified Try this code at different optimization levels Page 3

Some operations have undefined Consequences are arbitrary Undefined is always a serious bug Examples Null pointer dereference Improper type cast Out of bounds array access Divide by zero Signed integer overflow Shift by negative or past bitwidth Read uninitialized value Access to dead stack variable Double-free, use-after-free Total of about 190 kinds undefined in the C standard However, some can be reliably detected at compile time In practice, what happens when your problem executes an operation with undefined? Maybe the program does just what you expected Maybe it crashes Maybe nothing obvious program appears to continue normally but it is corrupted somehow The vast majority of security holes in C applications are the result of undefined Type 1 Functions Well-defined for all inputs int32_t safe_div_int32_t (int32_t a, int32_t b) if ((b == 0) ((a == INT32_MIN) && (b == -1))) report_integer_math_error(); urn 0; else urn a / b; Type 3 Functions Function always has undefined Never write a function like this! In practice they happen by accident Compiler will often silently eliminate some or all code in a function like this int bad (void) int x; urn x; Another Type 3 Function void str2 (void) char *s = "hello"; printf("%s\n", s); s[0] = 'H'; printf("%s\n", s); Why is it type 3? What are the compiler s obligations? str2: subq $8, %rsp movl $.LC1, %edx movl $.LC0, %esi movl $1, %edi xorl %eax, %eax call printf_chk movl $.LC1, %edx movl $.LC0, %esi movl $1, %edi xorl %eax, %eax addq $8, %rsp jmp printf_chk Page 4

Type 2 Functions Compiling Type 2 Funcs Has undefined for some inputs int32_t div_int32_t (int32_t a, int32_t b) urn a / b; int stupid (int a) urn (a+1) > a; What is this function s precondition? When is it OK to call this function? When is it OK to write this function? Compiling Type 2 Funcs Another Type 2 Case 1: a!= INT_MAX Behavior of + is defined Computer is obligated to urn 1 Case 2: a == INT_MAX Behavior of + is undefined Compiler has no particular obligations Generated code by gcc O2 : stupid: movl $1, %eax void devexit agnx_pci_remove (struct pci_dev *pdev) struct ieee80211_hw *dev = pci_get_drvdata(pdev); struct agnx_priv *priv = dev->priv; if (!dev) urn;... do stuff using dev... Case Analysis Case 1: dev == NULL dev->priv has undefined Compiler has no particular obligations Case 2: dev!= NULL Null pointer check won t fail Null pointer check is dead code and may be deleted This is real Linux kernel code! Since 2009 the Linux kernel us compiled using a special GCC flag that say never to delete null pointer checks Why not just fix the code? Why not require the C implementation to emit a compile-time warning when a program might contain undefined? Why not require that the C implementation throw an exception in order to avoid undefined? How should you deal with undefined? Page 5

Signed/Unsigned in C Operators like +, -, <, <= in C have signed and unsigned versions The version that gets chosen depends on the signs of the operands Rule: If at least one operand is unsigned, the operator is unsigned int a,b; unsigned c,d; (a < b) (c < d) (a < c) Integer Promotion Operators like +, -, <, <= in C have different versions for different types float, double int, long, long long Rule: Both operands are promoted to int before the operator executes char c1, c2; c1 = c1 + c2; Tricky: If an int can hold all of the values in the original type, a value is promoted to int; if not, it is promoted to unsigned int So, integer promotions always preserve value Side Effects If one of the operands is larger than an int, the other argument is promoted (if necessary) to that size The type of the result of an arithmetic operator is the promoted type of the operands The type of the result of a comparison operator is int, regardless of the types of the operands Integer promotion is performed before the operator is chosen to be signed vs. unsigned A C program interacts with the world using side effects Side effects are Accessing a volatile object Calling a function that is side-effecting Side effects do not occur immediately, but may be kept pending Why would this seem like a good idea? Sequence Points A sequence point in C is a barrier that side effects cannot pass When a sequence point is reached All previous side effects must have taken effect No subsequent side effects can have taken effect Between a pair of sequence points, side effects can occur in any order It s your problem to ensure that your code contains enough sequence points to make it correct Finding Sequence Points Point of calling a function, after all arguments are evaluated End of evaluating the first operand to && or End of evaluating the first operand to? : End of each operand to the comma operator Completing the evaluation of a full expression, defined as: Evaluating an initializer Expression in a regular statement terminated by a ; Controlling expressions in do, while, switch, for The other two expressions in a for Expression in a urn Page 6

More Sequence Points Important C standard tells us that Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be read only to determine the value to be stored. Violating this rule leads to undefined So don t both read and write any single variable in between a pair of sequence points However, ++ and are OK But just once per variable Sequence points are about the abstract machine They have nothing to do with the generated code E.g. a++; b++; Can be translated to: incl b; incl a; Why? Is the assembly code right? int my_loop (int base) int index, count = 0; for (index = base; index < (base+10); index++) count++; urn count; my_loop: movl $10, %eax Is the assembly code right? int my_compare (void) signed char a = 1; unsigned char b = -1; urn (a > b); my_compare: movl $1, %eax Which compiler is right? int another_compare (void) urn -1 < (unsigned short)1; gcc 4.3.2 for x86: gcc 3.2.3 for msp430: another_compare: movl $1, %eax another_compare: mov #llo(0), r15 $ gcc foo.c o foo $./foo Segmentation fault (core dumped) $ gcc O2 foo.c o foo $./foo Hello, world. Is it OK for the optimizer to turn a broken program into a working one? What about the other way around is it OK for the optimizer to break a working program? Page 7

What does each of these mean? x = y = z; v++ v + v++ *p + (i=1) (x=0) + (x=0) i = i + 1; i = (i = i + 1); Summary To write effective C code you need to understand and follow a lot of rules Your code must never rely on unspecified or execute an operation with undefined Sequence points are your friend Mixing signed and unsigned values usually leads to trouble Page 8