Software Testing CS 408. Lecture 10: Compiler Testing 2/15/18

Size: px
Start display at page:

Download "Software Testing CS 408. Lecture 10: Compiler Testing 2/15/18"

Transcription

1 Software Testing CS 408 Lecture 10: Compiler Testing 2/15/18

2 Compilers Clearly, a very critical part of any software system. It is itself a complex piece of software - How should they be tested? - Random testing? - Black-box methods? - White-box methods? Challenges - Reasoning about the input space - Understanding program transformations 2

3 Finding and Understanding Bugs in C Compilers Xuejun Yang Yang Chen Eric Eide John Regehr University of Utah, School of Computing { jxyang, chenyang, eeide, regehr }@cs.utah.edu Compiler Validation via Equivalence Modulo Inputs Vu Le Mehrdad Afshari Zhendong Su Department of Computer Science, University of California, Davis, USA {vmle, mafshari, su}@ucdavis.edu Abstract Compilers should be correct. To improve the quality of C compilers, we created Csmith, a randomized test-case generation tool, and spent three years using it to find compiler bugs. During this period we reported more than 325 previously unknown bugs to compiler developers. Every compiler we tested was found to crash and also to silently generate wrong code when presented with valid input. In this paper we present our compiler-testing tool and the results of our bug-hunting study. Our first contribution is to advance the state of the art in compiler testing. Unlike previous tools, Csmith generates programs that cover a large subset of C while avoiding the undefined and unspecified behaviors that would destroy its ability to automatically find wrong-code bugs. Our second contribution is a collection of qualitative and quantitative results about the bugs we have found in open-source C compilers. Categories and Subject Descriptors D.2.5 [Software Engineering]: Testing and Debugging testing tools; D.3.2 [Programming Languages]: Language Classifications C; D.3.4 [Programming Languages]: Processors compilers General Terms Languages, Reliability Keywords compiler testing, compiler defect, automated testing, random testing, random program generation 1. Introduction The theory of compilation is well developed, and there are compiler frameworks in which many optimizations have been proved correct. Nevertheless, the practical art of compiler construction involves a morass of trade-offs between compilation speed, code quality, code debuggability, compiler modularity, compiler retargetability, and other goals. It should be no surprise that optimizing compilers like all complex software systems contain bugs. Miscompilations often happen because optimization safety checks are inadequate, static analyses are unsound, or transformations are flawed. These bugs are out of reach for current and future automated program-verification tools because the specifications that need to be checked were never written down in a precise way, if they were written down at all. Where verification is impractical, however, other methods for improving compiler quality can succeed. This paper reports our experience in using testing to make C compilers better. c ACM, This is the author s version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in Proceedings of the 2011 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), San Jose, CA, Jun. 2011, 1 int foo (void) { 2 signed char x = 1; 3 unsigned char y = 255; 4 return x > y; 5 } Figure 1. We found a bug in the version of GCC that shipped with Ubuntu Linux for x86. At all optimization levels it compiles this function to return 1; the correct result is 0. The Ubuntu compiler was heavily patched; the base version of GCC did not have this bug. We created Csmith, a randomized test-case generator that supports compiler bug-hunting using differential testing. Csmith generates a C program; a test harness then compiles the program using several compilers, runs the executables, and compares the outputs. Although this compiler-testing approach has been used before [6, 16, 23], Csmith s test-generation techniques substantially advance the state of the art by generating random programs that are expressive containing complex code using many C language features while also ensuring that every generated program has a single interpretation. To have a unique interpretation, a program must not execute any of the 191 kinds of undefined behavior, nor depend on any of the 52 kinds of unspecified behavior, that are described in the C99 standard. For the past three years, we have used Csmith to discover bugs in C compilers. Our results are perhaps surprising in their extent: to date, we have found and reported more than 325 bugs in mainstream C compilers including GCC, LLVM, and commercial tools. Figure 1 shows a representative example. Every compiler that we have tested, including several that are routinely used to compile safety-critical embedded systems, has been crashed and also shown to silently miscompile valid inputs. As measured by the responses to our bug reports, the defects discovered by Csmith are important. Most of the bugs we have reported against GCC and LLVM have been fixed. Twenty-five of our reported GCC bugs have been classified as P1, the maximum, release-blocking priority for GCC defects. Our results suggest that fixed test suites the main way that compilers are tested are an inadequate mechanism for quality control. We claim that Csmith is an effective bug-finding tool in part because it generates tests that explore atypical combinations of C language features. Atypical code is not unimportant code, however; it is simply underrepresented in fixed compiler test suites. Developers who stray outside the well-tested paths that represent a compiler s comfort zone for example by writing kernel code or embedded systems code, using esoteric compiler options, or automatically generating code can encounter bugs quite frequently. This is a significant problem for complex systems. Wolfe [30], talking about independent software vendors (ISVs) says: An ISV with a complex code can work around correctness, turn off the optimizer in one or two files, and usually they have to do that for any of the compilers they use (emphasis ours). As another example, the front Abstract We introduce equivalence modulo inputs (EMI), a simple, widely applicable methodology for validating optimizing compilers. Our key insight is to exploit the close interplay between (1) dynamically executing a program on some test inputs and (2) statically compiling the program to work on all possible inputs. Indeed, the test inputs induce a natural collection of the original program s EMI variants, which can help differentially test any compiler and specifically target the difficult-to-find miscompilations. To create a practical implementation of EMI for validating C compilers, we profile a program s test executions and stochastically prune its unexecuted code. Our extensive testing in eleven months has led to 147 confirmed, unique bug reports for GCC and LLVM alone. The majority of those bugs are miscompilations, and more than 100 have already been fixed. Beyond testing compilers, EMI can be adapted to validate program transformation and analysis systems in general. This work opens up this exciting, new direction. Categories and Subject Descriptors D.2.5 [Software Engineering]: Testing and Debugging testing tools; D.3.2 [Programming Languages]: Language Classifications C; H.3.4 [Programming Languages]: Processors compilers General Terms Algorithms, Languages, Reliability, Verification Keywords Compiler testing, miscompilation, equivalent program variants, automated testing 1. Introduction Compilers are among the most important, widely-used and complex software ever written. Decades of extensive research and development have led to much increased compiler performance and reliability. Perhaps less known to application programmers is that production compilers do also contain bugs, and in fact quite a few. However, compiler bugs are hard to recognize from the much more frequent bugs in applications because often they manifest only indirectly as application failures. Thus, when compiler bugs occur, they frustrate programmers and may lead to unintended application behavior and disasters, especially in safety-critical domains. Compiler verification has been an important and fruitful area for the verification grand challenge in computing research [9]. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Permissions@acm.org. PLDI 14, June 9 11, 2014, Edinburgh, United Kingdom. Copyright 2014 ACM /14/06... $ Besides traditional manual code review and testing, the main compiler validation techniques include testing against popular validation suites (such as Plum Hall [21] and SuperTest [1]), verification [12, 13], translation validation [20, 22], and random testing [28]. These approaches have complementary benefits. For example, CompCert [12, 13] is a formally verified optimizing compiler for a subset of C, targeting the embedded software domain. It is an ambitious project, but much work remains to have a fully verified production compiler that is correct end-to-end. Another good example is Csmith [28], a recent work that generates random C programs to stress-test compilers. To date, it has found a few hundred bugs in GCC and LLVM, and helped improve the quality of the most widely-used C compilers. Despite this incredible success, the majority of the reported bugs were compiler crashes as it is difficult to steer its random program generation to specifically exercise a compiler s most critical components its optimization phases. We defer to Section 5 for a detailed survey of related work. Equivalence Modulo Inputs (EMI) This paper introduces a simple, broadly applicable concept for validating compilers. Our vision is to take existing real-world code and transform it in a novel, systematic way to produce different, but equivalent variants of the original code. To this end, we introduce equivalence modulo inputs (EMI) for a practical, concrete realization of the vision. The key insight behind EMI is to exploit the interplay between dynamically executing a program P on a subset of inputs and statically compiling P to work on all inputs. More concretely, given a program P and a set of input values I from its domain, the input set I induces a natural collection of programs C such that every program Q 2 C is equivalent to P modulo I: 8i 2 I,Q(i)=P(i). The collection C can then be used to perform differential testing [16] of any compiler Comp: If Comp(P)(i) 6= Comp(Q)(i) for some i 2 I and Q 2 C, Comp has a miscompilation. Next we provide some high-level intuition behind EMI s effectiveness (Section 2 illustrates this insight with two concrete, real examples for Clang and GCC respectively). The EMI variants can specifically target a compiler s analysis and optimization phases, and stress-test them to reveal latent compiler bugs. Indeed, although an EMI variant Q is only equivalent to P modulo the input set I, the compiler has to perform all its (static) analysis and optimizations to produce correct code for Q over all inputs. In addition, P s EMI variants, while semantically equivalent w.r.t. I, can have quite different static data- and control-flow. Since data- and control-flow information critically affects which optimizations are enabled and how they are applied, the EMI variants not only help exercise the optimizer differently, but also demand the exact same output on I from the generated code by these different optimization strategies This is the very fact that we crucially leverage. EMI has several unique advantages: It is general and easily applicable to finding bugs in compilers, analysis and transformation tools for any language.

4 Example llvm bug $"clang" m32" O0"test.c";"./a.out" $"clang" m32" O1"test.c";"./a.out"" Aborted"(core"dumped)" 4

5 Example 1 int foo (void) { 2 signed char x = 1; 3 unsigned char y = 255; 4 return x > y; 5 } Bug in GCC in Ubuntu x86 under all optimization levels. 5

6 CSmith Random Generator: Csmith C program gcc -O0 gcc -O2 clang -Os results majority vote minority 6

7 Requirements Unambiguous: avoid undefined or unspecified behaviors that create ambiguous meanings of a program Integer undefined behavior Use without initialization Unspecified evaluation order Use of dangling pointer Null pointer dereference OOB array access Expressiveness: support most commonly used C features Integer operations Loops (with break/continue) Conditionals Function calls Const and volatile Structs and Bitfields Pointers and arrays Goto 7

8 Avoiding Undefined/unspecified Behaviors Problem Generation Time Solution Run Time Solution Integer undefined behaviors Use without initialization Constant folding/ propagation Algebraic simplification explicit initializers Safe math wrappers OOB array access Force index within range Take modulus Null pointer dereference Use of dangling pointers Unspecified evaluation order Inter-procedural points-to analysis Inter-procedural points-to analysis Inter-procedural effect analysis 8

9 no LHS *q assign RHS call validate ok? func_2 Generation Time Analyzer Code Generator 9

10 LHS assign RHS call func_2 Generation Time Analyzer Code Generator 10

11 yes LHS *p assign RHS call validate update facts ok? func_2 Generation Time Analyzer Code Generator 11

12 From March, 2008 to June 2011: Compiler GCC 104 (86) LLVM 228 (221) Others (Compcert, icc, armcc, tcc, cil, suncc, open64, etc) Bugs reported (fixed) 50 Total 382 Accounts for 1% total valid GCC bugs reported in the same period Accounts for 3.5% total valid LLVM bugs reported in the same period Do they matter? 25 priority 1 bugs for GCC 8 of our bugs were re-reported by others 12

13 Equivalence Modulo Inputs! Relax equiv. wrt a given input " Variants must satisfy P(i) = P k (i) on input i " But may differ on other input j: P(j) P k (j)! Exploit close interplay between " Dynamic program execution on some input " Static compilation for all input 13

14 Equivalence Modulo Inputs profile input!i! #######executed# ######unexecuted# program!p 14

15 Equivalence Modulo Inputs mutate I! I! I!..! O! I! O! 15

16 Equivalence Modulo Inputs mutate I! I! I!..! O! O! equivalent!wrt!i! I! O! 16

17 Example revisited Test c in GCC test suite unexecuted $"clang" m32" O0"test.c";"./a.out" $"clang" m32" O1"test.c";"./a.out" 17

18 Example revisited Reduced version $"clang" m32" O0"test.c";"./a.out" $"clang" m32" O1"test.c";"./a.out"" Aborted"(core"dumped)" $"clang" m32" O0"test.c";"./a.out" $"clang" m32" O1"test.c";"./a.out" Aborted"(core"dumped)" 18

19 Autopsy GVN:!load!struct!! using!32?bit!load! SRoA:!read!past!! the!struct s!end!! #!!!!!!!undefined!!!!!!!!!behavior! $"clang" m32" O0"test.c";"./a.out" $"clang" m32" O1"test.c";"./a.out"" Aborted"(core"dumped)" 19

20 Effectiveness bug counts GCC# LLVM# TOTAL# Reported! 111! 84! 195# Marked!Duplicate! 28! 7! 35# Confirmed! 79! 68! 147# Fixed! 56! 54! 110# bug types GCC# LLVM# TOTAL# Wrong!code! 46! 49! 95# Crash! 23! 10! 33# Performance! 10! 9! 19# 20

Automatic program generation for detecting vulnerabilities and errors in compilers and interpreters

Automatic program generation for detecting vulnerabilities and errors in compilers and interpreters Automatic program generation for detecting vulnerabilities and errors in compilers and interpreters 0368-3500 Nurit Dor Shir Landau-Feibish Noam Rinetzky Preliminaries Students will group in teams of 2-3

More information

Hardening LLVM with Random Testing

Hardening LLVM with Random Testing Hardening LLVM with Random Testing Xuejun Yang, Yang Chen Eric Eide, John Regehr {jxyang, chenyang, eeide, regehr}@cs.utah.edu University of Utah 11/3/2010 1 A LLVM Crash Bug int * p[2]; int i; for (...)

More information

Turning proof assistants into programming assistants

Turning proof assistants into programming assistants Turning proof assistants into programming assistants ST Winter Meeting, 3 Feb 2015 Magnus Myréen Why? Why combine proof- and programming assistants? Why proofs? Testing cannot show absence of bugs. Some

More information

Verified compilers. Guest lecture for Compiler Construction, Spring Magnus Myréen. Chalmers University of Technology

Verified compilers. Guest lecture for Compiler Construction, Spring Magnus Myréen. Chalmers University of Technology Guest lecture for Compiler Construction, Spring 2015 Verified compilers Magnus Myréen Chalmers University of Technology Mentions joint work with Ramana Kumar, Michael Norrish, Scott Owens and many more

More information

Software Testing CS 408. Lecture 11: Review 2/20/18

Software Testing CS 408. Lecture 11: Review 2/20/18 Software Testing CS 408 Lecture 11: Review 2/20/18 Lecture 1: Basics 2 Two Views Verification: Prove the absence, and conjecture the presence, of bugs Ex: types: Not all ill-typed programs are wrong But,

More information

Randomized Stress-Testing of Link-Time Optimizers

Randomized Stress-Testing of Link-Time Optimizers Randomized Stress-Testing of Link-Time Optimizers Vu Le, Chengnian Sun, Zhendong Su University of California, Davis 1 General Software Build Process r Linker 2 General Software Build Process r r Optimizations

More information

An Empirical Comparison of Compiler Testing Techniques

An Empirical Comparison of Compiler Testing Techniques An Empirical Comparison of Compiler Testing Techniques Junjie Chen 1,2, Wenxiang Hu 1,2, Dan Hao 1,2, Yingfei Xiong 1,2, Hongyu Zhang 3, Lu Zhang 1,2, Bing Xie 1,2 1 Key Laboratory of High Confidence Software

More information

Test- Case Reduc-on for C Compiler Bugs. John Regehr, Yang Chen, Pascal Cuoq, Eric Eide, Chucky Ellison, Xuejun Yang

Test- Case Reduc-on for C Compiler Bugs. John Regehr, Yang Chen, Pascal Cuoq, Eric Eide, Chucky Ellison, Xuejun Yang Test- Case Reduc-on for C Compiler Bugs John Regehr, Yang Chen, Pascal Cuoq, Eric Eide, Chucky Ellison, Xuejun Yang Background: Csmith [PLDI 2011] 500 C compiler bugs reported 400 300 200 100 0 Jan 2010

More information

Skeletal Program Enumeration for Rigorous Compiler Testing

Skeletal Program Enumeration for Rigorous Compiler Testing Skeletal Program Enumeration for Rigorous Compiler Testing Qirun Zhang Chengnian Sun Zhendong Su University of California, Davis, United States {qrzhang, cnsun, su@ucdavis.edu Abstract A program can be

More information

Random Testing of C Compilers Targeting Arithmetic Optimization

Random Testing of C Compilers Targeting Arithmetic Optimization R1-10 SASIMI 2012 Proceedings Random Testing of C Compilers Targeting Arithmetic Optimization Eriko Nagai 1 Hironobu Awazu 2 Nagisa Ishiura 1 Naoya Takeda 3 1 School of Science and Technology, Kwansei

More information

Verification of an ML compiler. Lecture 1: An introduction to compiler verification

Verification of an ML compiler. Lecture 1: An introduction to compiler verification Verification of an ML compiler Lecture 1: An introduction to compiler verification Marktoberdorf Summer School MOD 2017 Magnus O. Myreen, Chalmers University of Technology Introduction Your program crashes.

More information

Randomized Stress-Testing of Link-Time Optimizers

Randomized Stress-Testing of Link-Time Optimizers Randomized Stress-Testing of Link-Time Optimizers Vu Le Chengnian Sun Zhendong Su Department of Computer Science, University of California, Davis, USA {vmle, cnsun, su@ucdavis.edu ABSTRACT Link-time optimization

More information

Certified compilers. Do you trust your compiler? Testing is immune to this problem, since it is applied to target code

Certified compilers. Do you trust your compiler? Testing is immune to this problem, since it is applied to target code Certified compilers Do you trust your compiler? Most software errors arise from source code But what if the compiler itself is flawed? Testing is immune to this problem, since it is applied to target code

More information

System Administration and Network Security

System Administration and Network Security System Administration and Network Security Master SSCI, M2P subject Duration: up to 3 hours. All answers should be justified. Clear and concise answers will be rewarded. 1 Network Administration To keep

More information

Harvard School of Engineering and Applied Sciences CS 152: Programming Languages

Harvard School of Engineering and Applied Sciences CS 152: Programming Languages Harvard School of Engineering and Applied Sciences CS 152: Programming Languages Lecture 18 Thursday, April 3, 2014 1 Error-propagating semantics For the last few weeks, we have been studying type systems.

More information

18-642: Code Style for Compilers

18-642: Code Style for Compilers 18-642: Code Style for Compilers 9/6/2018 2017-2018 Philip Koopman Programming can be fun, so can cryptography; however they should not be combined. Kreitzberg and Shneiderman 2017-2018 Philip Koopman

More information

C++ Undefined Behavior What is it, and why should I care?

C++ Undefined Behavior What is it, and why should I care? C++ Undefined Behavior What is it, and why should I care? Marshall Clow Qualcomm marshall@idio.com http://cplusplusmusings.wordpress.com (intermittent) Twitter: @mclow ACCU 2014 April 2014 What is Undefined

More information

DART: Directed Automated Random Testing

DART: Directed Automated Random Testing DART: Directed Automated Random Testing Patrice Godefroid Nils Klarlund Koushik Sen Bell Labs Bell Labs UIUC Presented by Wei Fang January 22, 2015 PLDI 2005 Page 1 June 2005 Motivation Software testing:

More information

Aliasing restrictions of C11 formalized in Coq

Aliasing restrictions of C11 formalized in Coq Aliasing restrictions of C11 formalized in Coq Robbert Krebbers Radboud University Nijmegen December 11, 2013 @ CPP, Melbourne, Australia Aliasing Aliasing: multiple pointers referring to the same object

More information

Type Checking and Type Equality

Type Checking and Type Equality Type Checking and Type Equality Type systems are the biggest point of variation across programming languages. Even languages that look similar are often greatly different when it comes to their type systems.

More information

Static Analysis: Overview, Syntactic Analysis and Abstract Interpretation TDDC90: Software Security

Static Analysis: Overview, Syntactic Analysis and Abstract Interpretation TDDC90: Software Security Static Analysis: Overview, Syntactic Analysis and Abstract Interpretation TDDC90: Software Security Ahmed Rezine IDA, Linköpings Universitet Hösttermin 2014 Outline Overview Syntactic Analysis Abstract

More information

Undefinedness and Non-determinism in C

Undefinedness and Non-determinism in C 1 Undefinedness and Non-determinism in C Nabil M. Al-Rousan Nov. 21, 2018 @ UBC Based on slides from Robbert Krebbers Aarhus University, Denmark 2 What is this program supposed to do? The C quiz, question

More information

Programming Languages Research Programme

Programming Languages Research Programme Programming Languages Research Programme Logic & semantics Planning Language-based security Resource-bound analysis Theorem proving/ CISA verification LFCS Logic Functional "Foundational" PL has been an

More information

C++ Undefined Behavior

C++ Undefined Behavior C++ Undefined Behavior What is it, and why should I care? A presentation originally by Marshal Clow Original: https://www.youtube.com/watch?v=uhclkb1vkay Original Slides: https://github.com/boostcon/cppnow_presentations_2014/blob/master/files/undefined-behavior.pdf

More information

Static Analysis Alert Audits Lexicon And Rules David Svoboda, CERT Lori Flynn, CERT Presenter: Will Snavely, CERT

Static Analysis Alert Audits Lexicon And Rules David Svoboda, CERT Lori Flynn, CERT Presenter: Will Snavely, CERT Static Analysis Alert Audits Lexicon And Rules David Svoboda, CERT Lori Flynn, CERT Presenter: Will Snavely, CERT Software Engineering Institute Carnegie Mellon University Pittsburgh, PA 15213 2016 Carnegie

More information

Stanford University Computer Science Department CS 295 midterm. May 14, (45 points) (30 points) total

Stanford University Computer Science Department CS 295 midterm. May 14, (45 points) (30 points) total Stanford University Computer Science Department CS 295 midterm May 14, 2008 This is an open-book exam. You have 75 minutes. Write all of your answers directly on the paper. Make your answers as concise

More information

Static Analysis of C++ Projects with CodeSonar

Static Analysis of C++ Projects with CodeSonar Static Analysis of C++ Projects with CodeSonar John Plaice, Senior Scientist, GrammaTech jplaice@grammatech.com 25 July 2017, Meetup C++ de Montréal Abstract Static program analysis consists of the analysis

More information

Introduction to Proof-Carrying Code

Introduction to Proof-Carrying Code Introduction to Proof-Carrying Code Soonho Kong Programming Research Lab. Seoul National University soon@ropas.snu.ac.kr 7 August 2009 ROPAS Show & Tell Proof-Carrying Code Code Code Producer Code Consumer

More information

Harvard School of Engineering and Applied Sciences CS 152: Programming Languages

Harvard School of Engineering and Applied Sciences CS 152: Programming Languages Harvard School of Engineering and Applied Sciences CS 152: Programming Languages Lecture 24 Thursday, April 19, 2018 1 Error-propagating semantics For the last few weeks, we have been studying type systems.

More information

Programming in C++ Prof. Partha Pratim Das Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Programming in C++ Prof. Partha Pratim Das Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Programming in C++ Prof. Partha Pratim Das Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture - 08 Constants and Inline Functions Welcome to module 6 of Programming

More information

In this Lecture you will Learn: Testing in Software Development Process. What is Software Testing. Static Testing vs.

In this Lecture you will Learn: Testing in Software Development Process. What is Software Testing. Static Testing vs. In this Lecture you will Learn: Testing in Software Development Process Examine the verification and validation activities in software development process stage by stage Introduce some basic concepts of

More information

Clarifying the restrict Keyword. Introduction

Clarifying the restrict Keyword. Introduction Clarifying the restrict Keyword Doc. No.: WG14/N2 Date: 2018-04-26 Author 1: Troy A. Email: troyj@cray Author 2: Bill Ho Email: homer@cray Introduction Drafts of the proposal to add the restrict qualifier

More information

Matching Logic. Grigore Rosu University of Illinois at Urbana-Champaign

Matching Logic. Grigore Rosu University of Illinois at Urbana-Champaign Matching Logic Grigore Rosu University of Illinois at Urbana-Champaign Joint work with Andrei Stefanescu and Chucky Ellison. Started with Wolfram Schulte at Microsoft Research in 2009 Question could it

More information

Lecture Notes on Contracts

Lecture Notes on Contracts Lecture Notes on Contracts 15-122: Principles of Imperative Computation Frank Pfenning Lecture 2 August 30, 2012 1 Introduction For an overview the course goals and the mechanics and schedule of the course,

More information

PRACTICAL FORMAL TECHNIQUES AND TOOLS FOR DEVELOPING LLVM S PEEPHOLE OPTIMIZATIONS

PRACTICAL FORMAL TECHNIQUES AND TOOLS FOR DEVELOPING LLVM S PEEPHOLE OPTIMIZATIONS PRACTICAL FORMAL TECHNIQUES AND TOOLS FOR DEVELOPING LLVM S PEEPHOLE OPTIMIZATIONS by DAVID MENENDEZ A dissertation submitted to the School of Graduate Studies Rutgers, The State University of New Jersey

More information

Rust and C++ performance on the Algorithmic Lovasz Local Lemma

Rust and C++ performance on the Algorithmic Lovasz Local Lemma Rust and C++ performance on the Algorithmic Lovasz Local Lemma ANTHONY PEREZ, Stanford University, USA Additional Key Words and Phrases: Programming Languages ACM Reference Format: Anthony Perez. 2017.

More information

Algorithm must complete after a finite number of instructions have been executed. Each step must be clearly defined, having only one interpretation.

Algorithm must complete after a finite number of instructions have been executed. Each step must be clearly defined, having only one interpretation. Algorithms 1 algorithm: a finite set of instructions that specify a sequence of operations to be carried out in order to solve a specific problem or class of problems An algorithm must possess the following

More information

Instances and Classes. SOFTWARE ENGINEERING Christopher A. Welty David A. Ferrucci. 24 Summer 1999 intelligence

Instances and Classes. SOFTWARE ENGINEERING Christopher A. Welty David A. Ferrucci. 24 Summer 1999 intelligence Instances and Classes in SOFTWARE ENGINEERING Christopher A. Welty David A. Ferrucci 24 Summer 1999 intelligence Software Engineering Over the past decade or so, one of the many areas that artificial intelligence

More information

Introduction to Extended Common Coupling with an Application Study on Linux

Introduction to Extended Common Coupling with an Application Study on Linux Introduction to Extended Common Coupling with an Application Study on Linux Liguo Yu Computer Science and Informatics Indiana University South Bend 1700 Mishawaka Ave. P.O. Box 7111 South Bend, IN 46634,

More information

Lecture 10: Introduction to Correctness

Lecture 10: Introduction to Correctness Lecture 10: Introduction to Correctness Aims: To look at the different types of errors that programs can contain; To look at how we might detect each of these errors; To look at the difficulty of detecting

More information

1.1 For Fun and Profit. 1.2 Common Techniques. My Preferred Techniques

1.1 For Fun and Profit. 1.2 Common Techniques. My Preferred Techniques 1 Bug Hunting Bug hunting is the process of finding bugs in software or hardware. In this book, however, the term bug hunting will be used specifically to describe the process of finding security-critical

More information

Software Testing CS 408. Lecture 6: Dynamic Symbolic Execution and Concolic Testing 1/30/18

Software Testing CS 408. Lecture 6: Dynamic Symbolic Execution and Concolic Testing 1/30/18 Software Testing CS 408 Lecture 6: Dynamic Symbolic Execution and Concolic Testing 1/30/18 Relevant Papers CUTE: A Concolic Unit Testing Engine for C Koushik Sen, Darko Marinov, Gul Agha Department of

More information

Structure-aware fuzzing

Structure-aware fuzzing Structure-aware fuzzing for real-world projects Réka Kovács Eötvös Loránd University, Hungary rekanikolett@gmail.com 1 Overview tutorial, no groundbreaking discoveries Motivation growing code size -> growing

More information

How much is a mechanized proof worth, certification-wise?

How much is a mechanized proof worth, certification-wise? How much is a mechanized proof worth, certification-wise? Xavier Leroy Inria Paris-Rocquencourt PiP 2014: Principles in Practice In this talk... Some feedback from the aircraft industry concerning the

More information

A Java Execution Simulator

A Java Execution Simulator A Java Execution Simulator Steven Robbins Department of Computer Science University of Texas at San Antonio srobbins@cs.utsa.edu ABSTRACT This paper describes JES, a Java Execution Simulator that allows

More information

C Programming Review CSC 4320/6320

C Programming Review CSC 4320/6320 C Programming Review CSC 4320/6320 Overview Introduction C program Structure Keywords & C Types Input & Output Arrays Functions Pointers Structures LinkedList Dynamic Memory Allocation Macro Compile &

More information

Software Testing CS 408

Software Testing CS 408 Software Testing CS 408 1/09/18 Course Webpage: http://www.cs.purdue.edu/homes/suresh/408-spring2018 1 The Course Understand testing in the context of an Agile software development methodology - Detail

More information

Introduction to optimizations. CS Compiler Design. Phases inside the compiler. Optimization. Introduction to Optimizations. V.

Introduction to optimizations. CS Compiler Design. Phases inside the compiler. Optimization. Introduction to Optimizations. V. Introduction to optimizations CS3300 - Compiler Design Introduction to Optimizations V. Krishna Nandivada IIT Madras Copyright c 2018 by Antony L. Hosking. Permission to make digital or hard copies of

More information

18-642: Code Style for Compilers

18-642: Code Style for Compilers 18-642: Code Style for Compilers 9/25/2017 1 Anti-Patterns: Coding Style: Language Use Code compiles with warnings Warnings are turned off or over-ridden Insufficient warning level set Language safety

More information

Software Quality. Chapter What is Quality?

Software Quality. Chapter What is Quality? Chapter 1 Software Quality 1.1 What is Quality? The purpose of software quality analysis, or software quality engineering, is to produce acceptable products at acceptable cost, where cost includes calendar

More information

Static Analysis of Embedded C Code

Static Analysis of Embedded C Code Static Analysis of Embedded C Code John Regehr University of Utah Joint work with Nathan Cooprider Relevant features of C code for MCUs Interrupt-driven concurrency Direct hardware access Whole program

More information

Splint Pre-History. Security Flaws. (A Somewhat Self-Indulgent) Splint Retrospective. (Almost) Everyone Hates Specifications.

Splint Pre-History. Security Flaws. (A Somewhat Self-Indulgent) Splint Retrospective. (Almost) Everyone Hates Specifications. (A Somewhat Self-Indulgent) Splint Retrospective Splint Pre-History Pre-history 1973: Steve Ziles algebraic specification of set 1975: John Guttag s PhD thesis: algebraic specifications for abstract datatypes

More information

Acknowledgement. CS Compiler Design. Intermediate representations. Intermediate representations. Semantic Analysis - IR Generation

Acknowledgement. CS Compiler Design. Intermediate representations. Intermediate representations. Semantic Analysis - IR Generation Acknowledgement CS3300 - Compiler Design Semantic Analysis - IR Generation V. Krishna Nandivada IIT Madras Copyright c 2000 by Antony L. Hosking. Permission to make digital or hard copies of part or all

More information

Formal C semantics: CompCert and the C standard

Formal C semantics: CompCert and the C standard Formal C semantics: CompCert and the C standard Robbert Krebbers 1, Xavier Leroy 2, and Freek Wiedijk 1 1 ICIS, Radboud University Nijmegen, The Netherlands 2 Inria Paris-Rocquencourt, France Abstract.

More information

Tokens, Expressions and Control Structures

Tokens, Expressions and Control Structures 3 Tokens, Expressions and Control Structures Tokens Keywords Identifiers Data types User-defined types Derived types Symbolic constants Declaration of variables Initialization Reference variables Type

More information

Topics in Software Testing

Topics in Software Testing Dependable Software Systems Topics in Software Testing Material drawn from [Beizer, Sommerville] Software Testing Software testing is a critical element of software quality assurance and represents the

More information

The compilation process is driven by the syntactic structure of the program as discovered by the parser

The compilation process is driven by the syntactic structure of the program as discovered by the parser Semantic Analysis The compilation process is driven by the syntactic structure of the program as discovered by the parser Semantic routines: interpret meaning of the program based on its syntactic structure

More information

Programming and Data Structures in C Instruction for students

Programming and Data Structures in C Instruction for students Programming and Data Structures in C Instruction for students Adam Piotrowski Dariusz Makowski Wojciech Sankowski 11 kwietnia 2016 General rules When writing programs please note that: Program must be

More information

QUIZ. What is wrong with this code that uses default arguments?

QUIZ. What is wrong with this code that uses default arguments? QUIZ What is wrong with this code that uses default arguments? Solution The value of the default argument should be placed in either declaration or definition, not both! QUIZ What is wrong with this code

More information

CSE 403: Software Engineering, Fall courses.cs.washington.edu/courses/cse403/16au/ Static Analysis. Emina Torlak

CSE 403: Software Engineering, Fall courses.cs.washington.edu/courses/cse403/16au/ Static Analysis. Emina Torlak CSE 403: Software Engineering, Fall 2016 courses.cs.washington.edu/courses/cse403/16au/ Static Analysis Emina Torlak emina@cs.washington.edu Outline What is static analysis? How does it work? Free and

More information

Test-Case Reduction for C Compiler Bugs

Test-Case Reduction for C Compiler Bugs Test-Case Reduction for C Compiler Bugs John Regehr University of Utah regehr@cs.utah.edu Yang Chen University of Utah chenyang@cs.utah.edu Pascal Cuoq CEA LIST pascal.cuoq@cea.fr Eric Eide University

More information

In Java we have the keyword null, which is the value of an uninitialized reference type

In Java we have the keyword null, which is the value of an uninitialized reference type + More on Pointers + Null pointers In Java we have the keyword null, which is the value of an uninitialized reference type In C we sometimes use NULL, but its just a macro for the integer 0 Pointers are

More information

Machine-checked proofs of program correctness

Machine-checked proofs of program correctness Machine-checked proofs of program correctness COS 326 Andrew W. Appel Princeton University slides copyright 2013-2015 David Walker and Andrew W. Appel In this course, you saw how to prove that functional

More information

Formal Verification Techniques for GPU Kernels Lecture 1

Formal Verification Techniques for GPU Kernels Lecture 1 École de Recherche: Semantics and Tools for Low-Level Concurrent Programming ENS Lyon Formal Verification Techniques for GPU Kernels Lecture 1 Alastair Donaldson Imperial College London www.doc.ic.ac.uk/~afd

More information

Undefined Behaviour in C

Undefined Behaviour in C Undefined Behaviour in C Report Field of work: Scientific Computing Field: Computer Science Faculty for Mathematics, Computer Science and Natural Sciences University of Hamburg Presented by: Dennis Sobczak

More information

Programming Lecture 3

Programming Lecture 3 Programming Lecture 3 Expressions (Chapter 3) Primitive types Aside: Context Free Grammars Constants, variables Identifiers Variable declarations Arithmetic expressions Operator precedence Assignment statements

More information

Be Conservative: Enhancing Failure Diagnosis with Proactive Logging

Be Conservative: Enhancing Failure Diagnosis with Proactive Logging Be Conservative: Enhancing Failure Diagnosis with Proactive Logging Ding Yuan, Soyeon Park, Peng Huang, Yang Liu, Michael Lee, Xiaoming Tang, Yuanyuan Zhou, Stefan Savage University of California, San

More information

Winter School in Software Engineering 2017

Winter School in Software Engineering 2017 Winter School in Software Engineering 2017 Monday December 11, 2017 Day 1 08:00-08:30 Registration 08:30-10:00 Programming by Examples: Applications, Algorithms and Ambiguity Resolution - Session I Sumit

More information

CS558 Programming Languages

CS558 Programming Languages CS558 Programming Languages Fall 2016 Lecture 3a Andrew Tolmach Portland State University 1994-2016 Formal Semantics Goal: rigorous and unambiguous definition in terms of a wellunderstood formalism (e.g.

More information

Overview AEG Conclusion CS 6V Automatic Exploit Generation (AEG) Matthew Stephen. Department of Computer Science University of Texas at Dallas

Overview AEG Conclusion CS 6V Automatic Exploit Generation (AEG) Matthew Stephen. Department of Computer Science University of Texas at Dallas CS 6V81.005 Automatic Exploit Generation (AEG) Matthew Stephen Department of Computer Science University of Texas at Dallas February 20 th, 2012 Outline 1 Overview Introduction Considerations 2 AEG Challenges

More information

Reinforcing Random Testing of Arithmetic Optimization of C Compilers by Scaling up Size and Number of Expressions

Reinforcing Random Testing of Arithmetic Optimization of C Compilers by Scaling up Size and Number of Expressions Regular Paper Reinforcing Random Testing of Arithmetic Optimization of C Compilers by Scaling up Size and Number of Expressions Eriko Nagai 1, 1 Atsushi Hashimoto 1 Nagisa Ishiura 1,a) Received: December

More information

Lecture Notes on Intermediate Representation

Lecture Notes on Intermediate Representation Lecture Notes on Intermediate Representation 15-411: Compiler Design Frank Pfenning Lecture 10 September 26, 2013 1 Introduction In this lecture we discuss the middle end of the compiler. After the source

More information

Milind Kulkarni Research Statement

Milind Kulkarni Research Statement Milind Kulkarni Research Statement With the increasing ubiquity of multicore processors, interest in parallel programming is again on the upswing. Over the past three decades, languages and compilers researchers

More information

Random Testing of Interrupt-Driven Software. John Regehr University of Utah

Random Testing of Interrupt-Driven Software. John Regehr University of Utah Random Testing of Interrupt-Driven Software John Regehr University of Utah Integrated stress testing and debugging Random interrupt testing Source-source transformation Static stack analysis Semantics

More information

General Purpose GPU Programming. Advanced Operating Systems Tutorial 7

General Purpose GPU Programming. Advanced Operating Systems Tutorial 7 General Purpose GPU Programming Advanced Operating Systems Tutorial 7 Tutorial Outline Review of lectured material Key points Discussion OpenCL Future directions 2 Review of Lectured Material Heterogeneous

More information

Lecture Notes on Intermediate Representation

Lecture Notes on Intermediate Representation Lecture Notes on Intermediate Representation 15-411: Compiler Design Frank Pfenning Lecture 9 September 24, 2009 1 Introduction In this lecture we discuss the middle end of the compiler. After the source

More information

AD HOC VS. PLANNED SOFTWARE MAINTENANCE

AD HOC VS. PLANNED SOFTWARE MAINTENANCE AD HOC VS. PLANNED SOFTWARE MAINTENANCE INTRODUCTION Warren Harrison Portland State University Portland, OR 97207-0751 warren@cs.pdx.edu In a series of papers, Belady and Lehman [Belady & Lehman, 1976]

More information

A brief introduction to C programming for Java programmers

A brief introduction to C programming for Java programmers A brief introduction to C programming for Java programmers Sven Gestegård Robertz September 2017 There are many similarities between Java and C. The syntax in Java is basically

More information

Static Analysis of Embedded C

Static Analysis of Embedded C Static Analysis of Embedded C John Regehr University of Utah Joint work with Nathan Cooprider Motivating Platform: TinyOS Embedded software for wireless sensor network nodes Has lots of SW components for

More information

THE EVALUATION OF OPERANDS AND ITS PROBLEMS IN C++

THE EVALUATION OF OPERANDS AND ITS PROBLEMS IN C++ Proceedings of the South Dakota Academy of Science, Vol. 85 (2006) 107 THE EVALUATION OF OPERANDS AND ITS PROBLEMS IN C++ Dan Day and Steve Shum Computer Science Department Augustana College Sioux Falls,

More information

Lecture Notes on Compiler Design: Overview

Lecture Notes on Compiler Design: Overview Lecture Notes on Compiler Design: Overview 15-411: Compiler Design Frank Pfenning Lecture 1 August 26, 2014 1 Introduction This course is a thorough introduction to compiler design, focusing on more lowlevel

More information

The Correctness-Security Gap in Compiler Optimization

The Correctness-Security Gap in Compiler Optimization The Correctness-Security Gap in Compiler Optimization Vijay D Silva, Mathias Payer, Dawn Song LangSec 2015 1 Compilers and Trust 2 Correctness vs. Security by Example 3 Correctness vs. Security, Formally

More information

A Formal C Memory Model Supporting Integer-Pointer Casts

A Formal C Memory Model Supporting Integer-Pointer Casts A Formal C Memory Model Supporting Integer-Pointer Casts Abstract The ISO C standard does not specify the semantics of many valid programs that use non-portable idioms such as integerpointer casts. Recent

More information

Program Partitioning - A Framework for Combining Static and Dynamic Analysis

Program Partitioning - A Framework for Combining Static and Dynamic Analysis Program Partitioning - A Framework for Combining Static and Dynamic Analysis Pankaj Jalote, Vipindeep V, Taranbir Singh, Prateek Jain Department of Computer Science and Engineering Indian Institute of

More information

1. Describe History of C++? 2. What is Dev. C++? 3. Why Use Dev. C++ instead of C++ DOS IDE?

1. Describe History of C++? 2. What is Dev. C++? 3. Why Use Dev. C++ instead of C++ DOS IDE? 1. Describe History of C++? The C++ programming language has a history going back to 1979, when Bjarne Stroustrup was doing work for his Ph.D. thesis. One of the languages Stroustrup had the opportunity

More information

Program Analysis And Its Support in Software Development

Program Analysis And Its Support in Software Development Program Analysis And Its Support in Software Development Qing Yi class web site: www.cs.utsa.edu/~qingyi/cs6463 cs6463 1 A little about myself Qing Yi B.S. Shandong University, China. Ph.D. Rice University,

More information

2014, IJARCSSE All Rights Reserved Page 303

2014, IJARCSSE All Rights Reserved Page 303 Volume 4, Issue 1, January 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com A Novel Software

More information

Blanket Execution: Dynamic Similarity Testing for Program Binaries and Components

Blanket Execution: Dynamic Similarity Testing for Program Binaries and Components Blanket Execution: Dynamic Similarity Testing for Program Binaries and Components Manuel Egele, Maverick Woo, Peter Chapman, and David Brumley Carnegie Mellon University 1 Picture Yourself as an Analyst

More information

A Formal C Memory Model Supporting Integer-Pointer Casts

A Formal C Memory Model Supporting Integer-Pointer Casts A Formal C Memory Model Supporting Integer-Pointer Casts Jeehoon Kang Seoul National University, South Korea jeehoon.kang@sf.snu.ac.kr Chung-Kil Hur Seoul National University, South Korea gil.hur@sf.snu.ac.kr

More information

Unit-II Programming and Problem Solving (BE1/4 CSE-2)

Unit-II Programming and Problem Solving (BE1/4 CSE-2) Unit-II Programming and Problem Solving (BE1/4 CSE-2) Problem Solving: Algorithm: It is a part of the plan for the computer program. An algorithm is an effective procedure for solving a problem in a finite

More information

Formal proofs of code generation and verification tools

Formal proofs of code generation and verification tools Formal proofs of code generation and verification tools Xavier Leroy To cite this version: Xavier Leroy. Formal proofs of code generation and verification tools. Dimitra Giannakopoulou and Gwen Salaün.

More information

Modern Buffer Overflow Prevention Techniques: How they work and why they don t

Modern Buffer Overflow Prevention Techniques: How they work and why they don t Modern Buffer Overflow Prevention Techniques: How they work and why they don t Russ Osborn CS182 JT 4/13/2006 1 In the past 10 years, computer viruses have been a growing problem. In 1995, there were approximately

More information

Binghamton University. CS-211 Fall Syntax. What the Compiler needs to understand your program

Binghamton University. CS-211 Fall Syntax. What the Compiler needs to understand your program Syntax What the Compiler needs to understand your program 1 Pre-Processing Any line that starts with # is a pre-processor directive Pre-processor consumes that entire line Possibly replacing it with other

More information

Data-Flow Analysis Foundations

Data-Flow Analysis Foundations CS 301 Spring 2016 Meetings April 11 Data-Flow Foundations Plan Source Program Lexical Syntax Semantic Intermediate Code Generation Machine- Independent Optimization Code Generation Target Program This

More information

BLM2031 Structured Programming. Zeyneb KURT

BLM2031 Structured Programming. Zeyneb KURT BLM2031 Structured Programming Zeyneb KURT 1 Contact Contact info office : D-219 e-mail zeynebkurt@gmail.com, zeyneb@ce.yildiz.edu.tr When to contact e-mail first, take an appointment What to expect help

More information

Static Analysis methods and tools An industrial study. Pär Emanuelsson Ericsson AB and LiU Prof Ulf Nilsson LiU

Static Analysis methods and tools An industrial study. Pär Emanuelsson Ericsson AB and LiU Prof Ulf Nilsson LiU Static Analysis methods and tools An industrial study Pär Emanuelsson Ericsson AB and LiU Prof Ulf Nilsson LiU Outline Why static analysis What is it Underlying technology Some tools (Coverity, KlocWork,

More information

Intro to semantics; Small-step semantics Lecture 1 Tuesday, January 29, 2013

Intro to semantics; Small-step semantics Lecture 1 Tuesday, January 29, 2013 Harvard School of Engineering and Applied Sciences CS 152: Programming Languages Lecture 1 Tuesday, January 29, 2013 1 Intro to semantics What is the meaning of a program? When we write a program, we use

More information

Lecture 1 Contracts : Principles of Imperative Computation (Fall 2018) Frank Pfenning

Lecture 1 Contracts : Principles of Imperative Computation (Fall 2018) Frank Pfenning Lecture 1 Contracts 15-122: Principles of Imperative Computation (Fall 2018) Frank Pfenning In these notes we review contracts, which we use to collectively denote function contracts, loop invariants,

More information

Lecture 12: Abstraction Functions

Lecture 12: Abstraction Functions Lecture 12: Abstraction Functions 12.1 Context What you ll learn: How to read, write, and use abstraction functions to describe the relationship between the abstract values of a type and its representation.

More information

CS2 Algorithms and Data Structures Note 10. Depth-First Search and Topological Sorting

CS2 Algorithms and Data Structures Note 10. Depth-First Search and Topological Sorting CS2 Algorithms and Data Structures Note 10 Depth-First Search and Topological Sorting In this lecture, we will analyse the running time of DFS and discuss a few applications. 10.1 A recursive implementation

More information