Test Automation. 20 December 2017

Similar documents
Part I: Preliminaries 24

Static Analysis and Bugfinding

CSE 565 Computer Security Fall 2018

Symbolic Execution. Wei Le April

Symbolic Execution, Dynamic Analysis

Software Security IV: Fuzzing

Automated Whitebox Fuzz Testing. by - Patrice Godefroid, - Michael Y. Levin and - David Molnar

Quality Assurance: Test Development & Execution. Ian S. King Test Development Lead Windows CE Base OS Team Microsoft Corporation

Testing & Symbolic Execution

CHAPTER 5 GENERATING TEST SCENARIOS AND TEST CASES FROM AN EVENT-FLOW MODEL

Program Analysis. Program Analysis

Part 5. Verification and Validation

Introduction to Problem Solving and Programming in Python.

Three Important Testing Questions

Automatic Test Generation. Galeotti/Gorla/Rau Saarland University

In this Lecture you will Learn: Testing in Software Development Process. What is Software Testing. Static Testing vs.

Topics in Software Testing

CSE 127 Computer Security

Software security, secure programming

Programming Embedded Systems

Writing a fuzzer. for any language with american fuzzy lop. Ariel Twistlock Labs

Symbolic and Concolic Execution of Programs

Reading Assignment. Symbolic Evaluation/Execution. Move from Dynamic Analysis to Static Analysis. Move from Dynamic Analysis to Static Analysis

Combining Complementary Formal Verification Strategies to Improve Performance and Accuracy

Symbolic Evaluation/Execution


Case Study on Testing of Web-Based Application: Del s Students Information System

Introduction to Software Testing Chapter 2, Sec#: 2.5 Graph Coverage for Specifications

Testing, Fuzzing, & Symbolic Execution

Objectives. Chapter 19. Verification vs. validation. Topics covered. Static and dynamic verification. The V&V process

Security Testing. Ulf Kargén Department of Computer and Information Science (IDA) Division for Database and Information Techniques (ADIT)

International Journal of Computer Engineering and Applications, Volume XII, Special Issue, September 18, ISSN SOFTWARE TESTING

n HW7 due in about ten days n HW8 will be optional n No CLASS or office hours on Tuesday n I will catch up on grading next week!

Testing. ECE/CS 5780/6780: Embedded System Design. Why is testing so hard? Why do testing?

Software Testing part II (white box) Lecturer: Giuseppe Santucci

Software Vulnerability

Testing! Prof. Leon Osterweil! CS 520/620! Spring 2013!

Coding and Unit Testing! The Coding Phase! Coding vs. Code! Coding! Overall Coding Language Trends!

International Journal of Computer Engineering and Applications, Volume XII, Special Issue, April- ICITDA 18,

Selection of UML Models for Test Case Generation: A Discussion on Techniques to Generate Test Cases

Ian Sommerville 2006 Software Engineering, 8th edition. Chapter 22 Slide 1

Automated Software Analysis Techniques For High Reliability: A Concolic Testing Approach. Moonzoo Kim

Software Engineering

Why testing and analysis. Software Testing. A framework for software testing. Outline. Software Qualities. Dependability Properties

CIS 1.5 Course Objectives. a. Understand the concept of a program (i.e., a computer following a series of instructions)

1 Visible deviation from the specification or expected behavior for end-user is called: a) an error b) a fault c) a failure d) a defect e) a mistake

Dynamic reverse engineering of Java software

Structural Testing. (c) 2007 Mauro Pezzè & Michal Young Ch 12, slide 1

Test Automation. Fundamentals. Mikó Szilárd

Verification and Validation

challenges in domain-specific modeling raphaël mannadiar august 27, 2009

7/8/10 KEY CONCEPTS. Problem COMP 10 EXPLORING COMPUTER SCIENCE. Algorithm. Lecture 2 Variables, Types, and Programs. Program PROBLEM SOLVING

Facts About Testing. Cost/benefit. Reveal faults. Bottom-up. Testing takes more than 50% of the total cost of software development

Software Testing CS 408. Lecture 6: Dynamic Symbolic Execution and Concolic Testing 1/30/18

Testing symbolic execution engines

Simple Overflow. #include <stdio.h> int main(void){ unsigned int num = 0xffffffff;

Introduction to Dynamic Analysis

Modular and Verified Automatic Program Repairs

Numerical Methods in Scientific Computation

Testing: Coverage and Structural Coverage

Chapter 1: Why Program? Main Hardware Component Categories 8/23/2014. Main Hardware Component Categories: Why Program?

Tool Support for Testing. Software Testing: INF3121 / INF4121

Automatic Testing of Symbolic Execution Engines via Program Generation and Differential Testing

A Survey of Approaches for Automated Unit Testing. Outline

Dynamic Test Generation to Find Bugs in Web Application

Sample Exam Syllabus

JIVE: Dynamic Analysis for Java

Chapter 11, Testing. Using UML, Patterns, and Java. Object-Oriented Software Engineering

Structural Testing. Testing Tactics. Why Structural? Structural white box. Functional black box. Structural white box. Functional black box

Subject Software Testing Structural Testing

An Automated Testing Environment to support Operational Profiles of Software Intensive Systems

Software Testing CS 408. Lecture 11: Review 2/20/18

Improving Program Testing and Understanding via Symbolic Execution

Introduction to Symbolic Execution

Chapter 9. Introduction to High-Level Language Programming. INVITATION TO Computer Science

Programming Embedded Systems

Overview. State-of-the-Art. Relative cost of error correction. CS 619 Introduction to OO Design and Development. Testing.

Bug Finding with Under-approximating Static Analyses. Daniel Kroening, Matt Lewis, Georg Weissenbacher

Generating Tests for Detecting Faults in Feature Models

The Mercury project. Zoltan Somogyi

MONIKA HEINER.

Testing, Debugging, Program Verification

Three General Principles of QA. COMP 4004 Fall Notes Adapted from Dr. A. Williams

Class 17. Discussion. Mutation analysis and testing. Problem Set 7 discuss Readings

Ready to Automate? Ready to Automate?

Honours/Master/PhD Thesis Projects Supervised by Dr. Yulei Sui

CMPSCI 521/621 Homework 2 Solutions

Programming Languages and Program Development Life Cycle Fall Introduction to Information and Communication Technologies CSD 102

Program verification. Generalities about software Verification Model Checking. September 20, 2016

Efficient Regression Test Model for Object Oriented Software

Fuzzing. compass-security.com 1

A survey of new trends in symbolic execution for software testing and analysis

Test Oracles. Test Oracle

Testing! The material for this lecture is drawn, in part, from! The Practice of Programming (Kernighan & Pike) Chapter 6!

Static Vulnerability Analysis

Testing, Debugging, Program Verification

Compositional Symbolic Execution through Program Specialization

Test How to Succeed in Test Automation Björn Hagström & Davor Crnomat, Testway AB

3.4 Deduction and Evaluation: Tools Conditional-Equational Logic

Verification and Validation

Transcription:

Test Automation 20 December 2017

The problem of test automation Testing has repetitive components, so automation is justified The problem is cost-benefit evaluation of automation [Kaner] Time for: test creation, checking their functionality, documentation Is automation reusable? (if the program evolves) Is maintenance needed? (GUI change, internationalization) Test automation: treated like any software development Does it delay finding bugs? (fewer resources to run tests) Does it find enough bugs? Or are most found by manual testing Is it powerful enough? Or does it automate only easy tests?

Example: Capture-replay 1) Record user actions (mouse/keyboard) and resulting screen (bitmap) most primitive level other checks: with tester effort (interrupt/insert) fragile: susceptible to any product change possible comparison errors in resulting image 2) script with high-level actions (select menu/button) more flexible, but does not check graphic layout (low level: font, text size/overwrite, etc.) 3) scripting language to automatically generate new tests Disadvantage of capture-replay Cannot continue from errors errors are found manually in the recording process only rerunning a good test is automated (regression) Does not define tests implicit for human ( all the rest is OK ) (cannot detect unspecified errors, is inflexible e.g. bitmap)

Example: Test monkeys Automated tools that execute random tests (without a testor s knowledge on product functionality) Dumb monkeys: completely ignore purpose (know just mouse/keyboard) but may have basic notions about windows/menus/buttons Smart monkeys: have a state model of the application, explore transitions between these states ++ can sometimes find 10-20% of errors [Nyman, Microsoft, 2000] ++ good preliminary coverage (e.g.: 65% in 15 min for a text editor) ++ completely automated, no human effort for test capture -- dumb : only bug known to monkeys is system crash -- errors are hard to record and reproduce ++ runs independently, unsupervised, minimal resources (cost)

What can we automate in testing? Test execution e.g. any unit testing framework useful in Test evaluation = problem of test oracle : did the test pass? Nontrivial, often needs manual inspection. Risks: undetected errors (imprecision) false warnings cost of manual checking Ex: compare continuous signals (in automotive industry) image comparison (for screen/printer) Test generation Relatively easy: generating test skeletons (declarations + calls) More difficult: intelligent generation of relevant data (coverage)

Choosing a test architecture [Kaner] 1) Data-driven architecture separates data from test structure (like in programs) Example: table. row = test; columns = test parameters A script generates a test case for every table row Minimal reasonable coverage: every pair of parameter values (for every combination of values, number is exponential) 2) Framework-based architecture A library of functions separates testing from UI e.g. open(file), independent of actions for opening (menu, button click, keyboard, etc.) ++ reuse for frequent actions ++ indirection insulation from testing tool -- costly, amortized only in future releases

Specification-based testing Automatable (keyword testing) for spects in well-defined language Starting from documentation: tabular spec, e.g. [Pettichord] Test ID Operation Table Name Type Nulls dtbed101 Add Col TB03 NEW INT COL CHAR(100) Y Important: choose format easily understandable by user A translator/test interpreter generates the test driver from the table or interfaces with the (commercial) testing tool used ++: requirement-driven (what, not how), independent of implementation and testing tool, self-documented More advanced: automated test generation from specs in formal language e.g. decision tables in RSML in TCAS-II aviation protocol test generation from timing diagrams in embedded systems

Model-based testing Models: finite automata, UML, Statecharts (hierarchical automata), Message Sequence Charts, timed automata, Petri nets, Markov chains... Test generation criteria: satisfactory model coverage all states / transitions; combinations of k consecutive transitions (k-switch cover) ++ facilitates generation of relevant tests -- investment in model building and maintenance Testing based on model checking by state space exploration, starting from specifications: 1) question: can the model reach a given state? 2) if so, a model checker will generate an example trace = test case

Implementation-based testing: symbolic execution Goal: exercising program, satisfying a coverage criterion needs: instrumentation to measure test coverage How: set of paths: random choice + directed search (to reach branches not yet covered) Symbolic execution: executing program using expression with symbolic variables, rather than concrete (numeric) values Symbolic execution gathers path conditions for followed branches Satisfiability of conditions is checked with specialized tools (satisfiability checkers, constraint solvers) generate input data that will exercise that path or prove path is infeasible stops exploring that path

Symbolic execution for program testing described as early as 1976 (James C. King) program is executed by a special interpreter, using symbolic inputs results in symbolic execution tree tree traversal stops when path condition becomes unsatisfiable Test generation purpose: attaining high coverage sometimes, reaching a specific branch Successful mature technique, hundreds of papers, many tools: Java Pathfinder, (j)cute, Crest, KLEE, Pex, SAGE,... for C/C++, C#, Java, more recently JavaScript

Variants of symbolic execution Classic completely symbolic execution explores each execution path independently c 1? c 2? c 3? c 1 c 2 c 1 c 2 c 1 c 3 c 1 c 3 Problem: must express all program/language semantics as formula solving arbitrary formulas impossible (limited to simple arithmetic) reality: complex math, library function, environment solution: model libraries & environment e.g. KLEE tool has models for some 40 syscalls (2.5 kloc)

Dynamic (concolic) symbolic execution symbolic execution is directed by concrete run (hence: conc olic When symbolic execution is infeasible, perform a concrete execution step e.g. nonlinear arithmetic, library/system functions function explore(pathcond = [c 1, c 2,..., c n ]) for k = n downto 1 do inputs = solve pathcond = c 1... c k 1 c k (flip c k ) rerun with new inputs; capture new pathcond explore(pathcond ) Problem: by using concrete values, might not reach desired path

Concretization as potential obstacle y = hash ( x ) ; // can t s o l v e hash f o r m u l a => y i s i f ( x + y > 0) // path 1 e l s e // path 2 Assume: x = 20; y = hash(20) = 13 path 1 To reach path 2, negate x + y > 0, with concrete y (constant 13) Solver might return, e.g., x = -15 but we might have hash(15) = 27 (can t predict) and then x + y > 0 execution still follows path 1 retry; worst-case: degrades to random testing

When is test automation efficient? In regression testing: need only store tests and expected results (and means to automate comparison) Testing user interfaces (discussed earlier) Testing compilers / translators automated test generation starting from input grammar explores random/statistic combination of grammar rules Load/stress testing: random; quantity rather than content is relevant Fuzz testing: generate large quantities of random / possibly hostile input, to detect input validation errors or security vulnerabilities e.g. Randoop [Microsoft]: 4 M tests in 150 CPU hours / 15 person-hr 30 bugs in code tested for 200 person-years, vs. 20 errors/year manually see also http://research.microsoft.com/en-us/projects/pex/

Basic workings of a fuzzer e.g. American Fuzzy Lop http://lcamtuf.coredump.cx/afl/ maintains queue of test inputs mutates inputs using several strategies if new coverage achieved, add mutant to input queue minimize each test input (keeping coverage) minimize input corpus (avoids overlap) records transition coverage between program basic blocks classifies runs into crashes/hangs/normal exit highly successful, found many security vulnerabilities mutating inputs can synthesize interesting formats (e.g. images) can identify format fields with various meaning (length, checksum, payload, control opcode, etc.)

Automated debugging After automating detection help in fault localization Minimizing test inputs binary search, finds (file) input half that caused error Minimize differences between correct and erroneous run also binary search, for two close inputs Fault localization in space in debugger, compare execution state of correct and buggy run detect (precisely/statistically) invariants/patterns violated by erroneous run Fault localization in time compare erroneous runs and find points where infected variables start affecting output Delta debugging [Zeller]: partial automation of these techniques