Cleanroom Software Engineering

Similar documents
The Cleanroom Method

Key Features. Defect Rates. Traditional Unit testing: 25 faults / KLOC System testing: 25 / KLOC Inspections: / KLOC

Cleanroom Software Engineering

Cleanroom Software Engineering

Gradational conception in Cleanroom Software Development

Computer Science and Software Engineering University of Wisconsin - Platteville 9-Software Testing, Verification and Validation

A TSP SM Roadmap to CMMI SM Implementation

Programming Embedded Systems

Software Verification and Validation (VIMMD052) Introduction. Istvan Majzik Budapest University of Technology and Economics

Statistical Testing of Software Based on a Usage Model

Quality Assurance in Software Development

Cleanroom Software Engineering

Verification and Validation

Feasibility of Testing to Code. Feasibility of Testing to Code. Feasibility of Testing to Code. Feasibility of Testing to Code (contd)

III. CONCEPTS OF MODELLING II.

Cleanroom Software Engineering Reference

Verification and Validation

Verification and Validation

Basic Concepts of Reliability

Course "Empirical Evaluation in Informatics" Quasi-Experiments

Introduction To Software Testing. Brian Nielsen. Center of Embedded Software Systems Aalborg University, Denmark CSS

Course "Empirical Evaluation in Informatics" Quasi-Experiments

Part 5. Verification and Validation

Safety and Reliability Engineering Part 5: Redundancy / Software Reliability

Software Quality Assurance. David Janzen

Formal Methods. CITS5501 Software Testing and Quality Assurance

Lecture Notes: Hoare Logic

Test requirements in networked systems

Standard CMMI SM Appraisal Method for Process Improvement. (SCAMPI Version 1.1)

CITS5501 Software Testing and Quality Assurance Formal methods

An Automated Testing Environment to support Operational Profiles of Software Intensive Systems

Framework for Improvement in Cleanroom Software Engineering Thesis Submitted in the partial fulfillment of requirements for the award of the degree

Topics in Software Testing

Formal Methods and their role in Software and System Development. Riccardo Sisto, Politecnico di Torino

Objectives. Chapter 19. Verification vs. validation. Topics covered. Static and dynamic verification. The V&V process

How invariants help writing loops Author: Sander Kooijmans Document version: 1.0

Metrics and OO. SE 3S03 - Tutorial 12. Alicia Marinache. Week of Apr 04, Department of Computer Science McMaster University

Softwaretechnik. Lecture 08: Testing and Debugging Overview. Peter Thiemann SS University of Freiburg, Germany

CS264: Homework #4. Due by midnight on Wednesday, October 22, 2014

Verification Overview Testing Theory and Principles Testing in Practice. Verification. Miaoqing Huang University of Arkansas 1 / 80

White-Box Testing Techniques

Computational problems. Lecture 2: Combinatorial search and optimisation problems. Computational problems. Examples. Example

Softwaretechnik. Lecture 08: Testing and Debugging Overview. Peter Thiemann SS University of Freiburg, Germany

Verification and Validation. Ian Sommerville 2004 Software Engineering, 7th edition. Chapter 22 Slide 1

Programming Embedded Systems

Black-box Testing Techniques

Generic Requirements Management and Verification Process for Ground Segment and Mission Operations Preparation

Lecture: Simulation. of Manufacturing Systems. Sivakumar AI. Simulation. SMA6304 M2 ---Factory Planning and scheduling. Simulation - A Predictive Tool

Discrete Optimization. Lecture Notes 2

Formalization of Incremental Simplex Algorithm by Stepwise Refinement

Advanced Software Engineering: Software Testing

Symbolic Execution and Proof of Properties

Administrivia. ECE/CS 5780/6780: Embedded System Design. Acknowledgements. What is verification?

Aerospace Software Engineering

SFWR ENG 3S03: Software Testing

Topic 01. Software Engineering, Web Engineering, agile methodologies.

Scheme: Expressions & Procedures

Improving the Effectiveness of Software Testing through Test Case Reduction

Static Analysis Techniques

Lecture Notes on Contracts

(Objective-CS605 Software Engeenring-II)

ImplementingPSP/TSP Massivelyin Mexico

1. true / false By a compiler we mean a program that translates to code that will run natively on some machine.

Tool Support for Tradespace Exploration and Analysis

A Project Network Protocol Based on Reliability Theory

Testing! Prof. Leon Osterweil! CS 520/620! Spring 2013!

TEACHING SPECIFICATION AND VERIFICATION OF EVENT- DRIVEN PROGRAMS USING CLEANROOM SOFTWARE ENGINEERING

FUZZY SPECIFICATION IN SOFTWARE ENGINEERING

Embedding Formulations, Complexity and Representability for Unions of Convex Sets

Behavioral Equivalence

Analysis of the Test Driven Development by Example

CTAL. ISTQB Advanced Level.

An Annotated Language

Question 1: What is a code walk-through, and how is it performed?

Formal Specification. Objectives

Microscopic Traffic Simulation

Ian Sommerville 2006 Software Engineering, 8th edition. Chapter 22 Slide 1

Software Engineering Testing and Debugging Testing

1 More on the Bellman-Ford Algorithm

Pet: An Interactive Software Testing Tool

Lecture 8 Requirements Engineering

MONIKA HEINER.

CS/ECE 5780/6780: Embedded System Design

(From Glenford Myers: The Art of Software Testing)

Reasoning About Imperative Programs. COS 441 Slides 10

14.1 Encoding for different models of computation

Requirements Validation and Negotiation

Equivalence Class Partitioning. Equivalence Partitioning. Definition and Example. Example set of classes

Combining UML and Z in a Software Process

Department of Electrical & Computer Engineering, University of Calgary. B.H. Far

Introduction to Software Engineering p. 1 The Scope of Software Engineering p. 3 Historical Aspects p. 4 Economic Aspects p. 7 Maintenance Aspects p.

Software Testing CS 408. Lecture 11: Review 2/20/18

Impact of Dependency Graph in Software Testing

Certified Automotive Software Tester Sample Exam Paper Syllabus Version 2.0

Testing Theory. Agenda - What will you learn today? A Software Life-cycle Model Which part will we talk about today? Theory Lecture Plan

Software Quality. Chapter What is Quality?

Software Engineering Economics

10. Software Testing Fundamental Concepts

Hardware versus software

Lecture 1 Contracts : Principles of Imperative Computation (Fall 2018) Frank Pfenning

Transcription:

Course "Softwareprozesse" Cleanroom Software Engineering Lutz Prechelt Freie Universität Berlin, Institut für Informatik Principles Empirical results Typical practices Stepwise refinement box structures, verification Statistical testing Usage modeling Hints for practice Cleanroom and CMMI 1 / 31

Cleanroom classification and goals Proposed by Harlan D. Mills, IBM, since 1980 'Cleanroom' stands for defect prevention instead of defect elimination Goal: High, quantified reliability at low cost Classification: Cleanroom is a development approach and a management approach Harlan Mills Context: Whenever precise specifications can be written early For new development, maintenance, and reengineering Independent of language and technology Requires approximately CMMI Level 3 2 / 31

Cleanroom principles Cleanroom development principle: Development teams strive to produce products without any defects by careful design and development by verification and review but not by testing Cleanroom testing principle: The purpose of testing is measuring the reliability of the product not improving the reliability Cleanroom management principle: Team-based practices limit the scope of human fallability and allow for continuous improvement 3 / 31

Empirical results (1): IBM Cobol SF R.C. Linger, H.D. Mills: "A Case Study in Cleanroom Software Engineering: The IBM COBOL Structuring Facility", 12th Intl. Computer Science and Applications Conf., Oct. 1988. Project developing "Cobol Structuring Facility" COBOL/SF A program analyzer/translator (written in PL/1) for converting Cobol code with GOTOs into structured Cobol code 52 KLOC modified/added to existing 40 KLOC base product Overall productivity: +400% Overall defect density: 3.4 defects/kloc Field-testing defects: 10 (only 1 of them major) The defect reduction is the main reason for the huge improvement in productivity Testing such a system is very laborious 4 / 31

Empirical results (2): Ellemtel/Ericsson OS32 L.G. Tann: "OS32 and Cleanroom" 1st Annual European Industrial Symposium on Cleanroom Software Engineering, Copenhagen, Denmark, 1993, pp. 1-40. Project developing an operating system for telephone switching systems 73 people staff, 33 months duration 350 KLOC resulting software size (14 LOC/PM) Development productivity: +70% Testing productivity: +100% (tests per hour) Testing defect density: 1 defect/kloc These are very big improvements, considering this was a mature development organization already. 5 / 31

Empirical results (3): Controlled experiment R. Selby, V. Basili, F. Baker: "Cleanroom Software Development: An Empirical Evaluation" IEEE Transactions on Software Engineering, 13(9), Sept. 1987 A controlled experiment: 15 teams (10 Cleanroom, 5 conventional) of 3 student developers (w. prof. experience). Each develops the same SW electronic messaging system: duration 6 weeks, 4 milestones, resulting size 800 to 2300 LOC of Simple-T code Results: The Cleanroom teams developed more functionality All Cleanroom teams kept all milestones, only 2 of the 5 others did The Cleanroom programs were less complex (control flow) and had better annotation The Cleanroom programs had significantly fewer test failures 86% of the developers missed testing (quality was not affected) 6 / 31

Typical Cleanroom techniques Small teams High motivation, close cooperation, efficiency "Defects are not acceptable!" Parallel development Strict modularization has to be done at specification time Exact specification All partial specifications are precise and self-contained Physical cleanroom Strict separation of development and testing Development teams Development teams are strictly forbidden to perform any testing Test teams Test teams never modify programs 7 / 31

Typical Cleanroom techniques (2) Exact specification Defect prevention Precise specifications help avoid ambiguity defects Verification During development, defects are continually searched for by comparing with specification Specif. languages: Z, VDM, box method, special grammars Stepwise refinement with the box method 1. Specification (black box) Describes WHAT without HOW 2. State description (state box) Specification as a state machine (not always useful) 3. Process description (clear box) Partial HOW: "Implementation", but may use further black boxes 8 / 31

Typical Cleanroom techniques (3) Review/verification Performed for each refinement State box and clear box Grounded in mathematics, performed as team discussion Convincing argumentation, rarely formal mathematical proof Argument is formulated and verified during an inspection Incremental development Initially, only basic functionality is developed The key point! Statistical testing Usage modelling Test cases are a random sample according to usage model Quantitative statement on reliability (certification) 9 / 31

Typical Cleanroom techniques: Note First and foremost, Cleanroom development is an attitude So none of the above techniques is absolutely mandatory: They can be driven to extremes for instance developers may be prohibited to even compile their code They can be relaxed for instance by performing defect testing before statistical testing They can be exchanged for others for instance by driving development in some other way than by box refinement M. Deck: Cleanroom Software Engineering Myths and Realities, 1997 10 / 31

Cleanroom process flow overview Requirements analysis Development team Specification Design: next refinement Verification, correction Definition of next increment Test team Usage modeling Test case generation Statistical testing Reliability certification 11 / 31

Problems and Obstacles Cleanroom is not suited if formal specification is difficult which is commonly the case for interactive systems determining the correctness of test outputs is costly but this is a problem for conventional development as well. One could still do Cleanroom without reliability certification by leaving out statistical testing Necessary preconditions: Highly trained software engineers Others cannot create reliable verification arguments Defined software process (CMMI Level ~3) Immature processes will lack the necessary discipline and control 12 / 31

Specification and design with box structure Define black box: define output based on input history Define state box [perhaps]: define states for the representation of input history reformulate black box (may introduce several new black boxes) verify reformulation: state box must be equivalent to black box Define clear box: define data abstraction for state data reformulate state box (may introduce several new black boxes) verify reformulation: clear box must be equivalent to state box Continue with black boxes of the refinement 13 / 31

Trivial refinement example black box 1: triangletype(a, b, c) precondition: a, b, c are positive, real numbers postcondition: return EQUILATERAL / ISOSCELES / OTHER / NO_TRIANGLE the triple (a, b, c) is side lengths of an equilateral / non equilateral isosceles / non isosceles triangle / cannot be side lengths of a triangle (only the lengths of the sides are relevant here!) 14 / 31

Refinement example (2) clear box 1: triangletype(a, b, c) IF allsidessatisfytriangleinequation(a, b, c) THEN return truetriangletype(a, b, c) ELSE return NO_TRIANGLE black box 2: allsidessatisfytriangleinequation(a, b, c) precondition: a, b, c positive, real numbers postcondition: True if each side is shorter than the sum of the other two; else False black box 3: truetriangletype(a, b, c) precondition: (a, b, c) are the side lengths of a triangle postcondition: 15 / 31

Refinement example (3) verification clear box 1: (a, b, c) can form triangle the two shorter sides x, y together are longer than the longest, z. Hence, z < x + y (i.e., "side z satisfies triangle inequation") is sufficient for diagnosing a triangle. "All sides satisfy triangle inequation" is a stronger condition, hence also sufficient. Is "All sides " also necessary? Yes: If z < x + y holds, x < z + y and y < x + z will hold even more strongly Hence, clear box 1 is correct. 16 / 31

Refinement example (4) clear box 2: allsidessatisfytriangleinequation(a, b, c) return (a < b + c AND b < a + c AND c < a + b) verification clear box 2: 3 different side lengths a, b, c are tested ( "triangle"), tests are connected by 'AND' ( "all sides"), each test compares one side to the sum of the two others, each comparison is by 'less than' ( correct inequation). Hence, the implementation appears to be correct 17 / 31

Refinement example (5) clear box 3: truetriangletype(a, b, c) IF a = b= c THEN return EQUILATERAL ELSE IF a = b OR a = c OR b = c THEN return ISOSCELES ELSE return OTHER verification clear box 3: 'Equilateral' is a special case of 'isoceles' and must therefore be tested first, this is done here. The test for 'equilateral' is correct. The test for 'isosceles' must check 3 different pairs (correct), only one needs to be equal (connection with 'OR', correct) 'Other' is the only remaining case, must be 'ELSE' part. Correct. Therefore clear box 3 is correct. 18 / 31

Statistical Testing and Certification Most software processes use defect testing Goal: Find as many defects as possible, with as few test cases as possible Testing concentrates on 'difficult' cases. Defect testing makes almost no statement about reliability. In contrast, Cleanroom uses statistical testing Goal: Quantify reliability; attitude like acceptance testing Does not specifically aim to find defects Testing reflects the frequency of 'typical' cases Basis: Usage modelling Based on description of the usage profile (from requirements) Mathematical description with Markov-chains (finite state space, discrete events) 19 / 31

Example: Excerpt from the usage model for a text editor Probabilistic state machine: States are actions, stochastic sequencing 0.1 0.02 0.38 place block marker 0.05 copy block 0.22 0.05 0.9 0.08 0.1 0.2 delete block 0.08 0.07 0.06 0.34 enter symbol 0.04 move cursor 0.8 0.13 0.66 0.02 0.7 20 / 31

Testing process Any number of test inputs can be generated automatically based on the usage model Output correctness predicate? Depends on application Often only plausibility checking is possible Measure the intervals between failures Terminate when sufficient reliability can be certified Stop when insufficient reliability has been determined 21 / 31

Reliability certification The goal is a statement such as "MTTF(program) m with confidence K" e.g. "With confidence 95% we can say that this program fails at most once every 2 000 000 steps" MTTF: mean time-to-failure ("time" being the number of steps) Computed with statistical methods (binomial distribution) Problem: When I find and correct a defect, may I still use the data from the previous test runs? Defect models and reliability growth models may allow this, but then need to rely on assumptions (in particular the nonintroduction of new failures). This is beyond the scope of this lecture. 22 / 31

Certification testing: basic idea Schematic view! Details follow number of failures number of test cases Note that the up-steps are not vertical; they go 1 to the right as well. 23 / 31

Details: Binomial distribution Given an event (here: failure) with probability p (here: 0.001) i.e. we want to certify 99.9% reliability (= 1-p) A binomial distribution describes the number F of failures to be expected during N runs (here: N=3000) for 95% confidence: acceptance region rejection region http://mathworld.wolfram.com/binomialdistribution.html 24 / 31

Certification testing Limit lines for binomial distribution (N trials, p=0.001) y P N,p (F < y) 0.95 continue testing P N,p (F < y) 0.05 25 / 31

Cleanroom testing in practice M. Deck, J.A. Whittaker: "Lessons Learned from Fifteen Years of Cleanroom Testing", 1997 One should integrate development and testing Split has too much negative side-effects adversarial thinking is bad, because collaboration helps Cooperation adds value e.g. operational profile helps SW design wrt realtime behavior There will be some defects to be found and removed Statistical testing is very difficult huge input spaces, so non-trivial usage models become very complicated One should adapt the techniques to the context e.g. prototyping may be useful e.g. coverage testing may be useful/required regression testing is useful 26 / 31

Cleanroom and CMMI CMMI process areas covered by typical Cleanroom practices Level 2: Managed Measurement and Analysis MA with respect to reliability only Process and Product Quality Assurance PPQA verification discipline Level 3: Defined Technical Solution TS SP 2.3 Design Interfaces Using Criteria: formal specification Verification VER The heart of Cleanroom! Level 4: Quantitatively Manag'd Quantitative Project Mgmt QPM Statistical testing Level 5: Optimizing Causal Analysis and Resolution CAR Continuous team improvement (defect-based) 27 / 31

Cleanroom and CMMI (2) Cleanroom alone does not get you anywhere wrt. CMMI not even to Level 2 But the quality culture inspired by Cleanroom is a useful driver for many improvements up to Level 5: Level 2: PPQA (developers become aware of process quality); Level 3: VER (reviews become standard practice) Level 4: OPP (defect densities become a natural process benchmark) Level 4: QPM (quantitative quality management is established) Level 5: OPM (developers start continuous improvements wrt. defect avoidance, thus opening the organization for process improvement) 28 / 31

Literature Richard Linger, Carmen Trammell: "Cleanroom Software Engineering Reference Model", Software Engineering Institute, Technical Report CMU/SEI-96-TR-022 detailled definition of the Cleanroom process 29 / 31

Summary: Cleanroom Software Engineering We studied Cleanroom for its ideas and basic attitude: "Do not accept defects, favor defect prevention over defect detection" not as a software process to be used exactly as a whole; useful where reliability matters a lot and specs are available Key properties: Exact specification (important) Stepwise refinement with box-specification (replacable) Verification during inspection (important, done by a team) Statistical testing based on usage model (ideally ) Reliability certification (ideally ) Result: very low defect rate, high productivity 30 / 31

Thank you! 31 / 31