Random input testing with R

Similar documents
1.7 Limit of a Function

Section 0.3 The Order of Operations

Intro. Scheme Basics. scm> 5 5. scm>

(Refer Slide Time: 1:27)

Data Structures and Algorithms Dr. Naveen Garg Department of Computer Science and Engineering Indian Institute of Technology, Delhi.

15-451/651: Design & Analysis of Algorithms November 4, 2015 Lecture #18 last changed: November 22, 2015

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Priority Queues / Heaps Date: 9/27/17

The name of our class will be Yo. Type that in where it says Class Name. Don t hit the OK button yet.

Fractions and their Equivalent Forms

Robert Ragan s TOP 3

Divisibility Rules and Their Explanations

SPRITES Moving Two At the Same Using Game State

CDs & DVDs: Different Types of Disk Explained

A quick guide to... Split-Testing

Search Lesson Outline

Intro to Algorithms. Professor Kevin Gold

Reliable programming

Introduction to Algorithms / Algorithms I Lecturer: Michael Dinitz Topic: Dynamic Programming I Date: 10/6/16

Variables and Data Representation

Introduction to Programming

Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.

6 Stephanie Well. It s six, because there s six towers.

Table of Laplace Transforms

How to Improve Your Campaign Conversion Rates

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Sorting lower bound and Linear-time sorting Date: 9/19/17

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Approximation algorithms Date: 11/27/18

Lutheran High North Technology The Finder

Sum or difference of Cubes

MergeSort, Recurrences, Asymptotic Analysis Scribe: Michael P. Kim Date: April 1, 2015

Web Design and Databases WD: Class 3: Usability. Dr Helen Hastie Dept of Computer Science Heriot-Watt University

(Refer Slide Time: 05:25)

Unit 9 Tech savvy? Tech support. 1 I have no idea why... Lesson A. A Unscramble the questions. Do you know which battery I should buy?

CPSC 320: Intermediate Algorithm Design and Analysis. Tutorial: Week 3

Bryan Kreuzberger, Creator of The Breakthrough System Presents. Breakthrough BLUEPRINT

CS125 : Introduction to Computer Science. Lecture Notes #4 Type Checking, Input/Output, and Programming Style

Chapter01.fm Page 1 Monday, August 23, :52 PM. Part I of Change. The Mechanics. of Change

Cold, Hard Cache KV? On the implementation and maintenance of caches. who is

Meet the Cast. The Cosmic Defenders: Gobo, Fabu, and Pele The Cosmic Defenders are transdimensional

Beyond Counting. Owen Kaser. September 17, 2014

Taskbar: Working with Several Windows at Once

Errors. And How to Handle Them

Business Hacks to grow your list with Social Media Marketing

FIGURING OUT WHAT MATTERS, WHAT DOESN T, AND WHY YOU SHOULD CARE

CS125 : Introduction to Computer Science. Lecture Notes #38 and #39 Quicksort. c 2005, 2003, 2002, 2000 Jason Zych

MergeSort, Recurrences, Asymptotic Analysis Scribe: Michael P. Kim Date: September 28, 2016 Edited by Ofir Geri

Sucuri Webinar Q&A HOW TO IDENTIFY AND FIX A HACKED WORDPRESS WEBSITE. Ben Martin - Remediation Team Lead

How to Read AWStats. Why it s important to know your stats

Sets. Sets. Examples. 5 2 {2, 3, 5} 2, 3 2 {2, 3, 5} 1 /2 {2, 3, 5}

XP: Backup Your Important Files for Safety

What Every Programmer Should Know About Floating-Point Arithmetic

Hello World! Computer Programming for Kids and Other Beginners. Chapter 1. by Warren Sande and Carter Sande. Copyright 2009 Manning Publications

Advanced Operations Research Prof. G. Srinivasan Department of Management Studies Indian Institute of Technology, Madras

Difference Between Dates Case Study 2002 M. J. Clancy and M. C. Linn

1.2 Adding Integers. Contents: Numbers on the Number Lines Adding Signed Numbers on the Number Line

Module 6. Campaign Layering

Rescuing Lost Files from CDs and DVDs

PowerPoint Basics: Create a Photo Slide Show

6.001 Notes: Section 6.1


14.1 Encoding for different models of computation

1 Algorithm and Proof for Minimum Vertex Coloring

Notebook Assignments

Transcriber(s): Aboelnaga, Eman Verifier(s): Yedman, Madeline Date Transcribed: Fall 2010 Page: 1 of 9

TestComplete 3.0 Overview for Non-developers

Proofwriting Checklist

Getting started with simulating data in R: some helpful functions and how to use them Ariel Muldoon August 28, 2018

CS261: A Second Course in Algorithms Lecture #14: Online Bipartite Matching

3 Nonlocal Exit. Quiz Program Revisited

If Statements, For Loops, Functions

Close Your File Template

Programming in the Real World. Dr. Baldassano Yu s Elite Education

6.001 Notes: Section 7.1

Clean & Speed Up Windows with AWO

A Beginner s Guide to Successful Marketing

Have the students look at the editor on their computers. Refer to overhead projector as necessary.

Lecture Notes on Contracts

Algorithm Design and Recursion. Search and Sort Algorithms

1 GSW Bridging and Switching

These are notes for the third lecture; if statements and loops.

Discover How to Watch the Mass Ascension of the Albuquerque International Balloon Fiesta Even if You Can t Be There

5 R1 The one green in the same place so either of these could be green.

COMP 161 Lecture Notes 16 Analyzing Search and Sort

Burning CDs in Windows XP

Introduction to Algorithms / Algorithms I Lecturer: Michael Dinitz Topic: Approximation algorithms Date: 11/18/14

It s possible to get your inbox to zero and keep it there, even if you get hundreds of s a day.

Excel Basics Rice Digital Media Commons Guide Written for Microsoft Excel 2010 Windows Edition by Eric Miller

Recitation 4: Elimination algorithm, reconstituted graph, triangulation

UKNova s Getting Connectable Guide

Computational Complexity and Implications for Security DRAFT Notes on Infeasible Computation for MA/CS 109 Leo Reyzin with the help of Nick Benes

Repetition Through Recursion

Trombone players produce different pitches partly by varying the length of a tube.

Chapter 5 Errors. Bjarne Stroustrup

Detailed instructions for adding (or changing) your Avatar (profile picture next to your

Week 5, continued. This is CS50. Harvard University. Fall Cheng Gong

Regularization and model selection

CS 370 The Pseudocode Programming Process D R. M I C H A E L J. R E A L E F A L L

9 R1 Get another piece of paper. We re going to have fun keeping track of (inaudible). Um How much time do you have? Are you getting tired?

CS 4349 Lecture August 21st, 2017

CMPSCI 187: Programming With Data Structures. Lecture 5: Analysis of Algorithms Overview 16 September 2011

RACKET BASICS, ORDER OF EVALUATION, RECURSION 1

Transcription:

Random input testing with R Patrick Burns http://www.burns-stat.com stat.com 2011 August Given at user!2011 at the University of Warwick on 2011 August 17 in the Programming session, Uwe Ligges presiding. 1

The goal is for us to have fewer bugs in our code. Photo from istockphoto.com 2

We ll start with a bit about the application that drove me in the direction that I ve come. The application is portfolio optimization in finance. The textbook version of this is quadratic programming that s an easy problem. The reality is that it is a species of the knapsack problem: what is the best bundle of stuff that you can carry with you? That s a hard problem. How hard? Let s think about doing it via brute force. Photo by David Weekly via everystockphoto.com 3

We ll take a small problem. Photo via istockphoto.com 4

We ll make some blue sky, absolutely optimistic assumptions. Photo via istockphoto.com 5

Unit of time Now we need a unit of time to state how long this process will take. I have one in mind, what is it? 6

The first guess from the audience was right: the age of the universe. There is a combinatorial explosion that makes this a fantastically hard problem. Wilkinson Microwave Anisotropy Probe image from NASA 7

Not everyone is keen on the wait. Photo from istockphoto.com 8

So in my code I have a number of shortcuts to speed up the process. Nothing could possible go wrong here. Photo by jurvetson via everystockphoto.com (I think) 9

Utilities Maximise information ratio Minimum variance Maximum return Mean-variance utility Mean-volatility utility Minimize distance to a portfolio With or without transaction costs Linear costs Polynomial costs Multiple terms with arbitrary exponents Separate costs for long-buy, long-sell, short-buy, short-sell With or without multiple expected return vectors With or without multiple variance matrices With or without multiple target portfolios Choice of quantile in multiple utility cases Can use weights in multiple utility cases When doing optimization, we need a utility. So pick one. 10

Constraints Long-only (or not) Value of gross, net, long, short Turnover Maximum weight (per asset) Number of assets traded Number of assets in portfolio Threshold (portfolio and/or trade) Linear On portfolio or trade On net or gross On long-side, on short-side, or both Count constraints Forced trades Assets to trade Number of positions closed Round lotting (automatic) Risk fractions Expected return Variance Distance Sum of n largest weights Cost There will also be a set of constraints which is a subset of all possible constraints. So there is a combinatorial explosion of inputs that is similar to the combinatorial explosion that makes it a hard problem in the first place. And we want to test to see if the code works. 11

Fixed problem Right result? The standard approach to software testing is to produce a fixed problem and ask if we get the correct answer. Perhaps you can spot that already I have a problem. 12

Fixed problem Rightish result? If it takes billions of years to get one right answer, perhaps my test suite will be a bit thin for a while. I can t ask if I have the right answer. I can only ask if the answer is sort of right. 13

Rightish answers Sanity check shortcuts Sanity check result Close to best observed? To see if an answer appears right, there is a large number of checks in test mode that squawk if something doesn t look right. In particular almost all shortcuts are also computed the long way in test mode and the two results compared. There are checks to see if the result makes sense. Since my optimization algorithm is random, it will potentially get different answers when using different seeds. Hence we can see if the current answer is close to the best answer that has been observed. This gives us a sense of how well the algorithm works on the particular problem. 14

testthat RUnit svunit If you are creating a standard type test suite for R code, then there are a few packages that can help you. 15

Fixed problem Right result? I have a standard test suite. But no matter how expansive my suite becomes, I will only ever test a tiny fraction of the possible combinations of inputs. And writing a standard test suite is quite labor-intensive. Plus there can be bias in the test. As someone said to me after the talk, we may have an unconscious desire NOT to find bugs. Hence looking for alternatives to standard testing seems like a worthwhile endeavor. 16

Random problem Any explosions? One alternative type of test is to create random inputs and then smell for smoke. 17

Create inputs Call code Look at output There are three steps in doing this. If the code you are testing is not R code, then R is still a great place to do the first step of creating the random inputs. If you are testing R code, then there are two key functions in performing the second step of calling the code. 18

do.call If you don t know this function, you should. It allows you to call a function with an actual list of arguments. So in the present case, we create a list containing the random inputs that were created in step 1, and then call the function being tested. 19

try or trycatch These are slightly different ways of doing the same thing. They catch the errors from the test cases so that the errors don t leak out into the actual test suite. 20

When doing random testing, there are several things that can happen. Here is the tree of those things. 21

Get answer We can get an answer as opposed to an error. 22

Get answer okay Boring. 23

Get answer wrong Not boring. 24

Get answer wrong Bad, bad, bad computer! We don t like to be here. But it does make the effort of testing seem worthwhile. If we are here, then we want to fix the problem. There are two more things we should do as well: 1. Create a case to put in the standard test suite for this problem. Random testing is an adjunct to standard testing, not a replacement. 2. Look through the code for other occurrences of this same sort of problem. 25

Get answer inefficient We can get the right answer via an inefficient (in time or memory) route. In my current application I m not going to know this, but in other applications it is feasible to spot this. 26

Get error We can get an error rather than an answer. This is a wonderful reason to do random testing. Standard test suites seldom to never exercise the code that throws errors. Random testing can exercise it a lot. You could create a test where you don t even expect to get an answer you re just testing the error throwing. This, of course, is entirely divorced from my original motivation for this sort of testing. 27

Get error Caught When we get an error, it can be caught by the code we ve written. 28

Get error Caught Not clear If the error message is not clear, then we should change the error message. 29

Get error Caught Clear If the error message is clear, then we should question that assumption. When we are coding, we have the curse of knowledge: we know everything that is going on. So any error message that we write is going to make sense to us. We should try to put ourselves in the frame of mind of a user who is not especially clued in. 30

Get error Not caught If the error message is not caught by our code, 31

Get error Not caught Not clear 32

Get error Not caught Clear we should consider catching it because it is even harder in that case for the message to be clear to our not-so-clued-in user. 33

I find writing test suites to be as exciting as digging ditches. Of all the tasks involved in creating software, this is the lowest of the low for me. I had expected the same to be true for random testing. Photo by KayPat via stock.xchng 34

But I found random testing to have a quite different ambience to it. It is much more like playing. Standard tests are basically static write them once, periodically add bits to them, use them many times. Random testing is more like an interactive game: do a bit, see how it goes, do some more, see how it goes, Keeping track of the different versions of the random tests is a good idea. You can go back and use a primitive test on a new version of the software. You can also use a primitive version as the basis for going off in a different testing direction. I hope to see you on the playground. Photo by forst747 via stock.xchng 35

One of the good things about giving a talk is that the audience tells you the name for what you are doing. Fuzz testing is a common word for random input testing. One of the key questions is what distribution to use. The distribution will depend on your goal. It seems like there are three basic goals: 1. Test the correctness of answers 2. Test error throwing code 3. Test for security issues The last of these is most closely associated with fuzz testing. 36

If your goal is one of the latter two, then the distributions should be very wide. That is, lots of stuff that need not even make any sense at all. If you are trying to test the error throwing, then you probably want a more refined test as well that has arguments closer to what is expected. If your goal is to test answers, then it seems to me that there are two possibilities: A distribution aiming at what is likely to be done in practice A distribution aiming at most likely to generate bugs The reason that random testing is more fun is because it is really programming writing the code that creates the distributions. 37

In my tests I have them start by doing two things: Print the set-seed number so that results are reproducible Print the version of the test being used My tests so far have used the iteration variable as the set-seed number. Martin Maechler had a better idea: generate a random set-seed number. That way there is no need to keep track of what iterations have already been done with each test. 38

The business end of my random test function looks like: ans <- try(do.call( trade.optimizer, c(thisdata, extra.args))) if(inherits(ans, try-error )) { cat( \n error with seed, the.seed, \n\n ) } else { pprobe.check(ans) } 39