Section 6.5: Random Sampling and Shuffling

Size: px
Start display at page:

Download "Section 6.5: Random Sampling and Shuffling"

Transcription

1 Section 6.5: Random Sampling and Shuffling Discrete-Event Simulation: A First Course c 2006 Pearson Ed., Inc Discrete-Event Simulation: A First Course Section 6.5: Random Sampling and Shuffling 1/ 29

2 Section 6.5: Random Sampling and Shuffling Exactly m! different permutations (a 0,a 1,...,a m 1 ) can be formed from a finite set A with m = A distinct elements A random shuffle generator is an algorithm that will produce any one of these m! permutations in such a way that all are equally likely Example If A = {0, 1, 2, the 3! = 6 different possible permutations of A are (0,1,2) (0,2,1) (1,0,2) (1,2,0) (2,0,1) (2,1,0) A random shuffle generator can produce any of six possible permutations with equal probability 1/6 Discrete-Event Simulation: A First Course Section 6.5: Random Sampling and Shuffling 2/ 29

3 Algorithm The intuitive way to generate a random shuffle of A is: draw the first element a 0 at random from A draw the second element a 1 at random from A {a 0 draw the third a 2 at random from A {a 0, a 1, etc. An in place algorithm: Algorithm for(i = 0; i < m - 1; i++){ j = Equilikely(i, m - 1); hold = a[j]; a[j] = a[i]; /* swap a[i] and a[j]*/ a[i] = hold; The algorithm assumes A is stored in array a Algorithm is an excellent example of the elegance of simplicity. Discrete-Event Simulation: A First Course Section 6.5: Random Sampling and Shuffling 3/ 29

4 Algorithm ctd prior to a[0], a[3] swap i j prior to a[1], a[6] swap i j prior to a[2], a[4] swap i j The above figure(corresponding to m = 9) illustrates i, j, and the state of a[ ] for the first three loop iterations The algorithm is ideal for shuffling a deck of cards a useful simulation skill Later in the section, a minor modification to this algorithm makes it suitable for random sampling without replacement Discrete-Event Simulation: A First Course Section 6.5: Random Sampling and Shuffling 4/ 29

5 Random Sampling A population P of m = P elements a 0,a 1,...,a m 1 If m is not large, the population could be stored in primary memory As an array a[0], a[1],..., a[m 1] As a linked list If m is large, the population may be stored in secondary memory In any case, the population is (conceptually) a list Want a random sample S of n = S elements x 0,x 1,...,x n 1 from list P If n is small, the sample can be stored in primary memory as x[0], x[1],..., x[n 1] Sometimes a linked list could be used Discrete-Event Simulation: A First Course Section 6.5: Random Sampling and Shuffling 5/ 29

6 Random Sampling: Notation and Terminology (1) Generic algorithms use functions Get(&z, L,j) and Put(z, L,j) L is either P or S Get(&z, L, j) returns the value of the j th element in the list L as z Put(z, L, j) assigns the value of z to the j th element in the list L Order preservation: sample order is consistent with population order Important for some applications Sequential sampling algorithms: Traverse P once, in order Access to S may or may not be sequential Discrete-Event Simulation: A First Course Section 6.5: Random Sampling and Shuffling 6/ 29

7 Random Sampling: Notation and Terminology (2) Sampling with replacement: S may contain multiple instances of one element of P n can be larger than m Sampling without replacement: S contains at most one instance of each element of P n m The phrase at random can be interpreted in two ways: 1 Each element of P is equally likely to be an element of S 2 Each possible sample of size n is equally likely to be selected Discrete-Event Simulation: A First Course Section 6.5: Random Sampling and Shuffling 7/ 29

8 Algorithm Algorithm for(i = 0; i < n; i++){ j = Equilikely(0, m - 1); Get(&z, P, j); /* random access */ Put(z, S, i); Complexity is O(n) Algorithm is non-sequential and m must be known Sampling is with replacement: n can be larger than m Order is not preserved The number of possible samples is m n (if elements of P are distinct) All samples are equally likely to be generated Discrete-Event Simulation: A First Course Section 6.5: Random Sampling and Shuffling 8/ 29

9 Example Suppose The population is stored as array a[0], a[1],..., a[m 1] The sample is stored as array x[0],x[1],...,x[n 1] Then, Algorithm is equivalent to Example for(i = 0; i < n; i++){ j = Equilikely(0, m - 1); x[i] = a[j]; Discrete-Event Simulation: A First Course Section 6.5: Random Sampling and Shuffling 9/ 29

10 Algorithm Algorithm for(i = 0; i < n; i++){ j = Equilikely(i, m - 1); /* i not 1*/ Get(&z, P, j); /* random access */ Put(z, S, i); Get(&x, P, i); /* sequential access */ Put(z, P, i); /* sequential access */ Put(x, P, j); /* random access */ O(n) complexity; non-sequential; m must be known Sampling is without replacement: n m Order is not preserved in S; order is destroyed in P by shuffling Number of possible samples: m(m 1) (m n + 1) = m!/(m n)! All samples are equally likely to be generated Discrete-Event Simulation: A First Course Section 6.5: Random Sampling and Shuffling 10/ 29

11 Example Suppose the population and sample are stored as arrays a[0],a[1],...,a[m 1] and x[0],x[1],...,x[n 1] respectively Algorithm is equivalent to Example for(i = 0; i < n; i++){ j = Equilikely(i, m - 1); x[i] = a[j]; a[j] = a[i]; a[i] = x[i]; Algorithm is a simple extension of Algorithm Discrete-Event Simulation: A First Course Section 6.5: Random Sampling and Shuffling 11/ 29

12 Sequential Sampling (without replacement) Basic idea: traverse P once, in order, and select elements to put in S Algorithms are O(m) The selection or non-selection of elements is random with probability p as illustrated Sequential Sampling Get(&z, P, j); if (Bernoulli(p)) Put(z, S, i) For algorithm 6.5.4, p is independent of i and j For algorithms and 6.5.6, the probability of selection changes based on the values of i, j and the number of elements selected Discrete-Event Simulation: A First Course Section 6.5: Random Sampling and Shuffling 12/ 29

13 Algorithm Algorithm i = 0; j = 0; while( more data in P){ Get(&z, P, j); /* sequential access */ j++; if (Bernoulli(p)){ Put(z, S, i); i++; Order is preserved Each element of P is selected, independently, with probability p Algorithm does not make use of either m or n explicitly m may be unknown Sample size n is a Binomial(m, p) random variate Discrete-Event Simulation: A First Course Section 6.5: Random Sampling and Shuffling 13/ 29

14 Example The population is stored in secondary memory as a sequential file and the sample is stored as array x[0],x[1],...,x[n 1] Algorithm is equivalent to Example i = 0; while( more data in P){ z = GetData(); if (Bernoulli(p)){ x[i] = z; i++; Discrete-Event Simulation: A First Course Section 6.5: Random Sampling and Shuffling 14/ 29

15 Applications Discrete event simulation Real time data acquisition Objective is to sample, at random, some percentage (say, 1%) of the population Algorithm with p = 0.01 would provide the ability but cannot specify the sample size n exactly Algorithm provides the ability, and can specify n exactly, provided m is known Algorithm provides the ability, can specify n, even if m is unknown Discrete-Event Simulation: A First Course Section 6.5: Random Sampling and Shuffling 15/ 29

16 Algorithm Algorithm i = 0; j = 0; while(i < n){ Get(&z, P, j); /* sequential access */ p = (n - i) / (m - j); j++; if (Bernoulli(p)){ Put(z, S, i); i++; m = P is known. Order is preserved. The key is that a j is selected with a probability (n i)/(m j) Number of possible samples is m!/(m n)!n! All samples are equally likely to be generated Discrete-Event Simulation: A First Course Section 6.5: Random Sampling and Shuffling 16/ 29

17 Example The population is stored in secondary memory as a sequential file and the sample is stored as array x[0],x[1],...,x[n 1] Algorithm is equivalent to Example i = 0; j = 0; while( i < n){ z = GetData(); p = (n - i) / (m - j); j++; if (Bernoulli(p)){ x[i] = z; i++; Discrete-Event Simulation: A First Course Section 6.5: Random Sampling and Shuffling 17/ 29

18 Algorithm Sequential random sampling without replacement, unknown m Algorithm for (i = 0; i < n; i++){ Get(&z, P, i); /* sequential access */ Put(z, S, i); j = n; while (more data in P){ Get(&z, P, j); /* sequential access */ j++; p = n / j; if (Bernoulli(p)){ i = Equilikely(0, n -1); Put(z, S, i); Discrete-Event Simulation: A First Course Section 6.5: Random Sampling and Shuffling 18/ 29

19 Explanation of Algorithm The sample is initialized with the first n elements of the population For j n, population element a j overwrites an existing sample element with probability n/(j + 1) Access to P is sequential Access to S is not sequential Thus, the algorithm is not order preserving The number of possible samples is m!/(m n)!n! All samples are equally likely to be generated (with a caveat...) Discrete-Event Simulation: A First Course Section 6.5: Random Sampling and Shuffling 19/ 29

20 Example The population is stored in secondary memory as a sequential file and the sample is stored as array x[0], x[1],..., x[n 1] Algorithm is equivalent to Example for (i = 0; i < n; i++){ z = GetData(); x[i] = z; j = n; while (more data in P){ z = GetData(); j++; p = n / j; if (Bernoulli(p)){ i = Equilikely(0, n -1); x[i] = z; Discrete-Event Simulation: A First Course Section 6.5: Random Sampling and Shuffling 20/ 29

21 Algorithm Differences Algorithm Requires knowledge of m = P Preserves order Algorithm Does not require knowledge of m = P Does not preserve order Discrete-Event Simulation: A First Course Section 6.5: Random Sampling and Shuffling 21/ 29

22 Example P = (0,1,2,3,4); m = 5 Use algorithm to select samples of size n = 3 Sampling is sequential and order is preserved There are 5!/2!3! = 10 possible (ordered) samples Algorithm generates possible samples (0,1, 2)(0, 1,3)(0, 1,4)(0, 2, 3)(0,2, 4)(0,3, 4)(1, 2,3)(1, 2,4)(1, 3,4)(2, 3, 4) each with equal probability 0.10 Discrete-Event Simulation: A First Course Section 6.5: Random Sampling and Shuffling 22/ 29

23 Example If Algorithm is used, 13 samples are possible (0,3, 4) 1,4) (3,4, 2) (0,1, 2)(0, 1,3)(0, 1,4)(0, 3, 2)(0,4, 2) (3, 1,2)(4, 1,2)(3, (0,4, 3) (4, 1,3) (4,3, 2) Note order is not preserved. Each of these samples are not equally likely Algorithm produces the correct number of equally likely samples only if all alike-but-for-permutation samples are combined Discrete-Event Simulation: A First Course Section 6.5: Random Sampling and Shuffling 23/ 29

24 URN Models Four examples with models: Binomial(n, p) Hypergeometric(n, a, b) Geometric(p) Pascal(n, p) All of these models can be motivated by drawing, at random, from a conceptual urn initially filled with a amber balls and b black balls Discrete-Event Simulation: A First Course Section 6.5: Random Sampling and Shuffling 24/ 29

25 Binomial n balls are drawn from the urn, with replacement Let x be the number of amber balls drawn A straightforward Monte Carlo simulation of this random experiment: Use algorithm Population is the urn; sample is the drawn balls x is number of amber balls in sample More elegant: Let p = a/(a + b) and use the O(n) algorithm Binomial Urn x = 0; for (i = 0; i < n; i++) x +=Bernoulli(p); return x; Random variate x is Binomial(n,p) Discrete-Event Simulation: A First Course Section 6.5: Random Sampling and Shuffling 25/ 29

26 Hypergeometric n balls are drawn from the urn, without replacement Use a modified version of previous algorithm Hypergeometric Urn m = a + b; x = 0; for (i = 0; i < n; i++){ p = (a - x) / (m - i); x += Bernoulli(p); return x; Random variate x is Hypergeometric(n,a,b) Discrete-Event Simulation: A First Course Section 6.5: Random Sampling and Shuffling 26/ 29

27 More Hypergeometric The Hypergeometric(n, a, b) pdf is ( )( ) a b f (x) = x n x ( ) x = max(0,n b),...,min(a,n) a + b n Lower limit of x: if n > b, at least n b amber balls will be drawn Upper limit of x: if n > a, then at most a amber balls will be drawn In applications where n is smaller than both a and b, the range of possible values is x = 0,1,2,...,n Discrete-Event Simulation: A First Course Section 6.5: Random Sampling and Shuffling 27/ 29

28 Geometric Draw with replacement until the first black ball is obtained Let x be the number of amber balls drawn With p = a/(a + b), the following algorithm simulates this random experiment Geometric Urn x = 0; while(bernoulli(p)){ x++; return x; Random variate x is Geometric(p) This stochastic model is commonly used in reliability studies: p is usually close to 1.0 and x counts the number of successes before the first failure Discrete-Event Simulation: A First Course Section 6.5: Random Sampling and Shuffling 28/ 29

29 Pascal Draw with replacement until the n th black ball is obtained Let x be the number of amber balls drawn With p = a/(a + b), the following algorithm simulates this random experiment Pascal Urn x = 0; for(i = 0; i < n; i++){ while(bernoulli(p)) x++; return x; Random variate x is Pascal(n,p) This stochastic model is commonly used in reliability studies: p is usually close to 1.0 and x counts the number of successes before the n th failure Discrete-Event Simulation: A First Course Section 6.5: Random Sampling and Shuffling 29/ 29

Section 6.2: Generating Discrete Random Variates

Section 6.2: Generating Discrete Random Variates Section 6.2: Generating Discrete Random Variates Discrete-Event Simulation: A First Course c 2006 Pearson Ed., Inc. 0-13-142917-5 Discrete-Event Simulation: A First Course Section 6.2: Generating Discrete

More information

Section 4.2: Discrete Data Histograms

Section 4.2: Discrete Data Histograms Section 4.2: Discrete Data Histograms Discrete-Event Simulation: A First Course c 2006 Pearson Ed., Inc. 0-13-142917-5 Discrete-Event Simulation: A First Course Section 4.2: Discrete Data Histograms 1/

More information

Counting. Chapter Basic Counting. The Sum Principle. We begin with an example that illustrates a fundamental principle.

Counting. Chapter Basic Counting. The Sum Principle. We begin with an example that illustrates a fundamental principle. Chapter 1 Counting 1.1 Basic Counting The Sum Principle We begin with an example that illustrates a fundamental principle. Exercise 1.1-1 The loop below is part of an implementation of selection sort,

More information

Computer vision: models, learning and inference. Chapter 10 Graphical Models

Computer vision: models, learning and inference. Chapter 10 Graphical Models Computer vision: models, learning and inference Chapter 10 Graphical Models Independence Two variables x 1 and x 2 are independent if their joint probability distribution factorizes as Pr(x 1, x 2 )=Pr(x

More information

Exact Sampling for Hardy- Weinberg Equilibrium

Exact Sampling for Hardy- Weinberg Equilibrium Exact Sampling for Hardy- Weinberg Equilibrium Mark Huber Dept. of Mathematics and Institute of Statistics and Decision Sciences Duke University mhuber@math.duke.edu www.math.duke.edu/~mhuber Joint work

More information

Practical 4 Programming in R

Practical 4 Programming in R Practical 4 Programming in R Q1. Here is an R implementation of Bubble sort. bubblesort

More information

Section 2.3: Monte Carlo Simulation

Section 2.3: Monte Carlo Simulation Section 2.3: Monte Carlo Simulation Discrete-Event Simulation: A First Course c 2006 Pearson Ed., Inc. 0-13-142917-5 Discrete-Event Simulation: A First Course Section 2.3: Monte Carlo Simulation 1/1 Section

More information

Discrete-Event Simulation:

Discrete-Event Simulation: Discrete-Event Simulation: A First Course Section 4.2: Section 4.2: Given a discrete-data sample multiset S = {x 1, x 2,...,x n } with possible values X, the relative frequency is ˆf (x) = the number of

More information

Department of Computer Science COMP The Programming Competency Test

Department of Computer Science COMP The Programming Competency Test The Australian National University Faculty of Engineering & Information Technology Department of Computer Science COMP1120-2003-01 The Programming Competency Test 1 Introduction The purpose of COMP1120

More information

S1) It's another form of peak finder problem that we discussed in class, We exploit the idea used in binary search.

S1) It's another form of peak finder problem that we discussed in class, We exploit the idea used in binary search. Q1) Given an array A which stores 0 and 1, such that each entry containing 0 appears before all those entries containing 1. In other words, it is like {0, 0, 0,..., 0, 0, 1, 1,..., 111}. Design an algorithm

More information

Probability Models.S4 Simulating Random Variables

Probability Models.S4 Simulating Random Variables Operations Research Models and Methods Paul A. Jensen and Jonathan F. Bard Probability Models.S4 Simulating Random Variables In the fashion of the last several sections, we will often create probability

More information

Unit 6 Chapter 15 EXAMPLES OF COMPLEXITY CALCULATION

Unit 6 Chapter 15 EXAMPLES OF COMPLEXITY CALCULATION DESIGN AND ANALYSIS OF ALGORITHMS Unit 6 Chapter 15 EXAMPLES OF COMPLEXITY CALCULATION http://milanvachhani.blogspot.in EXAMPLES FROM THE SORTING WORLD Sorting provides a good set of examples for analyzing

More information

Arrays. Defining arrays, declaration and initialization of arrays. Designed by Parul Khurana, LIECA.

Arrays. Defining arrays, declaration and initialization of arrays. Designed by Parul Khurana, LIECA. Arrays Defining arrays, declaration and initialization of arrays Introduction Many applications require the processing of multiple data items that have common characteristics (e.g., a set of numerical

More information

A noninformative Bayesian approach to small area estimation

A noninformative Bayesian approach to small area estimation A noninformative Bayesian approach to small area estimation Glen Meeden School of Statistics University of Minnesota Minneapolis, MN 55455 glen@stat.umn.edu September 2001 Revised May 2002 Research supported

More information

Topic C. Communicating the Precision of Measured Numbers

Topic C. Communicating the Precision of Measured Numbers Topic C. Communicating the Precision of Measured Numbers C. page 1 of 14 Topic C. Communicating the Precision of Measured Numbers This topic includes Section 1. Reporting measurements Section 2. Rounding

More information

Algorithm efficiency can be measured in terms of: Time Space Other resources such as processors, network packets, etc.

Algorithm efficiency can be measured in terms of: Time Space Other resources such as processors, network packets, etc. Algorithms Analysis Algorithm efficiency can be measured in terms of: Time Space Other resources such as processors, network packets, etc. Algorithms analysis tends to focus on time: Techniques for measuring

More information

Elementary maths for GMT. Algorithm analysis Part II

Elementary maths for GMT. Algorithm analysis Part II Elementary maths for GMT Algorithm analysis Part II Algorithms, Big-Oh and Big-Omega An algorithm has a O( ) and Ω( ) running time By default, we mean the worst case running time A worst case O running

More information

PASS Sample Size Software. Randomization Lists

PASS Sample Size Software. Randomization Lists Chapter 880 Introduction This module is used to create a randomization list for assigning subjects to one of up to eight groups or treatments. Six randomization algorithms are available. Four of the algorithms

More information

Lecture 3. Brute Force

Lecture 3. Brute Force Lecture 3 Brute Force 1 Lecture Contents 1. Selection Sort and Bubble Sort 2. Sequential Search and Brute-Force String Matching 3. Closest-Pair and Convex-Hull Problems by Brute Force 4. Exhaustive Search

More information

Section 5.2: Next Event Simulation Examples

Section 5.2: Next Event Simulation Examples Section 52: Next Event Simulation Examples Discrete-Event Simulation: A First Course c 2006 Pearson Ed, Inc 0-13-142917-5 Discrete-Event Simulation: A First Course Section 52: Next Event Simulation Examples

More information

CS 112 Final May 8, 2008 (Lightly edited for 2012 Practice) Name: BU ID: Instructions

CS 112 Final May 8, 2008 (Lightly edited for 2012 Practice) Name: BU ID: Instructions CS 112 Final May 8, 2008 (Lightly edited for 2012 Practice) Name: BU ID: This exam is CLOSED book and notes. Instructions The exam consists of six questions on 11 pages. Please answer all questions on

More information

Basic Properties The Definition of Catalan Numbers

Basic Properties The Definition of Catalan Numbers 1 Basic Properties 1.1. The Definition of Catalan Numbers There are many equivalent ways to define Catalan numbers. In fact, the main focus of this monograph is the myriad combinatorial interpretations

More information

Discrete-Event Simulation: A First Course. Steve Park and Larry Leemis College of William and Mary

Discrete-Event Simulation: A First Course. Steve Park and Larry Leemis College of William and Mary Discrete-Event Simulation: A First Course Steve Park and Larry Leemis College of William and Mary Technical Attractions of Simulation * Ability to compress time, expand time Ability to control sources

More information

Hypercubes. (Chapter Nine)

Hypercubes. (Chapter Nine) Hypercubes (Chapter Nine) Mesh Shortcomings: Due to its simplicity and regular structure, the mesh is attractive, both theoretically and practically. A problem with the mesh is that movement of data is

More information

Chapter 3:- Divide and Conquer. Compiled By:- Sanjay Patel Assistant Professor, SVBIT.

Chapter 3:- Divide and Conquer. Compiled By:- Sanjay Patel Assistant Professor, SVBIT. Chapter 3:- Divide and Conquer Compiled By:- Assistant Professor, SVBIT. Outline Introduction Multiplying large Integers Problem Problem Solving using divide and conquer algorithm - Binary Search Sorting

More information

Probability and Statistics for Final Year Engineering Students

Probability and Statistics for Final Year Engineering Students Probability and Statistics for Final Year Engineering Students By Yoni Nazarathy, Last Updated: April 11, 2011. Lecture 1: Introduction and Basic Terms Welcome to the course, time table, assessment, etc..

More information

4.2 ALGORITHM DESIGN METHODS

4.2 ALGORITHM DESIGN METHODS 4.2 ALGORITHM DESIGN METHODS 4.2.2 JACKSON STRUCTURED PROGRAMMING (JSP) JACKSON STRUCTURED PROGRAMMING (JSP) Jackson Structured Programming (JSP) is a method for program design and modeling in the small.

More information

Math 55 - Spring Lecture notes # 14 - March 9 (Tuesday)

Math 55 - Spring Lecture notes # 14 - March 9 (Tuesday) Math 55 - Spring 2004 - Lecture notes # 14 - March 9 (Tuesday) Read Chapter 4 Goals for Today: Continue counting principles Tree diagrams Pigeonhole principle Permutations Combinations Homework: 1) (Based

More information

COSC160: Data Structures Hashing Structures. Jeremy Bolton, PhD Assistant Teaching Professor

COSC160: Data Structures Hashing Structures. Jeremy Bolton, PhD Assistant Teaching Professor COSC160: Data Structures Hashing Structures Jeremy Bolton, PhD Assistant Teaching Professor Outline I. Hashing Structures I. Motivation and Review II. Hash Functions III. HashTables I. Implementations

More information

Math INTRODUCTION TO MATLAB L. J. Gross - August 1995

Math INTRODUCTION TO MATLAB L. J. Gross - August 1995 Math 151-2 INTRODUCTION TO MATLAB L. J. Gross - August 1995 This is a very basic introduction to the elements of MATLAB that will be used in the early part of this course. A much more complete description

More information

Business: Administrative Information Services Crosswalk to AZ Math Standards

Business: Administrative Information Services Crosswalk to AZ Math Standards Page 1 of 1 August 1998 2M-P1 Construct and draw inferences including measures of central tendency, from charts, tables, graphs and data plots that summarize data from real-world situations. PO 4 2.0 Manage

More information

JDEP 384H: Numerical Methods in Business

JDEP 384H: Numerical Methods in Business Instructor: Thomas Shores Department of Mathematics Lecture 1, January 9, 2007 110 Kaufmann Center Outline 1 2 3 Solving Systems Matrix and Vector Algebra Welcome to Matlab! Rational File Management File

More information

Combinatorial properties and n-ary topology on product of power sets

Combinatorial properties and n-ary topology on product of power sets Combinatorial properties and n-ary topology on product of power sets Seethalakshmi.R 1, Kamaraj.M 2 1 Deaprtmant of mathematics, Jaya collage of arts and Science, Thiruninravuir - 602024, Tamilnadu, India.

More information

Ph.D. Comprehensive Examination Design and Analysis of Algorithms

Ph.D. Comprehensive Examination Design and Analysis of Algorithms Ph.D. Comprehensive Examination Design and Analysis of Algorithms Main Books 1. Cormen, Leiserton, Rivest, Introduction to Algorithms, MIT Press, 2001. Additional Books 1. Kenneth H. Rosen, Discrete mathematics

More information

DM4U_B P 1 W EEK 1 T UNIT

DM4U_B P 1 W EEK 1 T UNIT MDM4U_B Per 1 WEEK 1 Tuesday Feb 3 2015 UNIT 1: Organizing Data for Analysis 1) THERE ARE DIFFERENT TYPES OF DATA THAT CAN BE SURVEYED. 2) DATA CAN BE EFFECTIVELY DISPLAYED IN APPROPRIATE TABLES AND GRAPHS.

More information

CS 112 Final May 8, 2008 (Lightly edited for 2011 Practice) Name: BU ID: Instructions GOOD LUCK!

CS 112 Final May 8, 2008 (Lightly edited for 2011 Practice) Name: BU ID: Instructions GOOD LUCK! CS 112 Final May 8, 2008 (Lightly edited for 2011 Practice) Name: BU ID: This exam is CLOSED book and notes. Instructions The exam consists of six questions on 11 pages. Please answer all questions on

More information

The Plan: Basic statistics: Random and pseudorandom numbers and their generation: Chapter 16.

The Plan: Basic statistics: Random and pseudorandom numbers and their generation: Chapter 16. Scientific Computing with Case Studies SIAM Press, 29 http://www.cs.umd.edu/users/oleary/sccswebpage Lecture Notes for Unit IV Monte Carlo Computations Dianne P. O Leary c 28 What is a Monte-Carlo method?

More information

9.1. K-means Clustering

9.1. K-means Clustering 424 9. MIXTURE MODELS AND EM Section 9.2 Section 9.3 Section 9.4 view of mixture distributions in which the discrete latent variables can be interpreted as defining assignments of data points to specific

More information

1 The sorting problem

1 The sorting problem Lecture 6: Sorting methods - The sorting problem - Insertion sort - Selection sort - Bubble sort 1 The sorting problem Let us consider a set of entities, each entity having a characteristics whose values

More information

Algorithms and Data Structures. Marcin Sydow. Introduction. QuickSort. Sorting 2. Partition. Limit. CountSort. RadixSort. Summary

Algorithms and Data Structures. Marcin Sydow. Introduction. QuickSort. Sorting 2. Partition. Limit. CountSort. RadixSort. Summary Sorting 2 Topics covered by this lecture: Stability of Sorting Quick Sort Is it possible to sort faster than with Θ(n log(n)) complexity? Countsort Stability A sorting algorithm is stable if it preserves

More information

Sorting. Bubble Sort. Selection Sort

Sorting. Bubble Sort. Selection Sort Sorting In this class we will consider three sorting algorithms, that is, algorithms that will take as input an array of items, and then rearrange (sort) those items in increasing order within the array.

More information

Java How to Program, 10/e. Copyright by Pearson Education, Inc. All Rights Reserved.

Java How to Program, 10/e. Copyright by Pearson Education, Inc. All Rights Reserved. Java How to Program, 10/e Copyright 1992-2015 by Pearson Education, Inc. All Rights Reserved. Data structures Collections of related data items. Discussed in depth in Chapters 16 21. Array objects Data

More information

Invariant Recognition of Hand-Drawn Pictograms Using HMMs with a Rotating Feature Extraction

Invariant Recognition of Hand-Drawn Pictograms Using HMMs with a Rotating Feature Extraction Invariant Recognition of Hand-Drawn Pictograms Using HMMs with a Rotating Feature Extraction Stefan Müller, Gerhard Rigoll, Andreas Kosmala and Denis Mazurenok Department of Computer Science, Faculty of

More information

Questions from the material presented in this lecture

Questions from the material presented in this lecture Advanced Data Structures Questions from the material presented in this lecture January 8, 2015 This material illustrates the kind of exercises and questions you may get at the final colloqium. L1. Introduction.

More information

Probabilistic (Randomized) algorithms

Probabilistic (Randomized) algorithms Probabilistic (Randomized) algorithms Idea: Build algorithms using a random element so as gain improved performance. For some cases, improved performance is very dramatic, moving from intractable to tractable.

More information

Relational Data Model

Relational Data Model Relational Data Model 1. Relational data model Information models try to put the real-world information complexity in a framework that can be easily understood. Data models must capture data structure

More information

HyperGeometric Distribution

HyperGeometric Distribution HyperGeometric Distribution Young W. Lim 2018-02-22 Thr Young W. Lim HyperGeometric Distribution 2018-02-22 Thr 1 / 15 Outline 1 HyperGeometric Distribution Based on HyperGeometric Random Variables Cumulative

More information

Math 501 Solutions to Homework Assignment 10

Math 501 Solutions to Homework Assignment 10 Department of Mathematics Discrete Mathematics Math 501 Solutions to Homework Assignment 10 Section 7.3 Additional Exercises 1. int CountValuesLessThanT(int a[], int n, int T) { int count = 0; for (int

More information

PACKING DIGRAPHS WITH DIRECTED CLOSED TRAILS

PACKING DIGRAPHS WITH DIRECTED CLOSED TRAILS PACKING DIGRAPHS WITH DIRECTED CLOSED TRAILS PAUL BALISTER Abstract It has been shown [Balister, 2001] that if n is odd and m 1,, m t are integers with m i 3 and t i=1 m i = E(K n) then K n can be decomposed

More information

A Catalog of While Loop Specification Patterns

A Catalog of While Loop Specification Patterns A Catalog of While Loop Specification Patterns Aditi Barua and Yoonsik Cheon TR #14-65 September 2014 Keywords: functional program verification, intended functions, program specification, specification

More information

1 Introduction to generation and random generation

1 Introduction to generation and random generation Contents 1 Introduction to generation and random generation 1 1.1 Features we might want in an exhaustive generation algorithm............... 1 1.2 What about random generation?.................................

More information

How Learning Differs from Optimization. Sargur N. Srihari

How Learning Differs from Optimization. Sargur N. Srihari How Learning Differs from Optimization Sargur N. srihari@cedar.buffalo.edu 1 Topics in Optimization Optimization for Training Deep Models: Overview How learning differs from optimization Risk, empirical

More information

Generating random samples from user-defined distributions

Generating random samples from user-defined distributions The Stata Journal (2011) 11, Number 2, pp. 299 304 Generating random samples from user-defined distributions Katarína Lukácsy Central European University Budapest, Hungary lukacsy katarina@phd.ceu.hu Abstract.

More information

5.4 Pure Minimal Cost Flow

5.4 Pure Minimal Cost Flow Pure Minimal Cost Flow Problem. Pure Minimal Cost Flow Networks are especially convenient for modeling because of their simple nonmathematical structure that can be easily portrayed with a graph. This

More information

Sorting L7.2 Recall linear search for an element in an array, which is O(n). The divide-and-conquer technique of binary search divides the array in ha

Sorting L7.2 Recall linear search for an element in an array, which is O(n). The divide-and-conquer technique of binary search divides the array in ha Lecture Notes on Sorting 15-122: Principles of Imperative Computation Frank Pfenning Lecture 7 February 1, 2011 1 Introduction We have seen in the last lecture that sorted arrays drastically reduce the

More information

Chapter 3: The Efficiency of Algorithms. Invitation to Computer Science, C++ Version, Third Edition

Chapter 3: The Efficiency of Algorithms. Invitation to Computer Science, C++ Version, Third Edition Chapter 3: The Efficiency of Algorithms Invitation to Computer Science, C++ Version, Third Edition Objectives In this chapter, you will learn about: Attributes of algorithms Measuring efficiency Analysis

More information

Algorithms. Appendix B. B.1 Matrix Algorithms. Matrix/Matrix multiplication. Vector/Matrix multiplication. This appendix is about algorithms.

Algorithms. Appendix B. B.1 Matrix Algorithms. Matrix/Matrix multiplication. Vector/Matrix multiplication. This appendix is about algorithms. Appendix B Algorithms This appendix is about algorithms. B.1 Matrix Algorithms When describing algorithms for manipulating matrices, we will use the notation x[i] for the i-th element of a vector (row

More information

Solution: It may be helpful to list out exactly what is in each of these events:

Solution: It may be helpful to list out exactly what is in each of these events: MATH 5010(002) Fall 2017 Homework 1 Solutions Please inform your instructor if you find any errors in the solutions. 1. You ask a friend to choose an integer N between 0 and 9. Let A = {N 5}, B = {3 N

More information

Ling/CSE 472: Introduction to Computational Linguistics. 4/6/15: Morphology & FST 2

Ling/CSE 472: Introduction to Computational Linguistics. 4/6/15: Morphology & FST 2 Ling/CSE 472: Introduction to Computational Linguistics 4/6/15: Morphology & FST 2 Overview Review: FSAs & FSTs XFST xfst demo Examples of FSTs for spelling change rules Reading questions Review: FSAs

More information

1 Minimum Cut Problem

1 Minimum Cut Problem CS 6 Lecture 6 Min Cut and Karger s Algorithm Scribes: Peng Hui How, Virginia Williams (05) Date: November 7, 07 Anthony Kim (06), Mary Wootters (07) Adapted from Virginia Williams lecture notes Minimum

More information

A.1 Numbers, Sets and Arithmetic

A.1 Numbers, Sets and Arithmetic 522 APPENDIX A. MATHEMATICS FOUNDATIONS A.1 Numbers, Sets and Arithmetic Numbers started as a conceptual way to quantify count objects. Later, numbers were used to measure quantities that were extensive,

More information

Catalan Numbers. Table 1: Balanced Parentheses

Catalan Numbers. Table 1: Balanced Parentheses Catalan Numbers Tom Davis tomrdavis@earthlink.net http://www.geometer.org/mathcircles November, 00 We begin with a set of problems that will be shown to be completely equivalent. The solution to each problem

More information

Graphs That Are Randomly Traceable from a Vertex

Graphs That Are Randomly Traceable from a Vertex Graphs That Are Randomly Traceable from a Vertex Daniel C. Isaksen 27 July 1993 Abstract A graph G is randomly traceable from one of its vertices v if every path in G starting at v can be extended to a

More information

What is a Graphon? Daniel Glasscock, June 2013

What is a Graphon? Daniel Glasscock, June 2013 What is a Graphon? Daniel Glasscock, June 2013 These notes complement a talk given for the What is...? seminar at the Ohio State University. The block images in this PDF should be sharp; if they appear

More information

Analysis of Algorithms. Unit 4 - Analysis of well known Algorithms

Analysis of Algorithms. Unit 4 - Analysis of well known Algorithms Analysis of Algorithms Unit 4 - Analysis of well known Algorithms 1 Analysis of well known Algorithms Brute Force Algorithms Greedy Algorithms Divide and Conquer Algorithms Decrease and Conquer Algorithms

More information

CHAPTER 2 Modeling Distributions of Data

CHAPTER 2 Modeling Distributions of Data CHAPTER 2 Modeling Distributions of Data 2.2 Density Curves and Normal Distributions The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers Density Curves

More information

Concept of Curve Fitting Difference with Interpolation

Concept of Curve Fitting Difference with Interpolation Curve Fitting Content Concept of Curve Fitting Difference with Interpolation Estimation of Linear Parameters by Least Squares Curve Fitting by Polynomial Least Squares Estimation of Non-linear Parameters

More information

Quantitative Biology II!

Quantitative Biology II! Quantitative Biology II! Lecture 3: Markov Chain Monte Carlo! March 9, 2015! 2! Plan for Today!! Introduction to Sampling!! Introduction to MCMC!! Metropolis Algorithm!! Metropolis-Hastings Algorithm!!

More information

Lecture 7. Transform-and-Conquer

Lecture 7. Transform-and-Conquer Lecture 7 Transform-and-Conquer 6-1 Transform and Conquer This group of techniques solves a problem by a transformation to a simpler/more convenient instance of the same problem (instance simplification)

More information

Chapter 3: The Efficiency of Algorithms

Chapter 3: The Efficiency of Algorithms Chapter 3: The Efficiency of Algorithms Invitation to Computer Science, Java Version, Third Edition Objectives In this chapter, you will learn about Attributes of algorithms Measuring efficiency Analysis

More information

Notes on CSP. Will Guaraldi, et al. version /13/2006

Notes on CSP. Will Guaraldi, et al. version /13/2006 Notes on CSP Will Guaraldi, et al version 1.5 10/13/2006 Abstract This document is a survey of the fundamentals of what we ve covered in the course up to this point. The information in this document was

More information

PRAM ALGORITHMS: BRENT S LAW

PRAM ALGORITHMS: BRENT S LAW PARALLEL AND DISTRIBUTED ALGORITHMS BY DEBDEEP MUKHOPADHYAY AND ABHISHEK SOMANI http://cse.iitkgp.ac.in/~debdeep/courses_iitkgp/palgo/index.htm PRAM ALGORITHMS: BRENT S LAW 2 1 MERGING TWO SORTED ARRAYS

More information

CSE 332: Data Structures & Parallelism Lecture 12: Comparison Sorting. Ruth Anderson Winter 2019

CSE 332: Data Structures & Parallelism Lecture 12: Comparison Sorting. Ruth Anderson Winter 2019 CSE 332: Data Structures & Parallelism Lecture 12: Comparison Sorting Ruth Anderson Winter 2019 Today Sorting Comparison sorting 2/08/2019 2 Introduction to sorting Stacks, queues, priority queues, and

More information

Outline. Computer Science 331. Heap Shape. Binary Heaps. Heap Sort. Insertion Deletion. Mike Jacobson. HeapSort Priority Queues.

Outline. Computer Science 331. Heap Shape. Binary Heaps. Heap Sort. Insertion Deletion. Mike Jacobson. HeapSort Priority Queues. Outline Computer Science 33 Heap Sort Mike Jacobson Department of Computer Science University of Calgary Lectures #5- Definition Representation 3 5 References Mike Jacobson (University of Calgary) Computer

More information

Statistical model for the reproducibility in ranking based feature selection

Statistical model for the reproducibility in ranking based feature selection Technical Report hdl.handle.net/181/91 UNIVERSITY OF THE BASQUE COUNTRY Department of Computer Science and Artificial Intelligence Statistical model for the reproducibility in ranking based feature selection

More information

Rendering. Mike Bailey. Rendering.pptx. The Rendering Equation

Rendering. Mike Bailey. Rendering.pptx. The Rendering Equation 1 Rendering This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License Mike Bailey mjb@cs.oregonstate.edu Rendering.pptx d i d 0 P P d i The Rendering

More information

Exploring Fractals through Geometry and Algebra. Kelly Deckelman Ben Eggleston Laura Mckenzie Patricia Parker-Davis Deanna Voss

Exploring Fractals through Geometry and Algebra. Kelly Deckelman Ben Eggleston Laura Mckenzie Patricia Parker-Davis Deanna Voss Exploring Fractals through Geometry and Algebra Kelly Deckelman Ben Eggleston Laura Mckenzie Patricia Parker-Davis Deanna Voss Learning Objective and skills practiced Students will: Learn the three criteria

More information

CSCD01 Engineering Large Software Systems. Design Patterns. Joe Bettridge. Winter With thanks to Anya Tafliovich

CSCD01 Engineering Large Software Systems. Design Patterns. Joe Bettridge. Winter With thanks to Anya Tafliovich CSCD01 Engineering Large Software Systems Design Patterns Joe Bettridge Winter 2018 With thanks to Anya Tafliovich Design Patterns Design patterns take the problems consistently found in software, and

More information

Efficient Storage and Processing of Adaptive Triangular Grids using Sierpinski Curves

Efficient Storage and Processing of Adaptive Triangular Grids using Sierpinski Curves Efficient Storage and Processing of Adaptive Triangular Grids using Sierpinski Curves Csaba Attila Vigh, Dr. Michael Bader Department of Informatics, TU München JASS 2006, course 2: Numerical Simulation:

More information

COS 226 Fall 2015 Midterm Exam pts.; 60 minutes; 8 Qs; 15 pgs :00 p.m. Name:

COS 226 Fall 2015 Midterm Exam pts.; 60 minutes; 8 Qs; 15 pgs :00 p.m. Name: COS 226 Fall 2015 Midterm Exam 1 60 + 10 pts.; 60 minutes; 8 Qs; 15 pgs. 2015-10-08 2:00 p.m. c 2015 Sudarshan S. Chawathe Name: 1. (1 pt.) Read all material carefully. If in doubt whether something is

More information

Höllische Programmiersprachen Hauptseminar im Wintersemester 2014/2015 Determinism and reliability in the context of parallel programming

Höllische Programmiersprachen Hauptseminar im Wintersemester 2014/2015 Determinism and reliability in the context of parallel programming Höllische Programmiersprachen Hauptseminar im Wintersemester 2014/2015 Determinism and reliability in the context of parallel programming Raphael Arias Technische Universität München 19.1.2015 Abstract

More information

March 19, Heuristics for Optimization. Outline. Problem formulation. Genetic algorithms

March 19, Heuristics for Optimization. Outline. Problem formulation. Genetic algorithms Olga Galinina olga.galinina@tut.fi ELT-53656 Network Analysis and Dimensioning II Department of Electronics and Communications Engineering Tampere University of Technology, Tampere, Finland March 19, 2014

More information

Rollout Algorithms for Stochastic Scheduling Problems

Rollout Algorithms for Stochastic Scheduling Problems Journal of Heuristics, 5, 89 108 (1999) c 1999 Kluwer Academic Publishers, Boston. Manufactured in The Netherlands. Rollout Algorithms for Stochastic Scheduling Problems DIMITRI P. BERTSEKAS* Department

More information

Lecture: Simulation. of Manufacturing Systems. Sivakumar AI. Simulation. SMA6304 M2 ---Factory Planning and scheduling. Simulation - A Predictive Tool

Lecture: Simulation. of Manufacturing Systems. Sivakumar AI. Simulation. SMA6304 M2 ---Factory Planning and scheduling. Simulation - A Predictive Tool SMA6304 M2 ---Factory Planning and scheduling Lecture Discrete Event of Manufacturing Systems Simulation Sivakumar AI Lecture: 12 copyright 2002 Sivakumar 1 Simulation Simulation - A Predictive Tool Next

More information

You ve already read basics of simulation now I will be taking up method of simulation, that is Random Number Generation

You ve already read basics of simulation now I will be taking up method of simulation, that is Random Number Generation Unit 5 SIMULATION THEORY Lesson 39 Learning objective: To learn random number generation. Methods of simulation. Monte Carlo method of simulation You ve already read basics of simulation now I will be

More information

Computer Experiments. Designs

Computer Experiments. Designs Computer Experiments Designs Differences between physical and computer Recall experiments 1. The code is deterministic. There is no random error (measurement error). As a result, no replication is needed.

More information

Curriculum Connections (Fractions): K-8 found at under Planning Supports

Curriculum Connections (Fractions): K-8 found at   under Planning Supports Curriculum Connections (Fractions): K-8 found at http://www.edugains.ca/newsite/digitalpapers/fractions/resources.html under Planning Supports Kindergarten Grade 1 Grade 2 Grade 3 Grade 4 Grade 5 Grade

More information

Sorting. There exist sorting algorithms which have shown to be more efficient in practice.

Sorting. There exist sorting algorithms which have shown to be more efficient in practice. Sorting Next to storing and retrieving data, sorting of data is one of the more common algorithmic tasks, with many different ways to perform it. Whenever we perform a web search and/or view statistics

More information

Trees. 3. (Minimally Connected) G is connected and deleting any of its edges gives rise to a disconnected graph.

Trees. 3. (Minimally Connected) G is connected and deleting any of its edges gives rise to a disconnected graph. Trees 1 Introduction Trees are very special kind of (undirected) graphs. Formally speaking, a tree is a connected graph that is acyclic. 1 This definition has some drawbacks: given a graph it is not trivial

More information

Scientific Computing with Case Studies SIAM Press, Lecture Notes for Unit IV Monte Carlo

Scientific Computing with Case Studies SIAM Press, Lecture Notes for Unit IV Monte Carlo Scientific Computing with Case Studies SIAM Press, 2009 http://www.cs.umd.edu/users/oleary/sccswebpage Lecture Notes for Unit IV Monte Carlo Computations Dianne P. O Leary c 2008 1 What is a Monte-Carlo

More information

CSCI2100B Data Structures Trees

CSCI2100B Data Structures Trees CSCI2100B Data Structures Trees Irwin King king@cse.cuhk.edu.hk http://www.cse.cuhk.edu.hk/~king Department of Computer Science & Engineering The Chinese University of Hong Kong Introduction General Tree

More information

Chapter 3: The Efficiency of Algorithms Invitation to Computer Science,

Chapter 3: The Efficiency of Algorithms Invitation to Computer Science, Chapter 3: The Efficiency of Algorithms Invitation to Computer Science, Objectives In this chapter, you will learn about Attributes of algorithms Measuring efficiency Analysis of algorithms When things

More information

Algorithmica Springer-Verlag New York Inc.

Algorithmica Springer-Verlag New York Inc. Algorithmica (1987) 2:91-95 Algorithmica 9 1987 Springer-Verlag New York Inc. The Set LCS Problem D. S. Hirschberg 1 and L. L. Larmore ~ Abstract. An efficient algorithm is presented that solves a generalization

More information

Chung-Feller Theorems

Chung-Feller Theorems Chung-Feller Theorems Ira M. Gessel Department of Mathematics Brandeis University KMS-AMS Meeting, Seoul December 18, 2009 In their 1949 paper On fluctuations in coin-tossing", Chung and Feller proved

More information

Sensor Scheduling and Energy Allocation For Lifetime Maximization in User-Centric Visual Sensor Networks

Sensor Scheduling and Energy Allocation For Lifetime Maximization in User-Centric Visual Sensor Networks 1 Sensor Scheduling and Energy Allocation For Lifetime Maximization in User-Centric Visual Sensor Networks Chao Yu IP Lab, University of Rochester December 4, 2008 2 1 Introduction User-Centric VSN Camera

More information

Scan and its Uses. 1 Scan. 1.1 Contraction CSE341T/CSE549T 09/17/2014. Lecture 8

Scan and its Uses. 1 Scan. 1.1 Contraction CSE341T/CSE549T 09/17/2014. Lecture 8 CSE341T/CSE549T 09/17/2014 Lecture 8 Scan and its Uses 1 Scan Today, we start by learning a very useful primitive. First, lets start by thinking about what other primitives we have learned so far? The

More information

Preliminaries: Size Measures and Shape Coordinates

Preliminaries: Size Measures and Shape Coordinates 2 Preliminaries: Size Measures and Shape Coordinates 2.1 Configuration Space Definition 2.1 The configuration is the set of landmarks on a particular object. The configuration matrix X is the k m matrix

More information

Sorting & Searching. Hours: 10. Marks: 16

Sorting & Searching. Hours: 10. Marks: 16 Sorting & Searching CONTENTS 2.1 Sorting Techniques 1. Introduction 2. Selection sort 3. Insertion sort 4. Bubble sort 5. Merge sort 6. Radix sort ( Only algorithm ) 7. Shell sort ( Only algorithm ) 8.

More information

Outline. Computer Science 331. Three Classical Algorithms. The Sorting Problem. Classical Sorting Algorithms. Mike Jacobson. Description Analysis

Outline. Computer Science 331. Three Classical Algorithms. The Sorting Problem. Classical Sorting Algorithms. Mike Jacobson. Description Analysis Outline Computer Science 331 Classical Sorting Algorithms Mike Jacobson Department of Computer Science University of Calgary Lecture #22 1 Introduction 2 3 4 5 Comparisons Mike Jacobson (University of

More information

Notes on CSP. Will Guaraldi, et al. version 1.7 4/18/2007

Notes on CSP. Will Guaraldi, et al. version 1.7 4/18/2007 Notes on CSP Will Guaraldi, et al version 1.7 4/18/2007 Abstract Original abstract This document is a survey of the fundamentals of what we ve covered in the course up to this point. The information in

More information