Exact Calculation of Beta Inequalities

Similar documents
TTEDesigner User s Manual

Bayes Factor Single Arm Binary User s Guide (Version 1.0)

On the computation of the n'th decimal digit of various transcendental numbers. Abstract

Supplemental Material for Spatially-Balanced Designs that Incorporate Legacy Sites

If your problem requires a series of related binomial coefficients, a good idea is to use recurrence relations, for example. = n k

Section 1.8. Simplifying Expressions

Numerical Aspects of Special Functions

Big Mathematical Ideas and Understandings

Mathematics Scope & Sequence Grade 8 Revised: June 2015

A.1 Numbers, Sets and Arithmetic

COMMUNITY UNIT SCHOOL DISTRICT 200

An Interesting Way to Combine Numbers

Mathematics 12 Pre-Calculus WNCP

Non-Holonomic Sequences and Functions

Three Different Algorithms for Generating Uniformly Distributed Random Points on the N-Sphere

Mathematics TEKS Grade Level Changes/Additions Summary

General Terms: Algorithms, exponential integral function, multiple precision

Edge-exchangeable graphs and sparsity

A Multiple -Precision Division Algorithm. By David M. Smith

MATH 115: Review for Chapter 1

A Multiple-Precision Division Algorithm

Design of the Power Standard Document

Integrated Algebra 2 and Trigonometry. Quarter 1

GREENWOOD PUBLIC SCHOOL DISTRICT Algebra III Pacing Guide FIRST NINE WEEKS

1. Answer: x or x. Explanation Set up the two equations, then solve each equation. x. Check

Unit 2: Accentuate the Negative Name:

EE292: Fundamentals of ECE

Modeling with Uncertainty Interval Computations Using Fuzzy Sets

VARIANCE REDUCTION TECHNIQUES IN MONTE CARLO SIMULATIONS K. Ming Leung

NAME DATE PER. GEOMETRY FALL SEMESTER REVIEW FIRST SIX WEEKS PART 1. A REVIEW OF ALGEBRA Find the correct answer for each of the following.

Integrated Mathematics I Performance Level Descriptors

Symbolic Evaluation of Sums for Parallelising Compilers

CGF Lecture 2 Numbers

Odd-Numbered Answers to Exercise Set 1.1: Numbers

Math 7 Glossary Terms

Calculus I Review Handout 1.3 Introduction to Calculus - Limits. by Kevin M. Chevalier

BMO Round 1 Problem 6 Solutions

+ bx + c = 0, you can solve for x by using The Quadratic Formula. x

Extension of the Josephus Problem with Varying Elimination Steps

Objectives and Homework List

An Improved Upper Bound for the Sum-free Subset Constant

Power Series and Summation

General properties of staircase and convex dual feasible functions

Bawar Abid Abdalla. Assistant Lecturer Software Engineering Department Koya University

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Twelve Simple Algorithms to Compute Fibonacci Numbers

SCHOOL OF ENGINEERING & BUILT ENVIRONMENT. Mathematics. Numbers & Number Systems

Chapter 1: Number and Operations

Extremal Graph Theory: Turán s Theorem

Themes in the Texas CCRS - Mathematics

A New Algorithm for Multiple Key Interpolation Search in Uniform List of Numbers

The absolute value of the real number y xis smaller than y x is smaller than or equal to y x is not equal to y

On Asymptotic Cost of Triangle Listing in Random Graphs

SYNERGY INSTITUTE OF ENGINEERING & TECHNOLOGY,DHENKANAL LECTURE NOTES ON DIGITAL ELECTRONICS CIRCUIT(SUBJECT CODE:PCEC4202)

[Y2] Counting and understanding number. [Y2] Counting and understanding number. [Y2] Counting and understanding number

d(γ(a i 1 ), γ(a i )) i=1

Cambridge International Examinations Cambridge Ordinary Level

Decimal Binary Conversion Decimal Binary Place Value = 13 (Base 10) becomes = 1101 (Base 2).

8 th Grade Pre Algebra Pacing Guide 1 st Nine Weeks

IT 201 Digital System Design Module II Notes

Mathematics Background

TABLE OF CONTENTS CHAPTER 1 LIMIT AND CONTINUITY... 26

Using Cantor s Diagonal Method to Show ζ(2) is Irrational

Lecture (04) Boolean Algebra and Logic Gates

Lecture (04) Boolean Algebra and Logic Gates By: Dr. Ahmed ElShafee

THE SEARCH FOR SYMMETRIC VENN DIAGRAMS

THE UNIFORMIZATION THEOREM AND UNIVERSAL COVERS

Limits. f(x) and lim. g(x) g(x)

Crown-free highly arc-transitive digraphs

1a. Define, classify, and order rational and irrational numbers and their subsets. (DOK 1)

Seventh Grade Mathematics Content Standards and Objectives

Test Bank Ver. 5.0: Data Abstraction and Problem Solving with C++: Walls and Mirrors, 5 th edition, Frank M. Carrano

Portraits of Groups on Bordered Surfaces

You used set notation to denote elements, subsets, and complements. (Lesson 0-1)

Unit 7 Number System and Bases. 7.1 Number System. 7.2 Binary Numbers. 7.3 Adding and Subtracting Binary Numbers. 7.4 Multiplying Binary Numbers

Trig Functions, Equations & Identities May a. [2 marks] Let. For what values of x does Markscheme (M1)

Carnegie Learning Math Series Course 2, A Florida Standards Program

An Approach to Identify the Number of Clusters

Systems of Inequalities and Linear Programming 5.7 Properties of Matrices 5.8 Matrix Inverses

Computer Algebra Algorithms for Orthogonal Polynomials and Special Functions

Montgomery County Community College MAT 170 College Algebra and Trigonometry 4-4-0

Catholic Central High School

arxiv: v1 [math.co] 25 Sep 2015

correlated to the Michigan High School Mathematics Content Expectations

Houghton Mifflin MATHSTEPS Level 7 correlated to Chicago Academic Standards and Framework Grade 7

CSI33 Data Structures

Central Valley School District Math Curriculum Map Grade 8. August - September

Maths Assessment Framework Year 10 Foundation

Infinite Geometry supports the teaching of the Common Core State Standards listed below.

A proposal to add special mathematical functions according to the ISO/IEC :2009 standard

Introduction to Complex Analysis

Math for Geometric Optics

Voluntary State Curriculum Algebra II

Prentice Hall Mathematics: Course Correlated to: Colorado Model Content Standards and Grade Level Expectations (Grade 6)

Computational Methods for Finding Probabilities of Intervals

Montana City School GRADE 5

ON THE CONSTRUCTION OF ORTHOGONAL ARRAYS

4. Specifications and Additional Information

Objective 1 : The student will demonstrate an understanding of numbers, operations, and quantitative reasoning.

Grade 7 Unit 1 at a Glance: Working With Rational Numbers

Transcription:

Exact Calculation of Beta Inequalities John Cook Department of Biostatistics, Box 447 The University of Texas, M. D. Anderson Cancer Center 1515 Holcombe Blvd., Houston, Texas 77030, USA cook@mdanderson.org November 2, 2005 Abstract This paper addresses the problem of evaluating P (X > Y ) where X and Y are independent beta random variables. We cast the problem in terms of a hypergeometric function and use hypergeometric identities to calculate the probability in closed form for certain values of the distribution parameters. Keywords: beta distribution, stochastic inequalities, hypergeometric functions, closed form 1 Introduction Define g(a, b, c, d) to be the probability of a sample from a beta(a, b) random variable being larger than an independent sample from a beta(c, d) random variable. Thus for positive parameters a, b, c, and d, g(a, b, c, d) 1 0 x a 1 (1 x) b 1 I x (c, d) dx (1) B(a, b) where I x (c, d) is the incomplete beta function, the CDF of a beta(c, d) random variable. Revised February 21, 2006 1

The function g plays a central role in the calculation of randomization probabilities for many outcome-adaptive clinical trials with binary end points. Suppose the prior probabilities of response on each of two arms in a randomized trial have beta priors. Because the beta prior is conjugate for binomial distributions, the posterior distributions on the probability of response are also beta distributions. A simple adaptive randomization scheme is to assign patients to an arm with probability equal to the probability that that arm is superior. If the posterior distribution on one arm is beta(a, b) and the other is beta(c, d), then the probability of assigning the first arm is g(a, b, c, d). More generally, one often assigns the first arm with probability g(a, b, c, d) λ g(a, b, c, d) λ + g(c, d, a, b) λ for some λ > 0. Values of λ < 1 dampen the the effect of the data on the randomization probabilities, and values λ > 1 amplify the effect of the data. See [3], [4], and [5] for theoretical background. See [11] for software to simulate adaptively randomized trials. The function g is also also at the heart of the safety monitoring methods of Thall, Simon, and Estey [9]. As above, the probabilities of response on each of two arms (either two active control arms or an active arm and a historical standard) are distributed as beta random variables. Accrual to an arm stops if the probability of that arm being superior falls below some threshold. That is, if g(a, b, c, d) is too small, the first arm is closed. See [8] for software implementing this method. In general, the function g cannot be evaluated in closed form and must be approximated numerically, as for example in [10]. In [6], we gave general methods for computing g for several distribution families, including beta. Here we focus on special cases that can be evaluated in terms of gamma functions. These special cases may be directly useful in applications. Also, the special cases given here provide test cases for numerical software used to evaluate g for general arguments. The symmetries g(a, b, c, d) g(d, c, b, a) g(d, b, c, a) 1 g(c, d, a, b) are developed in [6]. For the 24 permutations of the 4 arguments, there are at most six different values of g. These are g(a, b, c, d), g(a, b, d, c), g(a, c, d, b) and their complementary probabilities. 2

If we define h(a, b, c, d) B(a + c, b + d) B(a, b)b(c, d) (2) Γ(a + c)γ(b + d)γ(a + b)γ(c + d) Γ(a)Γ(b)Γ(c)Γ(d)Γ(a + b + c + d). (3) then [6] shows that the following recurrence relations hold: g(a + 1, b, c, d) g(a, b, c, d) + h(a, b, c, d)/a g(a, b + 1, c, d) g(a, b, c, d) h(a, b, c, d)/b g(a, b, c + 1, d) g(a, b, c, d) h(a, b, c, d)/c g(a, b, c, d + 1) g(a, b, c, d) + h(a, b, c, d)/d 2 Case: one integer argument First we consider the special case d 1. We have g(a, b, c, 1) 1 0 1 x a 1 (1 x) b 1 B(a, b) x x a+c 1 (1 x) b 1 0 B(a, b) B(a + c, b) B(a, b) Γ(a + b) Γ(a + c) Γ(a + b + c) Γ(a). 0 ct c 1 dt dx Using the recurrence relationship for the fourth argument, we can reduce the case of d being any positive integer to the case d 1. Using symmetries of g we can permute any parameter into the third argument, and so we can compute g(a, b, c, d) exactly if any one of the arguments is an integer. 3

3 Derivation of Hypergeometric Form In [6] we showed that g(a, b, c, d) k0 h(a, b, c + k, d). c + k If we define t k then a little algebra shows that h(a, b, c + k, d) c + k t k+1 t k (a + c + k)(c + d + k) (c + k + 1)(a + b + c + d + k). Since t k+1 /t k is a rational function of k, it follows that the infinite sum for g is a constant times a hypergeometric function. (See [2] or [7] for background on hypergeometric functions.) Furthermore, one can read the hypergeometric parameters from the factorization of t k+1 /t k. It follows that g(a, b, c, d) h(a, b, c, d) c 3F 2 ( a + c, c + d, 1 c + 1, a + b + c + d ) 1 (4) By representing our function g in terms of a hypergeometric function, we open the possibility of taking advantage of cataloged identities known for such functions. See [12] for thousands of hypergeometric identities. While the original definition of g given in (1) requires all parameters to be positive, equation (4) does not. The function h requires that none the beta function arguments in its definition are non-positive integers. The hypergeometric function 3 F 2 requires the real part of s l s u be positive in order for the series to converge, where s l is the sum of the lower parameters and s u is the sum of the upper parameters. In our case s l s u is simply b. By using the parameter symmetries, we can make any parameter the b parameter. Even though only positive parameters in g corresponds inequality probabilities, it will be useful below to use negative arguments along the way to compute g for certain positive arguments. 4

4 Case a + b + c + d 1 In this section, we consider the case a + b + c + d 1. Then the 3 F 2 function has 1 as both an upper and lower parameter and reduces to the simpler hypergeometric function 2 F 1 and so g(a, b, c, d) h(a, b, c, d) c 2F 1 ( a + c, c + d c + 1 ) 1. A theorem of Gauss ([1], equation 15.1.20) allows us to evaluate the 2 F 1 term in closed form: ( ) a + c, c + d 2F 1 c + 1 1 Γ(c + 1) Γ(1 a c d). Γ(1 a) Γ(1 d) Using the reflection formula ([1], equation 6.1.17) we find that for a + b + c + d 1, g(a, b, c, d) Γ(z)Γ(1 z) π csc(πz) sin(πa) sin(πd) sin(π(a + b)) sin(π(b + d)). We can repeatedly apply the recurrence relations given above to increase the parameters by any integer amount. Therefore if we can compute g(a, b, c, d), we can compute g(a+i, b+j, c+k, d+l) for any positive integers i, j, k, and l. Furthermore, we can often compute g(a, b, c, d) when a + b + c + d is any integer. Consider, for example, g(.7,.8,.1,.4). The parameters sum to 2 and given the definition (1) there would be nothing we could do. However, equation (4) allows negative arguments, and so we apply the recurrence relationship for the first argument and evaluate g(.7,.8,.1,.4) g(.3,.8,.1,.4) + h(.3,.8,.1,.4)/.3. This strategy will not work if h(a, b, c, d) is undefined, i.e. we cannot have a + b, c + d, a + c, or b + d equal to zero. For example, we could not use the approach in this section to evaluate g(.7,.3,.4,.6) because decreasing any parameter by 1 would cause h to be undefined. However, we give a method below that can be used for this case. 5

5 Case a + b c + d 1 Very often in practice, statisticians choose prior parameters for the beta distribution that sum to an integer. This is because the sum of the parameters can be interpreted as the number of observations contained in the prior and it seems natural to choose priors that make this value an integer. If a + b c + d 1, we have h(a, b, c, d) B(a + c, b + d) B(a, b)b(c, d) Γ(a + c)γ(b + d) Γ(a)Γ(b)Γ(c)Γ(d) Γ(a + c)γ(b + d) sin(πa) sin(πd)/π 2. Also, 3F 2 ( a + c, c + d, 1 c + 1, a + b + c + d ) 1 3 F 2 ( a + c, 1, 1 c + 1, 2 ) 1 c(ψ(c) ψ(1 a)) a + c 1 by equation 07.27.03.0053.01 of [12], provided a + c 1 and 1 a > 0. Here ψ(z) is the logarithmic of the Γ function, Γ (z)/γ(z). Therefore equation (4) can be evaluated analytically for a+b c+d 1, provided a+c 1 and a > 0. The recurrence relations can be used to parlay these results to the case of a + b and c + d being positive integers. 6 Conclusion In this note we defined the function g(a, b, c, d) and briefly outlined some of its uses in conducting clinical trials. We expressed this function as a hypergeometric function, extending the function s domain and making it possible to apply well known identities. We showed that the function can be evaluated in closed form provided one of the arguments is a positive integer, and can often be evaluated in closed form if the parameters sum to a positive integer. 6

References [1] Milton Abramowitz and Irene Stegun. Handbook of Mathematical Functions, Dover (1972) [2] George Andrews, Richard Askey, and Ranjan Roy. Special Functions, Cambridge. (2000) [3] Donald A. Berry and G. E. Eick. Adaptive assignment versus balanced randomization in clinical trials: a decision analysis. Statistics in Medicine, vol 14, 231-246 (1995). [4] Donald A. Berry. Statistical Innovations in Cancer Research. In Cancer Medicine e.6. Chapter 33. London: BC Decker. (Ed: Holland J, Frei T et al.) (2003) [5] Donald A. Berry. Bayesian statistics and the ethics of clinical trials. Statistical Science Volume 19(1):175-187. (2004) [6] John D. Cook. Numerical Computation of Stochastic Inequality Probabilities. MDACC technical report UTMDABTR-008-03, (2003), revised 2005. Available at http://www.mdanderson.org/pdf/biostats utmdabtr00803.pdf. [7] Ronald Graham, Donald Knuth, and Oren Patashnik Concrete Mathematics, Addison Wesley, 2nd edition. (1994) [8] Hoang Nguyen and John D. Cook. Multc Lean [Computer software]. (2005) Available at http://biostatistics.mdanderson.org/softwaredownload [9] Peter Thall, Richard Simon, and Elihu Estey. Bayesian sequential monitoring designs for single-arm clinical trials with multiple outcomes, Statistics in Medicine, vol 14, 357-379 (1995). [10] J. Kyle Wathen, Hoang Nguyen, and John D. Cook. Inequality Calculator [Computer software] (2005) Available at http://biostatistics.mdanderson.org/softwaredownload [11] J. Kyle Wathen and Odis Wooten. Adaptive Randomization 3.0 [Computer software]. (2005) Available at http://biostatistics.mdanderson.org/softwaredownload 7

[12] http://functions.wolfram.com/hypergeometricfunctions/ 8