Lecture 14. December 19, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University.

Similar documents
MATHEMATICS Key Stage 2 Year 6

K-Means and Gaussian Mixture Models

Lecture 12. August 23, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University.

Unit 8 SUPPLEMENT Normal, T, Chi Square, F, and Sums of Normals

Sequences and Series. Copyright Cengage Learning. All rights reserved.

The first few questions on this worksheet will deal with measures of central tendency. These data types tell us where the center of the data set lies.

CW High School. Algebra I A

Today. Lecture 4: Last time. The EM algorithm. We examine clustering in a little more detail; we went over it a somewhat quickly last time

Introduction to the Practice of Statistics Fifth Edition Moore, McCabe

Unit 5: Estimating with Confidence

Name: Date: Period: Chapter 2. Section 1: Describing Location in a Distribution

Number and Place Value

Chapter 2 Modeling Distributions of Data

A. Incorrect! This would be the negative of the range. B. Correct! The range is the maximum data value minus the minimum data value.

YEAR 7 MATHS SCHEMES OF WORK

Geometry Unit 5 Geometric and Algebraic Connections. Table of Contents

Year 8 Review 1, Set 1 Number confidence (Four operations, place value, common indices and estimation)

To be a grade 1 I need to

The method of rationalizing

9-1 GCSE Maths. GCSE Mathematics has a Foundation tier (Grades 1 5) and a Higher tier (Grades 4 9).

Secure understanding of multiplication of whole numbers by 10, 100 or 1000.

Image restoration. Restoration: Enhancement:

Stat 528 (Autumn 2008) Density Curves and the Normal Distribution. Measures of center and spread. Features of the normal distribution

Lecture 6: Chapter 6 Summary

Inference and Representation

9 abcd = dcba b + 90c = c + 10b b = 10c.

Year 6 Term 1 and

Modular Arithmetic. Marizza Bailey. December 14, 2015

Number and Place Value

INFO Wednesday, September 14, 2016

Chapters 5-6: Statistical Inference Methods

Sections 4.3 and 4.4

MATH 1A MIDTERM 1 (8 AM VERSION) SOLUTION. (Last edited October 18, 2013 at 5:06pm.) lim

Mathematics Year 9-11 Skills and Knowledge Checklist. Name: Class: Set : Premier Date Year 9 MEG :

MATHS. years 4,5,6. malmesbury c of e primary school NAME CLASS

Econ 3790: Business and Economics Statistics. Instructor: Yogesh Uppal

CURRICULUM UNIT MAP 1 ST QUARTER

CSC236 Week 5. Larry Zhang

Unit 7 Number System and Bases. 7.1 Number System. 7.2 Binary Numbers. 7.3 Adding and Subtracting Binary Numbers. 7.4 Multiplying Binary Numbers

STA 570 Spring Lecture 5 Tuesday, Feb 1

Math Interim Mini-Tests. 3rd Grade Mini-Tests

Part I, Chapters 4 & 5. Data Tables and Data Analysis Statistics and Figures

Simpson Elementary School Curriculum Prioritization and Mapping 3rd Grade Math

1.1 calculator viewing window find roots in your calculator 1.2 functions find domain and range (from a graph) may need to review interval notation

Number Sense and Operations Curriculum Framework Learning Standard

Density Curves Sections

Mathematics Year 9-11 Skills and Knowledge Checklist. Name: Class: Set : 1 Date Year 9 MEG :

LEADERS. Long and Medium Term Planning

Visual Basic for Applications

Year 6 programme of study

I. Recursive Descriptions A phrase like to get the next term you add 2, which tells how to obtain

11-2 Probability Distributions

Year 9 Calculator allowed Prepare for NAPLAN with Mangahigh

CAMI Education links: Maths NQF Level 2

Essential Elements Pacing Guide - Mathematics

(Refer Slide Time: 02:59)

Number- Algebra. Problem solving Statistics Investigations

Digital Image Processing Laboratory: Markov Random Fields and MAP Image Segmentation

Read, write compare and order numbers beyond 1000 in numerals and words Read Roman numerals to 100 and understand how they have changed through time

Recap: Gaussian (or Normal) Distribution. Recap: Minimizing the Expected Loss. Topics of This Lecture. Recap: Maximum Likelihood Approach

YEAR 10- Mathematics Term 1 plan

STAT 113: Lab 9. Colin Reimer Dawson. Last revised November 10, 2015

Modular Arithmetic. is just the set of remainders we can get when we divide integers by n

Linear, Quadratic, Exponential, and Absolute Value Functions

ISyE 6416: Computational Statistics Spring Lecture 13: Monte Carlo Methods

round decimals to the nearest decimal place and order negative numbers in context

Information for Parents/Carers. Mathematics Targets - A Year 1 Mathematician

GCSE Higher Revision List

Benjamin Adlard School 2015/16 Maths medium term plan: Autumn term Year 6

Voluntary State Curriculum Algebra II

9 abcd = dcba b + 90c = c + 10b b = 10c.

Chapter 3. Bootstrap. 3.1 Introduction. 3.2 The general idea

Prime Time (Factors and Multiples)

Year 6.1- Number and Place Value 2 weeks- Autumn 1 Read, write, order and compare numbers up to and determine the value of each digit.

Albertson AP Calculus AB AP CALCULUS AB SUMMER PACKET DUE DATE: The beginning of class on the last class day of the first week of school.

SCHEME OF WORK Yr 7 DELTA 1 UNIT / LESSON

Go Math 3 rd Grade Meramec Valley R-3 School District. Yearly Plan. 1 Updated 6/24/16

MATH: HIGH SCHOOL M.EE.A-CED.1

GREENWOOD PUBLIC SCHOOL DISTRICT Algebra III Pacing Guide FIRST NINE WEEKS

The method of rationalizing

Parkfield Community School

Chapter 2: The Normal Distribution

Acquisition Description Exploration Examination Understanding what data is collected. Characterizing properties of data.

Optimal designs for comparing curves

Grade 4 CCSS Pacing Guide: Math Expressions

Year 1 I Can Statements Can you?

DRAFT EAST POINSETT CO. SCHOOL DIST. - GRADE 3 MATH

Students will understand 1. that numerical expressions can be written and evaluated using whole number exponents

Recursive Estimation

Math 1314 Test 3 Review Material covered is from Lessons 9 15

Chapter 2: The Normal Distributions

Stage 7 Checklists Have you reached this Standard?

Minnesota Academic Standards for Mathematics 2007

Third Grade Math: I Can Statements

Network School KS3 Long Term Curriculum Plan

KS3 Mathematics Long Term Curriculum Plan

Year 4 Step 1 Step 2 Step 3 End of Year Expectations Using and Applying I can solve number and practical problems using all of my number skills.

Math 14 Lecture Notes Ch. 6.1

3.NBT.1 Use place value understanding to round whole numbers to the nearest 10 or 100.

Crockerne Church of England Primary Non-Negotiables. Mathematics

Transcription:

geometric Lecture 14 Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University December 19, 2007

geometric 1 2 3 4 geometric 5 6 7

geometric 1 Review about logs 2 Introduce the geometric 3 Interpretations of the geometric 4 Confidence intervals for the geometric 5 Log-normal 6 Log-normal based intervals

geometric Recall that log B (x) is the number y so that B y = x Note that you can not take the log of a negative number; log B (1) is always 0 and log B (0) is When the base is B = e we write log e as just log or ln Other useful bases are 10 (orders of magnitude) or 2 Recall that log(ab) = log(a) + log(b), log(a b ) = b log(a), log(a/b) = log(a) log(b) (log turns multiplication into addition, division into subtraction, powers into multiplication)

Some reasons for logging data geometric To correct for right skewness When considering ratios In settings where errors are feasibly multiplicative, such as when dealing with concentrations or rates To consider orders of magnitude (using log base 10); for example when considering astronomical distances Counts are often logged (though note the problem with zero counts)

geometric geometric (sample) geometric of a data set X 1,..., X n is ( n ) 1/n X i i=1 Note that (provided that the X i are positive) the log of the geometric is 1 n n log(x i ) i=1 As the log of the geometric is an average, the LLN and clt apply (under what assumptions?) geometric is always less than or equal to the sample (arithmetic)

geometric geometric geometric is often used when the X i are all multiplicative Suppose that in a population of interest, the prevalence of a disease rose 2% one year, then fell 1% the next, then rose 2%, then rose 1%; since these factors act multiplicatively it makes sense to consider the geometric (1.02.99 1.02 1.01) 1/4 = 1.01 for a 1% geometric increase in disease prevalence

geometric Notice that multiplying the initial prevalence by 1.01 4 is the same as multiplying by the original four numbers in sequence Hence 1.01 is constant factor by which you would need to multiply the initial prevalence each year to achieve the same overall increase in prevalence over a four year period arithmetic, in contrast, is the constant factor by which your would need to add each year to achieve the same total increase (1.02 +.99 + 1.02 + 1.01) In this case the product and hence the geometric make more sense than the arithmetic

Nifty fact geometric question corner (google) at the University of Toronto s web site (where I got much of this) has a fun interpretation of the geometric If a and b are the lengths of the sides of a rectangle then arithmetic (a + b)/2 is the length of the sides of the square that has the same perimeter geometric (ab) 1/2 is the length of the sides of the square that has the same area So if you re interested in perimeters (adding) use the arithmetic ; if you re interested in areas (multiplying) use the geometric

geometric Asymptotics Note, by the LLN the log of the geometric converges to µ = E[log(X )] refore the geometric converges to exp{e[log(x )]} = e µ, which is not the population on the natural scale; we call this the population geometric (but no one else seems to) To reiterate exp{e[log(x)]} E[exp{log(X )}] = E[X ] Note if the of log(x ) is symmetric then.5 = P(log X µ) = P(X e µ ) refore, for log-symmetric s the geometric is estimating the median

Using the geometric If you use the to create a confidence interval for the log measurements, your interval is estimating µ, the expected value of the log measurements If you exponentiate the endpoints of the interval, you are estimating e µ, the population geometric Recall, e µ is the population median when the of the logged data is symmetric This is especially useful for paired data when their ratio, rather than their difference, is of interest

Example geometric

geometric Consider when you have two independent groups, logging the individual data points and creating a confidence interval for the difference in the log s Prove to yourself that exponentiating the endpoints of this interval is then an interval for the ratio of the population geometric s, eµ 1 e µ 2

geometric A random variable is ly distributed if its log is a normally distributed random variable I am s take logs of me and then I ll then be normal Note random variables are not logs of normal random variables!!!!!! (You can t even take the log of a normal random variable) Formally, X is lognormal(µ, σ 2 ) if log(x ) N(µ, σ 2 ) If Y N(µ, σ 2 ) then X = e Y is

geometric density is 1 exp[ {log(x) µ}2 /(2σ 2 )] 2π x for 0 x Its is e µ+(σ2 /2) and variance is e 2µ+σ2 (e σ2 1) Its median is e µ

geometric Notice that if we assume that X 1,..., X n are (µ, σ 2 ) then Y 1 = log X 1,..., Y n = log X n are normally distributed with µ and variance σ 2 Creating a Gosset s t confidence interval on using the Y i is a confidence interval for µ the log of the median of the X i Exponentiate the endpoints of the interval to obtain a confidence interval for e µ, the median on the original scale Assuming ity, exponentiating t confidence intervals for the difference in two log s again estimates ratios of geometric s

Example geometric