Describing data with graphics and numbers

Similar documents
Data Analysis. Concepts and Techniques. Chapter 2. Chapter 2: Getting to Know Your Data. Data Objects and Attribute Types

Intermediate Statistics

( n+1 2 ) , position=(7+1)/2 =4,(median is observation #4) Median=10lb

SAMPLE VERSUS POPULATION. Population - consists of all possible measurements that can be made on a particular item or procedure.

OCR Statistics 1. Working with data. Section 3: Measures of spread

Descriptive Statistics Summary Lists

Normal Distributions

ENGI 4421 Probability and Statistics Faculty of Engineering and Applied Science Problem Set 1 Descriptive Statistics

Arithmetic Sequences

9 x and g(x) = 4. x. Find (x) 3.6. I. Combining Functions. A. From Equations. Example: Let f(x) = and its domain. Example: Let f(x) = and g(x) = x x 4

9.1. Sequences and Series. Sequences. What you should learn. Why you should learn it. Definition of Sequence

EM375 STATISTICS AND MEASUREMENT UNCERTAINTY LEAST SQUARES LINEAR REGRESSION ANALYSIS

SD vs. SD + One of the most important uses of sample statistics is to estimate the corresponding population parameters.

Chapter 2 and 3, Data Pre-processing

Capability Analysis (Variable Data)

ENGR Spring Exam 1

Name Date Hr. ALGEBRA 1-2 SPRING FINAL MULTIPLE CHOICE REVIEW #2

The Closest Line to a Data Set in the Plane. David Gurney Southeastern Louisiana University Hammond, Louisiana

Announcements. Reading. Project #4 is on the web. Homework #1. Midterm #2. Chapter 4 ( ) Note policy about project #3 missing components

South Slave Divisional Education Council. Math 10C

Improving Template Based Spike Detection

Data Preprocessing. Motivation

Chapter 3 Descriptive Statistics Numerical Summaries

Parabolic Path to a Best Best-Fit Line:

Chapter 1. Looking at Data-Distribution

The Nature of Light. Chapter 22. Geometric Optics Using a Ray Approximation. Ray Approximation

12. A(n) is the number of times an item or number occurs in a data set.

Ones Assignment Method for Solving Traveling Salesman Problem

10.4 Measures of Central Tendency and Variation

10.4 Measures of Central Tendency and Variation

Performance Plus Software Parameter Definitions

Lower Bounds for Sorting

What is the difference between a statistician and a mortician? Nobody's dying to see the statistician! Chapter 8 Interval Estimation

Name Date Hr. ALGEBRA 1-2 SPRING FINAL MULTIPLE CHOICE REVIEW #1

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 2.1- #

Panel Methods : Mini-Lecture. David Willis

Eigenimages. Digital Image Processing: Bernd Girod, 2013 Stanford University -- Eigenimages 1

Polynomial Functions and Models. Learning Objectives. Polynomials. P (x) = a n x n + a n 1 x n a 1 x + a 0, a n 0

STA 570 Spring Lecture 5 Tuesday, Feb 1

Math 10C Long Range Plans

No. of blue jelly beans No. of bags

Brief Guide on Using SPSS 10.0

Optimal Mapped Mesh on the Circle

Lecture 2: Spectra of Graphs

Spectral leakage and windowing

Chapter 5snow year.notebook March 15, 2018

Which movie we can suggest to Anne?

Using The Central Limit Theorem for Belief Network Learning

K-NET bus. When several turrets are connected to the K-Bus, the structure of the system is as showns

Consider the following population data for the state of California. Year Population

Recursive Estimation

Analysis of Server Resource Consumption of Meteorological Satellite Application System Based on Contour Curve

2.1: Frequency Distributions

Xbar/R Chart for x1-x3

*Corresponding author. Keywords: Power quality, Assessment system, Harmonic evaluation, Comprehensive evaluation.

Size and Shape Parameters

Using the Keyboard. Using the Wireless Keyboard. > Using the Keyboard

BOOLEAN MATHEMATICS: GENERAL THEORY

CS 111: Program Design I Lecture 21: Network Analysis. Robert H. Sloan & Richard Warner University of Illinois at Chicago April 10, 2018

A General Framework for Accurate Statistical Timing Analysis Considering Correlations

Package popkorn. R topics documented: February 20, Type Package

Overview. Frequency Distributions. Chapter 2 Summarizing & Graphing Data. Descriptive Statistics. Inferential Statistics. Frequency Distribution

Math Section 2.2 Polynomial Functions

Bar Charts and Frequency Distributions

DATA MINING II - 1DL460

Eigenimages. Digital Image Processing: Bernd Girod, Stanford University -- Eigenimages 1

Chapter 2 - Frequency Distributions and Graphs

Bar Graphs and Dot Plots

A MODIFIED APPROACH FOR ESTIMATING PROCESS CAPABILITY INDICES USING IMPROVED ESTIMATORS

Pseudocode ( 1.1) Analysis of Algorithms. Primitive Operations. Pseudocode Details. Running Time ( 1.1) Estimating performance

Section 2-2 Frequency Distributions. Copyright 2010, 2007, 2004 Pearson Education, Inc

What are we going to learn? CSC Data Structures Analysis of Algorithms. Overview. Algorithm, and Inputs

8 Organizing and Displaying

A New Morphological 3D Shape Decomposition: Grayscale Interframe Interpolation Method

EFFECT OF QUERY FORMATION ON WEB SEARCH ENGINE RESULTS

Elementary Educational Computer

Spherical Mirrors. Types of spherical mirrors. Lecture convex mirror: the. geometrical center is on the. opposite side of the mirror as

Date Lesson TOPIC HOMEWORK. Displaying Data WS 6.1. Measures of Central Tendency WS 6.2. Common Distributions WS 6.6. Outliers WS 6.

27 Refraction, Dispersion, Internal Reflection

NOTES TO CONSIDER BEFORE ATTEMPTING EX 1A TYPES OF DATA

Filter design. 1 Design considerations: a framework. 2 Finite impulse response (FIR) filter design

1 Graph Sparsfication

Data Structures and Algorithms. Analysis of Algorithms

Designing a learning system

Position and Velocity Estimation by Ultrasonic Sensor

Tutorial on Packet Time Metrics

Civil Engineering Computation

Lecture 13: Validation

Empirical Validate C&K Suite for Predict Fault-Proneness of Object-Oriented Classes Developed Using Fuzzy Logic.

Outline and Reading. Analysis of Algorithms. Running Time. Experimental Studies. Limitations of Experiments. Theoretical Analysis

Fundamentals of Media Processing. Shin'ichi Satoh Kazuya Kodama Hiroshi Mo Duy-Dinh Le

2.1 Objectives. Math Chapter 2. Chapter 2. Variable. Categorical Variable EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES

Mathematical Stat I: solutions of homework 1

Area As A Limit & Sigma Notation

MATH1635, Statistics (2)

Appendix D. Controller Implementation

MTH 3210: PROBABILITY AND STATISTICS DESCRIPTIVE STATISTICS WORKSHEET

Announcements. Recognition III. A Rough Recognition Spectrum. Projection, and reconstruction. Face detection using distance to face space

Python Programming: An Introduction to Computer Science

Chapter 2. Frequency distribution. Summarizing and Graphing Data

Transcription:

Describig data with graphics ad umbers Types of Data Categorical Variables also kow as class variables, omial variables Quatitative Variables aka umerical ariables either cotiuous or discrete. Graphig categorical variables Te most commo causes of death i Americas betwee 5 ad 9 years old i 999. Bar graphs Graphig umerical variables Heights of BIOL 3 studets (cm) Stem-ad-leaf plot table 65 68 63 73 7 63 7 55 52 9 7 68 42 6 54 65 56 77 73 65 65 75 55 66 68 65 8 65 9 8 7 6 5 4 3 3 5 7 3 3 5 5 5 5 5 5 6 8 8 8 2 4 5 5 6 2 Height Group 4-5 5-6 6-7 6 5 7-8 5 8-9

Histogram Histogram with more data Cumulative Distributio Cumulative Distributio Cumulative.8.6 Cumulative.8.6 Associatios betwee two categorical variables.4.4.2.2 5 6 7 8 9 2 2 Height (i cm) of Bio3 Studets 5 6 7 8 9 2 2 Height (i cm) of Bio3 Studets 5th percetile (media) 9th percetile Associatio betwee reproductive effort ad avia malaria Table 2.3A. Cotigecy table showig icidece of malaria i female great tits subjected to experimetal cotrol group egg removal. egg removal group row total malaria 7 5 22 o malaria colum total 28 5 43 35 3 65 Relative frequecy..8.6.4.2. Mosaic plot Cotrol Treatmet Egg removal Figure 2.3B. Mosaic plot for reproductive effort ad avia malaria i great tits (Table 2.3A). Blue fill idicates diseased birds whereas the white fill idicates birds free of malaria. = 65 birds. Grouped Bar Graph 25 2 5 5 Malaria No malaria Malaria No malaria Cotrol Egg removal

Multiple histograms Associatios betwee categorical ad umerical variables 6 No-coserved 4 2 2 4 6 8 6 Associatios betwee two umerical variables Coserved 4 2 2 4 6 8 Protei legth Scatterplots Evaluatig Graphics Do t mislead with graphics Lie factor Chartjuk Better represetatio of truth Lie Factor Lie factor = size of effect show i graphic size of effect i data Lie Factor Example Effect i graphic: 2.33/.8 = 29. Effect i data: 6748/5844 =.5 Lie factor = 29. /.5 = 25.3

Chartjuk Summary: Graphical methods for frequecy distributios Needless 3D Graphics Summary: Associatios betwee variables Respose variable Type of Data Categorical data Numerical data Method Bar graph Histogram Cumulative frequecy distributio Categorical Numerical Great book o graphics Explaatory variable Categorical Numerical Cotigecy table Grouped bar graph Mosaic plot Multiple histograms Scatter plot Cumulative frequecy distributios Two commo descriptios of data Locatio (or cetral tedecy) Describig data Width (or spread) Measures of locatio Media Mode

Y = " i= Y i Y =56, Y 2 =72, Y 3 =8, Y 4 =42 Y = (56+72+8+42) / 4 = 47 Media The media is the middle measuremet i a set of ordered data. is the size of the sample The data: 8 28 24 25 36 4 34 ca be put i order: 4 8 24 25 28 34 36 Media is 25. 2.5. 7.5 5. 2.5. Media Mode 5 6 7 8 9 2 3 4 5 6 7 8 vs. media i politics 24 U.S. Ecoomy Republicas: times are good icome icreasig ~ 4% per year Democrats: times are bad Media family icome fell Why? Mouse weight at 5 days old, i a lie selected for small size Measures of width 69.3 cm Media 7 cm Mode 65-7 cm Cumulative.8.6.4.2 5 6 7 8 9 2 2 Height (i cm) of Bio3 Studets Rage Stadard deviatio Variace Coefficiet of variatio

Rage Variace 4 7 8 2 22 22 24 25 26 28 28 28 3 34 36 The rage is 36-4 = 22 Var[Y] = N # i= ( Y i " µ ) 2 N s 2 = Sample variace # ( Y i "Y ) 2 i= " is the sample size Shortcut for calculatig sample variace # % # & s 2 = % (% $ "'% % $ ) i= Y i 2 & ( "Y 2 ( ( ( ' Stadard deviatio (SD) Positive square root of the variace! is the true stadard deviatio s is the sample stadard deviatio I class exercise Aswer Coefficiet of variace (CV) Calculate the variace ad stadard deviatio of a sample with the followig data: 6,, 2 Variace=7 Stadard deviatio = 7 CV = s /. Y

Equal meas, differet variaces Maipulatig meas Maipulatig variace.4.3.2. V = V=2-5 5 Value V= The mea of the sum of two variables: E[X + Y] = E[X]+ E[Y] The mea of the sum of a variable ad a costat: E[X + c] = E[X]+ c The mea of a product of a variable ad a costat: E[c X] = c E[X] The mea of a product of two variables: E[X Y] = E[X] E[Y] if ad oly if X ad Y are idepedet. The variace of the sum of two variables: Var[X + Y] = Var[X]+ Var[Y] if ad oly if X ad Y are idepedet. The variace of the sum of a variable ad a costat: Var[X + c] = Var[X] The variace of a product of a variable ad a costat: Var[c X] = c 2 Var[X] Parets heights Variace Father Height 74.3 7.7 Mother Height 6.4 58.3 Father Height +Mother Height 334.7 84.9