Statistical matching: conditional. independence assumption and auxiliary information

Size: px
Start display at page:

Download "Statistical matching: conditional. independence assumption and auxiliary information"

Transcription

1 Statistical matching: conditional Training Course Record Linkage and Statistical Matching Mauro Scanu Istat scanu [at] istat.it independence assumption and auxiliary information

2 Outline The conditional independence model(cia) Parametric macro methods The normal case Maximum likelihood Parametric micro methods: an overview Nonparametric macro methods: an overview Nonparametric micro methods Random hot deck Conditional random hot deck Rank hot deck Distance hot deck Constrained hot deck Auxiliary information

3 Statistical matching Let us assume that data are collected in two sample surveys, say A and B of size n A and n B from the same population. Some X variables are observed in both the samples Variables Y are observed only in survey A Variables Z are observed only in survey B. The goal is inference on (X,Y,Z), or at least on the bivariate (Y,Z)

4 Statistical matching Goal: estimation of parameters describing (Y,Z) or (X,Y,Z)

5 A first identifiable model Let us consider the class of models F for (X,Y,Z) to the following set: where f Y X is the conditional density of Y given X, f Z X is the conditional density of Z given X f X is the marginal density of X. Consequence 1: this class of distributions for (X,Y,Z) is the conditional independence of Y and Z given X (CIA). Consequence 2: this model is identifiable for A B Note: this is not the only identifiable model! Help: this model can be useful in many different cases (use of proxy variables and uncertainty)!

6 The different matching contexts Output Macro Micro Approach Parametric Nonparametric Let s tackle this problem in a familiar context for inferential statistics: data are drawn according to a probability law that follows a parametric model, and the objective is macro. In the following we will mainly consider two distributions: the normal and the multinomial

7 Parametric macro methods In a parametric model, each probability law that can generate our sample data can be described by a finite number of parameters. Under the CIA, given the sample A B, the likelihood function becomes:

8 Parametric macro methods Parameter estimation becomes straightforward: Use sample A B for estimating Use A for estimating Use B for estimating

9 Parametric macro methods: the normal case Let (X,Y,Z) be a three-variate normal r.v. with parameters: Under the CIA, the parameter YZ is superfluous For the statistical matching problem, it is convenient to consider the equivalent distribution defined by this parameterization: X, Y X, Z X.

10 Parametric macro methods: the normal case Estimates for the re-parameterization

11 Parametric macro methods: the normal case For the estimates of the parameters of the marginal distribution of X, the whole sample A B can be used

12 Parametric macro methods: the normal case For the estimates of the parameters of the distribution of Y given X, only sample A can be used Hence, the marginal parameters for Y are:

13 Parametric macro methods: the normal case For the estimates of the parameters of the distribution of Z given X, only sample B can be used Hence, the marginal parameters for Z are:

14 Comment: why maximum likelihood estimation? What happens if, instead of the previous maximum likelihood parameter estimation, we consider a direct estimation from the data set where the corresponding variable(s) are observed? For instance, let s consider the case of a direct estimation of the Y mean value on A (i.e. with the sample average of Y in A) instead of using μ Y (a kind of regression estimate in a double sampling) Where ρ XY is the correlation coefficient between X and Y. The maximum likelihood estimator is much more efficient when B sample size increases and X and Y are highly correlated.

15 Comment: why maximum likelihood estimation? When parameters are estimated distinctly on the part of A B that is complete for the corresponding r.v., it might happen that the estimates are not coherent. For instance, the estimated variance and covariance matrix for (X, Y) can be negative definite! This does not happen in the simultaneous estimation of all the parameters by means of the maximum likelihood estimation

16 Example

17 Example Under the CIA the maximum likelihood estimate of the parameters are: From the previous estimates we get:

18 Parametric macro methods: the multinomial case Let (X,Y,Z) be a multinomial r.v. with parameters: where following characteristics is a vector of parameters with the

19 Parametric macro methods: the multinomial case Adopting the same re-parameterization of the joint distribution, under the CIA the parameters of interest are: In this context, the parameters of the joint distribution computed according to the following formulas are When the interest is only on the pairwise distribution (Y,Z)

20 Parametric macro methods: the multinomial case Given the sample A B, the maximum likelihood estimator is

21 Example Let s consider the following two samples A and B, where I=2, J=2, K=3.

22 Example The maximum likelihood estimates of the parameters are:

23 Example The maximum likelihood estimates of the parameters of the joint distribution are:

24 Parametric macro methods: conclusions The CIA model is identifiable (i.e. with a unique estimate) for the data set A B The application of the maximum likelihood estimator is very easy Even if the problem is characterized by missing data, the problem can be split in three «complete data» subproblems, one for each parameter of the re-parameterization There can be other estimation methods can be statistically consistent, but incoherent

25 Selected references Anderson T W (1957) ``Maximum likelihood estimates for a multivariate normal distribution when some observations are missing'', JASA, 52, Anderson T W (1984) An Introduction to Multivariate Statistical Analysis, Wiley Rubin D B (1974)``Characterizing the Estimation of Parameters in Incomplete--Data Problems'', JASA, 69, D'Orazio, M., Di Zio, M. and Scanu, M. (2006) Statistical matching for categorical data: displaying uncertainty and using logical constraints. JOS, 22, Moriarity C, Scheuren F (2001)``Statistical Matching: a Paradigm for Assessing the Uncertainty in the Procedure'', JOS, 17,

26 Parametric micro methods Output Macro Micro Approach Parametric Nonparametric We are still in the familiar context of parametric data models, but now the objective is micro, i.e. we are interested in a complete data set where (X, Y, Z) are jointly available. This is the context where imputation methods are usually used!

27 Objective and context Objective: to create a complete data set for (X, Y, Z) Context: partially observed data set

28 Parametric micro methods: rationale Method: imputation of missing values. In a parametric context: 1. Estimate the distribution parameters 2. Take a (not necessarily random) value from the estimated distribution

29 Selected references Little R J A, Rubin D B (2002) Statistical Analysis with Missing Data, 2 nd edition, Wiley Rubin D B (1986) Statistical Matching Using File Concatenation with Adjusted Weights and Multiple Imputations, Journal of Business and Economic Statistics, 4, Kadane J B (1978) Some Statistical Problems in Merging Data Files, in Compendium of Tax Research, Department of Treasury, U.S. Government Printing Office, Washington D.C., Republished on Journal of Official Statistics, 17, Marella D., Scanu M., Conti P.L. (2008). On the matching noise of some nonparametric imputation procedures, Statistics and Probability Letters, 78, Conti P.L., Marella D., Scanu M. (2008). Evaluation of matching noise for imputation techniques based on the local linear regression estimator. Computational Statistics and Data Analysis, 53,

30 Non parametric macro methods Approach Output Parametric Nonparametric Macro Micro The family of distributions where the distribution of (X, Y, Z) belongs cannot be represented by a finite number of parameters. Although the statistical literature on nonparametrics is huge, this is by far the most neglected situation in statistical matching! Anyway we check it in order to link macro and micro methods, as for the parametric case

31 Non parametric macro methods: rationale Usually neglected in the statistical matching literature., anyway it is possible to develop the methodologies similarly to the parametric macro case. Two situations have been mainly studied: 1. X categorical, Y and Z ordered or numerical: estimation of the empirical cumulative distribution 2. X, Y and Z numerical: estimation of the nonparametric regression function As a matter of fact, the first approach will be helpful for the random generation of imputations in the corresponding micro methods, the second for the conditional mean matching and, whenever possible, adding a random residual

32 Selected references Paass G (1985) Statistical record linkage methodology, state of the art and future prospects, in Bulletin of the International Statistical Institute, Proceedings of the 45th Session, volume LI, Book 2 Marella D., Scanu M., Conti P.L. (2008) On the matching noise of some nonparametric imputation procedures, Statistics and Probability Letters, 78, Conti P.L., Marella D., Scanu M. (2008). Evaluation of matching noise for imputation techniques based on the local linear regression estimator. Computational Statistics and Data Analysis, 53,

33 Non parametric micro methods Approach Output Parametric Nonparametric Macro Micro Who applied these methods, seldom assumed anything about the distribution of (X, Y, Z) Each micro method has a macro counterpart, i.e. a representation of how the distribution of (X, Y, Z) should be done. The problem is: to be aware or not?

34 Non parametric micro methods The nonparametric micro matching methods consist of essentially three imputation procedures 1. Random hot deck 2. Rank hot deck 3. Distance hot deck As already seen in the parametric case, each one of these methods correspond to a specific nonparametric macro approach of the distribution f x, y, z or of a characteristic value. In general, these methods do not organize the two data sets A and B as a unique sample A B.

35 Parametric micro methods A is the recipient file B is the donor file and and these are the data these are the data to to impute use for imputation The idea is to consider a file as a recipient and the other as the donor

36 Example In order to define the different hot deck methods, let s consider an example Example: let A and B be the following ones A : n A = 6, observed variables: Gender, Age, Income B : n B = 10, observed variables: Gender, Age, Expenditures A=recipient B=donor Common variables X=(X 1 =Gender, X 2 =Age) Y=(Income) Z=(Expenditures)

37 Example

38 Random hot deck: the method 1. Let us draw one random value from B and assign it to the first value to impute in A. 2. Follow the same procedure for all the a A Example: In general we have n B n A = 10 6 possible different ways to impute A

39 Conditional random hot deck: the method 1. Let s fix a conditional variable, e.g. X 1 2. For the first record a=1, let us draw one random value from the subset of units in B that X 1 b = F. 3. Follow the same procedure for all the a A Example: The number of different completed data sets we can get is m B m A + n B m B n A ma = = 1312

40 Comments 1. Random hot deck corresponds to a random generation of the values to impute from the empirical cumulative distribution function of Z 2. Conditional random hot deck corresponds to a random generation of the values to impute from the empirical cumulative distribution function of Z X 3. It is possible to eliminate the already drawn value from the set of possible donors (constrained procedure), anyway the preservation of the observed distribution of Z or Z X in B is geopardized

41 Rank hot deck Let s assume that n B = kn A, k integer. Compute the empirical cumulative distribution functions To each a A assign b B chosen so that In other words, this method imputes the values whose quintiles of X are similar in A and B respectively

42 Rank hot deck Rank the two sample A and B according to X 1

43 Rank hot deck These are the values of the empirical cumulative distribution function of X 1 in A and B respectively

44 Rank hot deck This is the result In this example, there is only one way to impute a value

45 Distance hot deck To each a A assign b B chosen so that it is the nearest according to the common variables. This method depends on the distance function. Different choices are available. If X is numeric, it is possible to choose the Manhattan distance Other distances can be the Euclidean, If X is multivariate, the available distances are the Mahalanobis, Canberra, If X is categorical and unordered, it is possible to consider the classes of imputation (i.e. the distance is the «equality»)

46 Distance hot deck - example Let s consider X 2 as the variable to use in order to compute distances (i.e. we choose as donors those records in B whose age is the most similar to the one in A) Choose one value at random if there are more than one same distance donors The overall distance between donor and recipients is

47 Constrained distance hot deck In the former procedure, a donor can be chosen more than once if it is the nearest of more than one record in A. In order to resue the same information more than once, the following constrained procedure has been defined a. Minimize b. Under the constraints

48 Constrained distance hot deck: advantages and disadvantages Constrained matching allows to preserve the marginal distribution of the variable to impute (Z) Constrained distance hot deck is characterized by a larger distance between donors and recipients than the one for distance hot deck

49 Constrained distance hot deck: example The overall donor recipient distance is

50 Constrained distance hot deck: comments Distance hot deck is equivalent to the estimation of a nonparametric regression function via the knn method, when k=1. These methods produce always live data as imputations Sometimes, parametric and nonparametric procedures are applied together: mixed methods Example: Regression step - impute intermediate values in A and B Matching step use a distance hot deck by selecting b * with the shortest distance

51 Selected references Kadane J B (1978) Some Statistical Problems in Merging Data Files, in Compendium of Tax Research, Department of Treasury, U.S. Government Printing Office, Washington D.C., Published also on Journal of Official Statistics, 17, Little R J A, Rubin D B (2002) Statistical Analysis with Missing Data, 2 nd edition, Wiley Okner B A (1972) Constructing a new data base from existing microdata sets: the 1966 merge file, Annals of Economic and Social Measurement, 1, Rodgers W L (1984) An Evaluation of Statistical Matching, Journal of Business and Economic Statistics, 2, Rubin D B (1986) Statistical Matching Using File Concatenation with Adjusted Weights and Multiple Imputations, Journal of Business and Economic Statistics, 4, Sims C A (1972), Comments on Okner, Annals of Economic and Social Measurement, 1, Singh A C, Mantel H, Kinack M, Rowe G (1993) Statistical Matching: Use of Auxiliary Information as an Alternative to the Conditional Independence Assumption, Survey Methodology, 19, Marella D., Scanu M., Conti P.L. (2008). On the matching noise of some nonparametric imputation procedures, Statistics and Probability Letters, 78, Conti P.L., Marella D., Scanu M. (2008). Evaluation of matching noise for imputation techniques based on the local linear regression estimator. Computational Statistics and Data Analysis, 53,

52 Auxiliary information In order to avoid the CIA, two different kinds of auxiliary information have been usually considered: 1) a third file C where either (X, Y,Z) or (Y,Z) are jointly observed 2) a plausible value of the inestimable parameters of either (Y,Z X) or (Y,Z) These additional sources of information may originate from an outdated statistical investigation; administrative register; a supplemental (even small) ad hoc survey; proxy variables (Y,Z ) Pay always attention to their accuracy!!!

53 Auxiliary information on parameters Auxiliary information on parameters can be in terms of: information about q yz x Information about q yz This kind of information restricts the parameter space Q to a subspace Q*, where Q* involves all the parameters q Q compatible with the auxiliary information. NOTE: most of the times the unconstrained maximum likelihood estimate is not compatible with this information: this leads to parameter estimates of the estimable parameters that are strictly different from the ones on the CIA

54 Example: Auxiliary info on r yz = r* yz Let us suppose that Value r* yz = 0.7 is compatible, det(r ) = while r* yz = 0.9 is not compatible, det(r ) =-0.008

55 Use of a third file C, complete on X,Y,Z In a parametric macro approach: use the EM on the union of A, B, C In a parametric micro approach: use conditional mean matching In a nonparametric micro approach: use hot deck (first impute A with record from C, then impute live B values on A using the imputed records) Parametric and nonparametric methods can also be mixed (e.g. impute A records with the use of a conditional mean matching that makes use of an additional file C, then impute live B values)

56 Selected references Rässler S. (2002) Statistical Matching: a frequentist theory, practical applications and alternative Bayesian approaches, Springer Moriarity C., Scheuren F. (2001) Statistical Matching: a Paradigm for Assessing the Uncertainty in the Procedure, Jour. of Official Statistics, 17, Moriarity C., Scheuren F. (2003) A Note on Rubin s Statistical Matching Using File Concatenation with Adjusted Weights and Multiple Imputation, Jour. of Business and Economic Statistics, 21, Moriarity C., Scheuren F. (2004), Regression based statistical matching: recent developments, Proceedings of the Section on Survey Research Methods, American Statistical Association D Orazio M., Di Zio M., Scanu M. (2006) Statistical Matching for Categorical Data: displaying uncertainty and using logical constraints, Jour. of Official Statistics, 22, 1 22 Singh A.C., Mantel M.D., Kinack M.D., Rowe G. (1993). Statistical Matching: Use of Auxiliary Information as an Alternative to the Conditional Independence Assumption, Survey Methodology, vol. 19, N. 1, pp

This module is part of the. Memobust Handbook. on Methodology of Modern Business Statistics

This module is part of the. Memobust Handbook. on Methodology of Modern Business Statistics This module is part of the Memobust Handboo on Methodology of Modern Business Statistics 26 March 2014 Method: Statistical Matching Methods Contents General section... 3 Summary... 3 2. General description

More information

Statistical Matching of Two Surveys with a Common Subset

Statistical Matching of Two Surveys with a Common Subset Marco Ballin, Marcello D Orazio, Marco Di Zio, Mauro Scanu, Nicola Torelli Statistical Matching of Two Surveys with a Common Subset Working Paper n. 124 2009 1 Statistical Matching of Two Surveys with

More information

Statistical Matching using Fractional Imputation

Statistical Matching using Fractional Imputation Statistical Matching using Fractional Imputation Jae-Kwang Kim 1 Iowa State University 1 Joint work with Emily Berg and Taesung Park 1 Introduction 2 Classical Approaches 3 Proposed method 4 Application:

More information

Estimation methods for the integration of administrative sources

Estimation methods for the integration of administrative sources Estimation methods for the integration of administrative sources Task 5b: Review of estimation methods identified in Task 3 a report containing technical summary sheet for each identified estimation/statistical

More information

Cleanup and Statistical Analysis of Sets of National Files

Cleanup and Statistical Analysis of Sets of National Files Cleanup and Statistical Analysis of Sets of National Files William.e.winkler@census.gov FCSM Conference, November 6, 2013 Outline 1. Background on record linkage 2. Background on edit/imputation 3. Current

More information

Statistical Analysis Using Combined Data Sources: Discussion JPSM Distinguished Lecture University of Maryland

Statistical Analysis Using Combined Data Sources: Discussion JPSM Distinguished Lecture University of Maryland Statistical Analysis Using Combined Data Sources: Discussion 2011 JPSM Distinguished Lecture University of Maryland 1 1 University of Michigan School of Public Health April 2011 Complete (Ideal) vs. Observed

More information

Data corruption, correction and imputation methods.

Data corruption, correction and imputation methods. Data corruption, correction and imputation methods. Yerevan 8.2 12.2 2016 Enrico Tucci Istat Outline Data collection methods Duplicated records Data corruption Data correction and imputation Data validation

More information

Missing Data Analysis for the Employee Dataset

Missing Data Analysis for the Employee Dataset Missing Data Analysis for the Employee Dataset 67% of the observations have missing values! Modeling Setup For our analysis goals we would like to do: Y X N (X, 2 I) and then interpret the coefficients

More information

Statistical Methods for the Analysis of Repeated Measurements

Statistical Methods for the Analysis of Repeated Measurements Charles S. Davis Statistical Methods for the Analysis of Repeated Measurements With 20 Illustrations #j Springer Contents Preface List of Tables List of Figures v xv xxiii 1 Introduction 1 1.1 Repeated

More information

Missing Data Analysis for the Employee Dataset

Missing Data Analysis for the Employee Dataset Missing Data Analysis for the Employee Dataset 67% of the observations have missing values! Modeling Setup Random Variables: Y i =(Y i1,...,y ip ) 0 =(Y i,obs, Y i,miss ) 0 R i =(R i1,...,r ip ) 0 ( 1

More information

Bayesian Estimation for Skew Normal Distributions Using Data Augmentation

Bayesian Estimation for Skew Normal Distributions Using Data Augmentation The Korean Communications in Statistics Vol. 12 No. 2, 2005 pp. 323-333 Bayesian Estimation for Skew Normal Distributions Using Data Augmentation Hea-Jung Kim 1) Abstract In this paper, we develop a MCMC

More information

COPULA MODELS FOR BIG DATA USING DATA SHUFFLING

COPULA MODELS FOR BIG DATA USING DATA SHUFFLING COPULA MODELS FOR BIG DATA USING DATA SHUFFLING Krish Muralidhar, Rathindra Sarathy Department of Marketing & Supply Chain Management, Price College of Business, University of Oklahoma, Norman OK 73019

More information

Missing Data. Where did it go?

Missing Data. Where did it go? Missing Data Where did it go? 1 Learning Objectives High-level discussion of some techniques Identify type of missingness Single vs Multiple Imputation My favourite technique 2 Problem Uh data are missing

More information

CHAPTER 11 EXAMPLES: MISSING DATA MODELING AND BAYESIAN ANALYSIS

CHAPTER 11 EXAMPLES: MISSING DATA MODELING AND BAYESIAN ANALYSIS Examples: Missing Data Modeling And Bayesian Analysis CHAPTER 11 EXAMPLES: MISSING DATA MODELING AND BAYESIAN ANALYSIS Mplus provides estimation of models with missing data using both frequentist and Bayesian

More information

Hierarchical Mixture Models for Nested Data Structures

Hierarchical Mixture Models for Nested Data Structures Hierarchical Mixture Models for Nested Data Structures Jeroen K. Vermunt 1 and Jay Magidson 2 1 Department of Methodology and Statistics, Tilburg University, PO Box 90153, 5000 LE Tilburg, Netherlands

More information

Handling missing data for indicators, Susanne Rässler 1

Handling missing data for indicators, Susanne Rässler 1 Handling Missing Data for Indicators Susanne Rässler Institute for Employment Research & Federal Employment Agency Nürnberg, Germany First Workshop on Indicators in the Knowledge Economy, Tübingen, 3-4

More information

STATISTICS (STAT) Statistics (STAT) 1

STATISTICS (STAT) Statistics (STAT) 1 Statistics (STAT) 1 STATISTICS (STAT) STAT 2013 Elementary Statistics (A) Prerequisites: MATH 1483 or MATH 1513, each with a grade of "C" or better; or an acceptable placement score (see placement.okstate.edu).

More information

Data analysis using Microsoft Excel

Data analysis using Microsoft Excel Introduction to Statistics Statistics may be defined as the science of collection, organization presentation analysis and interpretation of numerical data from the logical analysis. 1.Collection of Data

More information

Handling Data with Three Types of Missing Values:

Handling Data with Three Types of Missing Values: Handling Data with Three Types of Missing Values: A Simulation Study Jennifer Boyko Advisor: Ofer Harel Department of Statistics University of Connecticut Storrs, CT May 21, 2013 Jennifer Boyko Handling

More information

CHAPTER 1 INTRODUCTION

CHAPTER 1 INTRODUCTION Introduction CHAPTER 1 INTRODUCTION Mplus is a statistical modeling program that provides researchers with a flexible tool to analyze their data. Mplus offers researchers a wide choice of models, estimators,

More information

A noninformative Bayesian approach to small area estimation

A noninformative Bayesian approach to small area estimation A noninformative Bayesian approach to small area estimation Glen Meeden School of Statistics University of Minnesota Minneapolis, MN 55455 glen@stat.umn.edu September 2001 Revised May 2002 Research supported

More information

in this course) ˆ Y =time to event, follow-up curtailed: covered under ˆ Missing at random (MAR) a

in this course) ˆ Y =time to event, follow-up curtailed: covered under ˆ Missing at random (MAR) a Chapter 3 Missing Data 3.1 Types of Missing Data ˆ Missing completely at random (MCAR) ˆ Missing at random (MAR) a ˆ Informative missing (non-ignorable non-response) See 1, 38, 59 for an introduction to

More information

Comparative Evaluation of Synthetic Dataset Generation Methods

Comparative Evaluation of Synthetic Dataset Generation Methods Comparative Evaluation of Synthetic Dataset Generation Methods Ashish Dandekar, Remmy A. M. Zen, Stéphane Bressan December 12, 2017 1 / 17 Open Data vs Data Privacy Open Data Helps crowdsourcing the research

More information

Missing Data: What Are You Missing?

Missing Data: What Are You Missing? Missing Data: What Are You Missing? Craig D. Newgard, MD, MPH Jason S. Haukoos, MD, MS Roger J. Lewis, MD, PhD Society for Academic Emergency Medicine Annual Meeting San Francisco, CA May 006 INTRODUCTION

More information

Missing Data Techniques

Missing Data Techniques Missing Data Techniques Paul Philippe Pare Department of Sociology, UWO Centre for Population, Aging, and Health, UWO London Criminometrics (www.crimino.biz) 1 Introduction Missing data is a common problem

More information

Mixture Models and the EM Algorithm

Mixture Models and the EM Algorithm Mixture Models and the EM Algorithm Padhraic Smyth, Department of Computer Science University of California, Irvine c 2017 1 Finite Mixture Models Say we have a data set D = {x 1,..., x N } where x i is

More information

Nuts and Bolts Research Methods Symposium

Nuts and Bolts Research Methods Symposium Organizing Your Data Jenny Holcombe, PhD UT College of Medicine Nuts & Bolts Conference August 16, 3013 Topics to Discuss: Types of Variables Constructing a Variable Code Book Developing Excel Spreadsheets

More information

Opening Windows into the Black Box

Opening Windows into the Black Box Opening Windows into the Black Box Yu-Sung Su, Andrew Gelman, Jennifer Hill and Masanao Yajima Columbia University, Columbia University, New York University and University of California at Los Angels July

More information

Clustering. CS294 Practical Machine Learning Junming Yin 10/09/06

Clustering. CS294 Practical Machine Learning Junming Yin 10/09/06 Clustering CS294 Practical Machine Learning Junming Yin 10/09/06 Outline Introduction Unsupervised learning What is clustering? Application Dissimilarity (similarity) of objects Clustering algorithm K-means,

More information

SOS3003 Applied data analysis for social science Lecture note Erling Berge Department of sociology and political science NTNU.

SOS3003 Applied data analysis for social science Lecture note Erling Berge Department of sociology and political science NTNU. SOS3003 Applied data analysis for social science Lecture note 04-2009 Erling Berge Department of sociology and political science NTNU Erling Berge 2009 1 Missing data Literature Allison, Paul D 2002 Missing

More information

NORM software review: handling missing values with multiple imputation methods 1

NORM software review: handling missing values with multiple imputation methods 1 METHODOLOGY UPDATE I Gusti Ngurah Darmawan NORM software review: handling missing values with multiple imputation methods 1 Evaluation studies often lack sophistication in their statistical analyses, particularly

More information

Machine Learning: An Applied Econometric Approach Online Appendix

Machine Learning: An Applied Econometric Approach Online Appendix Machine Learning: An Applied Econometric Approach Online Appendix Sendhil Mullainathan mullain@fas.harvard.edu Jann Spiess jspiess@fas.harvard.edu April 2017 A How We Predict In this section, we detail

More information

MODEL SELECTION AND MODEL AVERAGING IN THE PRESENCE OF MISSING VALUES

MODEL SELECTION AND MODEL AVERAGING IN THE PRESENCE OF MISSING VALUES UNIVERSITY OF GLASGOW MODEL SELECTION AND MODEL AVERAGING IN THE PRESENCE OF MISSING VALUES by KHUNESWARI GOPAL PILLAY A thesis submitted in partial fulfillment for the degree of Doctor of Philosophy in

More information

Variance Estimation in Presence of Imputation: an Application to an Istat Survey Data

Variance Estimation in Presence of Imputation: an Application to an Istat Survey Data Variance Estimation in Presence of Imputation: an Application to an Istat Survey Data Marco Di Zio, Stefano Falorsi, Ugo Guarnera, Orietta Luzi, Paolo Righi 1 Introduction Imputation is the commonly used

More information

Missing Data and Imputation

Missing Data and Imputation Missing Data and Imputation NINA ORWITZ OCTOBER 30 TH, 2017 Outline Types of missing data Simple methods for dealing with missing data Single and multiple imputation R example Missing data is a complex

More information

arxiv: v1 [stat.me] 29 May 2015

arxiv: v1 [stat.me] 29 May 2015 MIMCA: Multiple imputation for categorical variables with multiple correspondence analysis Vincent Audigier 1, François Husson 2 and Julie Josse 2 arxiv:1505.08116v1 [stat.me] 29 May 2015 Applied Mathematics

More information

Analysis of Incomplete Multivariate Data

Analysis of Incomplete Multivariate Data Analysis of Incomplete Multivariate Data J. L. Schafer Department of Statistics The Pennsylvania State University USA CHAPMAN & HALL/CRC A CR.C Press Company Boca Raton London New York Washington, D.C.

More information

CHAPTER 7 EXAMPLES: MIXTURE MODELING WITH CROSS- SECTIONAL DATA

CHAPTER 7 EXAMPLES: MIXTURE MODELING WITH CROSS- SECTIONAL DATA Examples: Mixture Modeling With Cross-Sectional Data CHAPTER 7 EXAMPLES: MIXTURE MODELING WITH CROSS- SECTIONAL DATA Mixture modeling refers to modeling with categorical latent variables that represent

More information

Paper CC-016. METHODOLOGY Suppose the data structure with m missing values for the row indices i=n-m+1,,n can be re-expressed by

Paper CC-016. METHODOLOGY Suppose the data structure with m missing values for the row indices i=n-m+1,,n can be re-expressed by Paper CC-016 A macro for nearest neighbor Lung-Chang Chien, University of North Carolina at Chapel Hill, Chapel Hill, NC Mark Weaver, Family Health International, Research Triangle Park, NC ABSTRACT SAS

More information

Conditional Volatility Estimation by. Conditional Quantile Autoregression

Conditional Volatility Estimation by. Conditional Quantile Autoregression International Journal of Mathematical Analysis Vol. 8, 2014, no. 41, 2033-2046 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ijma.2014.47210 Conditional Volatility Estimation by Conditional Quantile

More information

Smoothing Dissimilarities for Cluster Analysis: Binary Data and Functional Data

Smoothing Dissimilarities for Cluster Analysis: Binary Data and Functional Data Smoothing Dissimilarities for Cluster Analysis: Binary Data and unctional Data David B. University of South Carolina Department of Statistics Joint work with Zhimin Chen University of South Carolina Current

More information

FHDI: An R Package for Fractional Hot Deck Imputation by Jongho Im, In Ho Cho, and Jae Kwang Kim

FHDI: An R Package for Fractional Hot Deck Imputation by Jongho Im, In Ho Cho, and Jae Kwang Kim CONTRIBUTED RESEARCH ARTICLE 140 FHDI: An R Package for Fractional Hot Deck Imputation by Jongho Im, In Ho Cho, and Jae Kwang Kim Abstract Fractional hot deck imputation (FHDI), proposed by Kalton and

More information

Machine Learning and Data Mining. Clustering (1): Basics. Kalev Kask

Machine Learning and Data Mining. Clustering (1): Basics. Kalev Kask Machine Learning and Data Mining Clustering (1): Basics Kalev Kask Unsupervised learning Supervised learning Predict target value ( y ) given features ( x ) Unsupervised learning Understand patterns of

More information

Development of Synthetic Microdata for Educational Use in Japan

Development of Synthetic Microdata for Educational Use in Japan 2013 Joint IASE / IAOS Satellite Conference, Macau Tower, Macau, China, 22nd-24th August, 2013 Development of Synthetic Microdata for Educational Use in Japan Naoki Makita Shinsuke Ito* National Statistics

More information

Small area estimation by model calibration and "hybrid" calibration. Risto Lehtonen, University of Helsinki Ari Veijanen, Statistics Finland

Small area estimation by model calibration and hybrid calibration. Risto Lehtonen, University of Helsinki Ari Veijanen, Statistics Finland Small area estimation by model calibration and "hybrid" calibration Risto Lehtonen, University of Helsinki Ari Veijanen, Statistics Finland NTTS Conference, Brussels, 10-12 March 2015 Lehtonen R. and Veijanen

More information

Multiple Imputation for Missing Data. Benjamin Cooper, MPH Public Health Data & Training Center Institute for Public Health

Multiple Imputation for Missing Data. Benjamin Cooper, MPH Public Health Data & Training Center Institute for Public Health Multiple Imputation for Missing Data Benjamin Cooper, MPH Public Health Data & Training Center Institute for Public Health Outline Missing data mechanisms What is Multiple Imputation? Software Options

More information

Dynamic Thresholding for Image Analysis

Dynamic Thresholding for Image Analysis Dynamic Thresholding for Image Analysis Statistical Consulting Report for Edward Chan Clean Energy Research Center University of British Columbia by Libo Lu Department of Statistics University of British

More information

Multiple-imputation analysis using Stata s mi command

Multiple-imputation analysis using Stata s mi command Multiple-imputation analysis using Stata s mi command Yulia Marchenko Senior Statistician StataCorp LP 2009 UK Stata Users Group Meeting Yulia Marchenko (StataCorp) Multiple-imputation analysis using mi

More information

Using Machine Learning to Optimize Storage Systems

Using Machine Learning to Optimize Storage Systems Using Machine Learning to Optimize Storage Systems Dr. Kiran Gunnam 1 Outline 1. Overview 2. Building Flash Models using Logistic Regression. 3. Storage Object classification 4. Storage Allocation recommendation

More information

Statistical Matching of Discrete Data by Bayesian Networks

Statistical Matching of Discrete Data by Bayesian Networks JMLR: Workshop and Conference Proceedings vol 52, 159-170, 2016 PGM 2016 Statistical Matching of Discrete Data by Bayesian Networks Eva Endres Thomas Augustin Department of Statistics, Ludwig-Maximilians-Universität

More information

Clustering. Mihaela van der Schaar. January 27, Department of Engineering Science University of Oxford

Clustering. Mihaela van der Schaar. January 27, Department of Engineering Science University of Oxford Department of Engineering Science University of Oxford January 27, 2017 Many datasets consist of multiple heterogeneous subsets. Cluster analysis: Given an unlabelled data, want algorithms that automatically

More information

Missing data analysis. University College London, 2015

Missing data analysis. University College London, 2015 Missing data analysis University College London, 2015 Contents 1. Introduction 2. Missing-data mechanisms 3. Missing-data methods that discard data 4. Simple approaches that retain all the data 5. RIBG

More information

Statistics (STAT) Statistics (STAT) 1. Prerequisites: grade in C- or higher in STAT 1200 or STAT 1300 or STAT 1400

Statistics (STAT) Statistics (STAT) 1. Prerequisites: grade in C- or higher in STAT 1200 or STAT 1300 or STAT 1400 Statistics (STAT) 1 Statistics (STAT) STAT 1200: Introductory Statistical Reasoning Statistical concepts for critically evaluation quantitative information. Descriptive statistics, probability, estimation,

More information

CS 229 Midterm Review

CS 229 Midterm Review CS 229 Midterm Review Course Staff Fall 2018 11/2/2018 Outline Today: SVMs Kernels Tree Ensembles EM Algorithm / Mixture Models [ Focus on building intuition, less so on solving specific problems. Ask

More information

The Use of Sample Weights in Hot Deck Imputation

The Use of Sample Weights in Hot Deck Imputation Journal of Official Statistics, Vol. 25, No. 1, 2009, pp. 21 36 The Use of Sample Weights in Hot Deck Imputation Rebecca R. Andridge 1 and Roderick J. Little 1 A common strategy for handling item nonresponse

More information

ON SOME METHODS OF CONSTRUCTION OF BLOCK DESIGNS

ON SOME METHODS OF CONSTRUCTION OF BLOCK DESIGNS ON SOME METHODS OF CONSTRUCTION OF BLOCK DESIGNS NURNABI MEHERUL ALAM M.Sc. (Agricultural Statistics), Roll No. I.A.S.R.I, Library Avenue, New Delhi- Chairperson: Dr. P.K. Batra Abstract: Block designs

More information

Statistical Analysis of List Experiments

Statistical Analysis of List Experiments Statistical Analysis of List Experiments Kosuke Imai Princeton University Joint work with Graeme Blair October 29, 2010 Blair and Imai (Princeton) List Experiments NJIT (Mathematics) 1 / 26 Motivation

More information

Digital Image Classification Geography 4354 Remote Sensing

Digital Image Classification Geography 4354 Remote Sensing Digital Image Classification Geography 4354 Remote Sensing Lab 11 Dr. James Campbell December 10, 2001 Group #4 Mark Dougherty Paul Bartholomew Akisha Williams Dave Trible Seth McCoy Table of Contents:

More information

Lecture 7: Linear Regression (continued)

Lecture 7: Linear Regression (continued) Lecture 7: Linear Regression (continued) Reading: Chapter 3 STATS 2: Data mining and analysis Jonathan Taylor, 10/8 Slide credits: Sergio Bacallado 1 / 14 Potential issues in linear regression 1. Interactions

More information

Performance of Sequential Imputation Method in Multilevel Applications

Performance of Sequential Imputation Method in Multilevel Applications Section on Survey Research Methods JSM 9 Performance of Sequential Imputation Method in Multilevel Applications Enxu Zhao, Recai M. Yucel New York State Department of Health, 8 N. Pearl St., Albany, NY

More information

Predict Outcomes and Reveal Relationships in Categorical Data

Predict Outcomes and Reveal Relationships in Categorical Data PASW Categories 18 Specifications Predict Outcomes and Reveal Relationships in Categorical Data Unleash the full potential of your data through predictive analysis, statistical learning, perceptual mapping,

More information

MULTIVARIATE TEXTURE DISCRIMINATION USING A PRINCIPAL GEODESIC CLASSIFIER

MULTIVARIATE TEXTURE DISCRIMINATION USING A PRINCIPAL GEODESIC CLASSIFIER MULTIVARIATE TEXTURE DISCRIMINATION USING A PRINCIPAL GEODESIC CLASSIFIER A.Shabbir 1, 2 and G.Verdoolaege 1, 3 1 Department of Applied Physics, Ghent University, B-9000 Ghent, Belgium 2 Max Planck Institute

More information

JMP Book Descriptions

JMP Book Descriptions JMP Book Descriptions The collection of JMP documentation is available in the JMP Help > Books menu. This document describes each title to help you decide which book to explore. Each book title is linked

More information

Instance-based Learning CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2015

Instance-based Learning CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2015 Instance-based Learning CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2015 Outline Non-parametric approach Unsupervised: Non-parametric density estimation Parzen Windows K-Nearest

More information

( ) =cov X Y = W PRINCIPAL COMPONENT ANALYSIS. Eigenvectors of the covariance matrix are the principal components

( ) =cov X Y = W PRINCIPAL COMPONENT ANALYSIS. Eigenvectors of the covariance matrix are the principal components Review Lecture 14 ! PRINCIPAL COMPONENT ANALYSIS Eigenvectors of the covariance matrix are the principal components 1. =cov X Top K principal components are the eigenvectors with K largest eigenvalues

More information

THE 2002 U.S. CENSUS OF AGRICULTURE DATA PROCESSING SYSTEM

THE 2002 U.S. CENSUS OF AGRICULTURE DATA PROCESSING SYSTEM Abstract THE 2002 U.S. CENSUS OF AGRICULTURE DATA PROCESSING SYSTEM Kara Perritt and Chadd Crouse National Agricultural Statistics Service In 1997 responsibility for the census of agriculture was transferred

More information

Olmo S. Zavala Romero. Clustering Hierarchical Distance Group Dist. K-means. Center of Atmospheric Sciences, UNAM.

Olmo S. Zavala Romero. Clustering Hierarchical Distance Group Dist. K-means. Center of Atmospheric Sciences, UNAM. Center of Atmospheric Sciences, UNAM November 16, 2016 Cluster Analisis Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster)

More information

Section 4 Matching Estimator

Section 4 Matching Estimator Section 4 Matching Estimator Matching Estimators Key Idea: The matching method compares the outcomes of program participants with those of matched nonparticipants, where matches are chosen on the basis

More information

A Monotonic Sequence and Subsequence Approach in Missing Data Statistical Analysis

A Monotonic Sequence and Subsequence Approach in Missing Data Statistical Analysis Global Journal of Pure and Applied Mathematics. ISSN 0973-1768 Volume 12, Number 1 (2016), pp. 1131-1140 Research India Publications http://www.ripublication.com A Monotonic Sequence and Subsequence Approach

More information

Missing Data Missing Data Methods in ML Multiple Imputation

Missing Data Missing Data Methods in ML Multiple Imputation Missing Data Missing Data Methods in ML Multiple Imputation PRE 905: Multivariate Analysis Lecture 11: April 22, 2014 PRE 905: Lecture 11 Missing Data Methods Today s Lecture The basics of missing data:

More information

Massive Data Analysis

Massive Data Analysis Professor, Department of Electrical and Computer Engineering Tennessee Technological University February 25, 2015 Big Data This talk is based on the report [1]. The growth of big data is changing that

More information

WELCOME! Lecture 3 Thommy Perlinger

WELCOME! Lecture 3 Thommy Perlinger Quantitative Methods II WELCOME! Lecture 3 Thommy Perlinger Program Lecture 3 Cleaning and transforming data Graphical examination of the data Missing Values Graphical examination of the data It is important

More information

Time Series Analysis by State Space Methods

Time Series Analysis by State Space Methods Time Series Analysis by State Space Methods Second Edition J. Durbin London School of Economics and Political Science and University College London S. J. Koopman Vrije Universiteit Amsterdam OXFORD UNIVERSITY

More information

Programs for MDE Modeling and Conditional Distribution Calculation

Programs for MDE Modeling and Conditional Distribution Calculation Programs for MDE Modeling and Conditional Distribution Calculation Sahyun Hong and Clayton V. Deutsch Improved numerical reservoir models are constructed when all available diverse data sources are accounted

More information

Detecting Burnscar from Hyperspectral Imagery via Sparse Representation with Low-Rank Interference

Detecting Burnscar from Hyperspectral Imagery via Sparse Representation with Low-Rank Interference Detecting Burnscar from Hyperspectral Imagery via Sparse Representation with Low-Rank Interference Minh Dao 1, Xiang Xiang 1, Bulent Ayhan 2, Chiman Kwan 2, Trac D. Tran 1 Johns Hopkins Univeristy, 3400

More information

- 1 - Fig. A5.1 Missing value analysis dialog box

- 1 - Fig. A5.1 Missing value analysis dialog box WEB APPENDIX Sarstedt, M. & Mooi, E. (2019). A concise guide to market research. The process, data, and methods using SPSS (3 rd ed.). Heidelberg: Springer. Missing Value Analysis and Multiple Imputation

More information

An Introduction to the Bootstrap

An Introduction to the Bootstrap An Introduction to the Bootstrap Bradley Efron Department of Statistics Stanford University and Robert J. Tibshirani Department of Preventative Medicine and Biostatistics and Department of Statistics,

More information

A Fast Clustering Algorithm with Application to Cosmology. Woncheol Jang

A Fast Clustering Algorithm with Application to Cosmology. Woncheol Jang A Fast Clustering Algorithm with Application to Cosmology Woncheol Jang May 5, 2004 Abstract We present a fast clustering algorithm for density contour clusters (Hartigan, 1975) that is a modified version

More information

Record Linkage for the American Opportunity Study: Formal Framework and Research Agenda

Record Linkage for the American Opportunity Study: Formal Framework and Research Agenda 1 / 14 Record Linkage for the American Opportunity Study: Formal Framework and Research Agenda Stephen E. Fienberg Department of Statistics, Heinz College, and Machine Learning Department, Carnegie Mellon

More information

Automatic Selection of Compiler Options Using Non-parametric Inferential Statistics

Automatic Selection of Compiler Options Using Non-parametric Inferential Statistics Automatic Selection of Compiler Options Using Non-parametric Inferential Statistics Masayo Haneda Peter M.W. Knijnenburg Harry A.G. Wijshoff LIACS, Leiden University Motivation An optimal compiler optimization

More information

Random projection for non-gaussian mixture models

Random projection for non-gaussian mixture models Random projection for non-gaussian mixture models Győző Gidófalvi Department of Computer Science and Engineering University of California, San Diego La Jolla, CA 92037 gyozo@cs.ucsd.edu Abstract Recently,

More information

book 2014/5/6 15:21 page v #3 List of figures List of tables Preface to the second edition Preface to the first edition

book 2014/5/6 15:21 page v #3 List of figures List of tables Preface to the second edition Preface to the first edition book 2014/5/6 15:21 page v #3 Contents List of figures List of tables Preface to the second edition Preface to the first edition xvii xix xxi xxiii 1 Data input and output 1 1.1 Input........................................

More information

Creating a data file and entering data

Creating a data file and entering data 4 Creating a data file and entering data There are a number of stages in the process of setting up a data file and analysing the data. The flow chart shown on the next page outlines the main steps that

More information

A Bayesian analysis of survey design parameters for nonresponse, costs and survey outcome variable models

A Bayesian analysis of survey design parameters for nonresponse, costs and survey outcome variable models A Bayesian analysis of survey design parameters for nonresponse, costs and survey outcome variable models Eva de Jong, Nino Mushkudiani and Barry Schouten ASD workshop, November 6-8, 2017 Outline Bayesian

More information

The Comparative Study of Machine Learning Algorithms in Text Data Classification*

The Comparative Study of Machine Learning Algorithms in Text Data Classification* The Comparative Study of Machine Learning Algorithms in Text Data Classification* Wang Xin School of Science, Beijing Information Science and Technology University Beijing, China Abstract Classification

More information

A Fast Multivariate Nearest Neighbour Imputation Algorithm

A Fast Multivariate Nearest Neighbour Imputation Algorithm A Fast Multivariate Nearest Neighbour Imputation Algorithm Norman Solomon, Giles Oatley and Ken McGarry Abstract Imputation of missing data is important in many areas, such as reducing non-response bias

More information

Chapter 1 Introduction. Chapter Contents

Chapter 1 Introduction. Chapter Contents Chapter 1 Introduction Chapter Contents OVERVIEW OF SAS/STAT SOFTWARE................... 17 ABOUT THIS BOOK.............................. 17 Chapter Organization............................. 17 Typographical

More information

The Use of Biplot Analysis and Euclidean Distance with Procrustes Measure for Outliers Detection

The Use of Biplot Analysis and Euclidean Distance with Procrustes Measure for Outliers Detection Volume-8, Issue-1 February 2018 International Journal of Engineering and Management Research Page Number: 194-200 The Use of Biplot Analysis and Euclidean Distance with Procrustes Measure for Outliers

More information

Introduction to Machine Learning. Xiaojin Zhu

Introduction to Machine Learning. Xiaojin Zhu Introduction to Machine Learning Xiaojin Zhu jerryzhu@cs.wisc.edu Read Chapter 1 of this book: Xiaojin Zhu and Andrew B. Goldberg. Introduction to Semi- Supervised Learning. http://www.morganclaypool.com/doi/abs/10.2200/s00196ed1v01y200906aim006

More information

CHAPTER 5. BASIC STEPS FOR MODEL DEVELOPMENT

CHAPTER 5. BASIC STEPS FOR MODEL DEVELOPMENT CHAPTER 5. BASIC STEPS FOR MODEL DEVELOPMENT This chapter provides step by step instructions on how to define and estimate each of the three types of LC models (Cluster, DFactor or Regression) and also

More information

Module 1 Lecture Notes 2. Optimization Problem and Model Formulation

Module 1 Lecture Notes 2. Optimization Problem and Model Formulation Optimization Methods: Introduction and Basic concepts 1 Module 1 Lecture Notes 2 Optimization Problem and Model Formulation Introduction In the previous lecture we studied the evolution of optimization

More information

BCLUST -- A program to assess reliability of gene clusters from expression data by using consensus tree and bootstrap resampling method

BCLUST -- A program to assess reliability of gene clusters from expression data by using consensus tree and bootstrap resampling method BCLUST -- A program to assess reliability of gene clusters from expression data by using consensus tree and bootstrap resampling method Introduction This program is developed in the lab of Hongyu Zhao

More information

Missing Data? A Look at Two Imputation Methods Anita Rocha, Center for Studies in Demography and Ecology University of Washington, Seattle, WA

Missing Data? A Look at Two Imputation Methods Anita Rocha, Center for Studies in Demography and Ecology University of Washington, Seattle, WA Missing Data? A Look at Two Imputation Methods Anita Rocha, Center for Studies in Demography and Ecology University of Washington, Seattle, WA ABSTRACT Statistical analyses can be greatly hampered by missing

More information

ECLT 5810 Clustering

ECLT 5810 Clustering ECLT 5810 Clustering What is Cluster Analysis? Cluster: a collection of data objects Similar to one another within the same cluster Dissimilar to the objects in other clusters Cluster analysis Grouping

More information

Application of Characteristic Function Method in Target Detection

Application of Characteristic Function Method in Target Detection Application of Characteristic Function Method in Target Detection Mohammad H Marhaban and Josef Kittler Centre for Vision, Speech and Signal Processing University of Surrey Surrey, GU2 7XH, UK eep5mm@ee.surrey.ac.uk

More information

Package midastouch. February 7, 2016

Package midastouch. February 7, 2016 Type Package Version 1.3 Package midastouch February 7, 2016 Title Multiple Imputation by Distance Aided Donor Selection Date 2016-02-06 Maintainer Philipp Gaffert Depends R (>=

More information

Understanding Clustering Supervising the unsupervised

Understanding Clustering Supervising the unsupervised Understanding Clustering Supervising the unsupervised Janu Verma IBM T.J. Watson Research Center, New York http://jverma.github.io/ jverma@us.ibm.com @januverma Clustering Grouping together similar data

More information

Excel 2010 with XLSTAT

Excel 2010 with XLSTAT Excel 2010 with XLSTAT J E N N I F E R LE W I S PR I E S T L E Y, PH.D. Introduction to Excel 2010 with XLSTAT The layout for Excel 2010 is slightly different from the layout for Excel 2007. However, with

More information

PSY 9556B (Feb 5) Latent Growth Modeling

PSY 9556B (Feb 5) Latent Growth Modeling PSY 9556B (Feb 5) Latent Growth Modeling Fixed and random word confusion Simplest LGM knowing how to calculate dfs How many time points needed? Power, sample size Nonlinear growth quadratic Nonlinear growth

More information

Multivariate Normal Random Numbers

Multivariate Normal Random Numbers Multivariate Normal Random Numbers Revised: 10/11/2017 Summary... 1 Data Input... 3 Analysis Options... 4 Analysis Summary... 5 Matrix Plot... 6 Save Results... 8 Calculations... 9 Summary This procedure

More information