Chapter II Multiple Correspondance Analysis (MCA)

Size: px

Start display at page:

Download "Chapter II Multiple Correspondance Analysis (MCA)"

Paul Brent McGee
6 years ago
Views:

1 Chapter II Multiple Correspondance Analysis (MCA) Master MMAS - University of Bordeaux Marie Chavent Chapitre 2 MCA 1/52

2 Introduction How to get information from a categorical data table of individuals variables? Example : categorical data table where 27 dogs are described on 6 variables #load("chiensrdata") load("dogsrdata") print(data[1:8,]) ## Size Weight Velocity Intelligence Affectivity Aggressivness ## Beauceron S++ W+ V++ I+ Af+ Ag+ ## BassetHound S- W- V- I- Af- Ag+ ## GermanShepherd S++ W+ V++ I++ Af+ Ag+ ## Boxer S+ W+ V+ I+ Af+ Ag+ ## Bulldog S- W- V- I+ Af+ Ag- ## BullMastiff S++ W++ V- I++ Af- Ag+ ## Poodle S- W- V+ I++ Af+ Ag- ## Chihuahua S- W- V- I- Af+ Ag- - Which individuals are similar? - Which variables are linked? Chapitre 2 MCA 2/52

3 By looking the matrix of the distances between the individuals? d <- dist(data) asmatrix(d)[1:5,1:5] ## Beauceron BassetHound GermanShepherd Boxer Bulldog ## Beauceron 0 NA NA NA NA ## BassetHound NA 0 NA NA NA ## GermanShepherd NA NA 0 NA NA ## Boxer NA NA NA 0 NA ## Bulldog NA NA NA NA 0 How do we measure the distance between two rows of categorical data? Chapitre 2 MCA 3/52

4 By looking the matrix of the χ 2 of independance between the pairs of variables? p <- ncol(data) ; chi2 <- matrix(na,p,p) ; pval <- matrix(na,p,p) rownames(pval) <- colnames(pval) <- rownames(chi2) <- colnames(chi2) <- colnames(data) for (j in 1:p) for (k in 1:p) { tab <- table(data[,j],data[,k]) chi2[j,k] <- chisqtest(tab)$statistic pval[j,k] <- chisqtest(tab)$pvalue } print(chi2,digit=2) # value of the chi2 statistic ## Size Weight Velocity Intelligence Affectivity Aggressivness ## Size ## Weight ## Velocity ## Intelligence ## Affectivity ## Aggressivness print(round(pval,digit=3),digit=2) # p-value of the test of independance ## Size Weight Velocity Intelligence Affectivity Aggressivness ## Size ## Weight ## Velocity ## Intelligence ## Affectivity ## Aggressivness Chapitre 2 MCA 4/52

5 By applying a multivariate statistical method? Multiple Correspondance Analysis (MCA) gives graphical representation of the distances between the individuals, the links between the categorival variables and the levels library(factominer) res <- MCA(data,graph=FALSE) plot(res,choix="ind",invisible = "var", title="",cex=15) plot(res,choix="ind",invisible = "ind", title="",cex=15) plot(res,choix="var",invisible = "ind", title="",cex=15) Dim 2 (2308%) BassetHound Mastiff Chihuahua Pekingese BullMastiff SaintBernard Teckel GermanMastiff Bulldog Newfoundland CockerSpaniel FoxTerrier FoxHound Levrier Poodle GrdBleuGascon DobermanPointerSetter GermanShepherd Beauceron CollieFrenchSpaniel Boxer Dalmatien BittanySpaniel Labrador Dim 2 (2308%) V W++ S I W Af Ag+ S++ Ag Af+ V++ I+ I++ W+ V+ S Dim 2 (2308%) Weight Velocity Size Intelligence Affectivity Aggressivness Dim 1 (2890%) Dim 1 (2890%) Dim 1 (2890%) Chapitre 2 MCA 5/52

6 MCA is also a method of dimension reduction : it gives a small number of new synthetic numerical variables summarizing the initial variables Categorical data : 8 initial categorical data Numerical data : 3 synthetic numerical variables ## Size Weight Velocity Intelligence Affectivity Aggressivness ## Beauceron S++ W+ V++ I+ Af+ ## Ag+ Dim 1 Dim 2 Dim 3 ## BassetHound S- W- V- I- Af- ## Beauceron Ag ## GermanShepherd S++ W+ V++ I++ Af+ ## BassetHound Ag ## Boxer S+ W+ V+ I+ Af+ ## GermanShepherd Ag ## Bulldog S- W- V- I+ Af+ ## Boxer Ag ## BullMastiff S++ W++ V- I++ Af- ## BulldogAg ## Poodle S- W- V+ I++ Af+ ## BullMastiff Ag ## Chihuahua S- W- V- I- Af+ ## PoodleAg ## Chihuahua MCA is then also a method to transform categorical data into numerical data Chapitre 2 MCA 6/52

7 Plan 1 Basic notions 2 The MCA algorithm 3 Different implementations of MCA 4 Interpretation of the results Chapitre 2 MCA 7/52

8 1 Basic notions Let us consider a data table where n individuals are described on p categorical variables Let : 1 j p 1 i x ij n - X = (x ij) n p denote the original data matrix whith x ij M j and M j the set of the levels of the jth variable - m j = card(m j) denote the number of levels of the jth variable - m = m m p denote the total number of levels Chapitre 2 MCA 8/52

9 Example : categorical data with n = 27 individuals, p = 6 variables and m = 16 levels print(data[1:8,]) ## Size Weight Velocity Intelligence Affectivity Aggressivness ## Beauceron S++ W+ V++ I+ Af+ Ag+ ## BassetHound S- W- V- I- Af- Ag+ ## GermanShepherd S++ W+ V++ I++ Af+ Ag+ ## Boxer S+ W+ V+ I+ Af+ Ag+ ## Bulldog S- W- V- I+ Af+ Ag- ## BullMastiff S++ W++ V- I++ Af- Ag+ ## Poodle S- W- V+ I++ Af+ Ag- ## Chihuahua S- W- V- I- Af+ Ag- Levels of the variables : T-,T+,T++ (taille), P-,P+,P++ (poids), etc Two approaches for recoding the categorical data into numerical data : - build the disjonctive table where each levels is coded is coded as a binary variable, - build the Burt table (anglo-saxon approach) which gathers the contingency tables of all the pairs of variables Chapitre 2 MCA 9/52

10 The disjonctive table describes the n individuals on the m levels : 1 s m 1 K = i k is n total Each column s is the indicator vector of the level s with : { kis = 1 if individual i has level s k is = 0 otherwise Let n s denote the number of individuals having level s n s Chapitre 2 MCA 10/52

11 Disjonctive table of the m = 16 levels library(factominer) K <- tabdisjonctif(data) print(k[1:4,]) ## S- S+ S++ W- W+ W++ V- V+ V++ I- I+ I++ Af- Af+ Ag- Ag+ ## Beauceron ## BassetHound ## GermanShepherd ## Boxer Frequencies n s of the levels : ns <- apply(k,2,sum) print(ns) ## S- S+ S++ W- W+ W++ V- V+ V++ I- I+ I++ Af- Af+ Ag- Ag+ ## Relative frequencies ns n ns <- apply(k,2,sum) n <- nrow(k) print(ns/n) of the levels : ## S- S+ S++ W- W+ W++ V- V+ V++ I- I+ I++ Af- Af+ Ag- Ag+ ## Chapitre 2 MCA 11/52

12 Centered disjonctive table - The n rows of the matrix K (the disjonctive table) define a cloud of n points in R m - Each individual i is weighted by w i and usually w i = 1 n Matrix K of the original recoded data 1 s m 1 i k is n mean n s n Matrix Z of the centered data 1 s m 1 i z is = k is n s n n mean 0 var n s n (1 n s n ) Verify that var(z s ) = ns ns (1 ) where n n zs R n denotes s-th column of Z Chapitre 2 MCA 12/52

13 Distance between two individuals - A weight m s is associated with each level s in order to give more importance to rare levels : m s = n n s - The metric M = diag( n n s, s = 1, m) of the diagonal matrix of the weights of the columns gives : d 2 M(z i, z i ) = = m s=1 m s=1 n n s (z is z i s) 2 n n s (k is k i s) 2 Two individuals are different if they have different levels, with more weight in the distance for rare levels (n s small) Chapitre 2 MCA 13/52

14 Example : ## S- S+ S++ W- W+ W++ V- V+ V++ I- I+ I++ Af- Af+ Ag- Ag+ ## Beauceron ## BassetHound ## GermanShepherd ## Boxer ## Bulldog Relative frequency of the levels : ## S- S+ S++ W- W+ W++ V- V+ V++ I- I+ I++ Af- Af+ Ag- Ag+ ## Squared distance between the two first dogs : dm(z 2 1, z 2) = (0 1) (0 0) (1 1)2 048 Chapitre 2 MCA 14/52

15 Inertia of the disjonctive table We have seen in the slides about the basic notions for PCA that : - centering the data doesn t change the distances between the individuals and then the inertia, - the inertia of a data table is the (weighted) sum of the variances of its columns In the particular case of a disjonctive table K this gives : - I(K) = I(Z) where Z is the centered disjonctive table and : I(Z) = m m svar(z s ), s=1 where m s is the weight of the column (the level) s Chapitre 2 MCA 15/52

16 - This gives when the rows are weighted by 1 and the columns are n weighted by m s = n n s : I(Z) = m (1 ns n ) s=1 In practice : - The contribution of a level s to the inertia of Z is all the more important as the level is rare - Too rare levels are then avoided (by pre-processing for instance) Chapitre 2 MCA 16/52

17 - This gives also : In practice : I(Z) = p (m j 1) j=1 - The contribution of a variable j to the inertia of Z is all the more imporant as its number of levels m j is high - Variables with too different number of levels are then avoided (by pre-processing for instance) Chapitre 2 MCA 17/52

18 - This gives also : I(Z) = m p Example of the dogs : #number of variables ncol(data) ## [1] 6 #number of levels ncol(k) ## [1] 16 I(Z) = 16 6 = 10 Chapitre 2 MCA 18/52

19 The correlation ratio The link between a numerical variable y and a categorical variable x is often measure by : η 2 (y x) = var(ȳ x) var(y) n i=1 n s m s=1 = n (ȳ s ȳ) 2 1 (yi ȳ)2 n where m is the number of levels of x and ȳ s is the mean value of y performed with the individuals having the level s - This criterion is often named correlation ratio - It takes its values in [0, 1] - It measures the proportion of the variance of the numerical variable y explained by the categorical variable x In which situation is this criterion equal to 0, equal to 1? Chapitre 2 MCA 19/52

20 Example : The Iris data ## SepalLength SepalWidth PetalLength PetalWidth Species ## setosa ## setosa ## setosa ## versicolor ## versicolor ## virginica Correlation ratios between the variable Species and the 4 numerical variables : eta2 <- function(x, gpe) { moyennes <- tapply(x, gpe, mean) effectifs <- tapply(x, gpe, length) varinter <- (sum(effectifs * (moyennes - mean(x))^2)) vartot <- (var(x) * (length(x) - 1)) res <- varinter/vartot return(res) } apply(iris[,-5],2,function(x){eta2(x,iris$species)}) ## SepalLength SepalWidth PetalLength PetalWidth ## Chapitre 2 MCA 20/52

21 The variable Species explains : % of the variance of "Petal Length" - 40 % of the variance of "Sepal Length" Petal Length Sepal Width setosa versicolor virginica setosa versicolor virginica Chapitre 2 MCA 21/52

22 Give an interpretation of the graphical outputs below : res <- PCA(iris,qualisup = 5,graph=FALSE) plot(res,choix="ind",habillage=5, title="",label="none",invisible="quali") plot(res,choix="var",title="",cex=15) Dim 2 (2285%) setosa versicolor virginica Dim 2 (2285%) SepalWidth Dim 1 (7296%) SepalLength PetalWidth PetalLength Dim 1 (7296%) How is this interpretation coherent with the results of the correlation ratios? Chapitre 2 MCA 22/52

23 Plan 1 Basic notions 2 The MCA algorithm 3 Different implementations of MCA 4 Interpretation of the results Chapitre 2 MCA 23/52

24 2 The MCA algorithm Several algorithms exist to perform Multiple Correspondance Analysis (MCA) and MCA can be defined as : - Correspondance Analysis (CA) applied to the Burt table (anglo-saxon approach) or to the disjonctive table (french approach), - Principal Component Analysis (PCA) applied to the centered disjonctive table (the approach describe in this Chapter) Because the CA method in not studied in this lecture, the MCA algorithm described hereafter is based on the general framework of PCA with metric introduced in the section 4 of the Chapter I Chapitre 2 MCA 24/52

25 The MCA algorithm The data table to be analyzed by MCA comprises n individuals described by p categorical variables and it is represented by the n p categorical matrix X Let m denote the total number of levels of the p categorical variables Step 1 : the pre-processing step 1 Build the real matrix Z of dimension n m as follows : Each level is coded as a binary variable and the n m disjonctive table K is constructed Z is the centered version of K 2 Build the diagonal matrix N of the weights of the rows of Z The n rows are often weighted by 1 n, such that N = 1 n In 3 Build the diagonal matrix M of the weights of the columns of Z : The m columns (corresponding to the levels of the categorical variables) are weighted by n n s, where n s, s = 1,, m denotes the number of individuals that belong to the sth level Chapitre 2 MCA 25/52

26 The metric M = diag( n n 1,, n n m ) (1) indicates that the distance between two rows of Z is weighted euclidean distance in the spirit of the χ 2 distance used in CA This distance gives more importance to rare levels The total inertia of Z with this distance and the weights 1 is equal to m p n Chapitre 2 MCA 26/52

27 Step 2 : the factor coordinates processing step 1 The Generalized Singular Value Decomposition (GSVD) of Z with metrics N and M gives the decomposition : where Z = UΛV t (2) - Λ = diag( λ 1,, λ r ) is the r r diagonal matrix of the singular values of ZMZ t N and Z t NZM, and r denotes the rank of Z which can best be here r = min(n 1, m p) ; - U is the n r matrix of the first r eigenvectors of ZMZ t N such that U t NU = I r, with I r the identity matrix of size r ; - V is the p r matrix of the first r eigenvectors of Z t NZM such that V t MV = I r Chapitre 2 MCA 27/52

28 2 The matrix F of dimension n r of the factor coordinates of the individuals is defined by : F = ZMV, (3) and we deduce from (2) that : F = UΛ (4) 1 α r 1 i f iα n mean 0 var λ α The columns f α of F are the principal components and : var(f α ) = λ α The columns u α = the standardized principal components λ fα of U are α Chapitre 2 MCA 28/52

29 The matrix F res <- MCA(data,graph=FALSE) F <- res$ind$coord F[,1:2] ## Dim 1 Dim 2 ## Beauceron ## BassetHound ## GermanShepherd ## Boxer ## Bulldog ## BullMastiff ## Poodle ## Chihuahua ## CockerSpaniel ## Collie ## Dalmatien ## Doberman ## GermanMastiff ## BittanySpaniel ## FrenchSpaniel ## FoxHound ## FoxTerrier ## GrdBleuGascon ## Labrador ## Levrier ## Mastiff ## Pekingese ## Pointer ## SaintBernard ## Setter ## Teckel ## Newfoundland Individuals plotted according to the two first PCs plot(res,choix="ind",invisible="var", cex=15,title="") Dim 2 (2308%) res$eig[1:2,1] ## [1] BassetHound Mastiff Chihuahua Pekingese BullMastiff SaintBernard Teckel GermanMastiff Newfoundland Bulldog FoxHound CockerSpaniel GrdBleuGascon Levrier Doberman PointerSetter GermanShepherd Beauceron CollieFrenchSpaniel Dim 1 (2890%) FoxTerrier Poodle Boxer Labrador Dalmatien BittanySpaniel Chapitre 2 MCA 29/52

30 3 The matrix A of dimension m r of the factor coordinates of the levels is defined by : A = MZ t NU = MA, (5) and we deduce from (2) that : A = MVΛ (6) 1 α q 1 s asα m Each coordinate a sα (element of A ) is the mean value of the (standardized) factor coordinates of the individuals that belong to level s : a sα = 1 n s i:k is =1 f iα λα This relation is called the barycentric property This property is fondamental for the interpretation of the graphical outputs in MCA Chapitre 2 MCA 30/52

31 The matrix A A <- res$var$coord A[,1:2] ## Dim 1 Dim 2 ## S ## S ## S ## W ## W ## W ## V ## V ## V ## I ## I ## I ## Af ## Af ## Ag ## Ag Plot of the levels according to their factor coordinates on dim1-2 plot(res,choix="ind",invisible="ind", cex=15,title="") Dim 2 (2308%) W++ Af S++ V++ Ag+ I I++ W+ V V+ W Ag I+ Af+ S+ S Dim 1 (2890%) The coordinates of the level W++ are the mean of the standardized coordinates of the dogs that belong to W++ rownames(data)[which(data$weight=="w++")] ## [1] "BullMastiff" "GermanMastiff" "Mastiff" "SaintBernard" "Newfoundland" Chapitre 2 MCA 31/52

32 Is it possible to plot both individuals and levels on the same map? It is possible to plot the levels at the barycenter of the individuals by using the barycentric property asα = 1 f iα n s λα i:k is =1 In that case two dimensions are chosen and : - the individuals are plotted according to their standardized principal components fα, λ α - the levels are plotted according to their factor coordinates vectors a α Chapitre 2 MCA 32/52

33 Example of the dogs data : Levels at the barycenter of the individuals second standardized PC BassetHound Mastiff Chihuahua Pekingese W++ V BullMastiff SaintBernard S GermanMastiff I W Bulldog Teckel Newfoundland Af Ag+ FoxTerrier FoxHound CockerSpaniel S++ Poodle Levrier GrdBleuGascon V++ I+ Ag Af+ Doberman I++ Pointer Setter GermanShepherd Beauceron W+ FrenchSpaniel Collie V+ S+ Boxer BittanySpaniel Dalmatien Labrador first standardized PC For instance the level W++ is plotted at the barycenter of the dogs that belong to W++ rownames(data)[which(data$weight=="w++")] ## [1] "BullMastiff" "GermanMastiff" "Mastiff" "SaintBernard" "Newfoundland" Chapitre 2 MCA 33/52

34 However this simultaneous representation of the levels at the barycenter of the individuals is not the standard output of softwares implementing MCA where the so-called quasi-barycentric property is usually used The quasi-barycentric property is simply the barycentric property written as follows : ( ) asα = 1 1 f iα λα n s i:k is =1 This reads : each coordinate asα is the mean value of the factor coordinates of the individuals that belong to level s, up to the multiplier coefficient 1 λ α Chapitre 2 MCA 34/52

35 It is then possible to plot the levels at the quasi-barycenter of the individuals : - the individuals are plotted according to their principal components f α, - the levels are plotted according to their factor coordinates vectors a α The representation of the levels at the quasi-barycenter of the individuals : - is the simultaneous representation usually implemented in the softwares, - must be interpreted as follows : the cloud of the levels is the dilatation (by in each dimension) of the cloud of the gravity centers of the 1 λ α individuals Chapitre 2 MCA 35/52

36 Example of the dogs data : Levels at the quasi barycenter of the individuals second PC BassetHound W++ V Mastiff S I Chihuahua Pekingese W GermanMastiff BullMastiff SaintBernard Newfoundland Bulldog Teckel Af Ag+ CockerSpaniel FoxTerrier FoxHound S++ Levrier Poodle GrdBleuGascon V++ I+ Ag Doberman Af+ Pointer Setter GermanShepherd I++ Beauceron FrenchSpaniel Collie W+ Boxer V+ BittanySpaniel Dalmatien Labrador S For instance the level W++ is plotted at the barycenter of the dogs that belong to W++ dilated by 1 = 1 λ = 2076 on the first dimension first PC res$eig[1:2,1] ## [1] apply(f[which(data$weight=="w++"),1:2],2,mean)/sqrt(res$eig[1:2,1]) ## Dim 1 Dim 2 ## Chapitre 2 MCA 36/52

37 Step 3 : the squared loadings processing step The contribution c jα of the variable x j (j-th column of X) to the variance of the principal component f α is defined by : n s c jα = n a 2 sα (7) s M j The matrix C = (c jα) of dimention p r is called the squared loadings matrix to draw an analogy with squared loadings in PCA 1 α r 1 j c jα p Each element c jα is equal to the correlation ratio between x j and f α : c jα = η 2 (f α x j ) Chapitre 2 MCA 37/52

38 C <- res$var$eta2 C[,1:2] The matrix C ## Dim 1 Dim 2 ## Size ## Weight ## Velocity ## Intelligence ## Affectivity ## Aggressivness Variables plotted according to their squared loadings plot(res,choix="var", cex=15,title="") Dim 2 (2308%) Intelligence Aggressivness Velocity Weight Affectivity Size Dim 1 (2890%) Chapitre 2 MCA 38/52

39 Plan 1 Basic notions 2 The MCA algorithm 3 Different implementations of MCA 4 Interpretation of the results Chapitre 2 MCA 39/52

40 3 Different implementations of MCA 1 Implement MCA as a CA of the Burt table : the anglo-saxon approach - CA is called simple Correspondance Analysis - In french CA is AFC = Analyse Factorielle des Correspondances - CA analyses a simple contingency table obtained by crossing two categorical variables - CA is a two steps procedure with first a PCA of the matrix of the row-profiles of the contingency table and then a PCA of the the matrix of the column-profiles These PCA use specific weights on the rows and columns and then specific metrics - Applying CA to the Burt table is then applying a single PCA (with specific metrics) to the matrix of the row-profiles of the Burt table Indeed the column-profiles are identical to the row-profiles in the Burt table Drawback : This algorithm gives the results (factor coordinates) for the levels but not for the individuals Implemented in the procedure CORRESP of the SAS sofware Chapitre 2 MCA 40/52

41 The Burt table is a symmetric table of size m m which gathers the contingency tables of all the pairs of variables 1 s m 1 B = K t K = s b ss m where : - b ss = n kisk i=1 is is the number of individual having both levels s and s - b ss = n s is the number of individuals having s Chapitre 2 MCA 41/52

42 Example : Burt table of the m = 16 levels K <- tabdisjonctif(data) B <- t(k)%*%k print(b) ## S- S+ S++ W- W+ W++ V- V+ V++ I- I+ I++ Af- Af+ Ag- Ag+ ## S ## S ## S ## W ## W ## W ## V ## V ## V ## I ## I ## I ## Af ## Af ## Ag ## Ag Chapitre 2 MCA 42/52

43 2 Implement MCA as a CA of the disjonctive table : the standard approach - The disjonctive table is used as a contingency table - Applying CA to the disjonctive table is then a two steps procedure with first a PCA of the matrix of the row-profiles (of the individuall) and then a PCA of the the matrix of the column-profiles (of the levels) Advantage : This algorithm gives directly the results (factor coordinates) for the levels and for the individuals Implemented in the function MCA of the R package FactoMineR Chapitre 2 MCA 43/52

44 3 Perform a PCA of the disjonctive table : the single PCA approach - This PCA uses specific weights of the columns (the levels) and then a specific distance between two rows (individuals) - Compared to the standard approach : - the factor coordinates of the levels are the same - the factor coordinates of the individuals are multiplied by p - the total inertia is multiplied by p and is equal to m p Advantage : It is not necessary to know the CA method to understand this algorithm Implemented in the function PCAmix of the R package PCAmixdata Chapitre 2 MCA 44/52

45 Plan 1 Basic notions 2 The MCA algorithm 3 Different implementations of MCA 4 Interpretation of the results Chapitre 2 MCA 45/52

46 4 Interpretation of the results Quality of the dimension reduction The quality of the q first principal components is measured by the proportion of the inertia that they explain Inertia of the data : I(Z) = I(F) = λ λ r = m p Proportion of inertia explaine by the α-th principal component λ α λ λ r In MCA, the percentage of inertia explained by the axes are "small" by construction Some authors have proposed corrections of the eigenvalues in MCA (Greenacre, 1993) Chapitre 2 MCA 46/52

47 Original data (p = 6 et m=16) ## Size Weight Velocity Intelligence ## Beauceron S++ W+ V++ I+ ## BassetHound S- W- V- I- ## GermanShepherd S++ W+ V++ I++ ## Boxer S+ W+ V+ I+ ## Bulldog S- W- V- I+ Reduction to the 3 first PCs ## Dim 1 Dim 2 Dim 3 ## Beauceron ## BassetHound ## GermanShepherd ## Boxer ## Bulldog What is the quality of this reduction? ## Eigenvalue Proportion Cumulative ## dim ## dim ## dim ## dim ## dim ## dim ## dim ## dim ## dim ## dim r = 10 non nul eigenvalues because r = min(n 1, m p) = 10, - The sum of the eigenvalues is m p = 10 (total inertia), % of the inertia is exaplined by the 3 first PCs Chapitre 2 MCA 47/52

48 Contribution of the individuals and of the levels - The relative contribution of an individual i to the variance of an axe α is : 1 fiα 2 n λ α The individuals far from the center of the factor map are those who contribute the most They can be a source of instability and can be removed or used as illustrative - The relative contribution of a level s to the variance of an axe α is : n s asα 2 n λ α The levels far from the center of the factor map are not necessary those who contribute the most Chapitre 2 MCA 48/52

49 The 5 individuals which contribute the most The 5 levels which contribute the most Dim 2 (2308%) Mastiff Pekingese Chihuahua Dalmatien Labrador Dim 2 (2308 %) W+ V S+ S W Dim 1 (2890%) Dim 1 (289 %) Chapitre 2 MCA 49/52

50 Contribution of the variables The absolute contribution of a categorical variable j to the variance of an axe α is the sum of the contributions of its levels : n s n a2 sα = η 2 (f α x j ) s M j The correlation ratios are signless measure of links used to plot the categorical variables on a map Dim 2 (2308 %) Weight Velocity Size Intelligence Affectivity Aggressivness Dim 2 (2308%) S S+ S++ BassetHound Mastiff SaintBernard BullMastiff Newfoundland GermanMastiff CockerSpaniel FoxHound GrdBleuGascon Levrier Doberman Setter Beauceron Pointer GermanShepherd CollieFrenchSpaniel Chihuahua Pekingese Bulldog Teckel FoxTerrier Poodle Boxer Dalmatien Labrador BittanySpaniel Dim 1 (289 %) Dim 1 (2890%) Chapitre 2 MCA 50/52

51 Quality of the projection of the individuals and of the level The quality of the projection of the individuals or of the levels is measured as in PCA by the so-called squared cosine - If two individuals are well projected their distance on the factor map is not far from their true distance knowing that in MCA the distance between two individuals is small if they have the same levels - If two levels are well projected, their distance on the factor map can be interpreted using the barycentric property : - two levels of two different variables are close if they are owned by the same individuals - two levels of a same variable are close if the two associated groups of individuals are close - Take care of the dispertion of the individuals associated with each levels before interpretating of the proximity between two levels Chapitre 2 MCA 51/52

52 The 10 individuals best projected Levels having a cos 2 > 05 Dim 2 (2308%) Mastiff GermanMastiff BassetHound Chihuahua Pekingese Bulldog Teckel Dalmatien Labrador BittanySpaniel Dim 2 (2308 %) Af S++ W+ V Af+ S+ S W Dim 1 (2890%) Dim 1 (289 %) Chapitre 2 MCA 52/52

Chapter II Multiple Correspondance Analysis (MCA)

Chapter II Multiple Correspondance Analysis (MCA) Chapter II Multiple Correspodace Aalysis (MCA) Master MMAS - Uiversity of Bordeaux Marie Chavet Chapitre 2 MCA 1/52 Itroductio How to get iformatio from a categorical data table of idividuals variables?