Evaluation of PCA and ICA of Simulated ERPs: Promax vs. Infomax Rotations

Similar documents
Journal of World s Electrical Engineering and Technology J. World. Elect. Eng. Tech. 1(1): 12-16, 2012

Point-Biserial Correlation Analysis of Fuzzy Attributes

Controlled Information Maximization for SOM Knowledge Induced Learning

Segmentation of Casting Defects in X-Ray Images Based on Fractal Dimension

Image Enhancement in the Spatial Domain. Spatial Domain

A modal estimation based multitype sensor placement method

Optical Flow for Large Motion Using Gradient Technique

An Unsupervised Segmentation Framework For Texture Image Queries

Color Correction Using 3D Multiview Geometry

Lecture # 04. Image Enhancement in Spatial Domain

Detection and Recognition of Alert Traffic Signs

A Mathematical Implementation of a Global Human Walking Model with Real-Time Kinematic Personification by Boulic, Thalmann and Thalmann.

Assessment of Track Sequence Optimization based on Recorded Field Operations

Communication vs Distributed Computation: an alternative trade-off curve

IP Network Design by Modified Branch Exchange Method

ADDING REALISM TO SOURCE CHARACTERIZATION USING A GENETIC ALGORITHM

Fifth Wheel Modelling and Testing

A Novel Automatic White Balance Method For Digital Still Cameras

Frequency Domain Approach for Face Recognition Using Optical Vanderlugt Filters

User Specified non-bonded potentials in gromacs

A Shape-preserving Affine Takagi-Sugeno Model Based on a Piecewise Constant Nonuniform Fuzzification Transform

Illumination methods for optical wear detection

On Error Estimation in Runge-Kutta Methods

Cardiac C-Arm CT. SNR Enhancement by Combining Multiple Retrospectively Motion Corrected FDK-Like Reconstructions

AUTOMATED LOCATION OF ICE REGIONS IN RADARSAT SAR IMAGERY

a Not yet implemented in current version SPARK: Research Kit Pointer Analysis Parameters Soot Pointer analysis. Objectives

Modeling Low-Frequency Fluctuation and Hemodynamic Response Timecourse in Event-Related fmri

Gravitational Shift for Beginners

Generalized Grey Target Decision Method Based on Decision Makers Indifference Attribute Value Preferences

A Two-stage and Parameter-free Binarization Method for Degraded Document Images

COLOR EDGE DETECTION IN RGB USING JOINTLY EUCLIDEAN DISTANCE AND VECTOR ANGLE

Any modern computer system will incorporate (at least) two levels of storage:

4.2. Co-terminal and Related Angles. Investigate

Spiral Recognition Methodology and Its Application for Recognition of Chinese Bank Checks

A VECTOR PERTURBATION APPROACH TO THE GENERALIZED AIRCRAFT SPARE PARTS GROUPING PROBLEM

Improvement of First-order Takagi-Sugeno Models Using Local Uniform B-splines 1

Multi-azimuth Prestack Time Migration for General Anisotropic, Weakly Heterogeneous Media - Field Data Examples

Positioning of a robot based on binocular vision for hand / foot fusion Long Han

Topic -3 Image Enhancement

A Memory Efficient Array Architecture for Real-Time Motion Estimation

5 4 THE BERNOULLI EQUATION

Ranking Visualizations of Correlation Using Weber s Law

Topological Characteristic of Wireless Network

Methods for history matching under geological constraints Jef Caers Stanford University, Petroleum Engineering, Stanford CA , USA

Transmission Lines Modeling Based on Vector Fitting Algorithm and RLC Active/Passive Filter Design

(a, b) x y r. For this problem, is a point in the - coordinate plane and is a positive number.

dc - Linux Command Dc may be invoked with the following command-line options: -V --version Print out the version of dc

Monte Carlo Simulation for the ECAT HRRT using GATE

Prof. Feng Liu. Fall /17/2016

Input Layer f = 2 f = 0 f = f = 3 1,16 1,1 1,2 1,3 2, ,2 3,3 3,16. f = 1. f = Output Layer

Obstacle Avoidance of Autonomous Mobile Robot using Stereo Vision Sensor

Lecture 27: Voronoi Diagrams

A Recommender System for Online Personalization in the WUM Applications

A Minutiae-based Fingerprint Matching Algorithm Using Phase Correlation

INFORMATION DISSEMINATION DELAY IN VEHICLE-TO-VEHICLE COMMUNICATION NETWORKS IN A TRAFFIC STREAM

MapReduce Optimizations and Algorithms 2015 Professor Sasu Tarkoma

Comparisons of Transient Analytical Methods for Determining Hydraulic Conductivity Using Disc Permeameters

Conversion Functions for Symmetric Key Ciphers

UCLA Papers. Title. Permalink. Authors. Publication Date. Localized Edge Detection in Sensor Fields.

New Algorithms for Daylight Harvesting in a Private Office

ANALYTIC PERFORMANCE MODELS FOR SINGLE CLASS AND MULTIPLE CLASS MULTITHREADED SOFTWARE SERVERS

Image Registration among UAV Image Sequence and Google Satellite Image Under Quality Mismatch

Extract Object Boundaries in Noisy Images using Level Set. Final Report

Title. Author(s)NOMURA, K.; MOROOKA, S. Issue Date Doc URL. Type. Note. File Information

Accurate Diffraction Efficiency Control for Multiplexed Volume Holographic Gratings. Xuliang Han, Gicherl Kim, and Ray T. Chen

Module 6 STILL IMAGE COMPRESSION STANDARDS

3D inspection system for manufactured machine parts

Clustering Interval-valued Data Using an Overlapped Interval Divergence

Automatically Testing Interacting Software Components

COMPARISON OF CHIRP SCALING AND WAVENUMBER DOMAIN ALGORITHMS FOR AIRBORNE LOW FREQUENCY SAR DATA PROCESSING

Survey of Various Image Enhancement Techniques in Spatial Domain Using MATLAB

Keith Dalbey, PhD. Sandia National Labs, Dept 1441 Optimization & Uncertainty Quantification

Improved Fourier-transform profilometry

Development and Analysis of a Real-Time Human Motion Tracking System

A ROI Focusing Mechanism for Digital Cameras

Drag Optimization on Rear Box of a Simplified Car Model by Robust Parameter Design

vaiation than the fome. Howeve, these methods also beak down as shadowing becomes vey signicant. As we will see, the pesented algoithm based on the il

Effective Missing Data Prediction for Collaborative Filtering

An Assessment of the Efficiency of Close-Range Photogrammetry for Developing a Photo-Based Scanning Systeminthe Shams Tabrizi Minaret in Khoy City

Cellular Neural Network Based PTV

Data mining based automated reverse engineering and defect discovery

Computational and Theoretical Analysis of Null Space and Orthogonal Linear Discriminant Analysis

Tissue Classification Based on 3D Local Intensity Structures for Volume Rendering

How Easy is Matching 2D Line Models Using Local Search?

A Neural Network Model for Storing and Retrieving 2D Images of Rotated 3D Object Using Principal Components

Resolution and stability analysis of offset VSP acquisition scenarios with applications to fullwaveform

Introduction to Engineering Seismology Lecture 18

Configuring RSVP-ATM QoS Interworking

Hybrid Fractal Video Coding With Neighbourhood Vector Quantisation

QUANTITATIVE MEASURES FOR THE EVALUATION OF CAMERA STABILITY

SYSTEM LEVEL REUSE METRICS FOR OBJECT ORIENTED SOFTWARE : AN ALTERNATIVE APPROACH

Conservation Law of Centrifugal Force and Mechanism of Energy Transfer Caused in Turbomachinery

Research Article. Regularization Rotational motion image Blur Restoration

Concomitants of Upper Record Statistics for Bivariate Pseudo Weibull Distribution

Geophysical inversion with a neighbourhood algorithm I. Searching a parameter space

RANDOM IRREGULAR BLOCK-HIERARCHICAL NETWORKS: ALGORITHMS FOR COMPUTATION OF MAIN PROPERTIES

IP Multicast Simulation in OPNET

Modelling, simulation, and performance analysis of a CAN FD system with SAE benchmark based message set

ANN Models for Coplanar Strip Line Analysis and Synthesis

Experimental and numerical simulation of the flow over a spillway

Transcription:

Human Bain Mapping 28:742 763 (2007) Evaluation of PCA and ICA of Simulated ERPs: Pomax vs. Infomax Rotations Joseph Dien, 1 * Wayne Khoe, 2 and Geoge R. Mangun 3 1 Depatment of Psychology, Univesity of Kansas, Lawence, Kansas 2 Depatment of Neuosciences, Univesity of Califonia, San Diego, Califonia 3 Cente fo Mind and Bain and Depatments of Neuology and Psychology, Univesity of Califonia, Davis, Califonia Abstact: Independent components analysis (ICA) and pincipal components analysis (PCA) ae methods used to analyze event-elated potential (ERP) and functional imaging (fmri) data. In the pesent study, ICA and PCA wee diectly compaed by applying them to simulated ERP datasets. Specifically, PCA was used to geneate a sub of the dataset followed by the application of PCA Pomax o ICA Infomax otations. The simulated datasets wee composed of eal backgound EEG activity plus two ERP simulated components. The esults suggest that Pomax is most effective fo tempoal analysis, wheeas Infomax is most effective fo spatial analysis. Failed analyses wee examined and used to devise potential diagnostic stategies fo both otations. Finally, the esults also showed that decomposition of subject aveages yield bette esults than of gand aveages acoss subjects. Hum Bain Mapp 28:742 763, 2007. VC 2006 Wiley-Liss, Inc. Key wods: pincipal components analysis; independent components analysis; event-elated potentials INTRODUCTION Contact gant sponso: National Institute of Mental Health (NIMH); Contact gant numbe: MH11751 (to J.D.); Contact gant numbes: MH55714 and MH02019 (to G.R.M.). *Coespondence to: Joseph Dien, Depatment of Psychology, Univesity of Kansas, Fase Building, 1415 Jayhawk Blvd., Lawence, Kansas. E-mail: jdien@ku.edu Received fo publication 15 August 2005; Accepted 15 May 2006 DOI: 10.1002/hbm.20304 Published online 28 Novembe 2006 in Wiley InteScience (www. intescience.wiley.com). Pincipal components analysis (PCA) is a multivaiate technique that seeks to uncove latent vaiables esponsible fo pattens of covaiation in numeical datasets [Gosuch, 1983; Haman, 1976]. It has long been used as a data desciption and eduction technique to manage the copious quantities of measuements obtained in event-elated potential (ERP) studies [Donchin and Heffley, 1979; Möcks et al., 1991]. Although it has been shown to have limitations when applied to ERP data [Wood and McCathy, 1984] and to be sensitive to paametes like component ovelap and coelation [Dien, 1998a], it has nonetheless been utilized with easonable success in numeous studies when applied in a judicious fashion [Dien, 1999; Dien et al., 1997, 2003a; Spence et al., 2001; Squies et al., 1975]. Recognition of the limitations of the PCA pocedue has given ise to effots to impove on the pocess. It has been shown in simulations, fo example, that the oblique otation Pomax esults in moe accuate esults with coelated ERP components than the moe customay othogonal otation Vaimax [Dien, 1998a; Dien et al., 2005]. The use of a covaiance matix fo the elationship matix [Kayse and Tenke, 2003] and the inclusion of Kaise nomalization also yield impoved esults in compaison to using covaiance loadings duing otation [Dien et al., 2005]. Recently, a elated but quite diffeent pocedue called independent components analysis (ICA) has been poposed as an altenative to PCA and some pomising esults have been epoted with both ERPs [Jackson, 1991; VC 2006 Wiley-Liss, Inc.

PCA and ICA Jung et al., 2000; Makeig et al., 1996, 1997, 1999a,b; Vigaio, 1997] and hemodynamic measues [Calhoun et al., 2001; Dodel et al., 2000; McKeown et al., 1998; Pak et al., 2003]. Thee has been inteest in how the two techniques compae. It is not possible to state which will be moe effective fo ERP datasets on the basis of the statistical pinciples alone. Makeig et al. [1997: 10979] noted that ICA... equies the absence of highe-ode as well as secondode coelations between couses... [and] is a stonge condition than decoelation... ; howeve, since the default setting in thei implementation of ICA is to emove the second-ode elationships pio to ICA decomposition via spheing and then to etun them aftewads, ICA (used in this fashion) elies on diffeent statistical infomation fom PCA, athe than moe statistical infomation. Indeed, the ability of ICA components to be coelated is a stength of the technique, in contast to Vaimax-otated PCA solutions [Jung et al., 2000: 1756]. The goal of this execise is not to detemine which is globally bette fo all puposes, which would be an illposed question. Rathe, this epot will attempt to detemine the elative chaacteistics of the two techniques that will allow investigatos to detemine which tool to use fo a given poject. Evey statistical technique is based on cetain implicit assumptions upon which the model is constucted; the effectiveness of a statistical technique is most often detemined by the fit between the statistical assumptions and the chaacteistics of the datasets. We shall examine the unique aspects of ERP datasets, especially the distinction between using points o electodes as vaiables, and how they elate to the statistical assumptions. This epot will fist povide a bief eview of the algoithms undelying PCA and ICA, followed by a seies of tests using simulated and eal data. PRINCIPAL COMPONENTS ANALYSIS Since compehensive teatments of PCA ae available elsewhee [Gosuch, 1983; Haman, 1976], this eview will focus on highlighting the aspects elevant to the pesent compaison. Futhe infomation on its application to ERP datasets is also available elsewhee [Dien and Fishkoff, 2004; Donchin and Heffley, 1979; Möcks and Velege, 1991]. PCA has the ultimate pupose of expessing a dataset as a set of linea combinations of vaiables that ae moe intepetable, which is to say, elate simply to the latent vaiables athe than being some sot of complex combination of them. In the case of ERP data, some of these linea combinations would ideally coespond to the ERP components of inteest. The linea combinations poduced by PCA, as well as ICA, ae conventionally temed components but in the emainde of this epot will be temed factos to avoid confusion with ERP components. The coe pocedue of PCA is the decomposition of the so-called elationship matix. The elationship matix, typically a coelation o covaiance matix, summaizes the elationships between each vaiable and evey othe vaiable. In a coelation matix, the full set of vaiables is epesented by the ows and again by the columns. The enty fo each cell of the matix is the coelation between the two vaiables epesented by the espective ow and column. The diagonal of the matix is the coelation of each vaiable with itself (unity). A covaiance matix is the same as a coelation matix except that the vaiables have not been standadized so that the magnitude of the enties eflects the size of the vaiable vaiance as well as the degee of covaiation. The PCA algoithm sequentially fits a linea combination to this matix that accounts fo the geatest possible vaiance. The matix is then esidualized, which means that the linea combination is subtacted out, leaving behind the data that has not been accounted fo yet, and then the pocess is epeated with the emaining matix. In this fashion the dataset is eexpessed as a set of linea combinations (of equal numbe to the oiginal vaiables in the absence of collineaity) aanged in ode of deceasing size. These factos ae uncoelated with each othe, egadless of the natue of the undelying data. The smallest (pesumably unintepetable) factos ae then dopped fom futhe analysis. A otation pocedue is then utilized to incease intepetability of the obtained factos. This step is necessay since the statistically deived factos will usually be linea combinations of the actual latent vaiables of inteest (combinations of diffeent ERP components in the pesent case). Fo example, the Vaimax otation [Kaise, 1958] tanslates the factos to a mathematically equivalent set of linea combinations, maximizing the vaiance of the squaed facto loadings. This has the effect of geneating factos that ae as close to zeo on some vaiables as possible, while as lage as possible on the othes; this may easonably be expected to yield a solution in which the factos moe closely coespond to single ERP components since ERP components ae nominally zeo on most points and maximal in a limited set of points. This pocess can be gaphed as a scatteplot in which each point epesents a single vaiable and the axes epesent the two factos. The otation pocess otates the axes of the coodinate system such that the axes pass though the densest goupings of points (which is equivalent to saying that the otation will aange fo the facto loadings of each vaiable to be lage fo one facto and small fo the othe as much as possible). This pocess poceeds iteatively fo each paiwise combination of the vaiables until a pass though the full set of paiwise otations esults in otations that fall below a low citeion point. The Pomax otation [Hendickson and White, 1964] utilized in this epot pefoms an initial Vaimax otation and then elaxes the othogonality estictions, allowing the factos to become coelated. It does so computationally by otating individual factos such that they appoximate moe closely a vesion of themselves taken to a 743

Dien et al. highe powe (such as a fouth powe); in othe wods, enhancing the lage loadings elative to the smalle loadings. Gaphically, this is equivalent to saying that each axis is otated individually without attempting to maintain them at ight angles to each othe. The highe the powe, the geate is this final otation. If the undelying latent vaiables, like the ERP components, ae in fact coelated, then this can allow fo a moe accuate solution [Dien, 1998a; Dien et al., 2003b, 2005]. Assumptions and Issues in PCA of ERPs PCA does not make any stong assumptions about the data. No assumptions ae made about the distibution of the vaiables o of the facto scoes [Gosuch, 1983: 24]. The only assumption is that the vaiables ae linea functions of the factos. Vaiables do not even need to be linealy elated as long as the assumption is met [Gosuch, 1983: 18]. Thee is no paticula eason to think that this assumption will be violated fo ERP datasets. Howeve, seveal issues need to be consideed fo PCA to be successful, the fist of which is facto ovelap. Factos ae defined as being a specific patten of facto loadings (such as a paticula couse fo a tempoal PCA o a paticula scalp topogaphy fo a spatial PCA). ERP components that have an identical patten (such as both peaking at 300 ms fo a tempoal PCA) cannot by definition be sepaated into diffeent factos, even if they ae sepaable by some of the vaiance pesent in the obsevations, such as condition vaiance. The moe simila the two components, the moe difficult it may be to successfully sepaate them. This is likely a geate concen fo spatial PCA since volume conduction (the popety of voltage fields of speading thoughout the conductive medium of the head) ensues that evey electode will be affected by a component and hence evey component ovelaps substantially with evey othe component [Dien, 1998a]; convesely, components in the domain can be completely sepaate. A second issue is that of facto coelation. The initial facto decomposition and Vaimax otation ae both othogonal, meaning that the factos ae constained to be uncoelated even if the actual ERP components ae coelated. Such a constaint causes the statistical model to be distoted in ode to foce the factos to be othogonal. This issue can be addessed, somes quite effectively, by the use of the Pomax otation, which adds a elaxation step [Dien, 1998a; Dien et al., 2003b, 2005]. What is not clea is to what degee this elaxation step can be effective. It is likely that this pocedue will only be effective up to some unknown degee of facto coelation. Given the inceased amount of ovelap found in the spatial dimension, it is expected that this will be a geate concen fo tempoal PCA insofa as degee of spatial ovelap induces facto coelation fo tempoal PCA, and vice vesa, since it detemines the extent to which the two components cooccu in the obsevations [Dien, 1998a]. Lack of spatial ovelap would induce a negative coelation (obsevations containing one component would not contain the othe component), but it would be diluted by the numbe of obsevations containing neithe, of which thee would be many in most tempoal PCAs. A thid issue that can aise is misetention, leading to eithe undeextaction o oveextaction [Fava and Velice, 1992; Wood et al., 1996]. This occus when too few o too many factos ae etained fo otation compaed to the actual numbe of substantial latent vaiables in the dataset. Undeextaction can cause ERP components to be combined into a single facto, wheeas oveextaction can cause mino (pehaps noise) factos being built up at the expense of the majo (ERP component) factos and/o factos with only one high loading [Comey, 1978]. Caeful attention to the use of facto etention ules [see Dien, 1998a] and evaluation of facto esults ae equied to addess this issue. Two final issues have been identified fo Vaimax otations that could affect the pesent simulations [Cueton and Mulaik, 1975: 224]. The fist occus when the bulk of the vaiables load on both factos. One way of descibing this issue is by saying that Vaimax makes an implicit assumption that thee will be lage clustes of vaiables that load only on one o the othe facto; if this is not the case, then the otation will not occu popely. The second occus when a numbe of vaiables have zeo loadings on the fist unotated facto. In this case the facto is essentially pinned against otation since Vaimax equies that fo otation to occu, the citeion must be inceased fo each paiwise otation. In the language of connectionist models, the solution becomes tapped at a local minimum and cannot each the global minimum. This situation only applies fo facto solutions with at least thee dimensions. The fist unotated facto typically has loadings on as many of the vaiables as possible, so it is not clea how often this situation occus. A vaiant of the Vaimax otation, the weighted-vaimax, has been poposed to addess these two situations [Cueton and D Agostino, 1983; Cueton and Mulaik, 1975]. It gives the most weight to facto loadings that ae located away fom the initial unotated facto (which is by fa the lagest), essentially making the assumption that the initial otation is not aligned with the coect otation. It is not clea in advance how poblematic these two situations might be fo spatial and tempoal PCAs of ERP data, so it seems wothwhile to evaluate this otation as well. Since Pomax uses Vaimax as an initial otation, we implemented Pomax with Weighted-Vaimax to supplement the egula Pomax with Vaimax otation. INDEPENDENT COMPONENTS ANALYSIS Independent components analysis povides an altenative appoach to isolating ERP components. Since thee ae many vaieties of ICA, this epot will focus on the vesion 744

PCA and ICA most commonly applied to ERP data, the Infomax otation [Bell and Sejnowski, 1995], as implemented by the EEGlab toolkit [Delome and Makeig, 2004]. Since in-depth mathematical teatments aleady exist [Bell and Sejnowski, 1995; Makeig et al., 1997], this bief eview will focus on a moe applied desciption of the algoithm and its implications (as instantiated in the EEGlab softwae). A souce of confusion fo psychologists when discussing ICA is the use of a diffeent teminology gounded in the engineeing liteatue. To educe eade confusion, fo the emainde of this text the equivalent tems fom the PCA liteatue, as summaized in Table I, will be used to efe to both PCA and ICA. Anothe souce of confusion fo psychologists is that, unlike PCA, thee is no sepaate extaction step; the otations can be diectly applied to the stating vaiables. Thus, the tem PCA applies to the successive steps of extaction and otation. In contast, the tem ICA in effect applies only to a otation pocedue, since no extaction is equied (although, as will be discussed below, a PCA extaction may be used as a pepocessing step fo ICA). A fundamental diffeence between the PCA and the ICA pocedues concens the matix being evaluated. In PCA, duing the otation stage the matix being evaluated is the loading matix, which epesents the elationship between the factos and the vaiables; the otation altes the matix until the facto loadings meet the citeion (such as the Vaimax citeion of maximizing the vaiance of the squaed loadings). In ICA, the pocedue evaluates the matix of facto inteelationships; in othe wods, the facto scoes athe than the facto loadings. The factos ae systematically otated until the elationships between the factos ae as close to zeo (i.e., independent) as possible. The ICA algoithm begins by geneating facto scoes that ae initially set equal to the vaiables (one fo each). A elationship matix is then geneated between these facto scoes. The facto scoing matix is then modified such that factos that ae diffeent fom each othe ae made even moe diffeent. Though a somes lengthy taining pocess the facto scoing matix is modified. New facto scoes ae geneated and used to compute a new elationships matix; this pocess is epeated until the changes to the factos dop below a citeion theshold. In this manne the elationships between the factos ae gadually educed as they become inceasingly diffeentiated fom each othe. PCA TABLE I. PCA and ICA glossay Facto loading matix Facto scoing coefficient matix Facto scoes ICA Mixing matix Sepaation matix Activations PCA, pincipal components analysis; ICA, independent components analysis. Anothe diffeence is the metic by which these elationships ae measued. A facto loading matix, as used in PCA, can be thought of as containing the egession weights needed to pedict the vaiables fom the factos. Fomally, coelation coefficients ae the same as the egession weight needed to pedict one vaiable by the othe if the two vaiables ae standadized: Y ¼ X (whee Y is the vaiable and X is the facto and is the egession weight). In the ICA elationships matix, the enties eflect the highe moments as well, such as the thid moment: Y ¼ X 2 (keeping in mind that in this case Y is a facto, like X, athe than a vaiable). These highe-ode elations ae epesented by an exponential sigmoid function that has the fom of y ¼ 1./(1 þ exp( u)) and uns fom 1 to 1 afte some escaling (2*y 1), whee u ¼ the facto scoe and y is the sigma-tansfomed facto scoe. Just like with a coelation, a positive scoe means a tendency to vay in the same diection and a negative scoe means a tendency to vay in the opposite diection. Anothe issue is that in the elationship matix the columns ae the facto scoes and the ows ae the sigmoid (sig) tansfomed vesions of the facto scoes (fac). This means that the elationship is asymmetic, with the elationship between each facto epesented by two numbes (i.e., the poduct of fac1 and sig[fac2] and the poduct of sig[fac1] and fac2). In the subsequent otation step the fist value detemines the effect of the fist facto on the second facto, wheeas the second value detemines the effect of the second facto on the fist. The ability fo one facto to pedict anothe based on these highe moments is elated to its Gaussianity. Along the diagonal of the matix, the enties epesent the Gaussianity of the factos. A pefectly Gaussian facto would have a scoe of zeo. The off-diagonals epesent the non- Gaussianity of the two factos (i.e., the scoes will be maximal when both factos ae non-gaussian and in the same way, which means that the two factos will be elated though the highe-ode elationships). With each iteation the degee to which a facto will be otated depends on the elative diffeence between its diagonal (how much it will stay the same) and the off-diagonals (how much it will change). The moe non-gaussian a facto is, the less it will be otated. This appoach is based on the Cental Limit Theoem, which indicates that a mix of two latent vaiables should be moe Gaussian than the pue vaiables; maximizing non-gaussianity of the factos should theefoe maximize how puely they eflect a single latent vaiable [Hyväinen et al., 2001: 9]. The sign of the elationship numbe contols how the facto scoing coefficients, and hence the facto scoes, ae changed at each iteation of the pocess. If the elationship numbe is positive, a faction of the second facto s scoing coefficients ae subtacted fom those of the fist. The moe simila (and hence moe positive the numbe), the moe is subtacted. The evese happens if the elationship numbe is negative (the two ae simila in a mio-like fashion). If the factos stated out simila, this pocess will push them 745

Dien et al. apat fom each othe. Ultimately, this pocess eaches an equilibium whee the changes to the two factos cancel out. The stongest elationship is likely to be the secondode coelations. Fo this eason, the default appoach is to decoelate the matix by spheing the data by using matix division to divide it by the covaiance matix [see also Hyväinen et al., 2001: 160]. The esult is to eliminate the second-ode covaiances (they now equal zeo) but leaving intact the highe-ode elations (e.g., the poduct of the vaiable with its tansfomed vesion is not zeo). The spheing opeation also has the impotant effect of standadizing the data matix, equalizing the contibution of the diffeent vaiables to the esults, much as PCA nomally uses coelational facto loadings at the otation step. This standadization could, in pinciple, be pefomed without also spheing the data; the two do not need to go togethe. Wheeas PCA facto loadings ae, by convention, intepeted in coelation fom, ICA facto loadings ae, by convention, intepeted in covaiance fom (with micovolt metic). This convesion to micovolt metic automatically occus when the spheing opeation is undone pio to intepetation, and hence simply epesents a diffeence in convention athe than a fundamental diffeence between PCA and ICA, since PCA facto loadings can also be eadily conveted to micovolt metic [see Dien et al., 1997]. The sigmoid function also has the pupose of expanding the influence of the most infomative pat of the data distibution (the cente) compaed to the oute finges (the outlies). The outlies ae compacted into the floo and ceiling values of 1 and 1, wheeas the cental numbes, which may be closely d, ae d futhe apat in the sigmoid tansfomed vaiable. It is this maximization of the infomation value of the data by this tansfomation that leads to the name Infomax fo this ICA algoithm [Bell and Sejnowski, 1995: 1130]. Assumptions and Issues in ICA of ERPs ICA makes two assumptions about the data. The fist is that the data ae non-gaussian in thei distibution ove diffeent possible values [Hyväinen et al., 2001: 162], as can be gaphed by a histogam, which is to say that they depat fom nomality. Such non-gaussian distibutions make it possible fo the highe-ode moments to diffeentiate the ERP components. It seems likely that most ERP components analyzed in a spatial appoach will be highly non-gaussian, since most of the obsevations will be zeo with just a few points being nonzeo. The actual couse of the components will be elatively unimpotant compaed to this effect of being tempoally cicumscibed. It is not as clea what the case will be fo a tempoal appoach. The second is that the ERP components be independent of each othe [Hyväinen et al., 2001: 152], which means that they should not be only uncoelated but also unelated in tems of the highe-ode elations, as descibed in the pevious section. One of the pio simulation studies [Makeig et al., 2000] examined the effect of coelated components on ICA and showed that it can indeed cause distotions in the esults. Decoelating (spheing) the data befoe the ICA otation (and ecoelating aftewads) may pehaps addess this issue but it emains untested; futhemoe, it emains unclea just how independent ERP components tend to be, aside fom the second-ode coelational moments, o what effect such nonindependence might have on the esults. Like with PCA, one would expect that this issue be moe seious fo the tempoal appoach. Anothe situation is when the facto loading matix is almost singula [Bell and Sejnowski, 1995]. This could happen if two of the factos wee too simila, and hence the weights also. This statement is theefoe homologous to the issue discussed with egad to PCA that factos that ae too simila may be difficult to sepaate. As with PCA, it is most likely to be an issue fo the spatial appoach. The facto loading matix could also be singula if vaiables ae too simila o if thee ae moe vaiables than thee ae latent vaiables to be modeled by factos. In such a case, the phenomenon of ovefitting o oveleaning can occu. One way this phenomenon can be manifest is that factos with isolated tempoal bumps that epesent potions of a facto s seies being split between diffeent factos [Hyväinen et al., 2001; Säelä and Vigáio, 2003]. It can also esult in effects simila to that descibed fo PCA, such as single ERP components being split into multiple single loading factos. A final issue moe specific to the Infomax algoithm is that it is designed to handle supe-gaussian events that have lage amplitudes but limited pesence acoss the obsevations. When applied as a spatial ICA, typical ERP components meet this desciption, as they ae high amplitude and shot duation, which tanslates to being pesent in elatively few of the obsevations ( points). When applied as a tempoal ICA, the obsevations ae electodes and this may no longe be the case. Because ERP components ae pesent in most electodes due to volume conduction, they should be pesent in most of the obsevations (channels) of a tempoal ICA. It may theefoe be the case that they ae bette descibed as having a sub-gaussian distibution. A vaiant of the Infomax algoithm, called Extended ICA, has been developed fo such cases [Lee et al., 1999]. This vaiant will also be applied to see if it povides moe effective esults fo tempoal ICA. DIFFERENCES BETWEEN PCA AND ICA PCA can be applied to ERP datasets using eithe a tempoal [Donchin and Heffley, 1979] o a spatial appoach [Dien, 1998a; Kavanagh et al., 1976], a distinction that will play a key pat in the pesent simulation study. In the tempoal appoach (tempoal PCA) the points ae aanged as the vaiables and the wavefoms (combinations of channels, subjects, and conditions) ae the obseva- 746

PCA and ICA tions. The factos ae defined by a specific couse as descibed by thei espective facto loadings. Since the facto loadings ae fixed in natue, it is not possible to examine latency changes acoss conditions o subjects. Scalp topogaphy infomation, as coded in the facto scoes, is fee to vay. In the spatial appoach (spatial PCA) the channels ae the vaiables and the scalp topogaphies (combinations of points, subjects, and conditions) ae the obsevations. With the spatial appoach it is possible to examine latency effects but not topogaphy changes. Counteintuitively theefoe, the spatial appoach is bette fo studying tempoal changes and vice vesa. Aside fom the fundamental diffeences between PCA and ICA, thee has also been a histoy of diffeences in the application of the techniques. These diffeences in application need to be addessed as well. At the isk of ovegenealizing, it can be said that PCA studies of ERPs have typically been tempoal analyses on subject aveages [e.g., Bentin et al., 1985; Chapman et al., 1978; Cuy et al., 1983; Dien, 1999; Dien et al., 1997, 2003a; Fiedman et al., 1981; Kayse et al., 1998; Kame and Donchin, 1987; Lutzenbege et al., 1981; Polich, 1985; Rohbaugh et al., 1978; Ruchkin et al., 1990; Yee et al., 1987], wheeas ICA studies have typically been spatial analyses on eithe single subjects o multisubject gand aveages [e.g., Jung et al., 2000; Makeig et al., 1997, 1999a,b; Vigaio, 1997]. PCAs have histoically been applied on a tempoal basis, as ERP eseaches typically chaacteize ERP components in tems of thei couse, with scalp topogaphy being a seconday, albeit impotant, chaacteistic. Subject aveages have been analyzed because they facilitate subsequent analysis of vaiance (ANOVA) analyses of the facto scoes. The application of ICA to ERP datasets has been motivated, at least to some extent, by a basic diffeence between the two pocedues: PCA is oiented towad lumping, while ICA is oiented towads splitting. Since the PCA algoithm begins by extacting linea combinations that account fo as much vaiance as possible, the ealy factos it yields combine as many vaiables as possible. This esults in a maximally pasimonious set of factos that will e towad conflating simila latent vaiables. ICA, on the othe hand, stats the pocess with a diffeent facto fo evey vaiable and, in seeking maximum independence, will tend to make fine distinctions. ICA factos theefoe e in the diection of sepaating activity that should not be sepaated, somes splitting activity into multiple coelated factos that have pimay loadings on diffeent vaiables [Makeig et al., 1999a; McKeown et al., 1998], posing poblems of pasimony. These multiple factos may, fo example, eflect subtle individual diffeences. It can even esult in backgound noise combining with a latent facto, splitting it into nealy identical vesions at diffeent vaiables (channels fo a spatial analysis). Such splitting would complicate effots to intepet the ICA esults and to compae them with the PCA esults. This pasimony issue is lagely avoided with spase montages when using a spatial analysis. ICA aticles that have successfully applied a spatial appoach to subject aveages have only examined the data fom 14 ERP channel locations [i.e., Matsumoto et al., 2005; Sato et al., 2001]. Such an analysis would yield 14 factos (one fo each channel), which would avoid pasimony issues since this is appoximately the numbe of majo ERP featues (i.e., not leaving any factos to epesent subject diffeences). A study using 29 channels [Pitchad et al., 1999] epoted having to analyze each subject and condition sepaately, because combined analyses esulted in factos coesponding to only a single condition. While the single-subject appoach has been successfully used in a case study [Makeig et al., 1997] and in atifact coection [Jung et al., 2000; Vigaio, 1997], tying to apply it to multiple subject datasets can be difficult [e.g., Jung et al., 2001]. Conducting a tempoal analysis would similaly esult in lage numbes of factos (a 1-s epoch ecoded at 250 Hz yields 250 points and hence 250 factos) and high levels of splitting; in contast, the lumping bias of PCA means that only a elative handful of these factos explain enough vaiance to be of inteest, minimizing this concen. One appoach to counteing this splitting issue is to use the multisubject gand aveage data to avoid individual diffeence factos and to educe complications fom backgound noise [Makeig et al., 1999a]. In pinciple, this appoach could educe the quality of the esults since it loses infomation about individual diffeence vaiance that could be helpful fo sepaating component activity. It also does not solve the fundamental issue of pasimony since a dataset will typically poduce as many factos as thee ae vaiables, in the absence of collineaity. On the othe hand, a multisubject gand aveage ERP could have the advantage of an impoved signal-to-noise atio with espect to subject aveages. A second appoach is to use PCA as a pepocessing step to educe the dimensionality of the dataset, an option appaently used in only one ERP epot thus fa [Johnson et al., 2001], but used in a numbe of fmri analyses [Calhoun et al., 2001; Dodel et al., 2000; Geicius and Menon, 2004]. Reduction of data dimensionality has been advocated as a stategy fo minimizing ovefitting [Säelä and Vigáio, 2003]. The cuent epot will focus on using the PCA pepocessing appoach since it also facilitates compaisons with PCA otations. One can then conceptualize the contast as between two diffeent otations, Pomax and Infomax, of the same initial PCA decomposition. Issues about ICA component splitting and facto identification would be compaable to PCA. The multisubject gand aveage appoach will also be evaluated in Simulation 4. Two simulation compaisons (whee it is possible to evaluate accuacy, since the tue answe is known) have been made of PCA and ICA of ERP data, both ecommending ICA (using the Infomax algoithm) ove PCA [Makeig et al., 2000; Richads, 2004]. The pesent epot will seek to extend these studies as follows. Fist, it will utilize eal EEG fo the backgound noise. Second, it will 747

Dien et al. explicitly examine the distinction between spatial and tempoal appoaches. Thid, it will seek to paameteize the cases in which one o the othe technique fails so that uses have some basis fo choosing which one to use fo a given dataset. This epot will also addess the Vaimax issues noted ealie when thee ae no lage clustes of vaiables that load on only one facto o if thee ae a numbe of vaiables with zeo loadings on the fist unotated facto [Cueton and D Agostino, 1983 : 224]. A vaiant of the Vaimax otation, the weighted-vaimax, has been poposed to addess these two situations [Cueton and D Agostino, 1983; Cueton and Mulaik, 1975]. It gives the most weight to facto loadings that ae located away fom the initial unotated facto (which is by fa the lagest), essentially making the assumption that the initial otation is not aligned with the coect otation. It is not clea in advance how poblematic these two situations might be fo spatial and tempoal PCAs of ERP data, so it seems wothwhile to evaluate this otation as well. Since Pomax uses Vaimax as an initial otation, we implemented Pomax with Weighted-Vaimax to supplement the egula Pomax with Vaimax otation. We will also examine the peviously descibed Extended ICA algoithm, which is intended fo sub-gaussian distibutions [Lee et al., 1999]. This vaiant will also be applied to see if it povides moe effective esults fo tempoal ICA. This epot consists of five simulations. Simulation 1 examines the eliability of ICA esults. Simulation 2 evaluates Infomax and Pomax unde minimal noise conditions. Simulation 3 examines the effects of diffeent levels of eal backgound EEG noise. Simulation 4 examines the effects of individual diffeences and of using multisubject gand aveages athe than subject aveages. Simulation 5 detemines if these esults still apply when all five simulated components ae included in the simulations. an aveage of 55 tials pe condition. The data was aveaged using the 6 efeence [Schimmel, 1967], which eliminates the ERP signal, but peseves the andom backgound noise level, by inveting evey othe tial. This noise aveage was filteed using a 30-Hz low-pass filte. The data epesents 125 points, stating 184 ms befoe baseline, with a sampling ate of 125 Hz. The standad deviation (SD) of the noise anged fom 0.46 to 1.37 (median 1.04) micovolts acoss the epoch. Each channel of the data wee efeenced to the aveage of the data at a given point, othewise known as the aveage efeence [Betand et al., 1985; Dien, 1998b]. Supeimposed on the noise aveage wee two simulated ERP components (Fig. 1). The topogaphy of the ERP components was geneated by the Dipole Simulato v. 2.1.0.5 SIMULATION 1 Befoe diect compaisons can be made between ICA and PCA, an essential issue is detemining whethe ICA solutions ae eplicable, since thee is a andom element to the pocess (the andom selection of data subsets) that can cause some vaiability in the esults, as noted on the vey helpful website of Makeig and colleagues (http://www. sccn.ucsd.edu/~scott/tutoial/icafaq.html). Methods: Simulation 1 A ealistic simulation dataset was constucted fo testing puposes, as peviously descibed [Dien et al., 2005]. The simulation dataset epesents a typical ERP dataset with 20 subjects, two conditions, and 65 channels (using the oiginal montage of the Electical Geodesics, Eugene, OR, net). Realistic backgound noise was obtained by using the data obtained fom 20 subjects with EEG fee fom atifacts fom a peviously published expeiment [Dien et al., 2003a]. Tials containing blinks wee ejected, esulting in Figue 1. Simulated ERP components. The scalp topogaphies epesent the voltage map at the peak point. The couses epesent the voltages at the peak channel. 748

PCA and ICA (witten by Patick Beg and available fo download fom http://www.megis.com/udbesa.htm). One dipole was oiented oughly towad scalp location Cz of the Intenational 10-20 System of electode placement [Jaspe, 1958], while the othe dipole was oiented oughly towad Pz. The couse of the two components wee geneated using a half-sine wave coveing 10 and 30 points each. The peak latencies of the two components ae 160 and 256 ms, espectively. The amplitudes of the two components wee sepaately vaied fom 2 4 micovolts. Subject vaiance (coelation between the two component amplitudes) was simulated by setting the amplitude of Component 2 to be equal to the Component 1 amplitude plus 2 4 micovolts, divided by 2. The peak amplitude of Component 2 at the focal channel (the channel with the highest amplitude) in the small and lage Component 1 cells had a mean (SD in paentheses) of 1.72 mv (0.34) and 1.71 mv (0.34), espectively. A condition effect was intoduced by multiplying Component 1 by a facto of 0.9 fo the small Component 1 cell and 1.1 fo the lage Component 1 cell. The peak amplitude of Component 1 at the focal channel had a mean of 2.2 mv (0.30) in the small Component 1 cell and 2.6 mv (0.39) in the lage Component 1 cell. This level of effect was intended to yield F-values compaable to published P300 studies, since it has been shown that unealistically lage condition effects can exaggeate the degee of misallocation vaiance effects [Beauducel and Debene, 2003]. The two components ae tempoally ovelapping and spatially coelated, both of which can be deleteious fo PCA solutions [Dien, 1998a]. Aside fom the coelation fom the simulated subject vaiance, the substantial tempoal ovelap induces a coelation between the two ERP components fo a spatial analysis since they tend to be pesent in the same obsevations [Dien, 1998a], esulting, in this case, in a Peason s R of 0.10 when calculated acoss all the obsevations (the obsevations ae not independent so an infeential test is not waanted). The coelation was calculated fo each simulation, Fishe-Z-tansfomed, aveaged, and then backtansfomed. A coelation is a easonable measue of similaity in the context of PCA because even analyses with a covaiance matix actually use coelations (facto loadings) duing the otation step. The choice between covaiance and coelation elationship matices affects the facto etention step, not the otation step. In ode to evaluate epoducibility of the ICA esults, the same dataset was analyzed 100 s using a spatial analysis. EEGlab 4.08 [Delome and Makeig, 2004] unning unde Matlab 7.01 (MathWoks, Natick, MA) was used to compute the ICA solutions. Similaity of the analyses was assessed by examining the two factos coelating most highly with the two simulated components. When two factos coelated most highly with the same simulated component, the one that coelated most highly was paied with it. Because PCA is biased towad combining latent vaiables togethe into a single facto, this pocedue will tend to favo ICA; such cases will essentially be tabulated as being an eo because the second simulated component will be paied with an unsuitable facto instead of the combined facto. Convesely, ICA has a bias towad splitting components into multiple factos, and since only one of these multiple factos will be chosen, this facto will fit only pat of the vaiance and will scoe pooly. Implicit in this pocedue, theefoe, is the conscious judgment that both such situations epesent an eo on the pat of the statistical analysis. Coelations wee assessed with the facto loadings scaled in micovolts, which fo ICA takes the fom of the pseudoinvese of the poduct of the spheing matix and the weight matix. To examine the effect of using a PCA pepocessing step, the execise was epeated with six etained factos (as suggested by the Scee test). Results: Simulation 1 While the esults acoss the 100 analysis uns wee quite simila, they wee not identical. The coelation coefficients fo the couse of Component 1, as egeneated fom the facto scoes, vaied fom 0.9858 0.9861, while the spatial distibution vaied fom 0.9979 0.9982. The couse of Component 2 coelated at 0.9993 in all cases, while the scalp distibution vaied fom 0.9890 0.9893. In geneal, the paametes wee stable up to about the thid digit. Examination of the actual facto loadings evealed a simila situation. PCA pepocessing yielded modeately moe stable paametes, with numbes also being geneally stable up to the thid digit. Component 1 couses vaied fom 0.9913 0.9915, while the spatial topogaphy vaied fom 0.9913 0.9917. The Component 2 couse anged fom 0.9982 0.9982 fo tempoal pattens and 0.9863 0.9865 fo spatial pattens. Discussion: Simulation 1 Although the ICA solutions wee lagely eliable, vaiability is obsevable in the less significant digits. This vaiability can potentially have a noticeable impact. While the vaiability was not seious enough in the pesent study to be an issue, it is unknown whethe it may be geate unde some conditions such as when the signal-to-noise atio of the data is lowe. One way to addess this issue is to standadize the andom numbe geneation. Conventional computes cannot geneate tuly andom numbes since thei pogamming is wholly deteministic (http://compute.howstuffwoks. com/question697.htm); instead, they use a complicated fomula with a stating seed numbe to poduce unpedictable numbes (the output using the initial seed is utilized as the seed to geneate the next andom numbe). It may be advisable to modify the pseudoandom numbe geneato so that it uses a known seed that can be eplicated as needed. Matlab s pseudoandom numbe genea- 749

Dien et al. to is eset at statup, poducing the identical output each the pogam is stated. The same esult can be accomplished by inseting the command and( state,0); at the stat of the unica code. This appoach will be used fo the emainde of this epot. SIMULATION 2 With the eliability issues addessed in Simulation 1, a peliminay compaison of ICA and PCA can now be conducted. A simulation dataset will be analyzed using both spatial and tempoal appoaches with both PCA and ICA. To incease the genealizability to eal datasets, as descibed in the Methods section, five simulated components wee constucted fom five ERP components fom eal datasets. Fo this initial simulation, only minimal noncoheent backgound noise was added to maximize intepetability of the esults (descibed in Methods). Fo this eason, the esults of Simulation 2 will povide obsevations about the basic pinciples that will be lagely fee fom concens about confounds fom the backgound noise but will not be genealizable to eal datasets (maximizing contol at the cost of ecological validity). Because it is not possible to systematically vay all aspects of such complicated datasets, Simulation 2 will be appoached as epesentative of a lage univese of possible datasets wheein geneal pinciples may be obseved fo geate undestanding of the elevant paametes. We ague that making an effot to fom a fully ealistic ERP dataset and offeing an evaluation on this basis would be an ill-posed question. Evey ERP dataset will have diffeent combinations of ERP components and it would not be possible to cove evey eventuality. Instead, the goal of this aticle is to identify the paametes that esult in successful sepaations of components, such as due to component coelations o the sigma covaiance measue that we pesent late on. We theefoe used pais of components, easoning that lage sets of simulated components could be difficult to intepet, much as tying to intepet a fiveway ANOVA is extemely complicated due to all the possible pattens of inteactions. Any an expeimental study of any kind, simulation o othewise, is conducted one must choose a balance between intepetability and ecological validity; we ague that the cuent wok chose a balance that allowed us to best meet the goals of this study. We ae, howeve, mindful of the concen that the dataset has a ealistic level of dimensionality. We theefoe include a high level of coheent backgound noise in Simulation 3 that povides this dimensionality, while addessing as best we can the potential fo inteactions between the simulated components and the backgound noise chaacteistics by compaing with Simulation 2 esults (whee thee was no coheent backgound noise). Finally, in Simulation 5 we do include all five simulated components to detemine if the pio esults genealize to lage numbes of components. It is not clea how weighted-vaimax will pefom with tempoal and spatial PCAs. The pinning situation (descibed in the Intoduction) only applies to thee o moe factos, so it will not occu with the pesent simulation. The off-axis clustes situation (descibed in the Intoduction) might be moe likely to occu with spatial PCAs because moe ovelap (and hence vaiables with loadings on both factos) should occu in the spatial domain. Extended ICA will be applied to detemine if it povides any benefits. This vaiant will be most likely to impove tempoal ICA since the data is mostly likely to be sub- Gaussian with this appoach. Methods: Simulation 2 Simulation 2 was constucted in the same fashion as Simulation 1 with two modifications. The fist modification was that five eal ERP components wee utilized, as shown in Figue 1 and summaized in Table II. Two visual ERP components wee obtained fom a peviously published study [Dien et al., 2003a]. The left fontal effect (focus nea F3, peak at 432 ms), which we tem an N400 because it is fom an N400 study, although it has a moe fontal distibution than usual, was obtained fom the multisubject gand aveage diffeence wave between the conguent and inconguent ending conditions. A visual P1 was obtained fom the same dataset fom the conguent ending condition (focus nea O2, peak at 120 ms). Thee auditoy ERP components wee obtained fom anothe peviously published study [Dien et al., 1997]. An auditoy N1 was obtained using the multisubject gand aveage fo all audi- TABLE II. Statistical popeties of the five simulated ERP components Peak Tempoal SD Tempoal skew Tempoal kutosis Spatial SD Spatial skew Spatial kutosis N400 432 ms 0.95 0.04 0.93 0.63 1.64 1.15 P1 120 ms 0.75 0.77 0.64 0.30 4.43 19.00 N1 108 ms 2.61 0.17 1.21 0.98 3.64 12.10 P300 400 ms 1.06 0.50 0.80 0.76 1.13 0.28 P2 200 ms 2.16 0.84 0.12 1.32 2.70 5.89 SD, standad deviation, epesents the micovolt values at the peak channel. The paametes ae calculated fom the base components without the addition of condition o subject vaiance. Tempoal columns epesent the figues fo the tempoal appoach and the spatial columns ae fo the spatial appoach. 750

PCA and ICA TABLE III. Relations between paiwise compaisons of simulated components C1 C2 Tempoal scoes Spatial scoes Tempoal loadings Spatial loadings Tempoal sig cov Spatial sig cov 1 N400 P1 0.43 0.12 0.13 0.43 1.20 0.17 2 N400 N1 0.81 0.14 0.15 0.82 1.07 0.14 3 N400 P300 0.63 0.91 0.93 0.63 1.32 1.44 4 N400 P2 0.89 0.18 0.19 0.90 1.03 0.23 5 P1 N1 0.75 0.92 0.94 0.76 1.04 2.86 6 P1 P300 0.25 0.16 0.17 0.25 1.61 0.32 7 P1 P2 0.34 0.09 0.09 0.34 1.07 0.11 8 N1 P300 0.22 0.19 0.20 0.22 1.33 0.21 9 N1 P2 0.74 0.10 0.10 0.75 1.37 0.09 10 P300 P2 0.73 0.18 0.19 0.75 1.19 0.33 Sig Cov is the sigma covaiance measues as descibed in the text. C1 and C2 ae the two simulated components in the dataset. Tempoal scoes ae the coelation of the tue facto scoes fo the tempoal appoach (if the factos ae econstucted accuately), including subject, cell, and spatial vaiance. Spatial scoes ae the equivalent figue fo the spatial appoach. The scoes ae the median esults acoss the ten eplicates. Tempoal loadings ae the coelation between the couse (scaled loadings) of the two components that quantifies the vaiable ovelap fo the tempoal appoach. Spatial loadings ae the complementay figues fo the spatial appoach. toy conditions (focus nea Fz, peak at 108 ms). An auditoy P300 was obtained fom the diffeence wave between taget and standad conditions (focus between Cz and Pz, peak at 400 ms). Finally, an auditoy P2 was obtained fom the multisubject gand aveage of all auditoy conditions (focus nea Fz, peak at 200 ms). To ensue that these wavefoms ae statistically unidimensional in both the spatial and tempoal dimensions (without additional dimensions contibuted by noise o ovelapping components), the components wee constucted fom the couse at the focus channel matix (with the peiods befoe and afte the component set to zeo) multiplied by the maximum nomalized scalp distibution at the peak point (with a evesed N1 topogaphy to ensue coect polaity of econstucted component). This pocedue geneated a component with the same couse at all channels and the same scalp distibution at all points, as should be the case fo a single component due to volume conduction of a single souce electic field. This pocedue was also necessay since it is known that some of these ERP featues ae in fact composed of multiple components, an issue that is outside the scope of this epot [see Näätänen and Picton, 1987; Sutton and Ruchkin, 1984]. This pocedue educed these simulated ERP featues fom thei tue unknown dimensionality to a known single dimension. The second modification is that fo this initial simulation the backgound EEG noise was emoved. To pevent the data matices fom becoming singula, a vey small amount of noise was added to each data point ( 0.01 to þ0.01 micovolts). The noise had no coheence between data points. The same set of andom noise was used fo evey simulation to keep it constant with egad to the manipulations of inteest. One hunded diffeent datasets wee geneated and analyzed in each of the fou appoaches (spatial and tempoal, each with PCA and ICA). Test simulations with the 10 diffeent paiwise combinations of the five simulated components wee geneated. Fo each of these 10 combinations, 10 simulation datasets wee geneated fo a total of 100 simulation datasets. The component amplitudes wee vaied as descibed fo Simulation 1 with andom vaiation fo each simulated subject aveage, subject vaiance, and a condition effect fo one of the two components. Table III pesents the similaity of the pais of components in tems of both the facto scoes and in tems of the vaiables. The PCA Toolbox 1.091 (http://www.people.ku.edu/ ~jdien/downloads.html) was used to compute the PCA solutions. As we have ecommended elsewhee on the basis of simulation studies [Dien et al., 2005], the PCAs wee caied out using covaiance matices, Kaise nomalization, Pomax otation, and coelation loadings. Pomax was conducted with a kappa of 3, which is the paamete that detemines how oblique the otations will be. To convet the facto loadings into micovolt metic fo compaison with the oiginal data, the facto patten matix was multiplied by the SDs of the vaiables [Dien et al., 1997]. The ICA was conducted using PCA pepocessing. Only two factos wee etained since thee is no coheent noise to be accounted fo by the PCA solution. The extended- ICA was conducted with the specification that both of the factos would be sub-gaussian (e.g., extended, 2 ). Although we have some esevations about applying infeential statistics to atificial simulation data esults, we applied paied t-tests to selected compaisons of inteest to futhe evaluate the esults. Note that t-tests compae means, wheeas the tables pesent median statistics. We chose median statistics fo the tables because of a judgment that oveall consistency is moe impotant than highlighting the effect of damatic outlies. Howeve, we chose conventional t-tests, which compae means, since they will be the most familia fo eades. Fo the most pat, the esulting t-tests do seem to coespond with the median statistics. 751