Extraction of Uncorrelated Sparse Sources from Signal Mixtures using a. Clustering Method

Similar documents
Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

y and the total sum of

TN348: Openlab Module - Colocalization

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

A Binarization Algorithm specialized on Document Images and Photos

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

Feature Reduction and Selection

Analysis of Continuous Beams in General

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

S1 Note. Basis functions.

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Wishing you all a Total Quality New Year!

Simulation: Solving Dynamic Models ABE 5646 Week 11 Chapter 2, Spring 2010

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

Cluster Analysis of Electrical Behavior

User Authentication Based On Behavioral Mouse Dynamics Biometrics

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

CS 534: Computer Vision Model Fitting

2x x l. Module 3: Element Properties Lecture 4: Lagrange and Serendipity Elements

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

An Entropy-Based Approach to Integrated Information Needs Assessment

Classifying Acoustic Transient Signals Using Artificial Intelligence

Lecture 4: Principal components

Programming in Fortran 90 : 2017/2018

An Image Fusion Approach Based on Segmentation Region

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

High-Boost Mesh Filtering for 3-D Shape Enhancement

Edge Detection in Noisy Images Using the Support Vector Machines

Reducing Frame Rate for Object Tracking

Mathematics 256 a course in differential equations for engineering students

Smoothing Spline ANOVA for variable screening

The Codesign Challenge

Solving two-person zero-sum game by Matlab

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following.

Array transposition in CUDA shared memory

Circuit Analysis I (ENGR 2405) Chapter 3 Method of Analysis Nodal(KCL) and Mesh(KVL)

Machine Learning: Algorithms and Applications

Lecture #15 Lecture Notes

Music/Voice Separation using the Similarity Matrix. Zafar Rafii & Bryan Pardo

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

Hermite Splines in Lie Groups as Products of Geodesics

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes

Image Representation & Visualization Basic Imaging Algorithms Shape Representation and Analysis. outline

Unsupervised Learning and Clustering

X- Chart Using ANOM Approach

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters

Lecture 5: Multilayer Perceptrons

Outline. Self-Organizing Maps (SOM) US Hebbian Learning, Cntd. The learning rule is Hebbian like:

High resolution 3D Tau-p transform by matching pursuit Weiping Cao* and Warren S. Ross, Shearwater GeoServices

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS

Conditional Speculative Decimal Addition*

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009.

Hierarchical clustering for gene expression data analysis

Load Balancing for Hex-Cell Interconnection Network

An Optimal Algorithm for Prufer Codes *

A Robust Method for Estimating the Fundamental Matrix

Electrical analysis of light-weight, triangular weave reflector antennas

Unsupervised Learning

Synthesizer 1.0. User s Guide. A Varying Coefficient Meta. nalytic Tool. Z. Krizan Employing Microsoft Excel 2007

Parameter estimation for incomplete bivariate longitudinal data in clinical trials

CMPS 10 Introduction to Computer Science Lecture Notes

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

APPLICATION OF MULTIVARIATE LOSS FUNCTION FOR ASSESSMENT OF THE QUALITY OF TECHNOLOGICAL PROCESS MANAGEMENT

Fuzzy Filtering Algorithms for Image Processing: Performance Evaluation of Various Approaches

Detection of an Object by using Principal Component Analysis

Biostatistics 615/815

Vanishing Hull. Jinhui Hu, Suya You, Ulrich Neumann University of Southern California {jinhuihu,suyay,

We Two Seismic Interference Attenuation Methods Based on Automatic Detection of Seismic Interference Moveout

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1)

Support Vector Machines

Learning-Based Top-N Selection Query Evaluation over Relational Databases

GSLM Operations Research II Fall 13/14

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

MOTION PANORAMA CONSTRUCTION FROM STREAMING VIDEO FOR POWER- CONSTRAINED MOBILE MULTIMEDIA ENVIRONMENTS XUNYU PAN

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields

Wavefront Reconstructor

Face Recognition University at Buffalo CSE666 Lecture Slides Resources:

MOTION BLUR ESTIMATION AT CORNERS

Module Management Tool in Software Development Organizations

UB at GeoCLEF Department of Geography Abstract

Private Information Retrieval (PIR)

LOOP ANALYSIS. The second systematic technique to determine all currents and voltages in a circuit

Problem Set 3 Solutions

A Study on Clustering for Clustering Based Image De-Noising

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation

Hybrid Non-Blind Color Image Watermarking

Fitting: Deformable contours April 26 th, 2018

Positive Semi-definite Programming Localization in Wireless Sensor Networks

Virtual Machine Migration based on Trust Measurement of Computer Node

A fault tree analysis strategy using binary decision diagrams

Data Representation in Digital Design, a Single Conversion Equation and a Formal Languages Approach

Active Contours/Snakes

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

The Research of Support Vector Machine in Agricultural Data Classification

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Transcription:

1 Extracton of Uncorrelated Sparse Sources from Sgnal Mxtures usng a Malcolm Woolfson Clusterng Method Department of Electrcal and Electronc Engneerng, Faculty of Engneerng, Unversty of Nottngham, Nottngham. NG7 2RD. England emal: malcolm.woolfson@nottngham.ac.uk Abstract A blnd source separaton method s descrbed to extract sources from data mxtures where the underlyng sources are assumed to be sparse and uncorrelated. The approach used s to detect and analyse segments of tme where one source exsts on ts own. Informaton from these segments s combned to counteract the effects of nose and small random correlatons between the sources that would occur n practce. Ths combned nformaton can then be used to estmate the sources one at a tme usng a deflatonary method. Probablty densty functons are not assumed for any of the sources. A comparson s made between the proposed method, the Mnmum Headng Change method, Fast-ICA and Clusterwse PCA. It s shown, for the dataset used n ths paper, that the proposed method has the best performance for clean sgnals f the nput parameters are chosen correctly. However the performance of ths method can be very senstve to these nput parameters and can also be more senstve to nose than the Fast-ICA and Clusterwse methods. Keywords Blnd Source Separaton, Sparse Component Analyss, Sparse Sources

2 1 Introducton One general problem n sgnal processng s the extracton of ndvdual source sgnals {sj[n]} from measurements {z[n]} that are a lnear combnaton of these sources: N z [ n] A s [ n] (1) j1 j j ( = 1,2,,M) where {Aj} are the mxng coeffcents, M s the number of sets of measurement data and there are N underlyng sources. In the case where both the sources and mxng coeffcents are unknown, ths problem comes under the headng of Blnd Source Separaton (BSS). There are many applcatons n ths area, for example the analyss of EPR data [1], NMR data [2], fetal ECG montorng [3] and gene mappng [4]. BSS s an undetermned problem, even when M N, as both {Aj} and {sj[n]} n Equaton (1) are unknown, whch means that lnear estmaton methods cannot be appled. There are varous approaches to estmatng the sources, to take a few examples: Prncpal Component Analyss (PCA) [5], forcng hgher order cross-cumulants to zero [6] and Independent Component Analyss [7], [8]. In varous approaches, the data are normally whtened frst, usng for example PCA or Gram-Schmdt, and then each transformed data component s normalsed to have unty root mean squared (rms) value. The relaton between the whtened components { e [ n]} and underlyng sources { [ n]} can be wrtten as N e [ n] B s [ n] (2) j1 j j s where {e[n]}, =1,2,,M are the whtened components and { B j } are the mxng coeffcents. Whtenng makes t easer to separate out the ndvdual components whch are assumed uncorrelated. From now on we assume that the sources are uncorrelated and that the data has

3 been whtened and normalsed to unt rms value- the method of whtenng we wll use s the Gram-Schmdt method. The approaches to BSS descrbed n [5-8] do not make any assumptons as to how the underlyng sources vary wth tme. However, n many applcatons, for example montorng of fetal ECG from multlead measurements, the underlyng sources are sgnfcant only for a segment of tme- such sources are termed sparse. A looser defnton of sparsty s that each source should be domnant over the others for a short perod of tme. The methods descrbed n the prevous paragraph can be appled to the case where the underlyng sources are sparse; however, a group of BSS methods have been developed whch make use of the sparsty of the sources to extract them such methods come under the headng of Sparse Component Analyss (SCA) [9-17]. Some methods for SCA make use of the followng geometrcal nterpretaton of sparsty. Suppose that we are processng two mxtures of two sources, so that M N 2 n (1). If one of the sources exsts on ts own for a segment of tme then, f we plot one set of data aganst the other, the resultng phase plot wll be a straght lne durng the tme that that source s sparse. Let us look at the smplest case of two mxtures of two sources that are non-overlappng n tme. Each source s modelled as a Gaussan truncated n tme: s ( t) a ( t t exp 2 0 2 ) 2 for t 4 (3) = 0 for t 4 The followng parameters are used for each source: Source 1: a1 = 1, t01 = 0.1 s, 1 = 12.5 ms

4 Source 2: a2 =0.1, t01 = 0.026 s, 1 = 6.25 ms The smulated sources are shown n Fgure 1(a); n ths case the two sources are completely sparse. These sources are mxed where the randomly chosen mxng coeffcents {Aj} n Equaton (1) are gven by the matrx: 1.3 A 1 2 2.85 (4) A samplng frequency of 250 Hz s used. 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 5 10 15 20 25 30 35 40 Sample Number

5 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 5 10 15 20 25 30 35 40 Sample Number Fgure 1(a) Smulated Sparse Sources: top fgure Source 1, lower fgure Source 2 If one whtens the data usng the Gram-Schmdt method, and plots the mxed sgnal e1 aganst e2 (Equaton (2)) then the phase plot n Fgure 1(b) s obtaned. It can be seen that the ponts cluster n two drectons, each drecton correspondng to a partcular source and the prncpal drectons are orthogonal because the underlyng sources are uncorrelated.

6 0.5 0.4 0.3 e2 0.2 0.1 0-0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 e1 Fgure 1(b) Plot of e2 aganst e1 after whtenng and normalsaton The prncpal drectons of ths plot are drectly related to the coeffcents {Bj} n (2). Several clusterng methods have been developed to detect those segments of the plot where straght lnes occur, and these drectons can then be used to estmate the underlyng sources [9],[12]. Ths method works best when all sources can be detected n the orgnal data. However, sometmes certan sparse sources can be masked by one or more other sources and ths wll result n errors n the estmaton of the underlyng sources. Another approach [15,16], adopted n ths paper, s to estmate one at a tme, usng deflaton, the vectors n the phase plot correspondng to each sparse source. Takng the smple case of two sources and two data mxtures, suppose that for the example n Fgure 1(b), when e2[n] s plotted aganst e1[n], that ponts n the phase plot are joned up wth straght lnes n order of tme as shown n Fgure 2.

7 0.5 0.4 0.3 e2 0.2 0.1 0-0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 e1 Fgure 2 Phase plot wth ponts joned up Let the vector e[n] be defned as: e[ n] ( e1[ n], e2[ n]) (5) The velocty vectors for the phase plot are defned as v[ n] e[ n] e[ n 1] (6) wth the normalsed headng vector gven by rˆ[ n] v[ n] v[ n] (7) In [15], segments correspondng to a partcular source are recognsed by lookng for three consecutve ponts M - 2, M - 1 and M where the magntude of the change n normalsed headng s a mnmum over all the data ponts; ths s deemed to correspond to a sparse source and the estmate of the headng s taken as the most recent one found: r ˆ[ M ]. We then estmate the drecton n the phase plot correspondng to source 1, Rˆ 1, from Rˆ 1 rˆ[ M ] (8)

8 The estmate of source 1 s gven by ~ s [ n] Rˆ. e[ n ] 1 1 (9) Assumng uncorrelated sources, the drectons n the phase plot correspondng to the other sparse sources are orthogonal to Rˆ 1 and hence ~ s 1 [ n] wll pck up contrbutons from source 1 only and not the other sources. Ths estmate of source 1 can be subtracted from the phase plot as follows: z' [ n ] e[ n ] ~ s [ ˆ 1 n ] R 1 (10) The algorthm s then appled to { z'[ n]} to estmate source 2. Ths method, called the Mnmum Headng Change (MHC) method, can be extended to mxtures of N sources; a full descrpton of the method s contaned n [15]. Ths method was successfully appled to mxtures of uncorrelated sparse sources [15] and was extended to correlated sparse sources n [16]. However, as dscussed n these two references, the MHC method can be more senstve to nose than other methods. To llustrate the problem, n Fgure 3 the phase space plot of the whtened components correspondng to Fgure 2 s shown where nose s added on to z1 and z2 n Equaton (1).

9 0.6 0.5 0.4 0.3 e2 0.2 0.1 0-0.2-0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 e1 Fgure 3 Phase plot when nose s added The drectons n the phase plot correspondng to the two sources are stll clear but t can now be seen that nose s affectng the plots. Ths nose wll cause a problem for the MHC n detectng these drectons. Only one headng value s chosen n ths drecton whch may devate sgnfcantly from the actual headng. The queston to be asked s whether one can obtan more robustness to nose f one clusters all headng vectors over the whole phase plot correspondng to each domnant source drecton, smlar to the clusterng of ponts used n [12]. One could then perform a weghted average over the headng vectors to obtan a smoother estmate of the underlyng sources. We wll refer to ths approach as the Global Method. It should be ponted out that although the deflaton approach has the advantage of beng able to estmate sparse sources that are hdden n the orgnal data, the requred teratve procedure means that estmaton errors can accumulate when separatng more and more sources.

10 The am of ths paper s to derve a method based on clusterng to fnd the domnant drectons n the phase plot and to assess any mprovements of ths method compared to the MHC derved n [15] when there s nose present for data consstng of mxtures of uncorrelated sources. 2 Motvaton of the Clusterng Algorthm In the MHC method, we are lookng at the normalsed headngs n phase space rˆ[ n] v[ n] v[ n] (11) where v[ n] e[ n] e[ n 1] (12) For sources whch are sparse, we wsh to cluster together headngs correspondng to each source; ths means that we are clusterng headngs whch are deemed to be close enough to each other. The headngs that are clustered may come from dfferent non-adjacent segments of the data. We now need to consder how to cluster the headngs correspondng to each source. Conventonally, clusterng methods are appled to the orgnal data and varous teratve approaches are used, such as K-means clusterng. However, n ths paper, a much smpler approach can be used: as the dfferences between adjacent data ponts are beng analysed, then one can drectly assocate smlar headng vectors across the whole data. One way s to compute the magntude of the dfferences between each par of headngs and look for close assocatons between these pars. However, ths approach has been found to be computatonally expensve. An alternatve method, whch s adopted n ths paper, s to cluster the headngs component by component usng a sortng method.

11 In the clusterng procedure that s adopted n ths paper, the followng ntal steps are carred out. The normalsed headng vector at sample pont n can be wrtten n terms of ts component values as: ˆ 2 r[ n] (ˆ r1 [ n], rˆ [ n],..., rˆ N [ n]) (13) Suppose that we calculate the magntude of each headng component and then sort each headng component n ascendng order of magntude: sort { rˆ [ m]} sort{ rˆ [ n] } =1,2,,N (14) Next we plot each sorted normalsed headng component as a functon of headng ndex, m, n the reordered sequence; the resultng plot wll be dfferent dependng on the sparsty of the sources. To see ths, let us look at three examples. In the frst two examples source s1 and s2 are generated by a unform random number generator over 100 samples. () Both data nputs are dentcal to wthn a scalng constant In the frst example, suppose that the two data nputs are equal to wthn a scalng constant so that there s perfect correlaton between the two nputs. The resultng ordered plots for the sort case where the scalng constant s 2 for { rˆ [ m]} are shown, for each component n Fgure 4(a).

12 1 0.9 0.8 0.7 Headng Component 0.6 0.5 0.4 0.3 0.2 0.1. 0 0 10 20 30 40 50 60 70 80 90 100 Sample Number Fgure 4(a)- Both nputs same sequence of random numbers. Full lne: Component 1, Dashed Lne: Component 2. The normalsed headng components wll then be dentcal for all data ponts. As expected, the sorted normalsed headng components form a horzontal lne n these plots ndcatng that the two nputs are perfectly correlated. () Both data nputs are uncorrelated In the second example, one data mxture s equal to source 1 (used n the prevous example) and the other equal to source 2. In ths case example, the headngs components are uncorrelated between components, so that f we plot the headng components n ascendng order they are monotoncally ncreasng as shown n Fgure 4(b).

13 1 0.9 0.8 0.7 Headng Component 0.6 0.5 0.4 0.3 0.2 0.1 0 0 10 20 30 40 50 60 70 80 90 100 Sample Number Fgure 4(b) Inputs are dfferent sequences of random numbers Full lne: Component 1, Dashed Lne: Component 2. () Data nputs are mxtures of sparse sources For the sake of example, let the two sources be u [ ] andu [ ], whch are each generated by a unform random number generator. Source 2 s shfted to the rght by 90 samples, wth 90 zero values added at the begnnng so that the shfted data s gven by 1 n 2 n u '[ n] u [ n L] 2 u '[ n] 0 2 2 where L = 90. for L n 100 L for n L (15) Zeros are added to the end of Source 1 as follows: u '[ n] 0 1 u '[ n] u [ n] 1 1 for 100 n 100 L for1 n 100 (16) Hence the two sources are now sparse. The two sources are mxed usng the randomly selected mxng matrx:

14 0.799 A 0.373 0.498 0.133 (17) The resultng phase plot of the data, after whtenng, s shown n Fgure 4(c) where t can be seen that there s a mxture of straght lne segments, ndcated by 1 and 2 and a random pattern elsewhere. 0.15 0.1 0.05 1 2 e2 0-0.05-0.1-0.15-0.2-0.15-0.1-0.05 0 0.05 0.1 0.15 0.2 e1 Fgure 4(c) Gram-Schmdt plot of two sources sort The resultng ordered plots for { rˆ [ m]} are shown n Fgure 4(d).

15 1 0.9 0.8 0.7 C D Headng Component 0.6 0.5 0.4 0.3 B A 0.2 0.1 0 0 50 100 150 Sample Number Fgure 4(d) - Mxture of Sparse Sources Full lne: Component 1, Dashed Lne: Component 2. In ths thrd case, we can see that the resultant ordered normalsed headng components consst of monotoncally ncreasng values characterstc of randomness along wth horzontal segments reflectng the data segments 1 and 2 n Fgure 4(c) where one source exsts on ts own. It s the headng component values n these horzontal sectons that we are nterested n. Now the headng components n each of the segments A and B n Fgure 4(d) wll come from two dfferent sources, as wll those n C and D. But the queston s whch of the headngs n A are assocated wth C and D? A smlar queston can be asked for the headngs n segment B. As there are only two sources n ths example, and the magntude of the normalsed headng vectors are one, then t can be deduced that segment A s assocated wth segment C and B s assocated wth D. However, f there are three or more sources, then t would not be so easy to make ths assocaton; to address ths problem we need to reorder the headng components as a functon of tme and look for sample values where both headng components

16 are smultaneously assocated wth one of the horzontal components n Fgure 4(d). The headng vectors assocated wth each source can then be averaged over n some way to reduce the effects of nose. The potental advantage of ths clusterng method over the local MHC method s that, n the former method, we can assocate headngs that are not adjacent; n prncple estmatng the average headng over several headngs should be more accurate n the presence of nose compared wth just usng the more localsed MHC method. 3 Development of the Clusterng Algorthm The steps of the clusterng algorthm are as follows: (1) Input whtened data {e[n]},( =1,,N) (2) For each component calculate velocty vectors from adjacent data ponts: v[n]={v[n]} = {e[n] e[n-1]} To llustrate the clusterng approach used n ths work, we wll use the followng smplfed example where there are 10 headngs to sort nto clusters and the number of sources, N = 2, wth the velocty components v [ n], v [ ]) gven n Table 1. ( 1 2 n Headng Number, n v 1[n] v 2[n] 1 1 2 2-2 3 3 1 2 4 2 4 5 5 3 6-1 -2 7-4 6 8 5 5 9-4 6 10 5 10 (3) Calculate headng vector Table 1 Headng Components rˆ[ n] v[ n] v[ n]

17 (4) Take absolute values of each component of the normalsed headng vector: ˆ [ n] After carryng out Steps (3) and (4) we obtan the magntude of the headng components as shown n Table 2. r Normalsed Headng Number, n ˆ1 [ ] 2 [ 1 0.4472 0.8944 2 0.5547 0.8321 3 0.4472 0.8944 4 0.4472 0.8944 5 0.8575 0.5145 6 0.4472 0.8944 7 0.5547 0.8321 8 0.7071 0.7071 9 0.5547 0.8321 10 0.4472 0.8944 Table 2 Magntude of Normalsed Headng Components (5) For each component, sort absolute values of normalsed headng components n ascendng order sort { rˆ [ m]} = sort { rˆ [ n] } Note the followng mappng between n and m for each component f1[m] = n The sorted headngs and the array { 1 [ m]} for the above dagram are shown n Table3: f Sorted Normalsed Headng Number, m rˆ sort 1[ m] f 11[ m] rˆ sort 2[ m] f 12 [ m] 1 0.4472 1 0.5145 5 2 0.4472 3 0.7071 8 3 0.4472 4 0.8321 2 4 0.4472 6 0.8321 7 5 0.4472 10 0.8321 9 6 0.5547 2 0.8944 1 7 0.5547 7 0.8944 3 8 0.5547 9 0.8994 4 9 0.7071 8 0.8944 6 10 0.8575 5 0.8944 10 Table 3 Sorted Headng Components

18 (6) Cluster headng components by lookng at the dfferences between adjacent values of rˆ sort [ m] sort sort gven by rˆ [ m] rˆ [ m 1]. Choose a threshold. If rˆ sort sort [ m] rˆ [ m 1] C[m] = 1 else C[m] = 0 Let C be a matrx for whch C[m] s the element n the m th row and th column. In the above example there s assumed to be no nose. In practce, there wll be nose and that s the reason why we allow the dfference between adjacent values of rˆ sort [ m] to be less than some non-zero threshold ; we wll dscuss later how to choose. For our example, the values of {C[n]} and f1 are shown n Table 4. Sorted Normalsed Headng Number, m C 1[ m] f 11[ m] C 2[ m] f 12 [ m] 1 0 1 0 5 2 1 3 0 8 3 1 4 0 2 4 1 6 1 7 5 1 10 1 9 6 0 2 0 1 7 1 7 1 3 8 1 9 1 4 9 0 8 1 6 10 0 5 1 10 Table 4 C values

19 (7) Look for column j n {C[m]} wth largest number of adjacent values of 1 s: Notes () In Table 4 t can be seen that there are two such clusters of 1 s: C1[m] for 2 m 5and C2[m] for 7 m 10. In ths case, the software pcks up the cluster of C1 values but the same fnal result wll be obtaned f the other cluster s pcked frst. () Note that C1[1]= C1[6]= C2 [3]= C2 [6]=0. The reasons for puttng zeros at these ponts n the Table s to separate clusters of 1 s correspondng to dfferent headngs; for example, f C1[6] was put equal to 1 then there would be a contnuous cluster of 1 s from C1 [2] to C1[8] mplyng that all these components come from the same headng whch clearly they do not. (8) Now that we have dentfed a clusterng of a headng component n Table 4, we now need to assocate these components to the tme ordered components shown n Table 2. Ths s where the values of f [ ] and f [ ] 11 m 12 m n Table 4 are used. In ths Table, the sort r 1, followng normalsed sorted headng values are clustered together ˆ [2] sort 1 sort 1 sort 1 r ˆ [3] rˆ [4] and r ˆ [5]. Followng on from Note () n Step (7) above sort we should also nclude rˆ 1[1] n the clusterng. Usng the values of f [ ] 11 m n ths Table, these sorted headng components correspond to the followng unsorted headng components: r ˆ [1], r ˆ [3], r ˆ [4], rˆ [6] and r ˆ [10]. We now defne a matrx CU, 1 1 1 1 where the element n the n th row and th column s C U [n]; we fll n 1 s n column 1 at rows 1,3,4,6 and 10 ndcatng that the clusterng of the frst headng component has taken place, as shown n Table 5. 1

20 Normalsed Headng Number, n C U 1[ n] C U 2[ n] 1 1 2 0 3 1 4 1 5 0 6 1 7 0 8 0 9 0 10 1 Table 5 - C U (9) Now, we have found that the largest cluster for component 1 corresponds to the orgnal sample numbers 1, 3, 4, 6 and 10. We now need to determne how many headng values at these sample numbers for component 2 are also n a cluster. To determne ths, look at Table 4. We can see that the values of C2[n] correspondng to these tme ponts are 0,1,1,1,1. However, for the reason stated n Step 7(), we need U 2 to put C2[6] = 1 as ths s part of the same cluster correspondng to C [1] 1. Hence the second column of the above table can be flled n as follows: Normalsed Headng Number, n C U 1[ n] C U 2[ n] 1 1 1 2 0 0 3 1 1 4 1 1 5 0 0 6 1 1 7 0 0 8 0 0 9 0 0 10 1 1 Table 6 Tme ordered values for C U

21 U { (10) In Table 6, we are lookng for rows where all elements C } are 1; n ths case both components are part of a cluster. Ths can be acheved by performng a logcal AND of the elements of each row to produce the followng Table: Normalsed Headng D[ p] Number, n 1 1 2 0 3 1 4 1 5 0 6 1 7 0 8 0 9 0 10 1 Table 7 D[p] where D[p] = AND (C U 1[p],C U 2[p]) (11) Each row, p, where D[p] = 1 corresponds to a headng n the same cluster; for the above example, the followng velocty components form a cluster: v[1], v[3], v[4], v[6] and v[10] whch agrees wth Table 2. Comment: One could look for other clusters of headngs n Table 4, but n ths paper we look at usng a deflaton approach where we estmate each headng teratvely and subtract the correspondng estmated source from the data as n Equaton (10), the algorthm s appled to the data { z'[ n]} and the clusterng method s appled agan to the remanng data. Ths method s appled untl there are no further sources to estmate. Ths subtracton process should make t easer to solate the other components.

22 4 Estmaton of Headng Vector a a a ( 1 2 a N Let v v, v,..., v ) be the actual non-normalsed headng vector (whch we also call the velocty vector). Suppose that the clusterng algorthm has been carred out on each component and let us look at velocty component, v a. Let us assume that there are J velocty vectors wthn a cluster. e e r Let rˆ {ˆ } denote the estmate of the headng from a partcular cluster of headngs. We now need to estmate rˆ } from the set of velocty vectors that have been found from the { e clusterng method: { v '[1]},{ v '[2]},...,{ v '[ J]}, where v '[ n] s the th component of the n th velocty vector. Each velocty component wll be affected by nose. One possblty to determne the th component of the estmated headng, e rˆ, s to perform a drect average over j of '[ j]. v However ths s not optmal for the followng reason. The relaton between the clustered and actual th velocty component s gven by a v '[ j] v [ j] n[ j] (18) where j refers to ths velocty beng the j th member of the cluster and n[j] s a sample of nose, assumed Gaussan. The actual velocty to nose rato s gven as v [ j] VNR[ j] a (19) where s the standard devaton of nose. The correspondng relaton for the k th member of the cluster s: a v '[ k] v [ k] n[ k] (20) wth velocty to nose rato v [ k] VNR[ k] a (21)

23 Because the magntudes of any two veloctes n the cluster may be dfferent, then n general: VNR[ j] VNR[ k] (22) Nose wll affect smaller magntude veloctes more than those wth larger magntude. Hence any estmator should take ths nto account by puttng more weght on larger headng components than smaller headng components when averagng over all components n a cluster, because the VNR for the larger headngs are larger. Ths problem s addressed n [18] for the averagng of evoked potentals, where t s shown that the estmate, V ~, of the th velocty component s gven by ~ V J j1 J M[ j]. v '[ j] j1 ( M[ j]) 2. (23) where M[ j] N 1 { v '[ j]} 2 When M[j] = 1 for all j, ths reduces to a straght average over all headngs. The above processng s appled to each velocty component (=1,,N) so that the estmate of the velocty vector becomes: ~ ~ ~ ~ V ( V, V,..., V 1 2 N ) The estmate of the normalsed headng vector s then ~ V Rˆ e ~ V (24) Rˆ e s then used n Equaton (8-10) n place of Rˆ 1.

24 5 Input Parameters Let the N components of the velocty vector be wrtten as v[ n] v1[ n], v2[ n],..., vn[ n] In order to avod effects of spurous nose, a velocty vector s accepted at sample pont n f V th max n] v. [ v max (25) where V max [ n] max v1[ n], v2[ n],..., v N [ n] 0 v th 1 s a chosen threshold and v max max v[1], v[2],..., v[ M ] s the maxmum value of the magntude of the velocty vector over all M sample ponts. Ths threshold s used n both the MHC and Global methods. We now need to consder the choce of n Step 6 of the general clusterng algorthm descrbed n Secton 3. Ths parameter s used to determne f two headng components are assocated wth the same source. Lookng at Fgure 4(d), for example, t can be seen that n the regons where one source s on ts own, the deal value for s zero. In practce, because of the effects of nose and other sources, a non-zero value for should be chosen. Referrng to Fgure 4(b) where the ordered headngs are plotted for random nose, t can be seen that the graph s ncreasng n a non-lnear way; ths can be crudely approxmated as lnear, where the sorted normalsed headng s gven approxmately by rˆ sort [ m] m M (26) where M s the number of headngs. Hence the dfference between adjacent ordered normalsed headngs can be approxmated by

25 rˆ [ m] rˆ [ m 1] M sort sort 1 (27) Ths mples that for assocaton between two sorted headngs that we must choose the parameter such that 1 M (28) where 1. If s chosen to be too small, then assocatons between headng components belongng to the same source wll not occur; f too large, then too many false assocatons wll occur between headngs that are not from the same source. Accordng to Step 10 n Secton 3, a cluster s only declared f an assocaton s found between all components of the headng; hence, randomly assocated headng components wll tend to AND to zero. Extensve smulatons have been carred out to optmse the parameter n (28) and t has been found that a good compromse value to use s = 1; ths s used n all the smulatons and data analyss carred out n ths paper. 6 Results In ths Secton, the performance of the Global Method s compared to the followng three methods: () MHC [15] () Fast-ICA [19] and () Clusterwse PCA [9]. A selecton of three sets of data are used: () sources that are purely sparse and uncorrelated () sources that are locally sparse and weakly correlated () expermental abdomnal and thoracc ECG data taken from an expectant mother. The am here s to demonstrate when the proposed method works well compared wth standard technques and to also dscuss the lmtatons of ths method. It s partcularly of nterest to compare the robustness to nose of the proposed technque wth the other methods.

26 Fast-ICA s an mplementaton of the ICA method; ths method s not specfcally desgned to be appled to sparse sgnals and can be appled to mxtures of non-sparse sources. It s found for the sgnals studed n ths work that best results are obtaned wth the Fast-ICA method f one uses the deflaton approach and Gaussan non-lnearty [19]. The Clusterwse PCA method s specfcally desgned to be appled to mxtures of sparse sources; clusterng takes place usng mnmum component analyss to fnd the drectons n phase space correspondng to each source [9]. Monte Carlo smulatons are carred out to compare the robustness of all the methods to nose. Monte Carlo smulatons are also carred out for the Fast ICA and Clusterwse PCA methods when appled to clean sgnals because the former method randomly ntalses the weghts and the latter uses a randperm functon to defne the hyperplanes n the search process. The followng procedure s adopted n ths paper to come up wth a fgure of mert to quantfy the dfference between the actual and estmated sources. Frstly, when comparng actual and estmated sources, to take nto account possble scalng of the estmates,s~, they are normalsed so that ther rms values are 1. Ths same normalsaton procedure s carred out wth the actual sources. Suppose that the actual sources are gven by s [m] where m s a sample ndex. Suppose also that, at the q th Monte Carlo run, the estmates are ~ s q [ m ]. Now, when applyng BSS methods, the estmate may be a scaled verson of the actual source but t could also be nverted. To take nto account possble nverson of the estmates, we compute the crosscorrelaton functon between the j th estmate and th actual source usng the corrcoef functon of MATLAB. Supposng ths value s q c j.

27 q j At each Monte Carlo run, defne the matrx [ C ](=1,...,M; j=1,...,m) where the j th element s c q j. An assocaton between each source and estmate s determned by fndng the row r q j and column s of the entry n the matrx [ C ] wth the largest magntude q c rs ; the actual source r s then assocated wth the estmate s. The error between source r and estmate s s computed as follows: q s [ n] s [ n] ~ s [ n] r s f q c rs > 0 (29) ~ q s [ n] s [ n] f 0 r s c rs where n the above equatons, t s assumed that both { s r [ n]} and { ~ [ n]} have been normalsed to unty. The second condton s needed to take nto account the possblty that s s the estmate ~ ss [ n ] has been nverted. The matrx elements n row r and column s of [ C ] are then elmnated and the above q j procedure s carred out on the remanng rows and columns of ths matrx to determne q for the other actual sources. The procedure n the prevous paragraph s carred out for each Monte Carlo run. There are two fgures of mert that can be used to assess the performance of the processng methods: (1) Total RMS Estmaton Error The RMS estmaton errors at each tme pont, n, can be computed for each source by averagng q s ( [ n]) 2 over all Monte Carlo runs and takng the square root: Q q s [ n] 1 RMS[ n] Q q1 2 (30) where Q s the total number of Monte Carlo runs.

28 The total rms error over all data ponts can be calculated from RMS tot 1 M M RMS[ n] n1 2 (31) (2) Maxmum RMS Estmaton Error Ths s calculated from the maxmum over n of RMS[n] for each component: RMS max max{ RMS[ n]} n (32) 10 sets of 1000 Monte Carlo runs are carred out. RMStot and RMSmax are averaged over these 10 sets and the correspondng standard devatons are also calculated to assess the sgnfcance of these results. To quantfy the amount of nose that s added the followng conventon s used. From Equaton (1), the contrbuton of the j th source to the th measurement s gven by p j [ n] A s [ n] j j We can quantfy the peak value (postve or negatve) of the contrbuton of source j to measurement by Pj MAX where P MAX j max p n j [ n] Now the performance of the algorthm wll depend on how the nose standard devaton compares to the mnmum value of Pj MAX, whch s the smallest contrbuton of any source to the data mxtures:, j R mn P MAX j (33) The nose standard devaton wll be quoted as a percentage of R. 6.1 Example 1: Mxtures of Sparse Sources It s found that for no nose, the MHC and Global methods are relatvely nsenstve to the choce of nput parameters and the followng are chosen for the sake of example:

29 1 (Equaton (28)). th th 0.8 0.4 (Equaton (25)) for the MHC (Equaton (25)) for the Global Method In the absence of nose, t s found that the MHC, Global and Clusterwse PCA methods gve almost perfect reconstructon of the orgnal sources. The sources here are purely sparse and a good performance s to be expected for these algorthms whch have been desgned for such sources. However, t s found that there are sgnfcant estmaton errors for both sources when applyng Fast- ICA; n Fgures 5(a) and 5(b) the estmated and actual sources 1 and 2 are shown; both the estmated and orgnal sources have been normalsed to unty rms values. 0.45 0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0 0 5 10 15 20 25 30 35 40 Sample Number Fgure 5(a)- Comparson between the Orgnal (Full Lne) and FastICA Estmate (Dashed Lne) for Source 1

30 Fgure 5(b)- Comparson between the Actual (Full Lne) and Fast-ICA Estmate (Dashed Lne) for Source 2 For Fast ICA, before applyng PCA, the mean of the data s subtracted, whch ntroduces correlatons between the whtened sources; a consequence of ths s that the estmate of one source beng contamnated by a contrbuton from the other source. One can characterse the estmaton error by quantfyng the ampltude of the contamnatng peak compared wth the ampltude of the actual peak as a percentage; for Source 1 ths percentage s 27% and the correspondng value for Source 2 s 4.7%. Smulatons are next carred out for nose added to the measured data wth sd = 0.005, whch s 2.5% of R defned n Equaton (33). In ths case, t s found that the performance of the Global method s senstve to changes n the choce of v th. If one tabulates the maxmum of the RMS error, RMS max (32), averaged over 10 sets of 1000 Monte Carlo runs, as a functon of v th then the results are as s as n Table 8. In ths and subsequent tables, the numbers n brackets are the standard devaton of the rms errors computed over 10 sets of 1000 Monte Carlo runs.

31 Method Source 1 (x10 3 ) Source 2 (x10 3 ) Fast ICA 112 (0.102) 35.2 (0.391) Clusterwse PCA 6.765(0.102) 27.6(0.441) MHC vth = 0.7 11.7(0.379) 27.0(0.216) Global vth = 0.3 23.5(3.06) 29.5(1.19) Global vth = 0.35 9.74(2.55) 26.8(0.241) Global vth = 0.4 7.24(0.495) 26.8(0.241) Table 8 Maxmum of RMS Errors, nose sd = 0.005 It can be seen that the Global method s senstve to the parameter v th wth a 68% reducton n maxmum error for Source 1 as v th s ncreased from 0.3 to 0.4, wth a correspondng reducton of 12% for Source 2. Wth v th = 0.4, the Global Method has a comparable performance to Clusterwse PCA and a better performance than MHC and Fast ICA. For v th = 0.45, t s found that the Global Method breaks down as clusters are unable to be formed. Also lookng at Table 8, t can be seen that, for the larger Source 1, the Clusterwse PCA and Global methods have the best performances, although one has to change v th for Global to an approprate value. The MHC has a worse performance than these two methods because t only tres to fnd the best sngle headng to determne the domnant drecton whlst the Global method operates on a cluster of headngs and the Clusterwse PCA methods operate on the whole data. The Fast-ICA method stll has the worst performance because of the subtracton of the mean but ts performance s not sgnfcantly worse than wth no nose ndcatng that t s more robust to nose than the other technques. For the smaller Source 2, the four methods have comparable performances. When the nose standard devaton has ncreased to 0.01, whch s 5% of the maxmum magntude of the smaller source n the mxed data, then the results n Table 9 are obtaned for the maxmum rms errors for the varous methods.

32 Method Source 1 (x10 3 ) Source 2 (x10 3 ) Fast ICA 110.7(0.300) 53.4(0.772) Clusterwse PCA 15.5(0.257) 54.0(1.04) MHC vth = 0.8 29.4(0.771) 52.0(0.410) Global vth = 0.3 34.5(1.47) 52.7(1.09) Table 9 Maxmum of RMS Errors, nose sd = 0.01 All methods now gve comparable performances for the estmaton of Source 2. For Source 1, Clusterwse PCA yelds the best results wth MHC havng the second best performance. The Global method s farly nsenstve to changes n the parameter v th and now yelds worse results than the MHC method averagng over nosy vectors does not yeld better performance than the MHC. The Global v th method breaks down for v th 0.4 where ths method s unable to form clusters. Overall, takng nto account both clean and nosy data, the Clusterwse PCA method has the best performance of the methods tested for ths dataset. The Global method does have comparable performance to the Clusterwse method for clean data and also for nose standard devaton of 0.005 as long as an approprate value for v th s chosen. For the largest nose standard devaton that s chosen, the Global method has an nferor performance when compared wth the Clusterwse PCA and MHC methods. 6.2 Example 2: Mxture of Locally Sparse Sources Sound sgnals, for example speech and musc, can n many cases be consdered as approxmately sparse, so t would be of nterest to see f the proposed method can separate out the ndvdual sources from mxtures of sound sgnals. The sources wll, n general, be overlappng and correlated. The set of sources that we wll use are taken from [20]. Further detals concernng these data can be found n References [21] and [22].

33 The source fles are taken from the development data and are wdrums_ src_1,wdrums_src_2 and wdrums_src_3 these all represent musc ncludng drums. The samplng frequency s 16 khz. The followng randomly chosen mxng matrx s used: 0.91935141 A 0.92906443 0.29719212 0.35931601 0.42466101 0.15967404 0.37350394 0.56348773 0.68715652 (34) Data are taken between samples 19000 and 20000. The correlaton matrx of the sources, over ths data segment, s gven by 1.0000 0.1937 0.0182 0.1937 1.0000 0.0077 0.0182 0.0077 1.0000 (35) where the (,j) th element s the correlaton between sources and j. The sources are not localsed, as for Example 1, and so t s found that a better fgure of mert to assess estmaton errors s the total RMS estmaton error (31). It s found that the MHC method performs optmally for v th = 0.7 n (25). The RMS estmaton errors for the Global method are calculated for v th varyng from 0.3 to 0.6 n (25). In Table 10, the performance of the Global method s compared to the MHC, Clusterwse PCA and Fast ICA. Method Source 1 (x10 3 ) Source 2(x10 3 ) Source 3(x10 3 ) Fast ICA 7.77 (0.113) 12.9 (0.115) 7.32(0.121) Clusterwse PCA 20.4(0.067) 16.9(0.0553) 25.3(0.0230) MHC (vth = 0.7) 7.19 1.85 3.52 Global vth = 0.3 1.30 4.97 0.304 Global vth = 0.4 0.868 5.38 0.526 Global vth = 0.5 1.088 5.26 0.135 Global vth = 0.6 1.96 4.57 0.466 Table 10 RMS Errors (clean sgnal)

34 The Global method has the best performance of the methods tested when estmatng Sources 1 and 3 but MHC s better at estmatng Source 2. Fast ICA has a worse performance than these two methods and Clusterwse PCA has the worst performance. Now Clusterwse PCA s not workng well, understandably, because the sources are not purely sparse. The subtracton of the mean pror to applyng the Fast ICA s causng addtonal correlatons to appear between sources; n addton, even wthout ths subtracton, t can be seen from the matrx (35) that there s sgnfcant correlaton between sources 1 and 2. In ths example, we have a mxture of sgnals where there are no segments where one sgnal exsts on ts own hence the sources are not purely sparse. However there are segments where each source s domnant. When applyng the MHC or Global methods, the estmate of headng for one source wll conssts of contamnatons from other sources. The Global method averages over these contamnatons, treatng them as nose, so, n theory, should be better than local MHC, whch chooses one headng only; ths s the case for Sources 1 and 3, but not for Source 2 whch has small ampltude so averagng here does not help because of the relatvely large values of the contamnatons. In Fgure 6, the normalsed sources and the errors (29) n the estmates from the Global Method (v th = 0.5n (25)) are plotted together as a functon of sample number. It can be seen that the errors for Sources 1 and 3 are almost mperceptble, whlst the errors for Source 2 are more sgnfcant.

35 0.06 0.04 0.02 0-0.02-0.04-0.06 0 100 200 300 400 500 600 700 800 900 1000 Sample Number

36 0.05 0.04 0.03 0.02 0.01 0-0.01-0.02-0.03-0.04-0.05 0 100 200 300 400 500 600 700 800 900 1000 Sample Number Fgure 6 Comparson between actual source (full lne) and error n estmated source (dashed lne) for each source: Top: Source 1, Mddle: Source 2: Bottom: Source 3. All actual and estmated sources have been normalse to an rms value of 1. In Table 11, the results are shown for the case where nose wth standard devaton of 0.0005 s added to the mxtures correspondng to 20% of R gven n Equaton (33). Method Source 1 (x10 3 ) Source 2(x10 3 ) Source 3(x10 3 ) Fast ICA 7.52(0.267) 17.4(0.201) 6.82(0.214) Clusterwse PCA 22.7(0.116) 23.1(0.281) 23.4(0.065) MHC (vth = 0.7) 7.83(0.094) 13.6(0.012) 7.10(0.095) Global vth = 0.3 4.58(0.014) 13.6(0.009) 3.65(0.016) Global vth = 0.4 4.32(0.015) 13.6(0.008) 3.37(0.017) Global vth = 0.5 4.08(0.018) 13.6(0.013) 3.11(0.019) Table 11 RMS Errors (Nose sd = 0.0005) Comparng Tables 10 and 11, t can be seen that the Global and MHC methods are more senstve to nose than the Clusterwse PCA and Fast ICA methods. The reason for ths s that

37 the Fast ICA and Clusterwse PCA methods process the whole data. The localsed MHC processes one headng at a tme and s hence not usng the whole data to process the sgnal ths makes t more senstve to nose than Clusterwse PCA and Fast ICA. The Global method clusters part of the data so has better performance n nose than Local MHC for Sources 1 and 3 and s comparable wth MHC for Source 2 consstent wth the case when no nose s added. Fnally, t should be noted that, when the MHC and Global methods are appled to the clean sgnal, the sources are estmated n the order 2, 1, 3. Source 2 has hghest frequency components, followed by Source 1 and then Source 3. Ths can be explaned as both the MHC and Global methods operate on the headngs whch are taken from dfferences between adjacent data ponts. Ths dfferencng operaton s equvalent to applyng a hghpass flter to the data whch wll accentuate the hghest frequency components. Ths may also explan the relatve senstvty of the MHC and Global methods to nose. 6.3 Example 3: 8-Lead Thoracc and Abdomnal ECG Data from Expectant Mother In order to compare the performances of the FastICA and Global methods, data taken from the Dasy Database [23] wll be analysed. Ths data conssts of ECG sgnals taken from an expectant mother. The data conssts of 8 leads, 1 to 5 beng abdomnal and 6 to 8 thoracc. The frst 1000 samples of the data are chosen; there s some uncertanty about the samplng frequency, also ponted out n [24], but t s probably 250 Hz. Data from a typcal abdomnal lead s shown n Fgure 7.

38 40 30 Voltage (arbtrary unts) 20 10 0-10 -20-30 -40-50 0 100 200 300 400 500 600 700 800 900 1000 Sample Number Fgure 7 Lead 1 These data represent a more demandng test for the proposed technque: for mult-lead ECG data, the morphology of each source can change wth tme along wth the ampltude. In addton, there s nose present whch wll affect the performance of the method. In prevous work [15,16], the localsed MHC method gave comparable maternal and fetal outputs compared wth the ICA method. The component most resemblng the fetal sgnal s chosen by vsual nspecton. It s found that, when applyng the Global method, the qualty of the extracted fetal sgnal s senstve to the choce of v th. In Fgures 8(a) to 8(c), the component that looks most fetal s dsplayed for choces of threshold parameter v th = 0.1, 0.2 and 0.3 n (25). It can be seen that for v th = 0.1 and 0.3, a good qualty fetal component s extracted. However for v th = 0.2 the qualty of the fetal component s much worse. In Fgure 8(d), the correspondng component extracted usng FastICA s shown where the deflaton method s used along wth Gaussan non-lnearty. A good qualty fetal component s extracted comparable to those extracted by the Global method for v th = 0.1 and 0.3. Now, when applyng Fast ICA, one has the opton to apply the deflaton method or the symmetrc

39 method; n addton, one has the choce of the followng non-lneartes: Gaussan, pow3, tanh and skew [7,8]. It s found that the Fast-ICA consstently extracts fetal components wth smlar qualtes to Fgures 8(a) and 8(c) regardless of the combnaton of method/nonlnearty that s beng used and also ndependently of the random choce of ntal weghts. Hence, although t s possble to extract fetal components usng the Global method that are of smlar qualty to those extracted usng Fast-ICA, the latter method s more robust to nput parameters than the Global Method. (a) 0.2 0.15 0.1 0.05 0-0.05-0.1-0.15 0 100 200 300 400 500 600 700 800 900 1000 Sample Number

40 (b) 0.15 0.1 0.05 0-0.05-0.1-0.15-0.2 0 100 200 300 400 500 600 700 800 900 1000 Sample Number (c) 0.15 0.1 0.05 0-0.05-0.1-0.15-0.2 0 100 200 300 400 500 600 700 800 900 1000 Sample Number

41 (d) 6 4 2 0-2 -4-6 0 100 200 300 400 500 600 700 800 900 1000 Sample Number Fgure 8- Best fetal estmaton (a) Global vth =0.1, (b) Global vth = 0.2, (c) Global vth = 0.3, (d) Fast ICA 7 Dscusson The novel method presented n ths paper s one of a famly of methods that have been specfcally developed to separate sparse sources from mxtures of such sgnals [9-17]. In the partcular approach that s adopted n ths paper, (and n prevous references), one frst determnes the segments of data where one source domnates and then one uses that nformaton to extract the sources one-by-one. The potental advantage of ths approach over more general approaches to BSS s that one s concentratng on data where one can obtan most nformaton on the sources whch should hopefully lead to better extracton.

42 The approach used n ths paper has been successful n extractng sources for the examples that are studed n ths paper. In some cases, better extracton than Fast-ICA has been acheved. However, the two man drawbacks of ths method are: (1) Senstvty to nput parameter v th (25) (2) Senstvty to nose. It s observed wth Example 2 (Secton 6.2) that, when usng Global and MHC methods, the more rapdly varyng components are detected frst. Ths s due to the headngs beng calculated from dfferences of data; hence one s applyng a hgh pass flter. The advantage of ths approach for Sgnal 2 s that the hghest frequency source s not sparse, but because the other sources were effectvely attenuated, the fltered verson becomes sparser and s hence easer to detect. The domnance of each frequency component n turn means that each component s detected as aganst all the other components. The dsadvantage of the hgh pass flterng acton s that hgh frequency components of nose are amplfed leadng to relatvely poor performance n nose compared to the other methods. The performances of the Fast-ICA and Clusterwse PCA methods are much less senstve to nput parameters. Ths means that, when mplemented on-lne, these two methods would have a more relable performance than the Global and MHC methods because of the uncertanty n the best choce of nput parameter to use for a specfc sgnal and sgnal to nose rato. 8 Conclusons A blnd source separaton method has been developed for mxtures of sparse sources. The method nvolves dentfyng data ponts that are domnated by one source and to use ths nformaton to estmate that source by estmatng a domnant drecton n phase space. Prncpal drectons n the phase plot are estmated usng a smple clusterng technque based on sortng the headng components n ascendng order of magntude. Ths estmated source s

43 then subtracted from the data and the process repeated to estmate the other sources. Ths method s an extenson of the work n [15] where the headngs are chosen at the pont where there s least change between two adjacent headngs. The method has been evaluated on smulated data where the sources are sparse, smulated data where they are sem-sparse and expermental fetal ECG montorng data. The proposed method, called the Global method, s more robust to nose than the MHC method [15] and can lead to comparable and sometmes better estmates than the standard ClusterwsePCA and Fast-ICA methods. However the proposed method (lke the MHC) s senstve to the choce of nput parameter v th (25) and so may be best mplemented off-lne along where one has the tme to perform several analyses of data wth dfferent nput parameters. The Global method, whlst more robust to nose than the MHC, can be more senstve to nose than Fast- ICA and ClusterwsePCA; ths s because the Global method processes only part of the data whlst the Fast-ICA and ClusterwsePCA process the whole data. The Global method has been developed assumng that the sources are uncorrelated so that, at each teraton, t s possble to extract the sources one-by-one. In Example 2 n Secton 6.2, where the sources are weakly correlated, the Global Method performed relatvely well because when estmatng the headng for one source, the other sources contrbuted what was effectvely nose to the estmate of that headng whch can then be averaged over usng Equaton (23). However, for strongly correlated sources the method wll break down. The MHC method has been extended n [16] to the case of correlated sources, where t s assumed that all sources are sparse. It would be of nterest to see whether the Global method can be extended n a smlar way to deal wth correlatons between the underlyng sources.

44 REFERENCES [1] Ren, J.Y., Chang, C.Q., Fung, P.C.W., Shen, J.G., and Chan, F.H.Y.: Free Radcal EPR Spectroscopy Analyss usng BSS, Journal of Magnetc Resonance, 2004, vol.166, (1), pp 82-91 [2] Yn P, Sun Y and Xn J, A Geometrc Blnd Source Separaton Method Based on Facet Component Analyss, Sgnal, Image and Vdeo Processng, 2016, vol. 10, pp 19-28 [3] Varann M, Tartarsco G, Bllec L, Macerata A, Pogga G and R Balocch R, An effcent unsupervsed fetal QRS complex detecton from abdomnal maternal ECG, Physologcal Measurement, vol. 35, 2014, pp 1607 1619 [4] Dawy, Z., Sarks, M., Hagenauer, J., and Mueller, J.: A Novel Gene Mappng Algorthm based on Independent Component Analyss, IEEE Internatonal Conference on Acoustcs, Speech, and Sgnal Processng (ICASSP) 2005 Conference, March 18-23, 2005, Volume V, pp. 381-384 [5] Shlens J, A Tutoral on Prncpal Component Analyss http://arxv.org/abs/1404.1100v1 (2014) - Last accessed March 11 th 2016 [6] Zarzoso, V., and Nand,A.K.: Blnd Separaton of Independent Sources for Vrtually Any Source Probablty Densty Functon, IEEE Transactons on Sgnal Processng, 1999, vol. 47, (9), pp 2419-2432 [7] Hyvärnen A. 2013 Independent Component Analyss: Recent Advances. Phl Trans R Soc A 371: 20110534. http://dx.do.org/10.1098/rsta.2011.0534 [8] Hyvärnen A and Oja E, Independent component analyss: algorthms and applcatons, Neural Networks, 2000, vol. 13, no. 4-5, pp. 411 430

45 [9] Babae Zadeh, M., Jutten, C., and Mansour, A., Sparse ICA va Cluster-wse PCA, Neurocomputng,2006, vol. 69, (13-15), pp.1458-1466 [10] Chang C, Fung P C W and Hung Y S, On a Sparse Component Analyss Approach to BSS, 6th Internatonal Conference on Independent Component Analyss and Blnd Source Separaton (ICA 2006), Charleston, South Carolna, USA, March 2006, pp 765-772 [11] O Grady, P.D., Pearlmutter, B.A., Rckard, S.T. Survey of Sparse and non-sparse Methods n Source Separaton, Internatonal Journal of Imagng Systems and Technology, 2005, vol. 15, (1), pp. 18-33 [12] Thes, F., Jung, A., Puntonet, C., and Lang, E.W. Lnear Geometrc ICA: Fundamentals and Algorthms, 2003, Neural Computaton, 15, (2), pp. 419-439 [13] Georgev, P., Thes, F., and Cchock, A.: Sparse Component Analyss and Blnd Source Separaton of Underdetermned Mxtures, 2005, IEEE Transactons on Neural Networks, 16, (4), pp. 992-996 [14] Daves, M., and Mtanouds, N., Smple Mxture Model for Sparse Overcomplete ICA, (2004), IEE Proceedngs on Vson, Image and Sgnal Processng, 2004, 151,(1), pp. 35-43 [15] Woolfson, M.S., Bgan, C., Crowe, J.A. and Hayes-Gll, B.R. Method to separate sparse components from sgnal mxtures, 2008, Dgtal Sgnal Processng, 18, (6), pp. 985-1012 [16] Woolfson M. S, Bgan C, Crowe J. A., and Hayes-Gll B. R., Extracton of Correlated Sparse Sources from Sgnal Mxtures, ISRN Sgnal Processng, vol. 2013, Artcle ID 218651, 17 pages, 2013. do:10.1155/2013/218651