Computers in Biology and Medicine

Size: px

Start display at page:

Download "Computers in Biology and Medicine"

Barbara Norton
5 years ago
Views:

Computers in Biology and Medicine 39 (29) 818 -- 823 Contents lists available at ScienceDirect Computers in Biology and Medicine journal homepage: www.elsevier.

Shandong 25353, China A R T I C L E I N F O A B S T R A C T Article history: Received 6 March 27 Accepted 29 June 29 Keywords: Mass spectrometry data Feature extraction Wavelet analysis Support

1 Computers in Biology and Medicine 39 (29) Contents lists available at ScienceDirect Computers in Biology and Medicine journal homepage: Feature extraction and dimensionality reduction for mass spectrometry data Yihui Liu School of Computer Science and Information Technology, Shandong Institute of Light Industry, Jinan, Shandong 25353, China A R T I C L E I N F O A B S T R A C T Article history: Received 6 March 27 Accepted 29 June 29 Keywords: Mass spectrometry data Feature extraction Wavelet analysis Support vector machine Mass spectrometry is being used to generate protein profiles from human serum, and proteomic data obtained from mass spectrometry have attracted great interest for the detection of early stage cancer. However, high dimensional mass spectrometry data cause considerable challenges. In this paper we propose a feature extraction algorithm based on wavelet analysis for high dimensional mass spectrometry data. A set of wavelet detail coefficients at different scale is used to detect the transient changes of mass spectrometry data. The experiments are performed on 2 datasets. A highly competitive accuracy, compared with the best performance of other kinds of classification models, is achieved. Experimental results show that the wavelet detail coefficients are efficient way to characterize features of high dimensional mass spectra and reduce the dimensionality of high dimensional mass spectra. 29 Elsevier Ltd. All rights reserved. 1. Background Mass spectrometry is being used to generate protein profiles from human serum, and proteomic data obtained from mass spectrometry have attracted great interest for the detection of early stage cancer. Surface enhanced laser desorption/ionization time-of-flight mass spectrometry (SELDI-TOF-MS) in combination with advanced data mining algorithms, is used to detect protein patterns associated with diseases [1 5]. As a kind of MS-based protein chip technology, SELDI-TOF-MS has been successfully used to detect several diseaseassociated proteins in complex biological specimens such as serum [6 8]. Lilien et al. [9] perform principal component analysis (PCA) for dimensionality reduction and linear discriminant analysis (LDA) with a nearest centroid classifier [1] for the classification of mass spectra. Wu et al. [11] compare two feature extraction algorithms with several classification approaches on a MALDI TOF acquired data. The t-test is used to rank features. Support vector machines (SVMs), random forests, linear/quadratic discriminant analysis (LDA/QDA), k nearest neighbors, and bagged/boosted decision trees are performed to classify the data. In the paper of Jeffries et al. [12], both genetic algorithm (GA) approach and nearest shrunken centroid (NSC) approach have been found inferior to the boosting based feature selection method. Levner [13] examines the performance of the nearest centroid classifier using the following feature selection algorithms. For filter-based feature ranking methods, univariate statistics of student-t test, Kolmogorov Smirnov test, and the P-test address: Yihui_liu_25@yahoo.co.uk are used; for the wrapper methods, sequential forward selection (SFS) and a modified version of sequential backward selection (SBS) is tested; for embedded approaches, shrunken nearest centroid and a novel version of boosting based feature selection are investigated. Several dimensionality reduction approaches are also tested, such as PCA and LDA methods. For transform space a new basis is normally created for the data. The selection of the new basis determines the properties that will be held by the transformed data. Principal component analysis is used to extract the main components from mass spectra; linear discriminant analysis is used to extract discriminant information from mass spectra. But these methods lose the time property and do not detect the localized features of mass spectra. For wavelet transform, a set of wavelet basis aim to detect the localized features contained in mass spectra. The difference between cancer tissue and normal tissue can be measured using wavelet basis based on the compactness and finite energy characteristic of wavelet function. Yu et al. [14] developed a four-step strategy method for dimensionality reduction and test on a published ovarian highresolution SELDI-TOF dataset. They are based on: (1) binning, (2) Kolmogorov Smirnov test, (3) restriction of coefficient of variation and (4) wavelet analysis. They indicated that For the highresolution ovarian data, the vector of detail coefficients contains almost no information for the healthy, since SVMs identify all the data as cancers. In their proposed method the detail coefficients do not work on high-resolution mass spectrometry data. They use approximation coefficients of wavelet decomposition at first level. They also indicated that Theoretically, a heavier compression rate can be achieved, at the risk of losing some useful information, by choosing a higher level of approximation coefficients. They only /$ - see front matter 29 Elsevier Ltd. All rights reserved. doi:1.116/j.compbiomed

2 Y. Liu / Computers in Biology and Medicine 39 (29) used first level wavelet approximation coefficients in their wavelet analysis. However from another view, it is wavelet detail coefficients that characterize the localized features hidden in mass spectra and approximation coefficients only compress the mass spectra. Higher level wavelet decomposition makes the features of mass spectra more significant and clear. In this study we develop a feature extraction method based on wavelet detail coefficients. Multi-level wavelet analysis at different levels is performed on mass spectrometry data. Vectors of detail coefficients in wavelet subspace are extracted to detect the localized or transient changes of mass spectra based on the time property of wavelets, and the difference between cancer tissue and normal tissue can be measured using a set of orthogonal wavelet basis. Finally the wavelet features of mass spectra are input into the SVM classifier to distinguish the diagnostic classes. 2. Methods mass spectra decomposition tree S a 1 d 1 a 2 a 3 a 4 Fig. 2. Multilevel wavelet decomposition tree for mass spectra. Symbol s represents mass spectra; a 1,...,a 4 represent wavelet approximations from first level to fourth level respectively; d 1,...,d 4 represent wavelet details from first level to fourth level. d 2 d 3 d 4 In this research we develop a new application of wavelet feature extraction method for mass spectrometry data. Wavelet high frequency part (detail coefficients) is extracted to characterize the features of mass spectrometry data. The extracted features are used to build the SVM classifying model. Fig. 1 shows the general framework of the proposed method. s 2 1 Mass spectra and Approximation (s) 2 1 Mass spectra and Detail(s) s 2.1. Wavelet feature extraction For one dimensional wavelet analysis [15,16], a signal can be represented as a sum of wavelets at different time shifts and scales (frequencies) using discrete wavelet analysis (DWT). The DWT is capable of extracting the features of transient signals by separating signal components in both time and frequency. According to DWT, a time-varying function (signal) f (t) L 2 (R) can be expressed in terms of (t) andψ(t) as follows: f (t) = c (k) (t k) + d j (k)2 j/2 Ψ(2 j t k) k k j=1 = c j (k)2 j/2 (2 j t k) + d j (k)2 j/2 Ψ(2 j t k) k k j=j where (t), Ψ(t), c,andd j represent the scaling function, wavelet function, scaling coefficients at scale, and wavelet detailed coefficient at scale j, respectively. The variable is the translation coefficient for the localization of a signal for time. The scales k denote the different (high to low) frequency bands. The variable symbol j is scale number selected. Fig. 2 shows the wavelet decomposition tree at 4 levels. Fig. 3 shows the original mass spectra, wavelet approximations and wavelet details at 4 levels. When decomposition level is increased, the localized or transient features, which are detected based on detail coefficients, change from fine to coarse or from small to large. In our study the purpose of wavelet analysis is to detect the localized features hidden in mass spectra, in order to measure the difference between cancer tissue and normal tissue. Multi-level wavelet analysis makes it possible to detect the transient changes in one of mass spectra's derivatives. Wavelet detail coefficients at 4 levels reflect the localized features in first, second, third, and fourth derivative. Wavelets tend to be irregular and asymmetric, and wavelet Mass spectrometry data Extract wavelet features Wavelet features Build SVM classifier Fig. 1. The framework of the proposed method. Classifier model a a a 2 1 d a 1 1 d Fig. 3. Mass spectra, wavelet approximations and wavelet details at 4 levels. analysis is capable of revealing aspects of data that other analysis techniques miss, aspects like trends (approximation coefficients), discontinuities in higher derivatives (detail coefficients). Detail coefficients represent how closely correlated the wavelet is with this localized section of mass spectra. The higher the coefficients are, the more the similarity is. The presence of noise is a fairly common situation in mass spectra processing, which makes the identification of the transient change more complicated. If the first levels of the decomposition can be used to eliminate a large part of the noise, the successive details characterize more significant features hidden in mass spectra. In our study detail coefficients at second, third and fourth level are used respectively to characterize the features of mass spectra, removing noise and reducing the dimensionality. Detail coefficients determine the position of the change (time), the type of change (a localized features in which derivative), the amplitude of the change. Prostate cancer dataset [21] have 322 samples including 69 cancer samples and 253 normal samples, and dimensions for each sample vector. We perform one dimensional wavelet 2 d 4 d 3

3 82 Y. Liu / Computers in Biology and Medicine 39 (29) analysis on each sample vector to obtain the detail coefficients. Each sample vector is computed independently and is characterized by a set of orthogonal wavelet basis. We obtain , , and feature vectors at second, third and fourth level respectively. However other methods of feature extraction, such as PCA, LDA, etc., calculate the new transform feature space based on training dataset; once training samples change, the new transform feature space needs to be calculated again based on new training dataset, and the feature vector of each sample also needs to be computed again based on the changed feature space. This adds the computation load. Because mass spectra hold a high dimensionality, this causes large matrix computation for the transforms of PCA, LDA, etc., and large computation load is needed. Compared with these methods of feature extraction, wavelet feature extraction method does not rely on the training dataset. It is a set of orthogonal wavelet basis that represents the features of sample vectors. The vector of mass spectra is convolving with the high-pass wavelet filter and the convolved coefficients are downsampled by keeping the even indexed elements to form the wavelet feature vector. This only needs small computation load. Data vector Y(1xd ori ) 1D DWT at i thlevel N samples Features F(Nxw i ) K flod cross validation F tr (N tr xw i ) F te (N te xw i ) Train SVMs M times SVM clasifier Results Fig. 4. The process of k fold cross validation experiments based on wavelet decomposition at ith level. 1D DWT represents one dimensional discrete wavelet transform. Y, F, F tr, andf te represent the original vector of mass spectra, wavelet feature vectors (detail coefficients), training vectors and test vectors respectively. N, N tr, N te, d ori, and w i represent the sample number of mass spectra, training and test vector number of k fold cross validation, the dimension number of the original mass spectra, and the dimension number of wavelet feature vectors. w i is 3798, 195, and 959 dimensions based on wavelet decomposition at second, third, and fourth level respectively for prostate cancer dataset of dimensions. 1 5 High resolution ovarian samples 2.2. SVM classifier The SVM originated from the idea of the structural risk minimization developed by Vapnik [17]. The SVM is an effective algorithm to find the maximal margin hyperplane to separate two classes of patterns. A transform to map nonlinearly the data into a higherdimensional space allows a linear separation of classes, which could not be linearly separated in the original space. The objects that are located on these two hyperplanes are the so-called support vectors. The maximal margin hyperplane, which is uniquely defined by the support vectors, gives the best separation between the classes. The support vectors can be regarded as the selected representatives of the training wavelet features, and are most critical for the separation of the two classes. As usually only few support vectors are used, there are only some parameters adjustable by the algorithm and thus the overfitting is unlikely to occur. Radial basis functions (RBF) K(x i, x j ) = e x i x j 2 /r1, where r1 is a strictly positive constant, and is set to 1. Apparently the linear kernel is less complex than the polynomial and the RBF kernels. The RBF kernel usually has better boundary response as it allows for extrapolation, and most high dimensional data can be approximated by Gaussian-like distributions similar to those used by RBF networks [18]. 3. Experiments and Results In this study we use classification accuracy, sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) to evaluate the performance of the proposed method. Let TP, TN, FP, and FN be the number of true positive (cancer), true negative (), false positive and false negative samples. Sensitivity is defined as TP/(TP+FN); specificity is defined as TN/(TN+FP); positive predictive value is defined as TP/(TP + FP); negative predictive value is defined as TN/(TN + FN); accuracy is defined as (TP + TN)/(TP + TN + FP + FN). Balanced correct rate (BACC) is defined as 1 2 ( TP TP + FN + TN TN + FP ), which is the average of sensitivity and specificity. Daubechies wavelet db7 [19], which has seven non-zero coefficients of compact support wavelet orthogonal basis, is used for wavelet analysis of mass spectrometry data and the boundary values are symmetrically padded. Multilevel discrete wavelet transform is performed Vector of detail coefficients at 2nd level Vector of detail coefficients at 3rd level Vector of detail coefficients at 4th level Fig. 5. Wavelet features of high resolution ovarian mass spectra. This figure shows detail coefficients of wavelet decomposition at second, third and fourth level. on mass spectra to extract the features. K fold cross validation experiments are performed to evaluate our proposed method. K fold cross validation randomly generates indices, which contain equal (or approximately equal) proportions of the integers 1 through K that define a partition of the N observations into K disjoint subsets. In K fold cross validation, K 1 folds are used for training and the last fold is used for evaluation. This process is repeated K times, leaving one different fold for evaluation each time. In our study we use two- and threefold cross validation experiments to evaluate our proposed method. We run 2 times for each K fold cross validation experiments. Fig. 4 shows the process of k fold cross validation experiments based on wavelet detail coefficients at ith level High resolution ovarian dataset The raw ovarian high-resolution SELDI-TOF dataset is composed of 95 samples and 121 cancer samples, and the dimensional-

4 Y. Liu / Computers in Biology and Medicine 39 (29) Table 1 Performance of high resolution ovarian dataset. Level K fold Correct rate Sensitivity Specificity PPV NPV BACC PPV stands for positive predictive value; NPV stands for negative predictive value; BACC stands for balanced correct rate. Table 2 Performance using wavelet approximation coefficients [14]. K fold SD SD BACC This Table shows performance of high resolution ovarian dataset. BACC stands for balanced correct rate; SD stands for standard derivation. ity of the original feature space is It is provided by National Institute ( ppatterns.asp). Resampling of mass spectrometry data homogenizes the mass/charge (M/Z) vector in order to compare different spectra under the same reference and at the same resolution. In high resolution datasets, high resolution spectra contain redundant information. After resampling, signal is decimated into a more manageable M/Z vector, preserving the information content of the spectra. Resampling of mass spectrometry data selects a new M/Z vector and also applies an antialias filter that prevents high frequency noise from folding into lower frequencies [2]. We resample the mass spectrometry data to 15 M/Z points between 71 and After wavelet decomposition on mass spectra, we have 3759, 1886, and 949 dimensions for wavelet detail coefficients at second, third and fourth level respectively, which reduce dimensionality of mass spectra. Fig. 5 shows detail coefficients of wavelet decomposition at second, third and fourth level for cancer and samples. Table 1 shows the performance of two- and threefold cross validation experiments using detail coefficients at second, third and fourth level respectively. The 95.3% and 95.71% BACC of twoand threefold cross validation experiments are obtained using 3759 wavelet features at second level; 1886 wavelet features at third level achieve 97.86% and 98.21% BACC for two- and threefold cross validation experiments; 97.6% and 97.8% BACC of two- and threefold cross validation experiments for 949 wavelet features at fourth level. Yu et al. [14] indicated that For the high-resolution ovarian data, the vector of detail coefficients contains almost no information for the healthy, since SVMs identify all the data as cancers. Their experimental results based on wavelet approximation coefficients at first level are shown in Table 2. They achieved 95.34% BACC and 95.88% BACC for twofold and threefold cross validation experiments. In our proposed methods, wavelet detail coefficients perform very well. Both detail coefficients at third and fourth level outperform their method of four-step strategy. Our results also outperform other methods, which are shown in Table 3. Voted perceptron (VP) has 94.99% BACC; quadratic discriminant analysis (QDA) has 93.15% BACC; linear discriminant analysis (LDA) and Mahalanobis discriminant analysis (MDA) obtain 93.23% and 92.73% BACC respectively. We can see our proposed method is better than other feature extraction methods, such as QDA, LDA and MDA, etc. Table 3 Performance of different methods [14] of high resolution ovarian dataset. Methods Twofold cross validation (mean) (mean) BACC VP QDA LDA MDA NB Bagging NN NN ADtree J48tree This Table shows the performance of high resolution ovarian dataset based on different models. BACC stands for balanced correct rate. represents Sensitivity ; represents Specificity. VP stands for voted perceptron; QDA stands for quadratic discriminant analysis; LDA stands for linear discriminant analysis; MDA stands for Mahalanobis discriminant analysis; k-nn stands for k-nearest neighbor; NB stands for Na ve Bayes; Bagging stands for boostrap aggregating; ADtree stands for alternating decision trees; J48tree is a version of C4.5 in Weka classifier package Vector of detail coefficients at 2nd level Prostate samples Vector of detail coefficients at 3rd level Vector of detail coefficients at 4th level Fig. 6. Wavelet features of prostate mass spectra. This figure shows detail coefficients of wavelet decomposition at second, third and fourth level.

5 822 Y. Liu / Computers in Biology and Medicine 39 (29) Table 4 Performance of prostate cancer dataset. Level K fold Correct rate Sensitivity Specificity PPV NPV BACC PPV stands for positive predictive value; NPV stands for negative predictive value; BACC stands for balanced correct rate. Table 5 Performance of different methods [13]. BACC Specificity Sensitivity PPV No FE PCA PCA/LDA SFS SBS P-test T-test KS-test NSC(2) Boosted Boosted FE This Table shows the performance of prostate cancer dataset based on threefold cross validation experiments. PPV stands for positive predictive value; BACC stands for balanced correct rate; No FE stands for nearest centroid without feature selection; PCA and LDA stand for principal component analysis and linear discriminant analysis; SFS and SBS stand for sequential forward selection and sequential backward selection; T-test is for student-t test; KS-test for Kolmogorov Smirnov test; NSC for nearest shrunken centroid; Boosted for boosting algorithm; Boosted FE combines boosting algorithm and sequential forward selection method Prostate cancer dataset This data was collected using the H4 protein chip (JNCI dataset 7-3-2) [21]. There are 322 samples including 19 samples of benign prostate hyperplasia with PSA levels greater than 4, 63 samples of no evidence of disease and PSA level < 1, 26 samples of prostate cancer with PSA levels 4 through 1, and 43 samples of prostate cancer with PSA levels greater than 1. Each sample is composed of features. We combine benign prostate hyperplasia samples and those with no evidence of disease to form the normal class. The rest of the samples are classed into cancer category. We have 69 cancer samples and 253 normal samples, which is the same as the paper [13]. The original prostate mass spectra have features. After wavelet decomposition on mass spectra, 3798, 195 and 959 dimensions are obtained based on wavelet detail coefficients at second, third and fourth level respectively. Fig. 6 shows detail coefficients of wavelet decomposition at second, third and fourth level for cancer and samples. Table 4 shows the performance of twoand threefold cross validation experiments using detail coefficients at second, third and fourth level respectively. 83.9%, 86.18% and 82.36% BACC are obtained for threefold cross validation experiments based on detail coefficients at second, third, and fourth level respectively. Levner [13] performed threefold cross validation experiments using different methods and their results show in Table 5. Their best BACC result is 9.6% of boosted FE method, which combines boosting algorithm with sequential forward selection (SFS) method. Our best performance method achieves 86.18% BACC, which outperforms student-t test (T-test), Kolmogorov Smirnov test (KS-test), P-test, PCA/LDA, SFS, sequential backward selection (SBS), nearest shrunken centroid (NSC), bootsting algorithm, and is worse than boosting algorithm combining sequential forward selection. 4. Discussion and conclusions In this paper we propose a feature extraction algorithm based on multilevel wavelet decomposition for high dimensional mass spectra. A set of wavelet detail coefficients at different levels is used to reduce the dimensionality of mass spectra and characterizes the transient changes of mass spectra, in order to detect the difference between cancer tissue and normal tissue. Feature extraction method of wavelet detail coefficients is novel application on mass spectrometry data. A set of orthogonal wavelet basis is used to represent the features of mass spectra. Compared to PCA and LDA methods, wavelet feature extraction method does not depend on the training dataset to obtain the basis of feature space. It is wavelet basis which construct the feature space. So wavelet feature extraction method not only keeps the time property of mass spectra, but also dramatically reduces the computation load compared to PCA and LDA methods. Wavelet detail coefficients are high frequency part of mass spectra and are discarded by low frequency part of wavelet approximation. Wavelet detail coefficients have small energy and normally contain noise in the acquisition of mass spectra. After removing the noise using wavelet decomposition of first levels, detail coefficients at third level achieve the competitive performance compared to other feature extraction and feature selection methods. Experimental results suggest that the wavelet detail coefficients at third level are the efficient way to characterize the features of high dimensional mass spectra. Conflict of interest statement None declared. Acknowledgements This work was supported by SRF for ROCS, SEM, and Natural Science Foundation of Shandong Province (Y28G3), China. References [1] E.F. Petricoin, A.M. Ardekani, B.A. Hitt, P.J. Levine, V.A. Fusaro, S.M. Steinberg, G.B. Mills, C. Simone, D.A. Fishman, E.C. Kohn, L.A. Liotta, Use of proteomic patterns in serum to identify ovarian cancer, The Lancet 359 (22) [2] J.M. Sorace, M. Zhan, A data review and re-assessment of ovarian cancer serum proteomic profiling, BMC Bioinformatics 4 (23) 24. [3] C.M. Michener, A.M. Ardekani, E.F. Petricoin III, L.A. Liotta, E.C. Kohn, Genomics and proteomics: application of novel technology to early detection and prevention of cancer, Detection and Prevention 26 (22) [4] E.F. Petricoin, K.C. Zoon, E.C. Kohn, J.C. Barrett, L.A. Liotta, Clinical proteomics: translating benchside promise into bedside reality, Nature Reviews Drug Discovery 1 (22) [5] P.R. Srinivas, M. Verma, Y. Zhao, S. Srivastava, Proteomics for cancer biomarker discovery, Clinical Chemistry 48 (22) [6] P.C. Herrmann, L.A. Liotta, E.F. Petricoin III, proteomics: the state of the art, Disease Markers 17 (21)

6 Y. Liu / Computers in Biology and Medicine 39 (29) [7] G.W. Jr, L.H. Cazares, S.M. Leung, S. Nazim, B.L. Adam, T.T. Yip, P.E. Schellnhammer, L. Gong, A. Vlahou, Proteinchip surface enhanced laser desorption/ionization (SELDI) mass spectrometry: a novel protein biochip technology for detection of prostate cancer biomarkers in complex protein mixtures, Prostate and Prostatic Disease 2 (1999) [8] A. Vlahou, P.E. Schellhammer, S. Mendrinos, K. Patel, F.L. Kondylis, L. Gong, S. Nazim, G.W. Jr, Development of a novel proteomic approach for the detection of transitional cell carcinoma of the bladder in urine, American Journal of Pathology 158 (21) [9] R.H. Lilien, H. Farid, B.R. Donald, Probabilistic disease classification of expression-dependent proteomic data from mass spectrometry of human serum, Journal of Computational Biology 1 (6) (23) [1] H. Park, M. Jeon, J.B. Rosen, Lower dimensional representation of text data based on centroids and least squares, BIT 43 (23) [11] B. Wu, T. Abbott, D. Fishman, W. McMurray, G. Mor, K. Stone, D. Ward, K. Williams, H. Zhao, Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data, Bioinformatics 19 (23). [12] N.O. Jeffries, Performance of a genetic algorithm for mass spectrometry proteomics, BMC Bioinformatics 5 (24). [13] I. Levner, Feature selection and nearest centroid classification for protein mass spectrometry, BMC Bioinformatics 6 (25). [14] J.S. Yu, S. Ongarello, R. Fiedler, X.W. Chen, G. Toffolo, C. Cobelli, Z. Trajanoski, Ovarian cancer identification based on dimensionality reduction for highthroughput mass spectrometry data, Bioinformatics 21 (25) [15] A. Grossmann, J. Morlet, Decomposition of Hardy functions into square integrable wavelets of constant shape, SIAM Journal on Mathematical Analysis 15 (1984) [16] S. Mallat, A theory for multiresolution signal decomposition: the wavelet representation, IEEE Transactions on Pattern Analysis and Machine Intelligence 11 (1989) [17] V.N. Vapnik, Statistical Learning Theory, Wiley, New York, [18] C. Burges, A Tutorial on Support Vector Machines for Pattern Recognition, Kluwer Academic Publishers, Dordrecht, [19] I. Daubechies, Orthonormal bases of compactly supported wavelets, Communications on Pure and Applied Mathematics 41 (1988) [2] IEEE DSP Committee (Ed.), Programs for Digital Signal Processing, IEEE Press, New York, [21] E.F. Petricoin III, D.K. Ornstein, C.P. Paweletz, A. Ardekani, P.S. Hackett, B.A. Hitt, A. Velassco, C. Trucco, L. Wiegand, K. Wood, C.B. Simone, P.J. Levine, W.N. Linehan, M.R. Emmert-Buck, S.M. Steinberg, E.C. Kohn, L.A. Liotta, Serum proteomic patterns for detection of prostate cancer, Journal of the National Institute 94 (22)

Profiling of mass spectrometry data for ovarian cancer detection using negative correlation learning

Profiling of mass spectrometry data for ovarian cancer detection using negative correlation learning Shan He, Huanhuan Chen, Xiaoli Li, and Xin Yao The Centre of Excellence for Research in Computational