Performance Evaluation of Mutation / Non- Mutation Based Classification With Missing Data

Size: px

Start display at page:

Download "Performance Evaluation of Mutation / Non- Mutation Based Classification With Missing Data"

Kathlyn Patterson
5 years ago
Views:

1 Performace Evaluatio of Mutatio / No- Mutatio Based Classificatio With Missig Data N.C. Viod Research Scholar, Maomaiam Sudaraar Uiversity, Tiruelveli, Tamil Nadu, Idia Dr. M. Puithavalli Research Supervisor, Maomaiam Sudaraar Uiversity, Tiruelveli, Tamil Nadu, Idia Abstract A commo problem ecoutered by may data miig techiques is the missig data. A missig data is defied as a attribute or feature i a dataset which has o associated data value. Correct treatmet of these data is crucial, as they have a egative impact o the iterpretatio ad result of data miig processes. Missig value hadlig techiques ca be grouped ito four categories, amely, complete case aalysis, Imputatio methods, maximum likelihood methods ad machie learig methods. Out of these imputatio methods are the widely used solutio for hadlig missig values. However, there are situatios whe imputatio methods might ot work correctly. This study studies ad aalyzes the performace of two algorithms, oe imputatio based ad aother without imputatio based classificatio o missig data. Keywords-Missig Values, Imputatio, No-imputatio, Classificatio with missig data. 1. INTRODUCTION The process of extractig hidde kowledge from databases is called data miig. The data miig methods have great potetial i predictig future treds ad behaviours, allowig busiesses to make proactive, kowledge-drive decisios. It is a multidiscipliary field, drawig work from areas icludig database techology, artificial itelligece, machie learig, eural etworks, statistics, patter recogitio, kowledgebased systems, kowledge acquisitio, iformatio retrieval, high-performace computig ad data visualizatio [26]. The era of iformatio explosio is curretly ecouterig a sudde upsurge i both size ad umber of databases. This icrease far exceeds the ability of humas to aalyze ad extract kowledge from these databases. A commo problem ecoutered by may database ad data miig applicatios is missig data. A missig data is defied as a attribute / feature i a dataset which has o associated data value stored to it. Correct treatmet of missig data is essetial as they have a egative impact, if utreated, o the iterpretatio result. The situatio whe the missig rate is more tha 15%, is a clear idicatio that the applicatio i questio should implemet some sophisticated tools to hadle them correctly. Several studies have focused o developig algorithms that deal with missig data ([22], [24]). But most of these have some drawbacks whe applied to classificatio tasks. Classificatio is the process of dividig a dataset ito mutually exclusive groups such that the members of each group are as close as possible to oe aother, ad differet groups are as far as possible from oe aother, where distace is measured with respect to specific variable(s) that are used for predictio. Mishadlig missig values durig classificatio may produce erroeous classificatio results ([1], [5], [11], [25]). Geerally, the methods that deal with missig values ca be grouped ito four differet categories, which are based o the techique used to solve the problem. The first is Complete case aalysis. These algorithms igore the records with missig values ad the aalysis is based o the complete data ad ca oly work well with large sized database [24]). The demerit of this approach is that the umber of data discarded has a direct impact o the efficiecy of classificatio. The secod approach is Imputatio method [32]. Imputatio is a class of procedures that aims to fill i the missig values with estimated oes. The process uses kow relatioship that exists amog the complete values i the dataset to estimate the missig data. After fillig the missig data with the estimated value, the classifier lears the modified data set. The third approach is Maximum likelihood method [27]. I this method, the iput data distributio is modeled by usig either Expectatio-Maximizatio (EM) algorithm or by usig the variats of EM to hadle the missig values. This is followed by the classificatio task performed by meas of the Bayes rule [17]. The fourth approach is the machie learig methods, which deals with missig values without ay imputatio ad uses techiques such as decisio trees [30] ad fuzzy eural etwork [15]. Most of the methods i this category require complete iput data matrix for processig ad requires cosiderable efforts. Out of this imputatio methods have bee more widely used. Methods like mea ad mode imputatio [2], hot deck imputatio [24], predictio models [32], artificial eural etwork imputatio [13], recurret eural ISSN : Vol. 5 No. 02 Feb

2 etwork imputatio [18], auto-associative imputatio [20], k-earest eighbour imputatio [19] ad Self- Orgaizig Map imputatio [23] have reported to be successful i hadlig missig values. Recetly, Garcia-Laecia et al. [16] proposed a solutio to hadle missig data usig the popular k-earest eighbour (KNN) imputatio process. This procedure uses a feature-weighted distace metric based o mutual iformatio (MI) to estimate the missig value to improve classificatio task. MI is defied as a measure of depedece betwee radom variables ad has bee used i the past as relevace measure i several selectio algorithms ([21], [31]). This model is referred to as GL-Model i this paper. The primary objective of this model is to provide a solutio to missig data problem by focusig imputatio solutios based o the K Nearest Neighbours (KNN) algorithm ([4], [5], [33]). The KNN algorithm is oe of the most popular approaches for solvig icomplete data problems. This algorithm selects K closest observatios (eighbours) from a icomplete dataset, usig a distace metric. The selected observatios preset kow values o the features to be imputed. The weighted average of these values is calculated ad is used as a estimate for each icomplete feature value. Next, to improve the classificatio performace, a ovel KNN imputatio procedure that uses the feature-weighted distace metric is used. The iput attribute relevace to Mutual Iformatio (MI) for classificatio is cosidered accordig to the distace metric. The cocept of Mutatio might ot work i situatio like follows. Cosider a set of diagostic data of patiets ad ormal people collected from differet hospitals distributed geographically. The missig rate i this type of situatio is very high ad techiques that are based o feature deletio ad imputatio strategy are ot suitable. The reaso behid this is that the missig values may distribute differet from the existig oes ad these techiques either brig i bias or report iaccurate results. For example, the diagosis data of patiets ad ormal people have differet distributios. To solve this problem, Qu et al. [29], proposed a two stage model that avoids mutatio ad deletio. I phase I, the dataset is divided ito disjoit subsets based o the attributes with missig values. I phase II, each subset is used to trai appropriate classificatio algorithms respectively i parallel. This model is referred as QU-Model i this paper. The mai objective of this paper is to compare the performace of the GL-Model ad QU-Model with differet datasets, whose missig mechaism is Missig At Radom (MAR). MAR is the probability of the observed missig patter, give the observed ad uobserved data, does ot deped o the values of the uobserved data. This mechaism is commo i practice ad is geerally cosidered as the default type of missig data. The paper is orgaized as follows. Sectio II explais the GL-Model, while Sectio III explais QU-Model. Sectio IV presets the experimetal results ad performs the compariso. Sectio V presets the summary of the work. II. GL-MODEL The KNN method selects K patters from the full dataset i such a way that they miimize a distace measure. After fidig the k earest eighbours, a replacemet value to substitute the missig attribute must be estimated. The method used for estimatio depeds o the type of data; the mode ca be used for discrete data ad the mea for cotiuous data. A improved alterative is to weight the cotributio of each eighbour accordig to their distace to the icomplete patter whose values will be imputed, givig greater cotributio to close eighbours [21]. A advatage over mea/mode imputatio ad simple hot deck method (i fact, KNNimpute with K = 1) is that the replacemet values are oly iflueced by the most similar cases rather tha by all cases or the most similar oe, respectively. The mai drawback of this approach is that wheever KNNimpute looks for the most similar patters, the algorithm looks for through all traiig data set (i the complete data portio), which implies a high computatioal cost. Oe importat step i KNNImpute is the selectio of distace metric. The distace metric used i the GL-model is the heterogeeous Euclidea-overlap metric (HEOM) [5]. HEOM is described as follows: The distace betwee two iput vectors x a ad x b is deoted as d(x a, x b ) ad is calculated usig Equatio (1). 2 d (x a, x b ) = d j (x aj, (1) j= 1 where dj(x aj, x bj ) is the distace betwee x a, x b o its jth attribute ad is give by Equatio (2). 1 (1 maj)(1 m bj) = 0 d j (x aj, = d 0 (x aj, x j is qualitative (2) d N (x aj, x j is quatitative Whe the iput values are ukow, the it is treated missig data ad a distace value of 1 is retured i these situatios. The overlap fuctio d 0 is assiged a value 0 if the qualitative features are same other wise it is assiged a value 1 (Equatio 3) ad d N is the rage ormalized differece distace fuctio ad is give by Equatio (4). ISSN : Vol. 5 No. 02 Feb

3 0 if x aj = x bj d 0 (x aj, = (3) 1 if x aj x bj d N (x aj, x bj x aj x bj ) = (4) max(x ) mi(x ) j j where max(x j ) ad mi(x j ) are the maximum ad miimum values observed i the N traiig istaces for the jth attribute. To classify a ulabeled patter x, the distaces from x to the labeled istaces are computed, its K earest eighbors are idetified, ad the class labels of these earest eighbors are the used to determie the class label of x. Accordig to the votig KNN rule, x is assiged to the class represeted by a majority of its K earest eighbours [9]. I the stadard KNN algorithm, the K eighbours are implicitly assumed to have equal weight i decisio, regardless of their distaces to the patter to be classified. Some approaches have bee proposed based o assigig differet weights to the K eighbours accordig to their distaces to x, with closer istaces havig greater weights. I GL-model, the weighted KNN algorithm is referred as KNNclassify. Followig the distaceweighted rule proposed by Dudai [12], KNNclassify is implemeted usig the same weightig procedure described i [7]. Thus, a weight a k is assiged to each earest eighbour v k of x, with k = 1, 2,..., K. The earest eighbour receives a weight of 1, the furthest eighbour a weight of 0, ad the remaiig eighbours are scaled liearly betwee 0 ad 1. A ulabeled patter is assiged to the class producig the highest summed weight amog its referece eighbours. I the GL-model, the approach for missig data imputatio ad classificatio is a modified KNNImpute. The KNNImpute method is modified because, the covetioal method s learig process is ot orieted to provide a appropriate imputed dataset for solvig classificatio task. The modified method uses a effective procedure where the eighbourhood is selected by cosiderig the iput attribute relevace for classificatio. For each icomplete patter, its selected K eighbours are used to provide imputed values which ca make the classifier desig easier, ad thus, the classificatio accuracy is icreased. This approach uses a feature-weighted distace metric based o MI, which is a good idicator of depedece betwee radom variables. Feature selectio algorithms, i geeral, assig biary weights to features, that is, a weight equal to 1 for selected relevat attributes; a value of 0 for irrelevat features. This method has the advatage of reducig iput dimesioality ad computatio complexity. I GL-model, the feature-weighted procedure assigs oe weight per feature accordig to the MI estimate betwee each feature ad the target class variable. Details of MI estimatio ca be foud i [21]. For classificatio, the MI measures the amout of iformatio cotaied i a iput feature for predictig the target class variable [10]. A high MI betwee a iput feature ad the target meas that this feature is relevat, regardless of the classificatio algorithm. Otherwise, whe the shared iformatio betwee both variables is small, the iput feature is irrelevat for the classificatio task. I GL-model, the MI cocept to weight the iput features distaces i (1) accordig to their relevace for classificatio is sued. The method assigs a weight λ j (Equatio 5) to each jth iput feature accordig to the amout of iformatio that this attribute cotais about the target class variable. The scalig factors λ j is computed heuristically, as give by Weiberger et al. [34]. The higher λ j the more relevat X j is for classificatio. I(X j,c) λ j = (5) I(X,C) f = 1 j Accordig to the MI cocept, the feature-weighted distace metric (Equatio) betwee two iput vectors x a ad x b is computed usig Equatio (6). 2 d j (xa, xb) = λ jd j(xaj, xbj) (6) j= 1 where d j is the distace defied i Equatio (2). Whe this MI-weighted distace metric is implemeted i KNNImpute, the modified MI-KNNImpute is formed. The advatage of this method is that it selects the K- earest cases by cosiderig the iput attribute relevace to the target class, thus addig useful iformatio about classificatio task durig imputatio stage ad providig missig data estimatio orieted to solve the classificatio task. The MI betwee a attribute ad target class variable is estimated by the Parze widow method [21]. The MI-KNN classify assigs a weight β k to the k-th earest eighbour usig Equatio (7). d (v, x) d (v, x) ( x) I k β I k k = = 1 whe di(vk, x) = di(vk, x) (7) d (v, x) d (v, x) I k I 1 ISSN : Vol. 5 No. 02 Feb

4 where d I (.) is the distace measure based o MI cocept defied by (6) ad V x is the set of K earest eighbours of x arraged i icreasig order of distace. Durig classificatio, x is assiged to the class for which the weights β k of the represetatives amog the K C earest eighbours sum to the largest value. This is performed by cosiderig the relevacy betwee the iput features of the target class variable. III. QU-MODEL The QU-Model proposes a two-stage algorithm for situatios where imputatio methods are ot applicable. I the first stage, the origial dataset is segmeted ito disjoit sets accordig to the iformatio gai of the features with missig values (Equatio 8). I the secod stage, the data i each set obtaied from stage 1 is used to trai the classifier. S = v Gai (S, F) E(S) E(Sv ) (8) v {FN,FA } S where, S is the origial dataset, F is the feature set with missig values, F N ad F A represets the subsets of F depedig o whether the value of F is missig (N) or available (A). E(S) is the iformatio etropy of S (Equatio 9) ad is umber of categories ad p i is the probability that the samples i dataset S belog to the ith class. E (S) = pi log2(pi ) (9) i= ` The method used for traiig durig stage I is give i Figure Give a traiig dataset D, calculate the iformatio gai of each feature with missig data. 2. Select the feature F with the highest iformatio gai. Split D ito two subsets DN ad DA based o the values of F (missig or available). I other words, DA cosists of samples where the values of F are available ad DN cotais the rest samples. Sice the values of F are all missig i DN, F is removed ad the dimesio of DN is reduced by oe. A biary tree T is built i which D is the root ode ad the child odes are DA ad DN. 3. For datasets DA ad DN, repeat step 1 ad step 2 util there is o missig values i each child ode. Figure 1 : Traiig Stage I After Stage 1, the dimesios of most subsets are lower tha that of the origial dataset D, which may help improve the performace of the classifiers used i Phase II. I stage 2, the origial dataset D is split ito several subsets Di. A separate classifier Ci is traied o each subset respectively with the classificatio algorithm give by Libsvm [8]. Whe the umber of features with missig values is high, a miimum missig rate is applied to avoid geeratig too may subsets (the umber of samples i each subset eeds to be maitaied at a reasoable level for classificatio). Testig is performed as below: I Stage 1, for each ukow sample X to be classified fid the correspodig subset Di i the tree T geerated above. I Stage 2, Apply Ci (traied o Di) to assig a class label to X. IV. EXPERIMENTAL RESULTS The experimets were coducted with the objective of testig the GL ad QU models i their efficiecy i solvig classificatio task with icomplete ad complete data. To compare the results, the KNNImute approach, GL-model ad QU-model was tested with the C4.5 classifier. The efficiecy of the algorithm was aalyzed usig the classificatio accuracy ad time take to classify. The real world datasets used are Abaloe, Credit Approval ad Aealig datasets ad were obtaied from UCI repository [28]. The 10-fold cross validatio was used [6] was used i all experimets. Abaloe [3] is a multi-class dataset ad for the sake of clarity, a ew twoclass dataset was created by selectig a subset of samples ad mergig differet classes. The resultig dataset was a relatively balaced oe cotaiig 689 class 1 samples ad 709 class 2 samples. There was a feature sex i the dataset with three possible values: M, F ad I ad all sex values marked with I were regarded as missig data. As a result, the icomplete dataset had 578 samples with missig values. Credit card dataset ( br:443/pakdd2009) cotai customer iformatio ad credit levels (2-class problem). A subset of the origial data was selected based o shops (ID_Shop) ad the the values of the feature Moths_I_Job were discarded for some selected shops to create a icomplete dataset. This was to simulate the real world sceario where certai data from specific shops were missig. Fially, there were samples ad 3171 of them had missig values. The aealig dataset [14] is a multi-class dataset havig 798 istaces havig 38 attributes. I the dataset, the preset of the character - i ay of the attribute deotes missig values, ISSN : Vol. 5 No. 02 Feb

5 resultig i a icomplete dataset. A subset of the origial data was selected based o family type (family), totalig to 70 cases of missig data. Table I shows the classificatio accuracy for the three datasets selected. The experimets were coducted with subset selectio ad full dataset. Table I. Classificatio Accuracy Dataset C4.5 GL Model QU Model Abaloe CreditApproval(subset) CreditApproval (full dataset) Aealig (subset) Aealig (full dataset) While takig the executio time ito cosideratio, the QU-algorithm was fast i producig classificatio results tha GL-Algorithm. But it could be see that the C4.5 algorithm without missig value hadler was the quickest amog all the three. The reaso is that the additioal algorithm requires a egligible time to hadle the icomplete data durig classificatio. But this differece is very miimal (o average 0.09% with GL-Model ad 0.04% with QU-Model) ad the performace of accuracy obtaied by the GL ad QU Models outweighs this fact. Durig experimetatio, while the full dataset with missig data was cosidered ad it could be see that the GL-Model produced better classificatio accuracy. While a subset of the dataset was cosidered, i Credit Approval dataset, the sex attribute took oly two omial values, the mutatio algorithm s performace degraded ad QU-Model raked better tha GL-model. I all the cases, classifier without missig value hadler produced poor result. Figure 2 shows the time complexity of the proposed algorithms Time (Secods) Abaloe CreditApproval (subset) CreditApproval (full dataset) Aealig (subset) Aealig (full dataset) C4.5 GL-Model QU-Model Figure 2 : Performace based o Classificatio Time V. CONCLUSION The problem of missig data i databases creates problems i almost all steps of data miig. I this paper, a method based o KNN combied with imputatio ad a method which uses a two stage approach without imputatio are aalyzed. The first method uses a feature-weighted distace metric combied with KNN classifier to hadle icomplete data for classificatio. The secod method first divides the dataset ito disjoit subsets accordig to the attributes with missig values. Usig these subsets the classificatio process is performed. Several experimets were coducted ad the results prove that while both the models perform well with missig data, whe the attribute value is biary, the QU-model performs better tha GL-model. It would be worthwhile to pursue research ito developig hybrid model that combies the GL ad QU-models by applyig the imputatio method to the subsets i the first stage ad the perform classificatio ad to study ad aalyze their effect o classificatio. REFERENCES [1] Acua, E. ad Rodriguez, C. (2004) The treatmet of missig values ad its effect i the classifier accuracy, i: D. Baks, L. House, F.R. McMorris, P. Arabie, W. Gaul (Eds.), Classificatio, Clusterig ad Data Miig Applicatios, Spriger, Berli, Pp [2] Alliso, P.D. (2001) Missig data, Sage Uiversity Papers Series o Quatitative Applicatios i the Social Scieces, Thousad Oaks, Califoria, USA. [3] Asucio, A. ad Newma, D.J. (2007) UCI Machie Learig Repository, Uiversity of Califoria. [4] Batista, G.E. ad Moard, M.C. (2002) A study of k-earest eighbour as a imputatio method, i: Secod Iteratioal Coferece o Hybrid Itelliget Systems, vol. 87, Satiago, Chile, 2002, pp [5] Batista, G.E. ad Moard, M.C. (2003) A aalysis of four missig data treatmet methods for supervised learig, Applied Artificial Itelligece, Vol. 17, No.5 6, Pp [6] Bishop, C.M. (1995) Neural Networks for Patter Recogitio, Oxford Uiversity Press, Oxford, UK. [7] Brow, J.G. (2002) Usig a multiple imputatio techique to merge data sets, Applied Ecoomics Letters, Vol. 9, No. 5, Pp [8] Chag, C. ad Li, C. (2001) LIBSVM: a library for support vector machies. ISSN : Vol. 5 No. 02 Feb

6 [9] Cover, T.M. ad Hart, P.E. (1967) Nearest eighbor patter classificatio, IEEE Trasactios o Iformatio Theory, Vol. 13, No.1, Pp [10] Cover, T.M. ad Thomas, J.A. (1991) Elemets of Iformatio Theory, Wiley-Itersciece, New York. [11] Duda, R.O., Hart, P.E. ad Stork, D.G. (2000) Patter Classificatio, Wiley-Itersciece, New York. [12] Dudai, S.A. (1976) The distace-weighted k-earest-eighbor rule, IEEE Trasactios o Systems, Ma ad Cyberetics, Vol. 6, No.4, [13] Fracois, D., Rossi, F., Wertz, V. ad Verleyse, M. (2007) Resamplig methods for parameter-free ad robust feature selectio with mutual iformatio, Neurocomputig, Vol. 70, Issue 7 9, Pp [14] Frak, A. ad Asucio, A. (2010) UCI Machie Learig Repository, Irvie, CA: Uiversity of Califoria, School of Iformatio ad Computer Sciece. [15] Gabrys, B. (2002) Neuro-fuzzy approach to processig iputs with missig values i patter recogitio problems, Iteratioal Joural of Approximate Reasoig, Vol. 30, No.3, Pp [16] Garcia-Laecia, P.J., Sacho-Gomez, J., Figueiras-Vidal, A.R.C ad Verleyse, M. (2009) K earest eighbours with mutual iformatio for simultaeous classificatio ad missig data imputatio, Joural Neurocomputig, Elsevier Publicatios, Vol. 72. Issue 7-9, Pp [17] Ghahramai, Z. ad Jorda, M.I. (1994) Supervised learig from icomplete data via a EM approach, J.D. Cowa, G. Tesauro, J. Alspector (Eds.), Advaces i NIPS, Vol. 6, Morga Kaufma Publishers, Ic., Los Altos, CA, Pp [18] Hammer, B. ad Villma, T. (2002) Geeralized relevace learig vector quatizatio, Neural Networks, Vol. 15, Issues 8 9, Pp [19] Hechebichler, K. ad Schliep, K. (2007) Weighted k-earest-eighbor techiques ad ordial classificatio. Techical Report, Ludwig Maximilias Uiversity Muich. [20] Jerez, J.M., Molia, I., Subirats, J.L. ad Fraco, L. (2006) Missig data imputatio i breast cacer progosis, i: BioMed 06: Proceedigs of the 24th IASTED Itera- tioal Coferece o Biomedical Egieerig, ACTA Press, Aaheim, CA, USA, Pp [21] Kwak, N. ad Choi, C.H. (2002) Iput feature selectio by mutual iformatio based o Parze widow, IEEE Trasactios o Patter Aalysis ad Machie Itelligece, Vol. 24, No.12, Pp [22] Lall, U. ad Sharma, A., (1996) A earest-eighbor bootstrap for resamplig hydrologic time series, Water Resource. Res., Vol.32, Pp [23] Little, R.J.A. (1999) Methods for hadlig missig values i cliical trials, Joural of Rheumatology, Vol. 26, No. 8, Pp [24] Little, R.J.A. ad Rubi, D.B. (2002) Statistical Aalysis with Missig Data, 2d Editio, Joh Wiley ad Sos, New York. [25] Markey, M.K. ad Patel, A. (2004) Impact of missig data i traiig artificial eural etworks for computer-aided diagosis, i: Iteratioal Coferece o Machie Learig ad Applicatios, Pp [26] Marti, T. (2003) A day i the life of a Data Mier, Bulleti of the Iteratioal Statistical Istitute, 54th Sessio, Vol. LX, Ivited Papers, August 2003, Berli, Germay. Pp [27] McLachla, G.J. ad Krisha, T. (1997) The EM Algorithm ad Extesios, Wiley, New York. [28] Newma, D.J., Hettich, S., Blake, C.L. ad Merz, C.J. (1998) UCI repository of machie learig databases. [29] Qu, X., Yua, B. ad Liu, W. (2009) A Novel Two-Phase Method for the Classificatio of Icomplete Data, Iteratioal Coferece o Iformatio Maagemet, Iovatio Maagemet ad Idustrial Egieerig, Pp [30] Quila, J.R. (1993) C4.5: Programs for Machie Learig, Morga Kaufma, Los Altos, CA. [31] Rossi, F., Ledasse, A., Fracois, D., Wertz, V. ad Verleyse, M. (2006) Mutual iformatio for the selectio of relevat variables i spectrometric oliear modelig, Chemometrics ad Itelliget Laboratory Systems, Vol. 80, Pp [32] Schafer, J.L. (1997) Aalysis of Icomplete Multivariate Data, Chapma & Hall, Florida, USA. [33] Troyaskaya, O., Cator, M., Alter, O., Sherlock, G., Brow, P., Botstei, D., Tibshirai, R., Hastie, T. ad Altma, R. (2001) Missig value estimatio methods for DNA microarrays, Bioiformatics, Vol. 17, No.6, Pp [34] Weiberger, K., Blitzer, J. ad Saul, L. (2006) Distace metric learig for large margi earest eighbor classificatio, i: Y. Weiss, B. Scholkopf, J. Platt (Eds.), Advaces i NIPS 18, MIT Press, Cambridge, MA, Pp ISSN : Vol. 5 No. 02 Feb

3D Model Retrieval Method Based on Sample Prediction

20 Iteratioal Coferece o Computer Commuicatio ad Maagemet Proc.of CSIT vol.5 (20) (20) IACSIT Press, Sigapore 3D Model Retrieval Method Based o Sample Predictio Qigche Zhag, Ya Tag* School of Computer