Learning Topic Structure in Text Documents using Generative Topic Models
|
|
- Rosanna Potter
- 5 years ago
- Views:
Transcription
1 Learnng Topc Structure n Text Documents usng Generatve Topc Models Ntsh Srvastava CS 397 Report Advsor: Dr Hrsh Karnck Abstract We present a method for estmatng the topc structure for a document corpus by combnng the use of effcent clusterng algorthms wth generatve topc models. Our method bulds the topc structure as a DAG n a layer-wse manner by estmatng the number of topcs n each layer and constructng topc models for them. It dscovers subsumpton relatons between topc nodes n consecutve layers by mappng the problem to fndng maxmum edge-weghted clques on small and sparse graphs. We analyze the sparsty of the nduced graphs and gve bounds on the runnng tme of our algorthm. Most methods for solvng ths problem use varants of herarchcal drchlet procesess to provde a nonparametrc pror on the number of topcs and estmate the number as the topc model s bult. Whle these models have been shown to perform well under certan metrcs, the estmated number of topcs s often qute large, lmtng the utlty of the topc structures bult. Our approach s to buld structures where the number of topcs would be smaller and comparable to that n structures bult by human experts. We evaluate our method usng real world text datasets. Keywords-Data management; Data models; Herarchcal systems; Pattern clusterng methods I. INTRODUCTION Topc models have been shown to be effectve tools for analyss of document corpora. Learnng the topc structure s an mportant problem n ths regard. Our method bulds arbtrary DAG topc structures over a collecton of documents where nodes represent topcs and edges pont from a topc to a more specfc sub-topc. Key contrbutons are the estmaton of number of topcs n each layer of the structure and dscovery of subsumpton relatons between topc nodes n adjacent layers as a max-clque problem. Topc models have been proposed whch automatcally learn the number of topcs usng varous forms of Herarchcal Drchlet Processes (HDP, Teh et al. [1]) such as nonparametrc bayes PAM [2]. The key dea here s to Descrpton of HDP, PAM, NPB-PAM. Crtcsm We explore a dfferent strategy for topc structure extracton. We separate the generatve topc model from the topc ontology. We propose a method whch separates the processes of learnng a topc structure from learnng the topc model. We buld the topc structure usng smple clusterng technques combned wth sngle layer topc models. These structures can then be used by varous sophstcated topc models whch requre the number of topcs and herarchy structure to be specfed. Models such as herarchcal Latent Drchlet Allocaton (hlda, [3]), Correlated Topcs Model (CTM, [4]) and Pachnko Allocaton Model (PAM, [5]) and mxture model extensons of these are some such methods. Document clusterng s a wdely studed feld. Zhao et al. [6] gve an excellent comparson of several clusterng paradgms. We use a recursve bsecton based clusterng algorthm [7] to k-cluster a dataset and evaluate the qualty of each cluster. We use ths to approxmate the number of topcs at dfferent depths n the DAG. Successve levels of depth n the DAG ( layers ) represent more fne-graned topcs. We learn sngle-layer topc models for each depth. The topcs n consecutve layers are then lnked by subsumpton relatons. We cast the problem of dscoverng subsumpton relatons nto fndng a maxmum edge-weghted clque over a sparse topc graph. Besdes fndng subsumpton relatons, our formulaton also merges very smlar topcs n the same layer to further prune the topc structure and make up for errors made n the clusterng phase. In order to valdate our results, we use the 20 newsgroups comp5 (ngcomp5) and the NIPS abstracts datasets. These datasets have sngle level categores. In order to do a more meanngful evaluaton, we collected a herarchcal dataset from a crawl of the webpages lnked from the Open Drectory Project (ODP). Ths allows us to compare aganst human-expert determned topc structures. II. ESTIMATING NUMBER OF TOPICS In ths secton, we descrbe a smple approach to obtan a rough estmate of the number of the topcs gven a document corpus. Nonparametrc varants of topc models whch use Herarchcal Drchlet Processes (HDP) such as Latent Drchlet Allocaton [1] and Pachnko Allocaton Model (NPB-PAM [2]) have been used to estmate the number of topcs whle buldng the correspondng topc model. These methods place a non-parametrc pror on the number of topcs and estmate the number as the model s bult. These methods have been shown to work well n terms of evaluaton metrcs whch nvolve lkelhood estmates (emprcal or otherwse) or classfcaton accuraces on heldout data. Inferng the number of topcs usng HDPs gves more specfc topcs (as qualtatvely shown for the case of NPB-PAM). However, the average number of topcs estmated s often very hgh. For example NPB-PAM dscovers 179 sub-topcs n the 20newsgroups comp5 dataset whch contans 5 topcs. Such models are good for applcaton domans where the topc model s meant for tasks where the
2 Crteron functon Crteron functon Crteron functon number of topcs number of topcs number of topcs (a) ODP (b) NIPS (c) ODP/Computers Fgure 1. Plot of Crteron functon vs Number of topcs dscovered topcs themselves serve only as latent varables. It s not clear f these topcs would correspond to a human determned ontology. Our goal s to generate a topc structure wth typcally fewer topc nodes and more semblance to a topc herarchy that could be appled to domans whch use the actual topc structure for organzng unstructured text nformaton. We begn by obtanng an approxmate number of topcs usng a clusterng algorthm whch uses repeated bsectons. Zhao and Karyps [6] have shown that ths method produces better clusterng than agglomeratve or graph based methods. A key characterstc of ths approach s that t uses a global crteron functon whose optmzaton drves the entre clusterng process. The crteron functon used s maxmze =1 v,u S u.v where each document u s represented as a vector of normalzed tf values. We k-cluster the set of documents for each k {2, 3,..., K}, where K s the maxmum number of topcs to be allowed n any layer of the topc DAG. Computng a K-clusterng takes O(NNZ log(k)) tme where NNZ s the number of non-zero entres n the document smlarty matrx. In the process of K clusterng the dataset, the algorthm also produces k-clusters for all k < K snce the method performs repeated bsectons. Hence, computng all clusterngs from 2 to K also takes O(NNZ log(k)) tme. Each clusterng C k corresponds to a set of clusters {Ck 1, C2 k,..., Ck k }. The qualty of each C k s evaluated usng the functon 1 F ( µ nt, µ ext, σ nt ) = k =1 C k Ck µ extσnt µ (1) nt =1 Here µ denotes the vector of average ntra-cluster smlarty. µ e denotes the vector of average nter-cluster smlarty. σ denotes the vector standard devaton n ntra-cluster smlarty. All smlartes refer to cosne smlartes. A lower value of the functon ndcates a better clusterng. The ntuton behnd the crteron functon s to have low smlarty wth documents outsde the cluster and hgh smlarty, wth low standard devaton nsde. Ths s to ensure that the clusters are tght. The rato s weghed by the sze of the cluster. Ths functon s computed for each k-clusterng. Ths gves a sequence of qualty values q 1, q 2,..., q K correspondng to the sequence of topc numbers. A low value mples a good clusterng. Let {m 1, m 2,..., m l } be the sequence of topc numbers correspondng to local mnma n the sequence of qualty values. These mnma represent the regons where the number of topcs are such that the clusterng produced s locally optmal. These are potentally the number of topcs n at ncreasng depth n the topc structure. The ntuton behnd ths s as follows. Suppose that a document corpus has c topcs. If the corpus was clustered nto a c 1 or c + 1 clusterng, some of the actual c clusters wll have to redstrbute ther documents to ft nto a c 1 or c+1 clusterng. Ths wll decrease the tghtness of the clusters whch wll be captured by the crteron functon n 1. We expect that the value of the functon would be hgher on ether sde of a good clusterng. Fgure 1 shows plots of q 1, q 2,..., q 30 for dfferent datasets. Wth ths heurstc n place, the goal of the clusterng step s to determne {m 1, m 2,..., m L }, the sequence of approxmate number of topcs n each layer. For example, n Fgure 1(b) the sequence of mnma s {6, 9, 14, 18,...}. These wll be chosen as the approxmate number of topcs
3 n the topc structure correspondng to the NIPS abstracts dataset. The underlyng assumpton that mutually exclusve document clusters are representatve of topcs s not entrely accurate snce the actual topc structure may be more complex and documents may contan content that could be best descrbed as a mxture of dfferent topcs. Topcs may exhbt sgnfcant correlatons. However, the purpose of ths clusterng s not to dscover ths rch underlyng structure but to get a rough estmate of the number of topcs n each layer of the topc structure. Ths estmate s mportant because t gves a startng pont for the structure buldng algorthm. A more accurate topc number along wth subsumpton relatons between them s dscovered later. Table?? shows the top few words n the clusters correspondng to the local mnma of the crteron functon. Whle the clusters may not be extremely accurate, they are good enough to justfy ther use as approxmatatons. Secton xxx compares the fnal topc numbers and structure wth these approxmatons. III. BUILDING THE DAG A. Buldng Sngle Layer topc models The next step n buldng the topc structure s to fnd topc models for each value of the number of topcs {m 1, m 2,..., m l } found n the prevous secton. We treat each layer ndependently and buld sngle layer topc models for them. Our method allows the use of any statstcal topc model as long as t s possble to nfer topc proportons present n each document under that model. In ths paper, we demonstrate our method usng smple Latent Drchlet Allocaton [8] though t s possble to use more sophstcated models. The smplcty of the model allows us to demonstrate the key dea behnd the process of dscoverng subsumpton relatons more effectvely. The framework of buldng ndependent models and then subsumng consecutve layers has been prevously used by Zavtsanos et al. [9] who demonstrated ts use n buldng document ontologes usng LDA. Latent Drchlet Allocaton s a generatve probablstc model n whch documents are represented as random mxtures over latent topcs. Topcs are themselves modeled as an nfnte mxture over a set of topc probabltes. The generatve process can be summarzed as: We use varatonal methods descrbed n Ble et al. [8] to nfer topc proportons n each document. We have the sequence (m) L =1 of number of topcs estmated from the clusterng step. For each element m of ths sequence, we buld an LDA topc model T. For each document d of the corpus, the varatonal nference method gves a posteror drchlet parameter γ d, where the proporton of topc u s represented by the component γ d,u. One way to nterpret ths stuaton s to defne a probablty functon P over the set of topcs {u 1, u 2,..., u m }, where P (u ) denotes the probablty of occurence of topc u n a random document. Gven the document corpus D, ths can be estmated as, 1 P (u ) = γ d,u D d D γ d,u γ d,u = m l=1 γ d,u l The jont probablty dstrbuton P (u, u j ) can be estmated as, P (u, u j ) = 1 γ d,u D γ d,u j (2) d D The jont probabltes are a measure of co-occurence of dfferent topcs. Ths dea s exploted next for dscoverng subsumpton relatons. B. Subsumpton as a maxmum weght clque problem Now that we have the topc models for each layer, the next step s to fnd subsumpton relatons between consecutve layers. The problem of dscoverng subsumpton relatons n ontologes has been wdely studed. One of the most relevant works n the context of buldng topc structures s by Zavtsanos et al. [9] who use the dea of condtonal ndependence between sub-topcs gven a canddate parent topc to decde subsumpton relatons. In ther method, f {u 1, u 2,..., u m } s a topc layer and {v 1, v 2,..., v n } s the layer of topcs just below t, then each par (v, v j ) s assgned a parent u k f the occurence of u k n a document makes the occurence of topcs v and v j condtonally ndependent. The ntuton s that f the co-occurence of topcs v and v j s nullfed once we know the parent u k occurs, then the parent captures what s common between the topcs and s therefore a good choce to subsume v and v j. Such a crteron for subsumpton s reasonable, but t suffers from a constrant that t decdes the subsumpton relatons based on par-wse ndependence alone. In ths paper, we use a dfferent crteron and a more general graph framework that captures par-wse relatons and uses a clque based method to buld hgher order sub-topc sets and then determnes subsumpton relatons. Let there be m parent topcs and n chld topcs n some par of consecutve layers. For each chld topc v we fnd u k such that P (v u k ) s maxmum, where the probablty P s defned as n Eq. 2 and 2. The topc v s then drectly subsumed under u k. Ths process assocates each chld topc wth exactly one parent, buldng a tree structure. Next, we dscover par-wse subsumpton relatons,.e., for each par of dstnct chld topcs (v, v j ) we fnd the parent topc u k whch best subsumes them. We later use par-wse relatons to determne global subsumpton relatons. We would lke u k to subsume v and v j f they could be consdered subtopcs of u k. For them to be sub-topcs, two condtons must hold. v and v j should be assocated wth u k but at the same tme they must be suffcently separate so that they can be
4 consdered separate sub-topcs. These two condtons can be nterpreted as demandng that P (v u k ) and P (v j u k ) should both be hgh but P (v, v j u k ) must be low,.e. the jont probablty that v and v j occur together gven u k must be low. In other words, each sub-topc ndvdually must have a hgh condtonal probablty of occurence gven that the parent topc occurs (ndcatng that the subtopcs capture a part of the parent topc s vocabulary) but at the same tme both the sub-topcs must not occur together very often gven the parent (f they do occur, they are not separate enough to be consdered ndvdual sub-topcs). In ths sense, the occurence of v and v j must be negatvely correlated gven u k. Hence for every (v, v j ), we can fnd a u k u k = argmax uk (P (v u k )P (v j u k ) P (v, v j u k )) (3) subject to u k th where th s a postve threshold on the objectve functon to supress very small values. If no such u k exsts, then (v, v j ) cannot be subsumed under any parent topc. A postve threshold ensures that the par of sub-topcs respects the two condtons above. A slghtly dfferent way to look at ths objectve functon s to note that we want P (v j v, u k ) P (v j u k ) P (v u k )P (v j v, u k ) P (v u k )P (v j u k ) P (v, v j u k ) P (v u k )P (v j u k ) 5 are not connected. Ths would mean that they are not suffcently negatvely correlated to be consdered separate subtopcs of the parent topc. Hence we would not want both 4 and 5 to be subsumed. The clques consstng of topcs (8,1,4) or (8,2,4) are better choces snce each topc s then suffcently dfferent from the others. Topc 5 would, n essence, be captured by topc 4. The weghts assocated wth the edges denote the strength of the correspondng negatve correlaton. All vertces n the maxmum edgeweghted clque should be subsumed under the parent topc. Hence the problem of determnng the best subsumpton set for any parent topc k reduces to fndng the maxmum weght clque n graph G k. Though ths problem s hard to solve n general, we observed that the graphs G k that are nduced by real datasets that we expermented on are typcally very sparse (not more than 1 connected component and not more than 6-7 vertces n any graph). Hence, even an exponental tme algorthm would not be too bad keepng n mnd that ths ontology buldng exercse s to be done offlne. The par-wse ndependence crteron used by Zavtsanos et al. [9] can be seen as a specal case of ths method, where only the maxmum 2-clque s used for buldng subsumpton relatons. The next secton explores ths graph and the weght functon n more detal and gves bounds on the sparsty of the nduced graphs. And the more t s less the better. Solvng the optmzaton problem n Eq.(3) gves the best parent topc under whch a par of chld topcs may be subsumed. We emphasze that none of these chld topcs mght actually be subsumed under ths parent n the fnal herarchy. Ths s just the best parent that could subsume ths par. Next, we construct graphs G k for each parent topc u k. G k conssts of those sub-topcs whch occur as a part of a par whch s best subsumed under u k..e., V (G k ) = { jstu k subsumes (v, v j )} (4) E(G k ) = {(, j) st u k subsumes (v, v j )} (5) wt(, j) = P (v u k )P (v j u k ) P (v, v j u k ) (6) Fgure 2 shows some examples of such graphs for the Reuters dataset. The three graphs correspond to the three parent topcs. The chld layer conssts of 8 topcs. Note that any clque C n ths graph represents a set of topcs whch are mutually negatvely correlated n the sense of Eq.(3). We would want that only the largest such set be subsumed under the correspondng parent. For example n the frst graph n Fgure 2, topcs 8 and 5 are connected by an edge meanng that they are negatvely correlated. Also topcs 8 and 4 are connected. However, topcs 4 and C. Sparsty bounds Let there be m parent topcs and n chld topcs n some par of consecutve layers. An edge (, j) can be n G k for a unque value of k snce only the best canddate parent u k subsumes (v, v j ). There are O(n 2 ) such edges dstrbuted across m graphs. Hence, the expected number of edges n each graph s O(n 2 /m). Snce m and n are the number of topcs n adjacent layers, we can assume that m = O(n). Therefore, the expected number of edges s O(n). Ths means that the maxmum-clque can be of sze O( n). Consder the functon f k (, j) = P ( k)p (j k) P (, j k) Where and j refer to chld topcs and k refers to the parent. If the occurence of topcs and j s ndependent gven k, P (, j k) = P ( k)p (j k) and hence f k (, j) = 0. If the occurence of the topcs s negatvely correlated f k (, j) > 0.
5 (a) Graphs G k for Reuters-25 topcs Fgure 2. Determnng subsumpton relatons (b) Top few layers of the ontology Also, f k (, j) = (P ( k)p (j k) P (, j k)),j,j D. Model Trmmng =,j = P ( k)p (j k) P (, j k),j P ( k) j = P ( k)1 1 = 1 1 = 0 P (j k) 1 So far we have bult the topc structure assumng that the number of topcs chosen n the clusterng step were correct. Clusterng methods are however susceptble to errors and use a very prmtve noton of smlarty. The objectve functon 3 can be put to further use for trmmng the topc structure. Ths tme, we can look at t as follows w,j (k) = P (v u k )P (v j u k ) P (v, v j u k ) Recall that a hgh value of w,j (k) would mean that topcs v and v j are separated enough from each other and yet assocated suffcently to u k to be consdered ts sub-topcs. A low value, on the other hand, would mean that the topcs are qute smlar and as far as u k s consdered, they mght be merged. Ths can be seen as a scheme where each parent topc u k votes f (v, v j ) should be merged and the value of the vote s w,j (k). A hgh vote value means that the correspondng parent topc wants the par to be kept separate and a low ndcates that they should be merged. These votes are then polled together to get a fnal value. W (, j) = w,j (k ) k =1 If W (, j) s postve, the par s left alone, else t s merged. A possble mprovement s to wegh the votes of each parent k by P (v u k )P (v j u k ). Ths factor ensures that the vote of a parent topc whose occurence s more correlated wth the occurence of the chld topcs s taken more serously, the dea beng that such a parent topc carres statstcally more weght than a parent topc whch has nothng to do wth ths par of chld topcs. W (, j) = P (v u k)p (v j u k)w,j (k ) k =1 Merge (v, v j ) f W (, j) < 0 Whle the ntuton behnd the method seems sound, more experments need to done to evaluate t. IV. EVALUATION The am of ths study s to provde a method to buld semantcally meanngful topc structures whch are smlar to those bult by human experts. We compare the topc structures determned usng our method wth human bult ones. A. Datasets We chose a subset of the Open Drectory Project (ODP) category structure and crawled a random number of webpages lnked under those categores. Ths data was collected usng the rdf dump avalable from the ODP homepage accessed on 10th Aprl Besdes, we use standard datasets such as the NIPS abstracts dataset, Reuters and the 20 newsgroups dataset.
6 F_ F_ number of topcs number of topcs (a) Reuters-13 (b) Reuters-25 Fgure 3. Plot of Crteron functon vs Number of topcs for dfferent subsets of Reuters dataset Fgure 4. Top 3 layers of onotology for Reuters dataset. Black lnes ndcate major parent, blue lnes ndcate lesser degree of assocaton. B. Qualty of clusterng Reuters topc dataset Mnma for crteron functon at : 3, 8, 12, 20 and 24 topcs. See Fgure 3. Top few words for a 3-topc clusterng usng LDA Topc 1 shp, offc, unon, strke, gulf Topc 2 tonne, mln, wheat sugar, export, gran Topc 3 ol, prce, mln, dlr, pct, produc Top few words for a 8-topc clusterng usng LDA Topc 1 prce, market, dlr, futur, exchange, trade Topc 2 ol, mln, pct, dlr, gold Topc 3 tonne, export, sugar, wheat, mln Topc 4 shp, strke, compan, port, unon Topc 5 ol, opec, prce, mln, bpd, saud Topc 6 coffee, produc, export, quota, stock, cocoa Topc 7 mln, crop, tonne, pct, product, gran, plant Topc 8 propos, offc, farm, wheat, agrcultur, gran Fgure 2 shows the graphs generated for decdng subsumpton between 8-topc layer and 3-topc layer. Fgure 4 shows the frst 3 layers for the generated ontology tree. The subsumpton seems to be qute approprate consderng the topc keywords gven above. ODP/Comp Ths data was collected by us usng the rdf dump avalable from the Open Drectory Project page accessed on 10th Aprl A subset of the scence drectory urls were crawled and converted to bag of words. Topc Structure: Algorthms Artfcal Intellgence : Academc Departments, People, Conferences and Events, Machne Leanrng, Natural Langugae, Neural Networks, Vson. Hardware : Buses, Cables, Perpherals (Audo, Keyboards, Dsplays, Prnters) OpenSource Systems Internet :Organzatons, Searchng, Web desgn and development
7 Securty : Frewalls, Intruson Detecton Systems, Malcous software Fgure 1(c) shows the plot of the cluster qualty crteron functon. A sgnfcant amount of data cleanng needs to be done before ths set s usable for further experments. Ths presents a very qualtatve analyss of the method. A more quanttatve comparson needs to be done to fully valdate t. We are currently workng on mplementng other methods as such Non-Parametrc LDA and Non-Parametrc PAM whch also try to estmate the sze and structure of a topc herarchy. We plan to compare our method aganst these on metrcs such as perplexty and devaton from human defned structure. [9] E. Zavtsanos, S. Petrds, G. Palouras, and G. A. Vouros, Determnng automatcally the sze of learned ontologes, n ECAI, ser. Fronters n Artfcal Intellgence and Applcatons, M. Ghallab, C. D. Spyropoulos, N. Fakotaks, and N. M. Avours, Eds., vol IOS Press, 2008, pp V. CONCLUSIONS AND FUTURE WORK Our method estmates the sze and structure of the topc space underlyng a set of documents. However, an mportant step n usng ths method for a large number of practcal applcatons s to develop a method for category namng that wll be used to label each of the dscovered categores. Our method currently lacks a formal justfcaton and a quanttatve analyss. We plan to work further n these drectons. REFERENCES [1] Y. W. Teh, M. I. Jordan, M. J. Beal, and D. M. Ble, Herarchcal Drchlet processes, Journal of the Amercan Statstcal Assocaton, vol. 101, no. 476, pp , [2] W. L, D. Ble, and A. Mccallum, Nonparametrc bayes pachnko allocaton, n UAI 07, [Onlne]. Avalable: mccallum/papers/npbpamua2007s.pdf [3] D. Ble, T. L. Grffths, M. I. Jordan, and J. B. Tenenbaum, Herarchcal topc models and the nested chnese restaurant process, Advances n Neural Informaton Processng Systems 16., [Onlne]. Avalable: gruffydd/papers/ncrp.pdf [4] D. M. Ble and J. D. Lafferty, Correlated topc models, n NIPS, [5] W. L and A. McCallum, Pachnko allocaton: Dag-structured mxture models of topc correlatons, n ICML 06: Proceedngs of the 23rd nternatonal conference on Machne learnng. New York, NY, USA: ACM, 2006, pp [6] Y. Zhao and G. Karyps, Emprcal and theoretcal comparsons of selected crteron functons for document clusterng, Mach. Learn., vol. 55, no. 3, pp , [7] M. Stenbach, G. Karyps, and V. Kumar, A comparson of document clusterng technques, n KDD Workshop on Text Mnng, [8] D. Ble, A. Y. Ng, and M. I. Jordan, Latent drchlet allocaton, Journal of Machne Learnng Research, vol. 3, p. 2003, 2001.
Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning
Outlne Artfcal Intellgence and ts applcatons Lecture 8 Unsupervsed Learnng Professor Danel Yeung danyeung@eee.org Dr. Patrck Chan patrckchan@eee.org South Chna Unversty of Technology, Chna Introducton
More informationThe Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique
//00 :0 AM Outlne and Readng The Greedy Method The Greedy Method Technque (secton.) Fractonal Knapsack Problem (secton..) Task Schedulng (secton..) Mnmum Spannng Trees (secton.) Change Money Problem Greedy
More informationUser Authentication Based On Behavioral Mouse Dynamics Biometrics
User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA
More informationMachine Learning: Algorithms and Applications
14/05/1 Machne Learnng: Algorthms and Applcatons Florano Zn Free Unversty of Bozen-Bolzano Faculty of Computer Scence Academc Year 011-01 Lecture 10: 14 May 01 Unsupervsed Learnng cont Sldes courtesy of
More informationLearning the Kernel Parameters in Kernel Minimum Distance Classifier
Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department
More informationPerformance Evaluation of Information Retrieval Systems
Why System Evaluaton? Performance Evaluaton of Informaton Retreval Systems Many sldes n ths secton are adapted from Prof. Joydeep Ghosh (UT ECE) who n turn adapted them from Prof. Dk Lee (Unv. of Scence
More informationSubspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;
Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features
More informationCS 534: Computer Vision Model Fitting
CS 534: Computer Vson Model Fttng Sprng 004 Ahmed Elgammal Dept of Computer Scence CS 534 Model Fttng - 1 Outlnes Model fttng s mportant Least-squares fttng Maxmum lkelhood estmaton MAP estmaton Robust
More informationHierarchical clustering for gene expression data analysis
Herarchcal clusterng for gene expresson data analyss Gorgo Valentn e-mal: valentn@ds.unm.t Clusterng of Mcroarray Data. Clusterng of gene expresson profles (rows) => dscovery of co-regulated and functonally
More informationA Binarization Algorithm specialized on Document Images and Photos
A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a
More informationS1 Note. Basis functions.
S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type
More informationCluster Analysis of Electrical Behavior
Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School
More informationContent Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers
IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth
More informationCS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 15
CS434a/541a: Pattern Recognton Prof. Olga Veksler Lecture 15 Today New Topc: Unsupervsed Learnng Supervsed vs. unsupervsed learnng Unsupervsed learnng Net Tme: parametrc unsupervsed learnng Today: nonparametrc
More informationUnsupervised Learning
Pattern Recognton Lecture 8 Outlne Introducton Unsupervsed Learnng Parametrc VS Non-Parametrc Approach Mxture of Denstes Maxmum-Lkelhood Estmates Clusterng Prof. Danel Yeung School of Computer Scence and
More informationMathematics 256 a course in differential equations for engineering students
Mathematcs 56 a course n dfferental equatons for engneerng students Chapter 5. More effcent methods of numercal soluton Euler s method s qute neffcent. Because the error s essentally proportonal to the
More informationMachine Learning. Topic 6: Clustering
Machne Learnng Topc 6: lusterng lusterng Groupng data nto (hopefully useful) sets. Thngs on the left Thngs on the rght Applcatons of lusterng Hypothess Generaton lusters mght suggest natural groups. Hypothess
More informationTN348: Openlab Module - Colocalization
TN348: Openlab Module - Colocalzaton Topc The Colocalzaton module provdes the faclty to vsualze and quantfy colocalzaton between pars of mages. The Colocalzaton wndow contans a prevew of the two mages
More informationGSLM Operations Research II Fall 13/14
GSLM 58 Operatons Research II Fall /4 6. Separable Programmng Consder a general NLP mn f(x) s.t. g j (x) b j j =. m. Defnton 6.. The NLP s a separable program f ts objectve functon and all constrants are
More informationKeywords - Wep page classification; bag of words model; topic model; hierarchical classification; Support Vector Machines
(IJCSIS) Internatonal Journal of Computer Scence and Informaton Securty, Herarchcal Web Page Classfcaton Based on a Topc Model and Neghborng Pages Integraton Wongkot Srura Phayung Meesad Choochart Haruechayasak
More informationBiostatistics 615/815
The E-M Algorthm Bostatstcs 615/815 Lecture 17 Last Lecture: The Smplex Method General method for optmzaton Makes few assumptons about functon Crawls towards mnmum Some recommendatons Multple startng ponts
More informationCourse Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms
Course Introducton Course Topcs Exams, abs, Proects A quc loo at a few algorthms 1 Advanced Data Structures and Algorthms Descrpton: We are gong to dscuss algorthm complexty analyss, algorthm desgn technques
More informationEXTENDED BIC CRITERION FOR MODEL SELECTION
IDIAP RESEARCH REPORT EXTEDED BIC CRITERIO FOR ODEL SELECTIO Itshak Lapdot Andrew orrs IDIAP-RR-0-4 Dalle olle Insttute for Perceptual Artfcal Intellgence P.O.Box 59 artgny Valas Swtzerland phone +4 7
More informationProblem Set 3 Solutions
Introducton to Algorthms October 4, 2002 Massachusetts Insttute of Technology 6046J/18410J Professors Erk Demane and Shaf Goldwasser Handout 14 Problem Set 3 Solutons (Exercses were not to be turned n,
More informationFeature Reduction and Selection
Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components
More informationThe Research of Support Vector Machine in Agricultural Data Classification
The Research of Support Vector Machne n Agrcultural Data Classfcaton Le Sh, Qguo Duan, Xnmng Ma, Me Weng College of Informaton and Management Scence, HeNan Agrcultural Unversty, Zhengzhou 45000 Chna Zhengzhou
More informationCompiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz
Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster
More informationDeep Classification in Large-scale Text Hierarchies
Deep Classfcaton n Large-scale Text Herarches Gu-Rong Xue Dkan Xng Qang Yang 2 Yong Yu Dept. of Computer Scence and Engneerng Shangha Jao-Tong Unversty {grxue, dkxng, yyu}@apex.sjtu.edu.cn 2 Hong Kong
More informationQuery Clustering Using a Hybrid Query Similarity Measure
Query clusterng usng a hybrd query smlarty measure Fu. L., Goh, D.H., & Foo, S. (2004). WSEAS Transacton on Computers, 3(3), 700-705. Query Clusterng Usng a Hybrd Query Smlarty Measure Ln Fu, Don Hoe-Lan
More informationModule Management Tool in Software Development Organizations
Journal of Computer Scence (5): 8-, 7 ISSN 59-66 7 Scence Publcatons Management Tool n Software Development Organzatons Ahmad A. Al-Rababah and Mohammad A. Al-Rababah Faculty of IT, Al-Ahlyyah Amman Unversty,
More informationOutline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1
4/14/011 Outlne Dscrmnatve classfers for mage recognton Wednesday, Aprl 13 Krsten Grauman UT-Austn Last tme: wndow-based generc obect detecton basc ppelne face detecton wth boostng as case study Today:
More informationA Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems
A Unfed Framework for Semantcs and Feature Based Relevance Feedback n Image Retreval Systems Ye Lu *, Chunhu Hu 2, Xngquan Zhu 3*, HongJang Zhang 2, Qang Yang * School of Computng Scence Smon Fraser Unversty
More informationUnsupervised Learning and Clustering
Unsupervsed Learnng and Clusterng Why consder unlabeled samples?. Collectng and labelng large set of samples s costly Gettng recorded speech s free, labelng s tme consumng 2. Classfer could be desgned
More information6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour
6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the
More informationSupport Vector Machines
/9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.
More informationFor instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1)
Secton 1.2 Subsets and the Boolean operatons on sets If every element of the set A s an element of the set B, we say that A s a subset of B, or that A s contaned n B, or that B contans A, and we wrte A
More informationHarvard University CS 101 Fall 2005, Shimon Schocken. Assembler. Elements of Computing Systems 1 Assembler (Ch. 6)
Harvard Unversty CS 101 Fall 2005, Shmon Schocken Assembler Elements of Computng Systems 1 Assembler (Ch. 6) Why care about assemblers? Because Assemblers employ some nfty trcks Assemblers are the frst
More informationParallelism for Nested Loops with Non-uniform and Flow Dependences
Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr
More informationConcurrent Apriori Data Mining Algorithms
Concurrent Apror Data Mnng Algorthms Vassl Halatchev Department of Electrcal Engneerng and Computer Scence York Unversty, Toronto October 8, 2015 Outlne Why t s mportant Introducton to Assocaton Rule Mnng
More informationy and the total sum of
Lnear regresson Testng for non-lnearty In analytcal chemstry, lnear regresson s commonly used n the constructon of calbraton functons requred for analytcal technques such as gas chromatography, atomc absorpton
More informationKent State University CS 4/ Design and Analysis of Algorithms. Dept. of Math & Computer Science LECT-16. Dynamic Programming
CS 4/560 Desgn and Analyss of Algorthms Kent State Unversty Dept. of Math & Computer Scence LECT-6 Dynamc Programmng 2 Dynamc Programmng Dynamc Programmng, lke the dvde-and-conquer method, solves problems
More informationSmoothing Spline ANOVA for variable screening
Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory
More informationBAYESIAN MULTI-SOURCE DOMAIN ADAPTATION
BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION SHI-LIANG SUN, HONG-LEI SHI Department of Computer Scence and Technology, East Chna Normal Unversty 500 Dongchuan Road, Shangha 200241, P. R. Chna E-MAIL: slsun@cs.ecnu.edu.cn,
More informationReducing Frame Rate for Object Tracking
Reducng Frame Rate for Object Trackng Pavel Korshunov 1 and We Tsang Oo 2 1 Natonal Unversty of Sngapore, Sngapore 11977, pavelkor@comp.nus.edu.sg 2 Natonal Unversty of Sngapore, Sngapore 11977, oowt@comp.nus.edu.sg
More informationOptimizing Document Scoring for Query Retrieval
Optmzng Document Scorng for Query Retreval Brent Ellwen baellwe@cs.stanford.edu Abstract The goal of ths project was to automate the process of tunng a document query engne. Specfcally, I used machne learnng
More informationAn Optimal Algorithm for Prufer Codes *
J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,
More informationFEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur
FEATURE EXTRACTION Dr. K.Vjayarekha Assocate Dean School of Electrcal and Electroncs Engneerng SASTRA Unversty, Thanjavur613 41 Jont Intatve of IITs and IISc Funded by MHRD Page 1 of 8 Table of Contents
More informationMachine Learning 9. week
Machne Learnng 9. week Mappng Concept Radal Bass Functons (RBF) RBF Networks 1 Mappng It s probably the best scenaro for the classfcaton of two dataset s to separate them lnearly. As you see n the below
More informationImprovement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration
Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,
More informationDetermining the Optimal Bandwidth Based on Multi-criterion Fusion
Proceedngs of 01 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 5 (01) (01) IACSIT Press, Sngapore Determnng the Optmal Bandwdth Based on Mult-crteron Fuson Ha-L Lang 1+, Xan-Mn
More information12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification
Introducton to Artfcal Intellgence V22.0472-001 Fall 2009 Lecture 24: Nearest-Neghbors & Support Vector Machnes Rob Fergus Dept of Computer Scence, Courant Insttute, NYU Sldes from Danel Yeung, John DeNero
More informationSLAM Summer School 2006 Practical 2: SLAM using Monocular Vision
SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,
More informationBioTechnology. An Indian Journal FULL PAPER. Trade Science Inc.
[Type text] [Type text] [Type text] ISSN : 0974-74 Volume 0 Issue BoTechnology 04 An Indan Journal FULL PAPER BTAIJ 0() 04 [684-689] Revew on Chna s sports ndustry fnancng market based on market -orented
More informationCSCI 5417 Information Retrieval Systems Jim Martin!
CSCI 5417 Informaton Retreval Systems Jm Martn! Lecture 11 9/29/2011 Today 9/29 Classfcaton Naïve Bayes classfcaton Ungram LM 1 Where we are... Bascs of ad hoc retreval Indexng Term weghtng/scorng Cosne
More informationWishing you all a Total Quality New Year!
Total Qualty Management and Sx Sgma Post Graduate Program 214-15 Sesson 4 Vnay Kumar Kalakband Assstant Professor Operatons & Systems Area 1 Wshng you all a Total Qualty New Year! Hope you acheve Sx sgma
More information7/12/2016. GROUP ANALYSIS Martin M. Monti UCLA Psychology AGGREGATING MULTIPLE SUBJECTS VARIANCE AT THE GROUP LEVEL
GROUP ANALYSIS Martn M. Mont UCLA Psychology NITP AGGREGATING MULTIPLE SUBJECTS When we conduct mult-subject analyss we are tryng to understand whether an effect s sgnfcant across a group of people. Whether
More informationSimulation: Solving Dynamic Models ABE 5646 Week 11 Chapter 2, Spring 2010
Smulaton: Solvng Dynamc Models ABE 5646 Week Chapter 2, Sprng 200 Week Descrpton Readng Materal Mar 5- Mar 9 Evaluatng [Crop] Models Comparng a model wth data - Graphcal, errors - Measures of agreement
More informationGraph-based Clustering
Graphbased Clusterng Transform the data nto a graph representaton ertces are the data ponts to be clustered Edges are eghted based on smlarty beteen data ponts Graph parttonng Þ Each connected component
More informationEECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science
EECS 730 Introducton to Bonformatcs Sequence Algnment Luke Huan Electrcal Engneerng and Computer Scence http://people.eecs.ku.edu/~huan/ HMM Π s a set of states Transton Probabltes a kl Pr( l 1 k Probablty
More informationNUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS
ARPN Journal of Engneerng and Appled Scences 006-017 Asan Research Publshng Network (ARPN). All rghts reserved. NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS Igor Grgoryev, Svetlana
More informationAn Entropy-Based Approach to Integrated Information Needs Assessment
Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology
More informationA Post Randomization Framework for Privacy-Preserving Bayesian. Network Parameter Learning
A Post Randomzaton Framework for Prvacy-Preservng Bayesan Network Parameter Learnng JIANJIE MA K.SIVAKUMAR School Electrcal Engneerng and Computer Scence, Washngton State Unversty Pullman, WA. 9964-75
More informationIntelligent Information Acquisition for Improved Clustering
Intellgent Informaton Acquston for Improved Clusterng Duy Vu Unversty of Texas at Austn duyvu@cs.utexas.edu Mkhal Blenko Mcrosoft Research mblenko@mcrosoft.com Prem Melvlle IBM T.J. Watson Research Center
More informationAdaptive Transfer Learning
Adaptve Transfer Learnng Bn Cao, Snno Jaln Pan, Yu Zhang, Dt-Yan Yeung, Qang Yang Hong Kong Unversty of Scence and Technology Clear Water Bay, Kowloon, Hong Kong {caobn,snnopan,zhangyu,dyyeung,qyang}@cse.ust.hk
More informationThree supervised learning methods on pen digits character recognition dataset
Three supervsed learnng methods on pen dgts character recognton dataset Chrs Flezach Department of Computer Scence and Engneerng Unversty of Calforna, San Dego San Dego, CA 92093 cflezac@cs.ucsd.edu Satoru
More informationClustering is a discovery process in data mining.
Cover Feature Chameleon: Herarchcal Clusterng Usng Dynamc Modelng Many advanced algorthms have dffculty dealng wth hghly varable clusters that do not follow a preconceved model. By basng ts selectons on
More informationTerm Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task
Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto
More informationMULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION
MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION Paulo Quntlano 1 & Antono Santa-Rosa 1 Federal Polce Department, Brasla, Brazl. E-mals: quntlano.pqs@dpf.gov.br and
More informationClassifier Selection Based on Data Complexity Measures *
Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.
More informationTsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance
Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for
More informationX- Chart Using ANOM Approach
ISSN 1684-8403 Journal of Statstcs Volume 17, 010, pp. 3-3 Abstract X- Chart Usng ANOM Approach Gullapall Chakravarth 1 and Chaluvad Venkateswara Rao Control lmts for ndvdual measurements (X) chart are
More informationBackpropagation: In Search of Performance Parameters
Bacpropagaton: In Search of Performance Parameters ANIL KUMAR ENUMULAPALLY, LINGGUO BU, and KHOSROW KAIKHAH, Ph.D. Computer Scence Department Texas State Unversty-San Marcos San Marcos, TX-78666 USA ae049@txstate.edu,
More informationSHAPE RECOGNITION METHOD BASED ON THE k-nearest NEIGHBOR RULE
SHAPE RECOGNITION METHOD BASED ON THE k-nearest NEIGHBOR RULE Dorna Purcaru Faculty of Automaton, Computers and Electroncs Unersty of Craoa 13 Al. I. Cuza Street, Craoa RO-1100 ROMANIA E-mal: dpurcaru@electroncs.uc.ro
More informationEdge Detection in Noisy Images Using the Support Vector Machines
Edge Detecton n Nosy Images Usng the Support Vector Machnes Hlaro Gómez-Moreno, Saturnno Maldonado-Bascón, Francsco López-Ferreras Sgnal Theory and Communcatons Department. Unversty of Alcalá Crta. Madrd-Barcelona
More informationA Fast Content-Based Multimedia Retrieval Technique Using Compressed Data
A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,
More informationKeyword-based Document Clustering
Keyword-based ocument lusterng Seung-Shk Kang School of omputer Scence Kookmn Unversty & AIrc hungnung-dong Songbuk-gu Seoul 36-72 Korea sskang@kookmn.ac.kr Abstract ocument clusterng s an aggregaton of
More informationSVM-based Learning for Multiple Model Estimation
SVM-based Learnng for Multple Model Estmaton Vladmr Cherkassky and Yunqan Ma Department of Electrcal and Computer Engneerng Unversty of Mnnesota Mnneapols, MN 55455 {cherkass,myq}@ece.umn.edu Abstract:
More informationSupport Vector Machines
Support Vector Machnes Decson surface s a hyperplane (lne n 2D) n feature space (smlar to the Perceptron) Arguably, the most mportant recent dscovery n machne learnng In a nutshell: map the data to a predetermned
More informationAssembler. Building a Modern Computer From First Principles.
Assembler Buldng a Modern Computer From Frst Prncples www.nand2tetrs.org Elements of Computng Systems, Nsan & Schocken, MIT Press, www.nand2tetrs.org, Chapter 6: Assembler slde Where we are at: Human Thought
More informationUnsupervised Learning and Clustering
Unsupervsed Learnng and Clusterng Supervsed vs. Unsupervsed Learnng Up to now we consdered supervsed learnng scenaro, where we are gven 1. samples 1,, n 2. class labels for all samples 1,, n Ths s also
More informationAssembler. Shimon Schocken. Spring Elements of Computing Systems 1 Assembler (Ch. 6) Compiler. abstract interface.
IDC Herzlya Shmon Schocken Assembler Shmon Schocken Sprng 2005 Elements of Computng Systems 1 Assembler (Ch. 6) Where we are at: Human Thought Abstract desgn Chapters 9, 12 abstract nterface H.L. Language
More informationProblem Definitions and Evaluation Criteria for Computational Expensive Optimization
Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty
More informationPrivate Information Retrieval (PIR)
2 Levente Buttyán Problem formulaton Alce wants to obtan nformaton from a database, but she does not want the database to learn whch nformaton she wanted e.g., Alce s an nvestor queryng a stock-market
More informationTESTING AND IMPROVING LOCAL ADAPTIVE IMPORTANCE SAMPLING IN LJF LOCAL-JT IN MULTIPLY SECTIONED BAYESIAN NETWORKS
TESTING AND IMPROVING LOCAL ADAPTIVE IMPORTANCE SAMPLING IN LJF LOCAL-JT IN MULTIPLY SECTIONED BAYESIAN NETWORKS Dan Wu 1 and Sona Bhatt 2 1 School of Computer Scence Unversty of Wndsor, Wndsor, Ontaro
More informationHelsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)
Helsnk Unversty Of Technology, Systems Analyss Laboratory Mat-2.08 Independent research projects n appled mathematcs (3 cr) "! #$&% Antt Laukkanen 506 R ajlaukka@cc.hut.f 2 Introducton...3 2 Multattrbute
More informationUB at GeoCLEF Department of Geography Abstract
UB at GeoCLEF 2006 Mguel E. Ruz (1), Stuart Shapro (2), June Abbas (1), Slva B. Southwck (1) and Davd Mark (3) State Unversty of New York at Buffalo (1) Department of Lbrary and Informaton Studes (2) Department
More informationCE 221 Data Structures and Algorithms
CE 1 ata Structures and Algorthms Chapter 4: Trees BST Text: Read Wess, 4.3 Izmr Unversty of Economcs 1 The Search Tree AT Bnary Search Trees An mportant applcaton of bnary trees s n searchng. Let us assume
More informationText Similarity Computing Based on LDA Topic Model and Word Co-occurrence
2nd Internatonal Conference on Software Engneerng, Knowledge Engneerng and Informaton Engneerng (SEKEIE 204) Text Smlarty Computng Based on LDA Topc Model and Word Co-occurrence Mngla Shao School of Computer,
More informationGreedy Technique - Definition
Greedy Technque Greedy Technque - Defnton The greedy method s a general algorthm desgn paradgm, bult on the follong elements: confguratons: dfferent choces, collectons, or values to fnd objectve functon:
More informationA Statistical Model Selection Strategy Applied to Neural Networks
A Statstcal Model Selecton Strategy Appled to Neural Networks Joaquín Pzarro Elsa Guerrero Pedro L. Galndo joaqun.pzarro@uca.es elsa.guerrero@uca.es pedro.galndo@uca.es Dpto Lenguajes y Sstemas Informátcos
More informationCS246: Mining Massive Datasets Jure Leskovec, Stanford University
CS46: Mnng Massve Datasets Jure Leskovec, Stanford Unversty http://cs46.stanford.edu /19/013 Jure Leskovec, Stanford CS46: Mnng Massve Datasets, http://cs46.stanford.edu Perceptron: y = sgn( x Ho to fnd
More informationFuzzy Filtering Algorithms for Image Processing: Performance Evaluation of Various Approaches
Proceedngs of the Internatonal Conference on Cognton and Recognton Fuzzy Flterng Algorthms for Image Processng: Performance Evaluaton of Varous Approaches Rajoo Pandey and Umesh Ghanekar Department of
More informationLECTURE : MANIFOLD LEARNING
LECTURE : MANIFOLD LEARNING Rta Osadchy Some sldes are due to L.Saul, V. C. Raykar, N. Verma Topcs PCA MDS IsoMap LLE EgenMaps Done! Dmensonalty Reducton Data representaton Inputs are real-valued vectors
More informationThe Codesign Challenge
ECE 4530 Codesgn Challenge Fall 2007 Hardware/Software Codesgn The Codesgn Challenge Objectves In the codesgn challenge, your task s to accelerate a gven software reference mplementaton as fast as possble.
More informationLearning an Image Manifold for Retrieval
Learnng an Image Manfold for Retreval Xaofe He*, We-Yng Ma, and Hong-Jang Zhang Mcrosoft Research Asa Bejng, Chna, 100080 {wyma,hjzhang}@mcrosoft.com *Department of Computer Scence, The Unversty of Chcago
More informationA Fast Visual Tracking Algorithm Based on Circle Pixels Matching
A Fast Vsual Trackng Algorthm Based on Crcle Pxels Matchng Zhqang Hou hou_zhq@sohu.com Chongzhao Han czhan@mal.xjtu.edu.cn Ln Zheng Abstract: A fast vsual trackng algorthm based on crcle pxels matchng
More informationVirtual Machine Migration based on Trust Measurement of Computer Node
Appled Mechancs and Materals Onlne: 2014-04-04 ISSN: 1662-7482, Vols. 536-537, pp 678-682 do:10.4028/www.scentfc.net/amm.536-537.678 2014 Trans Tech Publcatons, Swtzerland Vrtual Machne Mgraton based on
More informationCHAPTER 2 DECOMPOSITION OF GRAPHS
CHAPTER DECOMPOSITION OF GRAPHS. INTRODUCTION A graph H s called a Supersubdvson of a graph G f H s obtaned from G by replacng every edge uv of G by a bpartte graph,m (m may vary for each edge by dentfyng
More informationGENETIC ALGORITHMS APPLIED FOR PATTERN GENERATION FOR DOWNHOLE DYNAMOMETER CARDS
GENETIC ALGORITHMS APPLIED FOR PATTERN GENERATION FOR DOWNHOLE DYNAMOMETER CARDS L. Schntman 1 ; B.C.Brandao 1 ; H.Lepkson 1 ; J.A.M. Felppe de Souza 2 ; J.F.S.Correa 3 1 Unversdade Federal da Baha- Brazl
More informationAnnouncements. Supervised Learning
Announcements See Chapter 5 of Duda, Hart, and Stork. Tutoral by Burge lnked to on web page. Supervsed Learnng Classfcaton wth labeled eamples. Images vectors n hgh-d space. Supervsed Learnng Labeled eamples
More information