Learning Tag Embeddings and Tag-specific Composition Functions in Recursive Neural Network

Size: px
Start display at page:

Download "Learning Tag Embeddings and Tag-specific Composition Functions in Recursive Neural Network"

Transcription

1 Learnng Tag Embeddngs and Tag-specfc Composton Functons n Recursve Neural Network Qao Qan, Bo Tan, Mnle Huang, Yang Lu*, Xuan Zhu*, Xaoyan Zhu State Key Lab. of Intellgent Technology and Systems, Natonal Lab. for Informaton Scence and Technology, Dept. of Computer Scence and Technology, Tsnghua Unversty, Bejng , PR Chna *Samsung R&D Insttute Bejng, Chna qanqaodecember29@126.com, smxtanbo@gmal.com ahuang@tsnghua.edu.cn, yang.lu@samsung.com xuan.zhu@samsung.com, zxy-dcs@tsnghua.edu.cn Abstract Recursve neural network s one of the most successful deep learnng models for natural language processng due to the compostonal nature of text. The model recursvely composes the vector of a parent phrase from those of chld words or phrases, wth a key component named composton functon. Although a varety of composton functons have been proposed, the syntactc nformaton has not been fully encoded n the composton process. We propose two models, Tag Guded RNN (TG- RNN for short) whch chooses a composton functon accordng to the part-ofspeech tag of a phrase, and Tag Embedded RNN/RNTN (TE-RNN/RNTN for short) whch learns tag embeddngs and then combnes tag and word embeddngs together. In the fne-graned sentment classfcaton, experment results show the proposed models obtan remarkable mprovement: TG-RNN/TE-RNN obtan remarkable mprovement over baselnes, TE-RNTN obtans the second best result among all the top performng models, and all the proposed models have much less parameters/complexty than ther counterparts. 1 Introducton Among a varety of deep learnng models for natural language processng, Recursve Neural Network (RNN) may be one of the most popular models. Thanks to the compostonal nature of natural text, recursve neural network utlzes the recursve structure of the nput such as a phrase or sentence, and has shown to be very effectve for many natural language processng tasks ncludng semantc relatonshp classfcaton (Socher et al., 2012), syntactc parsng (Socher et al., 2013a), sentment analyss (Socher et al., 2013b), and machne translaton (L et al., 2013). The key component of RNN and ts varants s the composton functon: how to compose the vector representaton for a longer text from the vector of ts chld words or phrases. For nstance, as shown n Fgure 2, the vector of s very nterestng can be composed from the vector of the left node s and that of the rght node very nterestng. It s worth to menton agan, the composton process s conducted wth the syntactc structure of the text, makng RNN more nterpretable than other deep learnng models. s... g very s very nterestng g very nterestng nterestng Fgure 1: The example process of vector composton n RNN. The vector of node very nterestng s composed from the vectors of node very and node nterestng. Smlarly, the node s very nterestng s composed from the phrase node very nterestng and the word node s. There are varous attempts to desgn the composton functon n RNN (or related models). In RNN (Socher et al., 2011), a global matrx s used to lnearly combne the elements of vectors. In RNTN (Socher et al., 2013b), a global tensor s used to compute the tensor products of dmensons to favor the assocaton between dfferent el Proceedngs of the 53rd Annual Meetng of the Assocaton for Computatonal Lngustcs and the 7th Internatonal Jont Conference on Natural Language Processng, pages , Bejng, Chna, July 26-31, c 2015 Assocaton for Computatonal Lngustcs

2 ements of the vectors. Sometmes t s challengng to fnd a sngle functon to model the composton process. As an alternatve, multple composton functons can be used. For nstance, n MV-RNN (Socher et al., 2012), dfferent matrces s desgned for dfferent words though the model s suffered from too much parameters. In AdaMC RNN/RNTN (Dong et al., 2014), a fxed number of composton functons s lnearly combned and the weght for each functon s adaptvely learned. In spte of the success of RNN and ts varants, the syntactc knowledge of the text s not yet fully employed n these models. Two deas are motvated by the example shown n Fgure 2: Frst, the composton functon for the noun phrase the move/np should be dfferent from that for the adjectve phrase very nterestng/adjp snce the two phrases are qute syntactcally dfferent. More specfcally to sentment analyss, a noun phrase s much less lkely to express sentment than an adjectve phrase. There are two notable works mentoned here: (Socher et al., 2013a) presented to combne the parsng and composton processes, but the purpose s for parsng; (Hermann and Blunsom, 2013) desgned composton functons accordng to the combnatory rules and categores n CCG grammar, however, only margnal mprovement aganst Nave Bayes was reported. Our proposed model, tag guded RNN (TG-RNN), s desgned to use the syntactc tag of the parent phrase to gude the composton process from the chld nodes. As an example, we desgn a functon for composng noun phrase (NP) and another one for adjectve phrase (ADJP). Ths smple strategy obtans remarkable mprovements aganst strong baselnes. the / DT the move / NP the move s very nteres/ng / S move / NN s / VBZ s very nteres/ng / VP very nteres/ng / ADJP very / RB nteres/ng / JJ Fgure 2: The parse tree for sentence The move s very nterestng bult by Stanford Parser. Second, when composng the adjectve phrase very nterestng/adjp from the left node very/rb and the rght node nterestng/jj, the rght node s obvously more mportant than the left one. Furthermore, the rght node nterestng/jj apparently contrbutes more to sentment expresson. To address ths ssue, we propose Tag embedded RNN/RNTN (TE-RNN/RNTN), to learn an embeddng vector for each word/phrase tag, and concatenate the tag vector wth the word/phrase vector as nput to the composton functon. For nstance, we have tag vectors for DT,NN,RB,JJ,ADJP,NP, etc. and the tag vectors are then used n composng the parent s vector. The proposed TE-RNTN obtan the second best result among all the top performng models but wth much less parameters and complexty. To the best of our knowledge, ths s the frst tme that tag embeddng s proposed. To summarze, the contrbutons of our work are as follows: We propose tag-guded composton functons n recursve neural network, TG-RNN. Tag-guded RNN allocates a composton functon for a phrase accordng to the partof-speech tag of the phrase. We propose to learn embeddng vectors for part-of-speech tags of words/phrases, and ntegrate the tag embeddngs n RNN and RNTN respectvely. The two models, TE- RNN and TE-RNTN, can leverage the syntactc nformaton of chld nodes when generatng the vector of parent nodes. The proposed models are effcent and effectve. The scale of the parameters s well controlled. Expermental results on the Stanford Sentment Treebank corpus show the effectveness of the models. TE-RNTN obtans the second best result among all publcly reported approaches, but wth much less parameters and complexty. The rest of the paper s structured as follows: n Secton 2, we survey related work. In Secton 3, we ntroduce the tradtonal recursve neural network as background. We present our deas n Secton 4. The experments are ntroduced n Secton 5. We summarze the work n Secton 6. 2 Related Work Dfferent knds of representatons are used n sentment analyss. Tradtonally, the bag-ofwords representatons are used for sentment analyss (Pang and Lee, 2008). To explot the relatonshp between words, word co-occurrence (Turney et al., 2010) and syntactc contexts (Padó 1366

3 and Lapata, 2007) are consdered. In order to dstngush antonyms wth smlar contexts, neural word vectors (Bengo et al., 2003) are proposed and can be learnt n an unsupervsed manner. Word2vec (Mkolov et al., 2013a) ntroduces a smpler network structure makng computaton more effcently and makes bllons of samples feasble for tranng. Semantc composton deals wth representng a longer text from ts shorter components, whch s extensvely studed recently. In many prevous works, a phrase vector s usually obtaned by average (Landauer and Dumas, 1997), addton, element-wse multplcaton (Mtchell and Lapata, 2008) or tensor product (Smolensky, 1990) of word vectors. In addton to usng vector representatons, matrces can also be used to represent phrases and the composton process can be done through matrx multplcaton (Rudolph and Gesbrecht, 2010; Yessenalna and Carde, 2011). Recursve neural models utlze the recursve structure (usually a parse tree) of a phrase or sentence for semantc composton. In Recursve Neural Network (Socher et al., 2011), the tree wth the least reconstructon error s bult and the vectors for nteror nodes s composed by a global matrx. Matrx-Vector Recursve Neural Network (MV-RNN) (Socher et al., 2012) assgns matrces for every words so that t could capture the relatonshp between two chldren. In Recursve Neural Tensor Networks (RNTN) (Socher et al., 2013b), the composton process s performed on a parse tree n whch every node s annotated wth fne-graned sentment labels, and a global tensor s used for composton. Adaptve Mult- Compostonalty (Dong et al., 2014) uses multple weghted composton matrces nstead of sharng a sngle matrx. The employment of syntactc nformaton n RNN s stll n ts nfant. In (Socher et al., 2013a), the part-of-speech tag of chld nodes s consdered n combnng the processes of both composton and parsng. The man purpose s for better parsng by employng RNN, but t s not desgned for sentment analyss. In (Hermann and Blunsom, 2013), the authors desgned composton functons accordng to the combnatory rules and categores n CCG grammar. However, only margnal mprovement aganst Nave Bayes was reported. Unlke (Hermann and Blunsom, 2013), our TG-RNN obtans remarkable mprovements aganst strong baselnes, and we are the frst to propose tag embedded RNTN whch obtans the second best result among all reported approaches. 3 Background: Recursve Neural Models In recursve neural models, the vector of a longer text (e.g., sentence) s composed from those of ts shorter components (e.g., words or phrases). To compose a sentence vector through word/phrase vectors, a bnary parse tree has to be bult wth a parser. The leaf nodes represent words and nteror nodes represent phrases. Vectors of nteror nodes are computed recursvely by composton of chld nodes vectors. Specally, the root vector s regarded as the sentence representaton. The composton process s shown n Fgure 1. More formally, vector v R d for node s calculated va: v = f(g(v l, v r )) (1) where v l and vr are chld vectors, g s a composton functon, and f s a nonlnearty functon, usually tanh. Dfferent recursve neural models manly dffer n composton functon. For example, the composton functon for RNN s as below: [ ] v g(v, l v r l ) = W + b (2) where W R d 2d s a composton matrx and b s a bas vector. And the composton functon for RNTN s as follows: [ ] [ ] [ ] v g(v, l v r l ) = v v r T [1:d] l v l v r + W v r + b (3) where W and b are defned n the prevous model and T [1:d] R 2d 2d d s the tensor that defnes multple blnear forms. The vectors are used as feature nputs to a softmax classfer. The posteror probablty over class labels on a node vector v s gven by v r y = softmax(w s v + b s ). (4) The parameters n these models nclude the word table L, a composton matrx W n RNN, and W and T [1:d] n RNTN, and the classfcaton matrx W s for the softmax classfer. 1367

4 4 Incorporatng Syntactc Knowledge nto Recursve Neural Model The central dea of the paper s nspred by the fact that words/phrases of dfferent part-of-speech tags play dfferent roles n semantc composton. As dscussed n the ntroducton, a noun phrase (e.g., a move/np) may be composed dfferent from a verb phrase (e.g., love move/vp). Furthermore, when composng the phrase a move/np, the two chld words, a/dt and move/nn, may play dfferent roles n the composton process. Unfortunately, the prevous RNN models neglect such syntactc nformaton, though the models do employ the parsng structure of a sentence. We have two approaches to mprove the composton process by leveragng tags on parent nodes and chld nodes. One approach s to use dfferent composton matrces for parent nodes wth dfferent tags so that the composton process could be guded by phrase type, for example, the matrx for NP s dfferent from that for VP. The other approach s to ntroduce tag embeddng for words and phrases, for example, to learn tag vectors for NP, VP, ADJP, etc., and then ntegrate the tag vectors wth the word/phrase vectors durng the composton process. 4.1 Tag Guded RNN (TG-RNN) We propose Tag Guded RNN (TG-RNN) to respect the tag of a parent phrase durng the composton process. The model chooses a composton functon accordng to the part-of-speech tag of a phrase. For example, the move has tag NP, very nterestng has tag ADJP, the two phrases have dfferent composton matrces. More formally, we desgn composton functons g wth a factor of the phrase tag of a parent node. The composton functon becomes g(t, v l, v r ) = g t (v l, v r ) = W t [ v l v r ] + b t (5) where t s the phrase tag for node, W t and b t are the parameters of functon g t, as defned n Equaton 2. In other words, phrase nodes wth varous tags have ther own composton functons such as g NP, g V P, and so on. There are totally k composton functon n ths model where k s the number of phrase tags. When composng chld vectors, a functon s chosen from the functon pool accordng to the tag of the parent node. The process s depcted n Fgure 3. We term ths model Tag guded RNN, TG-RNN for short.... s / VBZ very / RB s very nterestng / VP... g NP g ADJP g VP... g NP g ADJP g VP very nterestng / ADJP nterestng / JJ Fgure 3: The vector of phrase very nterestng s composed wth hghlghted g ADJP and s very nterestng wth g V P. But some tags have few occurrences n the corpus. It s hard and meanngless to tran composton functons for those nfrequent tags. So we smply choose top k frequent tags and tran k composton functons. A common composton functon s shared across phrases wth all nfrequent tags. The value of k depends on the sze of the tranng set and the occurrences of each tag. Specally, when k = 0, the model s the same as the tradtonal RNN. 4.2 Tag Embedded RNN and RNTN (TE-RNN/RNTN) In ths secton, we propose tag embedded RNN (TE-RNN) and tag embedded RNTN (TE-RNTN) to respect the part-of-speech tags of chld nodes durng composton. As mentoned above, tags of parent nodes have mpact on composton. However, some phrases wth the same tag should be composed n dfferent ways. For example, s nterestng and lke swmmng have the same tag VP. But t s not reasonable to compose the two phrases usng the prevous model because the partof-speech tags of ther chldren are qute dfferent. If we use dfferent composton functons for chldren wth dfferent tags lke TG-RNN, the number of tag pars wll amount to as many as k k, whch makes the models nfeasble due to too many parameters. In order to capture the compostonal effects of the tags of chld nodes, an embeddng e t R de s created for every tag t, where d e s the dmenson of tag vector. The tag vector and phrase vector are 1368

5 concatenated durng composton as llustrated n Fgure 4. Formally, the phrase vector s composed by the functon g(v, l e t l, v r, e t r ) = W v l e t l v r e t r + b (6) where t l and tr are tags of the left and the rght nodes respectvely, e t l and e t r are tag vectors, and W R d (2de+2d) s the composton matrx. We term ths model Tag embedded RNN, TE-RNN for short.... s / VBZ g very / RB s very nterestng / VP g very nterestng / ADJP nterestng / JJ Fgure 4: RNN wth tag embeddng. There s a tag embeddng table, storng vectors for RB, JJ, and ADJP, etc. Then we compose the phrase vector very nterestng from the vectors for very and nterestng, and the tag vectors for RB and JJ. Smlarly, ths dea can be appled to Recursve Neural Tensor Network (Socher et al., 2013b). In RNTN, the tag vector and the phrase vector can be nterweaved together through a tensor. More specfcally, the phrase vectors and tag vectors are multpled by the composed tensor. The composton functon changes to the followng: = g(v, l e t l, v r, e t r ) v l e t l v r e t r T [1:d] v l e t l v r e t r + W v l e t l v r e t r + b (7) where the varables are smlar to those defned n equaton 3 and equaton 7. We term ths model Tag embedded RNTN, TE-RNTN for short. The phrase vectors and tag vectors are used as nput to a softmax classfer, gvng the posteror probablty over labels va [ ] v y = softmax(w s + b s ) (8) 4.3 Model Tranng Let y be the target dstrbuton for node, ŷ be the predcted sentment dstrbuton. Our goal s to mnmze the cross-entropy error between y and ŷ for all nodes. The loss functon s defned as follows: E(θ) = y j log ŷ j + λ θ 2 (9) j where j s the label ndex, λ s a L 2 -regularzaton term, and θ s the parameter set. Smlar to RNN, the parameters for our models nclude word vector table L, the composton matrx W, and the sentment classfcaton matrx W s. Besdes, our models have some addtonal parameters, as dscussed below: TG-RNN: There are k composton matrces for top k frequent tags. They are defned as W t R k d 2d. The orgnal composton matrx W s for all nfrequent tags. As a result, the parameter set of TG-RNN s θ = (L, W, W t, W s ). TE-RNN: The parameters nclude the tag embeddng table E, whch contans all the embeddngs for part-of-speech tags for words and phrases. And the sze of matrx W R d (2d+2d e) and the softmax classfer W s R N (de+d). The parameter set of TE-RNN s θ = (L, E, W, W s ). TE-RNTN: Ths model has one more tensor T R (2d+2d e) (2d+2d e ) d than TE-RNN. The parameter set of TE-RNTN s θ = (L, E, W, T, W s ) 5 Experment e t 5.1 Dataset and Experment Settng We evaluate our models on Stanford Sentment Treebank whch contans fully labeled parse trees. It s bult upon 10,662 revews and each sentence has sentment labels on each node n the parse tree. The sentment label set s {0,1,2,3,4}, where the numbers mean very negatve, negatve, neutral, postve, and very postve, respectvely. We use standard splt (tran: 8,544 dev: 1,101, test: 2,210) on the corpus n our experments. In addton, we add the part-of-speech tag for each leaf node and phrase-type tag for each nteror node 1369

6 usng the latest verson of Stanford Parser. Because the newer parser generated trees dfferent from those provded n the datasets, 74/11/11 revews n tran/dev/test datasets are gnored. After removng the broken revews, our dataset contans revews (tran: 8,470, dev: 1,090, test: 2,199). The word vectors were pre-traned on an unlabeled corpus (about 100,000 move revews) by word2vec (Mkolov et al., 2013b) as ntal values and the other vectors s ntalzed by samplng from a unform dstrbuton U( ϵ, ϵ) where ϵ s 0.01 n our experments. The dmenson of word vectors s 25 for RNN models and 20 for RNTN models. Tanh s chosen as the nonlnearty functon. And after computng the output of node wth v = f(g(v l, vr )), we set v = v v so that the resultng vector has a lmted norm. Backpropagaton algorthm (Rumelhart et al., 1986) s used to compute gradents and we use mnbatch SGD wth momentum as the optmzaton method, mplemented wth Theano (Basten et al., 2012). We traned all our models usng stochastc gradent descent wth a batch sze of 30 examples, momentum of 0.9, L 2 -regularzaton weght of and a constant learnng rate of System Comparson We compare our models wth several methods whch are evaluated on the Sentment Treebank corpus. The baselne results are reported n (Dong et al., 2014) and (Km, 2014). We make comparson to the followng baselnes: SVM. A SVM model wth bag-of-words representaton (Pang and Lee, 2008). MNB/b-MNB. Multnomal Nave Bayes and ts bgram varant, adopted from (Wang and Mannng, 2012). RNN. The frst Recursve Neural Network model proposed by (Socher et al., 2011). MV-RNN. Matrx Vector Recursve Neural Network (Socher et al., 2012) represents each word and phrase wth a vector and a matrx. As reported, ths model suffers from too many parameters. RNTN. Recursve Neural Tenser Network (Socher et al., 2013b) employs a tensor Method Fne-graned Pos./Neg. SVM MNB b-mnb RNN MV-RNN RNTN AdaMC-RNN AdaMC-RNTN DRNN TG-RNN (ours) TE-RNN (ours) TE-RNTN (ours) CNN DCNN Para-Vec Table 1: Classfcaton accuray. Fne-graned stands for 5-class predcton and Pos./Neg. means bnary predcton whch gnores all neutral nstances. All the accuracy s at the sentence level (root). for composton functon whch could model the meanng of longer phrases and capture negaton rules. AdaMC. Adaptve Mult-Compostonalty for RNN and RNTN (Dong et al., 2014) trans more than one composton functons and adaptvely learns the weght for each functon. DCNN/CNN. Dynamc Convolutonal Neural Network (Kalchbrenner et al., 2014) and a smple Convolutonal Neural Network (Km, 2014), though these models are of dfferent genres to RNN, we nclude them here for far comparson snce they are among top performng approaches on ths task. Para-Vec. A word2vec varant (Le and Mkolov, 2014) that encodes paragraph nformaton nto word embeddng learnng. A smple but very compettve model. DRNN. Deep Recursve Neural Network (Irsoy and Carde, 2014) stacks multple recursve layers. The comparatve results are shown n Table 1. As llustrated, TG-RNN outperforms RNN, RNTN, MV-RNN, AdMC-RNN/RNTN. 1370

7 Compared wth RNN, the fne-graned accuracy and bnary accuracy of TG-RNN s mproved by 3.8% and 3.9% respectvely. When compared wth AdaMC-RNN, the accuracy of our method rses by 1.2% on the fne-graned predcton. The results show that the syntactc knowledge does facltate phrase vector composton n ths task. As for TE-RNN/RNTN, the fne-graned accuracy of TE-RNN s boosted by 4.8% compared wth RNN and the accuracy of TE-RNTN by 3.2% compared wth RNTN. TE-RNTN also beat the AdaMC-RNTN by 2.2% on the fne-graned classfcaton task. TE-RNN s comparable to CNN and DCNN, another lne of models for ths task. TE-RNTN s better than CNN, DCNN, and Para- Vec, whch are the top performng approaches on ths task. TE-RNTN s worse than DRNN, but the complexty of DRNN s much hgher than TE- RNTN, whch wll be dscussed n the next secton. Furthermore, TE-RNN s also better than TG-RNN. Ths mples that learnng the tag embeddngs for chld nodes s more effectve than smply usng the tags of parent phrases n composton. Note that the fne-graned accuracy s more convncble and relable to compare dfferent approaches due to the two facts: Frst, for the bnary classfcaton task, some approaches tran another bnary classfer for postve/negatve classfcaton whle other approaches, lke ours, drectly use the fne-graned classfer for ths purpose. Second, how the neutral nstances are processed s qute trcky and the detals are not reported n the lterature. In our work, we smply remove neural nstances from the test data before the evaluaton. Let the 5-dmenson vector y be the probabltes for each sentment label n a test nstance. The predcton wll be postve f arg max, 2 y s greater than 2, otherwse negatve, where {0, 1, 2, 3, 4} means very negatve, negatve, neutral, postve, very postve, respectvely. 5.3 Complexty Analyss To gan deeper understandng of the models presented n Table 1, we dscuss here about the parameter scale of the RNN/RNTN models snce the predcton power of neural network models s hghly correlated wth the number of parameters. The analyss s presented n Table 2 (the optmal values are adopted from the cted papers). The parameters for the word table have the same sze n d across all recursve neural models, where n s the number of words and d s the dmenson of word vector. Therefore, we gnore ths part but focus on the parameters of composton functons, termed model sze. Our models, TG-RNN/TE- RNN, have much less parameters than RNTN and AdMC-RNN/RNTN, but have much better performance. Although TE-RNTN s worse than DRNN, however, the parameters of DRNN are almost 9 tmes of ours. Ths ndcates that DRNN s much more complex, whch requres much more data and tme to tran. As a matter of a fact, our TE- RNTN only takes 20 epochs for tranng whch s 10 tmes less than DRNN. Method model sze # of parameters RNN 2d 2 1.8K RNTN 4d 3 108K AdaMC-RNN 2d 2 c 18.7K AdaMC-RNTN 4d 3 c 202K DRNN d h l +2h 2 l 451K TG-RNN (ours) 2d 2 (k + 1) 8.8K TE-RNN (ours) 2(d + d e ) d 1.7K TE-RNTN (ours) 4(d + d e ) 2 d 54K Table 2: The model sze. d s the dmenson of word/phrase vectors (the optmal value s 30 for RNN & RNTN, 25 for AdaMC-RNN, 15 for AdaMC-RNTN, 300 for DRNN). For AdaMC, c s the number of composton functons (15 s the optmal settng). For DRNN, l and h s the number of layers and the wdth for each layer (the optmal values l = 4, h = 174). For our methods, k s the number of unshared composton matrces and d e the dmenson of tag embeddng, for the optmal settng refer to Secton Parameter Analyss We have two key parameters to tune n our proposed models. For TG-RNN, the number of composton functons k s an mportant parameter, whch corresponds to the number of dstnct POS tags of phrases. Let s start from the corpus analyss. As shown n Table 3, the corpus contans 215,154 phrases but the dstrbuton of phrase tags s extremely mbalanced. For example, the phrase tag NP appears 60,239 tmes whle NAC appears only 10 tmes. Hence, t s mpossble to learn a compos- 1371

8 Phrase tag Frequency Phrase tag Frequency NP 60,239 ADVP 1,140 S 33,138 PRN 976 VP 26,956 FARG 792 PP 14,979 UCP 362 ADJP 7,912 SSINV 266 SBAR 5,308 others 1,102 Table 3: The dstrbuton of phrase-type tags n the tranng data. The top 6 frequency tags cover more than 95% phrases. ton functon for the nfrequent phrase tags. Each of the top k frequent phrase tags corresponds to a unque composton functon, whle all the other phrase tags share a same functon. We compare dfferent k for TG-RNN. The accuracy s shown n Fgure 5. Our model obtans the best performance when k s 6, whch s accordant wth the statstcs n Table 3. accuracy TG-RNN AdaMC-RNN RNN k Fgure 5: The accuracy for TG-RNN wth dfferent k. For TE-RNN/RNTN, the key parameter to tune s the dmenson of tag vectors. In the corpus, we have 70 types of tags for leaf nodes (words) and nteror nodes (phrases). Infrequent tags whose frequency s less than 1,000 are gnored. There are 30 tags left and we learn an embeddng for each of these frequent tags. We vares the dmenson of the embeddng d e from 0 to 30. Fgure 6 shows the accuracy for TE-RNN and TE-RNTN wth dfferent dmensons of d e. Our model obtans the best performance when d e s 8 for TE-RNN and 6 for TE-RNTN. The results show that too small dmensons may not be suffcent to encode the syntactc nformaton of tags and too large dmensons damage the performance. accuracy TE-RNTN TE-RNN AdaMC-RNTN AdaMC-RNN RNTN RNN d e Fgure 6: The accuracy for TE-RNN and TE- RNTN wth dfferent dmensons of d e. 5.5 Tag Vectors Analyss In order to prove tag vectors obtaned from tag embedded models are meanngful, we nspect the smlarty between vectors of tags. For each tag vector, we fnd the nearest neghbors based on Eucldean dstance, summarzed n Table 4. Tag Most Smlar Tags JJ (Adjectve) ADJP (Adjectve Phrase) VP (Verb Phrase) VBD (past tense) VBN (past partcple). (Dot) : (Colon) Table 4: Top 1 or 2 nearest neghborng tags wth defnton n brackets. Adjectves and verbs are of sgnfcant mportance n sentment analyss. Although JJ and ADJP are word and phrase tag respectvely, they have smlar tag vectors, because of playng the same role of Adjectve n sentences. VP, VBD and VBN wth smlar representatons all represent verbs. What s more nterestng s that the nearest neghbor of dot s colon, probably because both of them are punctuaton marks. Note that tag classfcaton s none of our tranng objectves and surprsngly the vectors of smlar tags are clustered together, whch can provdes addtonal nformaton durng sentence composton. 6 Concluson In ths paper, we present two ways to leverage syntactc knowledge n Recursve Neural Networks. 1372

9 The frst way s to use dfferent composton functons for phrases wth dfferent tags so that the composton processng s guded by phrase types (TG-RNN). The second way s to learn tag embeddngs and combne tag and word embeddngs durng composton (TE-RNN/RNTN). The proposed models are not only effectve (w.r.t competng performance) but also effcent (w.r.t wellcontrolled parameter scale). Experment results show that our models are among the top performng approaches up to date, but wth much less parameters and complexty. Acknowledgments Ths work was partly supported by the Natonal Basc Research Program (973 Program) under grant No.2012CB316301/2013CB329403, the Natonal Scence Foundaton of Chna under grant No / , and the Bejng Hgher Educaton Young Elte Teacher Project. The work was also supported by Tsnghua Unversty Bejng Samsung Telecom R&D Center Jont Laboratory for Intellgent Meda Computng. References Frédérc Basten, Pascal Lambln, Razvan Pascanu, James Bergstra, Ian J. Goodfellow, Arnaud Bergeron, Ncolas Bouchard, and Yoshua Bengo Theano: new features and speed mprovements. Deep Learnng and Unsupervsed Feature Learnng NIPS 2012 Workshop. Yoshua Bengo, Réjean Ducharme, Pascal Vncent, and Chrstan Janvn A neural probablstc language model. The Journal of Machne Learnng Research, 3: L Dong, Furu We, Mng Zhou, and Ke Xu Adaptve mult-compostonalty for recursve neural models wth applcatons to sentment analyss. In AAAI. AAAI. Karl Mortz Hermann and Phl Blunsom The role of syntax n vector space models of compostonal semantcs. In ACL, pages Assocaton for Computer Lngustcs. Ozan Irsoy and Clare Carde Deep recursve neural networks for compostonalty n language. In NIPS, pages Nal Kalchbrenner, Edward Grefenstette, and Phl Blunsom A convolutonal neural network for modellng sentences. In ACL, pages Assocaton for Computer Lngustcs. Yoon Km Convolutonal neural networks for sentence classfcaton. In EMNLP, pages Assocaton for Computatonal Lngustcs. Thomas K Landauer and Susan T Dumas A soluton to plato s problem: The latent semantc analyss theory of acquston, nducton, and representaton of knowledge. Psychologcal Revew, 104(2):211. Quoc V Le and Tomas Mkolov Dstrbuted representatons of sentences and documents. In ICML, volume 32, pages Peng L, Yang Lu, and Maosong Sun Recursve autoencoders for ITG-based translaton. In EMNLP, pages Assocaton for Computer Lngustcs. Tomas Mkolov, Ka Chen, Greg Corrado, and Jeffrey Dean. 2013a. Effcent estmaton of word representatons n vector space. CoRR. Tomas Mkolov, Ilya Sutskever, Ka Chen, Greg S Corrado, and Jeff Dean. 2013b. Dstrbuted representatons of words and phrases and ther compostonalty. In NIPS, pages Jeff Mtchell and Mrella Lapata Vector-based models of semantc composton. In ACL, pages Sebastan Padó and Mrella Lapata Dependency-based constructon of semantc space models. Computatonal Lngustcs, 33(2): Bo Pang and Lllan Lee Opnon mnng and sentment analyss. Foundatons and Trends n Informaton Retreval, 2(1-2): Sebastan Rudolph and Eugene Gesbrecht Compostonal matrx-space models of language. In ACL, pages Assocaton for Computer Lngustcs. Davd E Rumelhart, Geoffrey E Hnton, and Ronald J Wllams Learnng representatons by backpropagatng errors. Nature, 323: Paul Smolensky Tensor product varable bndng and the representaton of symbolc structures n connectonst systems. Artfcal ntellgence, 46(1): Rchard Socher, Jeffrey Pennngton, Erc H Huang, Andrew Y Ng, and Chrstopher D Mannng Sem-supervsed recursve autoencoders for predctng sentment dstrbutons. In EMNLP, pages Assocaton for Computatonal Lngustcs. Rchard Socher, Brody Huval, Chrstopher D Mannng, and Andrew Y Ng Semantc compostonalty through recursve matrx-vector spaces. In EMNLP, pages Assocaton for Computatonal Lngustcs. 1373

10 Rchard Socher, John Bauer, Chrstopher D Mannng, and Andrew Y Ng. 2013a. Parsng wth compostonal vector grammars. In ACL, pages Assocaton for Computer Lngustcs. Rchard Socher, Alex Perelygn, Jean Y Wu, Jason Chuang, Chrstopher D Mannng, Andrew Y Ng, and Chrstopher Potts. 2013b. Recursve deep models for semantc compostonalty over a sentment treebank. In EMNLP, pages Assocaton for Computatonal Lngustcs. Peter D Turney, Patrck Pantel, et al From frequency to meanng: Vector space models of semantcs. Journal of Artfcal Intellgence Research, 37(1): Sda I Wang and Chrstopher D Mannng Baselnes and bgrams: Smple, good sentment and topc classfcaton. In ACL, pages Assocaton for Computatonal Lngustcs. Anur Yessenalna and Clare Carde Compostonal matrx-space models for sentment analyss. In EMNLP, pages Assocaton for Computer Lngustcs. 1374

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

Lecture 5: Multilayer Perceptrons

Lecture 5: Multilayer Perceptrons Lecture 5: Multlayer Perceptrons Roger Grosse 1 Introducton So far, we ve only talked about lnear models: lnear regresson and lnear bnary classfers. We noted that there are functons that can t be represented

More information

BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION

BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION SHI-LIANG SUN, HONG-LEI SHI Department of Computer Scence and Technology, East Chna Normal Unversty 500 Dongchuan Road, Shangha 200241, P. R. Chna E-MAIL: slsun@cs.ecnu.edu.cn,

More information

Classifier Selection Based on Data Complexity Measures *

Classifier Selection Based on Data Complexity Measures * Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.

More information

Transformation Networks for Target-Oriented Sentiment Classification ACL / 25

Transformation Networks for Target-Oriented Sentiment Classification ACL / 25 Transformaton Networks for Target-Orented Sentment Classfcaton 1 Xn L 1, Ldong Bng 2, Wa Lam 1, Be Sh 1 1 The Chnese Unversty of Hong Kong 2 Tencent AI Lab ACL 2018 1 Jont work wth Tencent AI Lab Transformaton

More information

Credibility Adjusted Term Frequency: A Supervised Term Weighting Scheme for Sentiment Analysis and Text Classification

Credibility Adjusted Term Frequency: A Supervised Term Weighting Scheme for Sentiment Analysis and Text Classification Credblty Adjusted Term Frequency: A Supervsed Term Weghtng Scheme for Sentment Analyss and Text Classfcaton Yoon Km New York Unversty yhk255@nyu.edu Owen Zhang zhonghua.zhang2006@gmal.com Abstract We provde

More information

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for

More information

Machine Learning: Algorithms and Applications

Machine Learning: Algorithms and Applications 14/05/1 Machne Learnng: Algorthms and Applcatons Florano Zn Free Unversty of Bozen-Bolzano Faculty of Computer Scence Academc Year 011-01 Lecture 10: 14 May 01 Unsupervsed Learnng cont Sldes courtesy of

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

Machine Learning 9. week

Machine Learning 9. week Machne Learnng 9. week Mappng Concept Radal Bass Functons (RBF) RBF Networks 1 Mappng It s probably the best scenaro for the classfcaton of two dataset s to separate them lnearly. As you see n the below

More information

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster

More information

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1 4/14/011 Outlne Dscrmnatve classfers for mage recognton Wednesday, Aprl 13 Krsten Grauman UT-Austn Last tme: wndow-based generc obect detecton basc ppelne face detecton wth boostng as case study Today:

More information

Keywords - Wep page classification; bag of words model; topic model; hierarchical classification; Support Vector Machines

Keywords - Wep page classification; bag of words model; topic model; hierarchical classification; Support Vector Machines (IJCSIS) Internatonal Journal of Computer Scence and Informaton Securty, Herarchcal Web Page Classfcaton Based on a Topc Model and Neghborng Pages Integraton Wongkot Srura Phayung Meesad Choochart Haruechayasak

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET

BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET 1 BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET TZU-CHENG CHUANG School of Electrcal and Computer Engneerng, Purdue Unversty, West Lafayette, Indana 47907 SAUL B. GELFAND School

More information

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems A Unfed Framework for Semantcs and Feature Based Relevance Feedback n Image Retreval Systems Ye Lu *, Chunhu Hu 2, Xngquan Zhu 3*, HongJang Zhang 2, Qang Yang * School of Computng Scence Smon Fraser Unversty

More information

The Research of Support Vector Machine in Agricultural Data Classification

The Research of Support Vector Machine in Agricultural Data Classification The Research of Support Vector Machne n Agrcultural Data Classfcaton Le Sh, Qguo Duan, Xnmng Ma, Me Weng College of Informaton and Management Scence, HeNan Agrcultural Unversty, Zhengzhou 45000 Chna Zhengzhou

More information

User Authentication Based On Behavioral Mouse Dynamics Biometrics

User Authentication Based On Behavioral Mouse Dynamics Biometrics User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA

More information

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning Outlne Artfcal Intellgence and ts applcatons Lecture 8 Unsupervsed Learnng Professor Danel Yeung danyeung@eee.org Dr. Patrck Chan patrckchan@eee.org South Chna Unversty of Technology, Chna Introducton

More information

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY 1. SSDH: Semi-supervised Deep Hashing for Large Scale Image Retrieval

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY 1. SSDH: Semi-supervised Deep Hashing for Large Scale Image Retrieval IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY SSDH: Sem-supervsed Deep Hashng for Large Scale Image Retreval Jan Zhang, and Yuxn Peng arxv:607.08477v2 [cs.cv] 8 Jun 207 Abstract Hashng

More information

Learning-Based Top-N Selection Query Evaluation over Relational Databases

Learning-Based Top-N Selection Query Evaluation over Relational Databases Learnng-Based Top-N Selecton Query Evaluaton over Relatonal Databases Lang Zhu *, Wey Meng ** * School of Mathematcs and Computer Scence, Hebe Unversty, Baodng, Hebe 071002, Chna, zhu@mal.hbu.edu.cn **

More information

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

Determining the Optimal Bandwidth Based on Multi-criterion Fusion Proceedngs of 01 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 5 (01) (01) IACSIT Press, Sngapore Determnng the Optmal Bandwdth Based on Mult-crteron Fuson Ha-L Lang 1+, Xan-Mn

More information

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following.

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following. Complex Numbers The last topc n ths secton s not really related to most of what we ve done n ths chapter, although t s somewhat related to the radcals secton as we wll see. We also won t need the materal

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

BioTechnology. An Indian Journal FULL PAPER. Trade Science Inc.

BioTechnology. An Indian Journal FULL PAPER. Trade Science Inc. [Type text] [Type text] [Type text] ISSN : 0974-74 Volume 0 Issue BoTechnology 04 An Indan Journal FULL PAPER BTAIJ 0() 04 [684-689] Revew on Chna s sports ndustry fnancng market based on market -orented

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

Semi Supervised Learning using Higher Order Cooccurrence Paths to Overcome the Complexity of Data Representation

Semi Supervised Learning using Higher Order Cooccurrence Paths to Overcome the Complexity of Data Representation Sem Supervsed Learnng usng Hgher Order Cooccurrence Paths to Overcome the Complexty of Data Representaton Murat Can Ganz Computer Engneerng Department, Faculty of Engneerng Marmara Unversty, İstanbul,

More information

Unsupervised Learning

Unsupervised Learning Pattern Recognton Lecture 8 Outlne Introducton Unsupervsed Learnng Parametrc VS Non-Parametrc Approach Mxture of Denstes Maxmum-Lkelhood Estmates Clusterng Prof. Danel Yeung School of Computer Scence and

More information

Discriminative Dictionary Learning with Pairwise Constraints

Discriminative Dictionary Learning with Pairwise Constraints Dscrmnatve Dctonary Learnng wth Parwse Constrants Humn Guo Zhuoln Jang LARRY S. DAVIS UNIVERSITY OF MARYLAND Nov. 6 th, Outlne Introducton/motvaton Dctonary Learnng Dscrmnatve Dctonary Learnng wth Parwse

More information

CSCI 5417 Information Retrieval Systems Jim Martin!

CSCI 5417 Information Retrieval Systems Jim Martin! CSCI 5417 Informaton Retreval Systems Jm Martn! Lecture 11 9/29/2011 Today 9/29 Classfcaton Naïve Bayes classfcaton Ungram LM 1 Where we are... Bascs of ad hoc retreval Indexng Term weghtng/scorng Cosne

More information

Sentiment Classification and Polarity Shifting

Sentiment Classification and Polarity Shifting Sentment Classfcaton and Polarty Shftng Shoushan L Sopha Yat Me Lee Yng Chen Chu-Ren Huang Guodong Zhou Department of CBS The Hong Kong Polytechnc Unversty {shoushan.l, sophaym, chenyng3176, churenhuang}

More information

Collaboratively Regularized Nearest Points for Set Based Recognition

Collaboratively Regularized Nearest Points for Set Based Recognition Academc Center for Computng and Meda Studes, Kyoto Unversty Collaboratvely Regularzed Nearest Ponts for Set Based Recognton Yang Wu, Mchhko Mnoh, Masayuk Mukunok Kyoto Unversty 9/1/013 BMVC 013 @ Brstol,

More information

Enhancement of Infrequent Purchased Product Recommendation Using Data Mining Techniques

Enhancement of Infrequent Purchased Product Recommendation Using Data Mining Techniques Enhancement of Infrequent Purchased Product Recommendaton Usng Data Mnng Technques Noraswalza Abdullah, Yue Xu, Shlomo Geva, and Mark Loo Dscplne of Computer Scence Faculty of Scence and Technology Queensland

More information

Smoothing Spline ANOVA for variable screening

Smoothing Spline ANOVA for variable screening Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory

More information

UB at GeoCLEF Department of Geography Abstract

UB at GeoCLEF Department of Geography   Abstract UB at GeoCLEF 2006 Mguel E. Ruz (1), Stuart Shapro (2), June Abbas (1), Slva B. Southwck (1) and Davd Mark (3) State Unversty of New York at Buffalo (1) Department of Lbrary and Informaton Studes (2) Department

More information

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty

More information

Meta-heuristics for Multidimensional Knapsack Problems

Meta-heuristics for Multidimensional Knapsack Problems 2012 4th Internatonal Conference on Computer Research and Development IPCSIT vol.39 (2012) (2012) IACSIT Press, Sngapore Meta-heurstcs for Multdmensonal Knapsack Problems Zhbao Man + Computer Scence Department,

More information

Fast Computation of Shortest Path for Visiting Segments in the Plane

Fast Computation of Shortest Path for Visiting Segments in the Plane Send Orders for Reprnts to reprnts@benthamscence.ae 4 The Open Cybernetcs & Systemcs Journal, 04, 8, 4-9 Open Access Fast Computaton of Shortest Path for Vstng Segments n the Plane Ljuan Wang,, Bo Jang

More information

Learning Non-Linearly Separable Boolean Functions With Linear Threshold Unit Trees and Madaline-Style Networks

Learning Non-Linearly Separable Boolean Functions With Linear Threshold Unit Trees and Madaline-Style Networks In AAAI-93: Proceedngs of the 11th Natonal Conference on Artfcal Intellgence, 33-1. Menlo Park, CA: AAAI Press. Learnng Non-Lnearly Separable Boolean Functons Wth Lnear Threshold Unt Trees and Madalne-Style

More information

Syntactic Tree-based Relation Extraction Using a Generalization of Collins and Duffy Convolution Tree Kernel

Syntactic Tree-based Relation Extraction Using a Generalization of Collins and Duffy Convolution Tree Kernel Syntactc Tree-based Relaton Extracton Usng a Generalzaton of Collns and Duffy Convoluton Tree Kernel Mahdy Khayyaman Seyed Abolghasem Hassan Abolhassan Mrroshandel Sharf Unversty of Technology Sharf Unversty

More information

Research Article A High-Order CFS Algorithm for Clustering Big Data

Research Article A High-Order CFS Algorithm for Clustering Big Data Moble Informaton Systems Volume 26, Artcle ID 435627, 8 pages http://dx.do.org/.55/26/435627 Research Artcle A Hgh-Order Algorthm for Clusterng Bg Data Fanyu Bu,,2 Zhku Chen, Peng L, Tong Tang, 3 andyngzhang

More information

IN recent years, recommender systems, which help users discover

IN recent years, recommender systems, which help users discover Heterogeneous Informaton Network Embeddng for Recommendaton Chuan Sh, Member, IEEE, Bnbn Hu, Wayne Xn Zhao Member, IEEE and Phlp S. Yu, Fellow, IEEE 1 arxv:1711.10730v1 [cs.si] 29 Nov 2017 Abstract Due

More information

Classification / Regression Support Vector Machines

Classification / Regression Support Vector Machines Classfcaton / Regresson Support Vector Machnes Jeff Howbert Introducton to Machne Learnng Wnter 04 Topcs SVM classfers for lnearly separable classes SVM classfers for non-lnearly separable classes SVM

More information

Pruning Training Corpus to Speedup Text Classification 1

Pruning Training Corpus to Speedup Text Classification 1 Prunng Tranng Corpus to Speedup Text Classfcaton Jhong Guan and Shugeng Zhou School of Computer Scence, Wuhan Unversty, Wuhan, 430079, Chna hguan@wtusm.edu.cn State Key Lab of Software Engneerng, Wuhan

More information

Tighter Perceptron with Improved Dual Use of Cached Data for Model Representation and Validation

Tighter Perceptron with Improved Dual Use of Cached Data for Model Representation and Validation Proceedngs of Internatonal Jont Conference on Neural Networks, Atlanta, Georga, USA, June 49, 29 Tghter Perceptron wth Improved Dual Use of Cached Data for Model Representaton and Valdaton Zhuang Wang

More information

Support Vector Machines

Support Vector Machines Support Vector Machnes Decson surface s a hyperplane (lne n 2D) n feature space (smlar to the Perceptron) Arguably, the most mportant recent dscovery n machne learnng In a nutshell: map the data to a predetermned

More information

A New Approach For the Ranking of Fuzzy Sets With Different Heights

A New Approach For the Ranking of Fuzzy Sets With Different Heights New pproach For the ankng of Fuzzy Sets Wth Dfferent Heghts Pushpnder Sngh School of Mathematcs Computer pplcatons Thapar Unversty, Patala-7 00 Inda pushpndersnl@gmalcom STCT ankng of fuzzy sets plays

More information

Recommended Items Rating Prediction based on RBF Neural Network Optimized by PSO Algorithm

Recommended Items Rating Prediction based on RBF Neural Network Optimized by PSO Algorithm Recommended Items Ratng Predcton based on RBF Neural Network Optmzed by PSO Algorthm Chengfang Tan, Cayn Wang, Yuln L and Xx Q Abstract In order to mtgate the data sparsty and cold-start problems of recommendaton

More information

Optimal Workload-based Weighted Wavelet Synopses

Optimal Workload-based Weighted Wavelet Synopses Optmal Workload-based Weghted Wavelet Synopses Yoss Matas School of Computer Scence Tel Avv Unversty Tel Avv 69978, Israel matas@tau.ac.l Danel Urel School of Computer Scence Tel Avv Unversty Tel Avv 69978,

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS46: Mnng Massve Datasets Jure Leskovec, Stanford Unversty http://cs46.stanford.edu /19/013 Jure Leskovec, Stanford CS46: Mnng Massve Datasets, http://cs46.stanford.edu Perceptron: y = sgn( x Ho to fnd

More information

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 An Iteratve Soluton Approach to Process Plant Layout usng Mxed

More information

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 15

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 15 CS434a/541a: Pattern Recognton Prof. Olga Veksler Lecture 15 Today New Topc: Unsupervsed Learnng Supervsed vs. unsupervsed learnng Unsupervsed learnng Net Tme: parametrc unsupervsed learnng Today: nonparametrc

More information

A User Selection Method in Advertising System

A User Selection Method in Advertising System Int. J. Communcatons, etwork and System Scences, 2010, 3, 54-58 do:10.4236/jcns.2010.31007 Publshed Onlne January 2010 (http://www.scrp.org/journal/jcns/). A User Selecton Method n Advertsng System Shy

More information

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification Introducton to Artfcal Intellgence V22.0472-001 Fall 2009 Lecture 24: Nearest-Neghbors & Support Vector Machnes Rob Fergus Dept of Computer Scence, Courant Insttute, NYU Sldes from Danel Yeung, John DeNero

More information

Classifying Acoustic Transient Signals Using Artificial Intelligence

Classifying Acoustic Transient Signals Using Artificial Intelligence Classfyng Acoustc Transent Sgnals Usng Artfcal Intellgence Steve Sutton, Unversty of North Carolna At Wlmngton (suttons@charter.net) Greg Huff, Unversty of North Carolna At Wlmngton (jgh7476@uncwl.edu)

More information

Scheduling Remote Access to Scientific Instruments in Cyberinfrastructure for Education and Research

Scheduling Remote Access to Scientific Instruments in Cyberinfrastructure for Education and Research Schedulng Remote Access to Scentfc Instruments n Cybernfrastructure for Educaton and Research Je Yn 1, Junwe Cao 2,3,*, Yuexuan Wang 4, Lanchen Lu 1,3 and Cheng Wu 1,3 1 Natonal CIMS Engneerng and Research

More information

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur FEATURE EXTRACTION Dr. K.Vjayarekha Assocate Dean School of Electrcal and Electroncs Engneerng SASTRA Unversty, Thanjavur613 41 Jont Intatve of IITs and IISc Funded by MHRD Page 1 of 8 Table of Contents

More information

Optimizing Document Scoring for Query Retrieval

Optimizing Document Scoring for Query Retrieval Optmzng Document Scorng for Query Retreval Brent Ellwen baellwe@cs.stanford.edu Abstract The goal of ths project was to automate the process of tunng a document query engne. Specfcally, I used machne learnng

More information

Japanese Dependency Analysis Based on Improved SVM and KNN

Japanese Dependency Analysis Based on Improved SVM and KNN Proceedngs of the 7th WSEAS Internatonal Conference on Smulaton, Modellng and Optmzaton, Bejng, Chna, September 15-17, 2007 140 Japanese Dependency Analyss Based on Improved SVM and KNN ZHOU HUIWEI and

More information

Deep Classification in Large-scale Text Hierarchies

Deep Classification in Large-scale Text Hierarchies Deep Classfcaton n Large-scale Text Herarches Gu-Rong Xue Dkan Xng Qang Yang 2 Yong Yu Dept. of Computer Scence and Engneerng Shangha Jao-Tong Unversty {grxue, dkxng, yyu}@apex.sjtu.edu.cn 2 Hong Kong

More information

Adaptive Transfer Learning

Adaptive Transfer Learning Adaptve Transfer Learnng Bn Cao, Snno Jaln Pan, Yu Zhang, Dt-Yan Yeung, Qang Yang Hong Kong Unversty of Scence and Technology Clear Water Bay, Kowloon, Hong Kong {caobn,snnopan,zhangyu,dyyeung,qyang}@cse.ust.hk

More information

Feature Reduction and Selection

Feature Reduction and Selection Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components

More information

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique //00 :0 AM Outlne and Readng The Greedy Method The Greedy Method Technque (secton.) Fractonal Knapsack Problem (secton..) Task Schedulng (secton..) Mnmum Spannng Trees (secton.) Change Money Problem Greedy

More information

Face Detection with Deep Learning

Face Detection with Deep Learning Face Detecton wth Deep Learnng Yu Shen Yus122@ucsd.edu A13227146 Kuan-We Chen kuc010@ucsd.edu A99045121 Yzhou Hao y3hao@ucsd.edu A98017773 Mn Hsuan Wu mhwu@ucsd.edu A92424998 Abstract The project here

More information

Random Kernel Perceptron on ATTiny2313 Microcontroller

Random Kernel Perceptron on ATTiny2313 Microcontroller Random Kernel Perceptron on ATTny233 Mcrocontroller Nemanja Djurc Department of Computer and Informaton Scences, Temple Unversty Phladelpha, PA 922, USA nemanja.djurc@temple.edu Slobodan Vucetc Department

More information

Related-Mode Attacks on CTR Encryption Mode

Related-Mode Attacks on CTR Encryption Mode Internatonal Journal of Network Securty, Vol.4, No.3, PP.282 287, May 2007 282 Related-Mode Attacks on CTR Encrypton Mode Dayn Wang, Dongda Ln, and Wenlng Wu (Correspondng author: Dayn Wang) Key Laboratory

More information

FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK

FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK L-qng Qu, Yong-quan Lang 2, Jng-Chen 3, 2 College of Informaton Scence and Technology, Shandong Unversty of Scence and Technology,

More information

Backpropagation: In Search of Performance Parameters

Backpropagation: In Search of Performance Parameters Bacpropagaton: In Search of Performance Parameters ANIL KUMAR ENUMULAPALLY, LINGGUO BU, and KHOSROW KAIKHAH, Ph.D. Computer Scence Department Texas State Unversty-San Marcos San Marcos, TX-78666 USA ae049@txstate.edu,

More information

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1)

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1) Secton 1.2 Subsets and the Boolean operatons on sets If every element of the set A s an element of the set B, we say that A s a subset of B, or that A s contaned n B, or that B contans A, and we wrte A

More information

Semi-supervised Classification Using Local and Global Regularization

Semi-supervised Classification Using Local and Global Regularization Proceedngs of the Twenty-Thrd AAAI Conference on Artfcal Intellgence (2008) Sem-supervsed Classfcaton Usng Local and Global Regularzaton Fe Wang 1, Tao L 2, Gang Wang 3, Changshu Zhang 1 1 Department of

More information

39 Supervised Representation Learning with Double Encoding-layer Autoencoder for Transfer Learning

39 Supervised Representation Learning with Double Encoding-layer Autoencoder for Transfer Learning 39 Supervsed Representaton Learnng wth Double Encodng-layer Autoencoder for Transfer Learnng Fuzhen Zhuang 1, Key Lab of Intellgent Informaton Processng of Chnese Academy of Scences (CAS), Insttute of

More information

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 A mathematcal programmng approach to the analyss, desgn and

More information

Refining Word Embeddings for Sentiment Analysis

Refining Word Embeddings for Sentiment Analysis Refnng Word Embeddngs for Sentment Analyss Lang-Chh Yu,3, Jn Wang 2,3,4, K. Robert La 2,3 and Xueje Zhang 4 Department of Informaton Management, Yuan Ze Unversty, Tawan 2 Department of Computer Scence

More information

A Novel Adaptive Descriptor Algorithm for Ternary Pattern Textures

A Novel Adaptive Descriptor Algorithm for Ternary Pattern Textures A Novel Adaptve Descrptor Algorthm for Ternary Pattern Textures Fahuan Hu 1,2, Guopng Lu 1 *, Zengwen Dong 1 1.School of Mechancal & Electrcal Engneerng, Nanchang Unversty, Nanchang, 330031, Chna; 2. School

More information

Efficient Text Classification by Weighted Proximal SVM *

Efficient Text Classification by Weighted Proximal SVM * Effcent ext Classfcaton by Weghted Proxmal SVM * Dong Zhuang 1, Benyu Zhang, Qang Yang 3, Jun Yan 4, Zheng Chen, Yng Chen 1 1 Computer Scence and Engneerng, Bejng Insttute of echnology, Bejng 100081, Chna

More information

Object-Based Techniques for Image Retrieval

Object-Based Techniques for Image Retrieval 54 Zhang, Gao, & Luo Chapter VII Object-Based Technques for Image Retreval Y. J. Zhang, Tsnghua Unversty, Chna Y. Y. Gao, Tsnghua Unversty, Chna Y. Luo, Tsnghua Unversty, Chna ABSTRACT To overcome the

More information

The Shortest Path of Touring Lines given in the Plane

The Shortest Path of Touring Lines given in the Plane Send Orders for Reprnts to reprnts@benthamscence.ae 262 The Open Cybernetcs & Systemcs Journal, 2015, 9, 262-267 The Shortest Path of Tourng Lnes gven n the Plane Open Access Ljuan Wang 1,2, Dandan He

More information

Learning to Order Natural Language Texts

Learning to Order Natural Language Texts Learnng to Order Natural Language Texts Jwe Tan a, b, Xaojun Wan a * and Janguo Xao a a Insttute of Computer Scence and Technology, The MOE Key Laboratory of Computatonal Lngustcs, Pekng Unversty, Chna

More information

Learning a Class-Specific Dictionary for Facial Expression Recognition

Learning a Class-Specific Dictionary for Facial Expression Recognition BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 16, No 4 Sofa 016 Prnt ISSN: 1311-970; Onlne ISSN: 1314-4081 DOI: 10.1515/cat-016-0067 Learnng a Class-Specfc Dctonary for

More information

Range images. Range image registration. Examples of sampling patterns. Range images and range surfaces

Range images. Range image registration. Examples of sampling patterns. Range images and range surfaces Range mages For many structured lght scanners, the range data forms a hghly regular pattern known as a range mage. he samplng pattern s determned by the specfc scanner. Range mage regstraton 1 Examples

More information

Journal of Chemical and Pharmaceutical Research, 2014, 6(6): Research Article. A selective ensemble classification method on microarray data

Journal of Chemical and Pharmaceutical Research, 2014, 6(6): Research Article. A selective ensemble classification method on microarray data Avalable onlne www.ocpr.com Journal of Chemcal and Pharmaceutcal Research, 2014, 6(6):2860-2866 Research Artcle ISSN : 0975-7384 CODEN(USA) : JCPRC5 A selectve ensemble classfcaton method on mcroarray

More information

Experiments in Text Categorization Using Term Selection by Distance to Transition Point

Experiments in Text Categorization Using Term Selection by Distance to Transition Point Experments n Text Categorzaton Usng Term Selecton by Dstance to Transton Pont Edgar Moyotl-Hernández, Héctor Jménez-Salazar Facultad de Cencas de la Computacón, B. Unversdad Autónoma de Puebla, 14 Sur

More information

From Comparing Clusterings to Combining Clusterings

From Comparing Clusterings to Combining Clusterings Proceedngs of the Twenty-Thrd AAAI Conference on Artfcal Intellgence (008 From Comparng Clusterngs to Combnng Clusterngs Zhwu Lu and Yuxn Peng and Janguo Xao Insttute of Computer Scence and Technology,

More information

Resolving Surface Forms to Wikipedia Topics

Resolving Surface Forms to Wikipedia Topics Resolvng Surface Forms to Wkpeda Topcs Ypng Zhou Lan Ne Omd Rouhan-Kalleh Flavan Vasle Scott Gaffney Yahoo! Labs at Sunnyvale {zhouy,lanne,omd,flavan,gaffney}@yahoo-nc.com Abstract Ambguty of entty mentons

More information

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour 6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the

More information

Solving two-person zero-sum game by Matlab

Solving two-person zero-sum game by Matlab Appled Mechancs and Materals Onlne: 2011-02-02 ISSN: 1662-7482, Vols. 50-51, pp 262-265 do:10.4028/www.scentfc.net/amm.50-51.262 2011 Trans Tech Publcatons, Swtzerland Solvng two-person zero-sum game by

More information

Local Quaternary Patterns and Feature Local Quaternary Patterns

Local Quaternary Patterns and Feature Local Quaternary Patterns Local Quaternary Patterns and Feature Local Quaternary Patterns Jayu Gu and Chengjun Lu The Department of Computer Scence, New Jersey Insttute of Technology, Newark, NJ 0102, USA Abstract - Ths paper presents

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

Signed Distance-based Deep Memory Recommender

Signed Distance-based Deep Memory Recommender Sgned Dstance-based Deep Memory Recommender ABSTRACT Personalzed recommendaton algorthms learn a user s preference for an tem, by measurng a dstance/smlarty between them. However, some of exstng recommendaton

More information

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices Internatonal Mathematcal Forum, Vol 7, 2012, no 52, 2549-2554 An Applcaton of the Dulmage-Mendelsohn Decomposton to Sparse Null Space Bases of Full Row Rank Matrces Mostafa Khorramzadeh Department of Mathematcal

More information

Under-Sampling Approaches for Improving Prediction of the Minority Class in an Imbalanced Dataset

Under-Sampling Approaches for Improving Prediction of the Minority Class in an Imbalanced Dataset Under-Samplng Approaches for Improvng Predcton of the Mnorty Class n an Imbalanced Dataset Show-Jane Yen and Yue-Sh Lee Department of Computer Scence and Informaton Engneerng, Mng Chuan Unversty 5 The-Mng

More information

A Clustering Algorithm for Chinese Adjectives and Nouns 1

A Clustering Algorithm for Chinese Adjectives and Nouns 1 Clusterng lgorthm for Chnese dectves and ouns Yang Wen, Chunfa Yuan, Changnng Huang 2 State Key aboratory of Intellgent Technology and System Deptartment of Computer Scence & Technology, Tsnghua Unversty,

More information

On Some Entertaining Applications of the Concept of Set in Computer Science Course

On Some Entertaining Applications of the Concept of Set in Computer Science Course On Some Entertanng Applcatons of the Concept of Set n Computer Scence Course Krasmr Yordzhev *, Hrstna Kostadnova ** * Assocate Professor Krasmr Yordzhev, Ph.D., Faculty of Mathematcs and Natural Scences,

More information

Improved Relation Classification by Deep Recurrent Neural Networks with Data Augmentation

Improved Relation Classification by Deep Recurrent Neural Networks with Data Augmentation Improved Relaton Classfcaton by Deep Recurrent Neural Networks wth Data Augmentaton Yan Xu, 1,, Ran Ja, 1, Ll Mou, 1 Ge L, 1, Yunchuan Chen, 2 Yangyang Lu, 1 Zh Jn 1, 1 Key Laboratory of Hgh Confdence

More information

Investigating the Performance of Naïve- Bayes Classifiers and K- Nearest Neighbor Classifiers

Investigating the Performance of Naïve- Bayes Classifiers and K- Nearest Neighbor Classifiers Journal of Convergence Informaton Technology Volume 5, Number 2, Aprl 2010 Investgatng the Performance of Naïve- Bayes Classfers and K- Nearest Neghbor Classfers Mohammed J. Islam *, Q. M. Jonathan Wu,

More information

TPL-Aware Displacement-driven Detailed Placement Refinement with Coloring Constraints

TPL-Aware Displacement-driven Detailed Placement Refinement with Coloring Constraints TPL-ware Dsplacement-drven Detaled Placement Refnement wth Colorng Constrants Tao Ln Iowa State Unversty tln@astate.edu Chrs Chu Iowa State Unversty cnchu@astate.edu BSTRCT To mnmze the effect of process

More information