Making Tree Kernels practical for Natural Language Learning

Size: px
Start display at page:

Download "Making Tree Kernels practical for Natural Language Learning"

Transcription

1 Mking Tree Kernels prcticl for turl Lnguge Lerning Alessndro Moschitti eprtment of Computer Science University of Rome Tor ergt Rome, Itly Abstrct In recent yers tree kernels hve been proposed for the utomtic lerning of nturl lnguge pplictions. Unfortuntely, they show () n inherent super liner complexity nd (b) lower ccurcy thn trditionl ttribute/vlue methods. In this pper, we show tht tree kernels re very helpful in the processing of nturl lnguge s () we provide simple lgorithm to compute tree kernels in liner verge running time nd (b) our study on the clssifiction properties of diverse tree kernels show tht kernel combintions lwys improve the trditionl methods. Experiments with Support ector Mchines on the predicte rgument clssifiction tsk provide empiricl support to our thesis. 1 Introduction In recent yers tree kernels hve been shown to be interesting pproches for the modeling of syntctic informtion in nturl lnguge tsks, e.g. syntctic prsing (Collins nd uffy, 2002), reltion extrction (Zelenko et l., 2003), med Entity recognition (Cumby nd Roth, 2003; Culott nd Sorensen, 2004) nd Semntic Prsing (Moschitti, 2004). The min tree kernel dvntge is the possibility to generte high number of syntctic fetures nd let the lerning lgorithm to select those most relevnt for specific ppliction. In contrst, their mjor drwbck re () the computtionl time complexity which is superliner in the number of tree nodes nd (b) the ccurcy tht they produce is often lower thn the one provided by liner models on mnully designed fetures. To solve problem (), liner complexity lgorithm for the subtree (ST) kernel computtion, ws designed in (ishwnthn nd Smol, 2002). Unfortuntely, the ST set is rther poorer thn the one generted by the subset tree (SST) kernel designed in (Collins nd uffy, 2002). Intuitively, n ST rooted in node n of the trget tree lwys contins ll n s descendnts until the leves. This does not hold for the SSTs whose leves cn be internl nodes. To solve the problem (b), study on different tree substructure spces should be crried out to derive the tree kernel tht provide the highest ccurcy. On the one hnd, SSTs provide lerning lgorithms with richer informtion which my be criticl to cpture syntctic properties of prse trees s shown, for exmple, in (Zelenko et l., 2003; Moschitti, 2004). On the other hnd, if the SST spce contins too mny irrelevnt fetures, overfitting my occur nd decrese the clssifiction ccurcy (Cumby nd Roth, 2003). As consequence, the fewer fetures of the ST pproch my be more pproprite. In this pper, we im to solve the bove problems. We present () n lgorithm for the evlution of the ST nd SST kernels which runs in liner verge time nd (b) study of the impct of diverse tree kernels on the ccurcy of Support ector Mchines (SMs). Our fst lgorithm computes the kernels between two syntctic prse trees in O(m + n) verge time, where m nd n re the number of nodes in the two trees. This low complexity llows SMs to crry out experiments on hundreds of thousnds of trining instnces since it is not higher thn the complexity of the polynomil ker-

2 nel, widely used on lrge experimenttion e.g. (Prdhn et l., 2004). To confirm such hypothesis, we mesured the impct of the lgorithm on the time required by SMs for the lerning of bout 122,774 predicte rgument exmples nnotted in PropBnk (Kingsbury nd Plmer, 2002) nd 37,948 instnces nnotted in Frmeet (Fillmore, 1982). Regrding the clssifiction properties, we studied the rgument lbeling ccurcy of ST nd SST kernels nd their combintions with the stndrd fetures (Gilde nd Jurfsky, 2002). The results show tht, on both PropBnk nd Frmeet dtsets, the SST-bsed kernel, i.e. the richest in terms of substructures, produces the highest SM ccurcy. When SSTs re combined with the mnul designed fetures, we lwys obtin the best figure clssifier. This suggests tht the mny frgments included in the SST spce re relevnt nd, since their mnul design my be problemtic (requiring higher progrmming effort nd deeper knowledge of the linguistic phenomenon), tree kernels provide remrkble help in feture engineering. In the reminder of this pper, Section 2 describes the prse tree kernels nd our fst lgorithm. Section 3 introduces the predicte rgument clssifiction problem nd its solution. Section 4 shows the comprtive performnce in term of the execution time nd ccurcy. Finlly, Section 5 discusses the relted work wheres Section 6 summrizes the conclusions. 2 Fst Prse Tree Kernels The kernels tht we consider represent trees in terms of their substructures (frgments). These ltter define feture spces which, in turn, re mpped into vector spces, e.g. R n. The ssocited kernel function mesures the similrity between two trees by counting the number of their common frgments. More precisely, kernel function detects if tree subprt (common to both trees) belongs to the feture spce tht we intend to generte. For such purpose, the frgment types need to be described. We consider two importnt chrcteriztions: the subtrees (STs) nd the subset trees (SSTs). 2.1 Subtrees nd Subset Trees In our study, we consider syntctic prse trees, consequently, ech node with its children is ssocited with grmmr production rule, where the symbol t left-hnd side corresponds to the prent node nd the symbols t right-hnd side re ssocited with its children. The terminl symbols of the grmmr re lwys ssocited with the leves of the tree. For exmple, Figure 1 illustrtes the syntctic prse of the sentence "Mry ct to school". The root A lef Mry A subtree S P S ct P PP PP PP I I school to school Figure 1: A syntctic prse tree. We define s subtree (ST) ny node of tree long with ll its descendnts. For exmple, the line in Figure 1 circles the subtree rooted in the P node. A subset tree (SST) is more generl structure. The difference with the subtrees is tht the leves cn be ssocited with non-terminl symbols. The SSTs stisfy the constrint tht they re generted by pplying the sme grmmticl rule set which generted the originl tree. For exmple, [S [ ]] is SST of the tree in Figure 1 which hs two non-terminl symbols, nd, s leves. Mry S P ct ct Mry Figure 2: A syntctic prse tree with its subtrees (STs). P ct P ct P P P ct ct P ct P Figure 3: A tree with some of its subset trees (SSTs). P P ct ct Mry Given syntctic tree we cn use s feture representtion the set of ll its STs or SSTs. For exmple, Figure 2 shows the prse tree of the sentence "Mry ct" together with its 6 STs, wheres Figure 3 shows 10 SSTs (out of 17) of the subtree of Figure 2 rooted in. The

3 high different number of substructures gives n intuitive quntifiction of the different informtion level between the two tree-bsed representtions. 2.2 The Tree Kernel Functions The min ide of tree kernels is to compute the number of the common substructures between two trees T 1 nd T 2 without explicitly considering the whole frgment spce. For this purpose, we slightly modified the kernel function proposed in (Collins nd uffy, 2002) by introducing prmeter σ which enbles the ST or the SST evlution. Given the set of frgments {f 1, f 2,..} = F, we defined the indictor function I i (n) which is equl 1 if the trget f i is rooted t node n nd 0 otherwise. We define K(T 1, T 2 ) = (n 1, n 2 ) (1) n 2 T2 n 1 T1 where T1 nd T2 re the sets of the T 1 s nd T 2 s nodes, respectively nd (n 1, n 2 ) = F i=1 I i(n 1 )I i (n 2 ). This ltter is equl to the number of common frgments rooted in the n 1 nd n 2 nodes. We cn compute s follows: 1. if the productions t n 1 nd n 2 re different then (n 1, n 2 ) = 0; 2. if the productions t n 1 nd n 2 re the sme, nd n 1 nd n 2 hve only lef children (i.e. they re pre-terminls symbols) then (n 1, n 2 ) = 1; 3. if the productions t n 1 nd n 2 re the sme, nd n 1 nd n 2 re not pre-terminls then (n 1, n 2 ) = nc(n 1 ) j=1 (σ + (c j n 1, c j n 2 )) (2) where σ {0, 1}, nc(n 1 ) is the number of the children of n 1 nd c j n is the j-th child of the node n. ote tht, since the productions re the sme, nc(n 1 ) = nc(n 2 ). When σ = 0, (n 1, n 2 ) is equl 1 only if j (c j n 1, c j n 2 ) = 1, i.e. ll the productions ssocited with the children re identicl. By recursively pplying this property, it follows tht the subtrees in n 1 nd n 2 re identicl. Thus, Eq. 1 evlutes the subtree (ST) kernel. When σ = 1, (n 1, n 2 ) evlutes the number of SSTs common to n 1 nd n 2 s proved in (Collins nd uffy, 2002). Additionlly, we study some vritions of the bove kernels which include the leves in the frgment spce. For this purpose, it is enough to dd the condition: 0. if n 1 nd n 2 re leves nd their ssocited symbols re equl then (n 1, n 2 ) = 1, to the recursive rule set for the evlution (Zhng nd Lee, 2003). We will refer to such extended kernels s ST+bow nd SST+bow (bg-ofwords). Moreover, we dd the decy fctor λ by modifying steps (2) nd (3) s follows 1 : 2. (n 1, n 2 ) = λ, 3. (n 1, n 2 ) = λ nc(n 1 ) j=1 (σ + (c j n 1, c j n 2 )). The computtionl complexity of Eq. 1 is O( T1 T2 ). We will refer to this bsic implementtion s the Qudrtic Tree Kernel (QTK). However, s observed in (Collins nd uffy, 2002) this worst cse is quite unlikely for the syntctic trees of nturl lnguge sentences, thus, we cn design lgorithms tht run in liner time on verge. function Evlute Pir Set(Tree T 1, T 2 ) returns OE PAIR SET; LIST L 1,L 2 ; OE PAIR SET p ; begin L 1 = T 1.ordered list; L 2 = T 2.ordered list; /*the lists were sorted t loding time*/ n 1 = extrct(l 1 ); /*get the hed element nd*/ n 2 = extrct(l 2 ); /*remove it from the list*/ while (n 1 nd n 2 re not ULL) if (production of(n 1 ) > production of(n 2 )) then n 2 = extrct(l 2 ); else if (production of(n 1 ) < production of(n 2 )) then n 1 = extrct(l 1 ); else while (production of(n 1 ) == production of(n 2 )) while (production of(n 1 ) == production of(n 2 )) dd( n 1, n 2, p ); n 2 =get next elem(l 2 ); /*get the hed element nd move the pointer to the next element*/ end n 1 = extrct(l 1); reset(l 2); /*set the pointer t the first element*/ end end return p ; end Tble 1: Pseudo-code for fst evlution of the node pir sets used in the fst Tree Kernel. 2.3 A Fst Tree Kernel (FTK) To compute the kernels defined in the previous section, we sum the function for ech pir n 1, n 2 T1 T2 (Eq. 1). When the productions ssocited with n 1 nd n 2 re different, we cn void to evlute (n 1, n 2 ) since it is 0. 1 To hve similrity score between 0 nd 1, we lso pply the normliztion in the kernel spce, i.e. K (T 1, T 2 ) = K(T 1,T 2 ) K(T. 1,T 1 ) K(T 2,T 2 )

4 S S Arg0 S S Arg1 S ArgM Mry Arg. 0 Predicte P ct Arg. 1 PP I to school Arg. M Mry P ct I to PP school Figure 4: Tree substructure spce for predicte rgument clssifiction. Thus, we look for node pir set p ={ n 1, n 2 T1 T2 : p(n 1 ) = p(n 2 )}, where p(n) returns the production rule ssocited with n. To efficiently build p, we (i) extrct the L 1 nd L 2 lists of the production rules from T 1 nd T 2, (ii) sort them in the lphnumeric order nd (iii) scn them to find the node pirs n 1, n 2 such tht (p(n 1 ) = p(n 2 )) L 1 L 2. Step (iii) my require only O( T1 + T2 ) time, but, if p(n 1 ) ppers r 1 times in T 1 nd p(n 2 ) is repeted r 2 times in T 2, we need to consider r 1 r 2 pirs. The forml lgorithm is given in Tble 1. ote tht: () The list sorting cn be done only once t the dt preprtion time (i.e. before trining) in O( T1 log( T1 )). (b) The lgorithm shows tht the worst cse occurs when the prse trees re both generted using only one production rule, i.e. the two internl while cycles crry out T1 T2 itertions. In contrst, two identicl prse trees my generte liner number of non-null pirs if there re few groups of nodes ssocited with the sme production rule. (c) Such pproch is perfectly comptible with the dynmic progrmming lgorithm which computes. In fct, the only difference with the originl pproch is tht the mtrix entries corresponding to pirs of different production rules re not considered. Since such entries contin null vlues they do not ffect the ppliction of the originl dynmic progrmming. Moreover, the order of the pir evlution cn be estblished t run time, strting from the root nodes towrds the children. 3 A Semntic Appliction of Prse Tree Kernels An interesting ppliction of the SST kernel is the clssifiction of the predicte rgument structures defined in PropBnk (Kingsbury nd Plmer, 2002) or Frmeet (Fillmore, 1982). Figure 4 shows the prse tree of the sentence: "Mry ct to school" long with the predicte rgument nnottion proposed in the Prop- Bnk project. Only verbs re considered s predictes wheres rguments re lbeled sequentilly from ARG0 to ARG9. Also in Frmeet predicte/rgument informtion is described but for this purpose richer semntic structures clled Frmes re used. The Frmes re schemtic representtions of situtions involving vrious prticipnts, properties nd roles in which word my be typiclly used. Frme elements or semntic roles re rguments of predictes clled trget words. For exmple the following sentence is nnotted ccording to the AR- REST frme: [ T ime One Sturdy night] [ Authorities police in Brooklyn ] [ T rget pprehended ] [ Suspect sixteen teengers]. The roles Suspect nd Authorities re specific to the frme. The common pproch to lern the clssifiction of predicte rguments reltes to the extrction of fetures from the syntctic prse tree of the trget sentence. In (Gilde nd Jurfsky, 2002) seven different fetures 2, which im to cpture the reltion between the predicte nd its rguments, were proposed. For exmple, the Prse Tree Pth of the pir, ARG1 in the syntctic tree of Figure 4 is P. It encodes the dependency between the predicte nd the rgument s sequence of nonterminl lbels linked by direction symbols (up or down). An lterntive tree kernel representtion, proposed in (Moschitti, 2004), is the selection of the miniml tree subset tht includes predicte with only one of its rguments. For exmple, in Figure 4, the substructures inside the three frmes re the semntic/syntctic structures ssocited with the three rguments of the verb to bring, i.e. S ARG0, S ARG1 nd S ARGM. Given feture representtion of predicte r- 2 mely, they re Phrse Type, Prse Tree Pth, Predicte Word, Hed Word, Governing Ctegory, Position nd oice.

5 guments, we cn build n individul OE-vs-ALL (OA) clssifier C i for ech rgument i. As finl decision of the multiclssifier, we select the rgument type ARG t ssocited with the mximum vlue mong the scores provided by the C i, i.e. t = rgmx i S score(c i ), where S is the set of rgument types. We dopted the OA pproch s it is simple nd effective s showed in (Prdhn et l., 2004). ote tht the representtion in Figure 4 is quite intuitive nd, to conceive it, the designer requires much less linguistic knowledge bout semntic roles thn those necessry to define relevnt fetures mnully. To understnd such point, we should mke step bck before Gilde nd Jurfsky defined the first set of fetures for Semntic Role Lbeling (SRL). The ide tht syntx my hve been useful to derive semntic informtion ws lredy inspired by linguists, but from mchine lerning point of view, to decide which tree frgments my hve been useful for semntic role lbeling ws not n esy tsk. In principle, the designer should hve hd to select nd experiment ll possible tree subprts. This is exctly wht the tree kernels cn utomticlly do: the designer just need to roughly select the interesting whole subtree (correlted with the linguistic phenomenon) nd the tree kernel will generte ll possible syntctic fetures from it. The tsk of selecting the most relevnt substructures is crried out by the kernel mchines themselves. 4 The Experiments The im of the experiments is twofold. On the one hnd, we show tht the FTK running time is liner on the verge cse nd is much fster thn QTK. This is ccomplished by mesuring the lerning time nd the verge kernel computtion time. On the other hnd, we study the impct of the different tree bsed kernels on the predicte rgument clssifiction ccurcy. 4.1 Experimentl Set-up We used two different corpor: PropBnk ( ce) long with PennTree bnk 2 (Mrcus et l., 1993) nd Frmeet. PropBnk contins bout 53,700 sentences nd fixed split between trining nd testing which hs been used in other reserches, e.g. (Gilde nd Plmer, 2002; Prdhn et l., 2004). In this split, sections from 02 to 21 re used for trining, section 23 for testing nd sections 1 nd 22 s developing set. We considered totl of 122,774 nd 7,359 rguments (from ARG0 to ARG9, ARGA nd ARGM) in trining nd testing, respectively. Their tree structures were extrcted from the Penn Treebnk. It should be noted tht the min contribution to the globl ccurcy is given by ARG0, ARG1 nd ARGM. From the Frmeet corpus ( frmenet), we extrcted ll 24,558 sentences of the 40 Frmes selected for the Automtic Lbeling of Semntic Roles tsk of Sensevl 3 ( We mpped together the semntic roles hving the sme nme nd we considered only the 18 most frequent roles ssocited with verbl predictes, for totl of 37,948 rguments. We rndomly selected 30% of sentences for testing nd 70% for trining. Additionlly, 30% of trining ws used s vlidtionset. ote tht, since the Frmeet dt does not include deep syntctic tree nnottion, we processed the Frmeet dt with Collins prser (Collins, 1997), consequently, the experiments on Frmeet relte to utomtic syntctic prse trees. The clssifier evlutions were crried out with the SM-light-TK softwre vilble t which encodes ST nd SST kernels in the SMlight softwre (Jochims, 1999). We used the defult liner (Liner) nd polynomil (Poly) kernels for the evlutions with the stndrd fetures defined in (Gilde nd Jurfsky, 2002). We dopted the defult regulriztion prmeter (i.e., the verge of 1/ x ) nd we tried few cost-fctor vlues (i.e., j {1, 3, 7, 10, 30, 100}) to djust the rte between Precision nd Recll on the vlidtion-set. For the ST nd SST kernels, we derived tht the best λ (see Section 2.2) were 1 nd 0.4, respectively. The clssifiction performnce ws evluted using the F 1 mesure 3 for the single rguments nd the ccurcy for the finl multiclssifier. This ltter choice llows us to compre our results with previous literture work, e.g. (Gilde nd Jurfsky, 2002; Prdhn et l., 2004). 4.2 Time Complexity Experiments In this section we compre our Fst Tree Kernel (FTK) pproch with the Qudrtic Tree Kernel (QTK) lgorithm. The ltter refers to the nive evlution of Eq. 1 s presented in (Collins nd uffy, 2002). 3 F 1 ssigns equl importnce to Precision P nd Recll R, i.e. f 1 = 2P R P +R.

6 Figure 5 shows the lerning time 4 of the SMs using QTK nd FTK (over the SST structures) for the clssifiction of one lrge rgument (i.e. ARG0), ccording to different percentges of trining dt. We note tht, with 70% of the trining dt, FTK is bout 10 times fster thn QTK. With ll the trining dt FTK terminted in 6 hours wheres QTK required more thn 1 week. Hours y = x x y = x x % Trining t FTK QTK Figure 5: ARG0 clssifier lerning time ccording to different trining percentges. µseconds y = 0.04x x y = 0.14x umber of Tree odes FTK QTK Figure 6: Averge time in seconds for the QTK nd FTK evlutions. Accurcy ST SST 0.80 ST+bow SST+bow 0.78 Liner Poly % Trining t Figure 7: Multiclssifier ccurcy ccording to different trining set percentges. 4 We run the experiments on Pentium 4, 2GHz, with 1 Gb rm. The bove results re quite interesting becuse they show tht (1) we cn use tree kernels with SMs on huge trining sets, e.g. on 122,774 instnces nd (2) the time needed to converge is pproximtely the one required by SMs when using polynomil kernel. This ltter shows the miniml complexity needed to work in the dul spce. To study the FTK running time, we extrcted from PennTree bnk the first 500 trees 5 contining exctly n nodes, then, we evluted ll 25,000 possible tree pirs. Ech point of the Figure 6 shows the verge computtion time on ll the tree pirs of fixed size n. In the figures, the trend lines which best interpoltes the experimentl vlues re lso shown. It clerly ppers tht the trining time is qudrtic s SMs hve qudrtic lerning time complexity (see Figure 5) wheres the FTK running time hs liner behvior (Figure 6). The QTK lgorithm shows qudrtic running time complexity, s expected. 4.3 Accurcy of the Tree Kernels In these experiments, we investigte which kernel is the most ccurte for the predicte rgument clssifiction. First, we run ST, SST, ST+bow, SST+bow, Liner nd Poly kernels over different trining-set size of PropBnk. Figure 7 shows the lerning curves ssocited with the bove kernels for the SMbsed multiclssifier. We note tht () SSTs hve higher ccurcy thn STs, (b) bow does not improve either ST or SST kernels nd (c) in the finl prt of the plot SST shows higher grdient thn ST, Liner nd Poly. This ltter produces the best ccurcy 90.5% in line with the literture findings using stndrd fetures nd polynomil SMs, e.g. 87.1% 6 in (Prdhn et l., 2004). Second, in tbles 2 nd 3, we report the results using ll vilble trining dt, on PropBnk nd Frmeet test sets, respectively. Ech row of the two tbles shows the F 1 mesure of the individul clssifiers using different kernels wheres the lst column illustrtes the globl ccurcy of the multiclssifier. 5 We mesured lso the computtion time for the incomplete trees ssocited with the predicte rgument structures (see Section 3); we obtined the sme results. 6 The smll difference (2.4%) is minly due to the different tretment of ARGMs: we built single ARGM clss for ll subclsses, e.g. ARGM-LOC nd ARGM-TMP, wheres in (Prdhn et l., 2004), the ARGMs, were evluted seprtely.

7 We note tht, the F 1 of the single rguments cross the different kernels follows the sme behvior of the globl multiclssifier ccurcy. On Frmeet, the bow impct on the ST nd SST ccurcy is higher thn on PropBnk s it produces n improvement of bout 1.5%. This suggests tht (1) to detect semntic roles, lexicl informtion is very importnt, (2) bow give higher contribution s errors in POS-tgging mke the word + POS frgments less relible nd (3) s the Frmeet trees re obtined with the Collins syntctic prser, tree kernels seem robust to incorrect prse trees. Third, we point out tht the polynomil kernel on flt fetures is more ccurte thn tree kernels but the design of such effective fetures required noticeble knowledge nd effort (Gilde nd Jurfsky, 2002). On the contrry, the choice of subtrees suitble to syntcticlly chrcterize trget phenomenon seems esier tsk (see Section 3 for the predicte rgument cse). Moreover, by combining polynomil nd SST kernels, we cn improve the clssifiction ccurcy (Moschitti, 2004), i.e. tree kernels provide the lerning lgorithm with mny relevnt frgments which hrdly cn be designed by hnd. In fct, s mny predicte rgument structures re quite lrge (up to 100 nodes) they contin mny frgments. ARGs ST SST ST+bow SST+bow Liner P oly ARG ARG ARG ARG ARG ARGM Acc Tble 2: Evlution of Kernels on PropBnk. Roles ST SST ST+bow SST+bow Liner P oly gent theme gol pth mnner source time reson Acc roles Tble 3: Evlution of the Kernels on Frmeet semntic roles. Finlly, to study the combined kernels, we pplied the K 1 + γk 2 formul, where K 1 is either the Liner or the Poly kernel nd K 2 is the ST Corpus Poly ST+Liner SST+Liner ST+Poly SST+Poly PropBnk Frmeet Tble 4: Multiclssifier ccurcy using Kernel Combintions. or the SST kernel. Tble 4 shows the results of four kernel combintions. We note tht, () STs nd SSTs improve Poly (bout 0.5 nd 2 percent points on PropBnk nd Frmeet, respectively) nd (b) the liner kernel, which uses fewer fetures thn Poly, is more enhnced by the SSTs thn STs (for exmple on PropBnk we hve 89.4% nd 88.6% vs. 87.6%), i.e. Liner tkes dvntge by the richer feture set of the SSTs. It should be noted tht our results of kernel combintions on Frmeet re in contrst with (Moschitti, 2004), where no improvement ws obtined. Our explntion is tht, thnks to the fst evlution of FTK, we could crry out n dequte prmeteriztion. 5 Relted Work Recently, severl tree kernels hve been designed. In the following, we highlight their differences nd properties. In (Collins nd uffy, 2002), the SST tree kernel ws experimented with the oted Perceptron for the prse-tree rernking tsk. The combintion with the originl PCFG model improved the syntctic prsing. Additionlly, it ws lluded tht the verge execution time depends on the number of repeted productions. In (ishwnthn nd Smol, 2002), liner complexity lgorithm for the computtion of the ST kernel is provided (in the worst cse). The min ide is the use of the suffix trees to store prtil mtches for the evlution of the string kernel (Lodhi et l., 2000). This cn be used to compute the ST frgments once the tree is converted into string. To our knowledge, ours is the first ppliction of the ST kernel for nturl lnguge tsk. In (Kzm nd Torisw, 2005), n interesting lgorithm tht speeds up the verge running time is presented. Such lgorithm looks for node pirs tht hve in common lrge number of trees (mlicious nodes) nd pplies trnsformtion to the trees rooted in such nodes to mke fster the kernel computtion. The results show n increse of the speed similr to the one produced by our method. In (Zelenko et l., 2003), two kernels over syntctic shllow prser structures were devised for the extrction of linguistic reltions, e.g. personffilition. To mesure the similrity between two

8 nodes, the contiguous string kernel nd the sprse string kernel (Lodhi et l., 2000) were used. In (Culott nd Sorensen, 2004) such kernels were slightly generlized by providing mtching function for the node pirs. The time complexity for their computtion limited the experiments on dt set of just 200 news items. Moreover, we note tht the bove tree kernels re not convolution kernels s those proposed in this rticle. In (Shen et l., 2003), tree-kernel bsed on Lexiclized Tree Adjoining Grmmr (LTAG) for the prse-rernking tsk ws proposed. Since QTK ws used for the kernel computtion, the high lerning complexity forced the uthors to trin different SMs on different slices of trining dt. Our FTK, dpted for the LTAG tree kernel, would hve llowed SMs to be trined on the whole dt. In (Cumby nd Roth, 2003), feture description lnguge ws used to extrct structurl fetures from the syntctic shllow prse trees ssocited with nmed entities. The experiments on the nmed entity ctegoriztion showed tht when the description lnguge selects n dequte set of tree frgments the oted Perceptron lgorithm increses its clssifiction ccurcy. The explntion ws tht the complete tree frgment set contins mny irrelevnt fetures nd my cuse overfitting. 6 Conclusions In this pper, we hve shown tht tree kernels cn effectively be dopted in prcticl nturl lnguge pplictions. The min rguments ginst their use re their efficiency nd ccurcy lower thn trditionl feture bsed pproches. We hve shown tht fst lgorithm (FTK) cn evlute tree kernels in liner verge running time nd lso tht the overll converging time required by SMs is comptible with very lrge dt sets. Regrding the ccurcy, the experiments with Support ector Mchines on the PropBnk nd Frmeet predicte rgument structures show tht: () the richer the kernel is in term of substructures (e.g. SST), the higher the ccurcy is, (b) tree kernels re effective lso in cse of utomtic prse trees nd (c) s kernel combintions lwys improve trditionl feture models, the best pproch is to combine sclr-bsed nd structured bsed kernels. Acknowledgments I would like to thnk the AI group t the University of Rome Tor ergt. Mny thnks to the EACL 2006 nonymous reviewers, Roberto Bsili nd Giorgio Stt who provided me with vluble suggestions. This reserch is prtilly supported by the Presto Spce EU Project#: FP References Michel Collins nd igel uffy ew rnking lgorithms for prsing nd tgging: Kernels over discrete structures, nd the voted perceptron. In ACL02. Michel Collins Three genertive, lexiclized models for sttisticl prsing. In proceedings of the ACL97, Mdrid, Spin. Aron Culott nd Jeffrey Sorensen ependency tree kernels for reltion extrction. In proceedings of ACL04, Brcelon, Spin. Chd Cumby nd n Roth Kernel methods for reltionl lerning. In proceedings of ICML Wshington, US. Chrles J. Fillmore Frme semntics. In Linguistics in the Morning Clm. niel Gilde nd niel Jurfsky Automtic lbeling of semntic roles. Computtionl Linguistic, 28(3): niel Gilde nd Mrth Plmer The necessity of prsing for predicte rgument recognition. In proceedings of ACL02, Phildelphi, PA. T. Jochims Mking lrge-scle SM lerning prcticl. In B. Schölkopf, C. Burges, nd A. Smol, editors, Advnces in Kernel Methods - Support ector Lerning. Junichi Kzm nd Kentro Torisw Speeding up trining with tree kernels for node reltion lbeling. In proceedings of EMLP 2005, Toronto, Cnd. Pul Kingsbury nd Mrth Plmer From Treebnk to PropBnk. In proceedings of LREC-2002, Spin. Hum Lodhi, Crig Sunders, John Shwe-Tylor, ello Cristinini, nd Christopher Wtkins Text clssifiction using string kernels. In IPS02, ncouver, Cnd. M. P. Mrcus, B. Sntorini, nd M. A. Mrcinkiewicz Building lrge nnotted corpus of english: The Penn Treebnk. Computtionl Linguistics, 19: Alessndro Moschitti A study on convolution kernels for shllow semntic prsing. In proceedings ACL04, Brcelon, Spin. Smeer Prdhn, Kdri Hcioglu, leri Krugler, Wyne Wrd, Jmes H. Mrtin, nd niel Jurfsky Support vector lerning for semntic rgument clssifiction. Mchine Lerning Journl. Libin Shen, Anoop Srkr, nd Arvind Joshi Using LTAG bsed fetures in prse rernking. In proceedings of EMLP 2003, Spporo, Jpn. Ben Tskr, n Klein, Mike Collins, phne Koller, nd Christopher Mnning Mx-mrgin prsing. In proceedings of EMLP 2004 Brcelon, Spin. S... ishwnthn nd A.J. Smol Fst kernels on strings nd trees. In proceedings of eurl Informtion Processing Systems.. Zelenko, C. Aone, nd A. Richrdell Kernel methods for reltion extrction. Journl of Mchine Lerning Reserch. ell Zhng nd Wee Sun Lee Question clssifiction using support vector mchines. In proceedings of SI- GIR 03, ACM Press.

COMP 423 lecture 11 Jan. 28, 2008

COMP 423 lecture 11 Jan. 28, 2008 COMP 423 lecture 11 Jn. 28, 2008 Up to now, we hve looked t how some symols in n lphet occur more frequently thn others nd how we cn sve its y using code such tht the codewords for more frequently occuring

More information

A Comparison of the Discretization Approach for CST and Discretization Approach for VDM

A Comparison of the Discretization Approach for CST and Discretization Approach for VDM Interntionl Journl of Innovtive Reserch in Advnced Engineering (IJIRAE) Volume1 Issue1 (Mrch 2014) A Comprison of the Discretiztion Approch for CST nd Discretiztion Approch for VDM Omr A. A. Shib Fculty

More information

Text mining: bag of words representation and beyond it

Text mining: bag of words representation and beyond it Text mining: bg of words representtion nd beyond it Jsmink Dobš Fculty of Orgniztion nd Informtics University of Zgreb 1 Outline Definition of text mining Vector spce model or Bg of words representtion

More information

SPOKEN LANGUAGE UNDERSTANDING WITH KERNELS FOR SYNTACTIC/SEMANTIC STRUCTURES

SPOKEN LANGUAGE UNDERSTANDING WITH KERNELS FOR SYNTACTIC/SEMANTIC STRUCTURES SPOKE LAGUAGE UERSTAIG WITH KERELS FOR SYTACTIC/SEMATIC STRUCTURES Alessndro Moschitti, Giuseppe Riccrdi, Christin Rymond eprtment of Informtion nd Communiction Technology University of Tren 38050 Povo

More information

Lecture 10 Evolutionary Computation: Evolution strategies and genetic programming

Lecture 10 Evolutionary Computation: Evolution strategies and genetic programming Lecture 10 Evolutionry Computtion: Evolution strtegies nd genetic progrmming Evolution strtegies Genetic progrmming Summry Negnevitsky, Person Eduction, 2011 1 Evolution Strtegies Another pproch to simulting

More information

Statistical classification of spatial relationships among mathematical symbols

Statistical classification of spatial relationships among mathematical symbols 2009 10th Interntionl Conference on Document Anlysis nd Recognition Sttisticl clssifiction of sptil reltionships mong mthemticl symbols Wl Aly, Seiichi Uchid Deprtment of Intelligent Systems, Kyushu University

More information

A New Learning Algorithm for the MAXQ Hierarchical Reinforcement Learning Method

A New Learning Algorithm for the MAXQ Hierarchical Reinforcement Learning Method A New Lerning Algorithm for the MAXQ Hierrchicl Reinforcement Lerning Method Frzneh Mirzzdeh 1, Bbk Behsz 2, nd Hmid Beigy 1 1 Deprtment of Computer Engineering, Shrif University of Technology, Tehrn,

More information

ECE 468/573 Midterm 1 September 28, 2012

ECE 468/573 Midterm 1 September 28, 2012 ECE 468/573 Midterm 1 September 28, 2012 Nme:! Purdue emil:! Plese sign the following: I ffirm tht the nswers given on this test re mine nd mine lone. I did not receive help from ny person or mteril (other

More information

II. THE ALGORITHM. A. Depth Map Processing

II. THE ALGORITHM. A. Depth Map Processing Lerning Plnr Geometric Scene Context Using Stereo Vision Pul G. Bumstrck, Bryn D. Brudevold, nd Pul D. Reynolds {pbumstrck,brynb,pulr2}@stnford.edu CS229 Finl Project Report December 15, 2006 Abstrct A

More information

Tree Kernels for Machine Translation Quality Estimation

Tree Kernels for Machine Translation Quality Estimation Tree Kernels for Mchine Trnsltion Qulity Estimtion Christin Hrdmeier nd Jokim ivre nd Jörg Tiedemnn Uppsl University eprtment of Linguistics nd Philology Box 635, 751 26 Uppsl, Sweden firstnme.lstnme@lingfil.uu.se

More information

What are suffix trees?

What are suffix trees? Suffix Trees 1 Wht re suffix trees? Allow lgorithm designers to store very lrge mount of informtion out strings while still keeping within liner spce Allow users to serch for new strings in the originl

More information

Complete Coverage Path Planning of Mobile Robot Based on Dynamic Programming Algorithm Peng Zhou, Zhong-min Wang, Zhen-nan Li, Yang Li

Complete Coverage Path Planning of Mobile Robot Based on Dynamic Programming Algorithm Peng Zhou, Zhong-min Wang, Zhen-nan Li, Yang Li 2nd Interntionl Conference on Electronic & Mechnicl Engineering nd Informtion Technology (EMEIT-212) Complete Coverge Pth Plnning of Mobile Robot Bsed on Dynmic Progrmming Algorithm Peng Zhou, Zhong-min

More information

Slides for Data Mining by I. H. Witten and E. Frank

Slides for Data Mining by I. H. Witten and E. Frank Slides for Dt Mining y I. H. Witten nd E. Frnk Simplicity first Simple lgorithms often work very well! There re mny kinds of simple structure, eg: One ttriute does ll the work All ttriutes contriute eqully

More information

Alignment of Long Sequences. BMI/CS Spring 2012 Colin Dewey

Alignment of Long Sequences. BMI/CS Spring 2012 Colin Dewey Alignment of Long Sequences BMI/CS 776 www.biostt.wisc.edu/bmi776/ Spring 2012 Colin Dewey cdewey@biostt.wisc.edu Gols for Lecture the key concepts to understnd re the following how lrge-scle lignment

More information

P(r)dr = probability of generating a random number in the interval dr near r. For this probability idea to make sense we must have

P(r)dr = probability of generating a random number in the interval dr near r. For this probability idea to make sense we must have Rndom Numers nd Monte Crlo Methods Rndom Numer Methods The integrtion methods discussed so fr ll re sed upon mking polynomil pproximtions to the integrnd. Another clss of numericl methods relies upon using

More information

Unit #9 : Definite Integral Properties, Fundamental Theorem of Calculus

Unit #9 : Definite Integral Properties, Fundamental Theorem of Calculus Unit #9 : Definite Integrl Properties, Fundmentl Theorem of Clculus Gols: Identify properties of definite integrls Define odd nd even functions, nd reltionship to integrl vlues Introduce the Fundmentl

More information

Small Business Networking

Small Business Networking Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd business. Introducing technology

More information

Midterm 2 Sample solution

Midterm 2 Sample solution Nme: Instructions Midterm 2 Smple solution CMSC 430 Introduction to Compilers Fll 2012 November 28, 2012 This exm contins 9 pges, including this one. Mke sure you hve ll the pges. Write your nme on the

More information

SOME EXAMPLES OF SUBDIVISION OF SMALL CATEGORIES

SOME EXAMPLES OF SUBDIVISION OF SMALL CATEGORIES SOME EXAMPLES OF SUBDIVISION OF SMALL CATEGORIES MARCELLO DELGADO Abstrct. The purpose of this pper is to build up the bsic conceptul frmework nd underlying motivtions tht will llow us to understnd ctegoricl

More information

12-B FRACTIONS AND DECIMALS

12-B FRACTIONS AND DECIMALS -B Frctions nd Decimls. () If ll four integers were negtive, their product would be positive, nd so could not equl one of them. If ll four integers were positive, their product would be much greter thn

More information

CPSC 213. Polymorphism. Introduction to Computer Systems. Readings for Next Two Lectures. Back to Procedure Calls

CPSC 213. Polymorphism. Introduction to Computer Systems. Readings for Next Two Lectures. Back to Procedure Calls Redings for Next Two Lectures Text CPSC 213 Switch Sttements, Understnding Pointers - 2nd ed: 3.6.7, 3.10-1st ed: 3.6.6, 3.11 Introduction to Computer Systems Unit 1f Dynmic Control Flow Polymorphism nd

More information

A Scalable and Reliable Mobile Agent Computation Model

A Scalable and Reliable Mobile Agent Computation Model A Sclble nd Relible Mobile Agent Computtion Model Yong Liu, Congfu Xu, Zhohui Wu, nd Yunhe Pn College of Computer Science, Zhejing University Hngzhou 310027, Chin cckffe@yhoo.com.cn Abstrct. This pper

More information

Parallel Square and Cube Computations

Parallel Square and Cube Computations Prllel Squre nd Cube Computtions Albert A. Liddicot nd Michel J. Flynn Computer Systems Lbortory, Deprtment of Electricl Engineering Stnford University Gtes Building 5 Serr Mll, Stnford, CA 945, USA liddicot@stnford.edu

More information

Semistructured Data Management Part 2 - Graph Databases

Semistructured Data Management Part 2 - Graph Databases Semistructured Dt Mngement Prt 2 - Grph Dtbses 2003/4, Krl Aberer, EPFL-SSC, Lbortoire de systèmes d'informtions réprtis Semi-structured Dt - 1 1 Tody's Questions 1. Schems for Semi-structured Dt 2. Grph

More information

Learning grammars for architecture-specific facade parsing

Learning grammars for architecture-specific facade parsing Nonme mnuscript No. (will be inserted by the editor) Lerning grmmrs for rchitecture-specific fcde prsing Rghudeep Gdde Renud Mrlet Nikos Prgios 2 Received: dte / Accepted: dte Abstrct Prsing fcde imges

More information

Small Business Networking

Small Business Networking Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd processes. Introducing technology

More information

Small Business Networking

Small Business Networking Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd business. Introducing technology

More information

Small Business Networking

Small Business Networking Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd business. Introducing technology

More information

1. SEQUENCES INVOLVING EXPONENTIAL GROWTH (GEOMETRIC SEQUENCES)

1. SEQUENCES INVOLVING EXPONENTIAL GROWTH (GEOMETRIC SEQUENCES) Numbers nd Opertions, Algebr, nd Functions 45. SEQUENCES INVOLVING EXPONENTIAL GROWTH (GEOMETRIC SEQUENCES) In sequence of terms involving eponentil growth, which the testing service lso clls geometric

More information

Small Business Networking

Small Business Networking Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd processes. Introducing technology

More information

Functor (1A) Young Won Lim 8/2/17

Functor (1A) Young Won Lim 8/2/17 Copyright (c) 2016-2017 Young W. Lim. Permission is grnted to copy, distribute nd/or modify this document under the terms of the GNU Free Documenttion License, Version 1.2 or ny lter version published

More information

In the last lecture, we discussed how valid tokens may be specified by regular expressions.

In the last lecture, we discussed how valid tokens may be specified by regular expressions. LECTURE 5 Scnning SYNTAX ANALYSIS We know from our previous lectures tht the process of verifying the syntx of the progrm is performed in two stges: Scnning: Identifying nd verifying tokens in progrm.

More information

Chapter 2 Sensitivity Analysis: Differential Calculus of Models

Chapter 2 Sensitivity Analysis: Differential Calculus of Models Chpter 2 Sensitivity Anlysis: Differentil Clculus of Models Abstrct Models in remote sensing nd in science nd engineering, in generl re, essentilly, functions of discrete model input prmeters, nd/or functionls

More information

Engineer To Engineer Note

Engineer To Engineer Note Engineer To Engineer Note EE-186 Technicl Notes on using Anlog Devices' DSP components nd development tools Contct our technicl support by phone: (800) ANALOG-D or e-mil: dsp.support@nlog.com Or visit

More information

A Heuristic Approach for Discovering Reference Models by Mining Process Model Variants

A Heuristic Approach for Discovering Reference Models by Mining Process Model Variants A Heuristic Approch for Discovering Reference Models by Mining Process Model Vrints Chen Li 1, Mnfred Reichert 2, nd Andres Wombcher 3 1 Informtion System Group, University of Twente, The Netherlnds lic@cs.utwente.nl

More information

Small Business Networking

Small Business Networking Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd processes. Introducing technology

More information

Functor (1A) Young Won Lim 10/5/17

Functor (1A) Young Won Lim 10/5/17 Copyright (c) 2016-2017 Young W. Lim. Permission is grnted to copy, distribute nd/or modify this document under the terms of the GNU Free Documenttion License, Version 1.2 or ny lter version published

More information

COMBINATORIAL PATTERN MATCHING

COMBINATORIAL PATTERN MATCHING COMBINATORIAL PATTERN MATCHING Genomic Repets Exmple of repets: ATGGTCTAGGTCCTAGTGGTC Motivtion to find them: Genomic rerrngements re often ssocited with repets Trce evolutionry secrets Mny tumors re chrcterized

More information

On the Detection of Step Edges in Algorithms Based on Gradient Vector Analysis

On the Detection of Step Edges in Algorithms Based on Gradient Vector Analysis On the Detection of Step Edges in Algorithms Bsed on Grdient Vector Anlysis A. Lrr6, E. Montseny Computer Engineering Dept. Universitt Rovir i Virgili Crreter de Slou sin 43006 Trrgon, Spin Emil: lrre@etse.urv.es

More information

Fig.1. Let a source of monochromatic light be incident on a slit of finite width a, as shown in Fig. 1.

Fig.1. Let a source of monochromatic light be incident on a slit of finite width a, as shown in Fig. 1. Answer on Question #5692, Physics, Optics Stte slient fetures of single slit Frunhofer diffrction pttern. The slit is verticl nd illuminted by point source. Also, obtin n expression for intensity distribution

More information

Fig.25: the Role of LEX

Fig.25: the Role of LEX The Lnguge for Specifying Lexicl Anlyzer We shll now study how to uild lexicl nlyzer from specifiction of tokens in the form of list of regulr expressions The discussion centers round the design of n existing

More information

10/12/17. Motivating Example. Lexical and Syntax Analysis (2) Recursive-Descent Parsing. Recursive-Descent Parsing. Recursive-Descent Parsing

10/12/17. Motivating Example. Lexical and Syntax Analysis (2) Recursive-Descent Parsing. Recursive-Descent Parsing. Recursive-Descent Parsing Motivting Exmple Lexicl nd yntx Anlysis (2) In Text: Chpter 4 Consider the grmmr -> cad A -> b Input string: w = cd How to build prse tree top-down? 2 Initilly crete tree contining single node (the strt

More information

Presentation Martin Randers

Presentation Martin Randers Presenttion Mrtin Rnders Outline Introduction Algorithms Implementtion nd experiments Memory consumption Summry Introduction Introduction Evolution of species cn e modelled in trees Trees consist of nodes

More information

USING HOUGH TRANSFORM IN LINE EXTRACTION

USING HOUGH TRANSFORM IN LINE EXTRACTION Stylinidis, Efstrtios USING HOUGH TRANSFORM IN LINE EXTRACTION Efstrtios STYLIANIDIS, Petros PATIAS The Aristotle University of Thessloniki, Deprtment of Cdstre Photogrmmetry nd Crtogrphy Univ. Box 473,

More information

L. Yaroslavsky. Fundamentals of Digital Image Processing. Course

L. Yaroslavsky. Fundamentals of Digital Image Processing. Course L. Yroslvsky. Fundmentls of Digitl Imge Processing. Course 0555.330 Lecture. Imge enhncement.. Imge enhncement s n imge processing tsk. Clssifiction of imge enhncement methods Imge enhncement is processing

More information

Engineer To Engineer Note

Engineer To Engineer Note Engineer To Engineer Note EE-169 Technicl Notes on using Anlog Devices' DSP components nd development tools Contct our technicl support by phone: (800) ANALOG-D or e-mil: dsp.support@nlog.com Or visit

More information

Small Business Networking

Small Business Networking Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd business. Introducing technology

More information

CMSC 331 First Midterm Exam

CMSC 331 First Midterm Exam 0 00/ 1 20/ 2 05/ 3 15/ 4 15/ 5 15/ 6 20/ 7 30/ 8 30/ 150/ 331 First Midterm Exm 7 October 2003 CMC 331 First Midterm Exm Nme: mple Answers tudent ID#: You will hve seventy-five (75) minutes to complete

More information

Digital Design. Chapter 6: Optimizations and Tradeoffs

Digital Design. Chapter 6: Optimizations and Tradeoffs Digitl Design Chpter 6: Optimiztions nd Trdeoffs Slides to ccompny the tetbook Digitl Design, with RTL Design, VHDL, nd Verilog, 2nd Edition, by Frnk Vhid, John Wiley nd Sons Publishers, 2. http://www.ddvhid.com

More information

pdfapilot Server 2 Manual

pdfapilot Server 2 Manual pdfpilot Server 2 Mnul 2011 by clls softwre gmbh Schönhuser Allee 6/7 D 10119 Berlin Germny info@cllssoftwre.com www.cllssoftwre.com Mnul clls pdfpilot Server 2 Pge 2 clls pdfpilot Server 2 Mnul Lst modified:

More information

Homework. Context Free Languages III. Languages. Plan for today. Context Free Languages. CFLs and Regular Languages. Homework #5 (due 10/22)

Homework. Context Free Languages III. Languages. Plan for today. Context Free Languages. CFLs and Regular Languages. Homework #5 (due 10/22) Homework Context Free Lnguges III Prse Trees nd Homework #5 (due 10/22) From textbook 6.4,b 6.5b 6.9b,c 6.13 6.22 Pln for tody Context Free Lnguges Next clss of lnguges in our quest! Lnguges Recll. Wht

More information

CS 432 Fall Mike Lam, Professor a (bc)* Regular Expressions and Finite Automata

CS 432 Fall Mike Lam, Professor a (bc)* Regular Expressions and Finite Automata CS 432 Fll 2017 Mike Lm, Professor (c)* Regulr Expressions nd Finite Automt Compiltion Current focus "Bck end" Source code Tokens Syntx tree Mchine code chr dt[20]; int min() { flot x = 42.0; return 7;

More information

Tree Structured Symmetrical Systems of Linear Equations and their Graphical Solution

Tree Structured Symmetrical Systems of Linear Equations and their Graphical Solution Proceedings of the World Congress on Engineering nd Computer Science 4 Vol I WCECS 4, -4 October, 4, Sn Frncisco, USA Tree Structured Symmetricl Systems of Liner Equtions nd their Grphicl Solution Jime

More information

CSCI1950 Z Computa4onal Methods for Biology Lecture 2. Ben Raphael January 26, hhp://cs.brown.edu/courses/csci1950 z/ Outline

CSCI1950 Z Computa4onal Methods for Biology Lecture 2. Ben Raphael January 26, hhp://cs.brown.edu/courses/csci1950 z/ Outline CSCI1950 Z Comput4onl Methods for Biology Lecture 2 Ben Rphel Jnury 26, 2009 hhp://cs.brown.edu/courses/csci1950 z/ Outline Review of trees. Coun4ng fetures. Chrcter bsed phylogeny Mximum prsimony Mximum

More information

Efficient Regular Expression Grouping Algorithm Based on Label Propagation Xi Chena, Shuqiao Chenb and Ming Maoc

Efficient Regular Expression Grouping Algorithm Based on Label Propagation Xi Chena, Shuqiao Chenb and Ming Maoc 4th Ntionl Conference on Electricl, Electronics nd Computer Engineering (NCEECE 2015) Efficient Regulr Expression Grouping Algorithm Bsed on Lbel Propgtion Xi Chen, Shuqio Chenb nd Ming Moc Ntionl Digitl

More information

Misrepresentation of Preferences

Misrepresentation of Preferences Misrepresenttion of Preferences Gicomo Bonnno Deprtment of Economics, University of Cliforni, Dvis, USA gfbonnno@ucdvis.edu Socil choice functions Arrow s theorem sys tht it is not possible to extrct from

More information

From Dependencies to Evaluation Strategies

From Dependencies to Evaluation Strategies From Dependencies to Evlution Strtegies Possile strtegies: 1 let the user define the evlution order 2 utomtic strtegy sed on the dependencies: use locl dependencies to determine which ttriutes to compute

More information

MATH 25 CLASS 5 NOTES, SEP

MATH 25 CLASS 5 NOTES, SEP MATH 25 CLASS 5 NOTES, SEP 30 2011 Contents 1. A brief diversion: reltively prime numbers 1 2. Lest common multiples 3 3. Finding ll solutions to x + by = c 4 Quick links to definitions/theorems Euclid

More information

Elena Baralis, Silvia Chiusano Politecnico di Torino. Pag. 1. Query optimization. DBMS Architecture. Query optimizer. Query optimizer.

Elena Baralis, Silvia Chiusano Politecnico di Torino. Pag. 1. Query optimization. DBMS Architecture. Query optimizer. Query optimizer. DBMS Architecture SQL INSTRUCTION OPTIMIZER Dtbse Mngement Systems MANAGEMENT OF ACCESS METHODS BUFFER MANAGER CONCURRENCY CONTROL RELIABILITY MANAGEMENT Index Files Dt Files System Ctlog DATABASE 2 Query

More information

Cone Cluster Labeling for Support Vector Clustering

Cone Cluster Labeling for Support Vector Clustering Cone Cluster Lbeling for Support Vector Clustering Sei-Hyung Lee Deprtment of Computer Science University of Msschusetts Lowell MA 1854, U.S.A. slee@cs.uml.edu Kren M. Dniels Deprtment of Computer Science

More information

Eliminating left recursion grammar transformation. The transformed expression grammar

Eliminating left recursion grammar transformation. The transformed expression grammar Eliminting left recursion grmmr trnsformtion Originl! rnsformed! 0 0! 0 α β α α α α α α α α β he two grmmrs generte the sme lnguge, but the one on the right genertes the rst, nd then string of s, using

More information

Cross-Supervised Synthesis of Web-Crawlers

Cross-Supervised Synthesis of Web-Crawlers Cross-Supervised Synthesis of Web-Crwlers Adi Omri Technion omri@cs.technion.c.il Shron Shohm Acdemic College of Tel Aviv Yffo shron.shohm@gmil.com Ern Yhv Technion yhve@cs.technion.c.il ABSTRACT A web-crwler

More information

Fault injection attacks on cryptographic devices and countermeasures Part 2

Fault injection attacks on cryptographic devices and countermeasures Part 2 Fult injection ttcks on cryptogrphic devices nd countermesures Prt Isrel Koren Deprtment of Electricl nd Computer Engineering University of Msschusetts Amherst, MA Countermesures - Exmples Must first detect

More information

An Efficient and Effective Case Classification Method Based On Slicing

An Efficient and Effective Case Classification Method Based On Slicing An Efficient nd Effective Cse Clssifiction Method Bsed On Slicing An Efficient nd Effective Cse Clssifiction Method Bsed On Slicing Omr A. A. Shib, Md. Nsir Sulimn, Ali Mmt nd Ftimh Ahmd Fculty of Computer

More information

2014 Haskell January Test Regular Expressions and Finite Automata

2014 Haskell January Test Regular Expressions and Finite Automata 0 Hskell Jnury Test Regulr Expressions nd Finite Automt This test comprises four prts nd the mximum mrk is 5. Prts I, II nd III re worth 3 of the 5 mrks vilble. The 0 Hskell Progrmming Prize will be wrded

More information

Small Business Networking

Small Business Networking Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd processes. Introducing technology

More information

Computing offsets of freeform curves using quadratic trigonometric splines

Computing offsets of freeform curves using quadratic trigonometric splines Computing offsets of freeform curves using qudrtic trigonometric splines JIULONG GU, JAE-DEUK YUN, YOONG-HO JUNG*, TAE-GYEONG KIM,JEONG-WOON LEE, BONG-JUN KIM School of Mechnicl Engineering Pusn Ntionl

More information

Before We Begin. Introduction to Spatial Domain Filtering. Introduction to Digital Image Processing. Overview (1): Administrative Details (1):

Before We Begin. Introduction to Spatial Domain Filtering. Introduction to Digital Image Processing. Overview (1): Administrative Details (1): Overview (): Before We Begin Administrtive detils Review some questions to consider Winter 2006 Imge Enhncement in the Sptil Domin: Bsics of Sptil Filtering, Smoothing Sptil Filters, Order Sttistics Filters

More information

CS481: Bioinformatics Algorithms

CS481: Bioinformatics Algorithms CS481: Bioinformtics Algorithms Cn Alkn EA509 clkn@cs.ilkent.edu.tr http://www.cs.ilkent.edu.tr/~clkn/teching/cs481/ EXACT STRING MATCHING Fingerprint ide Assume: We cn compute fingerprint f(p) of P in

More information

Some Thoughts on Grad School. Undergraduate Compilers Review and Intro to MJC. Structure of a Typical Compiler. Lexing and Parsing

Some Thoughts on Grad School. Undergraduate Compilers Review and Intro to MJC. Structure of a Typical Compiler. Lexing and Parsing Undergrdute Compilers Review nd Intro to MJC Announcements Miling list is in full swing Tody Some thoughts on grd school Finish prsing Semntic nlysis Visitor pttern for bstrct syntx trees Some Thoughts

More information

On Computation and Resource Management in Networked Embedded Systems

On Computation and Resource Management in Networked Embedded Systems On Computtion nd Resource Mngement in Networed Embedded Systems Soheil Ghisi Krlene Nguyen Elheh Bozorgzdeh Mjid Srrfzdeh Computer Science Deprtment University of Cliforni, Los Angeles, CA 90095 soheil,

More information

Engineer To Engineer Note

Engineer To Engineer Note Engineer To Engineer Note EE-188 Technicl Notes on using Anlog Devices' DSP components nd development tools Contct our technicl support by phone: (800) ANALOG-D or e-mil: dsp.support@nlog.com Or visit

More information

10.5 Graphing Quadratic Functions

10.5 Graphing Quadratic Functions 0.5 Grphing Qudrtic Functions Now tht we cn solve qudrtic equtions, we wnt to lern how to grph the function ssocited with the qudrtic eqution. We cll this the qudrtic function. Grphs of Qudrtic Functions

More information

Algorithm Design (5) Text Search

Algorithm Design (5) Text Search Algorithm Design (5) Text Serch Tkshi Chikym School of Engineering The University of Tokyo Text Serch Find sustring tht mtches the given key string in text dt of lrge mount Key string: chr x[m] Text Dt:

More information

DQL: A New Updating Strategy for Reinforcement Learning Based on Q-Learning

DQL: A New Updating Strategy for Reinforcement Learning Based on Q-Learning DQL: A New Updting Strtegy for Reinforcement Lerning Bsed on Q-Lerning Crlos E. Mrino 1 nd Edurdo F. Morles 2 1 Instituto Mexicno de Tecnologí del Agu, Pseo Cuhunáhuc 8532, Jiutepec, Morelos, 6255, MEXICO

More information

Section 3.1: Sequences and Series

Section 3.1: Sequences and Series Section.: Sequences d Series Sequences Let s strt out with the definition of sequence: sequence: ordered list of numbers, often with definite pttern Recll tht in set, order doesn t mtter so this is one

More information

UNIT 11. Query Optimization

UNIT 11. Query Optimization UNIT Query Optimiztion Contents Introduction to Query Optimiztion 2 The Optimiztion Process: An Overview 3 Optimiztion in System R 4 Optimiztion in INGRES 5 Implementing the Join Opertors Wei-Png Yng,

More information

Midterm I Solutions CS164, Spring 2006

Midterm I Solutions CS164, Spring 2006 Midterm I Solutions CS164, Spring 2006 Februry 23, 2006 Plese red ll instructions (including these) crefully. Write your nme, login, SID, nd circle the section time. There re 8 pges in this exm nd 4 questions,

More information

Step-Voltage Regulator Model Test System

Step-Voltage Regulator Model Test System IEEE PES GENERAL MEETING, JULY 5 Step-Voltge Regultor Model Test System Md Rejwnur Rshid Mojumdr, Pblo Arboley, Senior Member, IEEE nd Cristin González-Morán, Member, IEEE Abstrct In this pper, 4-node

More information

Problem Set 2 Fall 16 Due: Wednesday, September 21th, in class, before class begins.

Problem Set 2 Fall 16 Due: Wednesday, September 21th, in class, before class begins. Problem Set 2 Fll 16 Due: Wednesdy, September 21th, in clss, before clss begins. 1. LL Prsing For the following sub-problems, consider the following context-free grmmr: S T$ (1) T A (2) T bbb (3) A T (4)

More information

An Integrated Simulation System for Human Factors Study

An Integrated Simulation System for Human Factors Study An Integrted Simultion System for Humn Fctors Study Ying Wng, Wei Zhng Deprtment of Industril Engineering, Tsinghu University, Beijing 100084, Chin Foud Bennis, Dmien Chblt IRCCyN, Ecole Centrle de Nntes,

More information

Scanner Termination. Multi Character Lookahead. to its physical end. Most parsers require an end of file token. Lex and Jlex automatically create an

Scanner Termination. Multi Character Lookahead. to its physical end. Most parsers require an end of file token. Lex and Jlex automatically create an Scnner Termintion A scnner reds input chrcters nd prtitions them into tokens. Wht hppens when the end of the input file is reched? It my be useful to crete n Eof pseudo-chrcter when this occurs. In Jv,

More information

4452 Mathematical Modeling Lecture 4: Lagrange Multipliers

4452 Mathematical Modeling Lecture 4: Lagrange Multipliers Mth Modeling Lecture 4: Lgrnge Multipliers Pge 4452 Mthemticl Modeling Lecture 4: Lgrnge Multipliers Lgrnge multipliers re high powered mthemticl technique to find the mximum nd minimum of multidimensionl

More information

Small Business Networking

Small Business Networking Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd processes. Introducing technology

More information

CS 430 Spring Mike Lam, Professor. Parsing

CS 430 Spring Mike Lam, Professor. Parsing CS 430 Spring 2015 Mike Lm, Professor Prsing Syntx Anlysis We cn now formlly descrie lnguge's syntx Using regulr expressions nd BNF grmmrs How does tht help us? Syntx Anlysis We cn now formlly descrie

More information

a < a+ x < a+2 x < < a+n x = b, n A i n f(x i ) x. i=1 i=1

a < a+ x < a+2 x < < a+n x = b, n A i n f(x i ) x. i=1 i=1 Mth 33 Volume Stewrt 5.2 Geometry of integrls. In this section, we will lern how to compute volumes using integrls defined by slice nlysis. First, we recll from Clculus I how to compute res. Given the

More information

A REINFORCEMENT LEARNING APPROACH TO SCHEDULING DUAL-ARMED CLUSTER TOOLS WITH TIME VARIATIONS

A REINFORCEMENT LEARNING APPROACH TO SCHEDULING DUAL-ARMED CLUSTER TOOLS WITH TIME VARIATIONS A REINFORCEMENT LEARNING APPROACH TO SCHEDULING DUAL-ARMED CLUSTER TOOLS WITH TIME VARIATIONS Ji-Eun Roh (), Te-Eog Lee (b) (),(b) Deprtment of Industril nd Systems Engineering, Kore Advnced Institute

More information

CHAPTER III IMAGE DEWARPING (CALIBRATION) PROCEDURE

CHAPTER III IMAGE DEWARPING (CALIBRATION) PROCEDURE CHAPTER III IMAGE DEWARPING (CALIBRATION) PROCEDURE 3.1 Scheimpflug Configurtion nd Perspective Distortion Scheimpflug criterion were found out to be the best lyout configurtion for Stereoscopic PIV, becuse

More information

Spring 2018 Midterm Exam 1 March 1, You may not use any books, notes, or electronic devices during this exam.

Spring 2018 Midterm Exam 1 March 1, You may not use any books, notes, or electronic devices during this exam. 15-112 Spring 2018 Midterm Exm 1 Mrch 1, 2018 Nme: Andrew ID: Recittion Section: You my not use ny books, notes, or electronic devices during this exm. You my not sk questions bout the exm except for lnguge

More information

Automated and Quality-driven Requirements Engineering

Automated and Quality-driven Requirements Engineering Automted nd Qulity-driven Requirements Engineering Rolf Drechsler Mthis Soeken Robert Wille Deprtment of Mmtics nd Computer Science, University of Bremen, Germny Cyber-Physicl Systems, DFKI GmbH, Bremen,

More information

Lecture T4: Pattern Matching

Lecture T4: Pattern Matching Introduction to Theoreticl CS Lecture T4: Pttern Mtching Two fundmentl questions. Wht cn computer do? How fst cn it do it? Generl pproch. Don t tlk bout specific mchines or problems. Consider miniml bstrct

More information

An Efficient Divide and Conquer Algorithm for Exact Hazard Free Logic Minimization

An Efficient Divide and Conquer Algorithm for Exact Hazard Free Logic Minimization An Efficient Divide nd Conquer Algorithm for Exct Hzrd Free Logic Minimiztion J.W.J.M. Rutten, M.R.C.M. Berkelr, C.A.J. vn Eijk, M.A.J. Kolsteren Eindhoven University of Technology Informtion nd Communiction

More information

CSE 401 Midterm Exam 11/5/10 Sample Solution

CSE 401 Midterm Exam 11/5/10 Sample Solution Question 1. egulr expressions (20 points) In the Ad Progrmming lnguge n integer constnt contins one or more digits, but it my lso contin embedded underscores. Any underscores must be preceded nd followed

More information

Discovering Program s Behavioral Patterns by Inferring Graph-Grammars from Execution Traces

Discovering Program s Behavioral Patterns by Inferring Graph-Grammars from Execution Traces Discovering Progrm s Behviorl Ptterns by Inferring Grph-Grmmrs from Execution Trces Chunying Zho 1, Keven Ates 1, Jun Kong 2, Kng Zhng 1 1 The University of Texs t Dlls {cxz051000, tescomp, kzhng }@utdlls.edu

More information

CS201 Discussion 10 DRAWTREE + TRIES

CS201 Discussion 10 DRAWTREE + TRIES CS201 Discussion 10 DRAWTREE + TRIES DrwTree First instinct: recursion As very generic structure, we could tckle this problem s follows: drw(): Find the root drw(root) drw(root): Write the line for the

More information

a(e, x) = x. Diagrammatically, this is encoded as the following commutative diagrams / X

a(e, x) = x. Diagrammatically, this is encoded as the following commutative diagrams / X 4. Mon, Sept. 30 Lst time, we defined the quotient topology coming from continuous surjection q : X! Y. Recll tht q is quotient mp (nd Y hs the quotient topology) if V Y is open precisely when q (V ) X

More information

CS321 Languages and Compiler Design I. Winter 2012 Lecture 5

CS321 Languages and Compiler Design I. Winter 2012 Lecture 5 CS321 Lnguges nd Compiler Design I Winter 2012 Lecture 5 1 FINITE AUTOMATA A non-deterministic finite utomton (NFA) consists of: An input lphet Σ, e.g. Σ =,. A set of sttes S, e.g. S = {1, 3, 5, 7, 11,

More information

CS143 Handout 07 Summer 2011 June 24 th, 2011 Written Set 1: Lexical Analysis

CS143 Handout 07 Summer 2011 June 24 th, 2011 Written Set 1: Lexical Analysis CS143 Hndout 07 Summer 2011 June 24 th, 2011 Written Set 1: Lexicl Anlysis In this first written ssignment, you'll get the chnce to ply round with the vrious constructions tht come up when doing lexicl

More information

Tixeo compared to other videoconferencing solutions

Tixeo compared to other videoconferencing solutions compred to other videoconferencing solutions for V171026EN , unique solution on the video conferencing field Adobe Connect Web RTC Vydio for High security level, privcy Zero impct on network security policies

More information

Integration. October 25, 2016

Integration. October 25, 2016 Integrtion October 5, 6 Introduction We hve lerned in previous chpter on how to do the differentition. It is conventionl in mthemtics tht we re supposed to lern bout the integrtion s well. As you my hve

More information