MINIMUM DESCRIPTION LENGTH BASED PROTEIN SECONDARY STRUCTURE PREDICTION

Size: px
Start display at page:

Download "MINIMUM DESCRIPTION LENGTH BASED PROTEIN SECONDARY STRUCTURE PREDICTION"

Transcription

1 MINIMUM DESCRIPTION LENGTH BASED PROTEIN SECONDARY STRUCTURE PREDICTION Andrea Hategan and Ioan Tabus Insttute of Sgnal Processng, Tampere Unversty of Technology P.O. Box 553, FIN Tampere, Fnland phone: +(358) , fax: +(358) , emal: web: hategan, tabus ABSTRACT Ths paper ntroduces a new algorthm for predctng the secondary structure of a proten based on the proten s prmary structure,.e. ts amno acd sequence. The problem conssts n fndng the segmentaton of the ntal amno acd sequence, where each segment carres the label of a secondary structure, e.g., helx, strand, and col. Our algorthm s dfferent from other exstng probablstc nference algorthms n that t uses probablstc models sutable for drectly encodng the jont nformaton represented by the par (amno acd sequence, secondary structure labels), and chooses as wnner the secondary structure sequence provdng the mnmum representaton, or descrpton length, n lne wth the mnmum descrpton length prncple. An addtonal beneft of our approach s that we provde not only a secondary structure predcton tool, but also a tool that s able to compress n an effcent manner the jont sequences that defne the prmary and secondary structure nformaton n protens. The prelmnary results obtaned for predcton and compresson show a good performance, whch s better n certan aspects than that of comparable algorthms. 1. INTRODUCTION Protens are essental molecules to sustan lfe n all lvng organsms. The process by whch a proten s foldng n ts three dmensonal shape s a prerequste for a proten to be able to perform ts bologcal functon. Because many dseases are the consequence of some protens falure to adopt ther 3-D functonal shape (.e. proten msfoldng [1]) t s mportant to understand the rules by whch the protens fold n the three dmensonal shape. The three dmensonal shape of a proten, known as the tertary structure, s nduced by ts prmary structure,.e. the amno acd sequence, and the envronmental condton. It s a lot easer to expermentally determne the prmary structure of a proten than t s to determne ts tertary structure, and thus the methods for predctng the tertary structure from the prmary structure are a current topc of research. Presently, there s a huge gap between the number of protens for whch the prmary structure s known and the number of protens for whch the tertary structure, and thus ther functon, s known. As a precondton for foldng n the three dmensonal shape, protens form some local conformatons, e.g. alphahelces and beta-strands, called the secondary structure, Ths work was supported by the Academy of Fnland (applcaton number , Fnnsh Programme for Centers of Excellence n Research ) and the Graduate school n Electroncs, Telecommuncaton and Automaton (GETA). whch fnally are essental for acqurng the three dmensonal shape. Because these local conformatons mpose geometrcal constrants on the three dmensonal shape of a proten, the predcton of the secondary structure from the amno acd sequence s also an mportant stage n the process of predctng the tertary structure. In ths paper the am s to predct such local conformatons from the amno acd sequence,.e. for each amno acd n a gven proten sequence to predct a class label that specfes what s the type of secondary structure to whch t belongs. The feld of proten secondary structure predcton from amno acd sequences has developed durng four decades resultng n a large number of methods whch can be classfed n three generatons [2]. The frst generaton methods, that appeared n 1960s and 1970s, evaluated the lkelhood of dfferent amno acds to form dfferent local conformatons. The second generaton methods were domnatng untl early 1990s and used the lkelhood of groups of adjacent amno acds to form dfferent secondary structures. The thrd generaton methods, of the last decade, addtonally consder the evolutonary nformaton contaned n multple algnments for refnng the predcton of the secondary structure. A multple algnment can be bult between several homologous protens,.e. protens that have smlar known functon, but have dfferent amno acd sequences. Usng such an algnment, one can buld dstrbuton profles for each poston n the sequence, thus provdng a stronger dscrmnatve nformaton, by accountng for the varablty whch does not affect the functon of a proten (thus beng also less mportant for ts 3-D shape). Varous predctors can be bult to use the dstrbuton profles, e.g., based on two layer neural networks. Although the accuracy of the methods usng evolutonary nformaton s hgher than that of the methods that are usng only the amno acd sequence, the predcton of the secondary structure from amno acd sequence alone s stll an mportant task because there are many orphan protens [3] for whch one needs to determne the secondary structure. For an orphan proten the evolutonary nformaton s not avalable, because ts amno acd sequence s not sgnfcantly smlar to that of protens wth known secondary structure and functon. The most nvestgated approach for proten secondary structure predcton s to consder a sldng wndow of typcally 15 amno acds and to predct the class label of the central amno acd knowng only the neghborng amno acds. The man problem of ths approach s that t cannot capture amno acd correlatons at a greater dstance than the wndow length. For ths reason, sldng wndow methods have a poor predcton accuracy for beta strands, whch have to form hy-

2 MWMPPRPEEVARKLRRLGFVERMAKGGHRLYTHPDGR LLLLLLHHHHHHHHHHHLLLEEEEELLEEEEELLLLL k=7 S={L,H,L,E,L,E,L} L ={6,11,3,5,2,5,5} Fgure 1: Example of a proten sequence wth n = 37 amno acds and ts secondary structure annotaton drogen bonds wth other beta strands stuated at a larger dstance than the length of the sldng wndow, n order to create beta sheets n the foldng process. The problem of secondary structure predcton can be formulated also as a segmentaton problem, where one starts from the prmary structure sequence and determnes: the number of segments, the class label of each segment and the length of each segment. Ths approach has been already nvestgated n the BSPSS [4] and IPSSP [3] algorthms. In these algorthms the goal s to fnd the best segmentaton by maxmzng the posteror probablty of the segmentaton gven the proten sequence, va Bayesan nference. In our algorthm, called mdlpssp (mnmum descrpton length proten secondary structure predcton), the best segmentaton s chosen va MDL prncple [5], by selectng the hypothess,.e. the segmentaton, that yelds the shortest descrpton length of the data and of the model parameters. The probablstc models and the searchng method used n our algorthm are dfferent than those used n [4] and [3], beng addtonally sutable for real encodng of the jont prmary and secondary structure nformaton. The paper s organzed as follows: n the next secton the problem of proten secondary structure predcton s formulated; the thrd secton presents the man features, modellng, codng, and searchng, of the mdlpssp algorthm. The expermental results are presented n the fourth secton, whle the conclusons are drawn n the last secton. 2. PROBLEM STATEMENT We need to predct the secondary structure annotaton of a gven proten sequence X n = {x 1...x n },.e. to fnd the trplet (k,s,l ) such that (k,s,l ) = mn (k,s,l ) CL(X n k,s,l ) (1) where k s the number of secondary structure segments, S = {s 1...s k } specfes the secondary structure type for each segment, L = {l 1...l k } specfes the length of each segment such that k =1 l = n, and CL(X n k,s,l ) s the cost of encodng the proten sequence X n when the secondary structure annotaton s gven by (k,s,l ). An example llustratng the notatons s gven n Fgure 1 where the amno acd sequence of a proten s shown n the frst row and ts secondary structure annotaton n the second row. The amno acd sequence represents the nput data of the algorthm. The letters n the second row specfy the secondary structure class label for each amno acd n the frst row. The secondary structure nformaton can be represented by the trplet (k,s,l ) and ths trplet s the output of the algorthm. In order to be able to compute the codelength of a proten sequence X n gven ts secondary structure annotaton (k,s,l ) we have to choose a set of probablstc models, that wll be denoted by M. In order to deal wth sequentally computable probabltes, we are gong to assume a parametrc frst order dependency, p(s s 1,M ), for the class labels, and a parametrc dependency of the amno acd probablty, gven ther class labels, such that the crteron n (1) wll decouple as: CL(X n k,s,l ;M ) = k =1 CL(X l s,s 1 ;M ) (2) For a gven (hypotheszed) secondary structure segmentaton, one can use a procedure to compute the cost of the secondary structure segments CL(X l s,s 1 ;M ), and add all results nto the overall cost (2). Our algorthm wll conssts of a search through the space of all segmentatons n order to fnd the one havng the mnmum assocated cost (1). The adopted model makes the search feasble by usng the dynamc programmng method. The next secton presents the set of models M used, the procedure to compute the cost of encodng a secondary structure segment CL(X l s,s 1 ;M ) and the searchng algorthm for the best segmentaton. 3. THE MDLPSSP ALGORITHM As n any machne learnng algorthm we wll defne the procedures for the two steps of tranng and testng. In the tranng step, the algorthm adjusts the parameters and structure wthn the adopted classes of parametrc models, n order to acqure knowledge about the dependences expected between secondary and prmary structures. In the testng step, the algorthm s provded wth some new proten prmary sequences, dfferent than the ones gven n the tranng step, and s asked to provde the segmentaton, or secondary structure, whch s subsequently assessed aganst the true one. For the proten secondary structure predcton problem the tranng step mples choosng the structure of the parametrc probablstc models M and adjustng the parameter values of these models based on the average codelength resulted for the observed segments of a gven secondary structure class. The ssues concernng the tranng step are dscussed n subsecton 3.1. In the testng step, the secondary structure of a new proten s predcted usng the models developed n the tranng step, through a searchng algorthm descrbed n 3.2. The predcted secondary structure s then compared wth the true secondary structure annotaton n order to assess the performance of the algorthm. 3.1 Codng and modellng We start by descrbng how the codelength of a gven secondary structure segment CL(X l s,s 1 ;M ) s calculated. In a typcal data compresson framework the encoder wants to send some data to the decoder by creatng a btstream that has to contan all the nformaton needed for the decoder to be able to recover the orgnal data. When the encoder has to send one secondary structure segment X l to a decoder, the encoder uses a set of models M to create a btstream that encodes the followng nformaton: the type s, the length l, and the amno acd sequence X l = {x 1...x l }. Usng the btstream and the same set of models M, the decoder s able to decode from the btstream all the nformaton needed to recover the sent secondary structure segment. The length of the resulted btstream s the cost of encodng the gven secondary structure segment usng the set of models M,.e. CL(X l s,s 1 ;M ).

3 Let M = {M S,M H,M E,M L } be the set of followng models: M S specfes the transton between consecutve secondary structure labels; M H,M E,M L are condtonal (tree) models for the amno acd beng n one of the three secondary structure types, helx H, strand E, and col L, respectvely. Then, usng the arthmetc codes [6], one can encode n the btstream all nformaton, wth the cost of segment gven by: CL(X l l +1 s,s 1 ;M ) = log 2 P(s s 1 ;M S ) log 2 P(x j c j ;M s ) (3) where M S s a frst order Markov model that gves the probablty of observng a segment of type s followng a segment of type s 1 and M s s a Tree Machne (TM) [7] that gves the probablty of observng the amno acd x j n the context c j n a segment of type s. The summaton n (3) has upper lmt l + 1 snce the encoder has to notfy when a segment ends, by sendng the STOP symbol. Thus, the models M H,E,L nclude along wth the standard amno acds the STOP symbol yeldng an alphabet A wth n a = A = 21 symbols. A TM s bult for each secondary structure type. Frst, a maxmal tree machne (MTM) s bult from a tranng set for each type and then the maxmal trees have to be pruned n order to dscrmnate as much as possble between the three secondary structure types. Encodng wth a TM, mples selectng the context,.e. a certan node n the tree, n whch a symbol s encoded. All these ssues are dscussed next. The tree machnes we use here are data structures smlar to those ntroduced by Rssanen n [7] that keep a sorted lst of pars (context, symbol). The contexts can be seen as nodes n a tree, whle the symbols formng a context defne a path from the root to the node. Each node n the tree stores a probablty dstrbutons of the symbols that were seen at that node. In our applcaton the MTM s bult from a large set of short segments, where each segment has a label, H,E, or L. The constructon of the MTM s explaned by the followng example and t s llustrated n Fgure 2. Suppose that the tranng set s composed of two segments: aabacb and abac. Frst, we pad each segment wth the extra symbol, to mark ther begnnng and endng. Only for ths example, we assume that the alphabet s A = {a,b,c, }. Then, for each symbol n all segments we select the longest context n whch t appears,.e. the prefx of each segment up to that symbol (the left table n Fgure 2). The resulted prefxes are sorted from rght to left n lexcal order. The contexts that wll defne the MTM are selected such that the prefx unquely dentfes the assocated symbol or the begnnng of a segment has been reached (the rght table n Fgure 2). The resulted contexts are arranged n a tree such that each context defnes a leaf. When a context s added to the tree, the tree s clmbed startng from root and followng the branches defned by context symbols from rght to left. The frequency of the symbol pared wth the added context s ncreased n all nodes (contexts) encountered on the path from root to the leaf defned by the context. For example, for the par ( aba, c), the resulted context s ba and the assocated symbol s c. Ths par s added n the MTM as follows: ncrease the count for c n the root and then take the branch a. At the arrved node ncrease the count for c and take the branch b. At the arrved leaf ncrease the count for c. Snce a context defnes a node j=1 aabacb abac a a aa b aab a aaba c aabac b aabacb a b ab a aba c abac prefxes aabacb abac a a a b aa b aba c aaba c ab a aab a aabacb abac aabac b contexts Fgure 2: Maxmal Tree Machne buldng n the tree and a node defnes a context, the two terms can be used nterchangeably. In the encodng process, the context for the current symbol x j s selected by clmbng the tree startng from the root and then takng the branches defned by the prevous symbols x j 1,x j 2,... untl a leaf s reached or untl at the arrved node r = x j 1 x j 2...x j t there s no branch wth the label x j t 1. If the MTM s used to encode the tranng set, almost all symbols, except those at the begnnng of segments, wll be encoded usng the dstrbutons n leaves, where the dstrbutons are more relevant and gve better estmates for the probablty of the next symbol. If the MTM s to be used for a data set dfferent than the tranng set, the so called zero-frequency problem wll occur. Ths problem deals wth the stuaton when at the arrved node r, the probablty of the current symbol x j s zero. Ths means that n the tranng phase the current symbol x j has never been observed n the context r. Because our goal s to use the MTM for data dfferent than the tranng set, we have to assgn probabltes for all symbols n the alphabet at each node. Ths can be acheved by ntroducng the probablty for the escape symbol [8], P esc. In the encodng process, f n the selected context the probablty of the current symbol s zero, the encoder frst sends the escape symbol to nform the decoder that the current symbols has not been seen n the current context. After ths, the encoder specfes whch of the unseen symbols n the current context, actually s sent, usng an equprobable dstrbuton over the unseen symbols. By ntroducng the probablty for the escape symbol, the collected dstrbutons at each node wll be adjusted to nclude also ths probablty. Thus, f the probablty of a symbol x j n the context r s P(x j r), the adjusted probablty wll be ˆP(x j r) = (1 P esc )P(x j r). Although the longer contexts are more relevant and provde better probablty estmates for the probablty of the next symbol, the problem s that these longer contexts also appear less often than the shorter ones. Thus, the statstcs kept n longer context accumulate slower. Even f the zerofrequency problem has been solved by ntroducng the escape symbol, when encodng a new data set we would lke to avod long contexts that are ftted to the tranng set, because here t s very lkely that the current symbol wll be encoded va the escape event, whch yelds a longer codelength than the one n a smaller context, where the symbol s more lkely to be observed. Then, the goal s to prune the MTM such that to get rde of those long contexts, (over-)ftted to the tranng set, where encodng s neffcent. After one maxmal tree MTM s has been created for each

4 of the three secondary types, s = {H,E,L}, the goal of prunng s to carve such trees, whch wll dscrmnate well between the three classes. From each MTM s, we have created a set of pruned tree machnes TM d s,d = 1 : D s that keep only the nodes from root untl depth level d, where D s s the maxmum level reached over the tranng set. The problem now s to fnd the optmum level d s yeldng the most dscrmnatve collecton of three trees. The resultng trees, TM d s s, and the dstrbuton collected at ther nodes represent the abstract models denoted earler M s. The calbraton process was carred out on a calbraton set, that s dfferent from both the tranng and testng sets. Denotng by C s the calbraton set contanng all amno acd segments for type s, the best value of ds was selected by the mnmzaton : ds = mn{cl(c s TMs j j )+ [CL(C s TMs j ) CL(C q TMs j )]} q s (4) Wth all the nformaton descrbed, we are able now to compute the codelength of a gven secondary structure segment (3) and further to compute the codelength of a gven proten sequence X n gven ts secondary structure annotaton (k,s,l ) (2). 3.2 Searchng In the prevous sectons we descrbed how to encode a gven proten sequence X n when ts secondary structure annotaton (k,s,l ) s also gven, based on the set of proposed models M. Next we move to the problem of actually fndng the secondary structure annotaton (k,s,l ) such that the resulted codelength s mnmzed (1). The exhaustve search through all possble segmentatons of the gven proten sequence for choosng the mnmzer n (1) s too expensve or even nfeasble for most of the proten sequences, whch have commonly hundreds of amno acds. For ths reason we resort to dynamc programmng, whch can be appled here due to the separablty of the crteron n (2). We have to defne the states and ther assocated costs, as well as the costs for a transton from one state to another. For mnmzng (2), a state can be defned as Z( j,k,s), where j represents the poston of the current amno acd x j n the gven proten sequence X n, k represents the number of segments up to poston j, and s represents the secondary structure label for the segment that ends at poston j. Each such state has assocated a cost that represents the codelength of the proten sequence X j, gven that there are k segments and the last one has the label s. From the state Z( j,k,s), the algorthm can make a transton to the new state Z(t,k + 1, p), wth the transton cost gven by CL(X j+1:t s, p;m p ). For each state we mark the state from whch we arrved wth the mnmum overall cost (cost of startng state plus cost of transton). For j = n, there are (n k mn ) 3 states, where k mn = n/l max and L max s the maxmum allowed segment. When arrvng at the states wth j = n, we select the one wth the mnmum cost and the best segmentaton s found by followng the ponters kept at each state, untl the state Z(0,0,0) s reached. 4. EXPERIMENTAL RESULTS In order to assess the performance of any predcton algorthm, the tranng set and the testng set should be dstnct. For the proten secondary structure predcton problem, the requrements are tradtonally more strngent,.e. the two data set should not contan sequences that share sgnfcant sequence smlarty. Ths condton s necessary snce keepng the sequences that share a certan degree of sequence smlarty (lkely to be homologous protens) wll artfcally lower the reported predcton errors. In our experments, we have used three data sets. For tranng, we have used the PSIPRED data set from for the calbraton process we have used the EVA data set from ftp://cubc.boc.columba.edu/pub/eva/unque lst.txt and for testng we have used the CASP6 targets data set vew.cg?loc=predctoncenter.org;page=casp6/. The tranng and testng sets are the same as n [3] for comparson purposes. All the data sets contan the followng nformaton for each proten: the sequence dentfer, the amno acd sequence and the secondary structure annotaton as defned by DSSP [9]. DSSP defnes eght secondary structure classes (H,I,G,E,B,S,T,-), but usually the eght classes are mapped to three classes as suggested n [10]: (H,G) H, (E,B) E and (I,S,T,-) L, whch we adopt also here. The performance of the predcton methods were compared n terms of per state predcton accuracy defned as Q (%) = N pred /N obs 100, where = {H,E,L,3} wth = 3 meanng the predcton over all three states, N obs s the number of observed amno acds n the secondary structure type and N pred s the number of amno acds correctly predcted n state. The performance of the new algorthm when workng as a compresson algorthm,.e. when the secondary structure s gven, was evaluated n terms of needed bts per symbol. Here, symbol s taken as the par: amno acd and secondary structure label. In ths case, one needs log log 2 3 = 5.9 bts per symbol on average to encode a proten sequence and ts secondary structure nformaton. Snce the secondary structure sequence s hghly correlated, we also consder two smple alternatves. In the frst one each amno acd s encoded usng log 2 20 bts per amno acd, whle for each secondary structure segment the class label s encoded usng log 2 3 and the length s encoded by Elas Gamma codes for ntegers. The second method, transforms the secondary structure annotaton n a strng of 0,1 and 2. The 0 s used to mark that the class label does not change, whle 1 and 2 are used to specfy where the class label changes and whch s the new label. The resulted strng s encoded usng the arthmetc codes [6]. The frst method s denoted Clear1 n Table 2, whle the second s denoted Clear2. Table 1 presents the results obtaned n the calbraton process on the EVA data set. The table s splt n three parts, one for each secondary structure type. The second column presents the average codelength when the segments from one class have been encoded wth dfferent pruned trees resulted from the maxmal trees traned on the same class (HH,EE,LL). The thrd and the forth column represent the average codelength when the segments from one class are encoded usng trees traned on dfferent classes. The expresson n (4) tells to choose the level that yelds the mnmum value n the last column, for each secondary structure type.

5 d H HH HE HL HH + (HH-HE) + (HH-HL) d E EE EH EL EE+(EE-EH)+(EE-EL) d L LL LH LE LL+(LL-LH)+(LL-LE) Table 1: Calbraton results on EVA set Algorthm Q H (%) Q E (%) Q L (%) Q 3 (%) PSIPRED BSPSS IPSSP mdlpssp Table 2: Predcton accuracy results: per state accuracy. Table 2 presents the results for the predcton accuracy. For the mdlpssp algorthm, the best results were obtaned when the pruned trees had: 3 levels for H, 2 levels for E and 2 levels for L. Note that from Table 1, ths corresponds to the best results when the segments of a certan type are encoded wth the tree traned on the same secondary structure type (column 2). The results n Table 2 n the frst three lnes, were taken from [3]. PSIPRED s a predcton algorthm that uses addtonally evolutonary nformaton, but for the results n ths table t was tested under sngle-sequence condton. The new algorthm mdlppss has the best performance for the helx class. The results are prelmnary snce we contnue to experment wth more refned prunng technques. On average, the predcton takes 22 seconds for one sequence on the CASP6 test set. Table 3 presents the results for the mdlpssp algorthm, when t works as a compresson algorthm for a proten sequence and ts secondary structure nformaton. The results are better for both data sets when compared wth two methods presented earler for the clear representaton. The column mc represents the cost for the set of models M f these should also be sent to the decoder. Ths s compulsory for the PSIPRED data set, because the models were traned on the same set. For the EVA set, t can be assumed that the models are generc, known to both the encoder and the decoder. The column sc shows the average codelength obtaned for encodng a proten sequence and ts secondary Data #seqs mc sc tc Clear 1 Clear 2 EVA PSIPRED Table 3: Compresson results: bts per symbol (amno acd and secondary structure label). structure annotaton. The column denoted by tc represents the total cost, for the model and the data. The last two methods show the average codelength obtaned for the two clear representatons. On average, the compresson takes 0.08 seconds for one sequence on the EVA set and 0.07 seconds for a sequence n the PSIPRED data set. The compresson results are compared wth the clear representaton because, to our knowledge, there s no other avalable algorthm for jontly compressng ndvdual protens and ther secondary structure nformaton. In a prevous work, we took advantages of the secondary structure annotaton n a compresson algorthm for amno acd sequences, [11], but n that algorthm we targeted compresson of full proteomes, whch are the collecton of all proten sequences n a gven organsm. The proteomes have length n the range of hundred thousand amno acds, and the approxmate repettons have a key role for achevng compresson, whch cannot be exploted n the current applcaton, so the method from [11] cannot be appled to the present dataset. 5. CONCLUSIONS Ths paper ntroduces a new algorthm, mdlpssp, useful n two dstnct applcatons: proten secondary structure predcton and jont compresson of prmary and secondary structure of protens. The prelmnary predcton results show better performance only for one of the classes, but the current prunng technque s stll not well adapted for predcton and a more refned prunng may lead to a better performance for all classes. The algorthm can work also as a compresson algorthm for ndvdual proten sequences and ther secondary structure nformaton, beng the frst such algorthm for these types of data. The compresson performance s sgnfcantly better when compared to a smple n clear representaton. REFERENCES [1] C.M. Dobson, Proten foldng and msfoldng, Nature, vol. 426, pp , Dec [2] B. Rost, Revew: proten secondary structure predcton contnues to rse, Journal of Structural Bology, vol. 134, pp , [3] Z. Aydn, Y. Altunbasak, and M. Borodovsky, Proten secondary structure predcton for a sngle sequence usng hdden sem-markov models, BMC Bonformatcs, vol. 7, no. 178, [4] S.C. Schmdler, J.S. Lu, D.L. Brutlag, Bayesan segmentaton of proten secondary structure, Journal of Computatonal Bology, vol. 7, no. 1/2, pp , [5] J. Rssanen, Modellng by the shortest data descrpton, Automatca, vol. 14, pp , [6] J. Rssanen, Generalzed Kraft nequalty and arthmetc codng, IBM Journal of Research and Development, vol. 20, no. 3, pp , [7] J. Rssanen, A unversal data compresson system, IEEE Trans. on Informaton Theory, vol. IT-29, no. 5, pp , [8] J. Rssanen, A lossless data compresson system, US Patent, no , [9] W. Kabsch, C. Sander, Dctonary of proten secondary structure: pattern recognton of hydrogen-bonded and geometrcal features, Bopolymers, vol. 22, pp , [10] B. Rost, C. Sander, Predcton of proten secondary structure at better than 70% accuracy, Journal of Molecular Bology, vol. 232, , [11] A. Hategan, I. Tabus, Jontly encodng proten sequences and ther secondary structre annotaton, n Proc. GENSIPS 2007, Tuusula, Fnland, June

Application of Maximum Entropy Markov Models on the Protein Secondary Structure Predictions

Application of Maximum Entropy Markov Models on the Protein Secondary Structure Predictions Applcaton of Maxmum Entropy Markov Models on the Proten Secondary Structure Predctons Yohan Km Department of Chemstry and Bochemstry Unversty of Calforna, San Dego La Jolla, CA 92093 ykm@ucsd.edu Abstract

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points; Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

CS 534: Computer Vision Model Fitting

CS 534: Computer Vision Model Fitting CS 534: Computer Vson Model Fttng Sprng 004 Ahmed Elgammal Dept of Computer Scence CS 534 Model Fttng - 1 Outlnes Model fttng s mportant Least-squares fttng Maxmum lkelhood estmaton MAP estmaton Robust

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for

More information

Parallel matrix-vector multiplication

Parallel matrix-vector multiplication Appendx A Parallel matrx-vector multplcaton The reduced transton matrx of the three-dmensonal cage model for gel electrophoress, descrbed n secton 3.2, becomes excessvely large for polymer lengths more

More information

EECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science

EECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science EECS 730 Introducton to Bonformatcs Sequence Algnment Luke Huan Electrcal Engneerng and Computer Scence http://people.eecs.ku.edu/~huan/ HMM Π s a set of states Transton Probabltes a kl Pr( l 1 k Probablty

More information

Programming in Fortran 90 : 2017/2018

Programming in Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Exercse 1 : Evaluaton of functon dependng on nput Wrte a program who evaluate the functon f (x,y) for any two user specfed values

More information

EXTENDED BIC CRITERION FOR MODEL SELECTION

EXTENDED BIC CRITERION FOR MODEL SELECTION IDIAP RESEARCH REPORT EXTEDED BIC CRITERIO FOR ODEL SELECTIO Itshak Lapdot Andrew orrs IDIAP-RR-0-4 Dalle olle Insttute for Perceptual Artfcal Intellgence P.O.Box 59 artgny Valas Swtzerland phone +4 7

More information

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 A mathematcal programmng approach to the analyss, desgn and

More information

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur FEATURE EXTRACTION Dr. K.Vjayarekha Assocate Dean School of Electrcal and Electroncs Engneerng SASTRA Unversty, Thanjavur613 41 Jont Intatve of IITs and IISc Funded by MHRD Page 1 of 8 Table of Contents

More information

Hermite Splines in Lie Groups as Products of Geodesics

Hermite Splines in Lie Groups as Products of Geodesics Hermte Splnes n Le Groups as Products of Geodescs Ethan Eade Updated May 28, 2017 1 Introducton 1.1 Goal Ths document defnes a curve n the Le group G parametrzed by tme and by structural parameters n the

More information

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms Course Introducton Course Topcs Exams, abs, Proects A quc loo at a few algorthms 1 Advanced Data Structures and Algorthms Descrpton: We are gong to dscuss algorthm complexty analyss, algorthm desgn technques

More information

Classifier Selection Based on Data Complexity Measures *

Classifier Selection Based on Data Complexity Measures * Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique //00 :0 AM Outlne and Readng The Greedy Method The Greedy Method Technque (secton.) Fractonal Knapsack Problem (secton..) Task Schedulng (secton..) Mnmum Spannng Trees (secton.) Change Money Problem Greedy

More information

CMPS 10 Introduction to Computer Science Lecture Notes

CMPS 10 Introduction to Computer Science Lecture Notes CPS 0 Introducton to Computer Scence Lecture Notes Chapter : Algorthm Desgn How should we present algorthms? Natural languages lke Englsh, Spansh, or French whch are rch n nterpretaton and meanng are not

More information

Support Vector Machines

Support Vector Machines Support Vector Machnes Decson surface s a hyperplane (lne n 2D) n feature space (smlar to the Perceptron) Arguably, the most mportant recent dscovery n machne learnng In a nutshell: map the data to a predetermned

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

Load Balancing for Hex-Cell Interconnection Network

Load Balancing for Hex-Cell Interconnection Network Int. J. Communcatons, Network and System Scences,,, - Publshed Onlne Aprl n ScRes. http://www.scrp.org/journal/jcns http://dx.do.org/./jcns.. Load Balancng for Hex-Cell Interconnecton Network Saher Manaseer,

More information

Problem Set 3 Solutions

Problem Set 3 Solutions Introducton to Algorthms October 4, 2002 Massachusetts Insttute of Technology 6046J/18410J Professors Erk Demane and Shaf Goldwasser Handout 14 Problem Set 3 Solutons (Exercses were not to be turned n,

More information

Lobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide

Lobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide Lobachevsky State Unversty of Nzhn Novgorod Polyhedron Quck Start Gude Nzhn Novgorod 2016 Contents Specfcaton of Polyhedron software... 3 Theoretcal background... 4 1. Interface of Polyhedron... 6 1.1.

More information

Machine Learning. Topic 6: Clustering

Machine Learning. Topic 6: Clustering Machne Learnng Topc 6: lusterng lusterng Groupng data nto (hopefully useful) sets. Thngs on the left Thngs on the rght Applcatons of lusterng Hypothess Generaton lusters mght suggest natural groups. Hypothess

More information

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster

More information

A Clustering Algorithm for Chinese Adjectives and Nouns 1

A Clustering Algorithm for Chinese Adjectives and Nouns 1 Clusterng lgorthm for Chnese dectves and ouns Yang Wen, Chunfa Yuan, Changnng Huang 2 State Key aboratory of Intellgent Technology and System Deptartment of Computer Scence & Technology, Tsnghua Unversty,

More information

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour 6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the

More information

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning Outlne Artfcal Intellgence and ts applcatons Lecture 8 Unsupervsed Learnng Professor Danel Yeung danyeung@eee.org Dr. Patrck Chan patrckchan@eee.org South Chna Unversty of Technology, Chna Introducton

More information

The Research of Support Vector Machine in Agricultural Data Classification

The Research of Support Vector Machine in Agricultural Data Classification The Research of Support Vector Machne n Agrcultural Data Classfcaton Le Sh, Qguo Duan, Xnmng Ma, Me Weng College of Informaton and Management Scence, HeNan Agrcultural Unversty, Zhengzhou 45000 Chna Zhengzhou

More information

Classifying Acoustic Transient Signals Using Artificial Intelligence

Classifying Acoustic Transient Signals Using Artificial Intelligence Classfyng Acoustc Transent Sgnals Usng Artfcal Intellgence Steve Sutton, Unversty of North Carolna At Wlmngton (suttons@charter.net) Greg Huff, Unversty of North Carolna At Wlmngton (jgh7476@uncwl.edu)

More information

Learning-Based Top-N Selection Query Evaluation over Relational Databases

Learning-Based Top-N Selection Query Evaluation over Relational Databases Learnng-Based Top-N Selecton Query Evaluaton over Relatonal Databases Lang Zhu *, Wey Meng ** * School of Mathematcs and Computer Scence, Hebe Unversty, Baodng, Hebe 071002, Chna, zhu@mal.hbu.edu.cn **

More information

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1)

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1) Secton 1.2 Subsets and the Boolean operatons on sets If every element of the set A s an element of the set B, we say that A s a subset of B, or that A s contaned n B, or that B contans A, and we wrte A

More information

A PATTERN RECOGNITION APPROACH TO IMAGE SEGMENTATION

A PATTERN RECOGNITION APPROACH TO IMAGE SEGMENTATION 1 THE PUBLISHING HOUSE PROCEEDINGS OF THE ROMANIAN ACADEMY, Seres A, OF THE ROMANIAN ACADEMY Volume 4, Number 2/2003, pp.000-000 A PATTERN RECOGNITION APPROACH TO IMAGE SEGMENTATION Tudor BARBU Insttute

More information

Analysis of Continuous Beams in General

Analysis of Continuous Beams in General Analyss of Contnuous Beams n General Contnuous beams consdered here are prsmatc, rgdly connected to each beam segment and supported at varous ponts along the beam. onts are selected at ponts of support,

More information

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty

More information

Collaboratively Regularized Nearest Points for Set Based Recognition

Collaboratively Regularized Nearest Points for Set Based Recognition Academc Center for Computng and Meda Studes, Kyoto Unversty Collaboratvely Regularzed Nearest Ponts for Set Based Recognton Yang Wu, Mchhko Mnoh, Masayuk Mukunok Kyoto Unversty 9/1/013 BMVC 013 @ Brstol,

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS ARPN Journal of Engneerng and Appled Scences 006-017 Asan Research Publshng Network (ARPN). All rghts reserved. NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS Igor Grgoryev, Svetlana

More information

Protein Secondary Structure Prediction Using Support Vector Machines, Nueral Networks and Genetic Algorithms

Protein Secondary Structure Prediction Using Support Vector Machines, Nueral Networks and Genetic Algorithms Georga State Unversty ScholarWorks @ Georga State Unversty Computer Scence Theses Department of Computer Scence 5-3-2007 Proten Secondary Structure Predcton Usng Support Vector Machnes, Nueral Networks

More information

Chapter 6 Programmng the fnte element method Inow turn to the man subject of ths book: The mplementaton of the fnte element algorthm n computer programs. In order to make my dscusson as straghtforward

More information

Improving Low Density Parity Check Codes Over the Erasure Channel. The Nelder Mead Downhill Simplex Method. Scott Stransky

Improving Low Density Parity Check Codes Over the Erasure Channel. The Nelder Mead Downhill Simplex Method. Scott Stransky Improvng Low Densty Party Check Codes Over the Erasure Channel The Nelder Mead Downhll Smplex Method Scott Stransky Programmng n conjuncton wth: Bors Cukalovc 18.413 Fnal Project Sprng 2004 Page 1 Abstract

More information

Investigating the Performance of Naïve- Bayes Classifiers and K- Nearest Neighbor Classifiers

Investigating the Performance of Naïve- Bayes Classifiers and K- Nearest Neighbor Classifiers Journal of Convergence Informaton Technology Volume 5, Number 2, Aprl 2010 Investgatng the Performance of Naïve- Bayes Classfers and K- Nearest Neghbor Classfers Mohammed J. Islam *, Q. M. Jonathan Wu,

More information

Related-Mode Attacks on CTR Encryption Mode

Related-Mode Attacks on CTR Encryption Mode Internatonal Journal of Network Securty, Vol.4, No.3, PP.282 287, May 2007 282 Related-Mode Attacks on CTR Encrypton Mode Dayn Wang, Dongda Ln, and Wenlng Wu (Correspondng author: Dayn Wang) Key Laboratory

More information

Non-Split Restrained Dominating Set of an Interval Graph Using an Algorithm

Non-Split Restrained Dominating Set of an Interval Graph Using an Algorithm Internatonal Journal of Advancements n Research & Technology, Volume, Issue, July- ISS - on-splt Restraned Domnatng Set of an Interval Graph Usng an Algorthm ABSTRACT Dr.A.Sudhakaraah *, E. Gnana Deepka,

More information

Machine Learning 9. week

Machine Learning 9. week Machne Learnng 9. week Mappng Concept Radal Bass Functons (RBF) RBF Networks 1 Mappng It s probably the best scenaro for the classfcaton of two dataset s to separate them lnearly. As you see n the below

More information

y and the total sum of

y and the total sum of Lnear regresson Testng for non-lnearty In analytcal chemstry, lnear regresson s commonly used n the constructon of calbraton functons requred for analytcal technques such as gas chromatography, atomc absorpton

More information

User Authentication Based On Behavioral Mouse Dynamics Biometrics

User Authentication Based On Behavioral Mouse Dynamics Biometrics User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

Data Representation in Digital Design, a Single Conversion Equation and a Formal Languages Approach

Data Representation in Digital Design, a Single Conversion Equation and a Formal Languages Approach Data Representaton n Dgtal Desgn, a Sngle Converson Equaton and a Formal Languages Approach Hassan Farhat Unversty of Nebraska at Omaha Abstract- In the study of data representaton n dgtal desgn and computer

More information

CHAPTER 3 SEQUENTIAL MINIMAL OPTIMIZATION TRAINED SUPPORT VECTOR CLASSIFIER FOR CANCER PREDICTION

CHAPTER 3 SEQUENTIAL MINIMAL OPTIMIZATION TRAINED SUPPORT VECTOR CLASSIFIER FOR CANCER PREDICTION 48 CHAPTER 3 SEQUENTIAL MINIMAL OPTIMIZATION TRAINED SUPPORT VECTOR CLASSIFIER FOR CANCER PREDICTION 3.1 INTRODUCTION The raw mcroarray data s bascally an mage wth dfferent colors ndcatng hybrdzaton (Xue

More information

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information

Efficient Distributed File System (EDFS)

Efficient Distributed File System (EDFS) Effcent Dstrbuted Fle System (EDFS) (Sem-Centralzed) Debessay(Debsh) Fesehaye, Rahul Malk & Klara Naherstedt Unversty of Illnos-Urbana Champagn Contents Problem Statement, Related Work, EDFS Desgn Rate

More information

A User Selection Method in Advertising System

A User Selection Method in Advertising System Int. J. Communcatons, etwork and System Scences, 2010, 3, 54-58 do:10.4236/jcns.2010.31007 Publshed Onlne January 2010 (http://www.scrp.org/journal/jcns/). A User Selecton Method n Advertsng System Shy

More information

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints Australan Journal of Basc and Appled Scences, 2(4): 1204-1208, 2008 ISSN 1991-8178 Sum of Lnear and Fractonal Multobjectve Programmng Problem under Fuzzy Rules Constrants 1 2 Sanjay Jan and Kalash Lachhwan

More information

A New Approach For the Ranking of Fuzzy Sets With Different Heights

A New Approach For the Ranking of Fuzzy Sets With Different Heights New pproach For the ankng of Fuzzy Sets Wth Dfferent Heghts Pushpnder Sngh School of Mathematcs Computer pplcatons Thapar Unversty, Patala-7 00 Inda pushpndersnl@gmalcom STCT ankng of fuzzy sets plays

More information

Intro. Iterators. 1. Access

Intro. Iterators. 1. Access Intro Ths mornng I d lke to talk a lttle bt about s and s. We wll start out wth smlartes and dfferences, then we wll see how to draw them n envronment dagrams, and we wll fnsh wth some examples. Happy

More information

BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION

BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION SHI-LIANG SUN, HONG-LEI SHI Department of Computer Scence and Technology, East Chna Normal Unversty 500 Dongchuan Road, Shangha 200241, P. R. Chna E-MAIL: slsun@cs.ecnu.edu.cn,

More information

Predicting Transcription Factor Binding Sites with an Ensemble of Hidden Markov Models

Predicting Transcription Factor Binding Sites with an Ensemble of Hidden Markov Models Vol. 3, No. 1, Fall, 2016, pp. 1-10 ISSN 2158-835X (prnt), 2158-8368 (onlne), All Rghts Reserved Predctng Transcrpton Factor Bndng Stes wth an Ensemble of Hdden Markov Models Yngle Song 1 and Albert Y.

More information

An Image Fusion Approach Based on Segmentation Region

An Image Fusion Approach Based on Segmentation Region Rong Wang, L-Qun Gao, Shu Yang, Yu-Hua Cha, and Yan-Chun Lu An Image Fuson Approach Based On Segmentaton Regon An Image Fuson Approach Based on Segmentaton Regon Rong Wang, L-Qun Gao, Shu Yang 3, Yu-Hua

More information

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 An Iteratve Soluton Approach to Process Plant Layout usng Mxed

More information

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009.

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009. Farrukh Jabeen Algorthms 51 Assgnment #2 Due Date: June 15, 29. Assgnment # 2 Chapter 3 Dscrete Fourer Transforms Implement the FFT for the DFT. Descrbed n sectons 3.1 and 3.2. Delverables: 1. Concse descrpton

More information

The Codesign Challenge

The Codesign Challenge ECE 4530 Codesgn Challenge Fall 2007 Hardware/Software Codesgn The Codesgn Challenge Objectves In the codesgn challenge, your task s to accelerate a gven software reference mplementaton as fast as possble.

More information

Specifications in 2001

Specifications in 2001 Specfcatons n 200 MISTY (updated : May 3, 2002) September 27, 200 Mtsubsh Electrc Corporaton Block Cpher Algorthm MISTY Ths document shows a complete descrpton of encrypton algorthm MISTY, whch are secret-key

More information

X- Chart Using ANOM Approach

X- Chart Using ANOM Approach ISSN 1684-8403 Journal of Statstcs Volume 17, 010, pp. 3-3 Abstract X- Chart Usng ANOM Approach Gullapall Chakravarth 1 and Chaluvad Venkateswara Rao Control lmts for ndvdual measurements (X) chart are

More information

A Statistical Model Selection Strategy Applied to Neural Networks

A Statistical Model Selection Strategy Applied to Neural Networks A Statstcal Model Selecton Strategy Appled to Neural Networks Joaquín Pzarro Elsa Guerrero Pedro L. Galndo joaqun.pzarro@uca.es elsa.guerrero@uca.es pedro.galndo@uca.es Dpto Lenguajes y Sstemas Informátcos

More information

News. Recap: While Loop Example. Reading. Recap: Do Loop Example. Recap: For Loop Example

News. Recap: While Loop Example. Reading. Recap: Do Loop Example. Recap: For Loop Example Unversty of Brtsh Columba CPSC, Intro to Computaton Jan-Apr Tamara Munzner News Assgnment correctons to ASCIIArtste.java posted defntely read WebCT bboards Arrays Lecture, Tue Feb based on sldes by Kurt

More information

Lecture 5: Multilayer Perceptrons

Lecture 5: Multilayer Perceptrons Lecture 5: Multlayer Perceptrons Roger Grosse 1 Introducton So far, we ve only talked about lnear models: lnear regresson and lnear bnary classfers. We noted that there are functons that can t be represented

More information

Adaptive Virtual Support Vector Machine for the Reliability Analysis of High-Dimensional Problems

Adaptive Virtual Support Vector Machine for the Reliability Analysis of High-Dimensional Problems Proceedngs of the ASME 2 Internatonal Desgn Engneerng Techncal Conferences & Computers and Informaton n Engneerng Conference IDETC/CIE 2 August 29-3, 2, Washngton, D.C., USA DETC2-47538 Adaptve Vrtual

More information

S1 Note. Basis functions.

S1 Note. Basis functions. S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type

More information

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics Introducton G10 NAG Fortran Lbrary Chapter Introducton G10 Smoothng n Statstcs Contents 1 Scope of the Chapter... 2 2 Background to the Problems... 2 2.1 Smoothng Methods... 2 2.2 Smoothng Splnes and Regresson

More information

Hierarchical clustering for gene expression data analysis

Hierarchical clustering for gene expression data analysis Herarchcal clusterng for gene expresson data analyss Gorgo Valentn e-mal: valentn@ds.unm.t Clusterng of Mcroarray Data. Clusterng of gene expresson profles (rows) => dscovery of co-regulated and functonally

More information

TN348: Openlab Module - Colocalization

TN348: Openlab Module - Colocalization TN348: Openlab Module - Colocalzaton Topc The Colocalzaton module provdes the faclty to vsualze and quantfy colocalzaton between pars of mages. The Colocalzaton wndow contans a prevew of the two mages

More information

Unsupervised Learning

Unsupervised Learning Pattern Recognton Lecture 8 Outlne Introducton Unsupervsed Learnng Parametrc VS Non-Parametrc Approach Mxture of Denstes Maxmum-Lkelhood Estmates Clusterng Prof. Danel Yeung School of Computer Scence and

More information

Meta-heuristics for Multidimensional Knapsack Problems

Meta-heuristics for Multidimensional Knapsack Problems 2012 4th Internatonal Conference on Computer Research and Development IPCSIT vol.39 (2012) (2012) IACSIT Press, Sngapore Meta-heurstcs for Multdmensonal Knapsack Problems Zhbao Man + Computer Scence Department,

More information

Lossless Compression of Map Contours by Context Tree Modeling of Chain Codes

Lossless Compression of Map Contours by Context Tree Modeling of Chain Codes Lossless Compresson of Map Contours by Context Tree Modelng of Chan Codes Alexander Akmo, Alexander Kolesnko, and Pas Fränt Department of Computer Scence, Unersty of Joensuu, P.O. Box 111, 80110 Joensuu,

More information

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION Paulo Quntlano 1 & Antono Santa-Rosa 1 Federal Polce Department, Brasla, Brazl. E-mals: quntlano.pqs@dpf.gov.br and

More information

F Geometric Mean Graphs

F Geometric Mean Graphs Avalable at http://pvamu.edu/aam Appl. Appl. Math. ISSN: 1932-9466 Vol. 10, Issue 2 (December 2015), pp. 937-952 Applcatons and Appled Mathematcs: An Internatonal Journal (AAM) F Geometrc Mean Graphs A.

More information

A Hidden Markov Model Variant for Sequence Classification

A Hidden Markov Model Variant for Sequence Classification Proceedngs of the Twenty-Second Internatonal Jont Conference on Artfcal Intellgence A Hdden Markov Model Varant for Sequence Classfcaton Sam Blasak and Huzefa Rangwala Computer Scence, George Mason Unversty

More information

Query Clustering Using a Hybrid Query Similarity Measure

Query Clustering Using a Hybrid Query Similarity Measure Query clusterng usng a hybrd query smlarty measure Fu. L., Goh, D.H., & Foo, S. (2004). WSEAS Transacton on Computers, 3(3), 700-705. Query Clusterng Usng a Hybrd Query Smlarty Measure Ln Fu, Don Hoe-Lan

More information

Design and Analysis of Algorithms

Design and Analysis of Algorithms Desgn and Analyss of Algorthms Heaps and Heapsort Reference: CLRS Chapter 6 Topcs: Heaps Heapsort Prorty queue Huo Hongwe Recap and overvew The story so far... Inserton sort runnng tme of Θ(n 2 ); sorts

More information

Machine Learning: Algorithms and Applications

Machine Learning: Algorithms and Applications 14/05/1 Machne Learnng: Algorthms and Applcatons Florano Zn Free Unversty of Bozen-Bolzano Faculty of Computer Scence Academc Year 011-01 Lecture 10: 14 May 01 Unsupervsed Learnng cont Sldes courtesy of

More information

such that is accepted of states in , where Finite Automata Lecture 2-1: Regular Languages be an FA. A string is the transition function,

such that is accepted of states in , where Finite Automata Lecture 2-1: Regular Languages be an FA. A string is the transition function, * Lecture - Regular Languages S Lecture - Fnte Automata where A fnte automaton s a -tuple s a fnte set called the states s a fnte set called the alphabet s the transton functon s the ntal state s the set

More information

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,

More information

MOTION PANORAMA CONSTRUCTION FROM STREAMING VIDEO FOR POWER- CONSTRAINED MOBILE MULTIMEDIA ENVIRONMENTS XUNYU PAN

MOTION PANORAMA CONSTRUCTION FROM STREAMING VIDEO FOR POWER- CONSTRAINED MOBILE MULTIMEDIA ENVIRONMENTS XUNYU PAN MOTION PANORAMA CONSTRUCTION FROM STREAMING VIDEO FOR POWER- CONSTRAINED MOBILE MULTIMEDIA ENVIRONMENTS by XUNYU PAN (Under the Drecton of Suchendra M. Bhandarkar) ABSTRACT In modern tmes, more and more

More information

Fast Feature Value Searching for Face Detection

Fast Feature Value Searching for Face Detection Vol., No. 2 Computer and Informaton Scence Fast Feature Value Searchng for Face Detecton Yunyang Yan Department of Computer Engneerng Huayn Insttute of Technology Hua an 22300, Chna E-mal: areyyyke@63.com

More information

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems A Unfed Framework for Semantcs and Feature Based Relevance Feedback n Image Retreval Systems Ye Lu *, Chunhu Hu 2, Xngquan Zhu 3*, HongJang Zhang 2, Qang Yang * School of Computng Scence Smon Fraser Unversty

More information

Clustering System and Clustering Support Vector Machine for Local Protein Structure Prediction

Clustering System and Clustering Support Vector Machine for Local Protein Structure Prediction Georga State Unversty ScholarWorks @ Georga State Unversty Computer Scence Dssertatons Department of Computer Scence 8-2-2006 Clusterng System and Clusterng Support Vector Machne for Local Proten Structure

More information

Can We Beat the Prefix Filtering? An Adaptive Framework for Similarity Join and Search

Can We Beat the Prefix Filtering? An Adaptive Framework for Similarity Join and Search Can We Beat the Prefx Flterng? An Adaptve Framework for Smlarty Jon and Search Jannan Wang Guolang L Janhua Feng Department of Computer Scence and Technology, Tsnghua Natonal Laboratory for Informaton

More information

Virtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory

Virtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory Background EECS. Operatng System Fundamentals No. Vrtual Memory Prof. Hu Jang Department of Electrcal Engneerng and Computer Scence, York Unversty Memory-management methods normally requres the entre process

More information

Comparison Study of Textural Descriptors for Training Neural Network Classifiers

Comparison Study of Textural Descriptors for Training Neural Network Classifiers Comparson Study of Textural Descrptors for Tranng Neural Network Classfers G.D. MAGOULAS (1) S.A. KARKANIS (1) D.A. KARRAS () and M.N. VRAHATIS (3) (1) Department of Informatcs Unversty of Athens GR-157.84

More information

Classification Based Mode Decisions for Video over Networks

Classification Based Mode Decisions for Video over Networks Classfcaton Based Mode Decsons for Vdeo over Networks Deepak S. Turaga and Tsuhan Chen Advanced Multmeda Processng Lab Tranng data for Inter-Intra Decson Inter-Intra Decson Regons pdf 6 5 6 5 Energy 4

More information

TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS. Muradaliyev A.Z.

TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS. Muradaliyev A.Z. TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS Muradalyev AZ Azerbajan Scentfc-Research and Desgn-Prospectng Insttute of Energetc AZ1012, Ave HZardab-94 E-mal:aydn_murad@yahoocom Importance of

More information

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices Steps for Computng the Dssmlarty, Entropy, Herfndahl-Hrschman and Accessblty (Gravty wth Competton) Indces I. Dssmlarty Index Measurement: The followng formula can be used to measure the evenness between

More information

High-Boost Mesh Filtering for 3-D Shape Enhancement

High-Boost Mesh Filtering for 3-D Shape Enhancement Hgh-Boost Mesh Flterng for 3-D Shape Enhancement Hrokazu Yagou Λ Alexander Belyaev y Damng We z Λ y z ; ; Shape Modelng Laboratory, Unversty of Azu, Azu-Wakamatsu 965-8580 Japan y Computer Graphcs Group,

More information

Optimal Workload-based Weighted Wavelet Synopses

Optimal Workload-based Weighted Wavelet Synopses Optmal Workload-based Weghted Wavelet Synopses Yoss Matas School of Computer Scence Tel Avv Unversty Tel Avv 69978, Israel matas@tau.ac.l Danel Urel School of Computer Scence Tel Avv Unversty Tel Avv 69978,

More information

Help for Time-Resolved Analysis TRI2 version 2.4 P Barber,

Help for Time-Resolved Analysis TRI2 version 2.4 P Barber, Help for Tme-Resolved Analyss TRI2 verson 2.4 P Barber, 22.01.10 Introducton Tme-resolved Analyss (TRA) becomes avalable under the processng menu once you have loaded and selected an mage that contans

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

Brushlet Features for Texture Image Retrieval

Brushlet Features for Texture Image Retrieval DICTA00: Dgtal Image Computng Technques and Applcatons, 1 January 00, Melbourne, Australa 1 Brushlet Features for Texture Image Retreval Chbao Chen and Kap Luk Chan Informaton System Research Lab, School

More information