arxiv: v1 [cs.sd] 22 Dec 2017

Size: px
Start display at page:

Download "arxiv: v1 [cs.sd] 22 Dec 2017"

Transcription

1 Musc Genre Classfcaton wth Parallelng Recurrent Convolutonal Neural Network arxv: v1 [cs.sd] 22 Dec 2017 Ln Feng, Shenlan Lu, Janng Yao December 2017 Abstract Deep learnng has been demonstrated ts effectveness and effcency n musc genre classfcaton. However, the exstng achevements stll have several shortcomngs whch mpar the performance of ths classfcaton task. In ths paper, we propose a hybrd archtecture whch conssts of the parallelng CNN and B-RNN blocks. They focus on spatal features and temporal frame orders extracton respectvely. Then the two outputs are fused nto one powerful representaton of muscal sgnals and fed nto softmax functon for classfcaton. The parallelng network guarantees the extractng features robust enough to represent musc. Moreover, the experments prove our proposed archtecture mprove the musc genre classfcaton performance and the addtonal B-RNN block s a supplement for CNNs. 1 Introducton Wth the extensve utlzaton of varous musc platforms, an ncreasng number of musc s wdely spread, whch causes chaos for audences and those platforms to organze these musc. Furthermore, t s mpossble to organze and dstngush such a large number of musc by manual efforts. Therefore, how to construct a convenent way to deal wth ths problem s of vtal mportance but challengng. Most of state-of-the-art methods am to classfy the musc genre whch s a toplevel label on musc to help audences to categorze and descrbe varous musc. [1] Meanwhle, exact classfcaton on musc genre s crucal for musc platforms to organze musc nto dfferent groups. For ths reason, classfcaton on musc genre has attracted wdely attentons n the feld of musc nformaton retreval (MIR) [2][3]. As two crucal components for musc genre classfcaton, feature extracton and classfer learnng may greatly nfluence the performance of most classfcaton systems [4].Feature extracton concentrates on explorng sutable representatons of samples whch are expected to be classfed n terms of feature vectors or parwse smlartes [5]. After feature extracton, features and representatons of musc are fed nto a classfer, whch ams to map feature vectors 1

2 nto dfferent musc genres. Banya et al. [6] adopt tmbral texture features (.e. Mel-frequency Cepstral Coeffcent) and rhythm content features lke beat hstogram (BH) [1] to represent musc sgnals. Then, they combne Extreme Learnng Machne (ELM) [7] wth baggng [8] as a classfer. Arab et al. [9] draw chord features and chord progresson nformaton nto feature extracton. In addton, by utlzng Support Vector Machne (SVM), they proved chord features n conjuncton wth low-level features [5] can provde hgher classfcaton accuracy. The state-of-the-art achevement s reported by Sarkar et al. [10], whch employs Emprcal Mode Decomposton (EMD) for sgnal component extracton and depends only on ptch based features. Even though all methods above acheve good performance n some certan stuatons, these hand-craft features cannot avod some fatal dsadvantages. The hand-craft features extracton from musc sgnals need some complex process, thus t requres researchers to possess expertse n the muscal doman. Furthermore, features whch extracted for one certan task lack unversalty snce they may have poor performances n other tasks. In recent years, deep learnng, especally Convolutonal Neural Networks (CNNs) have been utlzed n varous mage classfcatons successfully. [11][12] Meanwhle, Sander et al. [13] prove that comparng wth normal mages, spectrograms of musc audo can also acheve good performance wth CNNs. Under ths crcumstance, there s a growng tendency of learnng robust feature representatons from spectrograms of musc wth CNNs [14][15]. In contrast wth tradtonal methods, CNNs provdes an end-to-end tranng archtecture whch combne feature extracton wth musc classfcaton n one stage. And multple works based on CNNs have shown ther superortes for musc genre classfcaton. But t s worth notcng, dfferent from ordnary mages, spectrograms of musc have heavly sequental relatonshps nsde. However, the exstng musc genre classfcatons wth CNNs are not able to model the long-term temporal nformaton n spectrograms of musc data. As we all know, Recurrent Neural Networks [16] (RNNs) can model long-term dependences lke musc structure or recurrent harmones [17] whch are sgnfcant for musc classfcaton. To address all lmtaton mentoned above, we propose a hybrd learnng archtecture named Parallelng Recurrent and Convolutonal Neural Network (PRCNN), whch conssts of a CNN block and a Bdrectonal Recurrent Neural Network (B-RNN) block [18]. The man contrbuton of our proposed archtecture s that the hybrd structure models not only spatal features but also temporal frame orders of musc data, whch are greatly complementary to musc genre classfcatons comparng wth smple CNNs. The rest of ths paper are organzed as follows. In Secton 2, we retrospect related work of musc genre classfcaton and carefully analyze ther contrbuton as well as lmtaton. Secton 3 descrbes the constructon of our proposed hybrd archtecture PRCNN for musc genre classfcaton n detal. In Secton 4, we mplement varous experments based on several datasets and demonstrate the valdty of our proposed archtecture PRCNN. Fnally, we draw a concluson and present some future work n Secton 5. 2

3 2 Related work Musc genre classfcaton s a wdely studed area n Musc Informaton Retreval for categorzng and descrbng enormous amount of musc [1]. Varous researches ndcate extractng representatve features from musc sgnals can heavly mprove the performance of classfcatons. Thus, most exstng works focus on extractng robust features to represent musc n order to mprove the musc genre classfcaton performances. Motvated by the success of computer vson [19], CNNs have also attracted much attenton n the feld of musc genre classfcaton. By tranng an end-to-end archtecture, CNNs have powerful capactes to represent varous musc wth hgher-level features. In addton, CNNs requre less engneerng effort and pror knowledge of one certan feld. L et al. [14] declare the varatons of muscal patterns wth a certan transformaton such as, Fast Fourer Transform (FFT) and Mel-frequency Cepstral Coeffcent (MFCC), are smlar to mages whch work well wth CNNs n mage classfcatons [12]. Moreover, they prove CNNs are feasble alternates to extract muscal patterns features automatcally. Although ther work brngs opportuntes to dsplace hand-craft features, the expermental results, however, show the proposed structure s not robust enough to make testng data perform as excellently as tranng data. Zhang et al. [15] proposed two networks to mprove the performance of musc genre classfcaton wth CNNs. In order to offer more statstcal nformaton to the followng layers, max- and average-poolng are operated n conjuncton across the entre tme axs n one of networks. Tendng to mprove the accuracy from ncreased depth, they utlze shortcut connectons nspred by resdual learnng [20] n another network. The performances of two CNNs are both demonstrated to be mproved contrast wth prevous results based on GTZAN [1] dataset. However, as mentoned n prevous secton, muscal patterns have some temporal relatonshps whch are crucal for musc genre classfcatons but wll be dropped n CNNs. For ths reason, Cho et al. [21] desgn a hybrd model named convolutonal recurrent neural network (CRNN), whch CNNs and RNNs are exploted as features extractor and temporal summarzer, respectvely. Comparng wth three exstng CNNs, CRNN s demonstrated to mprove the performance of musc classfcaton va learnng more temporal nformaton. But ths hybrd model also have ts lmtaton whch mpar the performance of musc classfcaton. Even though CRNN has RNNs to be the temporal summarzer, t can only summarze temporal nformaton from the output of CNNs. Obvously, the temporal relatonshps of orgnal muscal sgnals are not preserved durng operatons wth CNNs. To preserve both spatal features and temporal frame orders of orgnal musc sgnals, we carefully desgn the hybrd model whch conssts of parallelng CNN and B-RNN blocks. In next secton, we wll descrbe our proposed hybrd archtecture for musc genre classfcaton n detal. 3

4 Fgure 1: The network archtecture of PRCNN 4

5 3 methodology As llustrated n Fgure 1, our proposed hybrd archtecture s dvded nto four blocks wth weghts to play dfferent roles. At the bottom of Fgure 1, we utlze Short-term Fourer Transform (STFT) spectrogram of muscal sgnals as the nput of our network. The nput whose sze s s smultaneously fed nto parallelng CNN and B-RNN blocks to mplement feature extracton. As aforementoned, CNNs have excellent performance on extractng spatal features of musc. However, the STFT spectrogram of muscal sgnals has some sgnfcant sequental-relatonshps lost n CNNs durng supervsed learnng. Thus, the parallelng B-RNN block s employed to extract temporal frame orders from the spectrogram as a supplement. Then the outputs of two parallelng blocks are fused nto one feature vector whch wll be classfed next. After a dense layer, we apply a softmax operaton as a post-processng stage to acqure a feature vector whch conssts of normalzed probabltes of dfferent musc genres. As mentoned n Secton 1, feature extracton s a crucal part n musc genre classfcatons. Therefore, n the rest of ths secton, we descrbe the parallelng CNN and B-RNN blocks utlzed for feature extracton n detal. 3.1 Convolutonal Neural Network Block Except for the nput and output layers, the CNN block of our proposed hybrd archtecture has 10 layers, ncludng fve convolutonal-poolng layers. After each convolutonal layer, a max-poolng operaton s followed to further process the output of prevous convolutonal layer. Each kernel detects a fxed 3 1 regon n the prevous layer wth 1 1 paddng. The desgn of paddng s to reduce the nformaton loss durng convoluton. In order to acqure more meanngful representatons from spectrogram, we desgn the fve convolutonal layers wth 16, 32, 64, 128 and 64 flters respectvely. The frst three max-poolng layers output the maxmum value wthn a 2 2 rectangular neghborhood wth strdes 2 2. And the upper two max-poolng layers reports the maxmum value of a 4 4 regon wth 4 4 strdes to extract more robust representatons. The output of CNN block s a vector of and wll be fed nto the classfer n conjuncton wth the output of B-RNN block. Convoluton kernel sze Kernels are regarded as feature detectors n convolutonal layers. In general, a kernel sze defned as k r c means the kernel can learn k features of r c, where r and c refers to rows and columns of a kernel respectvely. Kernel sze determnes the range of a feature map t can precsely detects. Thus, the kernel sze can certanly affect the performance of feature learnng. When the kernel sze s too small, t s not capable to learn representatve features from the gven data. Thus some researchers, such as Krzhevsky et al. [22], proposed large convoluton kernels szed as to detect features. However, the ncreasng sze of convoluton kernel makes parameters of per feature detector ncrease, and obvously, the storage and computaton wll both ncrease. Moreover, large kernels lose the nvarance wthn ther ranges 5

6 [23]. Amng to learn more representatve features wth less parameters, the kernel sze utlzed n our proposed archtecture s 3 1, whch have shown excellent performance of features detectng wth sutable parameters storage and computaton. Poolng Poolng functon, s regarded as a process of subsamplng and a crucal stage n CNNs. In contrast wth convoluton, poolng s a non-lnear behavor whch produces a summary statstc of the nearby output. The max-poolng operaton employed n CNN block can represent the most promnent features of musc, such as ampltudes. A max-poolng can also reduce the dmenson of prevous output, and therefore prevents the network from overfttng wth less parameters. Meanwhle, the poolng sze s also an mportant aspect whch nfluences the musc genre classfcaton. In general, underszed poolng sze makes the network not nvarant enough for some small translatons. On the contrary, f the poolng sze s overszed, some requste feature locatons wll be lost and some error may be brought nto the classfcaton result. Rectfed Lnear Unts As we all know, convoluton s a lnear operaton whch s usually not enough to reflect the representatons of features. Thus, we employ Rectfed Lnear Unts (ReLUs) [24] to acheve a non-lnear behavor. The defnton of ReLUs actvaton functon s f(x) = max(0, x). Obvously, ReLUs brngs out sparse feature representatons n hdden layers snce components below 0 are cut off. In contrast wth sgmod, ReLUs do not saturate at 1 and the partal dervatve of the actvaton functon s never 0, whch can avod the appearance of vanshng gradent n some degree. Meanwhle, ReLUs also have more rapd speed of convergence than tradtonal sgmod and tanh actvatons. 3.2 Bdrectonal Gated Recurrent Unts Block As llustrated n Fgure 1, the BGRU-RNN block conssts of 7 layers except for the nput and fused output layers. In ths block, the nput s frst processed by a max-poolng layer to reduce the dmenson. After ths step, the dmenson of spectrogram s reduced to Snce the upper BGRU layers are constructed knder complex, we employ an embeddng layer for further dmenson reducton to decrease parameters of. After the pre-tranng, a nput s fed nto two stacked BGRUs llustrated n Fgure 2 for features extracton. In contrast to the output of CNNs block, we smply splce the outputs of two stacked BGRU layers as one 256D feature vector. As we all know, standard recurrent neural networks (RNNs) only take advantage of prevous contexts but gnore the backwards dependences whch are also mportant for feature learnng. However, many applcatons have demonstrated that the predcton of y (t) heavly depends on the whole nput sequence, ncludng the past and future nformaton. Another lmtaton of tradtonal RNNs s that they wll suffer from the problem of vanshng and explodng gradents 6

7 Output Layer Backward Layer Forward Layer Input Layer t-1 t t+1 Fgure 2: The network archtecture of BGRU when dealng(deal) wth long-term dependences. Thus, n our hybrd archtecture, we explot two stacked bdrectonal BGRUs whch s a varant of RNNs to mprove the performance of feature extracton. The structure of BGRUs s shown n Fgure 2 and we wll descrbe t n detal soon. Bdrectonal Gated Recurrent Unts The desgn of BGRU s motvated by two man consderatons: 1) utlzng gated Recurrent Unt (GRU) to extract temporal features from spectrogram of muscal sgnals whch are lost n CNNs; 2) extractng powerful representatons by takng full advantage of past and future nformaton of a sequence. GRU s proposed n [25] to make the recurrent blocks adaptvely capture nformaton from varable-length sequences. Obvously, a BGRUs archtectures means that we employ GRU n both forward states part and backward states part. As llustrated n Fgure 2, the nput layer s fed nto both forward and backward layers. Meanwhle, the output layer s produced by both forward and backward layers. But the two reverse layers have no drect connectons. Indeed, GRU s a more smplfed varaton of the Long Short-term Memory (LSTM) [26], whch ntegrates nput and forget gates nto one update gate and append a reset gates. For GRU, t makes one sngle gatng unt smultaneously controls the forgettng element and the decson to update the state unt. In the th GRU, the actvaton h (t) at tme t s calculated by the prevous actvaton h (t 1) and 7

8 the current canddate update: h (t) h (t) = u (t) h (t) + (1 u (t) )h (t 1), (1) where u and respectvely stand for update gate and canddate actvaton. The update gate decdes how much the unt updates from ts actvaton: u (t) = σ(b u + U u x (t) + W u h (t 1) ), (2) where b, U and W respectvely denote the bases, nput weghts and recurrent weghts nto the th GRU. The nput vector at tme t s defned as x (t). The canddate actvaton s computed analogously to the update gate: h (t) h (t) = tanh(b + Ux (t) + W (r (t) h (t 1) )), (3) where r stands for reset gate and denotes an element-wse multplcaton operaton. If r (t) s close to 0, the reset gate s off and the unt should forget the past nformaton. The reset gate s defned wth the followng formula: r (t) = σ(b r + U r x (t) + W r h (t 1) ) (4) The update and reset gates can separately neglect vector parts. The update gates decde how much the past states should mpact current states. Whle the reset gates provde nonlnear effect n the correlaton between past state and future state. They decde whch parts should be computed n the future state. In our bdrectonal archtecture, the forward GRUs are calculated by past states along postve tme axs whle the back forward GRUs are computed by future states along reverse tme axs. For nstance, the actvaton at tme t of backward GRUs s calculated by the future actvaton h (t+1) and the current canddate update: h (t) = u (t) h (t) + (1 u (t) )h (t+1), (5) and other formulas are smlar to ths, beng computed along the reverse tme axs. Comparng wth LSTM, GRU has smpler structure whch captures temporal correlatons from muscal sgnals but overcomes the problem of vanshng and explodng gradent. GRU and LSTM can both preserve mportant nformaton va gates nsde durng dealng wth long-term dependences. But n GRU, the actvatons of gates only depend on prevous output and current nput. Thus, the smpler GRU mtgates the occurrence of overfttng and tends to converge faster than LSTM wth less parameters. 3.3 Feature Fuson and Classfer Block The outputs of the two parallelng blocks are two 256 dmensonal vectors. In our hybrd archtecture, CNNs and BRNNs blocks respectvely focus on extractng spatal features and temporal frame orders of muscal sgnals. Thus, the two 8

9 vectors need to be fused nto one powerful representaton to mprove the performance of musc genre classfcaton. Snce the two vectors have the same sze, we carry out two methods of fusng them nto one feature representaton: 1) drectly add the values of two vectors together and acqure a new 521 dmensonal vector; 2) keep the orgnal values of two vectors and concatenate them nto a 521 dmensonal vector. After feature fuson, the syncretc representaton s fed nto dense and softmax layers to mplement the classfcaton. In the classfer block, a dense layer s employed to map the prevous fused vector nto a feature vector whose sze s 10. Then a softmax functon s adopted n ths feature vector for musc genre classfcaton. The softmax functon s defned as: P () = exp(x ) k k=1 exp(x k), (6) where P () and x respectvely represent the probablty of musc genre and the th value of the feature vector. The am of explotng a softmax functon s to make each value of feature vector between 0 1. And the result of k k=1 exp(x k) equals 1. In ths stuaton, the 10 values between 0 1 can be regarded as the probabltes of 10 musc genres. 4 EXPERIMENT In ths secton, we ntroduce the two dataset used n our experments and report some contrast experments results for valdatng the effectveness of the proposed parallelng archtecture. 4.1 Dataset Descrpton There are two classcal datasets utlzed n our experments. One s GTZAN dataset [1] whch has been used as a benchmark n varous systems for musc genre classfcaton. It conssts of 1000 songs excerpts whch are evenly dstrbuted nto ten dfferent genres: Blues, Classcal, Country, Dsco, Hppop, Jazz, Metal, Pop, Reggae and Rock. Each song s about 30 seconds duraton and sampled wth the rate of 22050Hz at 16 bt. Another dataset s Extended Ballroom dataset [27] whch s an extended verson based on Ballroom dataset [28]. The Extended Ballroom dataset we use for tranng and testng conssts of 4180 excerpts wth 30 seconds duraton. The audo qualty s better than the Ballroom dataset and 5 new genres of ballroom dance musc: Foxtrot, Pasodoble, Salsa, Slowwaltz and Wcswng are added. 4.2 Expermental Setup Dataset pre-processng As we all know, Deep Neural Networks need enormous nput data to learn robust feature representaton. However, the datasets we used n our experments are wth 1000 song excerpts and 4180 musc tracks respectvely. In order to ncrease the number of tracks, we cut each song excerpt 9

10 Table 1: Genre classfcaton results on GTZAN dataset Methods Features Accuracy CNN+2-layer RNN STFT 88.8% CNN+1-layer RNN STFT 90.2% nnet1 STFT 84.8% nnet2 STFT 87.4% KCNN(k=5)+SVM [30] Mel-spectrum, SFM, SCF 83.9% DNN(ReLU+SGD+Dropout) [29] FFT(aggregaton) 83.0% Multlayer nvarant representaton [31] STFT wth log representaton 82.0% Table 2: Improved performance wth RNN for dfferent CNNs CNNs Wthout RNN Wth RNN Our CNN 88.0% 92.0% Alexnet 81.4% 88.8% Vgg % 88.7% ResNet % 87.6% nto shorter musc clps wth 3 seconds duraton and 50% overlap. Thus, the ncreased tranng datasets help our archtecture avod overfttng partly and have better performance on feature extracton. Smlar to the processng n [29][15], we calculate Fast Fourer Transforms (FFTs) on frames of length 1024 at khz samplng rate wth 50% overlap and use the absolute value of each FFT frame. We fnally construct a STFT spectrogram wth 128 frames and each frame s a 513 dmensonal vector. 4.3 Result The musc genre classfcaton accuracy of the proposed PRCNN s reported n Table 1. For comparson, we also reported other achevements appled to the GTZAN dataset presented n [15]. As shown n Table 1, we desgn our B-RNN block wth 2 layers RNNs and 1 layer RNN respectvely. And the results both show better performance than other achevements appled to the same dataset. Nevertheless, the problem of overfttng can easly appears n 2 layers RNNs durng feature learnng n the small szed dataset. Thus, we only use 1 layer RNN n our B-RNN block to extract features from spectrogram. And the results prove that the B-RNN block wth 1 layer RNN acheves better performance than employng 2 layers RNNs. In order to valdate the effectveness of the addtonally parallelng RNN block, we desgn some contrast experments wth other typcal CNNs. In Table 2, all the results are all acheved on the GTZAN dataset. And as can be seen, n contrast to utlzng CNNs alone, all of the CNNs wth parallelng RNN can mprove the performance of musc genre classfcaton. 10

11 5 Concluson In ths paper, we propose a hybrd archtecture PRCNN to mprove the performance of musc genre classfcaton. Ths end-to-end model conssts of parallelng CNN and B-RNN blocks for feature extracton. The CNN block focuses on extractng spatal features from spectrogram of muscal sgnals. On the contrary, the BRNNs block s desgned wth the purpose of modelng temporal frame orders. Furthermore, the bdrectonal archtecture can make current states depend on not only prevous nformaton but also future contexts of the sequence durng supervsed learnng. The outputs of two parallelng blocks are fused nto a more powerful feature vector for musc classfcaton. Several experments n ths paper adequately demonstrate the effectveness of our hybrd archtecture. Moreover, comparng wth utlzng CNNs alone, the expermental results prove extractng temporal frame orders from muscal sgnals wth RNNs mproves the performance of musc genre classfcaton. References [1] G. Tzanetaks and P. Cook, Muscal genre classfcaton of audo sgnals, IEEE Transactons on speech and audo processng, vol. 10, no. 5, pp , [2] J. Shawe-Taylor and A. Meng, An nvestgaton of feature models for musc genre classfcaton usng the support vector classfer, [3] K. West and S. Cox, Fndng an optmal segmentaton for audo genre classfcaton., n ISMIR, pp , [4] R. O. Duda, P. E. Hart, and D. G. Stork, Pattern classfcaton (2nd edton), En Broeck the Statstcal Mechancs of Learnng Rsty, [5] Z. Fu, G. Lu, K. M. Tng, and D. Zhang, A survey of audo-based musc classfcaton and annotaton, IEEE transactons on multmeda, vol. 13, no. 2, pp , [6] B. K. Banya, D. Ghmre, and J. Lee, A novel approach of automatc musc genre classfcaton based on tmbra texture and rhythmc content features, n Advanced Communcaton Technology (ICACT), th Internatonal Conference on, pp , IEEE, [7] G.-B. Huang, Q.-Y. Zhu, and C.-K. Sew, Extreme learnng machne: theory and applcatons, Neurocomputng, vol. 70, no. 1, pp , [8] L. Breman, Baggng predctors, Machne learnng, vol. 24, no. 2, pp , [9] A. F. Arab and G. Lu, Enhanced polyphonc musc genre classfcaton usng hgh level features, n Sgnal and Image Processng Applcatons 11

12 (ICSIPA), 2009 IEEE Internatonal Conference on, pp , IEEE, [10] R. Sarkar and S. K. Saha, Musc genre classfcaton usng emd and ptch based feature, n Advances n Pattern Recognton (ICAPR), 2015 Eghth Internatonal Conference on, pp. 1 6, IEEE, [11] Y. We, W. Xa, M. Ln, J. Huang, B. N, J. Dong, Y. Zhao, and S. Yan, Hcp: A flexble cnn framework for mult-label mage classfcaton, IEEE transactons on pattern analyss and machne ntellgence, vol. 38, no. 9, pp , [12] D. C. Cresan, U. Meer, J. Masc, L. Mara Gambardella, and J. Schmdhuber, Flexble, hgh performance convolutonal neural networks for mage classfcaton, n IJCAI Proceedngs-Internatonal Jont Conference on Artfcal Intellgence, vol. 22, p. 1237, Barcelona, Span, [13] S. Deleman and B. Schrauwen, End-to-end learnng for musc audo, n Acoustcs, Speech and Sgnal Processng (ICASSP), 2014 IEEE Internatonal Conference on, pp , IEEE, [14] T. L. L, A. B. Chan, and A. Chun, Automatc muscal pattern feature extracton usng convolutonal neural network, n Proc. Int. Conf. Data Mnng and Applcatons, [15] W. Zhang, W. Le, X. Xu, and X. Xng, Improved musc genre classfcaton wth convolutonal neural networks., n INTERSPEECH, pp , [16] J. L. Elman, Fndng structure n tme, Cogntve scence, vol. 14, no. 2, pp , [17] J. Pons, T. Ldy, and X. Serra, Expermentng wth muscally motvated convolutonal neural networks, n Content-Based Multmeda Indexng (CBMI), th Internatonal Workshop on, pp. 1 6, IEEE, [18] M. Schuster and K. K. Palwal, Bdrectonal recurrent neural networks, IEEE Transactons on Sgnal Processng, vol. 45, no. 11, pp , [19] S. Lawrence, C. L. Gles, A. C. Tso, and A. D. Back, Face recognton: A convolutonal neural-network approach, IEEE transactons on neural networks, vol. 8, no. 1, pp , [20] K. He, X. Zhang, S. Ren, and J. Sun, Deep resdual learnng for mage recognton, n Proceedngs of the IEEE conference on computer vson and pattern recognton, pp , [21] K. Cho, G. Fazekas, M. Sandler, and K. Cho, Convolutonal recurrent neural networks for musc classfcaton, arxv preprnt arxv: ,

13 [22] A. Krzhevsky, I. Sutskever, and G. E. Hnton, Imagenet classfcaton wth deep convolutonal neural networks, n Advances n neural nformaton processng systems, pp , [23] K. Cho, G. Fazekas, and M. Sandler, Automatc taggng usng deep convolutonal neural networks, arxv preprnt arxv: , [24] V. Nar and G. E. Hnton, Rectfed lnear unts mprove restrcted boltzmann machnes, n Proceedngs of the 27th nternatonal conference on machne learnng (ICML-10), pp , [25] K. Cho, B. Van Merrënboer, D. Bahdanau, and Y. Bengo, On the propertes of neural machne translaton: Encoder-decoder approaches, arxv preprnt arxv: , [26] S. Hochreter and J. Schmdhuber, Long short-term memory, Neural computaton, vol. 9, no. 8, pp , [27] U. Marchand and G. Peeters, The extended ballroom dataset, [28] F. Gouyon, S. Dxon, E. Pampalk, and G. Wdmer, Evaluatng rhythmc descrptors for muscal genre classfcaton, [29] S. Sgta and S. Dxon, Improved musc feature learnng wth deep neural networks, n Acoustcs, Speech and Sgnal Processng (ICASSP), 2014 IEEE Internatonal Conference on, pp , IEEE, [30] P. Zhang, X. Zheng, W. Zhang, S. L, S. Qan, W. He, S. Zhang, and Z. Wang, A deep neural network for modelng musc, n Proceedngs of the 5th ACM on Internatonal Conference on Multmeda Retreval, pp , ACM, [31] C. Zhang, G. Evangelopoulos, S. Vonea, L. Rosasco, and T. Poggo, A deep representaton for nvarance and musc classfcaton, n Acoustcs, Speech and Sgnal Processng (ICASSP), 2014 IEEE Internatonal Conference on, pp , IEEE,

Classifying Acoustic Transient Signals Using Artificial Intelligence

Classifying Acoustic Transient Signals Using Artificial Intelligence Classfyng Acoustc Transent Sgnals Usng Artfcal Intellgence Steve Sutton, Unversty of North Carolna At Wlmngton (suttons@charter.net) Greg Huff, Unversty of North Carolna At Wlmngton (jgh7476@uncwl.edu)

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

Deep learning is a good steganalysis tool when embedding key is reused for different images, even if there is a cover source-mismatch

Deep learning is a good steganalysis tool when embedding key is reused for different images, even if there is a cover source-mismatch Deep learnng s a good steganalyss tool when embeddng key s reused for dfferent mages, even f there s a cover source-msmatch Lonel PIBRE 2,3, Jérôme PASQUET 2,3, Dno IENCO 2,3, Marc CHAUMONT 1,2,3 (1) Unversty

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1 4/14/011 Outlne Dscrmnatve classfers for mage recognton Wednesday, Aprl 13 Krsten Grauman UT-Austn Last tme: wndow-based generc obect detecton basc ppelne face detecton wth boostng as case study Today:

More information

Lecture 5: Multilayer Perceptrons

Lecture 5: Multilayer Perceptrons Lecture 5: Multlayer Perceptrons Roger Grosse 1 Introducton So far, we ve only talked about lnear models: lnear regresson and lnear bnary classfers. We noted that there are functons that can t be represented

More information

Convolutional Neural Network- based Human Recognition for Vision Occupancy Sensors

Convolutional Neural Network- based Human Recognition for Vision Occupancy Sensors 10 Int'l Conf. IP, Comp. Vson, and Pattern Recognton IPCV'18 Convolutonal Neural Network- based Human Recognton for Vson Occupancy Sensors Seung Soo Lee and Manbae Km * Dept. of Computer and Communcatons

More information

The Research of Support Vector Machine in Agricultural Data Classification

The Research of Support Vector Machine in Agricultural Data Classification The Research of Support Vector Machne n Agrcultural Data Classfcaton Le Sh, Qguo Duan, Xnmng Ma, Me Weng College of Informaton and Management Scence, HeNan Agrcultural Unversty, Zhengzhou 45000 Chna Zhengzhou

More information

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur FEATURE EXTRACTION Dr. K.Vjayarekha Assocate Dean School of Electrcal and Electroncs Engneerng SASTRA Unversty, Thanjavur613 41 Jont Intatve of IITs and IISc Funded by MHRD Page 1 of 8 Table of Contents

More information

Comparing Image Representations for Training a Convolutional Neural Network to Classify Gender

Comparing Image Representations for Training a Convolutional Neural Network to Classify Gender 2013 Frst Internatonal Conference on Artfcal Intellgence, Modellng & Smulaton Comparng Image Representatons for Tranng a Convolutonal Neural Network to Classfy Gender Choon-Boon Ng, Yong-Haur Tay, Bok-Mn

More information

Object-Based Techniques for Image Retrieval

Object-Based Techniques for Image Retrieval 54 Zhang, Gao, & Luo Chapter VII Object-Based Technques for Image Retreval Y. J. Zhang, Tsnghua Unversty, Chna Y. Y. Gao, Tsnghua Unversty, Chna Y. Luo, Tsnghua Unversty, Chna ABSTRACT To overcome the

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching A Fast Vsual Trackng Algorthm Based on Crcle Pxels Matchng Zhqang Hou hou_zhq@sohu.com Chongzhao Han czhan@mal.xjtu.edu.cn Ln Zheng Abstract: A fast vsual trackng algorthm based on crcle pxels matchng

More information

Research of Image Recognition Algorithm Based on Depth Learning

Research of Image Recognition Algorithm Based on Depth Learning 208 4th World Conference on Control, Electroncs and Computer Engneerng (WCCECE 208) Research of Image Recognton Algorthm Based on Depth Learnng Zhang Jan, J Xnhao Zhejang Busness College, Hangzhou, Chna,

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

A Novel Adaptive Descriptor Algorithm for Ternary Pattern Textures

A Novel Adaptive Descriptor Algorithm for Ternary Pattern Textures A Novel Adaptve Descrptor Algorthm for Ternary Pattern Textures Fahuan Hu 1,2, Guopng Lu 1 *, Zengwen Dong 1 1.School of Mechancal & Electrcal Engneerng, Nanchang Unversty, Nanchang, 330031, Chna; 2. School

More information

Transformation Networks for Target-Oriented Sentiment Classification ACL / 25

Transformation Networks for Target-Oriented Sentiment Classification ACL / 25 Transformaton Networks for Target-Orented Sentment Classfcaton 1 Xn L 1, Ldong Bng 2, Wa Lam 1, Be Sh 1 1 The Chnese Unversty of Hong Kong 2 Tencent AI Lab ACL 2018 1 Jont work wth Tencent AI Lab Transformaton

More information

Detection of an Object by using Principal Component Analysis

Detection of an Object by using Principal Component Analysis Detecton of an Object by usng Prncpal Component Analyss 1. G. Nagaven, 2. Dr. T. Sreenvasulu Reddy 1. M.Tech, Department of EEE, SVUCE, Trupath, Inda. 2. Assoc. Professor, Department of ECE, SVUCE, Trupath,

More information

Face Recognition Based on SVM and 2DPCA

Face Recognition Based on SVM and 2DPCA Vol. 4, o. 3, September, 2011 Face Recognton Based on SVM and 2DPCA Tha Hoang Le, Len Bu Faculty of Informaton Technology, HCMC Unversty of Scence Faculty of Informaton Scences and Engneerng, Unversty

More information

Audio Event Detection and classification using extended R-FCN Approach. Kaiwu Wang, Liping Yang, Bin Yang

Audio Event Detection and classification using extended R-FCN Approach. Kaiwu Wang, Liping Yang, Bin Yang Audo Event Detecton and classfcaton usng extended R-FCN Approach Kawu Wang, Lpng Yang, Bn Yang Key Laboratory of Optoelectronc Technology and Systems(Chongqng Unversty), Mnstry of Educaton, ChongQng Unversty,

More information

Texture Feature Extraction Inspired by Natural Vision System and HMAX Algorithm

Texture Feature Extraction Inspired by Natural Vision System and HMAX Algorithm The Journal of Mathematcs and Computer Scence Avalable onlne at http://www.tjmcs.com The Journal of Mathematcs and Computer Scence Vol. 4 No.2 (2012) 197-206 Texture Feature Extracton Inspred by Natural

More information

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY 1. SSDH: Semi-supervised Deep Hashing for Large Scale Image Retrieval

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY 1. SSDH: Semi-supervised Deep Hashing for Large Scale Image Retrieval IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY SSDH: Sem-supervsed Deep Hashng for Large Scale Image Retreval Jan Zhang, and Yuxn Peng arxv:607.08477v2 [cs.cv] 8 Jun 207 Abstract Hashng

More information

Classifier Selection Based on Data Complexity Measures *

Classifier Selection Based on Data Complexity Measures * Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.

More information

ALEXNET FEATURE EXTRACTION AND MULTI-KERNEL LEARNING FOR OBJECT- ORIENTED CLASSIFICATION

ALEXNET FEATURE EXTRACTION AND MULTI-KERNEL LEARNING FOR OBJECT- ORIENTED CLASSIFICATION ALEXNET FEATURE EXTRACTION AND MULTI-KERNEL LEARNING FOR OBJECT- ORIENTED CLASSIFICATION Lng Dng 1, Hongy L 2, *, Changmao Hu 2, We Zhang 2, Shumn Wang 1 1 Insttute of Earthquake Forecastng, Chna Earthquake

More information

Scale Selective Extended Local Binary Pattern For Texture Classification

Scale Selective Extended Local Binary Pattern For Texture Classification Scale Selectve Extended Local Bnary Pattern For Texture Classfcaton Yutng Hu, Zhlng Long, and Ghassan AlRegb Multmeda & Sensors Lab (MSL) Georga Insttute of Technology 03/09/017 Outlne Texture Representaton

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

A Background Subtraction for a Vision-based User Interface *

A Background Subtraction for a Vision-based User Interface * A Background Subtracton for a Vson-based User Interface * Dongpyo Hong and Woontack Woo KJIST U-VR Lab. {dhon wwoo}@kjst.ac.kr Abstract In ths paper, we propose a robust and effcent background subtracton

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,

More information

Performance Assessment and Fault Diagnosis for Hydraulic Pump Based on WPT and SOM

Performance Assessment and Fault Diagnosis for Hydraulic Pump Based on WPT and SOM Performance Assessment and Fault Dagnoss for Hydraulc Pump Based on WPT and SOM Be Jkun, Lu Chen and Wang Zl PERFORMANCE ASSESSMENT AND FAULT DIAGNOSIS FOR HYDRAULIC PUMP BASED ON WPT AND SOM. Be Jkun,

More information

Machine Learning 9. week

Machine Learning 9. week Machne Learnng 9. week Mappng Concept Radal Bass Functons (RBF) RBF Networks 1 Mappng It s probably the best scenaro for the classfcaton of two dataset s to separate them lnearly. As you see n the below

More information

Learning-based License Plate Detection on Edge Features

Learning-based License Plate Detection on Edge Features Learnng-based Lcense Plate Detecton on Edge Features Wng Teng Ho, Woo Hen Yap, Yong Haur Tay Computer Vson and Intellgent Systems (CVIS) Group Unverst Tunku Abdul Rahman, Malaysa wngteng_h@yahoo.com, woohen@yahoo.com,

More information

Hierarchical Image Retrieval by Multi-Feature Fusion

Hierarchical Image Retrieval by Multi-Feature Fusion Preprnts (www.preprnts.org) NOT PEER-REVIEWED Posted: 26 Aprl 207 do:0.20944/preprnts20704.074.v Artcle Herarchcal Image Retreval by Mult- Fuson Xaojun Lu, Jaojuan Wang,Yngq Hou, Me Yang, Q Wang* and Xangde

More information

High resolution 3D Tau-p transform by matching pursuit Weiping Cao* and Warren S. Ross, Shearwater GeoServices

High resolution 3D Tau-p transform by matching pursuit Weiping Cao* and Warren S. Ross, Shearwater GeoServices Hgh resoluton 3D Tau-p transform by matchng pursut Wepng Cao* and Warren S. Ross, Shearwater GeoServces Summary The 3D Tau-p transform s of vtal sgnfcance for processng sesmc data acqured wth modern wde

More information

Research Article A High-Order CFS Algorithm for Clustering Big Data

Research Article A High-Order CFS Algorithm for Clustering Big Data Moble Informaton Systems Volume 26, Artcle ID 435627, 8 pages http://dx.do.org/.55/26/435627 Research Artcle A Hgh-Order Algorthm for Clusterng Bg Data Fanyu Bu,,2 Zhku Chen, Peng L, Tong Tang, 3 andyngzhang

More information

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach Angle Estmaton and Correcton of Hand Wrtten, Textual and Large areas of Non-Textual Document Images: A Novel Approach D.R.Ramesh Babu Pyush M Kumat Mahesh D Dhannawat PES Insttute of Technology Research

More information

A Gradient Difference based Technique for Video Text Detection

A Gradient Difference based Technique for Video Text Detection A Gradent Dfference based Technque for Vdeo Text Detecton Palaahnakote Shvakumara, Trung Quy Phan and Chew Lm Tan School of Computng, Natonal Unversty of Sngapore {shva, phanquyt, tancl }@comp.nus.edu.sg

More information

A Gradient Difference based Technique for Video Text Detection

A Gradient Difference based Technique for Video Text Detection 2009 10th Internatonal Conference on Document Analyss and Recognton A Gradent Dfference based Technque for Vdeo Text Detecton Palaahnakote Shvakumara, Trung Quy Phan and Chew Lm Tan School of Computng,

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

EYE CENTER LOCALIZATION ON A FACIAL IMAGE BASED ON MULTI-BLOCK LOCAL BINARY PATTERNS

EYE CENTER LOCALIZATION ON A FACIAL IMAGE BASED ON MULTI-BLOCK LOCAL BINARY PATTERNS P.G. Demdov Yaroslavl State Unversty Anatoly Ntn, Vladmr Khryashchev, Olga Stepanova, Igor Kostern EYE CENTER LOCALIZATION ON A FACIAL IMAGE BASED ON MULTI-BLOCK LOCAL BINARY PATTERNS Yaroslavl, 2015 Eye

More information

S1 Note. Basis functions.

S1 Note. Basis functions. S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type

More information

BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET

BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET 1 BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET TZU-CHENG CHUANG School of Electrcal and Computer Engneerng, Purdue Unversty, West Lafayette, Indana 47907 SAUL B. GELFAND School

More information

CS 534: Computer Vision Model Fitting

CS 534: Computer Vision Model Fitting CS 534: Computer Vson Model Fttng Sprng 004 Ahmed Elgammal Dept of Computer Scence CS 534 Model Fttng - 1 Outlnes Model fttng s mportant Least-squares fttng Maxmum lkelhood estmaton MAP estmaton Robust

More information

Image Matching Algorithm based on Feature-point and DAISY Descriptor

Image Matching Algorithm based on Feature-point and DAISY Descriptor JOURNAL OF MULTIMEDIA, VOL. 9, NO. 6, JUNE 2014 829 Image Matchng Algorthm based on Feature-pont and DAISY Descrptor L L School of Busness, Schuan Agrcultural Unversty, Schuan Dujanyan 611830, Chna Abstract

More information

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION Paulo Quntlano 1 & Antono Santa-Rosa 1 Federal Polce Department, Brasla, Brazl. E-mals: quntlano.pqs@dpf.gov.br and

More information

High-Boost Mesh Filtering for 3-D Shape Enhancement

High-Boost Mesh Filtering for 3-D Shape Enhancement Hgh-Boost Mesh Flterng for 3-D Shape Enhancement Hrokazu Yagou Λ Alexander Belyaev y Damng We z Λ y z ; ; Shape Modelng Laboratory, Unversty of Azu, Azu-Wakamatsu 965-8580 Japan y Computer Graphcs Group,

More information

A Modified Median Filter for the Removal of Impulse Noise Based on the Support Vector Machines

A Modified Median Filter for the Removal of Impulse Noise Based on the Support Vector Machines A Modfed Medan Flter for the Removal of Impulse Nose Based on the Support Vector Machnes H. GOMEZ-MORENO, S. MALDONADO-BASCON, F. LOPEZ-FERRERAS, M. UTRILLA- MANSO AND P. GIL-JIMENEZ Departamento de Teoría

More information

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for

More information

An Image Fusion Approach Based on Segmentation Region

An Image Fusion Approach Based on Segmentation Region Rong Wang, L-Qun Gao, Shu Yang, Yu-Hua Cha, and Yan-Chun Lu An Image Fuson Approach Based On Segmentaton Regon An Image Fuson Approach Based on Segmentaton Regon Rong Wang, L-Qun Gao, Shu Yang 3, Yu-Hua

More information

Audio Content Classification Method Research Based on Two-step Strategy

Audio Content Classification Method Research Based on Two-step Strategy (IJACSA) Internatonal Journal of Advanced Computer Scence and Applcatons, Audo Content Classfcaton Method Research Based on Two-step Strategy Sume Lang Department of Computer Scence and Technology Chongqng

More information

Computer Aided Drafting, Design and Manufacturing Volume 25, Number 2, June 2015, Page 14

Computer Aided Drafting, Design and Manufacturing Volume 25, Number 2, June 2015, Page 14 Computer Aded Draftng, Desgn and Manufacturng Volume 5, Number, June 015, Page 14 CADDM Face Recognton Algorthm Fusng Monogenc Bnary Codng and Collaboratve Representaton FU Yu-xan, PENG Lang-yu College

More information

Supplementary Material DESIRE: Distant Future Prediction in Dynamic Scenes with Interacting Agents

Supplementary Material DESIRE: Distant Future Prediction in Dynamic Scenes with Interacting Agents Supplementary Materal DESIRE: Dstant Future Predcton n Dynamc Scenes wth Interactng Agents Namhoon Lee 1, Wongun Cho 2, Paul Vernaza 2, Chrstopher B. Choy 3, Phlp H. S. Torr 1, Manmohan Chandraker 2,4

More information

COMPARISON OF ENHANCED SCHEMES FOR AUDIO CLASSIFICATION

COMPARISON OF ENHANCED SCHEMES FOR AUDIO CLASSIFICATION Volume 4, No. 1, December 13 Journal of Global Research n Computer Scence REVIEW ARTICLE Avalable Onlne at www.grcs.nfo COMPARISON OF ENHANCED SCHEMES FOR AUDIO CLASSIFICATION Dr. V. Radha *1 and G.Anuradha

More information

UB at GeoCLEF Department of Geography Abstract

UB at GeoCLEF Department of Geography   Abstract UB at GeoCLEF 2006 Mguel E. Ruz (1), Stuart Shapro (2), June Abbas (1), Slva B. Southwck (1) and Davd Mark (3) State Unversty of New York at Buffalo (1) Department of Lbrary and Informaton Studes (2) Department

More information

Histogram of Template for Pedestrian Detection

Histogram of Template for Pedestrian Detection PAPER IEICE TRANS. FUNDAMENTALS/COMMUN./ELECTRON./INF. & SYST., VOL. E85-A/B/C/D, No. xx JANUARY 20xx Hstogram of Template for Pedestran Detecton Shaopeng Tang, Non Member, Satosh Goto Fellow Summary In

More information

(a) Input data X n. (b) VersNet. (c) Output data Y n. (d) Supervsed data D n. Fg. 2 Illustraton of tranng for proposed CNN. 2. Related Work In segment

(a) Input data X n. (b) VersNet. (c) Output data Y n. (d) Supervsed data D n. Fg. 2 Illustraton of tranng for proposed CNN. 2. Related Work In segment 一般社団法人電子情報通信学会 THE INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGINEERS 信学技報 IEICE Techncal Report SANE2017-92 (2018-01) Deep Learnng for End-to-End Automatc Target Recognton from Synthetc

More information

Music Structure Boundaries Estimation Using Multiple. Self-Similarity Matrices as Input Depth of Convolutional Neural Networks.

Music Structure Boundaries Estimation Using Multiple. Self-Similarity Matrices as Input Depth of Convolutional Neural Networks. Musc Structure Boundares Estmaton Usng Multple Self-Smlarty Matrces as Input Depth of Convolutonal Neural Networks Alce Cohen-Hadra, Geoffroy Peeters To cte ths verson: Alce Cohen-Hadra, Geoffroy Peeters.

More information

Comparison Study of Textural Descriptors for Training Neural Network Classifiers

Comparison Study of Textural Descriptors for Training Neural Network Classifiers Comparson Study of Textural Descrptors for Tranng Neural Network Classfers G.D. MAGOULAS (1) S.A. KARKANIS (1) D.A. KARRAS () and M.N. VRAHATIS (3) (1) Department of Informatcs Unversty of Athens GR-157.84

More information

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 An Iteratve Soluton Approach to Process Plant Layout usng Mxed

More information

Automatic Control of a Digital Reverberation Effect using Hybrid Models

Automatic Control of a Digital Reverberation Effect using Hybrid Models Automatc Control of a Dgtal Reverberaton Effect usng Hybrd Models Emmanoul Theofans Chourdaks 1 and Joshua D. Ress 1 1 Queen Mary Unversty of London, Mle End Road, London E14NS, Unted Kngdom Correspondence

More information

Local Quaternary Patterns and Feature Local Quaternary Patterns

Local Quaternary Patterns and Feature Local Quaternary Patterns Local Quaternary Patterns and Feature Local Quaternary Patterns Jayu Gu and Chengjun Lu The Department of Computer Scence, New Jersey Insttute of Technology, Newark, NJ 0102, USA Abstract - Ths paper presents

More information

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems A Unfed Framework for Semantcs and Feature Based Relevance Feedback n Image Retreval Systems Ye Lu *, Chunhu Hu 2, Xngquan Zhu 3*, HongJang Zhang 2, Qang Yang * School of Computng Scence Smon Fraser Unversty

More information

A Deflected Grid-based Algorithm for Clustering Analysis

A Deflected Grid-based Algorithm for Clustering Analysis A Deflected Grd-based Algorthm for Clusterng Analyss NANCY P. LIN, CHUNG-I CHANG, HAO-EN CHUEH, HUNG-JEN CHEN, WEI-HUA HAO Department of Computer Scence and Informaton Engneerng Tamkang Unversty 5 Yng-chuan

More information

Outline. Self-Organizing Maps (SOM) US Hebbian Learning, Cntd. The learning rule is Hebbian like:

Outline. Self-Organizing Maps (SOM) US Hebbian Learning, Cntd. The learning rule is Hebbian like: Self-Organzng Maps (SOM) Turgay İBRİKÇİ, PhD. Outlne Introducton Structures of SOM SOM Archtecture Neghborhoods SOM Algorthm Examples Summary 1 2 Unsupervsed Hebban Learnng US Hebban Learnng, Cntd 3 A

More information

arxiv: v2 [cs.cv] 9 Apr 2018

arxiv: v2 [cs.cv] 9 Apr 2018 Boundary-senstve Network for Portrat Segmentaton Xanzh Du 1, Xaolong Wang 2, Dawe L 2, Jngwen Zhu 2, Serafettn Tasc 2, Cameron Uprght 2, Stephen Walsh 2, Larry Davs 1 1 Computer Vson Lab, UMIACS, Unversty

More information

Edge Detection in Noisy Images Using the Support Vector Machines

Edge Detection in Noisy Images Using the Support Vector Machines Edge Detecton n Nosy Images Usng the Support Vector Machnes Hlaro Gómez-Moreno, Saturnno Maldonado-Bascón, Francsco López-Ferreras Sgnal Theory and Communcatons Department. Unversty of Alcalá Crta. Madrd-Barcelona

More information

Face Recognition University at Buffalo CSE666 Lecture Slides Resources:

Face Recognition University at Buffalo CSE666 Lecture Slides Resources: Face Recognton Unversty at Buffalo CSE666 Lecture Sldes Resources: http://www.face-rec.org/algorthms/ Overvew of face recognton algorthms Correlaton - Pxel based correspondence between two face mages Structural

More information

Brushlet Features for Texture Image Retrieval

Brushlet Features for Texture Image Retrieval DICTA00: Dgtal Image Computng Technques and Applcatons, 1 January 00, Melbourne, Australa 1 Brushlet Features for Texture Image Retreval Chbao Chen and Kap Luk Chan Informaton System Research Lab, School

More information

Learning Non-Linearly Separable Boolean Functions With Linear Threshold Unit Trees and Madaline-Style Networks

Learning Non-Linearly Separable Boolean Functions With Linear Threshold Unit Trees and Madaline-Style Networks In AAAI-93: Proceedngs of the 11th Natonal Conference on Artfcal Intellgence, 33-1. Menlo Park, CA: AAAI Press. Learnng Non-Lnearly Separable Boolean Functons Wth Lnear Threshold Unt Trees and Madalne-Style

More information

Efficient Relative Attribute Learning using Graph Neural Networks

Efficient Relative Attribute Learning using Graph Neural Networks Effcent Relatve Attrbute Learnng usng Graph Neural Networks Zhang Meng 1, Nagesh Adluru 1, Hyunwoo J. Km 1, Glenn Fung 2, and Vkas Sngh 1 1 Unversty of Wsconsn Madson 2 Amercan Famly Insurance zhangm@cs.wsc.edu,

More information

Hyperspectral Image Classification Based on Local Binary Patterns and PCANet

Hyperspectral Image Classification Based on Local Binary Patterns and PCANet Hyperspectral Image Classfcaton Based on Local Bnary Patterns and PCANet Huzhen Yang a, Feng Gao a, Junyu Dong a, Yang Yang b a Ocean Unversty of Chna, Department of Computer Scence and Technology b Ocean

More information

Vol. 5, No. 3 March 2014 ISSN Journal of Emerging Trends in Computing and Information Sciences CIS Journal. All rights reserved.

Vol. 5, No. 3 March 2014 ISSN Journal of Emerging Trends in Computing and Information Sciences CIS Journal. All rights reserved. Journal of Emergng Trends n Computng and Informaton Scences 009-03 CIS Journal. All rghts reserved. http://www.csjournal.org Unhealthy Detecton n Lvestock Texture Images usng Subsampled Contourlet Transform

More information

Using Neural Networks and Support Vector Machines in Data Mining

Using Neural Networks and Support Vector Machines in Data Mining Usng eural etworks and Support Vector Machnes n Data Mnng RICHARD A. WASIOWSKI Computer Scence Department Calforna State Unversty Domnguez Hlls Carson, CA 90747 USA Abstract: - Multvarate data analyss

More information

Learning a Class-Specific Dictionary for Facial Expression Recognition

Learning a Class-Specific Dictionary for Facial Expression Recognition BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 16, No 4 Sofa 016 Prnt ISSN: 1311-970; Onlne ISSN: 1314-4081 DOI: 10.1515/cat-016-0067 Learnng a Class-Specfc Dctonary for

More information

An Entropy-Based Approach to Integrated Information Needs Assessment

An Entropy-Based Approach to Integrated Information Needs Assessment Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology

More information

Discriminative Dictionary Learning with Pairwise Constraints

Discriminative Dictionary Learning with Pairwise Constraints Dscrmnatve Dctonary Learnng wth Parwse Constrants Humn Guo Zhuoln Jang LARRY S. DAVIS UNIVERSITY OF MARYLAND Nov. 6 th, Outlne Introducton/motvaton Dctonary Learnng Dscrmnatve Dctonary Learnng wth Parwse

More information

Face Recognition using 3D Directional Corner Points

Face Recognition using 3D Directional Corner Points 2014 22nd Internatonal Conference on Pattern Recognton Face Recognton usng 3D Drectonal Corner Ponts Xun Yu, Yongsheng Gao School of Engneerng Grffth Unversty Nathan, QLD, Australa xun.yu@grffthun.edu.au,

More information

Corner-Based Image Alignment using Pyramid Structure with Gradient Vector Similarity

Corner-Based Image Alignment using Pyramid Structure with Gradient Vector Similarity Journal of Sgnal and Informaton Processng, 013, 4, 114-119 do:10.436/jsp.013.43b00 Publshed Onlne August 013 (http://www.scrp.org/journal/jsp) Corner-Based Image Algnment usng Pyramd Structure wth Gradent

More information

Fusion of Deep Features and Weighted VLAD Vectors based on Multiple Features for Image Retrieval

Fusion of Deep Features and Weighted VLAD Vectors based on Multiple Features for Image Retrieval MATEC Web of Conferences, 0500 (07) DTS-07 DO: 005/matecconf/070500 Fuson of Deep Features and Weghted VLAD Vectors based on Multple Features for mage Retreval Yanhong Wang,, Ygang Cen,, Lequan Lang,*,

More information

Font Recognition in Natural Images via Transfer Learning

Font Recognition in Natural Images via Transfer Learning Font Recognton n Natural Images va Transfer Learnng Yzh Wang, Zhouhu Lan, Yngmn Tang, and Janguo Xao Insttute of Computer Scence and Technology, Pekng Unversty Abstract. Font recognton s an mportant and

More information

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO. 4, APRIL

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO. 4, APRIL IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO. 4, APRIL 2016 1713 Weakly Supervsed Fne-Graned Categorzaton Wth Part-Based Image Representaton Yu Zhang, Xu-Shen We, Janxn Wu, Member, IEEE, Janfe Ca,

More information

PERFORMANCE EVALUATION FOR SCENE MATCHING ALGORITHMS BY SVM

PERFORMANCE EVALUATION FOR SCENE MATCHING ALGORITHMS BY SVM PERFORMACE EVALUAIO FOR SCEE MACHIG ALGORIHMS BY SVM Zhaohu Yang a, b, *, Yngyng Chen a, Shaomng Zhang a a he Research Center of Remote Sensng and Geomatc, ongj Unversty, Shangha 200092, Chna - yzhac@63.com

More information

Collaboratively Regularized Nearest Points for Set Based Recognition

Collaboratively Regularized Nearest Points for Set Based Recognition Academc Center for Computng and Meda Studes, Kyoto Unversty Collaboratvely Regularzed Nearest Ponts for Set Based Recognton Yang Wu, Mchhko Mnoh, Masayuk Mukunok Kyoto Unversty 9/1/013 BMVC 013 @ Brstol,

More information

Efficient Segmentation and Classification of Remote Sensing Image Using Local Self Similarity

Efficient Segmentation and Classification of Remote Sensing Image Using Local Self Similarity ISSN(Onlne): 2320-9801 ISSN (Prnt): 2320-9798 Internatonal Journal of Innovatve Research n Computer and Communcaton Engneerng (An ISO 3297: 2007 Certfed Organzaton) Vol.2, Specal Issue 1, March 2014 Proceedngs

More information

Load Balancing for Hex-Cell Interconnection Network

Load Balancing for Hex-Cell Interconnection Network Int. J. Communcatons, Network and System Scences,,, - Publshed Onlne Aprl n ScRes. http://www.scrp.org/journal/jcns http://dx.do.org/./jcns.. Load Balancng for Hex-Cell Interconnecton Network Saher Manaseer,

More information

Music/Voice Separation using the Similarity Matrix. Zafar Rafii & Bryan Pardo

Music/Voice Separation using the Similarity Matrix. Zafar Rafii & Bryan Pardo Musc/Voce Separaton usng the Smlarty Matrx Zafar Raf & Bryan Pardo Introducton Muscal peces are often characterzed by an underlyng repeatng structure over whch varyng elements are supermposed Propellerheads

More information

Gender Classification using Interlaced Derivative Patterns

Gender Classification using Interlaced Derivative Patterns Gender Classfcaton usng Interlaced Dervatve Patterns Author Shobernejad, Ameneh, Gao, Yongsheng Publshed 2 Conference Ttle Proceedngs of the 2th Internatonal Conference on Pattern Recognton (ICPR 2) DOI

More information

Optimizing Document Scoring for Query Retrieval

Optimizing Document Scoring for Query Retrieval Optmzng Document Scorng for Query Retreval Brent Ellwen baellwe@cs.stanford.edu Abstract The goal of ths project was to automate the process of tunng a document query engne. Specfcally, I used machne learnng

More information

Analysis of Continuous Beams in General

Analysis of Continuous Beams in General Analyss of Contnuous Beams n General Contnuous beams consdered here are prsmatc, rgdly connected to each beam segment and supported at varous ponts along the beam. onts are selected at ponts of support,

More information

Available online at Available online at Advanced in Control Engineering and Information Science

Available online at   Available online at   Advanced in Control Engineering and Information Science Avalable onlne at wwwscencedrectcom Avalable onlne at wwwscencedrectcom Proceda Proceda Engneerng Engneerng 00 (2011) 15000 000 (2011) 1642 1646 Proceda Engneerng wwwelsevercom/locate/proceda Advanced

More information

Human Face Recognition Using Generalized. Kernel Fisher Discriminant

Human Face Recognition Using Generalized. Kernel Fisher Discriminant Human Face Recognton Usng Generalzed Kernel Fsher Dscrmnant ng-yu Sun,2 De-Shuang Huang Ln Guo. Insttute of Intellgent Machnes, Chnese Academy of Scences, P.O.ox 30, Hefe, Anhu, Chna. 2. Department of

More information

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification Introducton to Artfcal Intellgence V22.0472-001 Fall 2009 Lecture 24: Nearest-Neghbors & Support Vector Machnes Rob Fergus Dept of Computer Scence, Courant Insttute, NYU Sldes from Danel Yeung, John DeNero

More information

Deep Spatial-Temporal Joint Feature Representation for Video Object Detection

Deep Spatial-Temporal Joint Feature Representation for Video Object Detection sensors Artcle Deep Spatal-Temporal Jont Feature Representaton for Vdeo Object Detecton Baojun Zhao 1,2, Boya Zhao 1,2 ID, Lnbo Tang 1,2, *, Yuq Han 1,2 and Wenzheng Wang 1,2 1 School of Informaton and

More information

The Study of Remote Sensing Image Classification Based on Support Vector Machine

The Study of Remote Sensing Image Classification Based on Support Vector Machine Sensors & Transducers 03 by IFSA http://www.sensorsportal.com The Study of Remote Sensng Image Classfcaton Based on Support Vector Machne, ZHANG Jan-Hua Key Research Insttute of Yellow Rver Cvlzaton and

More information

A Statistical Model Selection Strategy Applied to Neural Networks

A Statistical Model Selection Strategy Applied to Neural Networks A Statstcal Model Selecton Strategy Appled to Neural Networks Joaquín Pzarro Elsa Guerrero Pedro L. Galndo joaqun.pzarro@uca.es elsa.guerrero@uca.es pedro.galndo@uca.es Dpto Lenguajes y Sstemas Informátcos

More information

International Conference on Applied Science and Engineering Innovation (ASEI 2015)

International Conference on Applied Science and Engineering Innovation (ASEI 2015) Internatonal Conference on Appled Scence and Engneerng Innovaton (ASEI 205) Desgn and Implementaton of Novel Agrcultural Remote Sensng Image Classfcaton Framework through Deep Neural Network and Mult-

More information

Modular PCA Face Recognition Based on Weighted Average

Modular PCA Face Recognition Based on Weighted Average odern Appled Scence odular PCA Face Recognton Based on Weghted Average Chengmao Han (Correspondng author) Department of athematcs, Lny Normal Unversty Lny 76005, Chna E-mal: hanchengmao@163.com Abstract

More information

Pruning Training Corpus to Speedup Text Classification 1

Pruning Training Corpus to Speedup Text Classification 1 Prunng Tranng Corpus to Speedup Text Classfcaton Jhong Guan and Shugeng Zhou School of Computer Scence, Wuhan Unversty, Wuhan, 430079, Chna hguan@wtusm.edu.cn State Key Lab of Software Engneerng, Wuhan

More information

Smoothing Spline ANOVA for variable screening

Smoothing Spline ANOVA for variable screening Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory

More information