Audio Event Detection and classification using extended R-FCN Approach. Kaiwu Wang, Liping Yang, Bin Yang

Size: px
Start display at page:

Download "Audio Event Detection and classification using extended R-FCN Approach. Kaiwu Wang, Liping Yang, Bin Yang"

Transcription

1 Audo Event Detecton and classfcaton usng extended R-FCN Approach Kawu Wang, Lpng Yang, Bn Yang Key Laboratory of Optoelectronc Technology and Systems(Chongqng Unversty), Mnstry of Educaton, ChongQng Unversty, Chna {amsturdy, yanglp, ABSTRACT In ths study, we present a new audo event detecton and classfcaton approach based on R-FCN a state-of-the-art fully convolutonal network framework for vsual object detecton. Spectrogram features of audo sgnals are used as the nput of the approach. The proposed approach conssts of two stages lke R-FCN network. In the frst stage, we detect whether there are audo events by sldng convolutonal kernel n tme axs, and then proposals whch possbly contan audo events are generated by RPN (Regon Proposal Networks). In the second stage, tme and frequency doman nformaton are ntegrated to classfy these proposals and refne ther boundares. Our approach can output the postons of audo events drectly whch can nput a two-dmensonal representaton of arbtrary length sound wthout any sze regularzaton. Index Terms audo event detecton, Convolutonal Neural Network, spectrogram feature 1. INTRODUCTION Intellgent survellance system s becomng ncreasngly ubqutous n our lvng envronment. At present, most of the vdeo-based survellance system s lack of robustness and relablty n practcal applcaton. For example, vdeo-based survellance doesn t work well n some specfc scenaros, such as the nght or cloudy crcumstances. Audo survellance has been alone or n combnaton wth vdeo survellance to solve ths problem. The work n [1] descrbed a framework for scene analyss n a typcal survellance scenaro through ntegratng audo and vsual nformaton. Generally, audo stream s much less onerous than vdeo stream and the audo devces are more nexpensve. Audo events detecton has been one of the mportant components to ntellgent survellance of securty. Unlke the statc or changng slowly backgrounds n vdeo survellance backgrounds, there may be some mpulsve sounds n audo backgrounds. Moreover, the audo sgnal s more versatle when audo events are supermposed on one or more backgrounds wth dfferent sgnal-to-nose rato (SNR). Thus, t s a challenge work to detect audo events correctly from an audo segment. Early works manly concentrated on extractng dfferent types of hand-crafted features, and tranng effectve classfers for recognton wth tradtonal machne learnng algorthms. The most typcal classfer s Support Vector Machnes (SVM). For nstance, a method amed at recognzng envronmental sounds for survellance and securty applcatons s presented n [2], whch apples one-class support vector machnes together wth a sophstcated measure. Another approach s to utlze a Gaussan Mxture Model (GMM) for sounds recognton. The proposed approach n [3] frst classfes a gven audo frame nto vocal and non-vocal events, and then performs further classfcaton nto normal and target events usng GMM. The non-statonary (tmefrequency) technques are appled to sound classfcaton and produce a good result. In [4], Jonathan Denns et al. extract mage feature from the SPD mage a novel two-dmensonal representaton that characterzes the spectral power dstrbuton over tme n each frequency sub-band. In [5], [6], features are extracted from the spectrogram mage of sound sgnals for automatc sound recognton. More recently, methods based on Deep Neural Networks (DNNs) have acheved good performance for sound event classfcaton and detecton. In [12], the authors outlne a sound event classfcaton framework that compares audtory mage front end features wth spectrogram mage-based front end features, usng deep neural network classfers. The work n [13] employs an ensemble learnng framework a stack of ensemble classfers named mult-resoluton stackng (MRS). A concatenaton of lower buldng blocks predctons and the expanson of the raw acoustc feature s fed nto each classfer n MRS. The lower buldng blocks descrbe a base classfer n MRS, named boosted deep neural network. In [14], the authors record a database of sounds occurrng n subway trans n real condtons of explotaton and use DNNs to classfy the sounds nto screams, shouts and other categores. In these audo events detecton methods, Deep Neural Networks (DNNs) mostly work as a classfer whch acheve better performance than other tradtonal classfers, but can t drectly detect audo events n real-tme when the audo events occur. In ths paper, we present a new audo event detecton and classfcaton approach by extendng the R-FCN framework [11] whch s a fully convolutonal neural network used n vsual object detecton. Inspred by the framework, the proposed approach also conssts of three parts (not nclude the extractng of feature maps). The frst part s RPN Network [10] for generatng a lst of proposals whch possbly contan audo events. However, unlke the proposals wth dfferent wdths and heghts n [11], the proposals n our extended R-FCN framework have the same heght and just vary n wdth. We do ths just because the sound s one dmenson sgnal. The second part refnes the boundares of above proposals. The thrd part classfes above proposals nto audo events or backgrounds. In the second and thrd parts, we not only use the nformaton n tme doman but also use the nformaton n dfferent frequency ranges for classfcaton and boundary regresson. Spectrogram features of audo sgnals are

2 Fgure 1: Overvew of the system archtecture for audo detecton and classfcaton fed nto the archtecture as nputs. Based on the capacty of Res- Net network [8] for extractng features and solvng the gradent degradaton problem when tranng a deep neural network, feature maps are produced by the modfed ResNet50 network fed wth these spectrogram feature. These feature maps are shared by above three parts. The rest of ths paper s organzed as follows. Secton 2 presents the proposed deep learnng approach for audo detecton and classfcaton. Secton 3 dscusses our experments and results. Secton 4 concludes ths work. 2. PROPOSED APPROACH Our approach follows the deep learnng framework of R- FCN whch s a fully convolutonal neural network for vsual object detecton [11]. We mprove the Regon Proposal Network (RPN) to make the framework sutable for audo detecton. To make the overall system more smply and generate real-tme result, we substtute ResNet50 for ResNet101, and modfy Res- Net50 network. The overall system archtecture essentally conssts of two stages through three parts n Fgure 1. Frstly, we detect the approxmate poston (proposal) of audo event and don t care the audo event class by our frst part. So, ths can be treated as a bnary classfcaton problem. Secondly, we classfy the proposal nto audo events or background and refne the boundares n our second and thrd parts. These proposals are also called as regons of nterest (RoIs) The Modfed ResNet50 Network Based on the capacty of ResNet50 [8] network for extractng features and solvng the degradaton problem when tranng a deep neural network, we extract the shared feature maps from spectrogram feature by the modfed ResNet50 network. Experments show that the performance wll be reduced because of the reducton of feature maps resoluton. So, we remove the last fc layer, poolng layer and three buldng blocks n ResNet50. The modfed ResNet50 begns wth a 7 7 convolutonal layer whch has 64 kernels and a strde of 2, then follows a 3 3 max poolng layer wth a strde of 2, the rest of the modfed Res- Net50 are seres of buldng blocks. The last convolutonal layer has 1024 kernels, so the generated shared feature maps has 1024 channels The Improved RPN Network For audo events detecton, the mproved RPN network outputs the start tme, the length of proposals wth scores estmatng the probablty that these proposals belong to audo events or background. We model ths process followed a fully convolutonal network [10]. The shared feature maps produced by the modfed ResNet50 are fed nto RPN network. A strp wndow sldes over the shared feature maps to generate m base regons for proposals at each poston on the tme axs of feature map. These m base regons are dfferent n length but start at the same tme of audo sgnal corresponded to the current poston of sldng wndow. The sldng wndow has a sze of n 1, n s the heght of the shared feature maps. At each poston of sldng wndow, 256-dmensonal vector representng these m base regons smultaneously s generated, whch s used to produce scores and coordnates of proposals based on these m base regons. The coordnates represent the dfference of start tme and length between a base regon and proposal. The scores estmate the probablty that these m proposals belong to audo

3 events or background. So, 2m scores and 2m coordnates are generated for each poston of sldng wndow. In another word, we generate m base regons at every few frames of audo sgnal, and extract 256-d feature vector from front part of these m base regons to represent all m base regons smultaneously. Then, the 256-d feature vector s used for generatng scores and refnng the poston. Takng nspraton from the character of convolutonal layer, the mproved RPN network s mplemented by three normal convolutonal layers. The 256-d feature vector s produced by a n 1 convolutonal layer whch has 256 kernels and works as a sldng wndow. Then the 256-d vector s smultaneously fed nto two sblng 1 1 convolutonal layers a classfcaton layer (cls layer) and a regresson layer (reg layer). The cls layer and reg layer have 2m convolutonal kernels respectvely. The cls layer outputs 2 scores that estmate the probablty that the proposal belongs to audo events or background. So the cls layer has 2m outputs. The reg layer has 2m outputs representng the dfference between base regons and proposals n start tme and length. The key deal of RPN network for audo detecton s llustrated n the Fgure 2. We use m lnes wth dfferent lengths represent the m base regons n the Fgure 2. Fgure 2: Key deal of RPN network for audo detecton We use two 2-d vectors (B s, B l ) and (G s, G l ) denote the poston of a base regon and ground truth respectvely. We are more concerned wth the tme that audo event occurs, so the 2-d poston vector denotes the start and length of an audo event. We need to defne an operaton F to produce poston vector of proposal from a base regon: ( P, P) F( B, B ) (1) s l s l Here, (P s, P l ) denote the poston vector of a proposal. When the base regon near to the ground truth, we take shft and scale transformaton nto consderaton from a base regon. The operaton F can be smply defned as: P B * t B, P B * exp( t ) (2) s l s s l l l t s,t l s the shft factor and scale factor need to be learned, and they are the reg layer s output for a base regon. So, the correspondng labels are defned as: * * ts ( Gs Bs ) / B, l tl log( Gl / Bl ) (3) When tranng the mproved RPN network, our loss functon for audo detecton followed the mult-task loss n [10] whch can solve the classfcaton and coordnates refnng smultaneously, defned as: 1 L p t L p p p L t t 2 2 * * *, cls, reg, (4) Here, s the ndex of base regons, P represents the probablty that base regon s predcted as an audo event or background. We set the balance weght λ=1. The classfcaton loss L cls s log loss over two classes (audo event or background). The groundtruth label P * s 1 f the base regon s an audo event, otherwse s 0. Ths means the regresson loss L reg (smooth L 1 loss functon, defned n [9]) s actvated only for the base regon near to ground truth. t s a two-dmensonal vector (t s,t l ) for base regon, whch s produced by reg layer. t * s a twodmensonal vector (t * s,t * l ) for base regon The classfcaton and refnng of proposals After the frst stage, we select 6000 proposals whch are most lkely to be audo events from the translatons of all base regons. We do ths by checkng ther probablty to be audo event because the frst stage s a bnary problem. In the second stage, k 2 (C+1)-channel score maps are produced by a convolutonal layer whch has k 2 (C+1) kernels and 2k 2 -channel coordnate maps are produced by a convolutonal layer whch has 2k 2 kernels as seen n Fgure 1. C s the categores of audo events (+1 for background), k equals 3, but whch s a hyper parameter related to the parameter of the followng RoI poolng layer. Gven the 6000 proposals (RoIs), each RoI s dvded nto k k parts, and k k score / coordnate features are produced through RoI poolng layer from score / coordnate maps respectvely. The detals of the RoI poolng layer defned n [11]. After the RoI poolng layer, a vote prncple (mplemented by an average poolng layer) s appled to every channel of scores / coordnates features. Then the 2-d coordnates are generated, and the (C+1)-dmensonal scores for each category are generated after a softmax layer. In the score or coordnate maps, the three red, green, blue maps represent the hgh, mddle, low frequency components respectvely, and each of these three maps n every color (red, green, blue) maps represents sequentally one thrd segment of a proposal n tme axs correspond to the color depth, the deepest color score map represents frst segment of a proposal and the lghtest color score map represents the last segment. Each component of score / coordnate features s from average poolng on only one of k k score / coordnate maps correspond to ther same color. Ths can be understood that not only components at dfferent tme perods but also the dfferent frequency components at same tme perods have dfferent response to the classfyng or refnng of a RoI. For example, there are three RoIs whch have same sze and one thrd overlap as showed n Fgure 3. However, the same one thrd part has dfferent response to the classfyng or refnng of these three RoIs showed by the lghter of every color. Moreover, dfferent frequency components have dfferent response to the classfyng or refnng of each RoI showed by the dfferent color (red, green, blue). What s more, there are always k k score / coordnate features regardless of the sze of RoI. In another word, the sound wth arbtrary length can be processed.

4 Fgure 3: Illustraton of the RoI poolng layer s functon 3. EXPERIMENTS We perform experments on the dataset of IEEE DCASE Challenge 2017 Task 2, whch conssts of solated sound events for three target class (babycry, glassbreak, gunshot) and recordngs of everyday acoustc scenes to serve as background. The background audo materal conssts of recordngs from 15 dfferent audo scenes, and s part of TUT Acoustc Scenes 2016 dataset [7]. We regularze audo sgnals to same sample frequency of 44.1 KHz and synthesze our tranng data accordng to the event-to-background rato (EBR, -6.0, 0, 6.0). Our tranng data has mxtures (5000 per target class, each mxture only contans a target class event). Then we generate the grayscale spectrogram through short-tme Fast Fourer transform. A hammng wndow wth a length of 1024 samples and an overlap of 441 samples s appled. The generated spectrogram feature has a heght of 512. We evaluate our approach on the development dataset. The manly evaluaton metrcs are Error Rate (ER) and F-Score (F) calculated usng event-based onset-only condton wth a collar of 500ms [15]. Table 1: The results for each class on development dataset Class F (%) ER DR IR Baselne babycry Our approach glassbreak gunshot All babycry glassbreak gunshot All Table 1 shows the result on Event-based overall metrcs for each class on development dataset compared wth Baselne system for DCASE Challenge 2017 Task 2. DR, IR are the Deleton Rate, Inserton Rate respectvely [15]. The mplementaton of Baselne s based on a multlayer perceptron archtecture (MLP) and uses log mel-band energes as features. The features are calculated n frames of 40 ms wth a 50% overlap, usng 40 mel bands coverng the frequency range 0 to Hz. The feature vector was constructed usng a 5-frame context, resultng n a feature vector length of 200. The MLP conssts of two dense layers of 50 hdden unts each, wth 20% dropout. For each of the target classes, there s a separate bnary classfer wth one output neuron wth sgmod actvaton, ndcatng the actvty of the target class. Our approach has an ER value of 0.16, F-Scores of 91.4% for all classes, whch outperforms the Baselne system (ER: 0.60, F: 70.0%). The ER values of babycry and gunshot are mproved by 0.67 and 0.53 respectvely. It shows great mprovement for babycry and gunshot. The result on Event-based overall metrcs for each EBR on development dataset s shown n Table 2. F-Scores and ER for all EBRs (not ncludes the mxture whch doesn t contan target event) are 91.7% and The dfference of ER value between each EBR s lower than that of Baselne system, whch proves our approach s more robust to nose. Our approach has lower IR value for each class and EBR, whch ndcates our approach s more relable. Table 2: The results for each EBR on development dataset Class F (%) ER DR IR Baselne Our approach All All On the evaluaton dataset, F-score and ER on segment-based metrcs are and 82.0%, respectvely. Our approach generally produced better result because of the two stage strategy. After the frst stage, there wll be a global summary for an audo event compared wth Baselne s strategy. The 5-frame context of Baselne system only has a local summary for an audo event. At the same tme, the usng of deeper network and the ntegraton of tme and frequency doman nformaton s another reason for the better performance. 4. CONCLUSION We present an audo event detecton and classfcaton approach whch can drectly output the postons of audo events from an audo sgnals wth arbtrary length. We mproved RPN network for generatng proposals whch possbly contan audo events. Tme and frequency doman nformaton are fully utlzed to classfy these proposals and refne ther boundares.

5 5. REFERENCES [1] Crstan M, Bcego M, Murno V, et al. Audo-Vsual Event Recognton n Survellance Vdeo Sequences[J]. IEEE Transactons on Multmeda, 2007, 9(2): [2] Rabaou A, Davy M, Rossgnol S, et al. Usng One-Class SVMs and Wavelets for Audo Survellance[J]. IEEE Transactons on Informaton Forenscs and Securty, 2008, 3(4): [3] Atrey P K, Maddage N C, Kankanhall M S, et al. Audo Based Event Detecton for Multmeda Survellance[C]. nternatonal conference on acoustcs, speech, and sgnal processng, 2006: [4] Denns J, Tran H D, Chng E S, et al. Image Feature Representaton of the Subband Power Dstrbuton for Robust Sound Event Classfcaton[J]. IEEE Transactons on Audo, Speech, and Language Processng, 2013, 21(2): [5] Sharan R V, Mor T J. Nose robust audo survellance usng reduced spectrogram mage feature and one-aganst-all SVM[J]. Neurocomputng, 2015: [6] Sharan R V, Mor T J. Subband Tme-Frequency Image Texture Features for Robust Audo Survellance[J]. IEEE Transactons on Informaton Forenscs and Securty, 2015, 10(12): [7] Mesaros A, Hettola T, Vrtanen T. TUT database for acoustc scene classfcaton and sound event detecton[c]// European Sgnal Processng Conference. 2016: [8] He K, Zhang X, Ren S, et al. Deep Resdual Learnng for Image Recognton[C]. computer vson and pattern recognton, 2015: [9] Grshck R. Fast R-CNN[C]. nternatonal conference on computer vson, 2015: [10] Ren S, He K, Grshck R, et al. Faster R-CNN: Towards Real-Tme Object Detecton wth Regon Proposal Networks[J]. IEEE Transactons on Pattern Analyss and Machne Intellgence, 2015: 1-1. [11] Da J, L Y, He K, et al. R-FCN: Object Detecton va Regon-based Fully Convolutonal Networks[C]. neural nformaton processng systems, 2016: [12] Mcloughln I V, Zhang H, Xe Z, et al. Robust sound event classfcaton usng deep neural networks[j]. IEEE Transactons on Audo, Speech, and Language Processng, 2015, 23(3): [13] Zhang X, Wang D. Boostng contextual nformaton for deep neural network based voce actvty detecton[j]. IEEE Transactons on Audo, Speech, and Language Processng, 2016, 24(2): [14] Lafftte P, Sodoyer D, Tatkeu C, et al. Deep neural networks for automatc detecton of screams and shouted speech n subway trans[c]. nternatonal conference on acoustcs, speech, and sgnal processng, 2016: [15] Mesaros A, Hettola T, Vrtanen T. Metrcs for polyphonc sound event detecton[j]. Appled Scences, 2016, 6(6): 162. [16] Wabel A, Hanazawa T, Hnton G, et al. Phoneme recognton usng tme-delay neural networks[j]. IEEE transactons on acoustcs, speech, and sgnal processng, 1989, 37(3):

Machine Learning 9. week

Machine Learning 9. week Machne Learnng 9. week Mappng Concept Radal Bass Functons (RBF) RBF Networks 1 Mappng It s probably the best scenaro for the classfcaton of two dataset s to separate them lnearly. As you see n the below

More information

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION Paulo Quntlano 1 & Antono Santa-Rosa 1 Federal Polce Department, Brasla, Brazl. E-mals: quntlano.pqs@dpf.gov.br and

More information

Classifying Acoustic Transient Signals Using Artificial Intelligence

Classifying Acoustic Transient Signals Using Artificial Intelligence Classfyng Acoustc Transent Sgnals Usng Artfcal Intellgence Steve Sutton, Unversty of North Carolna At Wlmngton (suttons@charter.net) Greg Huff, Unversty of North Carolna At Wlmngton (jgh7476@uncwl.edu)

More information

Deep learning is a good steganalysis tool when embedding key is reused for different images, even if there is a cover source-mismatch

Deep learning is a good steganalysis tool when embedding key is reused for different images, even if there is a cover source-mismatch Deep learnng s a good steganalyss tool when embeddng key s reused for dfferent mages, even f there s a cover source-msmatch Lonel PIBRE 2,3, Jérôme PASQUET 2,3, Dno IENCO 2,3, Marc CHAUMONT 1,2,3 (1) Unversty

More information

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information

Lecture 5: Multilayer Perceptrons

Lecture 5: Multilayer Perceptrons Lecture 5: Multlayer Perceptrons Roger Grosse 1 Introducton So far, we ve only talked about lnear models: lnear regresson and lnear bnary classfers. We noted that there are functons that can t be represented

More information

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,

More information

Audio Content Classification Method Research Based on Two-step Strategy

Audio Content Classification Method Research Based on Two-step Strategy (IJACSA) Internatonal Journal of Advanced Computer Scence and Applcatons, Audo Content Classfcaton Method Research Based on Two-step Strategy Sume Lang Department of Computer Scence and Technology Chongqng

More information

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1 4/14/011 Outlne Dscrmnatve classfers for mage recognton Wednesday, Aprl 13 Krsten Grauman UT-Austn Last tme: wndow-based generc obect detecton basc ppelne face detecton wth boostng as case study Today:

More information

Face Detection with Deep Learning

Face Detection with Deep Learning Face Detecton wth Deep Learnng Yu Shen Yus122@ucsd.edu A13227146 Kuan-We Chen kuc010@ucsd.edu A99045121 Yzhou Hao y3hao@ucsd.edu A98017773 Mn Hsuan Wu mhwu@ucsd.edu A92424998 Abstract The project here

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

An Image Fusion Approach Based on Segmentation Region

An Image Fusion Approach Based on Segmentation Region Rong Wang, L-Qun Gao, Shu Yang, Yu-Hua Cha, and Yan-Chun Lu An Image Fuson Approach Based On Segmentaton Regon An Image Fuson Approach Based on Segmentaton Regon Rong Wang, L-Qun Gao, Shu Yang 3, Yu-Hua

More information

Comparison Study of Textural Descriptors for Training Neural Network Classifiers

Comparison Study of Textural Descriptors for Training Neural Network Classifiers Comparson Study of Textural Descrptors for Tranng Neural Network Classfers G.D. MAGOULAS (1) S.A. KARKANIS (1) D.A. KARRAS () and M.N. VRAHATIS (3) (1) Department of Informatcs Unversty of Athens GR-157.84

More information

The Research of Support Vector Machine in Agricultural Data Classification

The Research of Support Vector Machine in Agricultural Data Classification The Research of Support Vector Machne n Agrcultural Data Classfcaton Le Sh, Qguo Duan, Xnmng Ma, Me Weng College of Informaton and Management Scence, HeNan Agrcultural Unversty, Zhengzhou 45000 Chna Zhengzhou

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur FEATURE EXTRACTION Dr. K.Vjayarekha Assocate Dean School of Electrcal and Electroncs Engneerng SASTRA Unversty, Thanjavur613 41 Jont Intatve of IITs and IISc Funded by MHRD Page 1 of 8 Table of Contents

More information

Hybrid Non-Blind Color Image Watermarking

Hybrid Non-Blind Color Image Watermarking Hybrd Non-Blnd Color Image Watermarkng Ms C.N.Sujatha 1, Dr. P. Satyanarayana 2 1 Assocate Professor, Dept. of ECE, SNIST, Yamnampet, Ghatkesar Hyderabad-501301, Telangana 2 Professor, Dept. of ECE, AITS,

More information

Face Recognition Based on SVM and 2DPCA

Face Recognition Based on SVM and 2DPCA Vol. 4, o. 3, September, 2011 Face Recognton Based on SVM and 2DPCA Tha Hoang Le, Len Bu Faculty of Informaton Technology, HCMC Unversty of Scence Faculty of Informaton Scences and Engneerng, Unversty

More information

Research of Image Recognition Algorithm Based on Depth Learning

Research of Image Recognition Algorithm Based on Depth Learning 208 4th World Conference on Control, Electroncs and Computer Engneerng (WCCECE 208) Research of Image Recognton Algorthm Based on Depth Learnng Zhang Jan, J Xnhao Zhejang Busness College, Hangzhou, Chna,

More information

(a) Input data X n. (b) VersNet. (c) Output data Y n. (d) Supervsed data D n. Fg. 2 Illustraton of tranng for proposed CNN. 2. Related Work In segment

(a) Input data X n. (b) VersNet. (c) Output data Y n. (d) Supervsed data D n. Fg. 2 Illustraton of tranng for proposed CNN. 2. Related Work In segment 一般社団法人電子情報通信学会 THE INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGINEERS 信学技報 IEICE Techncal Report SANE2017-92 (2018-01) Deep Learnng for End-to-End Automatc Target Recognton from Synthetc

More information

Detection of an Object by using Principal Component Analysis

Detection of an Object by using Principal Component Analysis Detecton of an Object by usng Prncpal Component Analyss 1. G. Nagaven, 2. Dr. T. Sreenvasulu Reddy 1. M.Tech, Department of EEE, SVUCE, Trupath, Inda. 2. Assoc. Professor, Department of ECE, SVUCE, Trupath,

More information

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching A Fast Vsual Trackng Algorthm Based on Crcle Pxels Matchng Zhqang Hou hou_zhq@sohu.com Chongzhao Han czhan@mal.xjtu.edu.cn Ln Zheng Abstract: A fast vsual trackng algorthm based on crcle pxels matchng

More information

Image Representation & Visualization Basic Imaging Algorithms Shape Representation and Analysis. outline

Image Representation & Visualization Basic Imaging Algorithms Shape Representation and Analysis. outline mage Vsualzaton mage Vsualzaton mage Representaton & Vsualzaton Basc magng Algorthms Shape Representaton and Analyss outlne mage Representaton & Vsualzaton Basc magng Algorthms Shape Representaton and

More information

Convolutional Neural Network- based Human Recognition for Vision Occupancy Sensors

Convolutional Neural Network- based Human Recognition for Vision Occupancy Sensors 10 Int'l Conf. IP, Comp. Vson, and Pattern Recognton IPCV'18 Convolutonal Neural Network- based Human Recognton for Vson Occupancy Sensors Seung Soo Lee and Manbae Km * Dept. of Computer and Communcatons

More information

arxiv: v1 [cs.sd] 22 Dec 2017

arxiv: v1 [cs.sd] 22 Dec 2017 Musc Genre Classfcaton wth Parallelng Recurrent Convolutonal Neural Network arxv:1712.08370v1 [cs.sd] 22 Dec 2017 Ln Feng, Shenlan Lu, Janng Yao December 2017 Abstract Deep learnng has been demonstrated

More information

EYE CENTER LOCALIZATION ON A FACIAL IMAGE BASED ON MULTI-BLOCK LOCAL BINARY PATTERNS

EYE CENTER LOCALIZATION ON A FACIAL IMAGE BASED ON MULTI-BLOCK LOCAL BINARY PATTERNS P.G. Demdov Yaroslavl State Unversty Anatoly Ntn, Vladmr Khryashchev, Olga Stepanova, Igor Kostern EYE CENTER LOCALIZATION ON A FACIAL IMAGE BASED ON MULTI-BLOCK LOCAL BINARY PATTERNS Yaroslavl, 2015 Eye

More information

Learning a Class-Specific Dictionary for Facial Expression Recognition

Learning a Class-Specific Dictionary for Facial Expression Recognition BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 16, No 4 Sofa 016 Prnt ISSN: 1311-970; Onlne ISSN: 1314-4081 DOI: 10.1515/cat-016-0067 Learnng a Class-Specfc Dctonary for

More information

Focal Loss in 3D Object Detection

Focal Loss in 3D Object Detection 1 Focal Loss n 3D Object Detecton eng Yun1 Le Ta2 Yuan Wang2 Chengju Lu3 Mng Lu2 Fg. 1. Upper two rows show projected 3D object detecton results from the detector traned wth bnary cross entropy. Lower

More information

Reducing Frame Rate for Object Tracking

Reducing Frame Rate for Object Tracking Reducng Frame Rate for Object Trackng Pavel Korshunov 1 and We Tsang Oo 2 1 Natonal Unversty of Sngapore, Sngapore 11977, pavelkor@comp.nus.edu.sg 2 Natonal Unversty of Sngapore, Sngapore 11977, oowt@comp.nus.edu.sg

More information

Vol. 5, No. 3 March 2014 ISSN Journal of Emerging Trends in Computing and Information Sciences CIS Journal. All rights reserved.

Vol. 5, No. 3 March 2014 ISSN Journal of Emerging Trends in Computing and Information Sciences CIS Journal. All rights reserved. Journal of Emergng Trends n Computng and Informaton Scences 009-03 CIS Journal. All rghts reserved. http://www.csjournal.org Unhealthy Detecton n Lvestock Texture Images usng Subsampled Contourlet Transform

More information

BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET

BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET 1 BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET TZU-CHENG CHUANG School of Electrcal and Computer Engneerng, Purdue Unversty, West Lafayette, Indana 47907 SAUL B. GELFAND School

More information

Detection of hand grasping an object from complex background based on machine learning co-occurrence of local image feature

Detection of hand grasping an object from complex background based on machine learning co-occurrence of local image feature Detecton of hand graspng an object from complex background based on machne learnng co-occurrence of local mage feature Shnya Moroka, Yasuhro Hramoto, Nobutaka Shmada, Tadash Matsuo, Yoshak Shra Rtsumekan

More information

A Background Subtraction for a Vision-based User Interface *

A Background Subtraction for a Vision-based User Interface * A Background Subtracton for a Vson-based User Interface * Dongpyo Hong and Woontack Woo KJIST U-VR Lab. {dhon wwoo}@kjst.ac.kr Abstract In ths paper, we propose a robust and effcent background subtracton

More information

Edge Detection in Noisy Images Using the Support Vector Machines

Edge Detection in Noisy Images Using the Support Vector Machines Edge Detecton n Nosy Images Usng the Support Vector Machnes Hlaro Gómez-Moreno, Saturnno Maldonado-Bascón, Francsco López-Ferreras Sgnal Theory and Communcatons Department. Unversty of Alcalá Crta. Madrd-Barcelona

More information

ACOUSTIC SCENE CLASSIFICATION BASED ON CONVOLUTIONAL NEURAL NETWORK USING DOUBLE IMAGE FEATURES. Sangwook Park Seongkyu Mun Younglo Lee Hanseok Ko

ACOUSTIC SCENE CLASSIFICATION BASED ON CONVOLUTIONAL NEURAL NETWORK USING DOUBLE IMAGE FEATURES. Sangwook Park Seongkyu Mun Younglo Lee Hanseok Ko ACOUSTIC SCENE CLASSIFICATION BASED ON CONVOLUTIONAL NEURAL NETWORK USING DOUBLE IMAGE FEATURES Sangwook Park Seongkyu Mun Younglo Lee Hanseok Ko School of Electrcal Eng., Seoul, 136-713, swpark@spl.korea.ac.kr

More information

SVM-based Learning for Multiple Model Estimation

SVM-based Learning for Multiple Model Estimation SVM-based Learnng for Multple Model Estmaton Vladmr Cherkassky and Yunqan Ma Department of Electrcal and Computer Engneerng Unversty of Mnnesota Mnneapols, MN 55455 {cherkass,myq}@ece.umn.edu Abstract:

More information

Classifier Swarms for Human Detection in Infrared Imagery

Classifier Swarms for Human Detection in Infrared Imagery Classfer Swarms for Human Detecton n Infrared Imagery Yur Owechko, Swarup Medasan, and Narayan Srnvasa HRL Laboratores, LLC 3011 Malbu Canyon Road, Malbu, CA 90265 {owechko, smedasan, nsrnvasa}@hrl.com

More information

Research Article A High-Order CFS Algorithm for Clustering Big Data

Research Article A High-Order CFS Algorithm for Clustering Big Data Moble Informaton Systems Volume 26, Artcle ID 435627, 8 pages http://dx.do.org/.55/26/435627 Research Artcle A Hgh-Order Algorthm for Clusterng Bg Data Fanyu Bu,,2 Zhku Chen, Peng L, Tong Tang, 3 andyngzhang

More information

TN348: Openlab Module - Colocalization

TN348: Openlab Module - Colocalization TN348: Openlab Module - Colocalzaton Topc The Colocalzaton module provdes the faclty to vsualze and quantfy colocalzaton between pars of mages. The Colocalzaton wndow contans a prevew of the two mages

More information

Accurate Overlay Text Extraction for Digital Video Analysis

Accurate Overlay Text Extraction for Digital Video Analysis Accurate Overlay Text Extracton for Dgtal Vdeo Analyss Dongqng Zhang, and Shh-Fu Chang Electrcal Engneerng Department, Columba Unversty, New York, NY 10027. (Emal: dqzhang, sfchang@ee.columba.edu) Abstract

More information

Real-time Joint Tracking of a Hand Manipulating an Object from RGB-D Input

Real-time Joint Tracking of a Hand Manipulating an Object from RGB-D Input Real-tme Jont Tracng of a Hand Manpulatng an Object from RGB-D Input Srnath Srdhar 1 Franzsa Mueller 1 Mchael Zollhöfer 1 Dan Casas 1 Antt Oulasvrta 2 Chrstan Theobalt 1 1 Max Planc Insttute for Informatcs

More information

Image Alignment CSC 767

Image Alignment CSC 767 Image Algnment CSC 767 Image algnment Image from http://graphcs.cs.cmu.edu/courses/15-463/2010_fall/ Image algnment: Applcatons Panorama sttchng Image algnment: Applcatons Recognton of object nstances

More information

Fast Feature Value Searching for Face Detection

Fast Feature Value Searching for Face Detection Vol., No. 2 Computer and Informaton Scence Fast Feature Value Searchng for Face Detecton Yunyang Yan Department of Computer Engneerng Huayn Insttute of Technology Hua an 22300, Chna E-mal: areyyyke@63.com

More information

Local Quaternary Patterns and Feature Local Quaternary Patterns

Local Quaternary Patterns and Feature Local Quaternary Patterns Local Quaternary Patterns and Feature Local Quaternary Patterns Jayu Gu and Chengjun Lu The Department of Computer Scence, New Jersey Insttute of Technology, Newark, NJ 0102, USA Abstract - Ths paper presents

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

Deep Spatial-Temporal Joint Feature Representation for Video Object Detection

Deep Spatial-Temporal Joint Feature Representation for Video Object Detection sensors Artcle Deep Spatal-Temporal Jont Feature Representaton for Vdeo Object Detecton Baojun Zhao 1,2, Boya Zhao 1,2 ID, Lnbo Tang 1,2, *, Yuq Han 1,2 and Wenzheng Wang 1,2 1 School of Informaton and

More information

User Authentication Based On Behavioral Mouse Dynamics Biometrics

User Authentication Based On Behavioral Mouse Dynamics Biometrics User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA

More information

COMPARISON OF ENHANCED SCHEMES FOR AUDIO CLASSIFICATION

COMPARISON OF ENHANCED SCHEMES FOR AUDIO CLASSIFICATION Volume 4, No. 1, December 13 Journal of Global Research n Computer Scence REVIEW ARTICLE Avalable Onlne at www.grcs.nfo COMPARISON OF ENHANCED SCHEMES FOR AUDIO CLASSIFICATION Dr. V. Radha *1 and G.Anuradha

More information

Research and Application of Fingerprint Recognition Based on MATLAB

Research and Application of Fingerprint Recognition Based on MATLAB Send Orders for Reprnts to reprnts@benthamscence.ae The Open Automaton and Control Systems Journal, 205, 7, 07-07 Open Access Research and Applcaton of Fngerprnt Recognton Based on MATLAB Nng Lu* Department

More information

Spectral Subband Centroids for Robust Speaker Identification using Marginalization-based Missing Feature Theory

Spectral Subband Centroids for Robust Speaker Identification using Marginalization-based Missing Feature Theory Spectral Subband Centrods for Robust Speaer Identfcaton usng Margnalzaton-based Mssng Feature Theory Aaron Ncolson, Jac Hanson, James Lyons, Kuldp Palwal Sgnal Processng Laboratory Grffth Unversty, Brsbane,

More information

Classification of Face Images Based on Gender using Dimensionality Reduction Techniques and SVM

Classification of Face Images Based on Gender using Dimensionality Reduction Techniques and SVM Classfcaton of Face Images Based on Gender usng Dmensonalty Reducton Technques and SVM Fahm Mannan 260 266 294 School of Computer Scence McGll Unversty Abstract Ths report presents gender classfcaton based

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

A Fusion Steganographic Algorithm Based on Faster R-CNN

A Fusion Steganographic Algorithm Based on Faster R-CNN Copyrght 2018 Tech Scence Press CMC, vol.55, no.1, pp.1-16, 2018 A Fuson Steganographc Algorthm Based on Faster R-CNN Ruohan Meng 1, 2, Steven G. Rce 3, Jn Wang 4 1, 2, * and Xngmng Sun Abstract: The am

More information

Network Intrusion Detection Based on PSO-SVM

Network Intrusion Detection Based on PSO-SVM TELKOMNIKA Indonesan Journal of Electrcal Engneerng Vol.1, No., February 014, pp. 150 ~ 1508 DOI: http://dx.do.org/10.11591/telkomnka.v1.386 150 Network Intruson Detecton Based on PSO-SVM Changsheng Xang*

More information

Real-time Motion Capture System Using One Video Camera Based on Color and Edge Distribution

Real-time Motion Capture System Using One Video Camera Based on Color and Edge Distribution Real-tme Moton Capture System Usng One Vdeo Camera Based on Color and Edge Dstrbuton YOSHIAKI AKAZAWA, YOSHIHIRO OKADA, AND KOICHI NIIJIMA Graduate School of Informaton Scence and Electrcal Engneerng,

More information

A Modified Median Filter for the Removal of Impulse Noise Based on the Support Vector Machines

A Modified Median Filter for the Removal of Impulse Noise Based on the Support Vector Machines A Modfed Medan Flter for the Removal of Impulse Nose Based on the Support Vector Machnes H. GOMEZ-MORENO, S. MALDONADO-BASCON, F. LOPEZ-FERRERAS, M. UTRILLA- MANSO AND P. GIL-JIMENEZ Departamento de Teoría

More information

Feature Extractions for Iris Recognition

Feature Extractions for Iris Recognition Feature Extractons for Irs Recognton Jnwook Go, Jan Jang, Yllbyung Lee, and Chulhee Lee Department of Electrcal and Electronc Engneerng, Yonse Unversty 134 Shnchon-Dong, Seodaemoon-Gu, Seoul, KOREA Emal:

More information

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points; Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features

More information

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach Angle Estmaton and Correcton of Hand Wrtten, Textual and Large areas of Non-Textual Document Images: A Novel Approach D.R.Ramesh Babu Pyush M Kumat Mahesh D Dhannawat PES Insttute of Technology Research

More information

A Gradient Difference based Technique for Video Text Detection

A Gradient Difference based Technique for Video Text Detection A Gradent Dfference based Technque for Vdeo Text Detecton Palaahnakote Shvakumara, Trung Quy Phan and Chew Lm Tan School of Computng, Natonal Unversty of Sngapore {shva, phanquyt, tancl }@comp.nus.edu.sg

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

CLASSIFICATION OF ULTRASONIC SIGNALS

CLASSIFICATION OF ULTRASONIC SIGNALS The 8 th Internatonal Conference of the Slovenan Socety for Non-Destructve Testng»Applcaton of Contemporary Non-Destructve Testng n Engneerng«September -3, 5, Portorož, Slovena, pp. 7-33 CLASSIFICATION

More information

A New Feature of Uniformity of Image Texture Directions Coinciding with the Human Eyes Perception 1

A New Feature of Uniformity of Image Texture Directions Coinciding with the Human Eyes Perception 1 A New Feature of Unformty of Image Texture Drectons Concdng wth the Human Eyes Percepton Xng-Jan He, De-Shuang Huang, Yue Zhang, Tat-Mng Lo 2, and Mchael R. Lyu 3 Intellgent Computng Lab, Insttute of Intellgent

More information

Support Vector Machines

Support Vector Machines Support Vector Machnes Decson surface s a hyperplane (lne n 2D) n feature space (smlar to the Perceptron) Arguably, the most mportant recent dscovery n machne learnng In a nutshell: map the data to a predetermned

More information

Corner-Based Image Alignment using Pyramid Structure with Gradient Vector Similarity

Corner-Based Image Alignment using Pyramid Structure with Gradient Vector Similarity Journal of Sgnal and Informaton Processng, 013, 4, 114-119 do:10.436/jsp.013.43b00 Publshed Onlne August 013 (http://www.scrp.org/journal/jsp) Corner-Based Image Algnment usng Pyramd Structure wth Gradent

More information

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems A Unfed Framework for Semantcs and Feature Based Relevance Feedback n Image Retreval Systems Ye Lu *, Chunhu Hu 2, Xngquan Zhu 3*, HongJang Zhang 2, Qang Yang * School of Computng Scence Smon Fraser Unversty

More information

A Gradient Difference based Technique for Video Text Detection

A Gradient Difference based Technique for Video Text Detection 2009 10th Internatonal Conference on Document Analyss and Recognton A Gradent Dfference based Technque for Vdeo Text Detecton Palaahnakote Shvakumara, Trung Quy Phan and Chew Lm Tan School of Computng,

More information

Modeling Inter-cluster and Intra-cluster Discrimination Among Triphones

Modeling Inter-cluster and Intra-cluster Discrimination Among Triphones Modelng Inter-cluster and Intra-cluster Dscrmnaton Among Trphones Tom Ko, Bran Mak and Dongpeng Chen Department of Computer Scence and Engneerng The Hong Kong Unversty of Scence and Technology Clear Water

More information

Adaptive Silhouette Extraction and Human Tracking in Dynamic. Environments 1

Adaptive Silhouette Extraction and Human Tracking in Dynamic. Environments 1 Adaptve Slhouette Extracton and Human Trackng n Dynamc Envronments 1 X Chen, Zhha He, Derek Anderson, James Keller, and Marjore Skubc Department of Electrcal and Computer Engneerng Unversty of Mssour,

More information

Scale Selective Extended Local Binary Pattern For Texture Classification

Scale Selective Extended Local Binary Pattern For Texture Classification Scale Selectve Extended Local Bnary Pattern For Texture Classfcaton Yutng Hu, Zhlng Long, and Ghassan AlRegb Multmeda & Sensors Lab (MSL) Georga Insttute of Technology 03/09/017 Outlne Texture Representaton

More information

Iris recognition algorithm based on point covering of high-dimensional space and neural network

Iris recognition algorithm based on point covering of high-dimensional space and neural network Irs recognton algorthm based on pont coverng of hgh-dmensonal space and neural network Wenmng Cao,, Janhu Hu, Gang Xao, Shoujue Wang The College of Informaton Engneerng, ZheJang Unversty of Technology,

More information

What is Object Detection? Face Detection using AdaBoost. Detection as Classification. Principle of Boosting (Schapire 90)

What is Object Detection? Face Detection using AdaBoost. Detection as Classification. Principle of Boosting (Schapire 90) CIS 5543 Coputer Vson Object Detecton What s Object Detecton? Locate an object n an nput age Habn Lng Extensons Vola & Jones, 2004 Dalal & Trggs, 2005 one or ultple objects Object segentaton Object detecton

More information

Multiple Frame Motion Inference Using Belief Propagation

Multiple Frame Motion Inference Using Belief Propagation Multple Frame Moton Inference Usng Belef Propagaton Jang Gao Janbo Sh The Robotcs Insttute Department of Computer and Informaton Scence Carnege Mellon Unversty Unversty of Pennsylvana Pttsburgh, PA 53

More information

A Real-Time System for Monitoring of Cyclists and Pedestrians

A Real-Time System for Monitoring of Cyclists and Pedestrians A Real-Tme System for Montorng of Cyclsts and Pedestrans Janne Hekklä and Oll Slvén Machne Vson Group Infotech Oulu and Department of Electrcal Engneerng FIN-90014 Unversty of Oulu, Oulu, Fnland emal:

More information

Music/Voice Separation using the Similarity Matrix. Zafar Rafii & Bryan Pardo

Music/Voice Separation using the Similarity Matrix. Zafar Rafii & Bryan Pardo Musc/Voce Separaton usng the Smlarty Matrx Zafar Raf & Bryan Pardo Introducton Muscal peces are often characterzed by an underlyng repeatng structure over whch varyng elements are supermposed Propellerheads

More information

Using Neural Networks and Support Vector Machines in Data Mining

Using Neural Networks and Support Vector Machines in Data Mining Usng eural etworks and Support Vector Machnes n Data Mnng RICHARD A. WASIOWSKI Computer Scence Department Calforna State Unversty Domnguez Hlls Carson, CA 90747 USA Abstract: - Multvarate data analyss

More information

WIRELESS CAPSULE ENDOSCOPY IMAGE CLASSIFICATION BASED ON VECTOR SPARSE CODING.

WIRELESS CAPSULE ENDOSCOPY IMAGE CLASSIFICATION BASED ON VECTOR SPARSE CODING. WIRELESS CAPSULE ENDOSCOPY IMAGE CLASSIFICATION BASED ON VECTOR SPARSE CODING Tao Ma 1, Yuexan Zou 1 *, Zhqang Xang 1, Le L 1 and Y L 1 ADSPLAB/ELIP, School of ECE, Pekng Unversty, Shenzhen 518055, Chna

More information

Classifier Selection Based on Data Complexity Measures *

Classifier Selection Based on Data Complexity Measures * Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

Positive Semi-definite Programming Localization in Wireless Sensor Networks

Positive Semi-definite Programming Localization in Wireless Sensor Networks Postve Sem-defnte Programmng Localzaton n Wreless Sensor etworks Shengdong Xe 1,, Jn Wang, Aqun Hu 1, Yunl Gu, Jang Xu, 1 School of Informaton Scence and Engneerng, Southeast Unversty, 10096, anjng Computer

More information

Fuzzy Modeling of the Complexity vs. Accuracy Trade-off in a Sequential Two-Stage Multi-Classifier System

Fuzzy Modeling of the Complexity vs. Accuracy Trade-off in a Sequential Two-Stage Multi-Classifier System Fuzzy Modelng of the Complexty vs. Accuracy Trade-off n a Sequental Two-Stage Mult-Classfer System MARK LAST 1 Department of Informaton Systems Engneerng Ben-Guron Unversty of the Negev Beer-Sheva 84105

More information

High-Boost Mesh Filtering for 3-D Shape Enhancement

High-Boost Mesh Filtering for 3-D Shape Enhancement Hgh-Boost Mesh Flterng for 3-D Shape Enhancement Hrokazu Yagou Λ Alexander Belyaev y Damng We z Λ y z ; ; Shape Modelng Laboratory, Unversty of Azu, Azu-Wakamatsu 965-8580 Japan y Computer Graphcs Group,

More information

Classification Based Mode Decisions for Video over Networks

Classification Based Mode Decisions for Video over Networks Classfcaton Based Mode Decsons for Vdeo over Networks Deepak S. Turaga and Tsuhan Chen Advanced Multmeda Processng Lab Tranng data for Inter-Intra Decson Inter-Intra Decson Regons pdf 6 5 6 5 Energy 4

More information

The Codesign Challenge

The Codesign Challenge ECE 4530 Codesgn Challenge Fall 2007 Hardware/Software Codesgn The Codesgn Challenge Objectves In the codesgn challenge, your task s to accelerate a gven software reference mplementaton as fast as possble.

More information

International Conference on Applied Science and Engineering Innovation (ASEI 2015)

International Conference on Applied Science and Engineering Innovation (ASEI 2015) Internatonal Conference on Appled Scence and Engneerng Innovaton (ASEI 205) Desgn and Implementaton of Novel Agrcultural Remote Sensng Image Classfcaton Framework through Deep Neural Network and Mult-

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

ALEXNET FEATURE EXTRACTION AND MULTI-KERNEL LEARNING FOR OBJECT- ORIENTED CLASSIFICATION

ALEXNET FEATURE EXTRACTION AND MULTI-KERNEL LEARNING FOR OBJECT- ORIENTED CLASSIFICATION ALEXNET FEATURE EXTRACTION AND MULTI-KERNEL LEARNING FOR OBJECT- ORIENTED CLASSIFICATION Lng Dng 1, Hongy L 2, *, Changmao Hu 2, We Zhang 2, Shumn Wang 1 1 Insttute of Earthquake Forecastng, Chna Earthquake

More information

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for

More information

Hyperspectral Image Classification Based on Local Binary Patterns and PCANet

Hyperspectral Image Classification Based on Local Binary Patterns and PCANet Hyperspectral Image Classfcaton Based on Local Bnary Patterns and PCANet Huzhen Yang a, Feng Gao a, Junyu Dong a, Yang Yang b a Ocean Unversty of Chna, Department of Computer Scence and Technology b Ocean

More information

Enhanced Watermarking Technique for Color Images using Visual Cryptography

Enhanced Watermarking Technique for Color Images using Visual Cryptography Informaton Assurance and Securty Letters 1 (2010) 024-028 Enhanced Watermarkng Technque for Color Images usng Vsual Cryptography Enas F. Al rawashdeh 1, Rawan I.Zaghloul 2 1 Balqa Appled Unversty, MIS

More information

A PATTERN RECOGNITION APPROACH TO IMAGE SEGMENTATION

A PATTERN RECOGNITION APPROACH TO IMAGE SEGMENTATION 1 THE PUBLISHING HOUSE PROCEEDINGS OF THE ROMANIAN ACADEMY, Seres A, OF THE ROMANIAN ACADEMY Volume 4, Number 2/2003, pp.000-000 A PATTERN RECOGNITION APPROACH TO IMAGE SEGMENTATION Tudor BARBU Insttute

More information

Joint Example-based Depth Map Super-Resolution

Joint Example-based Depth Map Super-Resolution Jont Example-based Depth Map Super-Resoluton Yanje L 1, Tanfan Xue,3, Lfeng Sun 1, Janzhuang Lu,3,4 1 Informaton Scence and Technology Department, Tsnghua Unversty, Bejng, Chna Department of Informaton

More information

Multiclass Object Recognition based on Texture Linear Genetic Programming

Multiclass Object Recognition based on Texture Linear Genetic Programming Multclass Object Recognton based on Texture Lnear Genetc Programmng Gustavo Olague 1, Eva Romero 1 Leonardo Trujllo 1, and Br Bhanu 2 1 CICESE, Km. 107 carretera Tjuana-Ensenada, Mexco, olague@ccese.mx,

More information

PERFORMANCE EVALUATION FOR SCENE MATCHING ALGORITHMS BY SVM

PERFORMANCE EVALUATION FOR SCENE MATCHING ALGORITHMS BY SVM PERFORMACE EVALUAIO FOR SCEE MACHIG ALGORIHMS BY SVM Zhaohu Yang a, b, *, Yngyng Chen a, Shaomng Zhang a a he Research Center of Remote Sensng and Geomatc, ongj Unversty, Shangha 200092, Chna - yzhac@63.com

More information

An Improved Image Segmentation Algorithm Based on the Otsu Method

An Improved Image Segmentation Algorithm Based on the Otsu Method 3th ACIS Internatonal Conference on Software Engneerng, Artfcal Intellgence, Networkng arallel/dstrbuted Computng An Improved Image Segmentaton Algorthm Based on the Otsu Method Mengxng Huang, enjao Yu,

More information

Quality Improvement Algorithm for Tetrahedral Mesh Based on Optimal Delaunay Triangulation

Quality Improvement Algorithm for Tetrahedral Mesh Based on Optimal Delaunay Triangulation Intellgent Informaton Management, 013, 5, 191-195 Publshed Onlne November 013 (http://www.scrp.org/journal/m) http://dx.do.org/10.36/m.013.5601 Qualty Improvement Algorthm for Tetrahedral Mesh Based on

More information

Palmprint Feature Extraction Using 2-D Gabor Filters

Palmprint Feature Extraction Using 2-D Gabor Filters Palmprnt Feature Extracton Usng 2-D Gabor Flters Wa Kn Kong Davd Zhang and Wenxn L Bometrcs Research Centre Department of Computng The Hong Kong Polytechnc Unversty Kowloon Hong Kong Correspondng author:

More information

Large-scale Web Video Event Classification by use of Fisher Vectors

Large-scale Web Video Event Classification by use of Fisher Vectors Large-scale Web Vdeo Event Classfcaton by use of Fsher Vectors Chen Sun and Ram Nevata Unversty of Southern Calforna, Insttute for Robotcs and Intellgent Systems Los Angeles, CA 90089, USA {chensun nevata}@usc.org

More information

Novel Pattern-based Fingerprint Recognition Technique Using 2D Wavelet Decomposition

Novel Pattern-based Fingerprint Recognition Technique Using 2D Wavelet Decomposition Mathematcal Methods for Informaton Scence and Economcs Novel Pattern-based Fngerprnt Recognton Technque Usng D Wavelet Decomposton TUDOR BARBU Insttute of Computer Scence of the Romanan Academy T. Codrescu,,

More information