Language Identification for Texts Written in Transliteration
|
|
- Vernon Montgomery
- 5 years ago
- Views:
Transcription
1 Language Identification for Texts Written in Transiteration Andrey Chepovskiy, Sergey Gusev, Margarita Kurbatova Higher Schoo of Economics, Data Anaysis and Artificia Inteigence Department, Pokrovskiy bouevard, Moscow, Russia Abstract. The probem of identification of natura anguages for the texts written in transiteration is considered. We consider a method of identification of five Savic anguages for texts written with use of a Latin transiteration. We use two ways of creation modes for such texts and compare resuts of those appication. Keywords: statistica text mode; natura anguage identification; transiteration. Introduction In various type of information systems intended for automatic processing of arge amounts of texts in natura anguages various data recognition probems are actua. The requirement of automating textua data processing brings specific importance to the anguage identification probem of a text or a part of a text. At present time the sufficienty accurate anguage recognition methods for ong texts consisting of tens of sentences are known []. Modes based on frequencies of etter combinations are widey used to identify anguage of a text [2, 3]. It was noted in [3] that it is possibe to use rank methods for anguage identification of a text, but they are not suitabe for short texts. Aso in [3] it is concuded that the anguage identification probem for short texts segments is sti actua, and higher accuracy is achieved at the expense of arger mode size and sower processing. In [4] a anguage identification method for texts in natura anguage was studied. This method was appied to texts written in native aphabets for corresponding anguages and good resuts were achieved. In current work we consider using the same method for anguage identification of a text written in Latin transiteration. We use a set of five Savic anguages: Russian, Ukrainian, Byeorussian, Bugarian and Macedonian. A of these anguages use Cyriic aphabet as native, and transiteration must be used to write in Latin. The main probems reated to anguage identification process are: A wide variety of conficting transiteration tabes;
2 4 A. Chepovskiy et a. A frequent use of transiteration rues which do not meet any of standard transiteration tabes. Amount of texts in transiteration is not enough for construction of training sets that are used in anguage identification agorithms. To sove the first probem severa transiteration tabes shoud be used for each of the anguages. Training text sets were created with an automatic text generation process from Cyriic texts by using transiteration tabes. 2 Statistica mode for a natura anguage text The text anguage identification probem is a pattern recognition probem, and its soution can be based on a probabiistic mode. A Bayesian cassifier can be appied to a string of characters assuming that we know statistica characteristics of characters for texts in specific anguage or texts beonging to a given cass. Let s consider a string s which consists of characters с n (n =,, ) that beong to aphabet Σ. Further we wi use the foowing notation: s = с с 2 с - a specific vaue of the string, s i = c i a vaue of a character which is at i-th position in the string. For soving the probem this string must be assigned to one of the casses Y ( =,, K ), where Y denotes one of K anguages. We assume that every cass defines some kind of probabiity distribution on the set of a possibe strings. In that case it is possibe to appy a statistica criterion of maximum ikeihood to determine a cass which contains the string being cassified. The probabiity of a fact that string s wi appear in some anguage equas to product of probabiities that each character of this string wi appear in this anguage provided that a preceding characters wi appear in this anguage too: s... c s ) s s,..., s s 2,..., s 2 )*...* s )* ) () Let s assume that the probabiity distribution for a character at i-th position depends on probabiity distribution of not more than k preceding characters. In this case equation () can be written as foowing: P s s,..., s ) s s,..., s ) (2) ( i i i i i i i k i k i i An estimation of the conditiona probabiities is performed on the training set. For this purpose, the frequencies of a substrings of engths ess than k+2 are cacuated, and the estimated vaue of conditiona probabiity for the next character is a ratio of the frequencies of the corresponding substrings: Y, s i f ( ci m... ci ) i si m i m,..., si i ), m k f ( c... c ), (3) i m i
3 Language Identification for Texts Written in Transiteration 5 f(x) the frequency of substring X in the training set. The estimated vaue of probabiity of string s appearance in cass Y is defined as foows: Y, s) Y, s Y, s s k s k n k k,..., s,..., s 2 2 )* )*...* Y, s ) (4) The cassified string is assigned to the cass with the highest probabiity estimate. 3 Agorithm impementation First, each natura anguage text is converted to a set of words consisting of ower cased characters beonging to the native aphabet of the anguage. It forms a frequency dictionary of substrings having engths in the range [, k+] taken into account the number of word occurrences in the text. This process is executed during the construction of a string mode for a given anguage. This construction is based on training set of texts. The mode is represented as a finite state machine with states marked with preceding character sequences and transitions marked with the next character and corresponding conditiona probabiity. A singe space is appended to the end of each word of the text, then the word is passed as input to the finite state machine. The initia state of the machine corresponds to character sequence consisting of k spaces. According the formuas () (4) a probabiity of transition to the next state from the current one by each of the characters is being cacuated. A probabiity of appearance of the given word is a product of probabiities of a transitions that occurred during the machine operation. The probabiity of a text is a product of probabiities of a its words. For the anguage identification a probabiity of the text appearance for modes of every natura anguage is estimated. The anguage of the mode with the highest probabiity is assigned to the text. 4 Quaity of the text anguage identification For the anguage identification of a text fragment a numeric estimate of its correspondence to a natura anguage text mode can be cacuated. Let the text fragment s consist of characters. A probabiity of its appearance in the text written in -th anguage can be estimated with the formua (4). Then an estimation of this text fragment correspondence to the -th anguage wi be cacuated as: n( Y, s)) E ( s) onst, (5) Y, s) the probabiity of the string s appearance in the anguage Y ;
4 6 A. Chepovskiy et a. the number of characters in the string s; const a normaizing constant. The expected vaue of this estimation doesn t depend on the ength of text fragment. We choose the anguage with a maximum vaue of the estimation E (s). 5 Initia data Transiteration tabes for a of five anguages were constructed. Severa different tabes were used for each anguage from 4 to 0, depending on anguage. Cyriic text sets consisting of at east 500 thousands characters were made for each of the five anguages. These sets were automaticay transiterated into Latin aphabet with each of the transiteration tabes. Some tabes contain ambiguous transation rues (i.e. there are severa versions of Latin character combinations for one Cyriic character). When severa different rues were possibe then ony one of the rues was chosen randomy. These texts subsequenty were used as training texts for creation of text modes. Test sets consisting of at east 50 thousands characters were constructed for each anguage. Test sets are rea texts written in transiteration taken from various sources. We wi ca texts written in transiteration as transiterated texts, and texts written in native aphabets of corresponding anguages as native texts. To compare anguage identification quaity for transiterated texts with anguage identification quaity for native texts the foowing training and text sets of the same size were made: Cyriic sets for the five anguages being considered; Sets for 3 anguages which use Latin aphabet as native. 6 Experimenta resuts We evauate the quaity of our anguage identification method by cacuating precision and reca for individua anguages. When we identify the anguage of a text sampe of known anguage we can determine whether it was identified correcty or not. For a set of test sampes we know the number of correcty identified sampes as we as the number of identification errors for each anguage. For a given anguage precision can be cacuated as a ratio of the number of correcty identified text sampes in this anguage to the overa number of sampes identified to this anguage. Reca is a ratio of correcty identified text sampes in this anguage to the number of a sampes in this anguage in the test set. We use F-measure to combine precision and reca to a singe vaue, which is defined as harmonic mean of precision and reca. To evauate the quaity of the anguage identification method we created a set of anguage modes. Each mode was trained on a text from the training set and marked with the anguage of that text. Then severa sets of text sampes were created from the texts beonging to the test set. Each set incuded 000 text sampes for each anguage
5 Language Identification for Texts Written in Transiteration 7 of the same ength. The sampes were generated as text fragments starting from a randomy seected position inside text with the restriction that the position must be a beginning of a word. Every text sampe was evauated with every mode and the anguage of the mode with the highest estimate was chosen as the identified anguage of the sampe. Then the vaues of F-measure were cacuated for every anguage. The set of modes incuded 5 modes for transiterated texts of Savic anguages as we as 3 modes for native texts with anguages which use Latin as their native aphabet. The test set contained 36 texts for the same anguages. We considered two methods of anguage identification for transiterated texts. The first method uses severa text modes for each anguage. Each mode corresponds to a particuar transiteration tabe for the anguage. The mode is trained on a text which was generated using that transiteration tabe. So the set of modes contains severa different modes for each anguage. If a text sampe receives the highest estimate for any of the modes for a particuar anguage then the sampe is identified as beonging to that anguage. The second method uses exacty one text mode for each anguage. The mode is trained on a text which is a concatenation of a texts generated for the anguage by using a its transiteration tabes. Figures and 2 show dependence of the anguage identification F-measure for transiterated texts on the text sampe ength. Using severa modes for each anguage instead of one mode does not ead to significant improvement of the text anguage identification. Fig.. Dependence of the F-measure on text sampe ength when using severa modes for one anguage.
6 8 A. Chepovskiy et a. Fig. 2. Dependence of the F-measure on text sampe ength when using one mode for one anguage. For comparison figure 3 shows the anguage identification quaity for native texts. Resuts were obtained by training and testing of five anguage modes for the Cyriic sets for the same five Savic anguages. Fig. 3. Dependence of the F-measure on text sampe ength for anguage identification of native texts.
7 Language Identification for Texts Written in Transiteration 9 From figure 3 one can see that anguage identification quaity for transiterated texts is significanty ower than for native texts. The most substantia part of accuracy oss occurs when transiterated text incorrecty cassifies to mode representing transiterated text of another anguage. There are much fewer errors when transiterated text cassifies to a anguage that uses Latin aphabet. To iustrate this fact, we can measure the quaity of separation of transiterated texts from native texts written in Latin. It is much higher than anguage identification quaity for transiterated text. Figure 4 shows F-measure vaues for separation of transiterated texts for various text sampe engths. Fig. 4. F-measure vaues for separation of transiterated texts for various text sampe engths. 7 Concusion The text mode considered in this artice aows to successfuy separate texts written in anguages with Latin-based aphabets from texts written in Latin transiteration for anguages with Cyriic aphabets. The F-measure of such separation reaches 98% for short texts consisting of ony 20 characters. Identification accuracy of anguages using Cyriic aphabet for transiterated texts reaches more than 80% for texts consisting of 40 characters. It was observed that using severa different text modes for a singe anguage does not give a significant advantage over using a singe mode for a anguage.
8 20 A. Chepovskiy et a. References. Mcamee, B.P.: Language identification: a soved probem suitabe for undergraduate instruction Journa of Computing Sciences in Coeges, 20(3), P (2005) 2. Cavnar, W. B., Trenke, J. M.: -gram-based text categorization. In.: Proceedings of SDAIR-94, 3rd Annua Symposium on Document Anaysis and Information Retrieva, Las Vegas, US, P (994) 3. Vatanen, T, Väyrynen, J.J., Virpioja, S.: Language identification of short text segments with n-gram modes. In.: Proceedings of the Seventh conference on Internationa Language Resources and Evauation (LREC'0), P (200) 4. Gusev, G., Chepovskiy, A.: The mode for identification of a natura anguage of the text. Business Informatics, 3(7), P (20)
AN EVOLUTIONARY APPROACH TO OPTIMIZATION OF A LAYOUT CHART
13 AN EVOLUTIONARY APPROACH TO OPTIMIZATION OF A LAYOUT CHART Eva Vona University of Ostrava, 30th dubna st. 22, Ostrava, Czech Repubic e-mai: Eva.Vona@osu.cz Abstract: This artice presents the use of
More informationAutomatic Grouping for Social Networks CS229 Project Report
Automatic Grouping for Socia Networks CS229 Project Report Xiaoying Tian Ya Le Yangru Fang Abstract Socia networking sites aow users to manuay categorize their friends, but it is aborious to construct
More informationIntro to Programming & C Why Program? 1.2 Computer Systems: Hardware and Software. Why Learn to Program?
Intro to Programming & C++ Unit 1 Sections 1.1-3 and 2.1-10, 2.12-13, 2.15-17 CS 1428 Spring 2018 Ji Seaman 1.1 Why Program? Computer programmabe machine designed to foow instructions Program a set of
More informationAutomatic Hidden Web Database Classification
Automatic idden Web atabase Cassification Zhiguo Gong, Jingbai Zhang, and Qian Liu Facuty of Science and Technoogy niversity of Macau Macao, PRC {fstzgg,ma46597,ma46620}@umac.mo Abstract. In this paper,
More informationNearest Neighbor Learning
Nearest Neighbor Learning Cassify based on oca simiarity Ranges from simpe nearest neighbor to case-based and anaogica reasoning Use oca information near the current query instance to decide the cassification
More informationSensitivity Analysis of Hopfield Neural Network in Classifying Natural RGB Color Space
Sensitivity Anaysis of Hopfied Neura Network in Cassifying Natura RGB Coor Space Department of Computer Science University of Sharjah UAE rsammouda@sharjah.ac.ae Abstract: - This paper presents a study
More informationWATERMARKING GIS DATA FOR DIGITAL MAP COPYRIGHT PROTECTION
WATERMARKING GIS DATA FOR DIGITAL MAP COPYRIGHT PROTECTION Shen Tao Chinese Academy of Surveying and Mapping, Beijing 100039, China shentao@casm.ac.cn Xu Dehe Institute of resources and environment, North
More informationFuzzy Equivalence Relation Based Clustering and Its Use to Restructuring Websites Hyperlinks and Web Pages
Fuzzy Equivaence Reation Based Custering and Its Use to Restructuring Websites Hyperinks and Web Pages Dimitris K. Kardaras,*, Xenia J. Mamakou, and Bi Karakostas 2 Business Informatics Laboratory, Dept.
More informationA probabilistic fuzzy method for emitter identification based on genetic algorithm
A probabitic fuzzy method for emitter identification based on genetic agorithm Xia Chen, Weidong Hu, Hongwen Yang, Min Tang ATR Key Lab, Coege of Eectronic Science and Engineering Nationa University of
More informationA study of comparative evaluation of methods for image processing using color features
A study of comparative evauation of methods for image processing using coor features FLORENTINA MAGDA ENESCU,CAZACU DUMITRU Department Eectronics, Computers and Eectrica Engineering University Pitești
More informationA HYBRID FEATURE SELECTION METHOD BASED ON FISHER SCORE AND GENETIC ALGORITHM
Journa of Mathematica Sciences: Advances and Appications Voume 37, 2016, Pages 51-78 Avaiabe at http://scientificadvances.co.in DOI: http://dx.doi.org/10.18642/jmsaa_7100121627 A HYBRID FEATURE SELECTION
More informationA NEW APPROACH FOR BLOCK BASED STEGANALYSIS USING A MULTI-CLASSIFIER
Internationa Journa on Technica and Physica Probems of Engineering (IJTPE) Pubished by Internationa Organization of IOTPE ISSN 077-358 IJTPE Journa www.iotpe.com ijtpe@iotpe.com September 014 Issue 0 Voume
More informationResource Optimization to Provision a Virtual Private Network Using the Hose Model
Resource Optimization to Provision a Virtua Private Network Using the Hose Mode Monia Ghobadi, Sudhakar Ganti, Ghoamai C. Shoja University of Victoria, Victoria C, Canada V8W 3P6 e-mai: {monia, sganti,
More informationA Memory Grouping Method for Sharing Memory BIST Logic
A Memory Grouping Method for Sharing Memory BIST Logic Masahide Miyazai, Tomoazu Yoneda, and Hideo Fuiwara Graduate Schoo of Information Science, Nara Institute of Science and Technoogy (NAIST), 8916-5
More informationFast Methods for Kernel-based Text Analysis
Proceedings of the 41st Annua Meeting of the Association for Computationa Linguistics, Juy 2003, pp. 24-31. Fast Methods for Kerne-based Text Anaysis Taku Kudo and Yuji Matsumoto Graduate Schoo of Information
More informationMassively Parallel Part of Speech Tagging Using. Min-Max Modular Neural Networks.
assivey Parae Part of Speech Tagging Using in-ax oduar Neura Networks Bao-Liang Lu y, Qing a z, ichinori Ichikawa y, & Hitoshi Isahara z y Lab. for Brain-Operative Device, Brain Science Institute, RIEN
More informationA New Supervised Clustering Algorithm Based on Min-Max Modular Network with Gaussian-Zero-Crossing Functions
2006 Internationa Joint Conference on Neura Networks Sheraton Vancouver Wa Centre Hote, Vancouver, BC, Canada Juy 16-21, 2006 A New Supervised Custering Agorithm Based on Min-Max Moduar Network with Gaussian-Zero-Crossing
More informationIntro to Programming & C Why Program? 1.2 Computer Systems: Hardware and Software. Hardware Components Illustrated
Intro to Programming & C++ Unit 1 Sections 1.1-3 and 2.1-10, 2.12-13, 2.15-17 CS 1428 Fa 2017 Ji Seaman 1.1 Why Program? Computer programmabe machine designed to foow instructions Program instructions
More informationMobile App Recommendation: Maximize the Total App Downloads
Mobie App Recommendation: Maximize the Tota App Downoads Zhuohua Chen Schoo of Economics and Management Tsinghua University chenzhh3.12@sem.tsinghua.edu.cn Yinghui (Catherine) Yang Graduate Schoo of Management
More informationArithmetic Coding. Prof. Ja-Ling Wu. Department of Computer Science and Information Engineering National Taiwan University
Arithmetic Coding Prof. Ja-Ling Wu Department of Computer Science and Information Engineering Nationa Taiwan University F(X) Shannon-Fano-Eias Coding W..o.g. we can take X={,,,m}. Assume p()>0 for a. The
More informationImage Segmentation Using Semi-Supervised k-means
I J C T A, 9(34) 2016, pp. 595-601 Internationa Science Press Image Segmentation Using Semi-Supervised k-means Reza Monsefi * and Saeed Zahedi * ABSTRACT Extracting the region of interest is a very chaenging
More informationResponse Surface Model Updating for Nonlinear Structures
Response Surface Mode Updating for Noninear Structures Gonaz Shahidi a, Shamim Pakzad b a PhD Student, Department of Civi and Environmenta Engineering, Lehigh University, ATLSS Engineering Research Center,
More informationReference trajectory tracking for a multi-dof robot arm
Archives of Contro Sciences Voume 5LXI, 5 No. 4, pages 53 57 Reference trajectory tracking for a muti-dof robot arm RÓBERT KRASŇANSKÝ, PETER VALACH, DÁVID SOÓS, JAVAD ZARBAKHSH This paper presents the
More informationAs Michi Henning and Steve Vinoski showed 1, calling a remote
Reducing CORBA Ca Latency by Caching and Prefetching Bernd Brügge and Christoph Vismeier Technische Universität München Method ca atency is a major probem in approaches based on object-oriented middeware
More informationA Method for Calculating Term Similarity on Large Document Collections
$ A Method for Cacuating Term Simiarity on Large Document Coections Wofgang W Bein Schoo of Computer Science University of Nevada Las Vegas, NV 915-019 bein@csunvedu Jeffrey S Coombs and Kazem Taghva Information
More informationExtracting semistructured data from the Web: An XQuery Based Approach
EurAsia-ICT 2002, Shiraz-Iran, 29-31 Oct. Extracting semistructured data from the Web: An XQuery Based Approach Gies Nachouki Université de Nantes - Facuté des Sciences, IRIN, 2, rue de a Houssinière,
More informationSolutions to the Final Exam
CS/Math 24: Intro to Discrete Math 5//2 Instructor: Dieter van Mekebeek Soutions to the Fina Exam Probem Let D be the set of a peope. From the definition of R we see that (x, y) R if and ony if x is a
More informationEfficient Histogram-based Indexing for Video Copy Detection
Efficient Histogram-based Indexing for Video Copy Detection Chih-Yi Chiu, Jenq-Haur Wang*, and Hung-Chi Chang Institute of Information Science, Academia Sinica, Taiwan *Department of Computer Science and
More informationHiding secrete data in compressed images using histogram analysis
University of Woongong Research Onine University of Woongong in Dubai - Papers University of Woongong in Dubai 2 iding secrete data in compressed images using histogram anaysis Farhad Keissarian University
More informationUtility-based Camera Assignment in a Video Network: A Game Theoretic Framework
This artice has been accepted for pubication in a future issue of this journa, but has not been fuy edited. Content may change prior to fina pubication. Y.LI AND B.BHANU CAMERA ASSIGNMENT: A GAME-THEORETIC
More informationApplication of Intelligence Based Genetic Algorithm for Job Sequencing Problem on Parallel Mixed-Model Assembly Line
American J. of Engineering and Appied Sciences 3 (): 5-24, 200 ISSN 94-7020 200 Science Pubications Appication of Inteigence Based Genetic Agorithm for Job Sequencing Probem on Parae Mixed-Mode Assemby
More informationMultiple Medoids based Multi-view Relational Fuzzy Clustering with Minimax Optimization
Mutipe Medoids based Muti-view Reationa Fuzzy Custering with Minimax Optimization Yangtao Wang, Lihui Chen 2, Xiaoi Li Institude for Infocomm Research(I2R), A*STAR, Singapore 2 Schoo of Eectrica and Eectronic
More informationOptimization and Application of Support Vector Machine Based on SVM Algorithm Parameters
Optimization and Appication of Support Vector Machine Based on SVM Agorithm Parameters YAN Hui-feng 1, WANG Wei-feng 1, LIU Jie 2 1 ChongQing University of Posts and Teecom 400065, China 2 Schoo Of Civi
More informationTesting Whether a Set of Code Words Satisfies a Given Set of Constraints *
JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 6, 333-346 (010) Testing Whether a Set of Code Words Satisfies a Given Set of Constraints * HSIN-WEN WEI, WAN-CHEN LU, PEI-CHI HUANG, WEI-KUAN SHIH AND MING-YANG
More informationComparative Analysis of Relevance for SVM-Based Interactive Document Retrieval
Comparative Anaysis for SVM-Based Interactive Document Retrieva Paper: Comparative Anaysis of Reevance for SVM-Based Interactive Document Retrieva Hiroshi Murata, Takashi Onoda, and Seiji Yamada Centra
More informationMACHINE learning techniques can, automatically,
Proceedings of Internationa Joint Conference on Neura Networks, Daas, Texas, USA, August 4-9, 203 High Leve Data Cassification Based on Network Entropy Fiipe Aves Neto and Liang Zhao Abstract Traditiona
More informationNeural Network Enhancement of the Los Alamos Force Deployment Estimator
Missouri University of Science and Technoogy Schoars' Mine Eectrica and Computer Engineering Facuty Research & Creative Works Eectrica and Computer Engineering 1-1-1994 Neura Network Enhancement of the
More informationA Fast Block Matching Algorithm Based on the Winner-Update Strategy
In Proceedings of the Fourth Asian Conference on Computer Vision, Taipei, Taiwan, Jan. 000, Voume, pages 977 98 A Fast Bock Matching Agorithm Based on the Winner-Update Strategy Yong-Sheng Chenyz Yi-Ping
More informationA Design Method for Optimal Truss Structures with Certain Redundancy Based on Combinatorial Rigidity Theory
0 th Word Congress on Structura and Mutidiscipinary Optimization May 9 -, 03, Orando, Forida, USA A Design Method for Optima Truss Structures with Certain Redundancy Based on Combinatoria Rigidity Theory
More informationLoad Balancing by MPLS in Differentiated Services Networks
Load Baancing by MPLS in Differentiated Services Networks Riikka Susitaiva, Jorma Virtamo, and Samui Aato Networking Laboratory, Hesinki University of Technoogy P.O.Box 3000, FIN-02015 HUT, Finand {riikka.susitaiva,
More informationResearch on UAV Fixed Area Inspection based on Image Reconstruction
Research on UAV Fixed Area Inspection based on Image Reconstruction Kun Cao a, Fei Wu b Schoo of Eectronic and Eectrica Engineering, Shanghai University of Engineering Science, Abstract Shanghai 20600,
More informationDynamic Symbolic Execution of Distributed Concurrent Objects
Dynamic Symboic Execution of Distributed Concurrent Objects Andreas Griesmayer 1, Bernhard Aichernig 1,2, Einar Broch Johnsen 3, and Rudof Schatte 1,2 1 Internationa Institute for Software Technoogy, United
More informationA Petrel Plugin for Surface Modeling
A Petre Pugin for Surface Modeing R. M. Hassanpour, S. H. Derakhshan and C. V. Deutsch Structure and thickness uncertainty are important components of any uncertainty study. The exact ocations of the geoogica
More informationQuality Assessment using Tone Mapping Algorithm
Quaity Assessment using Tone Mapping Agorithm Nandiki.pushpa atha, Kuriti.Rajendra Prasad Research Schoar, Assistant Professor, Vignan s institute of engineering for women, Visakhapatnam, Andhra Pradesh,
More informationLecture Notes for Chapter 4 Part III. Introduction to Data Mining
Data Mining Cassification: Basic Concepts, Decision Trees, and Mode Evauation Lecture Notes for Chapter 4 Part III Introduction to Data Mining by Tan, Steinbach, Kumar Adapted by Qiang Yang (2010) Tan,Steinbach,
More informationSolving Large Double Digestion Problems for DNA Restriction Mapping by Using Branch-and-Bound Integer Linear Programming
The First Internationa Symposium on Optimization and Systems Bioogy (OSB 07) Beijing, China, August 8 10, 2007 Copyright 2007 ORSC & APORC pp. 267 279 Soving Large Doube Digestion Probems for DNA Restriction
More informationFREE-FORM ANISOTROPY: A NEW METHOD FOR CRACK DETECTION ON PAVEMENT SURFACE IMAGES
FREE-FORM ANISOTROPY: A NEW METHOD FOR CRACK DETECTION ON PAVEMENT SURFACE IMAGES Tien Sy Nguyen, Stéphane Begot, Forent Ducuty, Manue Avia To cite this version: Tien Sy Nguyen, Stéphane Begot, Forent
More informationInterpreting Individual Classifications of Hierarchical Networks
Interpreting Individua Cassifications of Hierarchica Networks Wi Landecker, Michae D. Thomure, Luís M. A. Bettencourt, Meanie Mitche, Garrett T. Kenyon, and Steven P. Brumby Department of Computer Science
More informationDETERMINING INTUITIONISTIC FUZZY DEGREE OF OVERLAPPING OF COMPUTATION AND COMMUNICATION IN PARALLEL APPLICATIONS USING GENERALIZED NETS
DETERMINING INTUITIONISTIC FUZZY DEGREE OF OVERLAPPING OF COMPUTATION AND COMMUNICATION IN PARALLEL APPLICATIONS USING GENERALIZED NETS Pave Tchesmedjiev, Peter Vassiev Centre for Biomedica Engineering,
More informationDistance Weighted Discrimination and Second Order Cone Programming
Distance Weighted Discrimination and Second Order Cone Programming Hanwen Huang, Xiaosun Lu, Yufeng Liu, J. S. Marron, Perry Haaand Apri 3, 2012 1 Introduction This vignette demonstrates the utiity and
More informationfastruct: model-based clustering made faster
fastruct: mode-based custering made faster Chibiao Chen Forence Forbes Oivier François INRIA Rhone-Apes, 655 Avenue de Europe, Montbonnot, 38334 St Ismier France TIMC-TIMB, Dept Math Bioogy, Facuty of
More informationBacking-up Fuzzy Control of a Truck-trailer Equipped with a Kingpin Sliding Mechanism
Backing-up Fuzzy Contro of a Truck-traier Equipped with a Kingpin Siding Mechanism G. Siamantas and S. Manesis Eectrica & Computer Engineering Dept., University of Patras, Patras, Greece gsiama@upatras.gr;stam.manesis@ece.upatras.gr
More informationIJOART. Copyright 2013 SciResPub. Shri Venkateshwara University, Institute of Computer & Information Science. India. Uttar Pradesh, India
Internationa Journa of Advancements in Research & Technoogy, Voume, Issue 6, June-04 ISSN 78-7763 89 Pattern assification for andwritten Marathi haracters using Gradient Descent of Distributed error with
More informationDevelopment of a hybrid K-means-expectation maximization clustering algorithm
Journa of Computations & Modeing, vo., no.4, 0, -3 ISSN: 79-765 (print, 79-8850 (onine Scienpress Ltd, 0 Deveopment of a hybrid K-means-expectation maximization custering agorithm Adigun Abimboa Adebisi,
More informationResearch of Classification based on Deep Neural Network
2018 Internationa Conference on Sensor Network and Computer Engineering (ICSNCE 2018) Research of Emai Cassification based on Deep Neura Network Wang Yawen Schoo of Computer Science and Engineering Xi
More informationEndoscopic Motion Compensation of High Speed Videoendoscopy
Endoscopic Motion Compensation of High Speed Videoendoscopy Bharath avuri Department of Computer Science and Engineering, University of South Caroina, Coumbia, SC - 901. ravuri@cse.sc.edu Abstract. High
More informationMULTITASK MULTIVARIATE COMMON SPARSE REPRESENTATIONS FOR ROBUST MULTIMODAL BIOMETRICS RECOGNITION. Heng Zhang, Vishal M. Patel and Rama Chellappa
MULTITASK MULTIVARIATE COMMON SPARSE REPRESENTATIONS FOR ROBUST MULTIMODAL BIOMETRICS RECOGNITION Heng Zhang, Visha M. Pate and Rama Cheappa Center for Automation Research University of Maryand, Coage
More informationBinarized support vector machines
Universidad Caros III de Madrid Repositorio instituciona e-archivo Departamento de Estadística http://e-archivo.uc3m.es DES - Working Papers. Statistics and Econometrics. WS 2007-11 Binarized support vector
More informationJoint disparity and motion eld estimation in. stereoscopic image sequences. Ioannis Patras, Nikos Alvertos and Georgios Tziritas y.
FORTH-ICS / TR-157 December 1995 Joint disparity and motion ed estimation in stereoscopic image sequences Ioannis Patras, Nikos Avertos and Georgios Tziritas y Abstract This work aims at determining four
More informationA Comparison of a Second-Order versus a Fourth- Order Laplacian Operator in the Multigrid Algorithm
A Comparison of a Second-Order versus a Fourth- Order Lapacian Operator in the Mutigrid Agorithm Kaushik Datta (kdatta@cs.berkeey.edu Math Project May 9, 003 Abstract In this paper, the mutigrid agorithm
More informationReal-Time Feature Descriptor Matching via a Multi-Resolution Exhaustive Search Method
297 Rea-Time Feature escriptor Matching via a Muti-Resoution Ehaustive Search Method Chi-Yi Tsai, An-Hung Tsao, and Chuan-Wei Wang epartment of Eectrica Engineering, Tamang University, New Taipei City,
More informationTHE PERCENTAGE OCCUPANCY HIT OR MISS TRANSFORM
17th European Signa Processing Conference (EUSIPCO 2009) Gasgow, Scotand, August 24-28, 2009 THE PERCENTAGE OCCUPANCY HIT OR MISS TRANSFORM P. Murray 1, S. Marsha 1, and E.Buinger 2 1 Dept. of Eectronic
More informationFurther Concepts in Geometry
ppendix F Further oncepts in Geometry F. Exporing ongruence and Simiarity Identifying ongruent Figures Identifying Simiar Figures Reading and Using Definitions ongruent Trianges assifying Trianges Identifying
More informationInterpreting Individual Classifications of Hierarchical Networks
Portand State University PDXSchoar Computer Science Facuty Pubications and Presentations Computer Science 2013 Interpreting Individua Cassifications of Hierarchica Networks Wi Landecker Portand State University,
More informationA Novel Method for Early Software Quality Prediction Based on Support Vector Machine
A Nove Method for Eary Software Quaity Prediction Based on Support Vector Machine Fei Xing 1,PingGuo 1;2, and Michae R. Lyu 2 1 Department of Computer Science Beijing Norma University, Beijing, 1875, China
More informationAUTOMATIC IMAGE RETARGETING USING SALIENCY BASED MESH PARAMETERIZATION
S.Sai Kumar et a. / (IJCSIT Internationa Journa of Computer Science and Information Technoogies, Vo. 1 (4, 010, 73-79 AUTOMATIC IMAGE RETARGETING USING SALIENCY BASED MESH PARAMETERIZATION 1 S.Sai Kumar,
More informationGenetic Algorithms for Parallel Code Optimization
Genetic Agorithms for Parae Code Optimization Ender Özcan Dept. of Computer Engineering Yeditepe University Kayışdağı, İstanbu, Turkey Emai: eozcan@cse.yeditepe.edu.tr Abstract- Determining the optimum
More informationProviding Hop-by-Hop Authentication and Source Privacy in Wireless Sensor Networks
The 31st Annua IEEE Internationa Conference on Computer Communications: Mini-Conference Providing Hop-by-Hop Authentication and Source Privacy in Wireess Sensor Networks Yun Li Jian Li Jian Ren Department
More information1682 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 22, NO. 6, DECEMBER Backward Fuzzy Rule Interpolation
1682 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 22, NO. 6, DECEMBER 2014 Bacward Fuzzy Rue Interpoation Shangzhu Jin, Ren Diao, Chai Que, Senior Member, IEEE, and Qiang Shen Abstract Fuzzy rue interpoation
More informationCollinearity and Coplanarity Constraints for Structure from Motion
Coinearity and Copanarity Constraints for Structure from Motion Gang Liu 1, Reinhard Kette 2, and Bodo Rosenhahn 3 1 Institute of Information Sciences and Technoogy, Massey University, New Zeaand, Department
More informationOn a Font Setting based Bayesian Model to extract mathematical expression in PDF files
On a Font Setting based Bayesian Mode to extract mathematica expression in PDF fies Xing Wang Department of Computer Science and Engineering Texas A&M University Coege Station, USA xingwang@cse.tamu.edu
More informationA Big Bang-Big Crunch Type-2 Fuzzy Logic System for Machine Vision-Based Event Detection and Summarization in Real-world Ambient Assisted Living
TFS-2015-0242 1 A Big Bang-Big Crunch Type-2 Fuzzy Logic System for Machine Vision-Based Event Detection and Summarization in Rea-word Ambient Assisted Living Bo Yao, Hani Hagras, Feow, IEEE, Daniya Aghazzawi,
More informationCollaborative Approach to Mitigating ARP Poisoning-based Man-in-the-Middle Attacks
Coaborative Approach to Mitigating ARP Poisoning-based Man-in-the-Midde Attacks Seung Yeob Nam a, Sirojiddin Djuraev a, Minho Park b a Department of Information and Communication Engineering, Yeungnam
More informationfile://j:\macmillancomputerpublishing\chapters\in073.html 3/22/01
Page 1 of 15 Chapter 9 Chapter 9: Deveoping the Logica Data Mode The information requirements and business rues provide the information to produce the entities, attributes, and reationships in ogica mode.
More informationA METHOD FOR GRIDLESS ROUTING OF PRINTED CIRCUIT BOARDS. A. C. Finch, K. J. Mackenzie, G. J. Balsdon, G. Symonds
A METHOD FOR GRIDLESS ROUTING OF PRINTED CIRCUIT BOARDS A C Finch K J Mackenzie G J Basdon G Symonds Raca-Redac Ltd Newtown Tewkesbury Gos Engand ABSTRACT The introduction of fine-ine technoogies to printed
More informationA Novel Linear-Polynomial Kernel to Construct Support Vector Machines for Speech Recognition
Journa of Computer Science 7 (7): 99-996, 20 ISSN 549-3636 20 Science Pubications A Nove Linear-Poynomia Kerne to Construct Support Vector Machines for Speech Recognition Bawant A. Sonkambe and 2 D.D.
More informationAn Exponential Time 2-Approximation Algorithm for Bandwidth
An Exponentia Time 2-Approximation Agorithm for Bandwidth Martin Fürer 1, Serge Gaspers 2, Shiva Prasad Kasiviswanathan 3 1 Computer Science and Engineering, Pennsyvania State University, furer@cse.psu.edu
More informationJOINT IMAGE REGISTRATION AND EXAMPLE-BASED SUPER-RESOLUTION ALGORITHM
JOINT IMAGE REGISTRATION AND AMPLE-BASED SUPER-RESOLUTION ALGORITHM Hyo-Song Kim, Jeyong Shin, and Rae-Hong Park Department of Eectronic Engineering, Schoo of Engineering, Sogang University 35 Baekbeom-ro,
More informationA Near-Optimal Distributed QoS Constrained Routing Algorithm for Multichannel Wireless Sensor Networks
Sensors 2013, 13, 16424-16450; doi:10.3390/s131216424 Artice OPEN ACCESS sensors ISSN 1424-8220 www.mdpi.com/journa/sensors A Near-Optima Distributed QoS Constrained Routing Agorithm for Mutichanne Wireess
More informationFast b-matching via Sufficient Selection Belief Propagation
Fast b-matching via Sufficient Seection Beief Propagation Bert Huang Computer Science Department Coumbia University New York, NY 127 bert@cs.coumbia.edu Tony Jebara Computer Science Department Coumbia
More informationFast b-matching via Sufficient Selection Belief Propagation
Bert Huang Computer Science Department Coumbia University New York, NY 127 bert@cs.coumbia.edu Tony Jebara Computer Science Department Coumbia University New York, NY 127 jebara@cs.coumbia.edu Abstract
More information5940 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 13, NO. 11, NOVEMBER 2014
5940 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 13, NO. 11, NOVEMBER 014 Topoogy-Transparent Scheduing in Mobie Ad Hoc Networks With Mutipe Packet Reception Capabiity Yiming Liu, Member, IEEE,
More informationHuman Action Recognition Using Key Points Displacement
Human Action Recognition Using Key Points Dispacement Kuan-Ting Lai,, Chaur-Heh Hsieh 3, Mao-Fu Lai 4, and Ming-Syan Chen, Research Center for Information Technoogy Innovation, Academia Sinica, Taiwan
More informationMulti-level Shape Recognition based on Wavelet-Transform. Modulus Maxima
uti-eve Shape Recognition based on Waveet-Transform oduus axima Faouzi Aaya Cheikh, Azhar Quddus and oncef Gabbouj Tampere University of Technoogy (TUT), Signa Processing aboratory, P.O. Box 553, FIN-33101
More informationTopology-aware Key Management Schemes for Wireless Multicast
Topoogy-aware Key Management Schemes for Wireess Muticast Yan Sun, Wade Trappe,andK.J.RayLiu Department of Eectrica and Computer Engineering, University of Maryand, Coege Park Emai: ysun, kjriu@gue.umd.edu
More informationConstellation Models for Recognition of Generic Objects
1 Consteation Modes for Recognition of Generic Objects Wei Zhang Schoo of Eectrica Engineering and Computer Science Oregon State University zhangwe@eecs.oregonstate.edu Abstract Recognition of generic
More informationPolygonal Approximation of Point Sets
Poygona Approximation of Point Sets Longin Jan Latecki 1, Rof Lakaemper 1, and Marc Sobe 2 1 CIS Dept., Tempe University, Phiadephia, PA 19122, USA, atecki@tempe.edu, akamper@tempe.edu 2 Statistics Dept.,
More informationQuality of Service Evaluations of Multicast Streaming Protocols *
Quaity of Service Evauations of Muticast Streaming Protocos Haonan Tan Derek L. Eager Mary. Vernon Hongfei Guo omputer Sciences Department University of Wisconsin-Madison, USA {haonan, vernon, guo}@cs.wisc.edu
More informationCERIAS Tech Report Replicated Parallel I/O without Additional Scheduling Costs by Mikhail J. Atallah Center for Education and Research
CERIAS Tech Report 2003-50 Repicated Parae I/O without Additiona Scheduing Costs by Mikhai J. Ataah Center for Education and Research Information Assurance and Security Purdue University, West Lafayette,
More informationPneumo-Mechanical Simulation of a 2 Dof Planar Manipulator
Pneumo-Mechanica Simuation of a 2 Dof Panar Manipuator Hermes GIBERTI, Simone CINQUEMANI Mechanica Engineering Department, Poitecnico di Miano, Campus Bovisa Sud, via La Masa 34, 2156, Miano, Itay ABSTRACT
More informationQuality Driven Web Services Composition
Quaity Driven Web Services Composition Liangzhao Zeng University of New South Waes Sydney, Austraia zzhao@cseunsweduau Jayant Kaagnanam IBM TJ Watson Research Center New York, USA ayant@usibmcom Bouaem
More informationFACE RECOGNITION WITH HARMONIC DE-LIGHTING. s: {lyqing, sgshan, wgao}jdl.ac.cn
FACE RECOGNITION WITH HARMONIC DE-LIGHTING Laiyun Qing 1,, Shiguang Shan, Wen Gao 1, 1 Graduate Schoo, CAS, Beijing, China, 100080 ICT-ISVISION Joint R&D Laboratory for Face Recognition, CAS, Beijing,
More informationExtended Node-Arc Formulation for the K-Edge-Disjoint Hop-Constrained Network Design Problem
Extended Node-Arc Formuation for the K-Edge-Disjoint Hop-Constrained Network Design Probem Quentin Botton Université cathoique de Louvain, Louvain Schoo of Management, (Begique) botton@poms.uc.ac.be Bernard
More informationGrating cell operator features for oriented texture segmentation
Appeared in in: Proc. of the 14th Int. Conf. on Pattern Recognition, Brisbane, Austraia, August 16-20, 1998, pp.1010-1014. Grating ce operator features for oriented texture segmentation P. Kruizinga and
More informationSemi-Supervised Learning with Sparse Distributed Representations
Semi-Supervised Learning with Sparse Distributed Representations David Zieger dzieger@stanford.edu CS 229 Fina Project 1 Introduction For many machine earning appications, abeed data may be very difficut
More informationCongestion Avoidance and Control in Sensor Network Based on Fuzzy AQM
ICSP2012 Proceedings Congestion Avoidance and Contro in Sensor Network Based on Fuzzy AQM Luo Cheng, Xie Weixin,Li Qiusheng ATR Key Laboratory of Nationa Defense Technoogy, Shenzhen University, Shenzhen
More informationCommunity-Aware Opportunistic Routing in Mobile Social Networks
IEEE TRANSACTIONS ON COMPUTERS VOL:PP NO:99 YEAR 213 Community-Aware Opportunistic Routing in Mobie Socia Networks Mingjun Xiao, Member, IEEE Jie Wu, Feow, IEEE, and Liusheng Huang, Member, IEEE Abstract
More informationAUTOMATIC gender classification based on facial images
SUBMITTED TO IEEE TRANSACTIONS ON NEURAL NETWORKS 1 Gender Cassification Using a Min-Max Moduar Support Vector Machine with Incorporating Prior Knowedge Hui-Cheng Lian and Bao-Liang Lu, Senior Member,
More informationMulti-task hidden Markov modeling of spectrogram feature from radar high-resolution range profiles
http://asp.eurasipjournas.com/content/22//86 RESEARCH Open Access Muti-task hidden Markov modeing of spectrogram feature from radar high-resoution range profies Mian Pan, Lan Du *, Penghui Wang, Hongwei
More informationSPT: Storyboard Programming Tool
SPT: Storyboard Programming Too The MIT Facuty has made this artice openy avaiabe. Pease share how this access beneits you. Your story matters. Citation As Pubished Pubisher Singh, Rishabh, and Armando
More information