The Analysis and Optimization of KNN Algorithm Space-Time Efficiency for Chinese Text Categorization

Size: px
Start display at page:

Download "The Analysis and Optimization of KNN Algorithm Space-Time Efficiency for Chinese Text Categorization"

Transcription

1 The Analysis and Optimization of KNN Algorithm Space-Time Efficiency for Chinese Text Categorization Ying Cai and Xiaofei Wang Dept. of Computer Science and Technology, Beijing Information Science & Technology University Beijing, , P. R. China Abstract. The performance of any algorithm for text classification are reflected in the of reliability classification results and classification algorithm is high efficient. We analyze the space-time efficiency of different stages based on the traditional KNN algorithm process for Chinese text classification and ensure the reliability of classification. And we optimize efficiency of the algorithm and the feasibility in the practical application from these aspects including feature extraction, feature weighting, similarity computing etc. Keywords: KNN Algorithm, Space-Time Efficiency, Text Categorization, Feature, Feature Vector, Similarity. 1 Introduction With the Web site of resources and the popularization of electronic text, the human began to pursue an efficient and reliable method of information processing in response to the rapid development of information technology industry brought about by the explosion of nowledge and other issues. Now many scholars concern the text classification technology. Text classification will be determined by one or more pre-defined class method based on the text content [1]. The current text classification methods can be divided into general rule-based calssifying and statistical-based classifying which including Decision Tree, K-nearest neighbor (KNN), Support Vector Machine, Bayes etc.[2]. However, the performance of any classification is reflected in two aspects, namely, the reliability and high efficiency of the classification algorithm. Chinese text classification requires more reliability and efficiency because of its inherent complicated, confusing meaning of the word, language forms and other characteristics. The text classification algorithm is often optimized for the reliability of the algorithm itself and is very difficult to implement it. At the same time it is easy to ignore time and space efficiency of classification algorithms. Or efficiency for the algorithm optimization of a certain stage, the whole traditional classification algorithms advantage is dispersed weaened. Therefore we analyze the efficiency of time and space in difference stage in order to get a reasonable proportion of S. Lin and X. Huang (Eds.): CSEE 2011, Part I, CCIS 214, pp , Springer-Verlag Berlin Heidelberg 2011

2 The Analysis and Optimization of KNN Algorithm Space-Time Efficiency 543 space-time according to the process of traditional KNN algorithm. It can get high efficiency and feasible for optimization algorithm of the practical application in ensuring the reliability of classification. 2 KNN Algorithm 2.1 KNN Algorithm Overview Traditional KNN algorithm is a simple and effective non-parametric algorithm. It is outstanding in the precision and recall rate. But its main problem is a high feature dimension space[3]. KNN is a lazy learning method, the calculation of large sample similarity, classification time is nonlinear, training fast but classification slow. And KNN classifier is strongly affected by the distribution of training data in efficiency. Its computational is unbearable in the general computer environment[4]. Traditional KNN algorithm is one of the most useful classifier as a Chinese text. It is deserved to study and exploration for the performance analysis and optimization. 2.2 Traditional KNN Algorithm Process The basic idea of the traditional KNN algorithm can be expressed as: According to the traditional vector space model, text features are formalized as the weighted feature vector[5]. For a given text to be classified, calculate similarity (distance) for each text in the training set. Then select the K texts with the nearest distance between the training set of documents and text sets to be classified. Determine which categories of the new text [6] according to the above K texts category. The algorithm flow is as the Figure1. Fig. 1. Traditional KNN Algorithm Flow

3 544 Y. Cai and X. Wang 2.3 The Principles of Traditional KNN Algorithm KNN algorithm design and implementation of the optimization process should follow a few principles to improve the space-time efficiency of algorithms. (1) Store intermediate results with a dis file. (2) Minimize the number of dis file access. (3) Hash table used as the basic storage structure. 3 KNN Algorithm Space-Time Efficiency Analysis We analyze the space-time efficiency of the traditional KNN algorithm. It will divide three stages including feature extraction, feature vector computing and similarity computing. 3.1 Feature Items Analysis The solution of feature item on the traditional KNN algorithm exists time and space both defects. First, the current widespread use of evaluation function to extraction feature items. But the evaluation function only increase in the extraction accuracy within a limited, and the time cost and the calculation cost of flat text similarity is same, timeconsuming is too high and increase the training corpus part of the burden. Secondly, feature extraction is not strictly the requirements of space resources, however, the characteristics of large-scale text feature entry will greatly increase the computational complexity of subsequent algorithms. In the calculation feature vector stage, each feature item is calculated as a dimension of vector. 3.2 Feature Vectors Analysis It is the basis in the text classification that the document was changed into the format computer can do it by using simple and accurate method [7]. Formalization of the classic text is as the feature vector with feature as following: (W 1,W 2,W 3,,W n ), where W i is the i th weight of feature item. The text formalization of the feature is calculated by assigning weight and form feature vectors. Feature vectors are composition numerical by weight. Feature items weighted based on the following two main experiences [1]. (1)The more lexical item appearing in a text the more it related to the subject of it. (2)The more times appear set of lexical items in the text the worse the term discrimination between items. Traditional KNN algorithm use tfidf(term frequency inverse document frequency), weighting formula[7], that is, feature items w in the text t weight is: tfidf ( w, t ) = # ( w, t ) * lg( N /# w ) (1)

4 The Analysis and Optimization of KNN Algorithm Space-Time Efficiency 545 #( w, t ) is the number of the feature item w appearing in the text t. N is the total number of text, # w is the number of text when appearing the feature w. High dimension of feature vectors, is generally more than 20 World Wide Web, commonly used feature vector stored in two ways. (1) Using fixed-length dimensions to store text feature vector. It taes up a lot of storage space, easy to calculate similarity and see fast. (2) Using variable length dimensions to store text feature vector. Saving only the characteristics of each item in the real text, need small space, not easy to calculate the similarity, a large see time. The tfidf algorithm with weight is as follows. for(text_i=first_text to N) for(i_tag_j=i_first_tag to text_i.length) begin #t_j++; for(text_=first_text to N) if(search i_tag_j in text_) #w_j++; end return #t*log(n/#w); The time complexity above algorithm is O (n 3 ) and space complexity is O(1). It costs so more time by repeatedly opening the dis file in the inner loop that it becomes the bottlenec for the weighting algorithm. Consider the separation of the inner loop, or cut down to a constant level for the inner cycle. 3.3 Similarity Analysis We calculate the similarity between the test and the training corpus to reflect how similarity and provide data support for the classification. We use cosine formula as a formula for calculating the similarity in Chinese text categorization [8]. Between the text t i and t j, the similarity is : M M = M 2 2 sim ( t i, t j ) ( w i * w j ) / w i w j (2) = 1 = 1 = 1 w i is the th feature item weight in the text i t. M is the total number of feature items. Because the traditional KNN algorithm needs to calculate the similarity with each training text, and therefore simplifying the training process will cost a lot of time[9].the traditional KNN algorithm for classification is low efficiency. Testing corpus choose single text. The basic design of the similarity is replacing time with space. Traditional similarity algorithm is as follows. for(weight_i=first_weight to M) put into hashtable ha; for(text_j=first_text to N) for(j_weight_=j_first_weight to M) if(search j_weight_ in ha) sim_j();

5 546 Y. Cai and X. Wang The time complexity of algorithm is O(n 2 ), space complexity is O(n). But if we use a conventional algorithm to calculate the similarity of large texts, time complexity will increase to O (n 3 ), then control the time consumption is particularly important. 4 Optimization Scheme of KNN Algorithm and Test Then with the results of KNN algorithm space-time efficiency analysis, we design and test space-time efficiency optimization schemes according to every stages. Mainly including extraction of feature items optimization scheme, feature items weighted optimization scheme, similarity calculation optimization scheme. 4.1 Features Extraction Optimization and Test The feature item data are the most is resources in the KNN classification algorithm. If the evaluation function is ignored, the establishment of good quality stop words can also reduce the dimension of feature vectors, and to ensure time efficiency of the algorithm. At present, stop words table is no uniform standard [10]. According to different extraction method stop words can be constructed of different tables, the space-time classification algorithm will affect the performance. Feature extraction program supports different items filter design is as follows. if(word.length()>lower_limit && word.length<upper_limit) if(word.trait==n or word.trait==v) tag_hash.put(word); Choose different word frequency, word combinations form different feature item extraction scheme. The test result is as the Table1. It includes 500 training Corpus. No. Word Frequency Table 1. Testing Result of Features Speech Part Feature Extraction Time(ms) Feature Storage space(kb) Feature Vector Dimension 1 >99 Noun >99 noun,verb >499 Noun >499 noun,verb >999 Noun >999 noun,verb <1000 Noun <1000 noun,verb <500 Noun <500 noun,verb The different test results are as the Figure 2.

6 The Analysis and Optimization of KNN Algorithm Space-Time Efficiency Optimization of Feature Items with Weight and Test We design tfidf optimization algorithm as the following. for(text_i=first_text to N) for(i_tag_j=i_first_tag to text_i.length) #t_j++; for(tag_group_i=first_tag_group to tag.group) for(text_j=first_text to N) if(search tag_group_i in text_j) #w_group_i++; The test results are as the Table2. It includes 500 training Corpus. Fig. 2. Feature Analysis Table 2. Testing Result of Feature Weighting Size of feature set Computing time (ms) Auxiliary storage space Fig. 3. Analysis of Feature Weighting

7 548 Y. Cai and X. Wang As the group size increases, the time complexity of the algorithm tends to O(n 2 ), space complexity tends to O(n). When the bloc size is 1, optimization algorithm is the same to the traditional one. When the bloc size is greater than 100, the time cost is reduced, space is increased slightly, and the system is easy to bear. Therefore, the optimization algorithm is feasible and effective. 4.3 Optimization Scheme of Similarity Calculation We consider to using the space instead of time if test corpus is composed of the text corpus. We send all the training feature vectors into memory, also can get the similarity. Optimized algorithm time complexity is O(n 2 ) and space complexity increase to O(n 2 ) but it still can withstand within the system. for(train_text_i=train_first_text to N) for(i_weight_=i_first_weight to M) put into hashtable hash_train[]; for(test_text_j=test_first_test to N) for(j_weight_=j_first_weight to M) if(search j_weight_ in hash_train[]) sim_j(); 5 Classifier Design and Test After the optimization of traditional KNN algorithm, the space-time efficiency is improved and more reasonable. But this high efficiency must based on the ensuring of reliability. 5.1 Classifier Design Select the K Neighbors. Select the K training text with big similar as the K neighbors for the current test version. In practical problems, it is difficult to determine the K value of the selected, often only rough estimates based on experience. This method of valuation may cause a decline in the accuracy of KNN algorithm. Scoring by Category. Test the text similarity of K neighbors by accumulation, accumulated value will be scored [11]. Classification Proposed. According to the principle of the KNN algorithm, we should include the class which is in the highest scores. 5.2 Performance Indexes Classification proposed is on the direct basis of evaluating the classification performance by Classifier. The following performance test index is generally used. 1) accuracy rate = number of correctly assigned to the particular type of the text / actual number assigned to certain types of the text. 2) the recall rate = number of correct assigned to certain types of text / text of the actual number to be assigned to certain types.

8 The Analysis and Optimization of KNN Algorithm Space-Time Efficiency 549 3) Standard measure is as following formula (3), where p is the accuracy, r is the recall rate,:β the weight of precision and recall rate. F1 measure is taen when β is Classifier Testing 2 2 F = (1 + β ) pr /( β p + r) β (3) Sougou Corpus was selected which involve the nine categories including financial business, information technology, food hygiene, sports, tourism, education exams, employment worplace, culture arts, military weapons. There are 1990 paper in each category. Dictionary contains 275,613 words, excluding stop words words. Choose 900 test text which is 5% of total corpus. Classifier test results is as shown the Table 3. Table 3. Testing Result of Classifier K Performance index FB IT FH SE TV EE EW CA MW Precision Recall F Precision Recall F Precision Recall F Precision Recall F Precision Recall F Performance test results above are automatically statistics by the classifier without manual checing. The results show that K value has little influence on the classifier. The above implementation and testing environment for the computer is 2GHz CPU frequency, 3GB main memory, Windows XP operating system, Java Language compiler. Thus, the traditional KNN algorithm efficiency is recognized by common PC environment. The weighted average time of feature vectors are 63ms / articles, similarities average time 4043ms / articles, the traditional KNN classifier level are classified time 125ms / articles.

9 550 Y. Cai and X. Wang 6 Conclusion In this paper, we analysis time and space efficiency for the conventional KNN algorithm of Chinese text classification process. We present a set of detailed efficiency optimization scheme in ensuring the reliability of the classification including extraction of feature items optimization scheme, feature items weighted optimization scheme, similarity calculation optimization scheme. Tests results satisfied the expected results. Acnowledgment. The research is supported by the General program of science and technology development project of Beijing Municipal Education Commission under Grant No.KM , Funding Project for Academic Human Resources Development in Institutions of Higher Learning under the Jurisdiction of Beijing Municipality Grant No.PHR and Beijing Municipal Organization Department Project talent under Grant No.2010D References 1. A Survey on Automated Text Categorization, 2. Guo, G.D., Wang, H., Bell, D., Bi, Y.X., Greer, K.: An KNN Model-based Approach and Its Application in Text Categorization. J. Computer Science, (2003) 3. Yang, Y.M., Pedersen, J.O.: A Comparative Study on Feature Selection in Text Categorization. In: 14th Int l Conf. on Machine Learning (ICML 1997), pp Morgan Kaufmann Publishers, San Francisco (1997) 4. Vries, A.D., Mamoulis, N., Nes, N.: Efficient KNN search on vertically decomposed data. In: 2002 ACM SIGMOD International Conference on Management of Data, pp ACM Press, Madison (2002) 5. Sun, R.Z.: An Improved KNN Algorithm for Text Classification. J. Computer Knowledge and Technology. 6(1), (2010) 6. Ma, J.B., Li, J., Teng, G.F., Wang, F., Zhao, Y.: The Comparison Studies on the Algorithm of KNN and SVM for Chinese Text Classification. Journal of Agricultural University of HeBei 31(3), (2008) 7. Wang, X.Q.: Research of KNN Classif ication Method based on Parallel Genetic Algorithm. Journal of Southwest China Normal University 35(2), (2010) 8. Zhu, G.H., Cheng, C.P.: An Improved -Nearest Neighbor Classification Method. Journal of HeNan Institute of Engineer Ing., (2008) 9. Liu, B., Yang, L., Yuan, F.: Improved KNN Method and Its Application in Chinese Text Classification. Journal of Xihua University 27(2), (2008) 10. Zhou, Q.Q., Sun, B.D., Wang, Y.: Study on New Pretreatment Method for Chinese Text Classification System. J. Application Research of Computers (2), (2005) 11. He, F., Lin, Y.L.: Summary of Improving KNN text classification algorithm. J. FuJian Computer (3), (2005)

Keyword Extraction by KNN considering Similarity among Features

Keyword Extraction by KNN considering Similarity among Features 64 Int'l Conf. on Advances in Big Data Analytics ABDA'15 Keyword Extraction by KNN considering Similarity among Features Taeho Jo Department of Computer and Information Engineering, Inha University, Incheon,

More information

A Modular k-nearest Neighbor Classification Method for Massively Parallel Text Categorization

A Modular k-nearest Neighbor Classification Method for Massively Parallel Text Categorization A Modular k-nearest Neighbor Classification Method for Massively Parallel Text Categorization Hai Zhao and Bao-Liang Lu Department of Computer Science and Engineering, Shanghai Jiao Tong University, 1954

More information

Research on Design and Application of Computer Database Quality Evaluation Model

Research on Design and Application of Computer Database Quality Evaluation Model Research on Design and Application of Computer Database Quality Evaluation Model Abstract Hong Li, Hui Ge Shihezi Radio and TV University, Shihezi 832000, China Computer data quality evaluation is the

More information

The Comparative Study of Machine Learning Algorithms in Text Data Classification*

The Comparative Study of Machine Learning Algorithms in Text Data Classification* The Comparative Study of Machine Learning Algorithms in Text Data Classification* Wang Xin School of Science, Beijing Information Science and Technology University Beijing, China Abstract Classification

More information

An Improved KNN Classification Algorithm based on Sampling

An Improved KNN Classification Algorithm based on Sampling International Conference on Advances in Materials, Machinery, Electrical Engineering (AMMEE 017) An Improved KNN Classification Algorithm based on Sampling Zhiwei Cheng1, a, Caisen Chen1, b, Xuehuan Qiu1,

More information

Using Gini-index for Feature Weighting in Text Categorization

Using Gini-index for Feature Weighting in Text Categorization Journal of Computational Information Systems 9: 14 (2013) 5819 5826 Available at http://www.jofcis.com Using Gini-index for Feature Weighting in Text Categorization Weidong ZHU 1,, Yongmin LIN 2 1 School

More information

Analysis on the technology improvement of the library network information retrieval efficiency

Analysis on the technology improvement of the library network information retrieval efficiency Available online www.jocpr.com Journal of Chemical and Pharmaceutical Research, 2014, 6(6):2198-2202 Research Article ISSN : 0975-7384 CODEN(USA) : JCPRC5 Analysis on the technology improvement of the

More information

An Integrated Face Recognition Algorithm Based on Wavelet Subspace

An Integrated Face Recognition Algorithm Based on Wavelet Subspace , pp.20-25 http://dx.doi.org/0.4257/astl.204.48.20 An Integrated Face Recognition Algorithm Based on Wavelet Subspace Wenhui Li, Ning Ma, Zhiyan Wang College of computer science and technology, Jilin University,

More information

Feature Selecting Model in Automatic Text Categorization of Chinese Financial Industrial News

Feature Selecting Model in Automatic Text Categorization of Chinese Financial Industrial News Selecting Model in Automatic Text Categorization of Chinese Industrial 1) HUEY-MING LEE 1 ), PIN-JEN CHEN 1 ), TSUNG-YEN LEE 2) Department of Information Management, Chinese Culture University 55, Hwa-Kung

More information

Text Clustering Incremental Algorithm in Sensitive Topic Detection

Text Clustering Incremental Algorithm in Sensitive Topic Detection International Journal of Information and Communication Sciences 2018; 3(3): 88-95 http://www.sciencepublishinggroup.com/j/ijics doi: 10.11648/j.ijics.20180303.12 ISSN: 2575-1700 (Print); ISSN: 2575-1719

More information

News-Oriented Keyword Indexing with Maximum Entropy Principle.

News-Oriented Keyword Indexing with Maximum Entropy Principle. News-Oriented Keyword Indexing with Maximum Entropy Principle. Li Sujian' Wang Houfeng' Yu Shiwen' Xin Chengsheng2 'Institute of Computational Linguistics, Peking University, 100871, Beijing, China Ilisujian,

More information

Fault Diagnosis of Wind Turbine Based on ELMD and FCM

Fault Diagnosis of Wind Turbine Based on ELMD and FCM Send Orders for Reprints to reprints@benthamscience.ae 76 The Open Mechanical Engineering Journal, 24, 8, 76-72 Fault Diagnosis of Wind Turbine Based on ELMD and FCM Open Access Xianjin Luo * and Xiumei

More information

An Improved Method of Vehicle Driving Cycle Construction: A Case Study of Beijing

An Improved Method of Vehicle Driving Cycle Construction: A Case Study of Beijing International Forum on Energy, Environment and Sustainable Development (IFEESD 206) An Improved Method of Vehicle Driving Cycle Construction: A Case Study of Beijing Zhenpo Wang,a, Yang Li,b, Hao Luo,

More information

Text Categorization (I)

Text Categorization (I) CS473 CS-473 Text Categorization (I) Luo Si Department of Computer Science Purdue University Text Categorization (I) Outline Introduction to the task of text categorization Manual v.s. automatic text categorization

More information

Design and Realization of Agricultural Information Intelligent Processing and Application Platform

Design and Realization of Agricultural Information Intelligent Processing and Application Platform Design and Realization of Agricultural Information Intelligent Processing and Application Platform Dan Wang 1,2 1 Institute of Agricultural Information, Chinese Academy of Agricultural Sciences, Beijing

More information

CADIAL Search Engine at INEX

CADIAL Search Engine at INEX CADIAL Search Engine at INEX Jure Mijić 1, Marie-Francine Moens 2, and Bojana Dalbelo Bašić 1 1 Faculty of Electrical Engineering and Computing, University of Zagreb, Unska 3, 10000 Zagreb, Croatia {jure.mijic,bojana.dalbelo}@fer.hr

More information

GRAPHICAL REPRESENTATION OF TEXTUAL DATA USING TEXT CATEGORIZATION SYSTEM

GRAPHICAL REPRESENTATION OF TEXTUAL DATA USING TEXT CATEGORIZATION SYSTEM http:// GRAPHICAL REPRESENTATION OF TEXTUAL DATA USING TEXT CATEGORIZATION SYSTEM Akshay Kumar 1, Vibhor Harit 2, Balwant Singh 3, Manzoor Husain Dar 4 1 M.Tech (CSE), Kurukshetra University, Kurukshetra,

More information

Content Based Image Retrieval system with a combination of Rough Set and Support Vector Machine

Content Based Image Retrieval system with a combination of Rough Set and Support Vector Machine Shahabi Lotfabadi, M., Shiratuddin, M.F. and Wong, K.W. (2013) Content Based Image Retrieval system with a combination of rough set and support vector machine. In: 9th Annual International Joint Conferences

More information

Improvement of SURF Feature Image Registration Algorithm Based on Cluster Analysis

Improvement of SURF Feature Image Registration Algorithm Based on Cluster Analysis Sensors & Transducers 2014 by IFSA Publishing, S. L. http://www.sensorsportal.com Improvement of SURF Feature Image Registration Algorithm Based on Cluster Analysis 1 Xulin LONG, 1,* Qiang CHEN, 2 Xiaoya

More information

Chinese Text Auto-Categorization on Petro-Chemical Industrial Processes

Chinese Text Auto-Categorization on Petro-Chemical Industrial Processes BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 16, No 6 Special issue with selection of extended papers from 6th International Conference on Logistic, Informatics and Service

More information

A Robust Image Zero-Watermarking Algorithm Based on DWT and PCA

A Robust Image Zero-Watermarking Algorithm Based on DWT and PCA A Robust Image Zero-Watermarking Algorithm Based on DWT and PCA Xiaoxu Leng, Jun Xiao, and Ying Wang Graduate University of Chinese Academy of Sciences, 100049 Beijing, China lengxiaoxu@163.com, {xiaojun,ywang}@gucas.ac.cn

More information

An Indian Journal FULL PAPER. Trade Science Inc.

An Indian Journal FULL PAPER. Trade Science Inc. [Type text] [Type text] [Type text] ISSN : 0974-745 Volume 10 Issue BioTechnology 014 An Indian Journal FULL PAPER BTAIJ, 10(), 014 [14449-14454] A simplified diagram multiplication method for displacement

More information

Feature Selection for Multi-Class Imbalanced Data Sets Based on Genetic Algorithm

Feature Selection for Multi-Class Imbalanced Data Sets Based on Genetic Algorithm Ann. Data. Sci. (2015) 2(3):293 300 DOI 10.1007/s40745-015-0060-x Feature Selection for Multi-Class Imbalanced Data Sets Based on Genetic Algorithm Li-min Du 1,2 Yang Xu 1 Hua Zhu 1 Received: 30 November

More information

A Feature Selection Method to Handle Imbalanced Data in Text Classification

A Feature Selection Method to Handle Imbalanced Data in Text Classification A Feature Selection Method to Handle Imbalanced Data in Text Classification Fengxiang Chang 1*, Jun Guo 1, Weiran Xu 1, Kejun Yao 2 1 School of Information and Communication Engineering Beijing University

More information

Assisting Trustworthiness Based Web Services Selection Using the Fidelity of Websites *

Assisting Trustworthiness Based Web Services Selection Using the Fidelity of Websites * Assisting Trustworthiness Based Web Services Selection Using the Fidelity of Websites * Lijie Wang, Fei Liu, Ge Li **, Liang Gu, Liangjie Zhang, and Bing Xie Software Institute, School of Electronic Engineering

More information

Encoding Words into String Vectors for Word Categorization

Encoding Words into String Vectors for Word Categorization Int'l Conf. Artificial Intelligence ICAI'16 271 Encoding Words into String Vectors for Word Categorization Taeho Jo Department of Computer and Information Communication Engineering, Hongik University,

More information

An Improved Frequent Pattern-growth Algorithm Based on Decomposition of the Transaction Database

An Improved Frequent Pattern-growth Algorithm Based on Decomposition of the Transaction Database Algorithm Based on Decomposition of the Transaction Database 1 School of Management Science and Engineering, Shandong Normal University,Jinan, 250014,China E-mail:459132653@qq.com Fei Wei 2 School of Management

More information

Minimal Test Cost Feature Selection with Positive Region Constraint

Minimal Test Cost Feature Selection with Positive Region Constraint Minimal Test Cost Feature Selection with Positive Region Constraint Jiabin Liu 1,2,FanMin 2,, Shujiao Liao 2, and William Zhu 2 1 Department of Computer Science, Sichuan University for Nationalities, Kangding

More information

Framework Research on Privacy Protection of PHR Owners in Medical Cloud System Based on Aggregation Key Encryption Algorithm

Framework Research on Privacy Protection of PHR Owners in Medical Cloud System Based on Aggregation Key Encryption Algorithm Framework Research on Privacy Protection of PHR Owners in Medical Cloud System Based on Aggregation Key Encryption Algorithm Huiqi Zhao 1,2,3, Yinglong Wang 2,3*, Minglei Shu 2,3 1 Department of Information

More information

Feature weighting classification algorithm in the application of text data processing research

Feature weighting classification algorithm in the application of text data processing research , pp.41-47 http://dx.doi.org/10.14257/astl.2016.134.07 Feature weighting classification algorithm in the application of text data research Zhou Chengyi University of Science and Technology Liaoning, Anshan,

More information

How to Apply the Geospatial Data Abstraction Library (GDAL) Properly to Parallel Geospatial Raster I/O?

How to Apply the Geospatial Data Abstraction Library (GDAL) Properly to Parallel Geospatial Raster I/O? bs_bs_banner Short Technical Note Transactions in GIS, 2014, 18(6): 950 957 How to Apply the Geospatial Data Abstraction Library (GDAL) Properly to Parallel Geospatial Raster I/O? Cheng-Zhi Qin,* Li-Jun

More information

Open Access Research on the Prediction Model of Material Cost Based on Data Mining

Open Access Research on the Prediction Model of Material Cost Based on Data Mining Send Orders for Reprints to reprints@benthamscience.ae 1062 The Open Mechanical Engineering Journal, 2015, 9, 1062-1066 Open Access Research on the Prediction Model of Material Cost Based on Data Mining

More information

A Method and System for Thunder Traffic Online Identification

A Method and System for Thunder Traffic Online Identification 2016 3 rd International Conference on Engineering Technology and Application (ICETA 2016) ISBN: 978-1-60595-383-0 A Method and System for Thunder Traffic Online Identification Jinfu Chen Institute of Information

More information

A semi-incremental recognition method for on-line handwritten Japanese text

A semi-incremental recognition method for on-line handwritten Japanese text 2013 12th International Conference on Document Analysis and Recognition A semi-incremental recognition method for on-line handwritten Japanese text Cuong Tuan Nguyen, Bilan Zhu and Masaki Nakagawa Department

More information

Research on Digital Library Platform Based on Cloud Computing

Research on Digital Library Platform Based on Cloud Computing Research on Digital Library Platform Based on Cloud Computing Lingling Han and Lijie Wang Heibei Energy Institute of Vocation and Technology, Tangshan, Hebei, China hanlingling2002@126.com, wanglj509@163.com

More information

In this project, I examined methods to classify a corpus of s by their content in order to suggest text blocks for semi-automatic replies.

In this project, I examined methods to classify a corpus of  s by their content in order to suggest text blocks for semi-automatic replies. December 13, 2006 IS256: Applied Natural Language Processing Final Project Email classification for semi-automated reply generation HANNES HESSE mail 2056 Emerson Street Berkeley, CA 94703 phone 1 (510)

More information

STUDYING OF CLASSIFYING CHINESE SMS MESSAGES

STUDYING OF CLASSIFYING CHINESE SMS MESSAGES STUDYING OF CLASSIFYING CHINESE SMS MESSAGES BASED ON BAYESIAN CLASSIFICATION 1 LI FENG, 2 LI JIGANG 1,2 Computer Science Department, DongHua University, Shanghai, China E-mail: 1 Lifeng@dhu.edu.cn, 2

More information

A Hybrid Approach to News Video Classification with Multi-modal Features

A Hybrid Approach to News Video Classification with Multi-modal Features A Hybrid Approach to News Video Classification with Multi-modal Features Peng Wang, Rui Cai and Shi-Qiang Yang Department of Computer Science and Technology, Tsinghua University, Beijing 00084, China Email:

More information

Design of student information system based on association algorithm and data mining technology. CaiYan, ChenHua

Design of student information system based on association algorithm and data mining technology. CaiYan, ChenHua 5th International Conference on Mechatronics, Materials, Chemistry and Computer Engineering (ICMMCCE 2017) Design of student information system based on association algorithm and data mining technology

More information

Detecting Near-Duplicates in Large-Scale Short Text Databases

Detecting Near-Duplicates in Large-Scale Short Text Databases Detecting Near-Duplicates in Large-Scale Short Text Databases Caichun Gong 1,2, Yulan Huang 1,2, Xueqi Cheng 1, and Shuo Bai 1 1 Institute of Computing Technology, Chinese Academy of Sciences, Beijing,

More information

Effect of log-based Query Term Expansion on Retrieval Effectiveness in Patent Searching

Effect of log-based Query Term Expansion on Retrieval Effectiveness in Patent Searching Effect of log-based Query Term Expansion on Retrieval Effectiveness in Patent Searching Wolfgang Tannebaum, Parvaz Madabi and Andreas Rauber Institute of Software Technology and Interactive Systems, Vienna

More information

String Vector based KNN for Text Categorization

String Vector based KNN for Text Categorization 458 String Vector based KNN for Text Categorization Taeho Jo Department of Computer and Information Communication Engineering Hongik University Sejong, South Korea tjo018@hongik.ac.kr Abstract This research

More information

Improvements and Implementation of Hierarchical Clustering based on Hadoop Jun Zhang1, a, Chunxiao Fan1, Yuexin Wu2,b, Ao Xiao1

Improvements and Implementation of Hierarchical Clustering based on Hadoop Jun Zhang1, a, Chunxiao Fan1, Yuexin Wu2,b, Ao Xiao1 3rd International Conference on Machinery, Materials and Information Technology Applications (ICMMITA 2015) Improvements and Implementation of Hierarchical Clustering based on Hadoop Jun Zhang1, a, Chunxiao

More information

The Design of Supermarket Electronic Shopping Guide System Based on ZigBee Communication

The Design of Supermarket Electronic Shopping Guide System Based on ZigBee Communication The Design of Supermarket Electronic Shopping Guide System Based on ZigBee Communication Yujie Zhang, Liang Han, and Yuanyuan Zhang College of Electrical and Information Engineering, Shaanxi University

More information

Using Text Learning to help Web browsing

Using Text Learning to help Web browsing Using Text Learning to help Web browsing Dunja Mladenić J.Stefan Institute, Ljubljana, Slovenia Carnegie Mellon University, Pittsburgh, PA, USA Dunja.Mladenic@{ijs.si, cs.cmu.edu} Abstract Web browsing

More information

Performance Degradation Assessment and Fault Diagnosis of Bearing Based on EMD and PCA-SOM

Performance Degradation Assessment and Fault Diagnosis of Bearing Based on EMD and PCA-SOM Performance Degradation Assessment and Fault Diagnosis of Bearing Based on EMD and PCA-SOM Lu Chen and Yuan Hang PERFORMANCE DEGRADATION ASSESSMENT AND FAULT DIAGNOSIS OF BEARING BASED ON EMD AND PCA-SOM.

More information

C-NBC: Neighborhood-Based Clustering with Constraints

C-NBC: Neighborhood-Based Clustering with Constraints C-NBC: Neighborhood-Based Clustering with Constraints Piotr Lasek Chair of Computer Science, University of Rzeszów ul. Prof. St. Pigonia 1, 35-310 Rzeszów, Poland lasek@ur.edu.pl Abstract. Clustering is

More information

An Adaptive Threshold LBP Algorithm for Face Recognition

An Adaptive Threshold LBP Algorithm for Face Recognition An Adaptive Threshold LBP Algorithm for Face Recognition Xiaoping Jiang 1, Chuyu Guo 1,*, Hua Zhang 1, and Chenghua Li 1 1 College of Electronics and Information Engineering, Hubei Key Laboratory of Intelligent

More information

[Gidhane* et al., 5(7): July, 2016] ISSN: IC Value: 3.00 Impact Factor: 4.116

[Gidhane* et al., 5(7): July, 2016] ISSN: IC Value: 3.00 Impact Factor: 4.116 IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY AN EFFICIENT APPROACH FOR TEXT MINING USING SIDE INFORMATION Kiran V. Gaidhane*, Prof. L. H. Patil, Prof. C. U. Chouhan DOI: 10.5281/zenodo.58632

More information

Design and Implementation of Real-Time Data Exchange Software of Maneuverable Command Automation System

Design and Implementation of Real-Time Data Exchange Software of Maneuverable Command Automation System Design and Implementation of Real-Time Data Exchange Software of Maneuverable Command Automation System Shi Chuan, Zhang Yang and Zhou Yuefei 1 Introduction Command automation system provides an effective

More information

Construction of Complex City Landscape with the Support of CAD Model

Construction of Complex City Landscape with the Support of CAD Model Construction of Complex City Landscape with the Support of CAD Model MinSun 1 JunChen 2 AinaiMa 1 1.Institute of RS & GIS, Peking University, Beijing, China, 100871 2.National Geomatics Center of China,

More information

Power Load Forecasting Based on ABC-SA Neural Network Model

Power Load Forecasting Based on ABC-SA Neural Network Model Power Load Forecasting Based on ABC-SA Neural Network Model Weihua Pan, Xinhui Wang College of Control and Computer Engineering, North China Electric Power University, Baoding, Hebei 071000, China. 1471647206@qq.com

More information

Learning the Three Factors of a Non-overlapping Multi-camera Network Topology

Learning the Three Factors of a Non-overlapping Multi-camera Network Topology Learning the Three Factors of a Non-overlapping Multi-camera Network Topology Xiaotang Chen, Kaiqi Huang, and Tieniu Tan National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy

More information

Applications of Machine Learning on Keyword Extraction of Large Datasets

Applications of Machine Learning on Keyword Extraction of Large Datasets Applications of Machine Learning on Keyword Extraction of Large Datasets 1 2 Meng Yan my259@stanford.edu 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37

More information

Quality Assessment of Power Dispatching Data Based on Improved Cloud Model

Quality Assessment of Power Dispatching Data Based on Improved Cloud Model Quality Assessment of Power Dispatching Based on Improved Cloud Model Zhaoyang Qu, Shaohua Zhou *. School of Information Engineering, Northeast Electric Power University, Jilin, China Abstract. This paper

More information

Video annotation based on adaptive annular spatial partition scheme

Video annotation based on adaptive annular spatial partition scheme Video annotation based on adaptive annular spatial partition scheme Guiguang Ding a), Lu Zhang, and Xiaoxu Li Key Laboratory for Information System Security, Ministry of Education, Tsinghua National Laboratory

More information

ScienceDirect. KNN with TF-IDF Based Framework for Text Categorization

ScienceDirect. KNN with TF-IDF Based Framework for Text Categorization Available online at www.sciencedirect.com ScienceDirect Procedia Engineering 69 ( 2014 ) 1356 1364 24th DAAAM International Symposium on Intelligent Manufacturing and Automation, 2013 KNN with TF-IDF Based

More information

Discovering Advertisement Links by Using URL Text

Discovering Advertisement Links by Using URL Text 017 3rd International Conference on Computational Systems and Communications (ICCSC 017) Discovering Advertisement Links by Using URL Text Jing-Shan Xu1, a, Peng Chang, b,* and Yong-Zheng Zhang, c 1 School

More information

Time Series Clustering Ensemble Algorithm Based on Locality Preserving Projection

Time Series Clustering Ensemble Algorithm Based on Locality Preserving Projection Based on Locality Preserving Projection 2 Information & Technology College, Hebei University of Economics & Business, 05006 Shijiazhuang, China E-mail: 92475577@qq.com Xiaoqing Weng Information & Technology

More information

Prediction of traffic flow based on the EMD and wavelet neural network Teng Feng 1,a,Xiaohong Wang 1,b,Yunlai He 1,c

Prediction of traffic flow based on the EMD and wavelet neural network Teng Feng 1,a,Xiaohong Wang 1,b,Yunlai He 1,c 2nd International Conference on Electrical, Computer Engineering and Electronics (ICECEE 215) Prediction of traffic flow based on the EMD and wavelet neural network Teng Feng 1,a,Xiaohong Wang 1,b,Yunlai

More information

An Optimization Algorithm of Selecting Initial Clustering Center in K means

An Optimization Algorithm of Selecting Initial Clustering Center in K means 2nd International Conference on Machinery, Electronics and Control Simulation (MECS 2017) An Optimization Algorithm of Selecting Initial Clustering Center in K means Tianhan Gao1, a, Xue Kong2, b,* 1 School

More information

Research on Applications of Data Mining in Electronic Commerce. Xiuping YANG 1, a

Research on Applications of Data Mining in Electronic Commerce. Xiuping YANG 1, a International Conference on Education Technology, Management and Humanities Science (ETMHS 2015) Research on Applications of Data Mining in Electronic Commerce Xiuping YANG 1, a 1 Computer Science Department,

More information

Mining Quantitative Association Rules on Overlapped Intervals

Mining Quantitative Association Rules on Overlapped Intervals Mining Quantitative Association Rules on Overlapped Intervals Qiang Tong 1,3, Baoping Yan 2, and Yuanchun Zhou 1,3 1 Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China {tongqiang,

More information

Improving Suffix Tree Clustering Algorithm for Web Documents

Improving Suffix Tree Clustering Algorithm for Web Documents International Conference on Logistics Engineering, Management and Computer Science (LEMCS 2015) Improving Suffix Tree Clustering Algorithm for Web Documents Yan Zhuang Computer Center East China Normal

More information

Clustering-Based Distributed Precomputation for Quality-of-Service Routing*

Clustering-Based Distributed Precomputation for Quality-of-Service Routing* Clustering-Based Distributed Precomputation for Quality-of-Service Routing* Yong Cui and Jianping Wu Department of Computer Science, Tsinghua University, Beijing, P.R.China, 100084 cy@csnet1.cs.tsinghua.edu.cn,

More information

ANN-Based Modeling for Load and Main Steam Pressure Characteristics of a 600MW Supercritical Power Generating Unit

ANN-Based Modeling for Load and Main Steam Pressure Characteristics of a 600MW Supercritical Power Generating Unit ANN-Based Modeling for Load and Main Steam Pressure Characteristics of a 600MW Supercritical Power Generating Unit Liangyu Ma, Zhiyuan Gao Automation Department, School of Control and Computer Engineering

More information

Support for development and test of web application: A tree-oriented model

Support for development and test of web application: A tree-oriented model J Shanghai Univ (Engl Ed), 2011, 15(5): 357 362 Digital Object Identifier(DOI): 10.1007/s11741-011-0751-1 Support for development and test of web application: A tree-oriented model CAO Min (ù ), CAO Zhen

More information

A Finite State Mobile Agent Computation Model

A Finite State Mobile Agent Computation Model A Finite State Mobile Agent Computation Model Yong Liu, Congfu Xu, Zhaohui Wu, Weidong Chen, and Yunhe Pan College of Computer Science, Zhejiang University Hangzhou 310027, PR China Abstract In this paper,

More information

A Dynamic TDMA Protocol Utilizing Channel Sense

A Dynamic TDMA Protocol Utilizing Channel Sense International Conference on Electromechanical Control Technology and Transportation (ICECTT 2015) A Dynamic TDMA Protocol Utilizing Channel Sense ZHOU De-min 1, a, LIU Yun-jiang 2,b and LI Man 3,c 1 2

More information

Graph Matching Iris Image Blocks with Local Binary Pattern

Graph Matching Iris Image Blocks with Local Binary Pattern Graph Matching Iris Image Blocs with Local Binary Pattern Zhenan Sun, Tieniu Tan, and Xianchao Qiu Center for Biometrics and Security Research, National Laboratory of Pattern Recognition, Institute of

More information

A Supervised Method for Multi-keyword Web Crawling on Web Forums

A Supervised Method for Multi-keyword Web Crawling on Web Forums Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 2, February 2014,

More information

Social Media Computing

Social Media Computing Social Media Computing Lecture 4: Introduction to Information Retrieval and Classification Lecturer: Aleksandr Farseev E-mail: farseev@u.nus.edu Slides: http://farseev.com/ainlfruct.html At the beginning,

More information

An Application of Genetic Algorithm for Auto-body Panel Die-design Case Library Based on Grid

An Application of Genetic Algorithm for Auto-body Panel Die-design Case Library Based on Grid An Application of Genetic Algorithm for Auto-body Panel Die-design Case Library Based on Grid Demin Wang 2, Hong Zhu 1, and Xin Liu 2 1 College of Computer Science and Technology, Jilin University, Changchun

More information

Fast K-nearest neighbors searching algorithms for point clouds data of 3D scanning system 1

Fast K-nearest neighbors searching algorithms for point clouds data of 3D scanning system 1 Acta Technica 62 No. 3B/2017, 141 148 c 2017 Institute of Thermomechanics CAS, v.v.i. Fast K-nearest neighbors searching algorithms for point clouds data of 3D scanning system 1 Zhang Fan 2, 3, Tan Yuegang

More information

AN OPTIMIZATION GENETIC ALGORITHM FOR IMAGE DATABASES IN AGRICULTURE

AN OPTIMIZATION GENETIC ALGORITHM FOR IMAGE DATABASES IN AGRICULTURE AN OPTIMIZATION GENETIC ALGORITHM FOR IMAGE DATABASES IN AGRICULTURE Changwu Zhu 1, Guanxiang Yan 2, Zhi Liu 3, Li Gao 1,* 1 Department of Computer Science, Hua Zhong Normal University, Wuhan 430079, China

More information

Ranking Web Pages by Associating Keywords with Locations

Ranking Web Pages by Associating Keywords with Locations Ranking Web Pages by Associating Keywords with Locations Peiquan Jin, Xiaoxiang Zhang, Qingqing Zhang, Sheng Lin, and Lihua Yue University of Science and Technology of China, 230027, Hefei, China jpq@ustc.edu.cn

More information

Chinese Microblog Entity Linking System Combining Wikipedia and Search Engine Retrieval Results

Chinese Microblog Entity Linking System Combining Wikipedia and Search Engine Retrieval Results Chinese Microblog Entity Linking System Combining Wikipedia and Search Engine Retrieval Results Zeyu Meng, Dong Yu, and Endong Xun Inter. R&D center for Chinese Education, Beijing Language and Culture

More information

A New Distance Independent Localization Algorithm in Wireless Sensor Network

A New Distance Independent Localization Algorithm in Wireless Sensor Network A New Distance Independent Localization Algorithm in Wireless Sensor Network Siwei Peng 1, Jihui Li 2, Hui Liu 3 1 School of Information Science and Engineering, Yanshan University, Qinhuangdao 2 The Key

More information

An Immune Concentration Based Virus Detection Approach Using Particle Swarm Optimization

An Immune Concentration Based Virus Detection Approach Using Particle Swarm Optimization An Immune Concentration Based Virus Detection Approach Using Particle Swarm Optimization Wei Wang 1,2, Pengtao Zhang 1,2, and Ying Tan 1,2 1 Key Laboratory of Machine Perception, Ministry of Eduction,

More information

Location-Aware Web Service Recommendation Using Personalized Collaborative Filtering

Location-Aware Web Service Recommendation Using Personalized Collaborative Filtering ISSN 2395-1621 Location-Aware Web Service Recommendation Using Personalized Collaborative Filtering #1 Shweta A. Bhalerao, #2 Prof. R. N. Phursule 1 Shweta.bhalerao75@gmail.com 2 rphursule@gmail.com #12

More information

An Agricultural Tri-dimensional Pollution Data Management Platform Based on DNDC Model

An Agricultural Tri-dimensional Pollution Data Management Platform Based on DNDC Model An Agricultural Tri-dimensional Pollution Data Management Platform Based on DNDC Model Lihua Jiang 1,2, Wensheng Wang 1,2, Xiaorong Yang 1,2, Nengfu Xie 1,2, and Youping Cheng 3 1 Agriculture Information

More information

Traffic Flow Prediction Based on the location of Big Data. Xijun Zhang, Zhanting Yuan

Traffic Flow Prediction Based on the location of Big Data. Xijun Zhang, Zhanting Yuan 5th International Conference on Civil Engineering and Transportation (ICCET 205) Traffic Flow Prediction Based on the location of Big Data Xijun Zhang, Zhanting Yuan Lanzhou Univ Technol, Coll Elect &

More information

Topic 1 Classification Alternatives

Topic 1 Classification Alternatives Topic 1 Classification Alternatives [Jiawei Han, Micheline Kamber, Jian Pei. 2011. Data Mining Concepts and Techniques. 3 rd Ed. Morgan Kaufmann. ISBN: 9380931913.] 1 Contents 2. Classification Using Frequent

More information

and Molds 1. INTRODUCTION

and Molds 1. INTRODUCTION Optimal Tool Path Generation for 2 and Molds D Milling of Dies HuiLi Automotive Components Division Ford Motor Company, Dearborn, MI, USA Zuomin Dong (zdong@me.uvic.ca) and Geoffrey W Vickers Department

More information

Inferring Variable Labels Considering Co-occurrence of Variable Labels in Data Jackets

Inferring Variable Labels Considering Co-occurrence of Variable Labels in Data Jackets 2016 IEEE 16th International Conference on Data Mining Workshops Inferring Variable Labels Considering Co-occurrence of Variable Labels in Data Jackets Teruaki Hayashi Department of Systems Innovation

More information

Video Inter-frame Forgery Identification Based on Optical Flow Consistency

Video Inter-frame Forgery Identification Based on Optical Flow Consistency Sensors & Transducers 24 by IFSA Publishing, S. L. http://www.sensorsportal.com Video Inter-frame Forgery Identification Based on Optical Flow Consistency Qi Wang, Zhaohong Li, Zhenzhen Zhang, Qinglong

More information

Open Access Self-Growing RBF Neural Network Approach for Semantic Image Retrieval

Open Access Self-Growing RBF Neural Network Approach for Semantic Image Retrieval Send Orders for Reprints to reprints@benthamscience.ae The Open Automation and Control Systems Journal, 2014, 6, 1505-1509 1505 Open Access Self-Growing RBF Neural Networ Approach for Semantic Image Retrieval

More information

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK DISTRIBUTED FRAMEWORK FOR DATA MINING AS A SERVICE ON PRIVATE CLOUD RUCHA V. JAMNEKAR

More information

Research on Quality Reliability of Rolling Bearings by Multi - Weight Method (Part Ⅱ: Experiment)

Research on Quality Reliability of Rolling Bearings by Multi - Weight Method (Part Ⅱ: Experiment) 6th International Conference on Mechatronics, Computer and Education Informationization (MCEI 2016) Research on Quality Reliability of Rolling Bearings by Multi - Weight Method (Part Ⅱ: Experiment) Xintao

More information

Adopting Data Mining Techniques on the Recommendations of Library Collections

Adopting Data Mining Techniques on the Recommendations of Library Collections Adopting Data Mining Techniques on the Recommendations of Library Collections Shu-Meng Huang a, Lu Wang b and Wan-Chih Wang c a Department of Information Management, Hsing Wu College, Taiwan (simon@mail.hwc.edu.tw)

More information

manufacturing process.

manufacturing process. Send Orders for Reprints to reprints@benthamscience.ae The Open Automation and Control Systems Journal, 2014, 6, 203-207 203 Open Access Identifying Method for Key Quality Characteristics in Series-Parallel

More information

Research on Design Reuse System of Parallel Indexing Cam Mechanism Based on Knowledge

Research on Design Reuse System of Parallel Indexing Cam Mechanism Based on Knowledge Send Orders for Reprints to reprints@benthamscience.ae 40 The Open Mechanical Engineering Journal, 2015, 9, 40-46 Open Access Research on Design Reuse System of Parallel Indexing Cam Mechanism Based on

More information

Test Analysis of Serial Communication Extension in Mobile Nodes of Participatory Sensing System Xinqiang Tang 1, Huichun Peng 2

Test Analysis of Serial Communication Extension in Mobile Nodes of Participatory Sensing System Xinqiang Tang 1, Huichun Peng 2 International Conference on Automation, Mechanical Control and Computational Engineering (AMCCE 2015) Test Analysis of Serial Communication Extension in Mobile Nodes of Participatory Sensing System Xinqiang

More information

Rotation Invariant Finger Vein Recognition *

Rotation Invariant Finger Vein Recognition * Rotation Invariant Finger Vein Recognition * Shaohua Pang, Yilong Yin **, Gongping Yang, and Yanan Li School of Computer Science and Technology, Shandong University, Jinan, China pangshaohua11271987@126.com,

More information

An Improved Pre-classification Method for Offline Handwritten Chinese Character Using Four Corner Feature

An Improved Pre-classification Method for Offline Handwritten Chinese Character Using Four Corner Feature ISBN 978-952-5726-04-6 (Print), 978-952-5726-05-3 (CD-ROM) Proceedings of the International Symposium on Intelligent Information Systems and Applications (IISA 09) Qingdao, P. R. China, Oct. 28-30, 2009,

More information

Karami, A., Zhou, B. (2015). Online Review Spam Detection by New Linguistic Features. In iconference 2015 Proceedings.

Karami, A., Zhou, B. (2015). Online Review Spam Detection by New Linguistic Features. In iconference 2015 Proceedings. Online Review Spam Detection by New Linguistic Features Amir Karam, University of Maryland Baltimore County Bin Zhou, University of Maryland Baltimore County Karami, A., Zhou, B. (2015). Online Review

More information

An Intelligent Retrieval Platform for Distributional Agriculture Science and Technology Data

An Intelligent Retrieval Platform for Distributional Agriculture Science and Technology Data An Intelligent Retrieval Platform for Distributional Agriculture Science and Technology Data Xiaorong Yang 1,2, Wensheng Wang 1,2, Qingtian Zeng 3, and Nengfu Xie 1,2 1 Agriculture Information Institute,

More information

The Analysis and Implementation of the K - Means Algorithm Based on Hadoop Platform

The Analysis and Implementation of the K - Means Algorithm Based on Hadoop Platform Computer and Information Science; Vol. 11, No. 1; 2018 ISSN 1913-8989 E-ISSN 1913-8997 Published by Canadian Center of Science and Education The Analysis and Implementation of the K - Means Algorithm Based

More information

A Database Redo Log System Based on Virtual Memory Disk*

A Database Redo Log System Based on Virtual Memory Disk* A Database Redo Log System Based on Virtual Memory Disk* Haiping Wu, Hongliang Yu, Bigang Li, Xue Wei, and Weimin Zheng Department of Computer Science and Technology, Tsinghua University, 100084, Beijing,

More information

Story Unit Segmentation with Friendly Acoustic Perception *

Story Unit Segmentation with Friendly Acoustic Perception * Story Unit Segmentation with Friendly Acoustic Perception * Longchuan Yan 1,3, Jun Du 2, Qingming Huang 3, and Shuqiang Jiang 1 1 Institute of Computing Technology, Chinese Academy of Sciences, Beijing,

More information