Construction of Knowledge Base for Automatic Indexing and Classification Based. on Chinese Library Classification

Size: px
Start display at page:

Download "Construction of Knowledge Base for Automatic Indexing and Classification Based. on Chinese Library Classification"

Transcription

1 Construction of Knowledge Base for Automatic Indexing and Classification Based on Chinese Library Classification Han-qing Hou, Chun-xiang Xue School of Information Science & Technology, Nanjing Agricultural University, China Abstract Class number, descriptor and keyword are three kinds of subject concept identifiers, among which there exist some concept ual mapping relationships, i.e. compatibility. According to this principle, we construct a CLC Knowledge Base on the basis of Chinese Library Classification for automatic indexing and classification. We compare it with the CLC system to illuminate its obvious advantages over automatic information processing and concept searching. We then introduce some key technologies in the process of construction at length and describe in brief their application to automatic indexing, automatic classification and concept searching. Keywords: automatic indexing, automatic classification, knowledge base, knowledge organization system, Chinese Library Classificatiom. 1. Introduction Knowledge organization systems (KOS) refer to all kinds of semantic tools that are used to describe and interpret human knowledge and its relationship s, such as library classifications, lists of subject headings, thesauri, semantic networks, maps of subject domains and ontologies. Library classification, lists of subject headings and thesauri have played an important part in organizing the traditional information resources, while the semantic network, subject maps and ontologies are designed for the second semantic Web. Some KOS in current use were constructed to improve traditional classifications or thesauri, inheriting and making use of an established knowledge system and abundant vocabulary. These systems have some features and functions of semantic network and ontology, which can promote the enhancing of knowledge processing and the efficiency of information retrieval. The knowledge base discussed in this article, called the CLC Knowledge Base, is also a KOS, an expertise system for knowledge organization, based on the Chinese Library Classification (thereinafter as CLC ). It reveals the concept mapping relationships among class numbers, descriptors and keywords in manually indexing records by statistical methods, and therefore it can be used to realize automatic indexing and classification, and concept searching. 2. Principles of Construction of the CLC Knowledge Base Classification scheme, thesaurus and natural language are three different kinds of information language with different symbols and organizational approaches. But they are the same, in essence; class numbers, descriptors and keywords all can be used to express the subject concept. There are some hidden mapping relationships conceptually, i.e. compatibility relationships among them. There are numerous manually indexed records of documents in most libraries, which simultaneously contain class numbers, descriptor strings or keyword strings. Through processing these data, we can mine the concept mapping relationships among class numbers, descriptor strings and keyword strings in order to construct a knowledge base. The CLC is a library classification based on the scientific classification and conceptual relation, so we can look upon it as a semantic network, which can be used to organize all kinds of information. The reasons why we choose the CLC as the frame of the knowledge base are: 1

2 Both classification scheme and thesaurus, even all KOS, use methodology of classification. The former uses open classification systems, while the latter hidden ones, such as cross-reference system, categorical index and hierarchical index. Classification scheme is the main part of the integrated vocabulary system, that is the classification/thesaurus system, and much easier to be accepted and understood. The CLC is a large universal classification edited by our own experts. It has been broadly used to classify and search the book materials, audio-visual materials and other sorts of information. The CLC exerts the most comprehensive influence domestically and boasts of numerous users, it has therefore been regarded as the national standard though not officially authorized. Since it was first published in 1975, the CLC has been continuously revised to meet information processing and accessing needs. It is currently in its 4 th edition; its electronic edition is stored in MARC format. The new edition has some features and functions, such as better logical knowledge organizing structure, more extensive coverage of knowledge, and faceted coordination. The CLC is widely used in most of the collections of Chinese documents. If we want to make use of these indexing records to construct knowledge bases, by choosing the CLC as the frame work, we can avoid switching class numbers of other classification schemes into the CLC class numbers. Most experts have approved the feasibility of applying the CLC to organize the Internet resources. Meanwhile, digitalization, faceted coordination, combination with natural language and hyperlinks has been added to the CLC; therefore it can be applied not only in traditional library but also in the web environments. Our automatic indexing and classification system is designed to organize both the traditional documents and digital information. The applicability both in traditional library and in web environments of the CLC happens to meet our needs. Given the advantages of this system, we use the CLC as frame when constructing the knowledge base to realize concept indexing and searching. 3. Comparison between the structure of the CLC Knowledge Base and the CLC system The CLC includes schedules, tables and indexes as well as other classifications. With the new trend that classifications integrate with thesauri developing, the CLC maps its class numbers to descriptors of the Chinese Thesaurus, like DDC to LCSH, and then develops an integrated vocabulary named Classified Chinese Thesaurus (thereinafter as CCT ); its first edition was edited from 1987 to At that time the CLC, Chinese Thesaurus and CCT made up a KOS, named the CLC system, specified in figure 1. Alt hough the CLC system did very well in the traditional library, its disadvantages are revealed when it is applied to the automatic processing of digital information in the Web. The disadvantages are as follows. The CLC system, both Classification scheme and Thesaurus, is a controlled language and lacks the elasticity of a natural language. The CLC system has a long period of revision, about eight to nine yeas, so many new words and subjects are not incorporated in a timely manner. The present classifications and thesauri have a small scale due to their printed edition. The CLC system cannot be directly applied to automatic information processing. We choose the CLC schedule to organize the knowledge base and improve it. We can discover compatible relationships among the class numbers, descriptor strings and keyword strings in the knowledge base, through statistics and computer technology. Compared with CLC system, the knowledge base adds some new features and functions, i.e. interface to natural language, continuously increasing scale, timely update, to adapt to the development of information organization in the Web. 2

3 The knowledge base is comprised of three parts: knowledge base for classification, knowledge base for subject indexing and supplementary knowledge base. The concordance of class numbers and keyword strings is the main part of the knowledge base for classification. Go -list, stop-list, dictionary of synonyms and semantic dictionary compose the knowledge base for subject indexing. Tables of area, periods, and document types compose the supplementary knowledge base that are used to extract the subjects about area, period and types from the documents. The structure and compositions of the knowledge base are specified in figure 2. The above two figures respectively reveal the frame of the CLC system and the structure of the knowledge base. Both are based on the CLC schedules and map their class numbers to descriptor or keywords, so they can be used to realize integrated classification with subject indexing. However, compared with the CLC system, the knowledge base is more suitable for automatic indexing and intelligent searching in their content, scale, structure and function. The reasons are as follows. The CLC system just reveals the mapping relationships between the CLC class numbers and descriptors of the Chinese Thesaurus, while the knowledge base reveals the mapping relationships among the class numbers, descriptor strings and keyword strings. The CLC system only comprises the class numbers and descriptors which were included in the CLC schedule and the Chinese Thesaurus, whereas the data of the knowledge base are from the manually indexing records, which includes a great deal of built class numbers and keywords or new words. So the scale of the knowledge base is larger than that of the CLC system. In the CLC system, one class number at most maps to 20 descriptors or strings, averagely 2-3. But, in the knowledge base, one class number averagely maps to keyword strings, even more than several hundreds of strings. So the knowledge base could reveal the hidden concepts in the classes. 3

4 The terms in the CLC system are updated very slowly because both the CLC and the CCT have long revision periods and are maintained by hand. However, the knowledge base is compiled and maintained by machine, and can embody newly proposed terms in real-time. More vocabulary, especially new words can lead to high indexing consistency and correctness. Due to the limited scale and vocabulary of the CLC system, it is only applied to index and classify literature to hand. However, the knowledge base can ensure higher quality and correctness because of its larger scale, more sufficient vocabulary and flexibility. Moreover, the knowledge base is applied not only to indexing and classify ing automatically but also to searching information more intelligently. The knowledge base could give descriptors and keywords as indexing terms at the same time, separately by their facets such as areas, periods and document types and use its dictionary of synonyms to add the entry words. All these advantages of the knowledge base provide users with multiple aspect and intelligent searching. In general, KOS and the collections of library are separated. In our system, we use the technology of database and hyperlink to connect the knowledge base with the collections of literature, like the directory of search engine in the Internet. 4

5 4. Key technologies of constructio n of CLC Knowledge Base There are some key technologies in the construction of CLC Knowledge Base. We would like to introduce them in the following text Collecting source data from manually indexed records and library classification At first, we should collect source data to build up the source database. There are four kinds of data source. (1) The Indexes of the CLC and the class number-descriptor strings parallel list of the Classified Chinese Thesaurus; (2) Indexing records of the large libraries, e.g. Beijing Library and Shanghai Library, which include the CLC class numbers and descriptors of the Chinese Thesaurus; (3) Indexing records of the periodical literature of bibliographic databases, which include the CLC class numbers and keyword strings, i.e. Database for Chinese Periodicals of Science & Technology (namely VIP), and Database for Social Newspaper and Periodicals that edited by Shanghai Library; (4)Database of titles, which is composed by CLC class numbers and titles coming from some famous bibliographic database. Next, we filter the erroneous and duplicate records to form a source database, which contains the mapping relationships between class numbers and descriptor strings or keyword strings Constructing the knowledge base by statistics method After finishing the data collection, we extract terms and class numbers from the source database, computer the frequency of terms and measure the co-occurrence frequency of the class numbers and strings to construct the knowledge base. Of all dictionaries of the knowledge base, the construction of the class number-keyword strings parallel list is the most important work. Here we use the statistics method to mine the conceptual mapping relationships between the class numbers and keyword strings. Through three statistics respectively called frequency of class number, frequency of the keyword string and the cooccurrence of the class number and keyword string, we use two parameters, namely the support degree and the confidence degree, often used in data mining, to discover the mapping relationships between the class numbers and keyword strings. Then we could generate the knowledge base for automatic classification. The so-called support degree is the co-occurrence frequency of the class numbers and keyword strings in the source database. More co-occurrence frequency shows more indexers agreeing on the conceptual mapping relationships of both. Suppoort ( keyword = P( clc, keyword ) clc ) = freq _ gx P(clc, keyword): the probability that the class number and keyword string are co-existing in an indexing record of the source database; it could be measured by the cooccurrence frequency. Generally speaking, the conceptual mapping relationship of both could be considered correct if the amount of the support >= 2. The greater the degree of support, the more correct the mapping relationship. The degree of confidence reflects the probability of the keyword strings, on the premise that the class number has appeared. Conf ( clc keyword ) = P( clc, keyword ) / P( keyword ) = Freq _ gx / freq _ keyword P(clc, keyword): the probability that class number and keyword string are co-existing in an indexing record of the source database; it could be measured by the co-occurrence frequency. P(keyword): the probability of the keyword string appearance, i.e., the frequency of the string appearing in the whole source data. If the degree of support and of confidence of the class number and keyword string separately reach the threshold, the conceptual mapping relationship between the class number and keyword string would be acceptable Measuring the similarity to solve the multiple-tomultiple relationships between class numbers and strings The relationship between the class numbers and keyword strings is multiple-to-multiple in the source database. In our system, one string only maps to a class number, so one string must map to an exclusive class number in the knowledge base. There are many methods to measure the similarity between the class numbers and strings, such as MI, LogL, Dice, etc. Here we use Dice measure to find out the best class number for a string. P ( clc, keyword ) Dice = 1 2 [ P ( clc ) + P ( keyword freq _ gx = 2 ( freq _ clc + freq _ keyword ) Hereinto: Dice: the probability of the class number and keyword string co-existing; P(clc): the probability of the class number existing in the source database, viz. the frequency of the class number; P(keyword): the probability of the keyword string existing in the source database, viz., the frequency of the keyword string; P(clc, keyword): the probability of the class number and keyword string co-existing, viz., the co-occurrence frequency of the class number and keyword string. )] 5

6 If one string maps to multiple class numbers, the best class number is the one that is maximum value of Dice Using Cilin, a thesaurus of Chinese words, to create a semantic dictionary for recognizing the synonyms Turning the keywords into descriptors in the subject indexing, measuring the semantic similarity between the indexing subjects and the terms in the knowledge base in the automatic classification, concept searching, all these processes could not be achieved without recognizing the synonyms. So it is important to create a semantic dictionary to recognize the synonyms. Cilin is a semantically classified dictionary, organized like a semantic tree. It divides the Chinese words into three sorts according to the semantic relationships, and from here into 14 major classes, 94 secondary classes and 1428 small classes. The vocabulary of Cilin is made up mostly of pure words, which are the morphemes of compounds. Through using Cilin to create the semantic dictionary, we could, on the one hand, directly recognize the synonyms in the form of morphemes, on the other hand, mine the synonymous relationships among compounds. [Semantic code]=>(major class) (secondly class) (small class) (group) Thereinto, major class=>(capital letter), secondly class=>(capital letter) (lowercase), small class=>(capital letter) (lowercase) (number) (number), group=>(capital letter) (lowercase) (number) (number) (number). For example, the semantic code of the word hotel is [Dm040901], the corresponding code of its major class, secondary class, small class and group are (D), (Dm), (Dm0409), (Dm040901). (D) represents the major class Thing, (Dm) the secondary class Organization, (Di0409) the word troop Hostel under the small class Shop, (Dm040901) the group Hotel. Then we could code all the morphemes to create a semantic dictionary by this method. Through the semantic dictionary, we can analyze the semantics of the terms to measure the semantic distance of two terms, then turn keywords into descriptors, measure the semantic similarity between two strings to realize the automatic classification and concept searching. The above introduces some key technologies about how to construct the class numbers-keyword strings parallel list and semantic dictionary, which is the main strength of the knowledge base. Since these technologies are the particular aspects of the construction of knowledge base, we introduce them at length. Other technologies for the construction of the knowledge base are not given unnecessary detail here. 5. Application of CLC Knowledge Base The knowledge base has a framework of the CLC, based on manual indexing. It has constructed mapping relationships among class numbers, descriptor strings and keyword strings, based on the compatibility principle of classification schemes, thesauri and natural language, which included abundant vocabulary, synonyms and mapping relationships between keyword strings and class numbers. The knowledge base can be broadly applied into automatic indexing and classification, even concept searching To realize automatic indexing by word segment aided by go-list and stop-list and subject controlling aided by synonymy dictionary Select title, abstract, keywords given by the authors, references and so on as the indexing sources, segment the text of indexing sources using max matching algorithm aided by go-list and stop-list, calculate word frequency, word number, word position weight to give ranked indexing terms, then turn them into descriptors through the use of a dictionary of synonyms To realize automatic classification aided by class numbers-keyword strings parallel list, synonymy dictionary and tables of areas, periods and document types The automatic classification discussed in this article is a classification method that classifies the documents by keyword strings and concepts. First, it classifies the documents by string rather than single word, which can improve the correction and precision. Second, it classifies the documents by conceptual matching. When matching the indexing terms with terms in the knowledge base, it first calculates word-form similarity, if no result, calculates semantic similarity aided by a dictionary of synonyms, and a semantic dictionary to work out the best CLC class number under the consideration of correction and speed. Third, it is a method based on cases (that is, indexing experience). Every record in the knowledge base is an example ; the indexing terms or strings will match with them to work out the best classification results. Fourth, the facets of area, period and document type in the text are separately indexed by the subdivisions, thus some shortcomings of the CLC system applied in the automatic classification would be avoided To realize concept searching and multiple-approach searching based on synonymy dictionary and the results of automatic indexing and classification From the perspective of indexing, the results of subject indexing include two parts, i.e. keyword strings and descriptor strings, which help users search not only by keywords and descriptors, but also by strings retrieval rather than single word; furthermore it can add retrieval 6

7 entries aided by a dictionary of synonyms dictionary and realize concept searching by semantic dictionary to improve searching efficiency. From the perspective of classification, results of classification include main class number, subdivision number of area, period and document type, this way user can search information from subjects, areas, periods and documents types. 6. Conclusion [5] Zhang, Q.Y. (2002). A Concept and faceted coordinate system for automatic classifying. Library Journal, 6:9-10 [6] Hou, H.Q. (1998). Construction of the indexing languages compatibility system on the basis of the Classified Chinese Thesaurus. Journal of the National Library of China, 4:35-39,90 [7] Mei, J.J. (1983). Cilin Thesaurus of Chinese Words. Shanghai: Shanghai Lexicon Press. The knowledge base as a KOS based on the frame of the CLC utilizes dual indexing records simultaneously including class numbers and descriptor strings or keyword strings in a bibliographic database, which has the feature of literary and user warrant. Professional people revise the data of the knowledge base after the statistic computation, which allows the base to improve its accuracy. At the same time, the knowledge base is constructed by the statistics of large corpus computer-assisted compilation; thus the subjectivity of mapping of class numbers to strings can be avoided. Although the knowledge base is based on the CLC, it has more broad functions than the CLC system itself. We think that the current KOS is the combination of indexing languages with the modern technology of computer and network. The CLC Knowledge Base we have constructed is such an example, it possesses an abundant vocabulary and semantic relationships. It combines the traditional indexing languages, such as CLC and CCT, with modern technology, such as database, data mining, hyperlink and computational linguistics. In a sense, it has some features of Ontology suitable for automation of information processing today. But CLC Knowledge Base has some disadvantages on understood and intelligent reasoning by machine. Although the knowledge base has brought about an important practical utilization in the intelligent processing of information, it still needs further research and improvement. References [1] Zeng, M.L. (2004). Networked knowledge organization systems/services. New Technology of Library and Information Services, 1:2-3 [2] Hou, H.Q, Xue, P.J. (2003). Design & construction of knowledge database for automatic classification in Chinese. Journal of the China Society for Scientific and Technical Information, 22(6): [3] Zhang, C.Z. (2002). Web concept mining based on text layer model, automatic indexing and automatic classifying based on concept semantic network. Supervised by Han-qing Hou. Master Dissertation of Nanjing Agricultural University, 2002,6 [4] Xue, P.J. (2001). Research on intelligent search engine of Chinese economic information based on knowledge database. Supervised by Han-qing Hou. Master Dissertation of Nanjing Agricultural University, 2001,6 7

INFORMATION RETRIEVAL SYSTEM: CONCEPT AND SCOPE

INFORMATION RETRIEVAL SYSTEM: CONCEPT AND SCOPE 15 : CONCEPT AND SCOPE 15.1 INTRODUCTION Information is communicated or received knowledge concerning a particular fact or circumstance. Retrieval refers to searching through stored information to find

More information

Semantic Visualization for Subject Authority Data of Chinese Classified Thesaurus

Semantic Visualization for Subject Authority Data of Chinese Classified Thesaurus Semantic Visualization for Subject Authority Data of Chinese Classified Thesaurus Wei Fan Sichuan University Shuqing Bu National Library of the China Qing Zou Lakehead University Outline I. Background

More information

Information Push Service of University Library in Network and Information Age

Information Push Service of University Library in Network and Information Age 2013 International Conference on Advances in Social Science, Humanities, and Management (ASSHM 2013) Information Push Service of University Library in Network and Information Age Song Deng 1 and Jun Wang

More information

Design and Realization of Agricultural Information Intelligent Processing and Application Platform

Design and Realization of Agricultural Information Intelligent Processing and Application Platform Design and Realization of Agricultural Information Intelligent Processing and Application Platform Dan Wang 1,2 1 Institute of Agricultural Information, Chinese Academy of Agricultural Sciences, Beijing

More information

EBSCOhost User Guide Browsing. Subjects, CINAHL/MeSH Headings, Indexes, Thesauri, Publications, Cited References. support.ebsco.

EBSCOhost User Guide Browsing. Subjects, CINAHL/MeSH Headings, Indexes, Thesauri, Publications, Cited References. support.ebsco. EBSCOhost User Guide Browsing Subjects, CINAHL/MeSH Headings, Indexes, Thesauri, Publications, Cited References Table of Contents EBSCOhost User Guide Browsing... 1... 1 Table of Contents... 2 Inside this

More information

Indexing and subject organisation

Indexing and subject organisation Indexing and subject organisation Madely du Preez Dept of Information Science University of South Africa (UNISA) LIASA IGBIS WORKSHOP 2018: 16-18 August, Centurion Lake Hotel. Menu Subject organisation

More information

An Intelligent Retrieval Platform for Distributional Agriculture Science and Technology Data

An Intelligent Retrieval Platform for Distributional Agriculture Science and Technology Data An Intelligent Retrieval Platform for Distributional Agriculture Science and Technology Data Xiaorong Yang 1,2, Wensheng Wang 1,2, Qingtian Zeng 3, and Nengfu Xie 1,2 1 Agriculture Information Institute,

More information

Remotely Sensed Image Processing Service Automatic Composition

Remotely Sensed Image Processing Service Automatic Composition Remotely Sensed Image Processing Service Automatic Composition Xiaoxia Yang Supervised by Qing Zhu State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University

More information

A Survey Of Different Text Mining Techniques Varsha C. Pande 1 and Dr. A.S. Khandelwal 2

A Survey Of Different Text Mining Techniques Varsha C. Pande 1 and Dr. A.S. Khandelwal 2 A Survey Of Different Text Mining Techniques Varsha C. Pande 1 and Dr. A.S. Khandelwal 2 1 Department of Electronics & Comp. Sc, RTMNU, Nagpur, India 2 Department of Computer Science, Hislop College, Nagpur,

More information

Application of Individualized Service System for Scientific and Technical Literature In Colleges and Universities

Application of Individualized Service System for Scientific and Technical Literature In Colleges and Universities Journal of Applied Science and Engineering Innovation, Vol.6, No.1, 2019, pp.26-30 ISSN (Print): 2331-9062 ISSN (Online): 2331-9070 Application of Individualized Service System for Scientific and Technical

More information

Research and Design of Key Technology of Vertical Search Engine for Educational Resources

Research and Design of Key Technology of Vertical Search Engine for Educational Resources 2017 International Conference on Arts and Design, Education and Social Sciences (ADESS 2017) ISBN: 978-1-60595-511-7 Research and Design of Key Technology of Vertical Search Engine for Educational Resources

More information

0.1 Knowledge Organization Systems for Semantic Web

0.1 Knowledge Organization Systems for Semantic Web 0.1 Knowledge Organization Systems for Semantic Web 0.1 Knowledge Organization Systems for Semantic Web 0.1.1 Knowledge Organization Systems Why do we need to organize knowledge? Indexing Retrieval Organization

More information

Natural Language Processing with PoolParty

Natural Language Processing with PoolParty Natural Language Processing with PoolParty Table of Content Introduction to PoolParty 2 Resolving Language Problems 4 Key Features 5 Entity Extraction and Term Extraction 5 Shadow Concepts 6 Word Sense

More information

Ontology Molecule Theory-based Information Integrated Service for Agricultural Risk Management

Ontology Molecule Theory-based Information Integrated Service for Agricultural Risk Management 2154 JOURNAL OF SOFTWARE, VOL. 6, NO. 11, NOVEMBER 2011 Ontology Molecule Theory-based Information Integrated Service for Agricultural Risk Management Qin Pan College of Economics Management, Huazhong

More information

Enhanced retrieval using semantic technologies:

Enhanced retrieval using semantic technologies: Enhanced retrieval using semantic technologies: Ontology based retrieval as a new search paradigm? - Considerations based on new projects at the Bavarian State Library Dr. Berthold Gillitzer 28. Mai 2008

More information

Table of contents for The organization of information / Arlene G. Taylor and Daniel N. Joudrey.

Table of contents for The organization of information / Arlene G. Taylor and Daniel N. Joudrey. Table of contents for The organization of information / Arlene G. Taylor and Daniel N. Joudrey. Chapter 1: Organization of Recorded Information The Need to Organize The Nature of Information Organization

More information

A Knowledge Network Constructed by Integrating Classification, Thesaurus, and Metadata in Digital Library

A Knowledge Network Constructed by Integrating Classification, Thesaurus, and Metadata in Digital Library A Knowledge Network Constructed by Integrating Classification, Thesaurus, and Metadata in Digital Library Jun Wang * * Information Management Department, Peking University, Peking, China. E-mail: junwang@pku.edu.cn

More information

APPLICATION OF JAVA TECHNOLOGY IN THE REGIONAL COMPARATIVE ADVANTAGE ANALYSIS SYSTEM OF MAIN GRAIN IN CHINA

APPLICATION OF JAVA TECHNOLOGY IN THE REGIONAL COMPARATIVE ADVANTAGE ANALYSIS SYSTEM OF MAIN GRAIN IN CHINA APPLICATION OF JAVA TECHNOLOGY IN THE REGIONAL COMPARATIVE ADVANTAGE ANALYSIS SYSTEM OF MAIN GRAIN IN CHINA Xue Yan, Yeping Zhu * Agricultural Information Institute of Chinese Academy of Agricultural Sciences

More information

Agricultural bibliographic data sharing & interoperability in China

Agricultural bibliographic data sharing & interoperability in China Agricultural bibliographic data sharing & interoperability in China Prof. Xuefu Zhang,Xian Guojian and Sun Wei Agricultural Information Institute of CAAS Asia Pacific Advanced Network Meeting, 29 Aug.,

More information

The Comparative Study of Machine Learning Algorithms in Text Data Classification*

The Comparative Study of Machine Learning Algorithms in Text Data Classification* The Comparative Study of Machine Learning Algorithms in Text Data Classification* Wang Xin School of Science, Beijing Information Science and Technology University Beijing, China Abstract Classification

More information

MLA International Bibliography

MLA International Bibliography Rodney A. Briggs Library MLA International Bibliography Provided by EBSCO and produced by the Modern Language Association, the MLA International Bibliography offers detailed bibliographic records of journal

More information

THEORY AND PRACTICE OF CLASSIFICATION

THEORY AND PRACTICE OF CLASSIFICATION THEORY AND PRACTICE OF CLASSIFICATION Ms. Patience Emefa Dzandza pedzandza@ug.edu.gh College of Education: School of Information and Communication Department of Information Studies ICT and Library Classification

More information

The Design and Implementation of Disaster Recovery in Dual-active Cloud Center

The Design and Implementation of Disaster Recovery in Dual-active Cloud Center International Conference on Information Sciences, Machinery, Materials and Energy (ICISMME 2015) The Design and Implementation of Disaster Recovery in Dual-active Cloud Center Xiao Chen 1, a, Longjun Zhang

More information

Knowledge organization on the Web ISKO-IWA meeting

Knowledge organization on the Web ISKO-IWA meeting Knowledge organization on the Web ISKO-IWA meeting German Social Science Infrastructure Services www.gesis.org Digital Library Data archive Consulting Surveys & studies Society observation German Social

More information

On Transformation from The Thesaurus into Domain Ontology

On Transformation from The Thesaurus into Domain Ontology On Transformation from The Thesaurus into Domain Ontology Ping Li, Yong Li Department of Computer Science and Engineering, Qujing Normal University Qujing, 655011, China E-mail: qjncliping@126.com Abstract:

More information

Semantic Web Mining and its application in Human Resource Management

Semantic Web Mining and its application in Human Resource Management International Journal of Computer Science & Management Studies, Vol. 11, Issue 02, August 2011 60 Semantic Web Mining and its application in Human Resource Management Ridhika Malik 1, Kunjana Vasudev 2

More information

Design and Implementation of Search Engine Using Vector Space Model for Personalized Search

Design and Implementation of Search Engine Using Vector Space Model for Personalized Search Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 1, January 2014,

More information

Equivalence Detection Using Parse-tree Normalization for Math Search

Equivalence Detection Using Parse-tree Normalization for Math Search Equivalence Detection Using Parse-tree Normalization for Math Search Mohammed Shatnawi Department of Computer Info. Systems Jordan University of Science and Tech. Jordan-Irbid (22110)-P.O.Box (3030) mshatnawi@just.edu.jo

More information

ImgSeek: Capturing User s Intent For Internet Image Search

ImgSeek: Capturing User s Intent For Internet Image Search ImgSeek: Capturing User s Intent For Internet Image Search Abstract - Internet image search engines (e.g. Bing Image Search) frequently lean on adjacent text features. It is difficult for them to illustrate

More information

SEARCH TECHNIQUES: BASIC AND ADVANCED

SEARCH TECHNIQUES: BASIC AND ADVANCED 17 SEARCH TECHNIQUES: BASIC AND ADVANCED 17.1 INTRODUCTION Searching is the activity of looking thoroughly in order to find something. In library and information science, searching refers to looking through

More information

2 Ontology evolution algorithm based on web-pages and users behavior logs

2 Ontology evolution algorithm based on web-pages and users behavior logs ISSN 1749-3889 (print), 1749-3897 (online) International Journal of Nonlinear Science Vol.18(2014) No.1,pp.86-91 Ontology Evolution Algorithm for Topic Information Collection Jing Ma 1, Mengyong Sun 1,

More information

From Scratch to the Web: Terminological Theses at the University of Innsbruck

From Scratch to the Web: Terminological Theses at the University of Innsbruck Peter Sandrini University of Innsbruck From Scratch to the Web: Terminological Theses at the University of Innsbruck Terminology Diploma Theses (TDT) have been well established in the training of translators

More information

Text Document Clustering Using DPM with Concept and Feature Analysis

Text Document Clustering Using DPM with Concept and Feature Analysis Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 2, Issue. 10, October 2013,

More information

The application of OLAP and Data mining technology in the analysis of. book lending

The application of OLAP and Data mining technology in the analysis of. book lending 2nd International Conference on Automation, Mechanical Control and Computational Engineering (AMCCE 2017) The application of OLAP and Data mining technology in the analysis of book lending Xiao-Han Zhou1,a,

More information

Data Mining Technology Based on Bayesian Network Structure Applied in Learning

Data Mining Technology Based on Bayesian Network Structure Applied in Learning , pp.67-71 http://dx.doi.org/10.14257/astl.2016.137.12 Data Mining Technology Based on Bayesian Network Structure Applied in Learning Chunhua Wang, Dong Han College of Information Engineering, Huanghuai

More information

TCM Health-keeping Proverb English Translation Management Platform based on SQL Server Database

TCM Health-keeping Proverb English Translation Management Platform based on SQL Server Database 2019 2nd International Conference on Computer Science and Advanced Materials (CSAM 2019) TCM Health-keeping Proverb English Translation Management Platform based on SQL Server Database Qiuxia Zeng1, Jianpeng

More information

Latest development in image feature representation and extraction

Latest development in image feature representation and extraction International Journal of Advanced Research and Development ISSN: 2455-4030, Impact Factor: RJIF 5.24 www.advancedjournal.com Volume 2; Issue 1; January 2017; Page No. 05-09 Latest development in image

More information

Revealing the Modern History of Japanese Philosophy Using Digitization, Natural Language Processing, and Visualization

Revealing the Modern History of Japanese Philosophy Using Digitization, Natural Language Processing, and Visualization Revealing the Modern History of Japanese Philosophy Using Digitization, Natural Language Katsuya Masuda *, Makoto Tanji **, and Hideki Mima *** Abstract This study proposes a framework to access to the

More information

Statistical Methods to Evaluate Important Degrees of Document Features

Statistical Methods to Evaluate Important Degrees of Document Features Statistical Methods to Evaluate Important Degrees of Document Features 1 2 Computer School, Beijing Information Science and Technology University; Beijing Key Laboratory of Internet Culture and Digital

More information

The Results of Falcon-AO in the OAEI 2006 Campaign

The Results of Falcon-AO in the OAEI 2006 Campaign The Results of Falcon-AO in the OAEI 2006 Campaign Wei Hu, Gong Cheng, Dongdong Zheng, Xinyu Zhong, and Yuzhong Qu School of Computer Science and Engineering, Southeast University, Nanjing 210096, P. R.

More information

International Conference on Automation, Mechanical Control and Computational Engineering (AMCCE 2015)

International Conference on Automation, Mechanical Control and Computational Engineering (AMCCE 2015) International Conference on Automation, Mechanical Control and Computational Engineering (AMCCE 2015) Risk Management Theory Application in national information security risk control Analysis of the relationship

More information

What is Discover! Additional Resources CountryWatch MathSciNet Literature Resource Center ProQuest: Historical Newspapers PAIS

What is Discover! Additional Resources CountryWatch MathSciNet Literature Resource Center ProQuest: Historical Newspapers PAIS What is Discover! A single interface to search our library s content. This platform provides users with an easy, yet powerful means of accessing resources through a single search. What resources are searched

More information

A Dublin Core Application Profile in the Agricultural Domain

A Dublin Core Application Profile in the Agricultural Domain Proc. Int l. Conf. on Dublin Core and Metadata Applications 2001 A Dublin Core Application Profile in the Agricultural Domain DC-2001 International Conference on Dublin Core and Metadata Applications 2001

More information

Content Organization and Knowledge Management in the Digital Environment

Content Organization and Knowledge Management in the Digital Environment Content Organization and Knowledge Management in the Digital Environment Kamani Perera Regional Centre for Strategic Studies, Colombo, Sri Lanka Abstract - Knowledge Organization Systems (KOS) such as

More information

Associating Terms with Text Categories

Associating Terms with Text Categories Associating Terms with Text Categories Osmar R. Zaïane Department of Computing Science University of Alberta Edmonton, AB, Canada zaiane@cs.ualberta.ca Maria-Luiza Antonie Department of Computing Science

More information

Domain-specific Concept-based Information Retrieval System

Domain-specific Concept-based Information Retrieval System Domain-specific Concept-based Information Retrieval System L. Shen 1, Y. K. Lim 1, H. T. Loh 2 1 Design Technology Institute Ltd, National University of Singapore, Singapore 2 Department of Mechanical

More information

A Network-Based Management Information System for Animal Husbandry in Farms

A Network-Based Management Information System for Animal Husbandry in Farms A Network-Based Information System for Animal Husbandry in Farms Jing Han 1 and Xi Wang 2, 1 College of Information Technology, Heilongjiang August First Land Reclamation University, Daqing, Heilongjiang

More information

[Type text] [Type text] [Type text]

[Type text] [Type text] [Type text] [Type text] [Type text] [Type text] ISSN : 0974-7435 Volume 10 Issue 19 BioTechnology 2014 An Indian Journal FULL PAPER BTAIJ, 10(19), 2014 [11492-11497] Research on integration of library information

More information

Multi-dimensional database design and implementation of dam safety monitoring system

Multi-dimensional database design and implementation of dam safety monitoring system Water Science and Engineering, Sep. 2008, Vol. 1, No. 3, 112-120 ISSN 1674-2370, http://kkb.hhu.edu.cn, e-mail: wse@hhu.edu.cn Multi-dimensional database design and implementation of dam safety monitoring

More information

Automated Classification. Lars Marius Garshol Topic Maps

Automated Classification. Lars Marius Garshol Topic Maps Automated Classification Lars Marius Garshol Topic Maps 2007 2007-03-21 Automated classification What is it? Why do it? 2 What is automated classification? Create parts of a topic map

More information

Analysis on the technology improvement of the library network information retrieval efficiency

Analysis on the technology improvement of the library network information retrieval efficiency Available online www.jocpr.com Journal of Chemical and Pharmaceutical Research, 2014, 6(6):2198-2202 Research Article ISSN : 0975-7384 CODEN(USA) : JCPRC5 Analysis on the technology improvement of the

More information

E B S C O h o s t U s e r G u i d e P s y c I N F O

E B S C O h o s t U s e r G u i d e P s y c I N F O E B S C O h o s t U s e r G u i d e P s y c I N F O PsycINFO User Guide Last Updated: 1/11/12 Table of Contents What is PsycINFO... 3 What is EBSCOhost... 3 System Requirements...3 Choosing Databases to

More information

Metadata for Digital Collections: A How-to-Do-It Manual

Metadata for Digital Collections: A How-to-Do-It Manual Chapter 4 Supplement Resource Content and Relationship Elements Questions for Review, Study, or Discussion 1. This chapter explores information and metadata elements having to do with what aspects of digital

More information

Yunfeng Zhang 1, Huan Wang 2, Jie Zhu 1 1 Computer Science & Engineering Department, North China Institute of Aerospace

Yunfeng Zhang 1, Huan Wang 2, Jie Zhu 1 1 Computer Science & Engineering Department, North China Institute of Aerospace [Type text] [Type text] [Type text] ISSN : 0974-7435 Volume 10 Issue 20 BioTechnology 2014 An Indian Journal FULL PAPER BTAIJ, 10(20), 2014 [12526-12531] Exploration on the data mining system construction

More information

Chapter 27 Introduction to Information Retrieval and Web Search

Chapter 27 Introduction to Information Retrieval and Web Search Chapter 27 Introduction to Information Retrieval and Web Search Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 27 Outline Information Retrieval (IR) Concepts Retrieval

More information

INFS 427: AUTOMATED INFORMATION RETRIEVAL (1 st Semester, 2018/2019)

INFS 427: AUTOMATED INFORMATION RETRIEVAL (1 st Semester, 2018/2019) INFS 427: AUTOMATED INFORMATION RETRIEVAL (1 st Semester, 2018/2019) Session 05 SUBJECT ANALYSIS & REPRESENTATION Lecturer: Mrs. Florence O. Entsua-Mensah, DIS Contact Information: fentsua-mensah@ug.edu.gh

More information

Creating a Corporate Taxonomy. Internet Librarian November 2001 Betsy Farr Cogliano

Creating a Corporate Taxonomy. Internet Librarian November 2001 Betsy Farr Cogliano Creating a Corporate Taxonomy Internet Librarian 2001 7 November 2001 Betsy Farr Cogliano 2001 The MITRE Corporation Revised October 2001 2 Background MITRE is a not-for-profit corporation operating three

More information

A REASONING COMPONENT S CONSTRUCTION FOR PLANNING REGIONAL AGRICULTURAL ADVANTAGEOUS INDUSTRY DEVELOPMENT

A REASONING COMPONENT S CONSTRUCTION FOR PLANNING REGIONAL AGRICULTURAL ADVANTAGEOUS INDUSTRY DEVELOPMENT A REASONING COMPONENT S CONSTRUCTION FOR PLANNING REGIONAL AGRICULTURAL ADVANTAGEOUS INDUSTRY DEVELOPMENT Yue Fan 1, Yeping Zhu 1*, 1 Agricultural Information Institute, Chinese Academy of Agricultural

More information

is easing the creation of new ontologies by promoting the reuse of existing ones and automating, as much as possible, the entire ontology

is easing the creation of new ontologies by promoting the reuse of existing ones and automating, as much as possible, the entire ontology Preface The idea of improving software quality through reuse is not new. After all, if software works and is needed, just reuse it. What is new and evolving is the idea of relative validation through testing

More information

Searching PsycInfo & Proquest Psychology

Searching PsycInfo & Proquest Psychology Searching PsycInfo & Proquest Psychology 1) When you access PsycInfo & Proquest Psychology, you will be taken to the Advanced Search screen. Using the Advanced Search will allow you to use all of the capabilities

More information

Ontology based Model and Procedure Creation for Topic Analysis in Chinese Language

Ontology based Model and Procedure Creation for Topic Analysis in Chinese Language Ontology based Model and Procedure Creation for Topic Analysis in Chinese Language Dong Han and Kilian Stoffel Information Management Institute, University of Neuchâtel Pierre-à-Mazel 7, CH-2000 Neuchâtel,

More information

Discussion of GPON technology application in communication engineering Zhongbo Feng

Discussion of GPON technology application in communication engineering Zhongbo Feng 2nd International Conference on Electronics, Network and Computer Engineering (ICENCE 2016) Discussion of GPON technology application in communication engineering Zhongbo Feng School of Physics and Electronic

More information

DIGIT.B4 Big Data PoC

DIGIT.B4 Big Data PoC DIGIT.B4 Big Data PoC RTD Health papers D02.02 Technological Architecture Table of contents 1 Introduction... 5 2 Methodological Approach... 6 2.1 Business understanding... 7 2.2 Data linguistic understanding...

More information

Headings: Academic Libraries. Database Management. Database Searching. Electronic Information Resource Searching Evaluation. Web Portals.

Headings: Academic Libraries. Database Management. Database Searching. Electronic Information Resource Searching Evaluation. Web Portals. Erin R. Holmes. Reimagining the E-Research by Discipline Portal. A Master s Project for the M.S. in IS degree. April, 2014. 20 pages. Advisor: Emily King This project presents recommendations and wireframes

More information

CE4031 and CZ4031 Database System Principles

CE4031 and CZ4031 Database System Principles CE431 and CZ431 Database System Principles Course CE/CZ431 Course Database System Principles CE/CZ21 Algorithms; CZ27 Introduction to Databases CZ433 Advanced Data Management (not offered currently) Lectures

More information

Re-designing Online Terminology Resources for German Grammar

Re-designing Online Terminology Resources for German Grammar Re-designing Online Terminology Resources for German Grammar Project Report Karolina Suchowolec, Christian Lang, and Roman Schneider Institut für Deutsche Sprache (IDS), Mannheim, Germany {suchowolec,

More information

Text Mining. Representation of Text Documents

Text Mining. Representation of Text Documents Data Mining is typically concerned with the detection of patterns in numeric data, but very often important (e.g., critical to business) information is stored in the form of text. Unlike numeric data,

More information

RETRACTED ARTICLE. Web-Based Data Mining in System Design and Implementation. Open Access. Jianhu Gong 1* and Jianzhi Gong 2

RETRACTED ARTICLE. Web-Based Data Mining in System Design and Implementation. Open Access. Jianhu Gong 1* and Jianzhi Gong 2 Send Orders for Reprints to reprints@benthamscience.ae The Open Automation and Control Systems Journal, 2014, 6, 1907-1911 1907 Web-Based Data Mining in System Design and Implementation Open Access Jianhu

More information

Developing ArXivSI to Help Scientists to Explore the Research Papers in ArXiv

Developing ArXivSI to Help Scientists to Explore the Research Papers in ArXiv Submitted on: 19.06.2015 Developing ArXivSI to Help Scientists to Explore the Research Papers in ArXiv Zhixiong Zhang National Science Library, Chinese Academy of Sciences, Beijing, China. E-mail address:

More information

PHARM 309 Secondary Resources Terry Ann Jankowski, MLS, AHIP 9 Oct 2006

PHARM 309 Secondary Resources Terry Ann Jankowski, MLS, AHIP 9 Oct 2006 PHARM 309 Secondary Resources Terry Ann Jankowski, MLS, AHIP 9 Oct 2006 Lecture objective: By the end of this lecture (and with a little practice outside of class), you will be able to: Define secondary

More information

Development of Contents Management System Based on Light-Weight Ontology

Development of Contents Management System Based on Light-Weight Ontology Development of Contents Management System Based on Light-Weight Ontology Kouji Kozaki, Yoshinobu Kitamura, and Riichiro Mizoguchi Abstract In the Structuring Nanotechnology Knowledge project, a material-independent

More information

Falcon-AO: Aligning Ontologies with Falcon

Falcon-AO: Aligning Ontologies with Falcon Falcon-AO: Aligning Ontologies with Falcon Ningsheng Jian, Wei Hu, Gong Cheng, Yuzhong Qu Department of Computer Science and Engineering Southeast University Nanjing 210096, P. R. China {nsjian, whu, gcheng,

More information

Ontology Matching with CIDER: Evaluation Report for the OAEI 2008

Ontology Matching with CIDER: Evaluation Report for the OAEI 2008 Ontology Matching with CIDER: Evaluation Report for the OAEI 2008 Jorge Gracia, Eduardo Mena IIS Department, University of Zaragoza, Spain {jogracia,emena}@unizar.es Abstract. Ontology matching, the task

More information

NTUST Library Service & E-Resource

NTUST Library Service & E-Resource NTUST Library Service & E-Resource English Version lib@mail.ntust.ecu.tw NTUST Library System Information Section 1 NTUST Library Services Floor Locations of Materials Services 5-8th Bound Periodicals

More information

Access ERIC from the GOS-ICH Library website: hhttps://

Access ERIC from the GOS-ICH Library website: hhttps:// The ERIC (Educational Resources Information Center) database is sponsored by the U.S. Department of Education to provide access to educational-related literature. ERIC provides coverage of journal articles,

More information

Study on the feasibility of multilingual subject cataloging. at the Swiss National Library

Study on the feasibility of multilingual subject cataloging. at the Swiss National Library Study on the feasibility of multilingual subject cataloging at the Swiss National Library Working report 1 First thoughts on issue (i): Suggestions for the organization of subject cataloging as the automated

More information

The Semantics of Semantic Interoperability: A Two-Dimensional Approach for Investigating Issues of Semantic Interoperability in Digital Libraries

The Semantics of Semantic Interoperability: A Two-Dimensional Approach for Investigating Issues of Semantic Interoperability in Digital Libraries The Semantics of Semantic Interoperability: A Two-Dimensional Approach for Investigating Issues of Semantic Interoperability in Digital Libraries EunKyung Chung, eunkyung.chung@usm.edu School of Library

More information

Enhancing E-Journal Access In A Digital Work Environment

Enhancing E-Journal Access In A Digital Work Environment Enhancing e-journal access in a digital work environment Foo, S. (2006). Singapore Journal of Library & Information Management, 34, 31-40. Enhancing E-Journal Access In A Digital Work Environment Schubert

More information

Domain Specific Search Engine for Students

Domain Specific Search Engine for Students Domain Specific Search Engine for Students Domain Specific Search Engine for Students Wai Yuen Tang The Department of Computer Science City University of Hong Kong, Hong Kong wytang@cs.cityu.edu.hk Lam

More information

Content analysis and classification in mathematics

Content analysis and classification in mathematics Content analysis and classification in mathematics Wolfram Sperber (Zentralblatt Math) Patrick Ion (Math Reviews) UDC Seminar 2011 CLASSIFICATION & ontology Formal approaches and Access to Knowledge The

More information

Analysis on international yak document and information resources

Analysis on international yak document and information resources Analysis on international yak document and information resources Liu Xi and Liang Jinping 1. International Yak Document Research Institute of the Library of Gansu Agricultural University, Lanzhou 730070,

More information

Schema Quality Improving Tasks in the Schema Integration Process

Schema Quality Improving Tasks in the Schema Integration Process 468 Schema Quality Improving Tasks in the Schema Integration Process Peter Bellström Information Systems Karlstad University Karlstad, Sweden e-mail: peter.bellstrom@kau.se Christian Kop Institute for

More information

STRUCTURE-BASED QUERY EXPANSION FOR XML SEARCH ENGINE

STRUCTURE-BASED QUERY EXPANSION FOR XML SEARCH ENGINE STRUCTURE-BASED QUERY EXPANSION FOR XML SEARCH ENGINE Wei-ning Qian, Hai-lei Qian, Li Wei, Yan Wang and Ao-ying Zhou Computer Science Department Fudan University Shanghai 200433 E-mail: wnqian@fudan.edu.cn

More information

CNBKSY Platform Manual for IP Login User

CNBKSY Platform Manual for IP Login User CNBKSY Platform Manual for IP Login User Table of Contents 1. Login... 3 1.1. Login Interface... 3 1.2. Operation Steps of Login Functions:... 4 1.3. Exit... 4 2. Home Page... 6 3. General Search Function...

More information

The Research on the Method of Process-Based Knowledge Catalog and Storage and Its Application in Steel Product R&D

The Research on the Method of Process-Based Knowledge Catalog and Storage and Its Application in Steel Product R&D The Research on the Method of Process-Based Knowledge Catalog and Storage and Its Application in Steel Product R&D Xiaodong Gao 1,2 and Zhiping Fan 1 1 School of Business Administration, Northeastern University,

More information

User guide. ( Basic Search Tips

User guide. (  Basic Search Tips User guide Welcome to the new ProQuest search experience. ProQuest s all-new, powerful, comprehensive, and easyto-navigate search environment brings together resources from ProQuest, Cambridge Scientific

More information

Organizing Information. Organizing information is at the heart of information science and is important in many other

Organizing Information. Organizing information is at the heart of information science and is important in many other Dagobert Soergel College of Library and Information Services University of Maryland College Park, MD 20742 Organizing Information Organizing information is at the heart of information science and is important

More information

Taxonomies and controlled vocabularies best practices for metadata

Taxonomies and controlled vocabularies best practices for metadata Original Article Taxonomies and controlled vocabularies best practices for metadata Heather Hedden is the taxonomy manager at First Wind Energy LLC. Previously, she was a taxonomy consultant with Earley

More information

Engineering education knowledge management based on Topic Maps

Engineering education knowledge management based on Topic Maps World Transactions on Engineering and Technology Education Vol.11, No.4, 2013 2013 WIETE Engineering education knowledge management based on Topic Maps Zhu Ke Henan Normal University Xin Xiang, People

More information

WEB SEARCH, FILTERING, AND TEXT MINING: TECHNOLOGY FOR A NEW ERA OF INFORMATION ACCESS

WEB SEARCH, FILTERING, AND TEXT MINING: TECHNOLOGY FOR A NEW ERA OF INFORMATION ACCESS 1 WEB SEARCH, FILTERING, AND TEXT MINING: TECHNOLOGY FOR A NEW ERA OF INFORMATION ACCESS BRUCE CROFT NSF Center for Intelligent Information Retrieval, Computer Science Department, University of Massachusetts,

More information

User Guide. Basic Search Tips

User Guide. Basic Search Tips User Guide Welcome to the new ProQuest search experience. ProQuest s all-new, powerful, comprehensive, and easyto-navigate search environment brings together resources from ProQuest, Cambridge Scientific

More information

The Promotion Channel Investigation of BIM Technology Application

The Promotion Channel Investigation of BIM Technology Application 2016 International Conference on Manufacturing Construction and Energy Engineering (MCEE) ISBN: 978-1-60595-374-8 The Promotion Channel Investigation of BIM Technology Application Yong Li, Jia-Chuan Qin,

More information

Large Scale Chinese News Categorization. Peng Wang. Joint work with H. Zhang, B. Xu, H.W. Hao

Large Scale Chinese News Categorization. Peng Wang. Joint work with H. Zhang, B. Xu, H.W. Hao Large Scale Chinese News Categorization --based on Improved Feature Selection Method Peng Wang Joint work with H. Zhang, B. Xu, H.W. Hao Computational-Brain Research Center Institute of Automation, Chinese

More information

Data Mining with Oracle 10g using Clustering and Classification Algorithms Nhamo Mdzingwa September 25, 2005

Data Mining with Oracle 10g using Clustering and Classification Algorithms Nhamo Mdzingwa September 25, 2005 Data Mining with Oracle 10g using Clustering and Classification Algorithms Nhamo Mdzingwa September 25, 2005 Abstract Deciding on which algorithm to use, in terms of which is the most effective and accurate

More information

A Study of Future Internet Applications based on Semantic Web Technology Configuration Model

A Study of Future Internet Applications based on Semantic Web Technology Configuration Model Indian Journal of Science and Technology, Vol 8(20), DOI:10.17485/ijst/2015/v8i20/79311, August 2015 ISSN (Print) : 0974-6846 ISSN (Online) : 0974-5645 A Study of Future Internet Applications based on

More information

Research on Anti-collision Algorithm Optimization of RFID Tag Based on Binary Search

Research on Anti-collision Algorithm Optimization of RFID Tag Based on Binary Search Research on Anti-collision Algorithm Optimization of RFID Tag Based on Binary Search Jinyan Liu, Quanyuan Feng School of Information Science and Technology, Southwest Jiaotong University, Chengdu 610031,

More information

irnational Standard 5963

irnational Standard 5963 5 1 3 8 5 DO irnational Standard 5963 INTERNATIONAL ORGANIZATION FOR STANDARDIZATION«ME)KflyHAPOflHAn 0PrAHM3ALlHH F1O CTAHflAPTL13AU.Hl

More information

Using AgreementMaker to Align Ontologies for OAEI 2010

Using AgreementMaker to Align Ontologies for OAEI 2010 Using AgreementMaker to Align Ontologies for OAEI 2010 Isabel F. Cruz, Cosmin Stroe, Michele Caci, Federico Caimi, Matteo Palmonari, Flavio Palandri Antonelli, Ulas C. Keles ADVIS Lab, Department of Computer

More information

October 28, 2017 WELCOME SHAREPOINT SATURDAY OTTAWA. Going Meta How to use metadata in SharePoint

October 28, 2017 WELCOME SHAREPOINT SATURDAY OTTAWA. Going Meta How to use metadata in SharePoint October 28, 2017 WELCOME SHAREPOINT SATURDAY OTTAWA Going Meta How to use metadata in SharePoint Agenda What is metadata and why should we use it? Types of metadata Metadata in SharePoint Metadata and

More information

DISCOVERING INFORMATIVE KNOWLEDGE FROM HETEROGENEOUS DATA SOURCES TO DEVELOP EFFECTIVE DATA MINING

DISCOVERING INFORMATIVE KNOWLEDGE FROM HETEROGENEOUS DATA SOURCES TO DEVELOP EFFECTIVE DATA MINING DISCOVERING INFORMATIVE KNOWLEDGE FROM HETEROGENEOUS DATA SOURCES TO DEVELOP EFFECTIVE DATA MINING Ms. Pooja Bhise 1, Prof. Mrs. Vidya Bharde 2 and Prof. Manoj Patil 3 1 PG Student, 2 Professor, Department

More information