Construction of Knowledge Base for Automatic Indexing and Classification Based. on Chinese Library Classification
|
|
- Peter Hopkins
- 5 years ago
- Views:
Transcription
1 Construction of Knowledge Base for Automatic Indexing and Classification Based on Chinese Library Classification Han-qing Hou, Chun-xiang Xue School of Information Science & Technology, Nanjing Agricultural University, China Abstract Class number, descriptor and keyword are three kinds of subject concept identifiers, among which there exist some concept ual mapping relationships, i.e. compatibility. According to this principle, we construct a CLC Knowledge Base on the basis of Chinese Library Classification for automatic indexing and classification. We compare it with the CLC system to illuminate its obvious advantages over automatic information processing and concept searching. We then introduce some key technologies in the process of construction at length and describe in brief their application to automatic indexing, automatic classification and concept searching. Keywords: automatic indexing, automatic classification, knowledge base, knowledge organization system, Chinese Library Classificatiom. 1. Introduction Knowledge organization systems (KOS) refer to all kinds of semantic tools that are used to describe and interpret human knowledge and its relationship s, such as library classifications, lists of subject headings, thesauri, semantic networks, maps of subject domains and ontologies. Library classification, lists of subject headings and thesauri have played an important part in organizing the traditional information resources, while the semantic network, subject maps and ontologies are designed for the second semantic Web. Some KOS in current use were constructed to improve traditional classifications or thesauri, inheriting and making use of an established knowledge system and abundant vocabulary. These systems have some features and functions of semantic network and ontology, which can promote the enhancing of knowledge processing and the efficiency of information retrieval. The knowledge base discussed in this article, called the CLC Knowledge Base, is also a KOS, an expertise system for knowledge organization, based on the Chinese Library Classification (thereinafter as CLC ). It reveals the concept mapping relationships among class numbers, descriptors and keywords in manually indexing records by statistical methods, and therefore it can be used to realize automatic indexing and classification, and concept searching. 2. Principles of Construction of the CLC Knowledge Base Classification scheme, thesaurus and natural language are three different kinds of information language with different symbols and organizational approaches. But they are the same, in essence; class numbers, descriptors and keywords all can be used to express the subject concept. There are some hidden mapping relationships conceptually, i.e. compatibility relationships among them. There are numerous manually indexed records of documents in most libraries, which simultaneously contain class numbers, descriptor strings or keyword strings. Through processing these data, we can mine the concept mapping relationships among class numbers, descriptor strings and keyword strings in order to construct a knowledge base. The CLC is a library classification based on the scientific classification and conceptual relation, so we can look upon it as a semantic network, which can be used to organize all kinds of information. The reasons why we choose the CLC as the frame of the knowledge base are: 1
2 Both classification scheme and thesaurus, even all KOS, use methodology of classification. The former uses open classification systems, while the latter hidden ones, such as cross-reference system, categorical index and hierarchical index. Classification scheme is the main part of the integrated vocabulary system, that is the classification/thesaurus system, and much easier to be accepted and understood. The CLC is a large universal classification edited by our own experts. It has been broadly used to classify and search the book materials, audio-visual materials and other sorts of information. The CLC exerts the most comprehensive influence domestically and boasts of numerous users, it has therefore been regarded as the national standard though not officially authorized. Since it was first published in 1975, the CLC has been continuously revised to meet information processing and accessing needs. It is currently in its 4 th edition; its electronic edition is stored in MARC format. The new edition has some features and functions, such as better logical knowledge organizing structure, more extensive coverage of knowledge, and faceted coordination. The CLC is widely used in most of the collections of Chinese documents. If we want to make use of these indexing records to construct knowledge bases, by choosing the CLC as the frame work, we can avoid switching class numbers of other classification schemes into the CLC class numbers. Most experts have approved the feasibility of applying the CLC to organize the Internet resources. Meanwhile, digitalization, faceted coordination, combination with natural language and hyperlinks has been added to the CLC; therefore it can be applied not only in traditional library but also in the web environments. Our automatic indexing and classification system is designed to organize both the traditional documents and digital information. The applicability both in traditional library and in web environments of the CLC happens to meet our needs. Given the advantages of this system, we use the CLC as frame when constructing the knowledge base to realize concept indexing and searching. 3. Comparison between the structure of the CLC Knowledge Base and the CLC system The CLC includes schedules, tables and indexes as well as other classifications. With the new trend that classifications integrate with thesauri developing, the CLC maps its class numbers to descriptors of the Chinese Thesaurus, like DDC to LCSH, and then develops an integrated vocabulary named Classified Chinese Thesaurus (thereinafter as CCT ); its first edition was edited from 1987 to At that time the CLC, Chinese Thesaurus and CCT made up a KOS, named the CLC system, specified in figure 1. Alt hough the CLC system did very well in the traditional library, its disadvantages are revealed when it is applied to the automatic processing of digital information in the Web. The disadvantages are as follows. The CLC system, both Classification scheme and Thesaurus, is a controlled language and lacks the elasticity of a natural language. The CLC system has a long period of revision, about eight to nine yeas, so many new words and subjects are not incorporated in a timely manner. The present classifications and thesauri have a small scale due to their printed edition. The CLC system cannot be directly applied to automatic information processing. We choose the CLC schedule to organize the knowledge base and improve it. We can discover compatible relationships among the class numbers, descriptor strings and keyword strings in the knowledge base, through statistics and computer technology. Compared with CLC system, the knowledge base adds some new features and functions, i.e. interface to natural language, continuously increasing scale, timely update, to adapt to the development of information organization in the Web. 2
3 The knowledge base is comprised of three parts: knowledge base for classification, knowledge base for subject indexing and supplementary knowledge base. The concordance of class numbers and keyword strings is the main part of the knowledge base for classification. Go -list, stop-list, dictionary of synonyms and semantic dictionary compose the knowledge base for subject indexing. Tables of area, periods, and document types compose the supplementary knowledge base that are used to extract the subjects about area, period and types from the documents. The structure and compositions of the knowledge base are specified in figure 2. The above two figures respectively reveal the frame of the CLC system and the structure of the knowledge base. Both are based on the CLC schedules and map their class numbers to descriptor or keywords, so they can be used to realize integrated classification with subject indexing. However, compared with the CLC system, the knowledge base is more suitable for automatic indexing and intelligent searching in their content, scale, structure and function. The reasons are as follows. The CLC system just reveals the mapping relationships between the CLC class numbers and descriptors of the Chinese Thesaurus, while the knowledge base reveals the mapping relationships among the class numbers, descriptor strings and keyword strings. The CLC system only comprises the class numbers and descriptors which were included in the CLC schedule and the Chinese Thesaurus, whereas the data of the knowledge base are from the manually indexing records, which includes a great deal of built class numbers and keywords or new words. So the scale of the knowledge base is larger than that of the CLC system. In the CLC system, one class number at most maps to 20 descriptors or strings, averagely 2-3. But, in the knowledge base, one class number averagely maps to keyword strings, even more than several hundreds of strings. So the knowledge base could reveal the hidden concepts in the classes. 3
4 The terms in the CLC system are updated very slowly because both the CLC and the CCT have long revision periods and are maintained by hand. However, the knowledge base is compiled and maintained by machine, and can embody newly proposed terms in real-time. More vocabulary, especially new words can lead to high indexing consistency and correctness. Due to the limited scale and vocabulary of the CLC system, it is only applied to index and classify literature to hand. However, the knowledge base can ensure higher quality and correctness because of its larger scale, more sufficient vocabulary and flexibility. Moreover, the knowledge base is applied not only to indexing and classify ing automatically but also to searching information more intelligently. The knowledge base could give descriptors and keywords as indexing terms at the same time, separately by their facets such as areas, periods and document types and use its dictionary of synonyms to add the entry words. All these advantages of the knowledge base provide users with multiple aspect and intelligent searching. In general, KOS and the collections of library are separated. In our system, we use the technology of database and hyperlink to connect the knowledge base with the collections of literature, like the directory of search engine in the Internet. 4
5 4. Key technologies of constructio n of CLC Knowledge Base There are some key technologies in the construction of CLC Knowledge Base. We would like to introduce them in the following text Collecting source data from manually indexed records and library classification At first, we should collect source data to build up the source database. There are four kinds of data source. (1) The Indexes of the CLC and the class number-descriptor strings parallel list of the Classified Chinese Thesaurus; (2) Indexing records of the large libraries, e.g. Beijing Library and Shanghai Library, which include the CLC class numbers and descriptors of the Chinese Thesaurus; (3) Indexing records of the periodical literature of bibliographic databases, which include the CLC class numbers and keyword strings, i.e. Database for Chinese Periodicals of Science & Technology (namely VIP), and Database for Social Newspaper and Periodicals that edited by Shanghai Library; (4)Database of titles, which is composed by CLC class numbers and titles coming from some famous bibliographic database. Next, we filter the erroneous and duplicate records to form a source database, which contains the mapping relationships between class numbers and descriptor strings or keyword strings Constructing the knowledge base by statistics method After finishing the data collection, we extract terms and class numbers from the source database, computer the frequency of terms and measure the co-occurrence frequency of the class numbers and strings to construct the knowledge base. Of all dictionaries of the knowledge base, the construction of the class number-keyword strings parallel list is the most important work. Here we use the statistics method to mine the conceptual mapping relationships between the class numbers and keyword strings. Through three statistics respectively called frequency of class number, frequency of the keyword string and the cooccurrence of the class number and keyword string, we use two parameters, namely the support degree and the confidence degree, often used in data mining, to discover the mapping relationships between the class numbers and keyword strings. Then we could generate the knowledge base for automatic classification. The so-called support degree is the co-occurrence frequency of the class numbers and keyword strings in the source database. More co-occurrence frequency shows more indexers agreeing on the conceptual mapping relationships of both. Suppoort ( keyword = P( clc, keyword ) clc ) = freq _ gx P(clc, keyword): the probability that the class number and keyword string are co-existing in an indexing record of the source database; it could be measured by the cooccurrence frequency. Generally speaking, the conceptual mapping relationship of both could be considered correct if the amount of the support >= 2. The greater the degree of support, the more correct the mapping relationship. The degree of confidence reflects the probability of the keyword strings, on the premise that the class number has appeared. Conf ( clc keyword ) = P( clc, keyword ) / P( keyword ) = Freq _ gx / freq _ keyword P(clc, keyword): the probability that class number and keyword string are co-existing in an indexing record of the source database; it could be measured by the co-occurrence frequency. P(keyword): the probability of the keyword string appearance, i.e., the frequency of the string appearing in the whole source data. If the degree of support and of confidence of the class number and keyword string separately reach the threshold, the conceptual mapping relationship between the class number and keyword string would be acceptable Measuring the similarity to solve the multiple-tomultiple relationships between class numbers and strings The relationship between the class numbers and keyword strings is multiple-to-multiple in the source database. In our system, one string only maps to a class number, so one string must map to an exclusive class number in the knowledge base. There are many methods to measure the similarity between the class numbers and strings, such as MI, LogL, Dice, etc. Here we use Dice measure to find out the best class number for a string. P ( clc, keyword ) Dice = 1 2 [ P ( clc ) + P ( keyword freq _ gx = 2 ( freq _ clc + freq _ keyword ) Hereinto: Dice: the probability of the class number and keyword string co-existing; P(clc): the probability of the class number existing in the source database, viz. the frequency of the class number; P(keyword): the probability of the keyword string existing in the source database, viz., the frequency of the keyword string; P(clc, keyword): the probability of the class number and keyword string co-existing, viz., the co-occurrence frequency of the class number and keyword string. )] 5
6 If one string maps to multiple class numbers, the best class number is the one that is maximum value of Dice Using Cilin, a thesaurus of Chinese words, to create a semantic dictionary for recognizing the synonyms Turning the keywords into descriptors in the subject indexing, measuring the semantic similarity between the indexing subjects and the terms in the knowledge base in the automatic classification, concept searching, all these processes could not be achieved without recognizing the synonyms. So it is important to create a semantic dictionary to recognize the synonyms. Cilin is a semantically classified dictionary, organized like a semantic tree. It divides the Chinese words into three sorts according to the semantic relationships, and from here into 14 major classes, 94 secondary classes and 1428 small classes. The vocabulary of Cilin is made up mostly of pure words, which are the morphemes of compounds. Through using Cilin to create the semantic dictionary, we could, on the one hand, directly recognize the synonyms in the form of morphemes, on the other hand, mine the synonymous relationships among compounds. [Semantic code]=>(major class) (secondly class) (small class) (group) Thereinto, major class=>(capital letter), secondly class=>(capital letter) (lowercase), small class=>(capital letter) (lowercase) (number) (number), group=>(capital letter) (lowercase) (number) (number) (number). For example, the semantic code of the word hotel is [Dm040901], the corresponding code of its major class, secondary class, small class and group are (D), (Dm), (Dm0409), (Dm040901). (D) represents the major class Thing, (Dm) the secondary class Organization, (Di0409) the word troop Hostel under the small class Shop, (Dm040901) the group Hotel. Then we could code all the morphemes to create a semantic dictionary by this method. Through the semantic dictionary, we can analyze the semantics of the terms to measure the semantic distance of two terms, then turn keywords into descriptors, measure the semantic similarity between two strings to realize the automatic classification and concept searching. The above introduces some key technologies about how to construct the class numbers-keyword strings parallel list and semantic dictionary, which is the main strength of the knowledge base. Since these technologies are the particular aspects of the construction of knowledge base, we introduce them at length. Other technologies for the construction of the knowledge base are not given unnecessary detail here. 5. Application of CLC Knowledge Base The knowledge base has a framework of the CLC, based on manual indexing. It has constructed mapping relationships among class numbers, descriptor strings and keyword strings, based on the compatibility principle of classification schemes, thesauri and natural language, which included abundant vocabulary, synonyms and mapping relationships between keyword strings and class numbers. The knowledge base can be broadly applied into automatic indexing and classification, even concept searching To realize automatic indexing by word segment aided by go-list and stop-list and subject controlling aided by synonymy dictionary Select title, abstract, keywords given by the authors, references and so on as the indexing sources, segment the text of indexing sources using max matching algorithm aided by go-list and stop-list, calculate word frequency, word number, word position weight to give ranked indexing terms, then turn them into descriptors through the use of a dictionary of synonyms To realize automatic classification aided by class numbers-keyword strings parallel list, synonymy dictionary and tables of areas, periods and document types The automatic classification discussed in this article is a classification method that classifies the documents by keyword strings and concepts. First, it classifies the documents by string rather than single word, which can improve the correction and precision. Second, it classifies the documents by conceptual matching. When matching the indexing terms with terms in the knowledge base, it first calculates word-form similarity, if no result, calculates semantic similarity aided by a dictionary of synonyms, and a semantic dictionary to work out the best CLC class number under the consideration of correction and speed. Third, it is a method based on cases (that is, indexing experience). Every record in the knowledge base is an example ; the indexing terms or strings will match with them to work out the best classification results. Fourth, the facets of area, period and document type in the text are separately indexed by the subdivisions, thus some shortcomings of the CLC system applied in the automatic classification would be avoided To realize concept searching and multiple-approach searching based on synonymy dictionary and the results of automatic indexing and classification From the perspective of indexing, the results of subject indexing include two parts, i.e. keyword strings and descriptor strings, which help users search not only by keywords and descriptors, but also by strings retrieval rather than single word; furthermore it can add retrieval 6
7 entries aided by a dictionary of synonyms dictionary and realize concept searching by semantic dictionary to improve searching efficiency. From the perspective of classification, results of classification include main class number, subdivision number of area, period and document type, this way user can search information from subjects, areas, periods and documents types. 6. Conclusion [5] Zhang, Q.Y. (2002). A Concept and faceted coordinate system for automatic classifying. Library Journal, 6:9-10 [6] Hou, H.Q. (1998). Construction of the indexing languages compatibility system on the basis of the Classified Chinese Thesaurus. Journal of the National Library of China, 4:35-39,90 [7] Mei, J.J. (1983). Cilin Thesaurus of Chinese Words. Shanghai: Shanghai Lexicon Press. The knowledge base as a KOS based on the frame of the CLC utilizes dual indexing records simultaneously including class numbers and descriptor strings or keyword strings in a bibliographic database, which has the feature of literary and user warrant. Professional people revise the data of the knowledge base after the statistic computation, which allows the base to improve its accuracy. At the same time, the knowledge base is constructed by the statistics of large corpus computer-assisted compilation; thus the subjectivity of mapping of class numbers to strings can be avoided. Although the knowledge base is based on the CLC, it has more broad functions than the CLC system itself. We think that the current KOS is the combination of indexing languages with the modern technology of computer and network. The CLC Knowledge Base we have constructed is such an example, it possesses an abundant vocabulary and semantic relationships. It combines the traditional indexing languages, such as CLC and CCT, with modern technology, such as database, data mining, hyperlink and computational linguistics. In a sense, it has some features of Ontology suitable for automation of information processing today. But CLC Knowledge Base has some disadvantages on understood and intelligent reasoning by machine. Although the knowledge base has brought about an important practical utilization in the intelligent processing of information, it still needs further research and improvement. References [1] Zeng, M.L. (2004). Networked knowledge organization systems/services. New Technology of Library and Information Services, 1:2-3 [2] Hou, H.Q, Xue, P.J. (2003). Design & construction of knowledge database for automatic classification in Chinese. Journal of the China Society for Scientific and Technical Information, 22(6): [3] Zhang, C.Z. (2002). Web concept mining based on text layer model, automatic indexing and automatic classifying based on concept semantic network. Supervised by Han-qing Hou. Master Dissertation of Nanjing Agricultural University, 2002,6 [4] Xue, P.J. (2001). Research on intelligent search engine of Chinese economic information based on knowledge database. Supervised by Han-qing Hou. Master Dissertation of Nanjing Agricultural University, 2001,6 7
INFORMATION RETRIEVAL SYSTEM: CONCEPT AND SCOPE
15 : CONCEPT AND SCOPE 15.1 INTRODUCTION Information is communicated or received knowledge concerning a particular fact or circumstance. Retrieval refers to searching through stored information to find
More informationSemantic Visualization for Subject Authority Data of Chinese Classified Thesaurus
Semantic Visualization for Subject Authority Data of Chinese Classified Thesaurus Wei Fan Sichuan University Shuqing Bu National Library of the China Qing Zou Lakehead University Outline I. Background
More informationInformation Push Service of University Library in Network and Information Age
2013 International Conference on Advances in Social Science, Humanities, and Management (ASSHM 2013) Information Push Service of University Library in Network and Information Age Song Deng 1 and Jun Wang
More informationDesign and Realization of Agricultural Information Intelligent Processing and Application Platform
Design and Realization of Agricultural Information Intelligent Processing and Application Platform Dan Wang 1,2 1 Institute of Agricultural Information, Chinese Academy of Agricultural Sciences, Beijing
More informationEBSCOhost User Guide Browsing. Subjects, CINAHL/MeSH Headings, Indexes, Thesauri, Publications, Cited References. support.ebsco.
EBSCOhost User Guide Browsing Subjects, CINAHL/MeSH Headings, Indexes, Thesauri, Publications, Cited References Table of Contents EBSCOhost User Guide Browsing... 1... 1 Table of Contents... 2 Inside this
More informationIndexing and subject organisation
Indexing and subject organisation Madely du Preez Dept of Information Science University of South Africa (UNISA) LIASA IGBIS WORKSHOP 2018: 16-18 August, Centurion Lake Hotel. Menu Subject organisation
More informationAn Intelligent Retrieval Platform for Distributional Agriculture Science and Technology Data
An Intelligent Retrieval Platform for Distributional Agriculture Science and Technology Data Xiaorong Yang 1,2, Wensheng Wang 1,2, Qingtian Zeng 3, and Nengfu Xie 1,2 1 Agriculture Information Institute,
More informationRemotely Sensed Image Processing Service Automatic Composition
Remotely Sensed Image Processing Service Automatic Composition Xiaoxia Yang Supervised by Qing Zhu State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University
More informationA Survey Of Different Text Mining Techniques Varsha C. Pande 1 and Dr. A.S. Khandelwal 2
A Survey Of Different Text Mining Techniques Varsha C. Pande 1 and Dr. A.S. Khandelwal 2 1 Department of Electronics & Comp. Sc, RTMNU, Nagpur, India 2 Department of Computer Science, Hislop College, Nagpur,
More informationApplication of Individualized Service System for Scientific and Technical Literature In Colleges and Universities
Journal of Applied Science and Engineering Innovation, Vol.6, No.1, 2019, pp.26-30 ISSN (Print): 2331-9062 ISSN (Online): 2331-9070 Application of Individualized Service System for Scientific and Technical
More informationResearch and Design of Key Technology of Vertical Search Engine for Educational Resources
2017 International Conference on Arts and Design, Education and Social Sciences (ADESS 2017) ISBN: 978-1-60595-511-7 Research and Design of Key Technology of Vertical Search Engine for Educational Resources
More information0.1 Knowledge Organization Systems for Semantic Web
0.1 Knowledge Organization Systems for Semantic Web 0.1 Knowledge Organization Systems for Semantic Web 0.1.1 Knowledge Organization Systems Why do we need to organize knowledge? Indexing Retrieval Organization
More informationNatural Language Processing with PoolParty
Natural Language Processing with PoolParty Table of Content Introduction to PoolParty 2 Resolving Language Problems 4 Key Features 5 Entity Extraction and Term Extraction 5 Shadow Concepts 6 Word Sense
More informationOntology Molecule Theory-based Information Integrated Service for Agricultural Risk Management
2154 JOURNAL OF SOFTWARE, VOL. 6, NO. 11, NOVEMBER 2011 Ontology Molecule Theory-based Information Integrated Service for Agricultural Risk Management Qin Pan College of Economics Management, Huazhong
More informationEnhanced retrieval using semantic technologies:
Enhanced retrieval using semantic technologies: Ontology based retrieval as a new search paradigm? - Considerations based on new projects at the Bavarian State Library Dr. Berthold Gillitzer 28. Mai 2008
More informationTable of contents for The organization of information / Arlene G. Taylor and Daniel N. Joudrey.
Table of contents for The organization of information / Arlene G. Taylor and Daniel N. Joudrey. Chapter 1: Organization of Recorded Information The Need to Organize The Nature of Information Organization
More informationA Knowledge Network Constructed by Integrating Classification, Thesaurus, and Metadata in Digital Library
A Knowledge Network Constructed by Integrating Classification, Thesaurus, and Metadata in Digital Library Jun Wang * * Information Management Department, Peking University, Peking, China. E-mail: junwang@pku.edu.cn
More informationAPPLICATION OF JAVA TECHNOLOGY IN THE REGIONAL COMPARATIVE ADVANTAGE ANALYSIS SYSTEM OF MAIN GRAIN IN CHINA
APPLICATION OF JAVA TECHNOLOGY IN THE REGIONAL COMPARATIVE ADVANTAGE ANALYSIS SYSTEM OF MAIN GRAIN IN CHINA Xue Yan, Yeping Zhu * Agricultural Information Institute of Chinese Academy of Agricultural Sciences
More informationAgricultural bibliographic data sharing & interoperability in China
Agricultural bibliographic data sharing & interoperability in China Prof. Xuefu Zhang,Xian Guojian and Sun Wei Agricultural Information Institute of CAAS Asia Pacific Advanced Network Meeting, 29 Aug.,
More informationThe Comparative Study of Machine Learning Algorithms in Text Data Classification*
The Comparative Study of Machine Learning Algorithms in Text Data Classification* Wang Xin School of Science, Beijing Information Science and Technology University Beijing, China Abstract Classification
More informationMLA International Bibliography
Rodney A. Briggs Library MLA International Bibliography Provided by EBSCO and produced by the Modern Language Association, the MLA International Bibliography offers detailed bibliographic records of journal
More informationTHEORY AND PRACTICE OF CLASSIFICATION
THEORY AND PRACTICE OF CLASSIFICATION Ms. Patience Emefa Dzandza pedzandza@ug.edu.gh College of Education: School of Information and Communication Department of Information Studies ICT and Library Classification
More informationThe Design and Implementation of Disaster Recovery in Dual-active Cloud Center
International Conference on Information Sciences, Machinery, Materials and Energy (ICISMME 2015) The Design and Implementation of Disaster Recovery in Dual-active Cloud Center Xiao Chen 1, a, Longjun Zhang
More informationKnowledge organization on the Web ISKO-IWA meeting
Knowledge organization on the Web ISKO-IWA meeting German Social Science Infrastructure Services www.gesis.org Digital Library Data archive Consulting Surveys & studies Society observation German Social
More informationOn Transformation from The Thesaurus into Domain Ontology
On Transformation from The Thesaurus into Domain Ontology Ping Li, Yong Li Department of Computer Science and Engineering, Qujing Normal University Qujing, 655011, China E-mail: qjncliping@126.com Abstract:
More informationSemantic Web Mining and its application in Human Resource Management
International Journal of Computer Science & Management Studies, Vol. 11, Issue 02, August 2011 60 Semantic Web Mining and its application in Human Resource Management Ridhika Malik 1, Kunjana Vasudev 2
More informationDesign and Implementation of Search Engine Using Vector Space Model for Personalized Search
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 1, January 2014,
More informationEquivalence Detection Using Parse-tree Normalization for Math Search
Equivalence Detection Using Parse-tree Normalization for Math Search Mohammed Shatnawi Department of Computer Info. Systems Jordan University of Science and Tech. Jordan-Irbid (22110)-P.O.Box (3030) mshatnawi@just.edu.jo
More informationImgSeek: Capturing User s Intent For Internet Image Search
ImgSeek: Capturing User s Intent For Internet Image Search Abstract - Internet image search engines (e.g. Bing Image Search) frequently lean on adjacent text features. It is difficult for them to illustrate
More informationSEARCH TECHNIQUES: BASIC AND ADVANCED
17 SEARCH TECHNIQUES: BASIC AND ADVANCED 17.1 INTRODUCTION Searching is the activity of looking thoroughly in order to find something. In library and information science, searching refers to looking through
More information2 Ontology evolution algorithm based on web-pages and users behavior logs
ISSN 1749-3889 (print), 1749-3897 (online) International Journal of Nonlinear Science Vol.18(2014) No.1,pp.86-91 Ontology Evolution Algorithm for Topic Information Collection Jing Ma 1, Mengyong Sun 1,
More informationFrom Scratch to the Web: Terminological Theses at the University of Innsbruck
Peter Sandrini University of Innsbruck From Scratch to the Web: Terminological Theses at the University of Innsbruck Terminology Diploma Theses (TDT) have been well established in the training of translators
More informationText Document Clustering Using DPM with Concept and Feature Analysis
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 2, Issue. 10, October 2013,
More informationThe application of OLAP and Data mining technology in the analysis of. book lending
2nd International Conference on Automation, Mechanical Control and Computational Engineering (AMCCE 2017) The application of OLAP and Data mining technology in the analysis of book lending Xiao-Han Zhou1,a,
More informationData Mining Technology Based on Bayesian Network Structure Applied in Learning
, pp.67-71 http://dx.doi.org/10.14257/astl.2016.137.12 Data Mining Technology Based on Bayesian Network Structure Applied in Learning Chunhua Wang, Dong Han College of Information Engineering, Huanghuai
More informationTCM Health-keeping Proverb English Translation Management Platform based on SQL Server Database
2019 2nd International Conference on Computer Science and Advanced Materials (CSAM 2019) TCM Health-keeping Proverb English Translation Management Platform based on SQL Server Database Qiuxia Zeng1, Jianpeng
More informationLatest development in image feature representation and extraction
International Journal of Advanced Research and Development ISSN: 2455-4030, Impact Factor: RJIF 5.24 www.advancedjournal.com Volume 2; Issue 1; January 2017; Page No. 05-09 Latest development in image
More informationRevealing the Modern History of Japanese Philosophy Using Digitization, Natural Language Processing, and Visualization
Revealing the Modern History of Japanese Philosophy Using Digitization, Natural Language Katsuya Masuda *, Makoto Tanji **, and Hideki Mima *** Abstract This study proposes a framework to access to the
More informationStatistical Methods to Evaluate Important Degrees of Document Features
Statistical Methods to Evaluate Important Degrees of Document Features 1 2 Computer School, Beijing Information Science and Technology University; Beijing Key Laboratory of Internet Culture and Digital
More informationThe Results of Falcon-AO in the OAEI 2006 Campaign
The Results of Falcon-AO in the OAEI 2006 Campaign Wei Hu, Gong Cheng, Dongdong Zheng, Xinyu Zhong, and Yuzhong Qu School of Computer Science and Engineering, Southeast University, Nanjing 210096, P. R.
More informationInternational Conference on Automation, Mechanical Control and Computational Engineering (AMCCE 2015)
International Conference on Automation, Mechanical Control and Computational Engineering (AMCCE 2015) Risk Management Theory Application in national information security risk control Analysis of the relationship
More informationWhat is Discover! Additional Resources CountryWatch MathSciNet Literature Resource Center ProQuest: Historical Newspapers PAIS
What is Discover! A single interface to search our library s content. This platform provides users with an easy, yet powerful means of accessing resources through a single search. What resources are searched
More informationA Dublin Core Application Profile in the Agricultural Domain
Proc. Int l. Conf. on Dublin Core and Metadata Applications 2001 A Dublin Core Application Profile in the Agricultural Domain DC-2001 International Conference on Dublin Core and Metadata Applications 2001
More informationContent Organization and Knowledge Management in the Digital Environment
Content Organization and Knowledge Management in the Digital Environment Kamani Perera Regional Centre for Strategic Studies, Colombo, Sri Lanka Abstract - Knowledge Organization Systems (KOS) such as
More informationAssociating Terms with Text Categories
Associating Terms with Text Categories Osmar R. Zaïane Department of Computing Science University of Alberta Edmonton, AB, Canada zaiane@cs.ualberta.ca Maria-Luiza Antonie Department of Computing Science
More informationDomain-specific Concept-based Information Retrieval System
Domain-specific Concept-based Information Retrieval System L. Shen 1, Y. K. Lim 1, H. T. Loh 2 1 Design Technology Institute Ltd, National University of Singapore, Singapore 2 Department of Mechanical
More informationA Network-Based Management Information System for Animal Husbandry in Farms
A Network-Based Information System for Animal Husbandry in Farms Jing Han 1 and Xi Wang 2, 1 College of Information Technology, Heilongjiang August First Land Reclamation University, Daqing, Heilongjiang
More information[Type text] [Type text] [Type text]
[Type text] [Type text] [Type text] ISSN : 0974-7435 Volume 10 Issue 19 BioTechnology 2014 An Indian Journal FULL PAPER BTAIJ, 10(19), 2014 [11492-11497] Research on integration of library information
More informationMulti-dimensional database design and implementation of dam safety monitoring system
Water Science and Engineering, Sep. 2008, Vol. 1, No. 3, 112-120 ISSN 1674-2370, http://kkb.hhu.edu.cn, e-mail: wse@hhu.edu.cn Multi-dimensional database design and implementation of dam safety monitoring
More informationAutomated Classification. Lars Marius Garshol Topic Maps
Automated Classification Lars Marius Garshol Topic Maps 2007 2007-03-21 Automated classification What is it? Why do it? 2 What is automated classification? Create parts of a topic map
More informationAnalysis on the technology improvement of the library network information retrieval efficiency
Available online www.jocpr.com Journal of Chemical and Pharmaceutical Research, 2014, 6(6):2198-2202 Research Article ISSN : 0975-7384 CODEN(USA) : JCPRC5 Analysis on the technology improvement of the
More informationE B S C O h o s t U s e r G u i d e P s y c I N F O
E B S C O h o s t U s e r G u i d e P s y c I N F O PsycINFO User Guide Last Updated: 1/11/12 Table of Contents What is PsycINFO... 3 What is EBSCOhost... 3 System Requirements...3 Choosing Databases to
More informationMetadata for Digital Collections: A How-to-Do-It Manual
Chapter 4 Supplement Resource Content and Relationship Elements Questions for Review, Study, or Discussion 1. This chapter explores information and metadata elements having to do with what aspects of digital
More informationYunfeng Zhang 1, Huan Wang 2, Jie Zhu 1 1 Computer Science & Engineering Department, North China Institute of Aerospace
[Type text] [Type text] [Type text] ISSN : 0974-7435 Volume 10 Issue 20 BioTechnology 2014 An Indian Journal FULL PAPER BTAIJ, 10(20), 2014 [12526-12531] Exploration on the data mining system construction
More informationChapter 27 Introduction to Information Retrieval and Web Search
Chapter 27 Introduction to Information Retrieval and Web Search Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 27 Outline Information Retrieval (IR) Concepts Retrieval
More informationINFS 427: AUTOMATED INFORMATION RETRIEVAL (1 st Semester, 2018/2019)
INFS 427: AUTOMATED INFORMATION RETRIEVAL (1 st Semester, 2018/2019) Session 05 SUBJECT ANALYSIS & REPRESENTATION Lecturer: Mrs. Florence O. Entsua-Mensah, DIS Contact Information: fentsua-mensah@ug.edu.gh
More informationCreating a Corporate Taxonomy. Internet Librarian November 2001 Betsy Farr Cogliano
Creating a Corporate Taxonomy Internet Librarian 2001 7 November 2001 Betsy Farr Cogliano 2001 The MITRE Corporation Revised October 2001 2 Background MITRE is a not-for-profit corporation operating three
More informationA REASONING COMPONENT S CONSTRUCTION FOR PLANNING REGIONAL AGRICULTURAL ADVANTAGEOUS INDUSTRY DEVELOPMENT
A REASONING COMPONENT S CONSTRUCTION FOR PLANNING REGIONAL AGRICULTURAL ADVANTAGEOUS INDUSTRY DEVELOPMENT Yue Fan 1, Yeping Zhu 1*, 1 Agricultural Information Institute, Chinese Academy of Agricultural
More informationis easing the creation of new ontologies by promoting the reuse of existing ones and automating, as much as possible, the entire ontology
Preface The idea of improving software quality through reuse is not new. After all, if software works and is needed, just reuse it. What is new and evolving is the idea of relative validation through testing
More informationSearching PsycInfo & Proquest Psychology
Searching PsycInfo & Proquest Psychology 1) When you access PsycInfo & Proquest Psychology, you will be taken to the Advanced Search screen. Using the Advanced Search will allow you to use all of the capabilities
More informationOntology based Model and Procedure Creation for Topic Analysis in Chinese Language
Ontology based Model and Procedure Creation for Topic Analysis in Chinese Language Dong Han and Kilian Stoffel Information Management Institute, University of Neuchâtel Pierre-à-Mazel 7, CH-2000 Neuchâtel,
More informationDiscussion of GPON technology application in communication engineering Zhongbo Feng
2nd International Conference on Electronics, Network and Computer Engineering (ICENCE 2016) Discussion of GPON technology application in communication engineering Zhongbo Feng School of Physics and Electronic
More informationDIGIT.B4 Big Data PoC
DIGIT.B4 Big Data PoC RTD Health papers D02.02 Technological Architecture Table of contents 1 Introduction... 5 2 Methodological Approach... 6 2.1 Business understanding... 7 2.2 Data linguistic understanding...
More informationHeadings: Academic Libraries. Database Management. Database Searching. Electronic Information Resource Searching Evaluation. Web Portals.
Erin R. Holmes. Reimagining the E-Research by Discipline Portal. A Master s Project for the M.S. in IS degree. April, 2014. 20 pages. Advisor: Emily King This project presents recommendations and wireframes
More informationCE4031 and CZ4031 Database System Principles
CE431 and CZ431 Database System Principles Course CE/CZ431 Course Database System Principles CE/CZ21 Algorithms; CZ27 Introduction to Databases CZ433 Advanced Data Management (not offered currently) Lectures
More informationRe-designing Online Terminology Resources for German Grammar
Re-designing Online Terminology Resources for German Grammar Project Report Karolina Suchowolec, Christian Lang, and Roman Schneider Institut für Deutsche Sprache (IDS), Mannheim, Germany {suchowolec,
More informationText Mining. Representation of Text Documents
Data Mining is typically concerned with the detection of patterns in numeric data, but very often important (e.g., critical to business) information is stored in the form of text. Unlike numeric data,
More informationRETRACTED ARTICLE. Web-Based Data Mining in System Design and Implementation. Open Access. Jianhu Gong 1* and Jianzhi Gong 2
Send Orders for Reprints to reprints@benthamscience.ae The Open Automation and Control Systems Journal, 2014, 6, 1907-1911 1907 Web-Based Data Mining in System Design and Implementation Open Access Jianhu
More informationDeveloping ArXivSI to Help Scientists to Explore the Research Papers in ArXiv
Submitted on: 19.06.2015 Developing ArXivSI to Help Scientists to Explore the Research Papers in ArXiv Zhixiong Zhang National Science Library, Chinese Academy of Sciences, Beijing, China. E-mail address:
More informationPHARM 309 Secondary Resources Terry Ann Jankowski, MLS, AHIP 9 Oct 2006
PHARM 309 Secondary Resources Terry Ann Jankowski, MLS, AHIP 9 Oct 2006 Lecture objective: By the end of this lecture (and with a little practice outside of class), you will be able to: Define secondary
More informationDevelopment of Contents Management System Based on Light-Weight Ontology
Development of Contents Management System Based on Light-Weight Ontology Kouji Kozaki, Yoshinobu Kitamura, and Riichiro Mizoguchi Abstract In the Structuring Nanotechnology Knowledge project, a material-independent
More informationFalcon-AO: Aligning Ontologies with Falcon
Falcon-AO: Aligning Ontologies with Falcon Ningsheng Jian, Wei Hu, Gong Cheng, Yuzhong Qu Department of Computer Science and Engineering Southeast University Nanjing 210096, P. R. China {nsjian, whu, gcheng,
More informationOntology Matching with CIDER: Evaluation Report for the OAEI 2008
Ontology Matching with CIDER: Evaluation Report for the OAEI 2008 Jorge Gracia, Eduardo Mena IIS Department, University of Zaragoza, Spain {jogracia,emena}@unizar.es Abstract. Ontology matching, the task
More informationNTUST Library Service & E-Resource
NTUST Library Service & E-Resource English Version lib@mail.ntust.ecu.tw NTUST Library System Information Section 1 NTUST Library Services Floor Locations of Materials Services 5-8th Bound Periodicals
More informationAccess ERIC from the GOS-ICH Library website: hhttps://
The ERIC (Educational Resources Information Center) database is sponsored by the U.S. Department of Education to provide access to educational-related literature. ERIC provides coverage of journal articles,
More informationStudy on the feasibility of multilingual subject cataloging. at the Swiss National Library
Study on the feasibility of multilingual subject cataloging at the Swiss National Library Working report 1 First thoughts on issue (i): Suggestions for the organization of subject cataloging as the automated
More informationThe Semantics of Semantic Interoperability: A Two-Dimensional Approach for Investigating Issues of Semantic Interoperability in Digital Libraries
The Semantics of Semantic Interoperability: A Two-Dimensional Approach for Investigating Issues of Semantic Interoperability in Digital Libraries EunKyung Chung, eunkyung.chung@usm.edu School of Library
More informationEnhancing E-Journal Access In A Digital Work Environment
Enhancing e-journal access in a digital work environment Foo, S. (2006). Singapore Journal of Library & Information Management, 34, 31-40. Enhancing E-Journal Access In A Digital Work Environment Schubert
More informationDomain Specific Search Engine for Students
Domain Specific Search Engine for Students Domain Specific Search Engine for Students Wai Yuen Tang The Department of Computer Science City University of Hong Kong, Hong Kong wytang@cs.cityu.edu.hk Lam
More informationContent analysis and classification in mathematics
Content analysis and classification in mathematics Wolfram Sperber (Zentralblatt Math) Patrick Ion (Math Reviews) UDC Seminar 2011 CLASSIFICATION & ontology Formal approaches and Access to Knowledge The
More informationAnalysis on international yak document and information resources
Analysis on international yak document and information resources Liu Xi and Liang Jinping 1. International Yak Document Research Institute of the Library of Gansu Agricultural University, Lanzhou 730070,
More informationSchema Quality Improving Tasks in the Schema Integration Process
468 Schema Quality Improving Tasks in the Schema Integration Process Peter Bellström Information Systems Karlstad University Karlstad, Sweden e-mail: peter.bellstrom@kau.se Christian Kop Institute for
More informationSTRUCTURE-BASED QUERY EXPANSION FOR XML SEARCH ENGINE
STRUCTURE-BASED QUERY EXPANSION FOR XML SEARCH ENGINE Wei-ning Qian, Hai-lei Qian, Li Wei, Yan Wang and Ao-ying Zhou Computer Science Department Fudan University Shanghai 200433 E-mail: wnqian@fudan.edu.cn
More informationCNBKSY Platform Manual for IP Login User
CNBKSY Platform Manual for IP Login User Table of Contents 1. Login... 3 1.1. Login Interface... 3 1.2. Operation Steps of Login Functions:... 4 1.3. Exit... 4 2. Home Page... 6 3. General Search Function...
More informationThe Research on the Method of Process-Based Knowledge Catalog and Storage and Its Application in Steel Product R&D
The Research on the Method of Process-Based Knowledge Catalog and Storage and Its Application in Steel Product R&D Xiaodong Gao 1,2 and Zhiping Fan 1 1 School of Business Administration, Northeastern University,
More informationUser guide. ( Basic Search Tips
User guide Welcome to the new ProQuest search experience. ProQuest s all-new, powerful, comprehensive, and easyto-navigate search environment brings together resources from ProQuest, Cambridge Scientific
More informationOrganizing Information. Organizing information is at the heart of information science and is important in many other
Dagobert Soergel College of Library and Information Services University of Maryland College Park, MD 20742 Organizing Information Organizing information is at the heart of information science and is important
More informationTaxonomies and controlled vocabularies best practices for metadata
Original Article Taxonomies and controlled vocabularies best practices for metadata Heather Hedden is the taxonomy manager at First Wind Energy LLC. Previously, she was a taxonomy consultant with Earley
More informationEngineering education knowledge management based on Topic Maps
World Transactions on Engineering and Technology Education Vol.11, No.4, 2013 2013 WIETE Engineering education knowledge management based on Topic Maps Zhu Ke Henan Normal University Xin Xiang, People
More informationWEB SEARCH, FILTERING, AND TEXT MINING: TECHNOLOGY FOR A NEW ERA OF INFORMATION ACCESS
1 WEB SEARCH, FILTERING, AND TEXT MINING: TECHNOLOGY FOR A NEW ERA OF INFORMATION ACCESS BRUCE CROFT NSF Center for Intelligent Information Retrieval, Computer Science Department, University of Massachusetts,
More informationUser Guide. Basic Search Tips
User Guide Welcome to the new ProQuest search experience. ProQuest s all-new, powerful, comprehensive, and easyto-navigate search environment brings together resources from ProQuest, Cambridge Scientific
More informationThe Promotion Channel Investigation of BIM Technology Application
2016 International Conference on Manufacturing Construction and Energy Engineering (MCEE) ISBN: 978-1-60595-374-8 The Promotion Channel Investigation of BIM Technology Application Yong Li, Jia-Chuan Qin,
More informationLarge Scale Chinese News Categorization. Peng Wang. Joint work with H. Zhang, B. Xu, H.W. Hao
Large Scale Chinese News Categorization --based on Improved Feature Selection Method Peng Wang Joint work with H. Zhang, B. Xu, H.W. Hao Computational-Brain Research Center Institute of Automation, Chinese
More informationData Mining with Oracle 10g using Clustering and Classification Algorithms Nhamo Mdzingwa September 25, 2005
Data Mining with Oracle 10g using Clustering and Classification Algorithms Nhamo Mdzingwa September 25, 2005 Abstract Deciding on which algorithm to use, in terms of which is the most effective and accurate
More informationA Study of Future Internet Applications based on Semantic Web Technology Configuration Model
Indian Journal of Science and Technology, Vol 8(20), DOI:10.17485/ijst/2015/v8i20/79311, August 2015 ISSN (Print) : 0974-6846 ISSN (Online) : 0974-5645 A Study of Future Internet Applications based on
More informationResearch on Anti-collision Algorithm Optimization of RFID Tag Based on Binary Search
Research on Anti-collision Algorithm Optimization of RFID Tag Based on Binary Search Jinyan Liu, Quanyuan Feng School of Information Science and Technology, Southwest Jiaotong University, Chengdu 610031,
More informationirnational Standard 5963
5 1 3 8 5 DO irnational Standard 5963 INTERNATIONAL ORGANIZATION FOR STANDARDIZATION«ME)KflyHAPOflHAn 0PrAHM3ALlHH F1O CTAHflAPTL13AU.Hl
More informationUsing AgreementMaker to Align Ontologies for OAEI 2010
Using AgreementMaker to Align Ontologies for OAEI 2010 Isabel F. Cruz, Cosmin Stroe, Michele Caci, Federico Caimi, Matteo Palmonari, Flavio Palandri Antonelli, Ulas C. Keles ADVIS Lab, Department of Computer
More informationOctober 28, 2017 WELCOME SHAREPOINT SATURDAY OTTAWA. Going Meta How to use metadata in SharePoint
October 28, 2017 WELCOME SHAREPOINT SATURDAY OTTAWA Going Meta How to use metadata in SharePoint Agenda What is metadata and why should we use it? Types of metadata Metadata in SharePoint Metadata and
More informationDISCOVERING INFORMATIVE KNOWLEDGE FROM HETEROGENEOUS DATA SOURCES TO DEVELOP EFFECTIVE DATA MINING
DISCOVERING INFORMATIVE KNOWLEDGE FROM HETEROGENEOUS DATA SOURCES TO DEVELOP EFFECTIVE DATA MINING Ms. Pooja Bhise 1, Prof. Mrs. Vidya Bharde 2 and Prof. Manoj Patil 3 1 PG Student, 2 Professor, Department
More information