An Integrated Framework to Enhance the Web Content Mining and Knowledge Discovery

Size: px
Start display at page:

Download "An Integrated Framework to Enhance the Web Content Mining and Knowledge Discovery"

Transcription

1 An Integrated Framework to Enhance the Web Content Mining and Knowledge Discovery Simon Pelletier Université de Moncton, Campus of Shippagan, BGI New Brunswick, Canada and Sid-Ahmed Selouani Université de Moncton, Campus of Shippagan, BGI New Brunswick, Canada ABSTRACT This paper addresses the issue of distilling relevant information from unstructured data such as content from Web pages. For the purpose of solving this issue, a system is designed to propose a utilization of automated guided web mining algorithms for meta-rules extraction. The proposed system can be viewed as an extensible tool to extract metadata and generate multi-format descriptions from existing Web documents. The on Canadian universities. The results show that the system easily provides meaningful visualizations and delivers powerful text extraction, supporting users in their quest to efficiently investigate and exploit available Web data sources. Keywords: Knowledge discovery, Web content mining, Information retrieval, Metadata, Visualisation capabilities 1. INTRODUCTION The rapid expansion of hugely unstructured data on the Web is causing several problems such as an increased difficulty of extracting potentially useful knowledge. Distilling relevant information from unstructured data, such as content from Web pages, can be both challenging and time consuming. Most Crawler-based search engines, such as Google, use methods that essentially do document-level ranking and retrieval, and create their listings automatically. They spider the web then they propose to the users to search through a proposed list of links of Web pages ranked according to their relevance to a given query. Extracting valuable information from such an ever increasing amount of data remains a fastidious and boring task. The biggest challenge is to drive the next generation of Web search by leveraging data mining, and knowledge discovery techniques for information organization, retrieval, and analysis. These new Web search services are expected to bring increased knowledge and intelligence to users. As such, enhanced search functions can effectively dig out understandable information and knowledge from unorganized and unstructured Web data. This paper is organized as follows. The related work is given in Section 2. In Section 3, we give the objectives of the designed tool. The components of the proposed tool are described in Section 4 through the presentation of two case studies. Finally, Section 5 concludes this paper. 2. RELATED WORK It was unanimously recognized that the huge volume of information on the web, which is disseminated to the users in a chaotic way, constitutes a great challenge to make use of that information in a systematic way. In order to face this challenge the Web mining is one of the fast growing technologies that aim at discovering and analyzing useful information from the Web. According to the classification proposed by Nadeem and Syed in [8], the Web mining consists of Web usage mining, Web structure mining, and Web content mining. The Web usage mining investigates the user access patterns from the Web usage logs. The Web structure mining aims at discovering useful knowledge from the structure of hyperlinks. The Web content mining refers to the extraction and integration of useful data, information and knowledge from Web page contents. In this paper we are concerned with the Web content mining. To extract structured data from semi-structured Web documents, pattern discovery based approaches can be used. Recent variants of these approaches consist of discovering extraction patterns from Web pages without user-labeled examples by using several pattern discovery

2 techniques, including radix trees, multiple string alignments and pattern matching algorithms [2]. These information extractors can be generalized over unseen pages from the same Web data source. One of the straightforward methods to extract Web data is to copy-paste. There are tools to copy-paste easier and one of these tools is Quotepad [10]. This tool permits to store notes or data directly from the Web and it also offers an option to convert the selected data by exporting and saving them as extended Markup Language (XML) format. Excel, the spreadsheet application of Microsoft Office Suite can also be used to extract data from the Web by using the Import from a website option [4]. The user may subsequently use data by making histograms or save them as a list. However, the extracted data must be beforehand structured so that the result is clear and easy to analyze and/or to navigate to. Tools like OutWit Hub [9] are useful to find, grab and organize data from the Web. However, these tools are more convenient to recover structured information such as tables or lists of data. Note that they do not automatically extract the data for all (unseen) Web pages of a given site, but only data from the Web page that is currently consulted. Besides this, they do not extract the data dynamically. For example, if the extracted data is saved in Excel and a histogram is made, you have to perform a new process to recreate this histogram if the Web page is updated. Screen Scraper is another Web data extraction tool [11]. This tool is used to store extracted data into databases. Its main advantage is that it can perform automatic extraction of targeted data during a certain period. This tool provides various useful features that allow users to easily interfacing it with their database engines. Data Mining Component Queue of URLs XHtml Parser Natural Language Processing Topic Identification Database Predicate Dictionary Query Composer Association rules Temp Text Storage File Accessor File Parser XSD/XML XSLT/FO PDF format Graphic format Figure 1. Overview of the proposed system Knowledge Base 3. OBJECTIVES To meet the challenge of delivering more intelligent search results to users, we propose a utilization of automated guided web mining algorithms for the purpose of metarules extraction. The proposed approach combines Natural Language Processing and supervised rule-based guidance algorithms to improve the knowledge discovery process by using information available on the Web. The proposed system can be viewed as an extensible tool to extract metadata and generate multi-format descriptions (including XML, database, graphics...) from existing Web documents. It provides a set of features that allow one to analyze documents from the Web without having to manually transcript the reliable information found. The on Canadian universities. 4. TRANSFORMING UNSTRUCTURED WEB DATA INTO INTUITIVE VISUAL FORMAT As illustrated by the block diagram of Figure 1, the proposed framework is composed of parsers, miners, and various output generators. The low-level processing performed by the parsers receives Web documents converted from different formats. It analyzes the contents and divides them into atomic units. For this task, we came up with a simple yet effective algorithm. The parser module contains two engines, and a temporary storage area. The first engine is a multi-format parser used in the system. Typically it selects important attributes by natural language processing of lexical analysis. The second one is used to open raw text documents as well as Microsoft Word documents, and PDF documents that are available for download from the fetched and queued URLs. Once the parsing is done, the documents are appended to the storage area for later processing. The miners make use of the parsed information to generate additional meta-data properties for the documents. Examples of miners include language identification module, Meta data extractor and

3 classifier, etc. Output generators allow users to highlight relevant information buried in unstructured content that is extracted/mined from metadata and present this information in an intuitive visual format. The findings are then presented as a consolidated view thanks to the visual (graphic) or structured information (database) discovered and extracted from processed documents. This framework was written in PHP and SQL function. To store the extracted data, we used a MySQL database. Through two practical case studies, we give details about the algorithms that are used in the proposed framework. Case study 1: Acadian literature resources This application aims at using the proposed framework to provide knowledge about Acadian literature derived by the mining algorithm given in Figure 2. In the steps of this algorithm, we have to enter the suitable combination of related keywords and discover the meaningful information of documents from the targeted web sites obtained by search support functions. To visualize the characteristics of obtained attributes, a Graphical User Interface (GUI) is developed. In order to operate the web miner, it is necessary to gather web pages selectively or entirely. When making request for a given feature, the miners check a text file that contains the queued URLs of these pages. Therefore, the possibility is given to the users to control the behavior of web miner by using this file. Additional selection policies and rules can be added in order to deeply gather and select more relevant web contents. For instance, according to these defined rules, it could be possible to manage the problems of intellectual properties and copyrights when storing copies of gathered web contents on personal servers. Algorithm 1: (deep search & GUI) Fix the number of Web sites S max that has been targeted Generate a set of rules and policies For S max sites Do For each set of visible and unseen pages Do Search for specific items related to publications Evaluate the attributes End for Select and store in the database End for Output to various formats and graphics Discover new sites and update S max Figure 2. The Mining algorithm used to provide Acadian literature information The modular architecture of the proposed framework allows administrators to consider the consistency of web pages, such as updating time of web contents and the validity of the hyperlinks to other web pages. Figure 3. Number of books related to the Acadian culture published per year ( ) In this case study, the system extracts the content of both visible and unseen pages of the website [6], and sends it to the parser. Search patterns are then created and transmitted to a pattern matching procedure. This procedure is used to search a string for specific patterns and stores the results in an array. To extract all the content of all the Web pages and not only for one year (that covers the publication activity of Acadian literature), we must use a loop and change the year in an adapted and dynamic URL. This method permits to go through an array that contains all the publication years from 1980 to The result of this extraction was stored in a database (MySQL) and can be further visualized in various formats. The user may also use extracted data for future analysis by creating a histogram as illustrated in Figure 3. The branches of the histogram will be data that are stored in the database. The main advantage of this histogram is that it is dynamic so if the data change in the database, the histogram changes as well. In this framework, we are using the dynamic SQL statement in a loop to retrieves the number of books published annually. Consequently, it is convenient to use this system for extracting data from unstructured Web because it is exploiting the data dynamically unlike other tools that offer only the manual possibility. The major advantage of structured document formats is the possibility to produce multiple deliverables. But given the fact that there are multiple ways of converting unstructured data into structured formats, it would seem reasonable to choose the appropriate deliverable according to the type of applications and users needs. In our application, the analysis, navigation, and browsing Web site data are facilitated by these new formats. For instance, it is possible thanks to the framework to structure the data collected from the Web site of Acadian literature in bibliographic record format for each book. Based on the fact that the data are saved in XML format as illustrated in Figure 4, this makes our system ideal to extensively use the XSLT (extensible Stylesheet Language Transformations) or RDF (Resource Description Framework) Schema. Subsequently, we have the ability to display XML data about each book in a user-friendly fashion as illustrated in Figure 5.

4 Case Study 2: Information on Canadian Universities (Google Search) Figure 4. XML structure of the Web content In this application, the data that users want to extract are retrieved from selected URLs. To create this relevant list of URLs, a search procedure function (same as the one of the 1 st case study) based on pattern matching of Google search results is used. The Algorithm given in Figure 6 depicts the steps performed to provide enhanced knowledge from current Web search engines. A text file containing the filtered URLs is automatically created to guide the parsing procedure. Next, we use an array of patterns to extract relevant attributes that are previously defined by the users. Note that XSD rules can be established in order to provide the a well formed XML file containing the final retained attributes extracted from the raw data obtained after mining the selected documents (step 6 of Algorithm 2). The framework allows users to extract only the content they want (metadata for instance) without having to click on each link of a given university that Google provides. In this example, the metadata s of Canadian universities Web pages extracted from Google's results can also be stored in a database. Subsequently, they can also be saved in XML file or in any other format depending on the choice and the needs of users. They can also be simply displayed in XHTML format directly from the framework. Algorithm 2: (Search & Store metadata) 1) Define user attributes and optional XSD rules 2) Generate a set of templates to filter Google results: T i 3) Store a set of relevant URLs 4) For U max URLs Do 5) Get a URL x 6) Mine d x : documents of x 7) Evaluate the relevance of attributes by scoring the pattern matching with T i 8) if d x Ø Goto 6 9) Store temporarily selected attributes 10) End For 11) Output information to XML according to XSD rules if established Figure 6. Algorithm providing knowledge discovery through augmented Web search results Figure 5. XSLT result applied to the XML extracted file Figure 7 gives an example of the deliverable obtained in step 9 of Algorithm 2 presented in Figure 6. This file contains temporarily selected attributes according to the user requested information. These invisible data that are extracted from the Google search results on Canadian universities are now accessible. The raw information obtained in step 9, is further structured in XML format according to the predefined XSD rules.

5 Figure 7. Excerpt of the raw data obtained in step 9 of the algorithm presented in Figure 6 5. CONCLUSION In this paper we proposed a framework that can be used to identify and transform valuable text-based information extracted from Web documents into a multiple structured formats, facilitating the analytical process. This on Canadian universities. The algorithms developed within the framework are proven to be effective and intuitive to overcome some difficulties associated with the assimilation of unstructured data. Many uses and possibilities are achievable in order to provide meaningful visualizations, supporting users in their quest to efficiently investigate and exploit the data sources available on the Web. 6. REFERENCES [1] M. Y. Chau, Finding order in a chaotic world: A model for Organized research using the World Wide Web, Internet Reference Services Quarterly, vol. 2, No 2/3, pp , [2] C. Chia-Hui, H. Chun-Nan and L. Shao-Cheng, Automatic information extraction from semi-structured Web pages by pattern discovery, journal of decision Support Systems, Vol. 35, No 1, pp , Elsevier Science Publishers, [3] P. Desikan, J. Srivastava, V. Kumar, and P.N. Tan, Hyperlink Analysis: Techniques and Applications, Technical Report , Army High Performance Computing and Research Center, [4] Excel Microsoft Office, Online on: [5] H. Kawano, Web Archiving Strategies by using Web Mining Techniques, PACRIM IEEE-Communications, Computers and signal Processing Conference, pp , [6] La littérature francophone en Acadie depuis 1980, (translation : «Acadian literature since 1980»), online on: [7] S. Lawrence and C. Lee Giles, Searching the World Wide Web, Science, vol. 280, No3, pp , [8] M. Nadeem and S.H. Syed, Guided Web Content Mining Approach for Automated Meta-Rule Extraction and Information Retrieval, Proceedings of The 2008 International Conference on Data Mining, pp , Las Vegas, USA, [9] Outwit technologies, Harvest the web, online on [10] Quotepad, The free notepad that can save the text selected on the screen, online on: [11] Screen-Scraper, web data extraction, online on:

Prof. Ahmet Süerdem Istanbul Bilgi University London School of Economics

Prof. Ahmet Süerdem Istanbul Bilgi University London School of Economics Prof. Ahmet Süerdem Istanbul Bilgi University London School of Economics Media Intelligence Business intelligence (BI) Uses data mining techniques and tools for the transformation of raw data into meaningful

More information

A SURVEY- WEB MINING TOOLS AND TECHNIQUE

A SURVEY- WEB MINING TOOLS AND TECHNIQUE International Journal of Latest Trends in Engineering and Technology Vol.(7)Issue(4), pp.212-217 DOI: http://dx.doi.org/10.21172/1.74.028 e-issn:2278-621x A SURVEY- WEB MINING TOOLS AND TECHNIQUE Prof.

More information

Competitive Intelligence and Web Mining:

Competitive Intelligence and Web Mining: Competitive Intelligence and Web Mining: Domain Specific Web Spiders American University in Cairo (AUC) CSCE 590: Seminar1 Report Dr. Ahmed Rafea 2 P age Khalid Magdy Salama 3 P age Table of Contents Introduction

More information

3 Publishing Technique

3 Publishing Technique Publishing Tool 32 3 Publishing Technique As discussed in Chapter 2, annotations can be extracted from audio, text, and visual features. The extraction of text features from the audio layer is the approach

More information

ISSN: (Online) Volume 3, Issue 6, June 2015 International Journal of Advance Research in Computer Science and Management Studies

ISSN: (Online) Volume 3, Issue 6, June 2015 International Journal of Advance Research in Computer Science and Management Studies ISSN: 2321-7782 (Online) Volume 3, Issue 6, June 2015 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online

More information

An Approach To Web Content Mining

An Approach To Web Content Mining An Approach To Web Content Mining Nita Patil, Chhaya Das, Shreya Patanakar, Kshitija Pol Department of Computer Engg. Datta Meghe College of Engineering, Airoli, Navi Mumbai Abstract-With the research

More information

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK A SURVEY ON WEB CONTENT MINING DEVEN KENE 1, DR. PRADEEP K. BUTEY 2 1 Research

More information

DATA MINING II - 1DL460. Spring 2014"

DATA MINING II - 1DL460. Spring 2014 DATA MINING II - 1DL460 Spring 2014" A second course in data mining http://www.it.uu.se/edu/course/homepage/infoutv2/vt14 Kjell Orsborn Uppsala Database Laboratory Department of Information Technology,

More information

DATA MINING II - 1DL460. Spring 2017

DATA MINING II - 1DL460. Spring 2017 DATA MINING II - 1DL460 Spring 2017 A second course in data mining http://www.it.uu.se/edu/course/homepage/infoutv2/vt17 Kjell Orsborn Uppsala Database Laboratory Department of Information Technology,

More information

Support System- Pioneering approach for Web Data Mining

Support System- Pioneering approach for Web Data Mining Support System- Pioneering approach for Web Data Mining Geeta Kataria 1, Surbhi Kaushik 2, Nidhi Narang 3 and Sunny Dahiya 4 1,2,3,4 Computer Science Department Kurukshetra University Sonepat, India ABSTRACT

More information

A B2B Search Engine. Abstract. Motivation. Challenges. Technical Report

A B2B Search Engine. Abstract. Motivation. Challenges. Technical Report Technical Report A B2B Search Engine Abstract In this report, we describe a business-to-business search engine that allows searching for potential customers with highly-specific queries. Currently over

More information

IJMIE Volume 2, Issue 9 ISSN:

IJMIE Volume 2, Issue 9 ISSN: WEB USAGE MINING: LEARNER CENTRIC APPROACH FOR E-BUSINESS APPLICATIONS B. NAVEENA DEVI* Abstract Emerging of web has put forward a great deal of challenges to web researchers for web based information

More information

DATA MINING - 1DL105, 1DL111

DATA MINING - 1DL105, 1DL111 1 DATA MINING - 1DL105, 1DL111 Fall 2007 An introductory class in data mining http://user.it.uu.se/~udbl/dut-ht2007/ alt. http://www.it.uu.se/edu/course/homepage/infoutv/ht07 Kjell Orsborn Uppsala Database

More information

Semantic Web Mining and its application in Human Resource Management

Semantic Web Mining and its application in Human Resource Management International Journal of Computer Science & Management Studies, Vol. 11, Issue 02, August 2011 60 Semantic Web Mining and its application in Human Resource Management Ridhika Malik 1, Kunjana Vasudev 2

More information

AN OVERVIEW OF SEARCHING AND DISCOVERING WEB BASED INFORMATION RESOURCES

AN OVERVIEW OF SEARCHING AND DISCOVERING WEB BASED INFORMATION RESOURCES Journal of Defense Resources Management No. 1 (1) / 2010 AN OVERVIEW OF SEARCHING AND DISCOVERING Cezar VASILESCU Regional Department of Defense Resources Management Studies Abstract: The Internet becomes

More information

Secrets of Profitable Freelance Writing

Secrets of Profitable Freelance Writing Secrets of Profitable Freelance Writing Proven Strategies for Finding High Paying Writing Jobs Online Nathan Segal Cover by Nathan Segal Editing Precision Proofreading Nathan Segal 2014 Secrets of Profitable

More information

The influence of caching on web usage mining

The influence of caching on web usage mining The influence of caching on web usage mining J. Huysmans 1, B. Baesens 1,2 & J. Vanthienen 1 1 Department of Applied Economic Sciences, K.U.Leuven, Belgium 2 School of Management, University of Southampton,

More information

Chapter 50 Tracing Related Scientific Papers by a Given Seed Paper Using Parscit

Chapter 50 Tracing Related Scientific Papers by a Given Seed Paper Using Parscit Chapter 50 Tracing Related Scientific Papers by a Given Seed Paper Using Parscit Resmana Lim, Indra Ruslan, Hansin Susatya, Adi Wibowo, Andreas Handojo and Raymond Sutjiadi Abstract The project developed

More information

Deep Web Content Mining

Deep Web Content Mining Deep Web Content Mining Shohreh Ajoudanian, and Mohammad Davarpanah Jazi Abstract The rapid expansion of the web is causing the constant growth of information, leading to several problems such as increased

More information

Implementing a Knowledge Database for Scientific Control Systems. Daniel Gresh Wheatland-Chili High School LLE Advisor: Richard Kidder Summer 2006

Implementing a Knowledge Database for Scientific Control Systems. Daniel Gresh Wheatland-Chili High School LLE Advisor: Richard Kidder Summer 2006 Implementing a Knowledge Database for Scientific Control Systems Abstract Daniel Gresh Wheatland-Chili High School LLE Advisor: Richard Kidder Summer 2006 A knowledge database for scientific control systems

More information

Toward a Knowledge-Based Solution for Information Discovery in Complex and Dynamic Domains

Toward a Knowledge-Based Solution for Information Discovery in Complex and Dynamic Domains Toward a Knowledge-Based Solution for Information Discovery in Complex and Dynamic Domains Eloise Currie and Mary Parmelee SAS Institute, Cary NC About SAS: The Power to Know SAS: The Market Leader in

More information

Deep Web Crawling and Mining for Building Advanced Search Application

Deep Web Crawling and Mining for Building Advanced Search Application Deep Web Crawling and Mining for Building Advanced Search Application Zhigang Hua, Dan Hou, Yu Liu, Xin Sun, Yanbing Yu {hua, houdan, yuliu, xinsun, yyu}@cc.gatech.edu College of computing, Georgia Tech

More information

LOMGen: A Learning Object Metadata Generator Applied to Computer Science Terminology

LOMGen: A Learning Object Metadata Generator Applied to Computer Science Terminology LOMGen: A Learning Object Metadata Generator Applied to Computer Science Terminology A. Singh, H. Boley, V.C. Bhavsar National Research Council and University of New Brunswick Learning Objects Summit Fredericton,

More information

ASG WHITE PAPER DATA INTELLIGENCE. ASG s Enterprise Data Intelligence Solutions: Data Lineage Diving Deeper

ASG WHITE PAPER DATA INTELLIGENCE. ASG s Enterprise Data Intelligence Solutions: Data Lineage Diving Deeper THE NEED Knowing where data came from, how it moves through systems, and how it changes, is the most critical and most difficult task in any data management project. If that process known as tracing data

More information

TERM BASED WEIGHT MEASURE FOR INFORMATION FILTERING IN SEARCH ENGINES

TERM BASED WEIGHT MEASURE FOR INFORMATION FILTERING IN SEARCH ENGINES TERM BASED WEIGHT MEASURE FOR INFORMATION FILTERING IN SEARCH ENGINES Mu. Annalakshmi Research Scholar, Department of Computer Science, Alagappa University, Karaikudi. annalakshmi_mu@yahoo.co.in Dr. A.

More information

A NOVEL APPROACH TO INTEGRATED SEARCH INFORMATION RETRIEVAL TECHNIQUE FOR HIDDEN WEB FOR DOMAIN SPECIFIC CRAWLING

A NOVEL APPROACH TO INTEGRATED SEARCH INFORMATION RETRIEVAL TECHNIQUE FOR HIDDEN WEB FOR DOMAIN SPECIFIC CRAWLING A NOVEL APPROACH TO INTEGRATED SEARCH INFORMATION RETRIEVAL TECHNIQUE FOR HIDDEN WEB FOR DOMAIN SPECIFIC CRAWLING Manoj Kumar 1, James 2, Sachin Srivastava 3 1 Student, M. Tech. CSE, SCET Palwal - 121105,

More information

UNIT-V WEB MINING. 3/18/2012 Prof. Asha Ambhaikar, RCET Bhilai.

UNIT-V WEB MINING. 3/18/2012 Prof. Asha Ambhaikar, RCET Bhilai. UNIT-V WEB MINING 1 Mining the World-Wide Web 2 What is Web Mining? Discovering useful information from the World-Wide Web and its usage patterns. 3 Web search engines Index-based: search the Web, index

More information

WEB SEARCH, FILTERING, AND TEXT MINING: TECHNOLOGY FOR A NEW ERA OF INFORMATION ACCESS

WEB SEARCH, FILTERING, AND TEXT MINING: TECHNOLOGY FOR A NEW ERA OF INFORMATION ACCESS 1 WEB SEARCH, FILTERING, AND TEXT MINING: TECHNOLOGY FOR A NEW ERA OF INFORMATION ACCESS BRUCE CROFT NSF Center for Intelligent Information Retrieval, Computer Science Department, University of Massachusetts,

More information

Creating a Classifier for a Focused Web Crawler

Creating a Classifier for a Focused Web Crawler Creating a Classifier for a Focused Web Crawler Nathan Moeller December 16, 2015 1 Abstract With the increasing size of the web, it can be hard to find high quality content with traditional search engines.

More information

Proposal for Implementing Linked Open Data on Libraries Catalogue

Proposal for Implementing Linked Open Data on Libraries Catalogue Submitted on: 16.07.2018 Proposal for Implementing Linked Open Data on Libraries Catalogue Esraa Elsayed Abdelaziz Computer Science, Arab Academy for Science and Technology, Alexandria, Egypt. E-mail address:

More information

Finding Topic-centric Identified Experts based on Full Text Analysis

Finding Topic-centric Identified Experts based on Full Text Analysis Finding Topic-centric Identified Experts based on Full Text Analysis Hanmin Jung, Mikyoung Lee, In-Su Kang, Seung-Woo Lee, Won-Kyung Sung Information Service Research Lab., KISTI, Korea jhm@kisti.re.kr

More information

Development of an e-library Web Application

Development of an e-library Web Application Development of an e-library Web Application Farrukh SHAHZAD Assistant Professor al-huda University, Houston, TX USA Email: dr.farrukh@alhudauniversity.org and Fathi M. ALWOSAIBI Information Technology

More information

Developing Seamless Discovery of Scholarly and Trade Journal Resources Via OAI and RSS Chumbe, Santiago Segundo; MacLeod, Roddy

Developing Seamless Discovery of Scholarly and Trade Journal Resources Via OAI and RSS Chumbe, Santiago Segundo; MacLeod, Roddy Heriot-Watt University Heriot-Watt University Research Gateway Developing Seamless Discovery of Scholarly and Trade Journal Resources Via OAI and RSS Chumbe, Santiago Segundo; MacLeod, Roddy Publication

More information

INTRODUCTION. Chapter GENERAL

INTRODUCTION. Chapter GENERAL Chapter 1 INTRODUCTION 1.1 GENERAL The World Wide Web (WWW) [1] is a system of interlinked hypertext documents accessed via the Internet. It is an interactive world of shared information through which

More information

D2.5 Data mediation. Project: ROADIDEA

D2.5 Data mediation. Project: ROADIDEA D2.5 Data mediation Project: ROADIDEA 215455 Document Number and Title: D2.5 Data mediation How to convert data with different formats Work-Package: WP2 Deliverable Type: Report Contractual Date of Delivery:

More information

Knowledge Retrieval. Franz J. Kurfess. Computer Science Department California Polytechnic State University San Luis Obispo, CA, U.S.A.

Knowledge Retrieval. Franz J. Kurfess. Computer Science Department California Polytechnic State University San Luis Obispo, CA, U.S.A. Knowledge Retrieval Franz J. Kurfess Computer Science Department California Polytechnic State University San Luis Obispo, CA, U.S.A. 1 Acknowledgements This lecture series has been sponsored by the European

More information

Efficient Indexing and Searching Framework for Unstructured Data

Efficient Indexing and Searching Framework for Unstructured Data Efficient Indexing and Searching Framework for Unstructured Data Kyar Nyo Aye, Ni Lar Thein University of Computer Studies, Yangon kyarnyoaye@gmail.com, nilarthein@gmail.com ABSTRACT The proliferation

More information

Life Science Journal 2017;14(2) Optimized Web Content Mining

Life Science Journal 2017;14(2)   Optimized Web Content Mining Optimized Web Content Mining * K. Thirugnana Sambanthan,** Dr. S.S. Dhenakaran, Professor * Research Scholar, Dept. Computer Science, Alagappa University, Karaikudi, E-mail: shivaperuman@gmail.com ** Dept.

More information

A Supervised Method for Multi-keyword Web Crawling on Web Forums

A Supervised Method for Multi-keyword Web Crawling on Web Forums Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 2, February 2014,

More information

Evaluating the Usefulness of Sentiment Information for Focused Crawlers

Evaluating the Usefulness of Sentiment Information for Focused Crawlers Evaluating the Usefulness of Sentiment Information for Focused Crawlers Tianjun Fu 1, Ahmed Abbasi 2, Daniel Zeng 1, Hsinchun Chen 1 University of Arizona 1, University of Wisconsin-Milwaukee 2 futj@email.arizona.edu,

More information

Web Analysis in 4 Easy Steps. Rosaria Silipo, Bernd Wiswedel and Tobias Kötter

Web Analysis in 4 Easy Steps. Rosaria Silipo, Bernd Wiswedel and Tobias Kötter Web Analysis in 4 Easy Steps Rosaria Silipo, Bernd Wiswedel and Tobias Kötter KNIME Forum Analysis KNIME Forum Analysis Steps: 1. Get data into KNIME 2. Extract simple statistics (how many posts, response

More information

A Comprehensive Comparison between Web Content Mining Tools: Usages, Capabilities and Limitations

A Comprehensive Comparison between Web Content Mining Tools: Usages, Capabilities and Limitations A Comprehensive Comparison between Web Content Mining Tools: Usages, Capabilities and Limitations Zahra Hojati 1, Rozita Jamili Oskouei 2* Department of Electrical, Computer & IT, Zanjan Branch, Islamic

More information

ABSTRACT: INTRODUCTION: WEB CRAWLER OVERVIEW: METHOD 1: WEB CRAWLER IN SAS DATA STEP CODE. Paper CC-17

ABSTRACT: INTRODUCTION: WEB CRAWLER OVERVIEW: METHOD 1: WEB CRAWLER IN SAS DATA STEP CODE. Paper CC-17 Paper CC-17 Your Friendly Neighborhood Web Crawler: A Guide to Crawling the Web with SAS Jake Bartlett, Alicia Bieringer, and James Cox PhD, SAS Institute Inc., Cary, NC ABSTRACT: The World Wide Web has

More information

Part I: Data Mining Foundations

Part I: Data Mining Foundations Table of Contents 1. Introduction 1 1.1. What is the World Wide Web? 1 1.2. A Brief History of the Web and the Internet 2 1.3. Web Data Mining 4 1.3.1. What is Data Mining? 6 1.3.2. What is Web Mining?

More information

Overview of Web Mining Techniques and its Application towards Web

Overview of Web Mining Techniques and its Application towards Web Overview of Web Mining Techniques and its Application towards Web *Prof.Pooja Mehta Abstract The World Wide Web (WWW) acts as an interactive and popular way to transfer information. Due to the enormous

More information

CHAPTER THREE INFORMATION RETRIEVAL SYSTEM

CHAPTER THREE INFORMATION RETRIEVAL SYSTEM CHAPTER THREE INFORMATION RETRIEVAL SYSTEM 3.1 INTRODUCTION Search engine is one of the most effective and prominent method to find information online. It has become an essential part of life for almost

More information

Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p.

Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p. Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p. 6 What is Web Mining? p. 6 Summary of Chapters p. 8 How

More information

Ranking Techniques in Search Engines

Ranking Techniques in Search Engines Ranking Techniques in Search Engines Rajat Chaudhari M.Tech Scholar Manav Rachna International University, Faridabad Charu Pujara Assistant professor, Dept. of Computer Science Manav Rachna International

More information

Crawling the Hidden Web Resources: A Review

Crawling the Hidden Web Resources: A Review Rosy Madaan 1, Ashutosh Dixit 2 and A.K. Sharma 2 Abstract An ever-increasing amount of information on the Web today is available only through search interfaces. The users have to type in a set of keywords

More information

WEB-BASED COLLECTION MANAGEMENT FOR LIBRARIES

WEB-BASED COLLECTION MANAGEMENT FOR LIBRARIES WEB-BASED COLLECTION MANAGEMENT FOR LIBRARIES Comprehensive Collections Management Systems You Can Access Anytime, Anywhere AXIELL COLLECTIONS FOR LIBRARIES Axiell Collections is a web-based CMS designed

More information

Design and Implementation of Agricultural Information Resources Vertical Search Engine Based on Nutch

Design and Implementation of Agricultural Information Resources Vertical Search Engine Based on Nutch 619 A publication of CHEMICAL ENGINEERING TRANSACTIONS VOL. 51, 2016 Guest Editors: Tichun Wang, Hongyang Zhang, Lei Tian Copyright 2016, AIDIC Servizi S.r.l., ISBN 978-88-95608-43-3; ISSN 2283-9216 The

More information

Interactive Machine Learning (IML) Markup of OCR Generated Text by Exploiting Domain Knowledge: A Biodiversity Case Study

Interactive Machine Learning (IML) Markup of OCR Generated Text by Exploiting Domain Knowledge: A Biodiversity Case Study Interactive Machine Learning (IML) Markup of OCR Generated by Exploiting Domain Knowledge: A Biodiversity Case Study Several digitization projects such as Google books are involved in scanning millions

More information

USER S GUIDE FOR THE ECONOMICS ELECTRONIC LIBRARY

USER S GUIDE FOR THE ECONOMICS ELECTRONIC LIBRARY USER S GUIDE FOR THE ECONOMICS ELECTRONIC LIBRARY User s Guide for the Economics Electronic Library http://www.bibeco.ulb.ac.be Table of Contents 1. Introduction... 4 2. Overview... 5 3. Search tools...

More information

Chapter 2. Architecture of a Search Engine

Chapter 2. Architecture of a Search Engine Chapter 2 Architecture of a Search Engine Search Engine Architecture A software architecture consists of software components, the interfaces provided by those components and the relationships between them

More information

A Comparative Study of the Search and Retrieval Features of OAI Harvesting Services

A Comparative Study of the Search and Retrieval Features of OAI Harvesting Services A Comparative Study of the Search and Retrieval Features of OAI Harvesting Services V. Indrani 1 and K. Thulasi 2 1 Information Centre for Aerospace Science and Technology, National Aerospace Laboratories,

More information

A Survey Of Different Text Mining Techniques Varsha C. Pande 1 and Dr. A.S. Khandelwal 2

A Survey Of Different Text Mining Techniques Varsha C. Pande 1 and Dr. A.S. Khandelwal 2 A Survey Of Different Text Mining Techniques Varsha C. Pande 1 and Dr. A.S. Khandelwal 2 1 Department of Electronics & Comp. Sc, RTMNU, Nagpur, India 2 Department of Computer Science, Hislop College, Nagpur,

More information

Archives in a Networked Information Society: The Problem of Sustainability in the Digital Information Environment

Archives in a Networked Information Society: The Problem of Sustainability in the Digital Information Environment Archives in a Networked Information Society: The Problem of Sustainability in the Digital Information Environment Shigeo Sugimoto Research Center for Knowledge Communities Graduate School of Library, Information

More information

A Novel Interface to a Web Crawler using VB.NET Technology

A Novel Interface to a Web Crawler using VB.NET Technology IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661, p- ISSN: 2278-8727Volume 15, Issue 6 (Nov. - Dec. 2013), PP 59-63 A Novel Interface to a Web Crawler using VB.NET Technology Deepak Kumar

More information

Metadata Framework for Resource Discovery

Metadata Framework for Resource Discovery Submitted by: Metadata Strategy Catalytic Initiative 2006-05-01 Page 1 Section 1 Metadata Framework for Resource Discovery Overview We must find new ways to organize and describe our extraordinary information

More information

ATLAS.ti 8 WINDOWS & ATLAS.ti MAC THE NEXT LEVEL

ATLAS.ti 8 WINDOWS & ATLAS.ti MAC THE NEXT LEVEL ATLAS.ti 8 & ATLAS.ti THE NEXT LEVEL POWERFUL DATA ANALYSIS. EASY TO USE LIKE NEVER BEFORE. www.atlasti.com UNIVERSAL EXPORT. LIFE LONG DATA ACCESS. ATLAS.ti 8 AND ATLAS.ti DATA ANALYSIS WITH ATLAS.ti

More information

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK REVIEW PAPER ON IMPLEMENTATION OF DOCUMENT ANNOTATION USING CONTENT AND QUERYING

More information

2015 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media,

2015 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, 2015 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising

More information

So You Want To Save Outlook s to SharePoint

So You Want To Save Outlook  s to SharePoint So You Want To Save Outlook Emails to SharePoint Interested in using Microsoft SharePoint to store, find and share your Microsoft Outlook messages? Finding that the out-of-the-box integration of Outlook

More information

Building Institutional Repositories: Emerging Challenges

Building Institutional Repositories: Emerging Challenges University of Nebraska at Omaha From the SelectedWorks of Yumi Ohira 2014 Building Institutional Repositories: Emerging Challenges Yumi Ohira, University of Nebraska at Omaha Available at: https://works.bepress.com/yumi-ohira/3/

More information

Easy Ed: An Integration of Technologies for Multimedia Education 1

Easy Ed: An Integration of Technologies for Multimedia Education 1 Easy Ed: An Integration of Technologies for Multimedia Education 1 G. Ahanger and T.D.C. Little Multimedia Communications Laboratory Department of Electrical and Computer Engineering Boston University,

More information

When Communities of Interest Collide: Harmonizing Vocabularies Across Operational Areas C. L. Connors, The MITRE Corporation

When Communities of Interest Collide: Harmonizing Vocabularies Across Operational Areas C. L. Connors, The MITRE Corporation When Communities of Interest Collide: Harmonizing Vocabularies Across Operational Areas C. L. Connors, The MITRE Corporation Three recent trends have had a profound impact on data standardization within

More information

Provenance-aware Faceted Search in Drupal

Provenance-aware Faceted Search in Drupal Provenance-aware Faceted Search in Drupal Zhenning Shangguan, Jinguang Zheng, and Deborah L. McGuinness Tetherless World Constellation, Computer Science Department, Rensselaer Polytechnic Institute, 110

More information

Data and Information Integration: Information Extraction

Data and Information Integration: Information Extraction International OPEN ACCESS Journal Of Modern Engineering Research (IJMER) Data and Information Integration: Information Extraction Varnica Verma 1 1 (Department of Computer Science Engineering, Guru Nanak

More information

Database of historical places, persons, and lemmas

Database of historical places, persons, and lemmas Database of historical places, persons, and lemmas Natalia Korchagina Outline 1. Introduction 1.1 Swiss Law Sources Foundation as a Digital Humanities project 1.2 Data to be stored 1.3 Final goal: how

More information

Course Introduction & Foundational Concepts

Course Introduction & Foundational Concepts Course Introduction & Foundational Concepts CPS 352: Database Systems Simon Miner Gordon College Last Revised: 8/30/12 Agenda Introductions Course Syllabus Databases Why What Terminology and Concepts Design

More information

A Lime Light on the Emerging Trends of Web Mining

A Lime Light on the Emerging Trends of Web Mining A Lime Light on the Emerging Trends of Web Mining Udayasri.B, Sushmitha.N, Padmavathi.S Dept. of Computer Science and Engineering, Vidyavardhaka College of Engineering, Mysore, Karnataka, India E-mail

More information

Performance Comparison of Hive, Pig & Map Reduce over Variety of Big Data

Performance Comparison of Hive, Pig & Map Reduce over Variety of Big Data Performance Comparison of Hive, Pig & Map Reduce over Variety of Big Data Yojna Arora, Dinesh Goyal Abstract: Big Data refers to that huge amount of data which cannot be analyzed by using traditional analytics

More information

An FCA Framework for Knowledge Discovery in SPARQL Query Answers

An FCA Framework for Knowledge Discovery in SPARQL Query Answers An FCA Framework for Knowledge Discovery in SPARQL Query Answers Melisachew Wudage Chekol, Amedeo Napoli To cite this version: Melisachew Wudage Chekol, Amedeo Napoli. An FCA Framework for Knowledge Discovery

More information

Review on Text Mining

Review on Text Mining Review on Text Mining Aarushi Rai #1, Aarush Gupta *2, Jabanjalin Hilda J. #3 #1 School of Computer Science and Engineering, VIT University, Tamil Nadu - India #2 School of Computer Science and Engineering,

More information

Minghai Liu, Rui Cai, Ming Zhang, and Lei Zhang. Microsoft Research, Asia School of EECS, Peking University

Minghai Liu, Rui Cai, Ming Zhang, and Lei Zhang. Microsoft Research, Asia School of EECS, Peking University Minghai Liu, Rui Cai, Ming Zhang, and Lei Zhang Microsoft Research, Asia School of EECS, Peking University Ordering Policies for Web Crawling Ordering policy To prioritize the URLs in a crawling queue

More information

Semantic Web Search Model for Information Retrieval of the Semantic Data *

Semantic Web Search Model for Information Retrieval of the Semantic Data * Semantic Web Search Model for Information Retrieval of the Semantic Data * Okkyung Choi 1, SeokHyun Yoon 1, Myeongeun Oh 1, and Sangyong Han 2 Department of Computer Science & Engineering Chungang University

More information

IJREAT International Journal of Research in Engineering & Advanced Technology, Volume 1, Issue 5, Oct-Nov, ISSN:

IJREAT International Journal of Research in Engineering & Advanced Technology, Volume 1, Issue 5, Oct-Nov, ISSN: IJREAT International Journal of Research in Engineering & Advanced Technology, Volume 1, Issue 5, Oct-Nov, 20131 Improve Search Engine Relevance with Filter session Addlin Shinney R 1, Saravana Kumar T

More information

VALLIAMMAI ENGINEERING COLLEGE SRM Nagar, Kattankulathur DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING QUESTION BANK VII SEMESTER

VALLIAMMAI ENGINEERING COLLEGE SRM Nagar, Kattankulathur DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING QUESTION BANK VII SEMESTER VALLIAMMAI ENGINEERING COLLEGE SRM Nagar, Kattankulathur 603 203 DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING QUESTION BANK VII SEMESTER CS6007-INFORMATION RETRIEVAL Regulation 2013 Academic Year 2018

More information

XFDU packaging contribution to an implementation of the OAIS reference model

XFDU packaging contribution to an implementation of the OAIS reference model XFDU packaging contribution to an implementation of the OAIS reference model Arnaud Lucas, Centre National d Etudes Spatiales 18, avenue Edouard Belin 31401 Toulouse Cedex 9 FRANCE Arnaud.lucas@cnes.fr

More information

Domain Specific Search Engine for Students

Domain Specific Search Engine for Students Domain Specific Search Engine for Students Domain Specific Search Engine for Students Wai Yuen Tang The Department of Computer Science City University of Hong Kong, Hong Kong wytang@cs.cityu.edu.hk Lam

More information

Construction of the Library Management System Based on Data Warehouse and OLAP Maoli Xu 1, a, Xiuying Li 2,b

Construction of the Library Management System Based on Data Warehouse and OLAP Maoli Xu 1, a, Xiuying Li 2,b Applied Mechanics and Materials Online: 2013-08-30 ISSN: 1662-7482, Vols. 380-384, pp 4796-4799 doi:10.4028/www.scientific.net/amm.380-384.4796 2013 Trans Tech Publications, Switzerland Construction of

More information

Web Crawling. Jitali Patel 1, Hardik Jethva 2 Dept. of Computer Science and Engineering, Nirma University, Ahmedabad, Gujarat, India

Web Crawling. Jitali Patel 1, Hardik Jethva 2 Dept. of Computer Science and Engineering, Nirma University, Ahmedabad, Gujarat, India Web Crawling Jitali Patel 1, Hardik Jethva 2 Dept. of Computer Science and Engineering, Nirma University, Ahmedabad, Gujarat, India - 382 481. Abstract- A web crawler is a relatively simple automated program

More information

Information Retrieval May 15. Web retrieval

Information Retrieval May 15. Web retrieval Information Retrieval May 15 Web retrieval What s so special about the Web? The Web Large Changing fast Public - No control over editing or contents Spam and Advertisement How big is the Web? Practically

More information

In the recent past, the World Wide Web has been witnessing an. explosive growth. All the leading web search engines, namely, Google,

In the recent past, the World Wide Web has been witnessing an. explosive growth. All the leading web search engines, namely, Google, 1 1.1 Introduction In the recent past, the World Wide Web has been witnessing an explosive growth. All the leading web search engines, namely, Google, Yahoo, Askjeeves, etc. are vying with each other to

More information

A Two-stage Crawler for Efficiently Harvesting Deep-Web Interfaces

A Two-stage Crawler for Efficiently Harvesting Deep-Web Interfaces A Two-stage Crawler for Efficiently Harvesting Deep-Web Interfaces Md. Nazeem Ahmed MTech(CSE) SLC s Institute of Engineering and Technology Adavelli ramesh Mtech Assoc. Prof Dep. of computer Science SLC

More information

EXTRACTION OF REUSABLE COMPONENTS FROM LEGACY SYSTEMS

EXTRACTION OF REUSABLE COMPONENTS FROM LEGACY SYSTEMS EXTRACTION OF REUSABLE COMPONENTS FROM LEGACY SYSTEMS Moon-Soo Lee, Yeon-June Choi, Min-Jeong Kim, Oh-Chun, Kwon Telematics S/W Platform Team, Telematics Research Division Electronics and Telecommunications

More information

Unit 10 Databases. Computer Concepts Unit Contents. 10 Operational and Analytical Databases. 10 Section A: Database Basics

Unit 10 Databases. Computer Concepts Unit Contents. 10 Operational and Analytical Databases. 10 Section A: Database Basics Unit 10 Databases Computer Concepts 2016 ENHANCED EDITION 10 Unit Contents Section A: Database Basics Section B: Database Tools Section C: Database Design Section D: SQL Section E: Big Data Unit 10: Databases

More information

strategy IT Str a 2020 tegy

strategy IT Str a 2020 tegy strategy IT Strategy 2017-2020 Great things happen when the world agrees ISOʼs mission is to bring together experts through its Members to share knowledge and to develop voluntary, consensus-based, market-relevant

More information

EXTRACT THE TARGET LIST WITH HIGH ACCURACY FROM TOP-K WEB PAGES

EXTRACT THE TARGET LIST WITH HIGH ACCURACY FROM TOP-K WEB PAGES EXTRACT THE TARGET LIST WITH HIGH ACCURACY FROM TOP-K WEB PAGES B. GEETHA KUMARI M. Tech (CSE) Email-id: Geetha.bapr07@gmail.com JAGETI PADMAVTHI M. Tech (CSE) Email-id: jageti.padmavathi4@gmail.com ABSTRACT:

More information

Automated Classification. Lars Marius Garshol Topic Maps

Automated Classification. Lars Marius Garshol Topic Maps Automated Classification Lars Marius Garshol Topic Maps 2007 2007-03-21 Automated classification What is it? Why do it? 2 What is automated classification? Create parts of a topic map

More information

Information Retrieval Spring Web retrieval

Information Retrieval Spring Web retrieval Information Retrieval Spring 2016 Web retrieval The Web Large Changing fast Public - No control over editing or contents Spam and Advertisement How big is the Web? Practically infinite due to the dynamic

More information

International Journal of Scientific & Engineering Research, Volume 7, Issue 2, February ISSN

International Journal of Scientific & Engineering Research, Volume 7, Issue 2, February ISSN International Journal of Scientific & Engineering Research, Volume 7, Issue 2, February-2016 1402 An Application Programming Interface Based Architectural Design for Information Retrieval in Semantic Organization

More information

Search Engine Optimisation Basics for Government Agencies

Search Engine Optimisation Basics for Government Agencies Search Engine Optimisation Basics for Government Agencies Prepared for State Services Commission by Catalyst IT Neil Bertram May 11, 2007 Abstract This document is intended as a guide for New Zealand government

More information

An Efficient Approach for Color Pattern Matching Using Image Mining

An Efficient Approach for Color Pattern Matching Using Image Mining An Efficient Approach for Color Pattern Matching Using Image Mining * Manjot Kaur Navjot Kaur Master of Technology in Computer Science & Engineering, Sri Guru Granth Sahib World University, Fatehgarh Sahib,

More information

Information Retrieval

Information Retrieval Multimedia Computing: Algorithms, Systems, and Applications: Information Retrieval and Search Engine By Dr. Yu Cao Department of Computer Science The University of Massachusetts Lowell Lowell, MA 01854,

More information

6 TOOLS FOR A COMPLETE MARKETING WORKFLOW

6 TOOLS FOR A COMPLETE MARKETING WORKFLOW 6 S FOR A COMPLETE MARKETING WORKFLOW 01 6 S FOR A COMPLETE MARKETING WORKFLOW FROM ALEXA DIFFICULTY DIFFICULTY MATRIX OVERLAP 6 S FOR A COMPLETE MARKETING WORKFLOW 02 INTRODUCTION Marketers use countless

More information

Role of Metadata in Knowledge Management of Multinational Organizations

Role of Metadata in Knowledge Management of Multinational Organizations Advances in Computational Sciences and Technology ISSN 0973-6107 Volume 10, Number 2 (2017) pp. 211-219 Research India Publications http://www.ripublication.com Role of Metadata in Knowledge Management

More information

Information Retrieval (IR) through Semantic Web (SW): An Overview

Information Retrieval (IR) through Semantic Web (SW): An Overview Information Retrieval (IR) through Semantic Web (SW): An Overview Gagandeep Singh 1, Vishal Jain 2 1 B.Tech (CSE) VI Sem, GuruTegh Bahadur Institute of Technology, GGS Indraprastha University, Delhi 2

More information

Bing Liu. Web Data Mining. Exploring Hyperlinks, Contents, and Usage Data. With 177 Figures. Springer

Bing Liu. Web Data Mining. Exploring Hyperlinks, Contents, and Usage Data. With 177 Figures. Springer Bing Liu Web Data Mining Exploring Hyperlinks, Contents, and Usage Data With 177 Figures Springer Table of Contents 1. Introduction 1 1.1. What is the World Wide Web? 1 1.2. A Brief History of the Web

More information

<is web> Information Systems & Semantic Web University of Koblenz Landau, Germany

<is web> Information Systems & Semantic Web University of Koblenz Landau, Germany Information Systems & University of Koblenz Landau, Germany Semantic Search examples: Swoogle and Watson Steffen Staad credit: Tim Finin (swoogle), Mathieu d Aquin (watson) and their groups 2009-07-17

More information