LIST OF ACRONYMS & ABBREVIATIONS

Similar documents
TABLE OF CONTENTS CHAPTER NO. TITLE PAGENO. LIST OF TABLES LIST OF FIGURES LIST OF ABRIVATION

DEC Computer Technology LESSON 6: DATABASES AND WEB SEARCH ENGINES

UNIT-V WEB MINING. 3/18/2012 Prof. Asha Ambhaikar, RCET Bhilai.

CHAPTER THREE INFORMATION RETRIEVAL SYSTEM

Searching the Web What is this Page Known for? Luis De Alba

An Introduction to Search Engines and Web Navigation

<is web> Information Systems & Semantic Web University of Koblenz Landau, Germany

CHAPTER-3 PROPOSED ARCHITECTURE FOR CONTEXT BASED FOCUSED SEARCH ENGINE

Automatic Identification of User Goals in Web Search [WWW 05]

This course is designed for web developers that want to learn HTML5, CSS3, JavaScript and jquery.

Mining Web Data. Lijun Zhang

Bing Liu. Web Data Mining. Exploring Hyperlinks, Contents, and Usage Data. With 177 Figures. Springer

Chapter 2. Architecture of a Search Engine

DATA MINING II - 1DL460. Spring 2017

DATA MINING II - 1DL460. Spring 2014"

How Does a Search Engine Work? Part 1

PROJECT REPORT (Final Year Project ) Project Supervisor Mrs. Shikha Mehta

Objective Explain concepts used to create websites.

COMP5331: Knowledge Discovery and Data Mining

Automated Online News Classification with Personalization

Mining Web Data. Lijun Zhang

Chapter 27 Introduction to Information Retrieval and Web Search

A Survey on Web Information Retrieval Technologies

Everyday Activity. Course Content. Objectives of Lecture 13 Search Engine

Chrome based Keyword Visualizer (under sparse text constraint) SANGHO SUH MOONSHIK KANG HOONHEE CHO

Administrative. Web crawlers. Web Crawlers and Link Analysis!

Information Retrieval Spring Web retrieval

= a hypertext system which is accessible via internet

The Topic Specific Search Engine

Ontology Based Searching For Optimization Used As Advance Technology in Web Crawlers

TABLE OF CONTENTS CHAPTER NO. TITLE PAGE NO. ABSTRACT 5 LIST OF TABLES LIST OF FIGURES LIST OF SYMBOLS AND ABBREVIATIONS xxi

COPYRIGHTED MATERIAL. Contents. Chapter 1: Creating Structured Documents 1

SEARCH ENGINE INSIDE OUT

DATA MINING - 1DL105, 1DL111

How many people are online? As of Sept. 2002: an educated guess suggests: World Total: million. Internet. Types of Computers on Internet

Experimental study of Web Page Ranking Algorithms

Information Retrieval

Full-Text Indexing For Heritrix

An Adaptive Approach in Web Search Algorithm

TERM BASED WEIGHT MEASURE FOR INFORMATION FILTERING IN SEARCH ENGINES

Introduction. What do you know about web in general and web-searching in specific?

Computer Fundamentals : Pradeep K. Sinha& Priti Sinha

Tennessee. Trade & Industrial Course Web Page Design II - Site Designer Standards. A Guide to Web Development Using Adobe Dreamweaver CS3 2009

Finding Information on the Information Highway. How to get around in the Internet

Search Engine Optimization. What is SEO?

Representation/Indexing (fig 1.2) IR models - overview (fig 2.1) IR models - vector space. Weighting TF*IDF. U s e r. T a s k s

Internetwork - B. What are. Example. Domain (Top-level domains) Other countries domain names. UserName HostName Subdomain Domain

Chapter 6: Information Retrieval and Web Search. An introduction

Effective Use of Environmental Management Information Systems with Data Crawling Techniques

The Ultimate Digital Marketing Glossary (A-Z) what does it all mean? A-Z of Digital Marketing Translation

Information Retrieval (IR) Introduction to Information Retrieval. Lecture Overview. Why do we need IR? Basics of an IR system.

COMP 4601 Hubs and Authorities

THE HISTORY & EVOLUTION OF SEARCH

Business Intelligence Roadmap HDT923 Three Days

Information Retrieval May 15. Web retrieval

Hierarchical Classification and its Application in University Search

doc. RNDr. Tomáš Skopal, Ph.D. Department of Software Engineering, Faculty of Information Technology, Czech Technical University in Prague

Searching the Deep Web

Web Search Basics. Berlin Chen Department of Computer Science & Information Engineering National Taiwan Normal University

FACTFILE: GCE DIGITAL TECHNOLOGY

Information Retrieval. (M&S Ch 15)

Introduction to Bioinformatics

Introduction to Bioinformatics

Searching the Deep Web

power up your business SEO (SEARCH ENGINE OPTIMISATION)

Information Retrieval II

10/10/13. Traditional database system. Information Retrieval. Information Retrieval. Information retrieval system? Information Retrieval Issues

Computer Science 572 Midterm Prof. Horowitz Thursday, March 8, 2012, 2:00pm 3:00pm

Lesson 1 Key-Terms Meanings: Web Connectivity of Devices and Devices Network

Learning Ontology-Based User Profiles: A Semantic Approach to Personalized Web Search

SEO. A Lecture by Usman Akram for CIIT Lahore Students

Basic Tokenizing, Indexing, and Implementation of Vector-Space Retrieval

Software Requirement Specification Version 1.0.0

Searching the Web [Arasu 01]

Approaches to Mining the Web

An Improved PageRank Method based on Genetic Algorithm for Web Search

Link Analysis and Web Search

Web Design and Development ACS-1809

Web Design E M I R R A H A M A N WEB DESIGN SIDES 2017 EMIR RAHAMAN 1

Vannevar Bush. Information Retrieval. Prophetic: Hypertext. Historic Vision 2/8/17

The Internet and the Web

Introduction to Information Retrieval

Modern Information Retrieval

To access a search engine go to the search engine s web site (i.e. yahoo.com).

SEO and UAEX.EDU GETTING YOUR WEB PAGES FOUND IN GOOGLE

AN OVERVIEW OF SEARCHING AND DISCOVERING WEB BASED INFORMATION RESOURCES

Module 1: Internet Basics for Web Development (II)

Why is Search Engine Optimisation (SEO) important?

Chapter Ten. From Internet to Information Superhighway

Information Networks. Hacettepe University Department of Information Management DOK 422: Information Networks

Web Search Basics. Berlin Chen Department t of Computer Science & Information Engineering National Taiwan Normal University

doi: / _32

The Black Magic of Flash SEO

ONTOPARK: ONTOLOGY BASED PAGE RANKING FRAMEWORK USING RESOURCE DESCRIPTION FRAMEWORK

Developing Web Applications

Searching the Web for Information

CS377: Database Systems Text data and information. Li Xiong Department of Mathematics and Computer Science Emory University

International Journal of Scientific & Engineering Research Volume 2, Issue 12, December ISSN Web Search Engine

CHAPTER2. 1. The Internet was launched in 1969 and was originally called

Searching. Outline. Copyright 2006 Haim Levkowitz. Copyright 2006 Haim Levkowitz

Transcription:

LIST OF ACRONYMS & ABBREVIATIONS ARPA CBFSE CBR CS CSE FiPRA GUI HITS HTML HTTP HyPRA NoRPRA ODP PR RBSE RS SE TF-IDF UI URI URL W3 W3C WePRA WP WWW Alpha Page Rank Algorithm Context based Focused Search Engine Context based Relevance Contextual Sense Contextual Sense Extractor Filtered Page Rank Algorithm Graphical User Interface Hyperlink Induced Topic Search Hypertext Markup Language Hypertext Transfer Protocol Hybrid Page Rank Algorithm Noise Removed Page Rank Algorithm Open Directory Project Page Rank Repository-Based Software Engineering Relevance Score Search Engine Term Frequency-Inverse Document Frequency User Interface Uniform Resource Identifier Uniform Resource Locator World Wide Web World Wide Web Consortium Weighted Page Rank Algorithm Web Page World Wide Web xiv

LIST OF FIGURES Figure Caption Page 2.1 Web architecture 8 2.2 Comparison of Documents Indexed by Google and Bing 10 2.3 Billions of documents indexed by Google 10 2.4 Billions of documents indexed by Bing 11 2.5 Comparison of documents indexed by Google and Yahoo 11 2.6 World Population Penetration Rates 12 2.7 World Internet Users Growth 12 2.8 Internet Users in the World Distribution 13 2.9 Asia top Internet countries 13 2.10 Meta Search Engine Basic Architecture 16 2.11 Architecture of a search engine 18 2.12 Use of the Internet 22 2.13 Popular areas of applications of the Internet 22 2.14 Too many results to browse 23 2.15 Percentage of users getting information on first page 23 2.16 Users do not search beyond third level 24 2.17 Category Taxonomy Based Focused Approach 29 3.1 Google results for keyword Student 48 3.2 Changed Result from Google for keyword Student 51 3.3 High level architecture of proposed context based focused 53 search engine 3.4 Context Based Index Structure 58 xv

3.5 Structure of the node 59 3.6 Client Side Crawl Worker Activities 62 3.7 Algorithm for Crawl_Worker 63 3.8 Algorithm for URL_Mapper 64 3.9 Algorithm for Downloader 65 3.10 Data Flow among various architectural components 66 3.11 Example for the proposed architecture 69 4.1 Block Diagram for Context Based Relevance Calculator 72 4.2 WordNet Dictionary Structure Matrix 77 4.3 WordNet Storage Example 78 4.4 Result from WordNet for keyword Student 78 4.5 Result from WordNet for keyword Spider 79 4.6 Pseudo code for extraction of various contextual senses 81 4.7 Contextual senses from WordNet dictionary 81 5.1 Ranking Module and other components 97 5.2 (a) Ordering results by proposed context based tanking mechanism 103 5.2 (b) Ordering results by Page Rank ranking mechanism 104 5.3 Average Precision of Results by Proposed Ranking Mechanism 106 Compared to Page Rank Ranking Mechanism 6.1 Back-Links 111 6.2 Block diagram for Back-link Extraction and Relevance Evaluation 112 6.3 Data Flow between Back-link Extraction and Relevance Evaluator 115 Processes 6.4 Back-link Relevance 119 6.5 Comparison when URLs and URLs + Back-Links considered 126 6.6 Result Analysis in Scenario 1 127 xvi

6.7 Result Analysis in Scenario 2 127 7.1 Prototype for Context Based Focused Search Engine 130 7.2 An Instance of Table named words 132 7.3 An Instance of table named searchresults 133 7.4 An instance of table linkresults 134 7.5 An Instance of searchresultdetails 136 7.6 An Instance of linkresultdetails 137 7.7 An Instance of table urlkeywordscorefinal 139 7.8 An instance of user interface 140 7.9 Various contextual senses of Java displayed by search module 141 to user 7.10 Ranked list of matched documents for Java Island 142 7.11 The web page corresponding to first link in ranked list 142 A.1 Web pages and CS association 162 A.2 Hyperlinked Structure 164 A.3 Relation between back-link page with the page it point to 167 B.1 (a) Context score based ranking of URLs on topic Computer Mouse 170 B.1 (b) Page Rank based ranking of URLs on topic Computer Mouse 170 B.2 (a) Context score based ranking of URLs on topic Mouse Rodent 171 B.2 (b) Page Rank based ranking of URLs on topic Mouse Rodent 171 B.3 (a) Context score based ranking of URLs on topic Crane Bird 172 B.3 (b) Page Rank based ranking of URLs on topic Crane Bird 172 B.4 (a) Context score based ranking of URLs on topic Crane Machine 173 B.4 (b) Page Rank based ranking of URLs on topic Crane Machine 173 B.5 (a) Context score based ranking of URLs on topic Java Lang. 174 xvii

B.5 (b) Page Rank based ranking of URLs on topic Java Lang. 174 B.6 (a) Context score based ranking of URLs on topic Java Island 175 B.6 (b) Page Rank based ranking of URLs on topic Java Island 175 B.7 (a) Context score based ranking of URLs on topic Java Coffee 176 B.7 (b) Page Rank based ranking of URLs on topic Java Coffee 176 B.8 (a) Context score based ranking of URLs on topic Lion Animal 177 B.8 (b) Page Rank based ranking of URLs on topic Lion Animal 177 B.9 (a) Context score based ranking of URLs on topic Java Lang. 179 B.9 (b) Google s ranking of URLs on topic Java Lang. 179 B.10 (a) Context score based ranking of URLs on topic Java Island 180 B.10 (b) Google s ranking of URLs on topic Java Island 180 B.11 (a) Context score based ranking of URLs on topic Java Coffee 181 B.11 (b) Google s ranking of URLs on topic Java Coffee 181 B.12 (a) Context score based ranking of URLs on topic Crane Bird 182 B.12 (b) Google s ranking of URLs on topic Crane Bird 182 B.13 (a) Context score based ranking of URLs on topic Crane Machine 183 B.13 (b) Google s ranking of URLs on topic Crane Machine 183 B.14 (a) Context score based ranking of URLs on topic Lion Animal 184 B.14 (b) Google s ranking of URLs on topic Lion Animal 184 B.15 (a) Context score based ranking of URLs on topic Colt Young Horse 185 B.15 (b) Google s ranking of URLs on topic Colt Young Horse 185 xviii

LIST OF TABLES Table Caption Page 2.1 Inverted Index 20 3.1 Motivating Examples 48 4.1 Words occurrences and their corresponding accumulated weights 74 4.2 Contextual Senses of Word Wood 75 4.3 Comparison for Student 79 4.4 Comparison for Spider 80 4.5 List of Keywords (http://en.wikipedia.org/wiki/mouse_computing) 85 4.6 Contextual senses definition 86 4.7 Results w.r.t sense 3 86 4.8 Result w.r.t sense 6 87 4.9 Context Score for each Contextual Sense of word Mouse 87 4.10 Keywords Contextual Senses and computed context score 87 (CSense/WP) 4.11 URLs Topic and computed context score 88 4.12 Computed context score for links from Google for keyword 90 Mouse 4.13 Filtered document in sense of Computer Mouse 92 5.1 Top 20 URLs and their computed context score Rank on topic 99 Mouse 5.2 Top 20 ranked URLs in descending order of computed rank 100 5.3 Page Rank ordering vs Context based ordering 102 6.1 Structure of the URL table 114 6.2 RS of Back-links 119 xix

6.3 Matched URLs for query keyword Mouse 122 6.4 Back-Links and their computed rank 123 6.5 Combined list of Matched URLs + Back-Links with computed 123 rank 6.6 Top 10 high rank URLs consisting of URL s and back-links 125 7.1 Result analysis of proposed CBFSE with Page Rank 143 7.2 Precision Table 145 A.1 Hyperlinked Structure 164 xx