VALLIAMMAI ENGINEERING COLLEGE SRM Nagar, Kattankulathur DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING QUESTION BANK VII SEMESTER

Similar documents
Department of Computer Science and Engineering B.E/B.Tech/M.E/M.Tech : B.E. Regulation: 2013 PG Specialisation : _

S.No QUESTIONS COMPETENCE LEVEL UNIT -1 PART A 1. Illustrate the evolutionary trend towards parallel distributed and cloud computing.

VALLIAMMAI ENGINEERING COLLEGE



VALLIAMMAI ENGINEERING COLLEGE

Part I: Data Mining Foundations

Chapter 27 Introduction to Information Retrieval and Web Search

Chapter 6: Information Retrieval and Web Search. An introduction

UNIT 1-UMAL DIAGRAMS. Q.No. Question Competence Level. 1 What is Object Oriented analysis & Design? Remembering BTL1

Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p.

VALLIAMMAI ENGNIEERING COLLEGE SRM Nagar, Kattankulathur

VALLIAMMAI ENGINEERING COLLEGE

VALLIAMMAI ENGINEERING COLLEGE

VALLIAMMAI ENGINEERING COLLEGE

VALLIAMMAI ENGINEERING COLLEGE SRM Nagar, Kattankulathur

Bing Liu. Web Data Mining. Exploring Hyperlinks, Contents, and Usage Data. With 177 Figures. Springer

Mining Web Data. Lijun Zhang


CS473: Course Review CS-473. Luo Si Department of Computer Science Purdue University

VALLIAMMAI ENGINEERING COLLEGE

Mining Web Data. Lijun Zhang

VALLIAMMAI ENGINEERING COLLEGE

VALLIAMMAI ENGINEERING COLLEGE

VALLIAMMAI ENGINEERING COLLEGE

Information Retrieval. CS630 Representing and Accessing Digital Information. What is a Retrieval Model? Basic IR Processes

Information Retrieval

VALLIAMMAI ENGINEERING COLLEGE

DEPARTMENT OF INFORMATION TECHNOLOGY / COMPUTER SCIENCE AND ENGINEERING UNIT -1-INTRODUCTION TO COMPILERS 2 MARK QUESTIONS

INFORMATION TECHNOLOGY HANDLED & PREPARED BY Dr. N.KRISHNARAJ,A.P(Sel.G) MS. R. THENMOZHI, AP (Sel.G)

modern database systems lecture 4 : information retrieval

Chapter 2. Architecture of a Search Engine


DEPARTMENT OF ELECTRONICS & COMMUNICATION ENGINEERING QUESTION BANK

Introduction to Information Retrieval

CS371R: Final Exam Dec. 18, 2017

7. Discuss the hardware signals and superscalar architecture of Pentium BTL 2 Understand

60-538: Information Retrieval

Search Engines. Information Retrieval in Practice

Collective Intelligence in Action

CS54701: Information Retrieval

Search Engines Information Retrieval in Practice

CS 6320 Natural Language Processing

Representation/Indexing (fig 1.2) IR models - overview (fig 2.1) IR models - vector space. Weighting TF*IDF. U s e r. T a s k s

VALLIAMMAI ENGINEERING COLLEGE

Information Retrieval CS Lecture 01. Razvan C. Bunescu School of Electrical Engineering and Computer Science

VALLIAMMAI ENGINEERING COLLEGE

Information Retrieval Spring Web retrieval

CS377: Database Systems Text data and information. Li Xiong Department of Mathematics and Computer Science Emory University

Information Retrieval: Retrieval Models

Information Retrieval

Building Search Applications

DATA MINING - 1DL105, 1DL111

VALLIAMMAI ENGINEERING COLLEGE

Birkbeck (University of London)

TEXT MINING APPLICATION PROGRAMMING

Pre-Requisites: CS2510. NU Core Designations: AD

VALLIAMMAI ENGINEERING COLLEGE

Exam IST 441 Spring 2011

A Survey Of Different Text Mining Techniques Varsha C. Pande 1 and Dr. A.S. Khandelwal 2

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

VALLIAMMAI ENGINEERING COLLEGE SRM Nagar, Kattankulathur

VALLIAMMAI ENGNIEERING COLLEGE SRM Nagar, Kattankulathur DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING QUESTION BANK

An Introduction to Search Engines and Web Navigation

10/10/13. Traditional database system. Information Retrieval. Information Retrieval. Information retrieval system? Information Retrieval Issues

James Mayfield! The Johns Hopkins University Applied Physics Laboratory The Human Language Technology Center of Excellence!

International Journal of Advance Foundation and Research in Science & Engineering (IJAFRSE) Volume 1, Issue 2, July 2014.

Outline. Structures for subject browsing. Subject browsing. Research issues. Renardus

Text Analytics (Text Mining)

Information Retrieval May 15. Web retrieval

Models for Document & Query Representation. Ziawasch Abedjan

Information Retrieval

Keyword Extraction by KNN considering Similarity among Features

Machine Learning using MapReduce

Natural Language Processing

Basic Tokenizing, Indexing, and Implementation of Vector-Space Retrieval

Bruno Martins. 1 st Semester 2012/2013

Clustering Results. Result List Example. Clustering Results. Information Retrieval

Search Engines. Information Retrieval in Practice

Introduction to Text Mining. Hongning Wang

DATA MINING II - 1DL460. Spring 2014"

Name of the lecturer Doç. Dr. Selma Ayşe ÖZEL

Text Analytics (Text Mining)

Home Page. Title Page. Page 1 of 14. Go Back. Full Screen. Close. Quit

Multimedia Information Systems

Information Retrieval. hussein suleman uct cs

VALLIAMMAI ENGINEERING COLLEGE. SRM Nagar, Kattankulathur DEPARTMENT OF COMPUTER SCIENCE ENGINEERING

Information Retrieval

The Security Role for Content Analysis

Automatic Summarization

Information Retrieval. (M&S Ch 15)

University of Virginia Department of Computer Science. CS 4501: Information Retrieval Fall 2015

Chrome based Keyword Visualizer (under sparse text constraint) SANGHO SUH MOONSHIK KANG HOONHEE CHO

Classification and Clustering

CSE 5243 INTRO. TO DATA MINING

ΕΠΛ660. Ανάκτηση µε το µοντέλο διανυσµατικού χώρου

Desktop Crawls. Document Feeds. Document Feeds. Information Retrieval

CLUSTERING, TIERED INDEXES AND TERM PROXIMITY WEIGHTING IN TEXT-BASED RETRIEVAL

Shrey Patel B.E. Computer Engineering, Gujarat Technological University, Ahmedabad, Gujarat, India

Relevance Feedback and Query Reformulation. Lecture 10 CS 510 Information Retrieval on the Internet Thanks to Susan Price. Outline

Transcription:

VALLIAMMAI ENGINEERING COLLEGE SRM Nagar, Kattankulathur 603 203 DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING QUESTION BANK VII SEMESTER CS6007-INFORMATION RETRIEVAL Regulation 2013 Academic Year 2018 19 Prepared by Dr.M.Senthil Kumar, Associate Professor/CSE

VALLIAMMAI ENGINEERING COLLEGE SRM Nagar, Kattankulathur-603203 DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING QUESTION BANK SUBJECT : CS6007 INFORMATION RETRIEVAL SEM/YEAR: VII/IV UNIT I -INTRODUCTION Introduction -History of IR- Components of IR Issues Open source Search engine Frameworks The impact of the web on IR The role of artificial intelligence (AI) in IR IR Versus Web Search Components of a Search engine- Characterizing the web. PART-A 1 Discuss about Peer to Peer Search. 2 Identify the need of Information Retrieval 3 List and explain the components of IR block diagram. 4 List the fundamental concepts in IR. 5 Express the need of tiered indexes. 6 Interpret the role of Artificial Intelligence (AI) in IR. 7 Differentiate data retrieval and information retrieval. 8 Give the components of Search Engine and the performance measures. 9 What is an extractor? 10 Show the issues that affects IR. BTL 3 Apply 11 Give the purpose of Query Interface. BTL 6 Create 12 Summarize the queries of IR. BTL 5 Evaluate 13 Design the IR architecture diagram.. BTL 6 Create 14 State the impact of WEB on IR. 15 Showthe type of natural language technology used in information BTL 3 Apply retrieval. 16 Define Information Retrieval 17 What is search engine? 18 Compare IR vs Web Search. 19 Illustrate the function of Information Retrieval System. BTL 3 Apply 20 Summarize on text acquisition. BTL 5 Evaluate

PART B 1 i)summarize the history of IR.(7) ii) Explain the purpose of Information Retrieval System.(6) 2 Describe the various components of Information Retrieval System with neat diagram. (13) 3 i)define Information Retrieval system and its features.(4) ii)describe the different stages of IR system.(9) 4. i) Identify the various issues in IR system.(7) ii) Examine the various impact of WEB on IR (6) 5 Discuss in detail about the framework of Open Source Search engine with necessary diagrams. (13) 6 i) Compare in detail Information Retrieval and Web Search with examples.(8) ii) the fundamental concepts involved in IR system. (5) 7 Demonstrate the role of Artificial Intelligence in Information Retrieval Systems. (13) BTL 5 BTL 1 BTL 1 BTL 1 BTL 2 BTL 3 Evaluate Remember Remember Remember Understand Apply 8 i)describe the various components of a Search Engine. (8) ii) Express the various Search Engine available in current world. (5) 9 i)formulate the working of Search Engine.(8) BTL 6 Create ii)generalize the Process of Search Engine in detail.(5) 10 i) Demonstrate the working of IR architecture with a diagram.(6) ii) Infer How Designing Parsing and Scoring functions BTL 3 Apply works in detail. (7) 11 i)define Information Retrieval.(2) ii) Describe in detail the IR system, Fundamental concepts, need and purpose of the system.(4+4+3) 12 Explain how to characterize the web in detail. (13) 13 Explain the different types of computer software used in computer architecture.(13) 14 i) Differentiate database and Information Retrieval with example (4) ii)summarize the functions and features of Information Retrieval Systems.( 9) PART-C 1 Create an open source search engine like Google with suitable BTL 6 Create functionalities. 2 Evaluate the best search engines other than Google and explain any BTL 5 Evaluate five of them in detail. 3 how the AI impact Search and Search Engine optimization 4 Generalize the Deep Learning and Human Learning capabilities in Future of Search engine Optimization. BTL 6 Create

UNIT II - INFORMATION RETRIEVAL Boolean and vector-space retrieval models- Term weighting TF-IDF weighting- cosine similarity Preprocessing Inverted indices efficient processing with sparse vectors Language Model based IR Probabilistic IR Latent Semantic Indexing Relevance feedback and query expansion. PART-A 1 Demonstrate probabilistic Information Retrieval. BTL 3 Apply 2 the Boolean model. 3 Construct the Vector space model representation. BTL 3 Apply 4 List the classes of retrieval model. 5 Define Retrieval model. 6 Express language modelling with example. 7 Illustrate similarity measure. BTL 3 Apply 8 the problems in lexical semantics. 9 Differentiate language model and naïve bayes. 10 Formulate the Bayesian rule. BTL 6 Create 11 What is meant by sparse vector? 12 Design an Inverted file with an example. BTL 6 Create 13 Evaluate the goals of LSI. BTL 5 Evaluate 14 What is smoothing? 15 Give probabilistic approaches to IR. 16 What is meant Zone Index? 17 Interpret cosine similarity measure. 18 relevance feedback 19 List the steps involved in preprocessing. 20 Generalize on why distance is not preferred compared to angle. BTL 5 Evaluate PART-B 1 i) Express what is Boolean retrieval model. (4) ii) Discuss the Boolean retrieval in detail with diagram. (9) BTL 2 Understand 2 Illustrate the Vector space retrieval model with example (13) BTL 3 Apply 3 Describe about basic concepts of Cosine similarity. (13) 4 Develop on example to implement term weighting.(min docs = 5) (13) BTL 6 Create 5 i) Tabulate the common preprocessing steps.(4) ii)describe the document preprocessing steps in detail.(9) 6 i)discuss in detail about term frequency and Inverse Document Frequency. (7) ii)compute TF-IDF.given a document containing terms with the given frequencies A(3),B(2), C(1).Assume document collections 10,000 and document frequencies of these terms are A(50), B(1300), C(250) (6) 7 i)explain Latent semantic Indexing and latent semantic space with an illustration.(9)

ii) the use of LSI in Information Retrieval. What is its need in synonymy and semantic relatedness.(4) 8 i)examine, how to form a binary term - document incidence matrix (7) ii) Give an example for the above. (6) 9 Describe document preprocessing and its stages in detail. (13) 10 i) Discuss the structure of inverted indices. (7) ii)discuss the searching process in inverted file (6) 11 i)why do we need sparse vectors? (4) BTL 5 Evaluate ii)explain sparse vectors and its efficiency with diagram.(9) 12 i) the language model based IR and its probabilistic representation. (7) ii)compare Language model vs Naive bayes and Language model vs Vector space model (6) 13 Differentiate the various query expansion method with relevance feedback methods.(13) 14 (i)apply how Probabilistic approaches to Information Retrieval are done. (7) (ii) Illustrate the following (6) a) Probabilistic relevance feedback. b) Pseudo relevance feedback. c) Indirect relevance feedback PART-C BTL 3 Apply Q.No Questions BT 1 Compose the information Retrieval services of the internet with suitable BTL 6 design. 2 Assess the best Language model to computational linguistics for BTL 5 investigating the use of software to translate text or speech from one language to another. Competence Create Evaluate 3 Contrast the uses of probabilistic IR in indexing the search in the internet. 4 Create a Relevance feedback mechanism for your college website search in the internet. BTL 6 Create

UNIT III-WEB SEARCH ENGINE INTRODUCTION AND CRAWLING Web search overview, web structure, the user, paid placement, search engine optimization/ spam. Web size measurement search engine optimization/spam Web Search Architectures crawling meta-crawlers- Focused Crawling web indexes - Near-duplicate detection Index Compression XML retrieval. PART-A 1 Express the basics of web search with a neat diagram. 2 Define Pay for Placement. 3 What is meant by Search Engine Optimization? 4 List the need of web search engine. 5 Draw the architecture of search engine. 6 Distinguish parallel crawler and meta crawler. 7 List the SPAM Techniques. 8 Evaluate use of Full text indexing and In human indexing. BTL 5 Evaluate 9 State the issues in search engines. 10 Design the Politeness policies used in web crawler. BTL 6 Create 11 Classify the ways to identify duplication. 12 How to Apply duplicate Deduction to web pages? BTL 3 Apply 13 Assess the need for keyword stuffing. BTL 5 Evaluate 14 What are the challenges in data traversing by search engines? 15 Show the applications of search engines. BTL 3 Apply 16 Point out the use of Web indexing and inversion of indexing process. 17 What is focused crawler? 18 Illustrate the hashing technique with example. BTL 3 Apply 19 Classify the types of search engines. 20 Generalize on XML Retrieval. BTL6 Create PART-B 1 Discuss the Search Engine Optimization/SPAM in detail.(13) 2 i)describe in detail about XML Retrieval.(9) ii)what is Structured and Unstructured Retrieval.(4) 3 i)list the types of Search Engine and explain them. (7) ii)describe the working of Search Engine.(6) 4 Design and develop a Web search Architecture and the components of BTL 6 Create search engine and its issues.(13) 5 i)what is P4P? Elaborate on Paid Placement.(7) ii) What is the purpose of Web indexing?(6) 6 i) Summarize on the working of WEB CRAWLER with its diagram.(8) ii) Distinguish visual vs programmatic crawler.(5) 7 i)differentiate meta crawler and focused crawler. (8) ii) on URL normalization.(5) 8 Recommend the need for Near-Duplication Detection by the ways to identify the duplication. (13) BTL 5 Evaluate 9 i)examine the behavior of web crawler and the outcome of crawling policies.(5) ii) Illustrate the following(8) BTL 3 Apply

a) Focused Crawling b) Deep web c) Distributed crawling d) Site map 10 i)explain the overview of Web search.(8) ii)describe the structure of WEB and its characteristics(5) 11 Discuss the process of index compression in detail.(13) 12 (i)explain the need for Web Search Engine.(6) (ii)point out the challenges in data traversing by search engine and how will you overcome it.(7) 13 Describe the following with example. (13) i)bag of Words ii) Shingling iii) Hashing iv)min Hash and Sim Hash 14 i)based on the Application of Search Engines, How will you categorize BTL 3 Apply them and what are the issues faced by them? (9) (ii) Demonstrate about Search Engine Optimization. (4) PART-C 1 Develop a web search structure for searching a newly hosted web BTL 6 Create domain by the naïve user with step by step procedure. 2 Grade the optimization techniques available for search engine and rank them by your justification. BTL 5 Evaluate 3 Classify the web crawling methods and illustrate the effects of different crawling policies on data collection. 4 Formulate the application of Near Duplicate Document Detection techniques and also Generalize the advantages in Plagiarism checking. BTL 6 Create UNIT IV- WEB SEARCH LINK ANALYSIS AND SPECIALIZED SEARCH Link Analysis hubs and authorities Page Rank and HITS algorithms Searching and Ranking Relevance Scoring and ranking for Web Similarity Hadoop & Map Reduce Evaluation Personalized search Collaborative filtering and content-based recommendation of documents and products handling invisible Web Snippet generation, Summarization, Question Answering, Cross- Lingual Retrieval. PART-A 1 Describe the main idea of Link Analysis. 2 Illustrate the web as a directed graph. BTL 3 Apply 3 List the issues of page rank algorithm and characteristics of Map reduce Strategy. 4 how citation analysis in done. 5 Quote the importance of Anchor text and indexing. BTL1 Remember 6 Define Hub. BTL1 Remember

7 What is meant by Query independent ordering? 8 State the aim of question answering. 9 Differentiate between citations and links. 10 Show the working of random walks in Graphs. BTL 3 Apply 11 Evaluate on Recommender System. BTL 5 Evaluate 12 Define Lossy compression mechanisms 13 Integrate the ideas of HITS Algorithm. BTL 6 Create 14 Assess on the parts of Search engine. BTL 5 Evaluate 15 What is map reduce and snippet generation? 16 Express Recall at rank and Precision at rank. 17 Formulate the examples for boolean queries. BTL 6 Create 18 Categorize the modules of Hadoop Framework. 19 the Collaborative filtering and challenges. 20 Demonstrate Bayesian Inferencing. BTL 3 Apply PART-B 1 i)define Link Analysis and explain in detail.(7) ii)describe in detail about HUBS and Authorities.(6) 2 BTL 2 Understand i) Give the concept of PAGE Ranking in detail. (6) ii)summarize the preprocessing and Query Processing of Page Rank along with its issues.(7) 3 Discuss in detail about HITS Algorithm with necessary examples.(13) 4 BTL 3 Apply Illustrate the abstract search engine and how will you speed snippet generation? Explain with algorithm (13) 5 Describe the aim and purpose of Question Answering in detail. (13) 6 i) Point out stages of summarization. (7) ii) how Handling Invisible Web is done. (6) 7 i)evaluate the concept of Personalized Search.(7) ii)assess the methodology used in it. (6) 8 i) content based recommendations of documents and products.(7) ii) the process of cross lingual retrieval (6) 9 i)formulate the working of HADOOP.(7) ii) Compose the Map Reduce in detail. (6) 10 i)define contextual computing and discuss on Personalized search (9) ii)describe how to solve privacy problems(4) 11 i) Explain working of collaborative filtering by analyzing any two case study. (8) ii) Give the challenges of Collaborative filtering.(5) BTL 5 BTL 6 BTL 1 Evaluate Create Remember 12 Describe the Searching and Ranking process in detail with necessary examples. (13) BTL 1 Remember 13 BTL 3 Apply

i) Show the performance of TREC Systems. (7) ii) Illustrate the CLIR Approaches (6) 14 i)describe in detail about of SNIPPET Generation along with example.(6) ii) Summarize in detail about community-based Question Answering system in IR.(7) PART-C 1 Generalize how Link analysis has been instrumental in the development BTL 6 Create of web search. 2 Summarize the impacts of in-links and link-spam in the link analysis. BTL 5 Evaluate 3 any five online utility tools available for searching and ranking in information retrieval. 4 Design a Plan to overcome the challenges in Cross-Lingual Retrieval. BTL 6 Create UNIT V-DOCUMENT TEXT MINING Information filtering; organization and relevance feedback Text Mining Text classification and clustering Categorization algorithms: naïve Bayes; decision trees; and nearest neighbor Clustering algorithms: agglomerative clustering; k-means; expectation maximization (EM). PART-A Q.No Questions BT Competence 1 Distinguish IF vs IR. 2 Define the general features of Filtering. 3 Give the idea of filtering rules and attributes 4 Compare Automatic vs Social Filtering 5 What is the need of Filtering against spamming? 6 Give some examples of EM. 7 What is Text Mining and Dendrogram? 8 Evaluate the process of Text Mining. BTL 5 Evaluate 9 Formulate the estimation of Multinomial document model and BTL 6 Create Bernoulli document model. 10 the types of filters 11 Integrate the problems of k-means method. BTL 6 Create 12 State positive and negative feedback. 13 Summarize relevance feedback with example.. 14 What are the types of data in clustering analysis? 15 Point out the advantages and disadvantages of Decision Tree algorithm. 16 Show the applications of text mining. BTL 3 Apply 17 Illustrate the advantages of Naiye Bayes. BTL 3 Apply 18 Assess how to measure distance of clusters? BTL 5 Evaluate 19 Distinguish Supervised learning and Unsupervised Learning. 20 Summarize the major clustering approaches. BTL 3 Apply PART-B 1 i)list the general features of Filtering, rules and its attributes.(7)

ii) Describe the filtering using IR in detail.(6) 2 i)describe in detail the various types of filters, Profiling and filtering technologies in detail.(9) ii) Describe in detail the Multiple-Bernoulli and the multinominal mode.(4) 3 (i)give the examples of EM method. (4) (ii)summarize the profiling and Filtering Technologies.(9) 4 i)express the process of Text Mining. (6) ii)explain the challenges and application of Text Mining.(7) 5 the procedure involved in Expectation Maximization along with the steps involved in it. (13) 6 (i)define Topic detection and tracking, Clustering in TDT. (4) (ii)examine in detail about Cluster Analysis in Text Clustering.(9) 7 Illustrate in detail with examples about Organization and Relevance BTL 3 Apply feedback. (13) 8 (i)evaluate the Agglomerative Clustering and HAC in detail. (7) BTL 5 Evaluate (ii) Evaluate on the various classification methods of Text. (6) 9 BTL 6 Create i)summarize on Clustering Algorithms.(6) ii) Rearrange the Types of data in cluster analysis.(7) 10 i) the working of Nearest Neighbor algorithm along with one representation. (7) ii) the K-Means Clustering method and the problems in it. (6) 11 about Decision Tree Algorithm with illustration. (13) 12 i)describe in detail about Text Mining.(7) ii)examine the process of Mining with detailed example.(6) 13 i) Discuss in detail about Text Classification.(7) ii) Summarize Text Clustering in detail.(6) 14 i)apply Naïve Bayes Algorithm for an example.(7) BTL 3 Apply ii) Demonstrate its working in detail. (6) PART-C 1 Prepare how Information Filtering has been instrumental in the BTL 6 Create development of document text mining by the massive users in internet. 2 Rank the impacts of Categorization and clustering of text in the mining with the suitable examples. BTL 5 Evaluate 3 any online utility tools available for text analytics software to transform unstructured text into structured data in text mining. 4 Design a Plan to overcome the gap in decision theoretic approach for evaluation in text mining. BTL 6 Create