International Journal of Scientific & Engineering Research Volume 7, Issue 12, December-2016 ISSN

Similar documents
Implementation of Smart Question Answering System using IoT and Cognitive Computing

Question Answering Systems

TERM BASED WEIGHT MEASURE FOR INFORMATION FILTERING IN SEARCH ENGINES

Shrey Patel B.E. Computer Engineering, Gujarat Technological University, Ahmedabad, Gujarat, India

Domain-specific Concept-based Information Retrieval System

Information Retrieval

Information Retrieval

Natural Language Processing. SoSe Question Answering

A Survey Of Different Text Mining Techniques Varsha C. Pande 1 and Dr. A.S. Khandelwal 2

Natural Language Processing SoSe Question Answering. (based on the slides of Dr. Saeedeh Momtazi)

A Comparison Study of Question Answering Systems

LIDER Survey. Overview. Number of participants: 24. Participant profile (organisation type, industry sector) Relevant use-cases

Natural Language Processing SoSe Question Answering. (based on the slides of Dr. Saeedeh Momtazi) )

TEXT MINING APPLICATION PROGRAMMING

Final Project Discussion. Adam Meyers Montclair State University

Correlation to Georgia Quality Core Curriculum

INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (IJCET) CONTEXT SENSITIVE TEXT SUMMARIZATION USING HIERARCHICAL CLUSTERING ALGORITHM

A BFS-BASED SIMILAR CONFERENCE RETRIEVAL FRAMEWORK

IMPROVING INFORMATION RETRIEVAL BASED ON QUERY CLASSIFICATION ALGORITHM

Query Difficulty Prediction for Contextual Image Retrieval

Review on Text Mining

CS6200 Information Retrieval. Jesse Anderton College of Computer and Information Science Northeastern University

Text Mining: A Burgeoning technology for knowledge extraction

Research Article. August 2017

An Approach To Web Content Mining

Question Answering Approach Using a WordNet-based Answer Type Taxonomy

ResPubliQA 2010

In the recent past, the World Wide Web has been witnessing an. explosive growth. All the leading web search engines, namely, Google,

Information Retrieval CS Lecture 01. Razvan C. Bunescu School of Electrical Engineering and Computer Science

Web-based Question Answering: Revisiting AskMSR

Karami, A., Zhou, B. (2015). Online Review Spam Detection by New Linguistic Features. In iconference 2015 Proceedings.

Ontology-Based Web Query Classification for Research Paper Searching

AUTOMATIC VISUAL CONCEPT DETECTION IN VIDEOS

QANUS A GENERIC QUESTION-ANSWERING FRAMEWORK

Introduction to Text Mining. Hongning Wang

Finding Topic-centric Identified Experts based on Full Text Analysis

An Ontology Based Question Answering System on Software Test Document Domain

CLARA: a multifunctional virtual agent for conference support and touristic information

Meta-Content framework for back index generation

Keyword Extraction by KNN considering Similarity among Features

ISSN: [Sugumar * et al., 7(4): April, 2018] Impact Factor: 5.164

Knowledge Retrieval. Franz J. Kurfess. Computer Science Department California Polytechnic State University San Luis Obispo, CA, U.S.A.

Classification and Comparative Analysis of Passage Retrieval Methods

Empirical Analysis of Single and Multi Document Summarization using Clustering Algorithms

ISSUES IN INFORMATION RETRIEVAL Brian Vickery. Presentation at ISKO meeting on June 26, 2008 At University College, London

Enhancing applications with Cognitive APIs IBM Corporation

NLP Seminar, Fall 2009, TAU. A presentation by Ariel Stolerman Instructor: Prof. Nachum Dershowitz. 1 Question Answering

String Vector based KNN for Text Categorization

Chapter 27 Introduction to Information Retrieval and Web Search

CHAPTER 5 SEARCH ENGINE USING SEMANTIC CONCEPTS

QA System: A Perspective View

Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p.

Chinese Question Answering using the DLT System at NTCIR 2005

Chrome based Keyword Visualizer (under sparse text constraint) SANGHO SUH MOONSHIK KANG HOONHEE CHO

A Multilingual Social Media Linguistic Corpus

ACCESSING DATABASE USING NLP

Taming Text. How to Find, Organize, and Manipulate It MANNING GRANT S. INGERSOLL THOMAS S. MORTON ANDREW L. KARRIS. Shelter Island

On the automatic classification of app reviews

Data Science Course Content

Character Recognition from Google Street View Images

Relevance Feature Discovery for Text Mining

The Comparative Study of Machine Learning Algorithms in Text Data Classification*

Information Retrieval

HYBRIDIZED MODEL FOR EFFICIENT MATCHING AND DATA PREDICTION IN INFORMATION RETRIEVAL

Automated Tagging for Online Q&A Forums

Classifying Twitter Data in Multiple Classes Based On Sentiment Class Labels

Telling Experts from Spammers Expertise Ranking in Folksonomies

A Deep Relevance Matching Model for Ad-hoc Retrieval

Enhanced Retrieval of Web Pages using Improved Page Rank Algorithm

Building Search Applications

Classification and retrieval of biomedical literatures: SNUMedinfo at CLEF QA track BioASQ 2014

Charles University at CLEF 2007 CL-SR Track

Ontology based Chatbot (For E-commerce Website)

Binju Bentex *1, Shandry K. K 2. PG Student, Department of Computer Science, College Of Engineering, Kidangoor, Kottayam, Kerala, India

Analysis on the technology improvement of the library network information retrieval efficiency

INFORMATION RETRIEVAL SYSTEM: CONCEPT AND SCOPE

Information Retrieval CSCI

A Novel Categorized Search Strategy using Distributional Clustering Neenu Joseph. M 1, Sudheep Elayidom 2

A Review: Content Base Image Mining Technique for Image Retrieval Using Hybrid Clustering

VALLIAMMAI ENGINEERING COLLEGE SRM Nagar, Kattankulathur DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING QUESTION BANK VII SEMESTER

Annotating Spatio-Temporal Information in Documents

Department of Electronic Engineering FINAL YEAR PROJECT REPORT

MPI-INF AT THE NTCIR-11 TEMPORAL QUERY CLASSIFICATION TASK

FAQ Retrieval using Noisy Queries : English Monolingual Sub-task

Best Customer Services among the E-Commerce Websites A Predictive Analysis

Automated Online News Classification with Personalization

Fast and Effective System for Name Entity Recognition on Big Data

Question Answering Using XML-Tagged Documents

Vision Plan. For KDD- Service based Numerical Entity Searcher (KSNES) Version 2.0

Encoding Words into String Vectors for Word Categorization

Overview of Classification Subtask at NTCIR-6 Patent Retrieval Task

Keywords APSE: Advanced Preferred Search Engine, Google Android Platform, Search Engine, Click-through data, Location and Content Concepts.

Semantic Extensions to Syntactic Analysis of Queries Ben Handy, Rohini Rajaraman

A Multiclassifier based Approach for Word Sense Disambiguation using Singular Value Decomposition

Design and Implementation of Search Engine Using Vector Space Model for Personalized Search

Embedding Intelligence through Cognitive Services

Predictive Coding. A Low Nerd Factor Overview. kpmg.ch/forensic

Advances in Natural and Applied Sciences. Information Retrieval Using Collaborative Filtering and Item Based Recommendation

A Multiclassifier based Approach for Word Sense Disambiguation using Singular Value Decomposition

WEB SEARCH, FILTERING, AND TEXT MINING: TECHNOLOGY FOR A NEW ERA OF INFORMATION ACCESS

Transcription:

55 The Answering System Using NLP and AI Shivani Singh Student, SCS, Lingaya s University,Faridabad Nishtha Das Student, SCS, Lingaya s University,Faridabad Abstract: The Paper aims at an intelligent learning system that will take a text file as an input and gain knowledge from the given text. Thus using this knowledge our system will try to answer questions queried to it by the user. The main goal of the Answering system (QAS) is to encourage research into systems that return answers because ample number of users prefer direct answers, and bring benefits of large-scale evaluation to QA task. Rachel Michael Student, SCS, Lingaya s University,Faridabad Dr. Poonam Tanwar Associate Prof., SCS Lingaya s University, Faridabad (ii) QA response with specific answer to a specific question instead of a list of documents. 1.1. APPROCHES in QA There are three major approaches to Answering Systems: Linguistic Approach, Statistical Approach and Pattern Matching Approach. Keywords: A. Linguistic Approach Answering System (QAS), Artificial Intelligence (AI), Natural Language Processing (NLP) 1. INTRODUCTION Answering (QA) is a research area that combines research from different fields, with a common subject, which are Information Retrieval (IR), Information Extraction (IE) and Natural Language Processing (NLP). Actually, current search engine just do document retrieval, i.e. given some keywords it only returns the relevant ranked documents that contain these keywords. They do not provide a precise answer to that. Hence QAS is designed to help people find specific answers to specific questions in restricted domain. QA systems are classified into two main categories, namely open-domain QA systems and closed- domain QA systems. -domain question answering deals with questions about nearly everything such as the World Wide Web. On the other hand, closed-domain question answering deals with questions under a specific domain (music, weather forecasting etc.) The domain specific QA system considers heavy use of natural language processing systems. QA is different from the search engines in two aspects: (i) In QA, query is the question not a keyword This approach understands natural language texts, linguistic techniques such as tokenization, POS tagging and parsing.[1] These are applied to reconstruct questions into a correct query that extracts the relevant answers from a structured database. The questions handled by this approach are of Factoid type and have a deep semantic understanding. B. Statistical Approach Availability of huge amount of data on the World Wide Web increased the importance of Statistical Approach. Statistical approaches and online text repositories are not dependent on structured query languages and can formulate queries in natural language. Statistical techniques: Support Vector Machine classifier, Bayesian classifiers, etc. C. Pattern Matching Approach This approach deals with expressive power of text pattern, it replaces the sophisticated processing involved in other computing approaches. This approach uses the communicative power of text patterns. This approach best suits to small and medium sized websites. The type of questions handled by this approach are mainly factoid based,

56 definition based and it has a less semantic understanding as compared to other approaches. Most of the pattern matching QA systems uses the surface text pattern. 1.1. QAS COMPONENTS Documentation and Passage Retrieval Processing The architecture of the QA system as mentioned earlier, would consist of following three modules: 1. processing module. 2. Document processing module. 3. Answer extraction and formulation module. TheDocument Processing Module It takes in the choice of the user for a particular passage from the displayed list.then using POS Tagger tags the tokens.with the help of tags finds the verbs in the passage. Using the list of irregular verbs and the logic for regular verbs a data structure (array) is created which contains the verbs along with their tenses and ing form. Fig.1 Architecture of QAS QA systems, as explained before, have a backbone consisting of three main parts: categorization of question, information retrieval, and answer extraction. Therefore, each of these three components attracted the attention of QA researchers. Classification class WHAT WHO HOW The -Answering Module First finds the verb in the question. Matches the verb just found with the tokens created in the document processing stage. According to the selected case for type of factual question (what, when, etc) it further tries to extract and formulate the answer. The user is first asked to select the passage of his choice and then the type of question. The processing module will process the question and pass it to the Answering module which will make use of the various extractions received from the Document Processing phase, along with the Processed Documents containing the tagged format of the original input document. By applying required algorithms this module will pass it to the Formulation module for getting the desired answer. Answer 2. LITERATURE SURVEY The Processing Module: It takes in a question from the user Using StringTokenizer tokenizes it and stores it in another data structure returns it for further use in the program Answering Processing Query/Questi on Classification The questions that the system receives can be divided into two major categories: FACTUAL & EXPERT. Factual questions are those which contain words like what, where, when, who, etc. Expert questions are those which contain words like how, why etc. Document Answer Type basic-what/ what-who /whatwhen/ whatwhere Money/ No./ Definition/ Title/ NNP/ Undefined basic-how how-manyhowlong how-much howmuchhow-far how-tall how-rich how-large Location WHERE WHEN WHICH NAME WHY WHOM Person / Manner Number Time/Distance Money / Price howmuch Undefined Distance Number Undefined which-who which-where which-when which-what name-who namewhere name-what Date Person Location Date NNP Person/ORG. Location Title / NNP Reason Person s usually gurantee to predictable language patterns, and therefore are categorized based on taxonomies. Taxonomies are distinguished into two main types: flat and

hierarchical taxonomies. Flat taxonomies have only one level of classes without having sub-classes, on t,he other hand hierarchical taxonomies have multi-level classes. Lehnert [2] proposed QUALM. QA system used a flat taxonomy with seventeen classes e.g. PERSON, PLACE, DATE, NUMBER, DEFINITION, ORGANIZATION, DESCRIPTION, ABBREVIATION, KNOWNFOR, RATE, LENGTH, MONEY, REASON, DURATION, PURPOSE, NOMINAL,OTHER. Zhang and Lee [3] compared various choices for machine learning classifiers using the hierarchical taxonomy propose such as: Support Vector Machines (SVM), Nearest Neighbors (NN), Naïve Bayes (NB), Decision Trees (DT). Information Retrieval Evaluate the use of named entities and of noun, verb, and prepositional phrases as exact match phrases in a document retrieval query. Gaizauskas, and Humphreys [4] described an approach to question answering Answer Extraction BING is Microsoft s answer to google and it was launched in 2009. Bing is the default search engine in Microsoft s web Browser.it is available in 40 languages. It provides different services including image, web and vedio search along with maps. Stoyanchev et al (StoQA), 2008: Contribution In their research, they presented a document retrieval experiment on a question answering system. They used exact phrases, as constituents to search queries. The process of extracting phrases was performed with the aid of named-entity (NE) recognition, stop-word lists, and parts-of-speech taggers. Wolfram Alpha is a Computational Knowledge Engine that introduces a fundamentally new way to get knowledge and answers, not by searching the Web sites, but by dynamic computations based on a vast collection of built-in data, algorithms, and methods. [7]. Wolfram Alpha returns an answer in a form of a table where information, which is relevant to a query, is separated by categories (e.g. an answer to a query about some person usually contains such se image, timeline, notable facts, familial relationships and others). Finding the answers by exploiting surface text information using manually constructed surface patterns. In order to enhance the poor recall of the manual hand-crafting patterns, many researchers gained text patterns automatically such as Xu et al [5]. 3. PROPOSED WORK Nowadays there are many different Answering Systems. Most popular QAS and their main characteristics are described below. Answerbag is question answering website where users can get answers to their questions, whether they're looking for facts, opinions or simply entertainment. s are answered by Answerbag professional researchers and community members.[8]so Answerbag can also be considered as an expert community question answering website.. Blurtit is an online question answering community that provides the answers users are looking for and gives them free, 24/7 access to a whole world of information, and to millions of knowledgeable friends. Blurtit knowledge base contains facts, information and users opinions, enlarging with each new answer. ELIZA One of the earliest QA systems was ELIZA, developed in 1964. One successful ELIZA application was DOCTOR, a computer program that basically interacted with users through a text chat interface, answering questions and responding to the users dialog in a way that mimicked the client-centered psychotherapy between a client (the user) and their therapist. Evi (originally known as TrueKnowledge) is a knowledge Web search engine that assists people gaining what they desire and require through Evi understanding of each user and the world they live in. [6]. QUORA is available in English and Spanish launched in 2010. The QUORA community includes some well know people, such as Dustin Moskovitz, Jimmy Wales. Kangavaril, 2008: Contribution The research presented a model for improving QA systems by query reformulation and answer validation. The model depends on previously asked questions along with the user feedback. Experimental environment and results: The system worked on a closed aerologic domain for forecasting weather information. Limitations:the model was tested in a sparse experimental environment in which the domain was very particular and the number of questions relatively small. Also, depending only on the users as a single source for validating answers is a two-edged weapon. Ask.com (originally known as Ask Jeeves) is a question answering-focused web search engine. The original idea behind Ask Jeeves was to allow users to obtain answers to questions posed on daily basis, natural language, as well as 57

by traditional keyword searching. The current Ask.com still supports this, with added support for math, dictionary, and conversion questions. This system tries to understand any users query and gives three forms of answer at once: a direct answer, a list of links to webpages on related topics and a list of similar questions with answers from other question answering websites. The general characteristics of described QAS are summarized in Table 1. 58 and analyzing information; can naturally be produced as QA queries. Can be used to make online lectures more oldschool type by allowing lectures to proceed only when questions related to the previous lecture are answered correctly. The existing system can be integrated with a search engine to enhance the performance. Table 1 QA System ELIZA Domain EVI Quora Bing Stoyanchev Wolfram/Alp ha Answerbag Blurtit Description Attempt to mimic basic human interaction Q&A exchanges. Specializes in knowledge base & semantic search Knowledge based answering ability Provide diff. services like image, web and video seach Extract phases Computation search engine Web based QA system People ask query of regular user provide answer based on their opinion. Depend on Previously asked. Year 1964 Web based QA 1996 2007 2009 CONCLUSION This paper describe about the Answering System for an English Language i.e. it receives query from the user and selects most appropriate answer. QAS is approach to find the correct answer to the question asked from user. This paper also describes different QAS approaches, different types of QAS.QA system help in improving system interaction. In this paper we also concentrated on finding the solution of some problem: Answer is restricted to a precise domain, user has to follow a particular path while entering a question and Extracting correct answer. The solution consists: semantic representation for Natural Language, effective logic is to be performed on them and developing a formalism to represent the answer verification and specific answer extraction. Thus there is great potential for exploring the challenges in QA domain. Kangavari ASK.com 2009 2008 2009 2003 2006 ACKNOWLEDGMENT 2008 We sincerely thank prof. PoonamTanwar, Computer department, Lingaya s University, Faridabad for her valuable guidance and motivation for this work. 4. APPLICATION AND FUTURE SCOPE QA system handles a question in natural language and tries to provide its answer in a single statement. With certain improvements; the proposed system can be used for the following applications: Adding speech recognition abilities to the current system will enable people with reading disabilities to take advantage of this system. Query processing in database systems and many analytical tasks that involve collecting, correlating REFERENCES

[1]. Saranya R, Christopher Augustine, Schemes and Approaches in Answering System, International Journal of Advanced Research Trends in Engineering and Technology (IJARTET),Vol. II, Special Issue X, March 2015. [2]. Lehnert, W., 1977. The Process of Answering A Computer Simulation of Cognition. Yale University, ISBN 0-470-26485-3. [3]. Zhang, D. & Lee, W., 2003. Classification using Support Vector Machines.In Proceedings of the 26th Annual International ACM SIGIR Conference. [4]. Gaizauskas, R. & Humphreys, K., 2000. A Combined IR/NLP Approach to Answering Against Large Text Collections.In Proceedings of the 6th Content-based Multimedia Information Access (RIAO-2000). [5]. Xu, J., Licuanan, A. &Weischedel R., 2003. TREC2003 QA at BBN: Answering Definitional s. In Proceedings of the 12th Text Retrieval Conference. [6]. The story of Evi.http://www.evi.com/about/. [7]. http://www.wolframalpha.com/about.html. [8]. About us, Answerbag2014. http://www.answerbag.com/about-us/ 59

60