Contextual Search using Cognitive Discovery Capabilities

Size: px

Start display at page:

Download "Contextual Search using Cognitive Discovery Capabilities"

Blaze Melton
6 years ago
Views:

1 Contextual Search using Cognitive Discovery Capabilities In this exercise, you will work with a sample application that uses the Watson Discovery service API s for cognitive search use cases. Discovery service queries are used to extract and detect concepts, keywords, sentiment, entities such as people and companies, relationships as well as trends. Contextual search is a powerful way to gather personalized results based on information from multiple structured and unstructured data repositories. Context is used to help refine and target search results and relevance. Without context, users often must sift through a bunch of irrelevant results before finding what they want. Contextual intelligence helps to assign confidence rankings to search results and streamline the process of finding relevant, current data. The Discovery Service enables developers to build an automated data pipeline to ingest your unstructured data, where the discovery service uses Natural Language Understanding and other cognitive services to enrich understanding of the data. This process consists of automatically tagging NLP meta data, cleansing and normalizing for improved data quality. Once ingested and enriched, queries can be performed. In this exercise, we ll focus on using Discovery service queries with examples for contextual search with news datasources. Exercise: 1. Go to the Discovery News application in Bluemix at:

2 Enter a company name for a search term, for instance Amazon. The app uses Discovery service APIs and queries to perform a contextual search in news articles previously ingested and enriched thru the data pipeline and NLP process. It returns the following information: - most frequently occurring topics (concepts), companies, and people (entities) - the news articles search results with links to each article content, along document content sentiment score - positive/negative sentiment rating of the company for content from 10 randomly selected news sources/sites - sentiment trend along a timeline based on mentions of the company along with other companies that it is most frequently mentioned with, in the source articles Examine the contextual search results in each of the four sections: - Top Entities - Top Stories - Sentiment Analysis - Co-mentions & Trends

2. To get ready to examine and understand the Discovery service query that was used to perform this contextual search, lets review a few key concepts: Examples of keyword and entity queries: Queries

3 2. To get ready to examine and understand the Discovery service query that was used to perform this contextual search, lets review a few key concepts: Examples of keyword and entity queries: Queries can be structured for additional options including concepts, sentiment, filtering and aggregations that can provide deeper insights and identify patterns, clusters and trends. The Discovery service provides a query tool that uses a simple query language for multiple query types including boolean, filter, and aggregation queries to discover patterns, trends, and answers.

4 Aggregations are collections of occurrences of the keywords, concepts and entities from the search results. Aggregations can be nested to extract information and get insights about other keywords, concepts and entities that may be connected or related in the source content. 3. Examine the Discovery service query that was used to perform the news articles contextual search: - In the Top Entities block, click the View Query button. The content of the query consists of the actions and results described in-line in italics below: "return": "title,enrichedtitle.text,url,host,blekko.chrondate", "query": "\"amazon\",language:english", This is a simple keyword query for the company name

5 "aggregations": [ "nested(enrichedtitle.entities).filter(enrichedtitle.entities.ty pe:company).term(enrichedtitle.entities.text)", This aggregation collects the enriched data for companies the query specifies selection of entities of type company, to get company names mentioned in the news articles "nested(enrichedtitle.entities).filter(enrichedtitle.entities.ty pe:person).term(enrichedtitle.entities.text)", This aggregation collects the enriched data entities of type people to get names of people mentioned in the news articles "term(enrichedtitle.concepts.text)", This aggregation collects the enriched data for concepts, to get the names of topics mentioned in the news articles 4. Click on Response to view the response data returned from the query The response data consists of most frequently occurring company names, names of people and topics, with number of occurrences, in sorted order.

6 Click on the GoBack button. 5. In the Top Stories section, click on the View Query button, notice that the same query is used to retrieve data for the most frequently appearing stories based on the enriched title and extracted concepts. Click on the Response Data button response data includes the document title, URL, enriched title, host website and sentiment score. Note that in some cases the enriched title may be different than the document title, either to add more context or to remove irrelevant information such as URL strings in titles. 6. In the Sentiment Analysis section, click on the View Query button. Examine the query and notice the stanza in the query that extracts content sentiment: "term(blekko.basedomain).term(docsentiment.type)", This aggregation collects the enriched data for content sentiment of the news articles This next aggregation collects the enriched data for content sentiment of the news articles for min/max sentiment trend along a timeline of each mention of the company plus co-mentioned companies: "term(docsentiment.type)", "min(docsentiment.score)", "max(docsentiment.score)", "filter(enrichedtitle.entities.type::company).term(enrichedtitle.entities.text).timeslice(blekko.chrondate,1day).term(docsentime nt.type)" ], "filter": "blekko.hostrank>20,blekko.chrondate> ,blekko.chrondate < " Click on the Response Data button response data includes two sections: The first section provides the count of all documents queried having a positive sentiment score, negative sentiment score and neutral sentiment score.

7 The second section provides the positive/negative/neutral sentiment document count for each of the 10 randomly selected news sites the content was obtained from.

8 7. In the Co-mentions & Trends section, click on the Response Data button response data includes two sections: The first section provides the count of all documents in which the company name occurs having a positive sentiment score, negative sentiment score and neutral sentiment score, along with individual document sentiment scoring detail data for each. The second section provides for each of the top co-mentioned companies in the documents, the number of matches and the sentiment score document counts of all the documents with the co-mentioned company followed by the individual document sentiment scoring detail data. "key": "Google", "matching_results": 3599, "aggregations": [ "type": "timeslice", "field": "blekko.chrondate", "interval": "1d", "results": [ "key_as_string": " ", "key": , "matching_results": 8, "aggregations": [ "type": "term", "field": "docsentiment.type", "results": [ "key": "negative", "matching_results": 6 }, "key": "positive", "matching_results": 2 }

Learn more about Watson Discovery Service View these education modules on the Watson Discovery service to learn more: https://youtu.be/9ks-ceg6kps https://www.youtube.com/watch/?

9 Learn more about Watson Discovery Service View these education modules on the Watson Discovery service to learn more: CHG8YUvWx WWLP Watson Discovery Service key use cases: Additional use cases are described for financial research, supply chain, customer behavior insights, field engineer advisor and surgical knowledgebase in the Architecture Center for Cognitive Discovery, which also provides detailed information on how to work with the Watson Discovery service to create your document store, create queries, and implement or integrate in your application. Using the Watson Discovery Service getting started documentation and query guide Step by step tutorial on using the Watson Discovery service in Bluemix and building custom queries with the Discovery query tool More information regarding contextual search using cognitive discovery capabilities Complete source code for the application used in this exercise is available at GitHub in the project repository

An Oracle White Paper October Oracle Social Cloud Platform Text Analytics

An Oracle White Paper October 2012 Oracle Social Cloud Platform Text Analytics Executive Overview Oracle s social cloud text analytics platform is able to process unstructured text-based conversations