Empowering People with Knowledge the Next Frontier for Web Search. Wei-Ying Ma Assistant Managing Director Microsoft Research Asia

Similar documents
Beyond Ten Blue Links Seven Challenges

Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p.

Information Retrieval

UNIT-V WEB MINING. 3/18/2012 Prof. Asha Ambhaikar, RCET Bhilai.

Publishers Get ARM ed: Attract, Retain and Monetize through Search. Perry Solomon SVP, Product Marketing Monetization FAST, A Microsoft Subsidiary

Part I: Data Mining Foundations

Bing Liu. Web Data Mining. Exploring Hyperlinks, Contents, and Usage Data. With 177 Figures. Springer

Structured Data on the Web

DATA MINING II - 1DL460. Spring 2014"

SME Developing and managing your online presence. Presented by: Rasheed Girvan Global Directories

The State of the App Economy

CHAPTER THREE INFORMATION RETRIEVAL SYSTEM

Creating a Recommender System. An Elasticsearch & Apache Spark approach

DATA MINING - 1DL105, 1DL111

WEB SEARCH, FILTERING, AND TEXT MINING: TECHNOLOGY FOR A NEW ERA OF INFORMATION ACCESS

The Connected World and Speech Technologies: It s About Effortless

ANNUAL REPORT Visit us at project.eu Supported by. Mission

Overview of Web Mining Techniques and its Application towards Web

DATA MINING II - 1DL460. Spring 2017

KDD 10 Tutorial: Recommender Problems for Web Applications. Deepak Agarwal and Bee-Chung Chen Yahoo! Research

doc. RNDr. Tomáš Skopal, Ph.D. Department of Software Engineering, Faculty of Information Technology, Czech Technical University in Prague

THE API DEVELOPER EXPERIENCE ENABLING RAPID INTEGRATION

INTRODUCTION. Chapter GENERAL

ATA DRIVEN GLOBAL VISION CLOUD PLATFORM STRATEG N POWERFUL RELEVANT PERFORMANCE SOLUTION CLO IRTUAL BIG DATA SOLUTION ROI FLEXIBLE DATA DRIVEN V

Using the Penn State Search Engine

Minghai Liu, Rui Cai, Ming Zhang, and Lei Zhang. Microsoft Research, Asia School of EECS, Peking University

"Mobile & Wireless - The U.S. - Asia Connection"

Searching the Deep Web

Digital Marketing for Small Businesses. Amandine - The Marketing Cookie

Shaping the future of broadcasting STARFISH

Toward Human-Computer Information Retrieval

From HyperTEXT to HyperTEC

THE HISTORY & EVOLUTION OF SEARCH

CS-490WIR Web Information Retrieval and Management. Luo Si

Maximizing the Value of STM Content through Semantic Enrichment. Frank Stumpf December 1, 2009

Mining Web Data. Lijun Zhang

Advanced Digital Markeitng Training Syllabus

Search Engine Optimization

Changes to Underlying Architecture Impact Universal Search Results

Mining Web Data. Lijun Zhang

Google GSuite Intro Demo of GSuite and GCP integration

RiskSense Attack Surface Validation for IoT Systems

Collection Building on the Web. Basic Algorithm

SEO and Monetizing The Content. Digital 2011 March 30 th Thinking on a different level

TERM BASED WEIGHT MEASURE FOR INFORMATION FILTERING IN SEARCH ENGINES

Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Perceptions

A Survey Of Different Text Mining Techniques Varsha C. Pande 1 and Dr. A.S. Khandelwal 2

White Paper. EVERY THING CONNECTED How Web Object Technology Is Putting Every Physical Thing On The Web

20489: Developing Microsoft SharePoint Server 2013 Advanced Solutions

Searching the Deep Web

Self-Service Data Preparation for Qlik. Cookbook Series Self-Service Data Preparation for Qlik

MAPR DATA GOVERNANCE WITHOUT COMPROMISE

CS490W: Web Information Search & Management. CS-490W Web Information Search and Management. Luo Si. Department of Computer Science Purdue University

I. INTRODUCTION. Fig Taxonomy of approaches to build specialized search engines, as shown in [80].

Competitive Intelligence and Web Mining:

Search Computing: Business Areas, Research and Socio-Economic Challenges

UK Institutional Repository Search Project

Oleksandr Kuzomin, Bohdan Tkachenko

Science-as-a-Service

AURA ACADEMY Training With Expertised Faculty Call us on for Free Demo

Connected vehicle cloud Commercial presentation

Powering Knowledge Discovery. Insights from big data with Linguamatics I2E

Enhancing applications with Cognitive APIs IBM Corporation

Rethinking the Semantic Annotation of Services

WEB PAGE RE-RANKING TECHNIQUE IN SEARCH ENGINE

Information Retrieval

Knowledge Discovery and Data Mining

2015 Search Ranking Factors

Human-Computer Information Retrieval

Text Mining: A Burgeoning technology for knowledge extraction

Design Informatics. - Report Out -

Yiyu (Y.Y.) Yao Department of Computer Science University of Regina Regina, Sask., Canada S4S 0A2

Mapping the library future: Subject navigation for today's and tomorrow's library catalogs

SCALABLE KNOWLEDGE BASED AGGREGATION OF COLLECTIVE BEHAVIOR

How to Crawl the Web. Hector Garcia-Molina Stanford University. Joint work with Junghoo Cho

Contextual Search using Cognitive Discovery Capabilities

Deep Web Crawling and Mining for Building Advanced Search Application

Administrivia. Crawlers: Nutch. Course Overview. Issues. Crawling Issues. Groups Formed Architecture Documents under Review Group Meetings CSE 454

Embedding Intelligence through Cognitive Services

The Web Data Factory. Search, Integrate, Organize The Real World. The Problem Today s search engines simply index the

TEXT MINING: THE NEXT DATA FRONTIER

Pay TV solution from ADB

EXTRACTION OF RELEVANT WEB PAGES USING DATA MINING

MARCUS TOBER. Google RankBrain: Is the rise of the machines the future of SMX Paris June 1, 2016 #SMX

A Cloud-based Web Crawler Architecture

Google Analytics. powerful simplicity, practical insight

ISSN (Online): International Journal of Advanced Research in Basic Engineering Sciences and Technology (IJARBEST) Vol.4 Issue.

Connected Experiences

Leveraging the Social Web for Situational Application Development and Business Mashups

Multimedia Information Systems

SEARCH ENGINE INSIDE OUT

Ervan Pouliquen. WELL CONNECTED: Why MSN and Windows Live in Greece have the right audience willing to buy Telco products and services

The State of Mobile Advertising Q2 2012

: Semantic Web (2013 Fall)

Dynamic Search Ads. Playbook

Things to consider when using Semantics in your Information Management strategy. Toby Conrad Smartlogic

Understanding the Ever-Changing World of SEO

THE QUICK AND EASY GUIDE

Marketing & Back Office Management

USER GUIDE DESIGN A STEP BY STEP GUIDE

Transcription:

Empowering People with Knowledge the Next Frontier for Web Search Wei-Ying Ma Assistant Managing Director Microsoft Research Asia

Important Trends for Web Search Organizing all information Addressing user s information need Intent Knowledge Semantic matching & task completion Searching content Searching apps & services The cloud platform and developer ecosystem

Library Card Index Search Paradigm: Query Indexing Documents Query: book title, author name, Indexing: inverted indices Documents: books Browsing Documents are organized into categories

The First Generation of Search Engines Essentially were invented to replace library card index Based on information retrieval techniques Search Paradigm: Query Indexing Documents Ranking Query: any words appearing in pages Indexing: inverted indices Documents: pages, images Ranking: classical IR techniques + PageRank Browsing Pages are organized into categories

Current Search Engines Search (have not changed much) Paradigm: Query Indexing Documents Ranking Query: any words appearing in pages Indexing: inverted indices Documents: pages, images, videos, books, answers, Ranking: More signals (features) are used; machine learning; log mining; human feedbacks, etc Browsing Authoritative pages are organized into categories Challenges: information explosion and information overload Index selection, index quality, and freshness Relevance ranking (10 blue links)

Organize all information Address user s information need The Web Pages fulfill user s information need Index selection problem The explosive growth of the Web vs. The relatively slow growth of human population and their time spent in search

Empower People with Knowledge Enable people to gain knowledge and creativity from the web by computationally understanding user intent and matching that with published content, apps and services Intent Computationally understand what the user is trying to accomplish Knowing user needs, attitudes, and desires enables us to help the consumer better enrich their lives Knowledge Computationally distill concepts and entities such as people, places, products, businesses and the relationship between them Enable people and businesses to derive insights and knowledge from the web, and take actions Semantic Matching and Task Completion Routing intent to task (not only content, but also apps & services)

(no query ) Browse to intent page (SERP) Generic results page (DTP) Domain specific results page (DTP) Domain specific task or action page Four Key Page Types example Video browse example SERP with answer example Video VERP example Video play

Infrastructure for Web-scale Data Mining and Knowledge Discovery Deep understanding of data Data -> Information -> Knowledge & Intelligence Queries -> Intent Users -> Audience Intelligence -> Personalized & Targeted Experiments at scale Offline experiments Online experiments Fast cycle of innovation Ideas/Hypothesis Formulated Ideas/hypothesis Validated Ideas Deployed

Search Infrastructure + Data Mining Fetch (Crawl) Process Build (Index) Serve Knowledge Exchange Knowledge Exchange Knowledge Exchange Web-Scale Data Mining and Knowledge Discovery Classification / Clustering Information Extraction Document Features Link-based Features Entity/Object Extraction Entity Relationship Mining Statistics Vertical / Multimedia / Mobile Experiments at scale Knowledge accumulation Faster innovation cycle Infrastructure for Cloud Data Management Collect Store Aggregate Analyze Organize Present Act

From Web Pages to Web Entities Entity search and knowledge mining Web-scale entity extraction, integration, and summarization Entity relationship mining Entity ranking Academic search as an example Researchers, papers, organizations, conferences, journals Knowledge and insights Visualization & exploration

DEMO: MICROSOFT ACADEMIC SEARCH

Sketch out Your Search Intent

Challenge #1 The gap between a natural image and a query sketch?

Challenge #2 Scalable solution and efficient semantic indexing to support real-time search in a database of billions of images

DEMO: MINDFINDER

Important Trends for Web Search Organizing all information Addressing user s information need Intent Knowledge Semantic matching & task completion Searching content Searching apps & services The cloud platform and developer ecosystem

The Emergence of the Cloud Software as a Service Platform as a Service Infrastructure as a Service Information and Knowledge as a Service

Open Ecosystem for Search (Platform + Infrastructure + Information/Knowledge) as a service to developers 1-2 developers can build a micro-vertical app and web service to help with users on a specific task Millions of apps are easily discoverable and searchable at apps market place search engine routes intent to task (apps)

Evolution of the Web Content & HTML Documents People & Profiles Places & Maps Services & Applications

Intent and Knowledge Explicit signals beyond a single query Implicit signals from the broader context (social, geo, app, camera ) Dialog-based Intent Understanding Move from a bag of words to connecting dots (entities) tasklet and Query Store authoring + entity ecosystem Personal content Knowledge Construction Intent-knowledge matching in the social context Signed-in crossscreen multimodal experience Intent-driven ad + search + browse Personalization

Make Search Actionable Search is mostly based on a bag of words method Statistical and super scalable (breadth) Document-centric Search Task/app has deep engagement but not as scalable as search (depth) Authored with schemas and entities like NLP Action-centric Tasks Marry search and tasks seamlessly (breadth + depth) Alleviate the coverage challenges in NLP Enhance the flow of tasks with recommendations and social connections Search + Tasks

Re-Organizing the Web for Task Completion

What s Next for Search?

What s Next for Search? Organize the world s information Relevance ranking 10 blue links Directly address user s information need User centric innovation Answers & tasks Understand the query space and organize knowledge according to query space (instead of document space)

What s Next for Search? Information Search content Knowledge Entity (people, places, things) and concepts, and relationship between them Search apps & services

What s Next for Search? Keyword matching Inverted index Understand intent Query understanding Natural language User s context and history Intent modeling Semantic indexing & matching

What s Next for Search? Search engine Get the relevant information (a website) Get out of SERP with a simple click Challenge: query URL matching Decision engine Complete the task by fulfilling user intent Exploring search results by clicking & browsing Whole page relevance Search interaction model (dialog)

What s Next for Search? Archived Web Offline mining & knowledge discovery Real Time Web Analyzing streaming text data such as tweets

What s Next for Search? Software + Hardware More advanced index serving using hardware acceleration

What s Next for Search? Close Internal engineers Experimental platform Experimental infrastructure Shared data and storage Open + Ecosystem 1 st and 3 rd party developers Platform as a Service Infrastructure as a Service Information and knowledge as a Service (Web data and meta data)

What s Next for Search? Impression-based advertising Pay per click Transaction-based advertising Deeper understanding of user s intent Routing intent to apps or services for task completion

What s Next for Search? Web Graph Links between web pages PageRank Cloud Graph Information cloud Social cloud Communication cloud Entertainment cloud Productivity cloud Commerce cloud Fusion & graph mining

Summary Organizing all information Addressing user s information need Intent Knowledge Semantic matching & task completion Searching content Searching apps & services The cloud platform and developer ecosystem

THANK YOU