Mining of Implicit User Data

Size: px
Start display at page:

Download "Mining of Implicit User Data"

Transcription

1 Introduction to Search Engine Technology Mining Implicit User Data Ronny Lempel Yahoo! Labs, Haifa Mining of Implicit User Data Implicit data: data that users enter (or leave) in the course of their interaction with the Web, without necessarily being aware that it is used for purposes beyond the immediate interaction In this class: queries, clicks, timestamps of uploaded images As opposed to explicit user data: relevance feedback, ratings, reviews, comments, tags, blog entries, forum postings Assortment of examples on how mining of implicit user data may be used to improve search and media experiences 1. Automatic construction of touristic itineraries 2. Improving automatic query suggestions 3. Pseudo-anchor text generation 4. Identification when no interaction implies user satisfaction By no means comprehensive! Search Engine Technology 2 1

2 Automatic Construction of Travel Itineraries using Social Breadcrumbs Hypertext 2010 Munmun De Choudhury* Sihem Amer-Yahia Cong Yu Yahoo! Research, NY Moran Feldman* Nadav Golbandi Ronny Lempel Yahoo! Labs, Haifa * Former intern Search Engine Technology 3 In a Nutshell Goal: construct intra-city travel itineraries automatically by tapping a latent source reflecting geo-temporal breadcrumbs left by millions of tourists Flickr images, in this case Approach Extract photo streams of individual users Associate points of interest with the photos Using the timestamps of the photos, determine: Lower bounds on the amount of time spent at a POI Upper bounds on the transit time between POIs aggregate all user photo streams into a POI graph; apply the Orienteering algorithm to construct itineraries Search Engine Technology 4 2

3 Step I Timed Paths Photo-POI Mapping Photo Streams Step II Step III Identify photos of a given city Filter out residents of a city Validate photo timestamps Extract Candidate POIs o Lonely Planet/Y! Travel to extract landmarks o Yahoo! Maps API to retrieve their geo-locations Tag & geo-based POI association Segmentation of Photo Streams o Split the stream whenever the time difference between two successive photos is large Distillation of Timed Visits <POI, start time, end time> Construction of Timed Paths Search Engine Technology o A sequence of Timed Visits5 Constructing POI Graph Given the set of timed paths, our goal is to aggregate the actions of many individual travelers into coherent itineraries while taking into consideration POI popularity. Undirected POI graph, G C (V=L C, E=L C L C ) with the following predicates: T(l L C ), the visit time at each POI l. Longest visit time for each user; take the 75 th percentile among all users T(e E), the median transit time between two POIs V(l L C ), the prize or value that an itinerary gets from visiting each POI l in L C. Defined as the product of popularity and visit duration of l (eliminates very short visit durations) Search Engine Technology 6 3

4 Itineraries and Orienteering Problem An itinerary is a path in the graph G C, where a node (POI) in the path may be visited more than once. Problem Instance: Let I be an itinerary; its prize V(I) is defined as the sum of prizes of the unique set of POIs (i.e., a POI s prize is counted only once even if it is visited multiple times) along the path. The time T(I) of the itinerary is the sum of visit times to the unique set of POIs along the path, plus the transit times along all edges on the path (including those that are traversed more than once). Objective (solved using Orienteering Problem Approximation): Find an itinerary in G C from s to t of cost (=time) at most B maximizing total node prizes. Note, B is typically whole days; s and t can be provided by the user Search Engine Technology Data Preparation Five popular cities were chosen: Barcelona, London, New York City (NYC), Paris, and San Francisco City #POIs #Timed Paths Sample POIs Barcelona 74 6,087 Museu Picasso, Plaza Reial London ,052 Buckingham Palace, Churchill Museum, Tower Bridge New York City 100 3,991 Brooklyn Bridge, Ellis Island Paris ,651 Tour Eiffel, Musee du Louvre San Francisco 80 12,308 Aquarium of the Bay, Golden Gate Bridge, Lombard Street Search Engine Technology 4

5 Example Itineraries for NYC Single day itinerary Two day itinerary Search Engine Technology 9 9 Ground Truth To compare our algorithmically constructed itineraries with reasonable baseline itineraries, we obtained itineraries provided by top tour bus companies for each city and considered them as ground truth itineraries City Barcelona London New York City Paris San Francisco Ground Truth Sources Search Engine Technology 10 5

6 User Study Design Summary Side-by-side evaluation comparing our itineraries to ground-truths Independent evaluation examining our itineraries in detail Side-by-side comparison IMP GT Questions? Which itinerary is bette POIs Transit times Visit times Independen t evaluation IMP Questions? Is the itinerary reasona POIs Transit times Visit times Search Engine Technology 11 Results Side by Side Evaluation Evaluation Metric: estimate the usefulness of the itineraries from two aspects, such as the overall utility of the itineraries and appropriateness of POIs Search Engine Technology 12 6

7 Conclusions We addressed the question of automatic generation of travel itineraries for popular touristic cities from large-scale rich media repositories with UGC (user generated content) Such repositories reveal the following patterns: What are the popular touristic points of interest? How long do tourists spend at each POI? How long does it take for tourists to transit between pairs of POIs? User studies comparing our itineraries to bus tour companies itineraries indicate that high quality itineraries can be automatically constructed from Flickr data Search Engine Technology 13 The Query Flow Graph Paolo Boldi, Francesco Bonchi, Carlos Castillo, Debora Donato, Aris Gionis, Sebastiano Vigna Yahoo! Research Barcelona, University of Milano CIKM

8 Session Query-Flow Graph ebay autotrader used fox vw s barcelona hotel barcelona rent t soccer barcelona soccer... barcelona barcelona fc Search Engine Technology 15 Methodology Partition overall query stream into user streams Divide user streams into logical sessions that pursue the same information need Logical sessions might be interleaving Signals: Timestamps of queries Edit distance between successive queries Result set similarity between successive queries more Transitions weighted by observed frequency Transitions further classified by type (next slide) Search Engine Technology 16 8

9 Transition Analysis Barcelona Luxury Barcelona hotels Brcelona Correct Specialize Generalize Parallel Move Barcelona F.C. Real Madrid Barcelona hotels Cheap Barcelona hotels Boldi, Bonchi, Castillo and Vigna, From Dango to Japanese Cakes : Query Reformulation Models and Patterns, Web Intelligence, 2009 (Best paper award) Notification: 18/Sep/ Search Engine Technology 17 Applications Intent mining: what are people really looking for when asking Q Related Query suggestions Based on most popular transitions out of current query s node When coupled with click information, can improve index quality and hence observed ranking: If session looks like XQQ QC (X: delimeter, Q: query, C:click) it means that the clicked document satisfied the information need of all previous queries Can associate the text of earlier queries with the clicked document, generating pseudo anchor text, so that in the future it will show up as a result for those preliminary queries Can expedite speed of information need satisfaction Originally proposed in Queries as anchors: selection by association, Einat Amitay, Adam Darlow, David Konopnicki, Uri Weiss, Hypertext Search Engine Technology 18 9

10 When no clicks are good news Carlos Castillo, Aris Gionis, Ronny Lempel, Yoelle Maarek Yahoo! Research Barcelona & Haifa SIGIR 2010 Industrial Track Interaction Mining for Search Behavioral signals are useful for measuring performance of retrieval systems Relevant results that satisfy the information need are clicked more often, visited (dwelled) for a longer time, lead to long-term engagement, etc. However, predicting user satisfaction accurately from search behavior signals is still an open problem Search Engine Technology 20 10

11 A (not-so-) Special Case When satisfying the user by impression, we observe a lower click-through rate (and higher abandonment) Search Engine Technology 21 Satisfaction by Impression Oneboxes and Direct Displays Oneboxes 1 and Direct Displays 2 (DD) are Very specific results answering (mostly) unambiguous queries with a unique answer directly on the SERP Displayed above regular Web results, due to their high relevance, and rendered in a differentiated format. Typical example: weather <city name> 1 : Google terminology 2 :Yahoo! terminology Search Engine Technology 22 11

12 Additional Satisfaction by Impression Examples Searches for specific stocks, movie screen times, sports results, package tracking (Fedex/UPS), etc. To the extreme, what about spell checking, arithmetic operations or currency conversion? Search Engine Technology 23 The Problem: Response Interpretation How can we distinguish between user satisfaction by impression and abandonment? Application: adjust click-based metrics for user satisfaction, avoiding mis-interpretations of CTR Search Engine Technology 24 12

13 Proposed Methodology General method Pick a class of users with a distinctive behavior Study changes in their aggregate response patterns Specific method Identify users who are Tenacious (a.k.a. pitbulls ) In queries with no DDs, display very low abandonment Will either reformulate the query or click on some results, do not let go Measure their abandonment on queries with DDs Search Engine Technology 25 Modeling users Session representation Search sessions are delimited by X Actions classes: queries (Q) and clicks (C) XQCQX means start, query, click, query, stop User representation Frequency of up to first two actions at start of session Tenacity = (XQQ+XQC)/(XQQ+XQC+XQX) In words: the frequency by which the user follows up the first query in a session by either a reformulation or a click on some result Search Engine Technology 26 13

14 (Preliminary) Experiments Segment sessions into logical goals Divide queries into two groups With direct-displays triggered above position 5 (DD) Without (NO-DD) Metric Find users with Tenacity NO-DD >= 80% (min. 10 sessions) For a DD-triggering query Q, measure (on enough tenacious users only) #abandonments / #impressions Tenacious abandonment as a good signal Ground truth user survey Ask humans do you think users querying for Q will be satisfied by impression by this DD? 1=never... 5=always Search Engine Technology 27 Weather DD Change in the tenacity of tenacious users Search Engine Technology 28 14

15 GOOD GOOD BAD BAD Change in the tenacity of tenacious users Search Engine Technology 29 GOOD 63% of bad cases 83% precision BAD Change in the tenacity of tenacious users Search Engine Technology 30 15

16 GOOD BAD Reference DD Change in the Search tenacity Engine of Technology tenacious users 31 GOOD GOOD BAD BAD Change in the Search tenacity Engine of Technology tenacious users 32 16

17 GOOD 71% of bad cases 84% precision BAD Change in the tenacity of tenacious users Search Engine Technology 33 17

New Perspectives in Social Data Management

New Perspectives in Social Data Management New Perspectives in Social Data Management Sihem Amer-Yahia Research Director CNRS @ LIG Sihem.Amer-Yahia@imag.fr TSUKUBA University May 20 th, 2014 Traditional data management stack High-level specification

More information

Mining the Web 2.0 to improve Search

Mining the Web 2.0 to improve Search Mining the Web 2.0 to improve Search Ricardo Baeza-Yates VP, Yahoo! Research Agenda The Power of Data Examples Improving Image Search (Faceted Clusters) Searching the Wikipedia (Correlator) Understanding

More information

Query Sugges*ons. Debapriyo Majumdar Information Retrieval Spring 2015 Indian Statistical Institute Kolkata

Query Sugges*ons. Debapriyo Majumdar Information Retrieval Spring 2015 Indian Statistical Institute Kolkata Query Sugges*ons Debapriyo Majumdar Information Retrieval Spring 2015 Indian Statistical Institute Kolkata Search engines User needs some information search engine tries to bridge this gap ssumption: the

More information

Beyond Ten Blue Links Seven Challenges

Beyond Ten Blue Links Seven Challenges Beyond Ten Blue Links Seven Challenges Ricardo Baeza-Yates VP of Yahoo! Research for EMEA & LatAm Barcelona, Spain Thanks to Andrei Broder, Yoelle Maarek & Prabhakar Raghavan Agenda Past and Present Wisdom

More information

Personalized Models of Search Satisfaction. Ahmed Hassan and Ryen White

Personalized Models of Search Satisfaction. Ahmed Hassan and Ryen White Personalized Models of Search Satisfaction Ahmed Hassan and Ryen White Online Satisfaction Measurement Satisfying users is the main objective of any search system Measuring user satisfaction is essential

More information

How Big Data is Changing User Engagement

How Big Data is Changing User Engagement Strata Barcelona November 2014 How Big Data is Changing User Engagement Mounia Lalmas Yahoo Labs London mounia@acm.org This talk User engagement Definitions Characteristics Metrics Case studies: Absence

More information

EFFICIENT MULTIUSER ITINERARY PLANNING FOR TRAVELLING SERVICES USING FKM-CLUSTERING ALGORITHM

EFFICIENT MULTIUSER ITINERARY PLANNING FOR TRAVELLING SERVICES USING FKM-CLUSTERING ALGORITHM EFFICIENT MULTIUSER ITINERARY PLANNING FOR TRAVELLING SERVICES USING FKM-CLUSTERING ALGORITHM R.Rajeswari 1, J. Mannar Mannan 2 1 III-M.tech(IT), Department of Information technology, Regional Centre,

More information

An Introduction to Search Engines and Web Navigation

An Introduction to Search Engines and Web Navigation An Introduction to Search Engines and Web Navigation MARK LEVENE ADDISON-WESLEY Ал imprint of Pearson Education Harlow, England London New York Boston San Francisco Toronto Sydney Tokyo Singapore Hong

More information

Advanced Computer Graphics CS 525M: Crowds replace Experts: Building Better Location-based Services using Mobile Social Network Interactions

Advanced Computer Graphics CS 525M: Crowds replace Experts: Building Better Location-based Services using Mobile Social Network Interactions Advanced Computer Graphics CS 525M: Crowds replace Experts: Building Better Location-based Services using Mobile Social Network Interactions XIAOCHEN HUANG Computer Science Dept. Worcester Polytechnic

More information

Exploiting Group Recommendation Functions for Flexible Preferences

Exploiting Group Recommendation Functions for Flexible Preferences Exploiting Group Recommendation Functions for Flexible Preferences Senjuti Basu Roy, Saravanan Thirumuruganathan #, Sihem Amer-Yahia +, Gautam Das #, Cong Yu University of Washington Tacoma # University

More information

DS595/CS525: Urban Network Analysis --Urban Mobility Prof. Yanhua Li

DS595/CS525: Urban Network Analysis --Urban Mobility Prof. Yanhua Li Welcome to DS595/CS525: Urban Network Analysis --Urban Mobility Prof. Yanhua Li Time: 6:00pm 8:50pm Wednesday Location: Fuller 320 Spring 2017 2 Team assignment Finalized. (Great!) Guest Speaker 2/22 A

More information

Google My Business The Free Listing

Google My Business The Free Listing Google My Business The Free Listing Entrata compiled year-over-year data from 400+ apartment communities across the United States in this study to to help apartment marketers better understand how to measure

More information

A Task Level Metric for Measuring Web Search Satisfaction and its Application on Improving Relevance Estimation

A Task Level Metric for Measuring Web Search Satisfaction and its Application on Improving Relevance Estimation A Task Level Metric for Measuring Web Search Satisfaction and its Application on Improving Relevance Estimation Ahmed Hassan Microsoft Research Redmond, WA hassanam@microsoft.com Yang Song Microsoft Research

More information

Advanced Topics in Information Retrieval. Learning to Rank. ATIR July 14, 2016

Advanced Topics in Information Retrieval. Learning to Rank. ATIR July 14, 2016 Advanced Topics in Information Retrieval Learning to Rank Vinay Setty vsetty@mpi-inf.mpg.de Jannik Strötgen jannik.stroetgen@mpi-inf.mpg.de ATIR July 14, 2016 Before we start oral exams July 28, the full

More information

INFERENCE OF MOBILITY PATTERNS VIA SPECTRAL GRAPH WAVELETS

INFERENCE OF MOBILITY PATTERNS VIA SPECTRAL GRAPH WAVELETS INFERENCE OF MOBILITY PATTERNS VIA SPECTRAL GRAPH WAVELETS Xiaowen Dong, Antonio Ortega, Pascal Frossard and Pierre Vandergheynst Signal Processing Laboratories, EPFL, Switzerland {xiaowen.dong, pascal.frossard,

More information

KDD 10 Tutorial: Recommender Problems for Web Applications. Deepak Agarwal and Bee-Chung Chen Yahoo! Research

KDD 10 Tutorial: Recommender Problems for Web Applications. Deepak Agarwal and Bee-Chung Chen Yahoo! Research KDD 10 Tutorial: Recommender Problems for Web Applications Deepak Agarwal and Bee-Chung Chen Yahoo! Research Agenda Focus: Recommender problems for dynamic, time-sensitive applications Content Optimization

More information

Opportunities and challenges in personalization of online hotel search

Opportunities and challenges in personalization of online hotel search Opportunities and challenges in personalization of online hotel search David Zibriczky Data Science & Analytics Lead, User Profiling Introduction 2 Introduction About Mission: Helping the travelers to

More information

DS504/CS586: Big Data Analytics Data Management Prof. Yanhua Li

DS504/CS586: Big Data Analytics Data Management Prof. Yanhua Li Welcome to DS504/CS586: Big Data Analytics Data Management Prof. Yanhua Li Time: 6:00pm 8:50pm R Location: KH 116 Fall 2017 First Grading for Reading Assignment Weka v 6 weeks v https://weka.waikato.ac.nz/dataminingwithweka/preview

More information

Context based Re-ranking of Web Documents (CReWD)

Context based Re-ranking of Web Documents (CReWD) Context based Re-ranking of Web Documents (CReWD) Arijit Banerjee, Jagadish Venkatraman Graduate Students, Department of Computer Science, Stanford University arijitb@stanford.edu, jagadish@stanford.edu}

More information

An algorithm for Trajectories Classification

An algorithm for Trajectories Classification An algorithm for Trajectories Classification Fabrizio Celli 28/08/2009 INDEX ABSTRACT... 3 APPLICATION SCENARIO... 3 CONCEPTUAL MODEL... 3 THE PROBLEM... 7 THE ALGORITHM... 8 DETAILS... 9 THE ALGORITHM

More information

Understanding the use of Temporal Expressions on Persian Web Search

Understanding the use of Temporal Expressions on Persian Web Search Understanding the use of Temporal Expressions on Persian Web Search Behrooz Mansouri Mohammad Zahedi Ricardo Campos Mojgan Farhoodi Alireza Yari Ricardo Campos TempWeb 2018 @ WWW Lyon, France, Apr 23,

More information

Beyond DCG User Behavior as a Predictor of a Successful Search

Beyond DCG User Behavior as a Predictor of a Successful Search Beyond DCG User Behavior as a Predictor of a Successful Search Ahmed Hassan Department of EECS U. Michigan Ann Arbor Ann Arbor, MI hassanam@umich.edu Rosie Jones Yahoo Labs 4 Cambridge Center Cambridge,

More information

Discovering Local Attractions from Geo-Tagged Photos

Discovering Local Attractions from Geo-Tagged Photos Discovering Local Attractions from Geo-Tagged Photos Ombretta Gaggi Department of Mathematics University of Padua via Trieste, 63, 35131 Padova, Italy ombretta.gaggi@unipd.it ABSTRACT This paper presents

More information

Measuring and Recommending Time-Sensitive Routes from Location-Based Data

Measuring and Recommending Time-Sensitive Routes from Location-Based Data Measuring and Recommending Time-Sensitive Routes from Location-Based Data HSUN-PING HSIEH, CHENG-TE LI, and SHOU-DE LIN, National Taiwan University Location-based services allow users to perform geospatial

More information

Challenges on Combining Open Web and Dataset Evaluation Results: The Case of the Contextual Suggestion Track

Challenges on Combining Open Web and Dataset Evaluation Results: The Case of the Contextual Suggestion Track Challenges on Combining Open Web and Dataset Evaluation Results: The Case of the Contextual Suggestion Track Alejandro Bellogín 1,2, Thaer Samar 1, Arjen P. de Vries 1, and Alan Said 1 1 Centrum Wiskunde

More information

Deep Character-Level Click-Through Rate Prediction for Sponsored Search

Deep Character-Level Click-Through Rate Prediction for Sponsored Search Deep Character-Level Click-Through Rate Prediction for Sponsored Search Bora Edizel - Phd Student UPF Amin Mantrach - Criteo Research Xiao Bai - Oath This work was done at Yahoo and will be presented as

More information

NTU Approaches to Subtopic Mining and Document Ranking at NTCIR-9 Intent Task

NTU Approaches to Subtopic Mining and Document Ranking at NTCIR-9 Intent Task NTU Approaches to Subtopic Mining and Document Ranking at NTCIR-9 Intent Task Chieh-Jen Wang, Yung-Wei Lin, *Ming-Feng Tsai and Hsin-Hsi Chen Department of Computer Science and Information Engineering,

More information

Analysis of Trail Algorithms for User Search Behavior

Analysis of Trail Algorithms for User Search Behavior Analysis of Trail Algorithms for User Search Behavior Surabhi S. Golechha, Prof. R.R. Keole Abstract Web log data has been the basis for analyzing user query session behavior for a number of years. Web

More information

Future Trends in Search User Interfaces

Future Trends in Search User Interfaces Future Trends in Search User Interfaces Dr. Marti Hearst UC Berkeley i-know Conference Keynote Sept 1, 2010 Forecasting the Future First: What are the larger trends? In technology? In society? Next: Project

More information

Survey on Recommendation of Personalized Travel Sequence

Survey on Recommendation of Personalized Travel Sequence Survey on Recommendation of Personalized Travel Sequence Mayuri D. Aswale 1, Dr. S. C. Dharmadhikari 2 ME Student, Department of Information Technology, PICT, Pune, India 1 Head of Department, Department

More information

Extracting Information from Social Networks

Extracting Information from Social Networks Extracting Information from Social Networks Reminder: Social networks Catch-all term for social networking sites Facebook microblogging sites Twitter blog sites (for some purposes) 1 2 Ways we can use

More information

Did you know that SEO increases traffic, leads and sales? SEO = More Website Visitors More Traffic = More Leads More Leads= More Sales

Did you know that SEO increases traffic, leads and sales? SEO = More Website Visitors More Traffic = More Leads More Leads= More Sales 1 Did you know that SEO increases traffic, leads and sales? SEO = More Website Visitors More Traffic = More Leads More Leads= More Sales What is SEO? Search engine optimization is the process of improving

More information

An Empirical Analysis of Communities in Real-World Networks

An Empirical Analysis of Communities in Real-World Networks An Empirical Analysis of Communities in Real-World Networks Chuan Sheng Foo Computer Science Department Stanford University csfoo@cs.stanford.edu ABSTRACT Little work has been done on the characterization

More information

Managing your Online Reputation. Andrew Wiens International DMO Manager

Managing your Online Reputation. Andrew Wiens International DMO Manager Managing your Online Reputation Andrew Wiens International DMO Manager million unique monthly visitors * million TripAdvisor members million reviews and opinions user contributions every minute 500 million

More information

DATA MINING II - 1DL460. Spring 2014"

DATA MINING II - 1DL460. Spring 2014 DATA MINING II - 1DL460 Spring 2014" A second course in data mining http://www.it.uu.se/edu/course/homepage/infoutv2/vt14 Kjell Orsborn Uppsala Database Laboratory Department of Information Technology,

More information

Detecting Good Abandonment in Mobile Search

Detecting Good Abandonment in Mobile Search Detecting Good Abandonment in Mobile Search Kyle Williams, Julia Kiseleva, Aidan C. Crook Γ, Imed Zitouni Γ, Ahmed Hassan Awadallah Γ, Madian Khabsa Γ The Pennsylvania State University, University Park,

More information

Finding High-Probability Mobile Photography Routes Between Origin and Destination Endpoints

Finding High-Probability Mobile Photography Routes Between Origin and Destination Endpoints Finding High-Probability Mobile Photography Routes Between Origin and Destination Endpoints Thomas Phan Samsung Research America Silicon Valley San Jose, CA, USA ABSTRACT A challenge for mobile smartphone

More information

Modern Retrieval Evaluations. Hongning Wang

Modern Retrieval Evaluations. Hongning Wang Modern Retrieval Evaluations Hongning Wang CS@UVa What we have known about IR evaluations Three key elements for IR evaluation A document collection A test suite of information needs A set of relevance

More information

CURRICULUM VITÆ. Naama Kraus B.Sc. in Computer Science and Mathematics, Bar-Ilan University, Cum Laude GPA: 90.

CURRICULUM VITÆ. Naama Kraus B.Sc. in Computer Science and Mathematics, Bar-Ilan University, Cum Laude GPA: 90. CURRICULUM VITÆ Naama Kraus naamakraus@gmail.com Personal Information Home Address: 6 Trumpeldor Ave., Haifa, 32582, Israel Phone (Home): +972 4 8328216 Phone (Mobile): +972 55 6644563 Date of Birth: 29-APR-1974

More information

Lab for Media Search, National University of Singapore 1

Lab for Media Search, National University of Singapore 1 1 2 Word2Image: Towards Visual Interpretation of Words Haojie Li Introduction Motivation A picture is worth 1000 words Traditional dictionary Containing word entries accompanied by photos or drawing to

More information

ELEC6910Q Analytics and Systems for Social Media and Big Data Applications Lecture 4. Prof. James She

ELEC6910Q Analytics and Systems for Social Media and Big Data Applications Lecture 4. Prof. James She ELEC6910Q Analytics and Systems for Social Media and Big Data Applications Lecture 4 Prof. James She james.she@ust.hk 1 Selected Works of Activity 4 2 Selected Works of Activity 4 3 Last lecture 4 Mid-term

More information

Propagation of geotags based on object duplicate detection

Propagation of geotags based on object duplicate detection Propagation of geotags based on object duplicate detection Peter Vajda, Ivan Ivanov, Jong-Seok Lee, Lutz Goldmann and Touradj Ebrahimi Multimedia Signal Processing Group MMSPG Institute of Electrical Engineering

More information

Overview of iclef 2008: search log analysis for Multilingual Image Retrieval

Overview of iclef 2008: search log analysis for Multilingual Image Retrieval Overview of iclef 2008: search log analysis for Multilingual Image Retrieval Julio Gonzalo Paul Clough Jussi Karlgren UNED U. Sheffield SICS Spain United Kingdom Sweden julio@lsi.uned.es p.d.clough@sheffield.ac.uk

More information

The Push Notification Playbook for Travel Apps

The Push Notification Playbook for Travel Apps The Push Notification Playbook for Travel Apps 1 Mobile Is the Hub of All Communications Today. Smartphones enable brands to capitalize on new ways to reach their users, via personalization, location,

More information

SME Developing and managing your online presence. Presented by: Rasheed Girvan Global Directories

SME Developing and managing your online presence. Presented by: Rasheed Girvan Global Directories SME Developing and managing your online presence Presented by: Rasheed Girvan Global Directories DIGITAL MEDIA What is Digital Media Any media type in an electronic or digital format for the convenience

More information

ACM-W Athena Lecturer

ACM-W Athena Lecturer Welcome to ACM-W Athena Lecturer Athena is the Greek goddess of wisdom, courage, inspiration, civilization, law and justice, mathematics, strength, war strategy, the arts, crafts, and skill. The Athena

More information

Recommendation Projects at Yahoo!

Recommendation Projects at Yahoo! Recommendation Projects at Yahoo! Sihem Amer-Yahia, Yahoo! Labs 1 Introduction Recommendation is the task of providing content that is likely to interest users and enrich their online experience. At Yahoo!,

More information

Empowering People with Knowledge the Next Frontier for Web Search. Wei-Ying Ma Assistant Managing Director Microsoft Research Asia

Empowering People with Knowledge the Next Frontier for Web Search. Wei-Ying Ma Assistant Managing Director Microsoft Research Asia Empowering People with Knowledge the Next Frontier for Web Search Wei-Ying Ma Assistant Managing Director Microsoft Research Asia Important Trends for Web Search Organizing all information Addressing user

More information

Aggregation for searching complex information spaces. Mounia Lalmas

Aggregation for searching complex information spaces. Mounia Lalmas Aggregation for searching complex information spaces Mounia Lalmas mounia@acm.org Outline Document Retrieval Focused Retrieval Aggregated Retrieval Complexity of the information space (s) INEX - INitiative

More information

IJREAT International Journal of Research in Engineering & Advanced Technology, Volume 1, Issue 5, Oct-Nov, ISSN:

IJREAT International Journal of Research in Engineering & Advanced Technology, Volume 1, Issue 5, Oct-Nov, ISSN: IJREAT International Journal of Research in Engineering & Advanced Technology, Volume 1, Issue 5, Oct-Nov, 20131 Improve Search Engine Relevance with Filter session Addlin Shinney R 1, Saravana Kumar T

More information

Semantic Web and Web2.0. Dr Nicholas Gibbins

Semantic Web and Web2.0. Dr Nicholas Gibbins Semantic Web and Web2.0 Dr Nicholas Gibbins Web 2.0 is the business revolution in the computer industry caused by the move to the internet as platform, and an attempt to understand the rules for success

More information

Comment Extraction from Blog Posts and Its Applications to Opinion Mining

Comment Extraction from Blog Posts and Its Applications to Opinion Mining Comment Extraction from Blog Posts and Its Applications to Opinion Mining Huan-An Kao, Hsin-Hsi Chen Department of Computer Science and Information Engineering National Taiwan University, Taipei, Taiwan

More information

TREC 2016 Dynamic Domain Track: Exploiting Passage Representation for Retrieval and Relevance Feedback

TREC 2016 Dynamic Domain Track: Exploiting Passage Representation for Retrieval and Relevance Feedback RMIT @ TREC 2016 Dynamic Domain Track: Exploiting Passage Representation for Retrieval and Relevance Feedback Ameer Albahem ameer.albahem@rmit.edu.au Lawrence Cavedon lawrence.cavedon@rmit.edu.au Damiano

More information

A guide to GOOGLE+LOCAL. for business. Published by. hypercube.co.nz

A guide to GOOGLE+LOCAL. for business. Published by. hypercube.co.nz A guide to GOOGLE+LOCAL for business Published by hypercube.co.nz An introduction You have probably noticed that since June 2012, changes have been taking place with the local search results appearing

More information

Session Based Click Features for Recency Ranking

Session Based Click Features for Recency Ranking Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI-10) Session Based Click Features for Recency Ranking Yoshiyuki Inagaki and Narayanan Sadagopan and Georges Dupret and Ciya

More information

Query Evaluation Strategies

Query Evaluation Strategies Introduction to Search Engine Technology Term-at-a-Time and Document-at-a-Time Evaluation Ronny Lempel Yahoo! Labs (Many of the following slides are courtesy of Aya Soffer and David Carmel, IBM Haifa Research

More information

An Empirical Evaluation of User Interfaces for Topic Management of Web Sites

An Empirical Evaluation of User Interfaces for Topic Management of Web Sites An Empirical Evaluation of User Interfaces for Topic Management of Web Sites Brian Amento AT&T Labs - Research 180 Park Avenue, P.O. Box 971 Florham Park, NJ 07932 USA brian@research.att.com ABSTRACT Topic

More information

USER SEARCH INTERFACES. Design and Application

USER SEARCH INTERFACES. Design and Application USER SEARCH INTERFACES Design and Application KEEP IT SIMPLE Search is a means towards some other end, rather than a goal in itself. Search is a mentally intensive task. Task Example: You have a friend

More information

Location-Based Social Software for Mobile Devices. Inventors: Dennis Crowley Alex Rainert New York, NY Brooklyn, NY 11231

Location-Based Social Software for Mobile Devices. Inventors: Dennis Crowley Alex Rainert New York, NY Brooklyn, NY 11231 Title: Location-Based Social Software for Mobile Devices Date: April 28, 2004 Inventors: Dennis Crowley Alex Rainert New York, NY 10002 Brooklyn, NY 11231 dens@dodgeball.com alex@dodgeball.com Abstract

More information

Personalized Web Search

Personalized Web Search Personalized Web Search Dhanraj Mavilodan (dhanrajm@stanford.edu), Kapil Jaisinghani (kjaising@stanford.edu), Radhika Bansal (radhika3@stanford.edu) Abstract: With the increase in the diversity of contents

More information

GLOBAL VIDEO INDEX Q3 2012

GLOBAL VIDEO INDEX Q3 2012 GLOBAL VIDEO INDEX Q3 2012 Table of Contents ABOUT OOYALA S GLOBAL VIDEO INDEX...3 EXECUTIVE SUMMARY...4 GOING LONG: LONG-FORM VIDEO...5 TABLET AND MOBILE VIDEO...7 LIVE VIDEO VIEWING...8 Viewer Behavior

More information

Query Modifications Patterns During Web Searching

Query Modifications Patterns During Web Searching Bernard J. Jansen The Pennsylvania State University jjansen@ist.psu.edu Query Modifications Patterns During Web Searching Amanda Spink Queensland University of Technology ah.spink@qut.edu.au Bhuva Narayan

More information

Slides based on those in:

Slides based on those in: Spyros Kontogiannis & Christos Zaroliagis Slides based on those in: http://www.mmds.org A 3.3 B 38.4 C 34.3 D 3.9 E 8.1 F 3.9 1.6 1.6 1.6 1.6 1.6 2 y 0.8 ½+0.2 ⅓ M 1/2 1/2 0 0.8 1/2 0 0 + 0.2 0 1/2 1 [1/N]

More information

Executed by Rocky Sir, tech Head Suven Consultants & Technology Pvt Ltd. seo.suven.net 1

Executed by Rocky Sir, tech Head Suven Consultants & Technology Pvt Ltd. seo.suven.net 1 Executed by Rocky Sir, tech Head Suven Consultants & Technology Pvt Ltd. seo.suven.net 1 1. Parts of a Search Engine Every search engine has the 3 basic parts: a crawler an index (or catalog) matching

More information

We hope you come away inspired to take your marketing to the next level. Adam Stambleck Senior VP of Client Experience Movable Ink

We hope you come away inspired to take your  marketing to the next level. Adam Stambleck Senior VP of Client Experience Movable Ink Inkredible5 Welcome to Inkredible 5: Spring 2016. In 2015, we saw marketers make a big shift in mindset toward a more contextual approach. Consumers now expect relevancy, and marketers must speak to them

More information

Recommendation System for Location-based Social Network CS224W Project Report

Recommendation System for Location-based Social Network CS224W Project Report Recommendation System for Location-based Social Network CS224W Project Report Group 42, Yiying Cheng, Yangru Fang, Yongqing Yuan 1 Introduction With the rapid development of mobile devices and wireless

More information

The Ultimate Digital Marketing Glossary (A-Z) what does it all mean? A-Z of Digital Marketing Translation

The Ultimate Digital Marketing Glossary (A-Z) what does it all mean? A-Z of Digital Marketing Translation The Ultimate Digital Marketing Glossary (A-Z) what does it all mean? In our experience, we find we can get over-excited when talking to clients or family or friends and sometimes we forget that not everyone

More information

How to get the most from e mail with Gmail. New York Society Library Tech Workshop Spring 2011 Julia Weist (212) x 234

How to get the most from e mail with Gmail. New York Society Library Tech Workshop Spring 2011 Julia Weist (212) x 234 How to get the most from e mail with Gmail New York Society Library Tech Workshop Spring 2011 Julia Weist jweist@nysoclib.org (212) 288 6900 x 234 1 IN THIS GUIDE Getting started About Gmail..2 Opening

More information

Part 1: Link Analysis & Page Rank

Part 1: Link Analysis & Page Rank Chapter 8: Graph Data Part 1: Link Analysis & Page Rank Based on Leskovec, Rajaraman, Ullman 214: Mining of Massive Datasets 1 Graph Data: Social Networks [Source: 4-degrees of separation, Backstrom-Boldi-Rosa-Ugander-Vigna,

More information

Big Data Analytics CSCI 4030

Big Data Analytics CSCI 4030 High dim. data Graph data Infinite data Machine learning Apps Locality sensitive hashing PageRank, SimRank Filtering data streams SVM Recommen der systems Clustering Community Detection Web advertising

More information

Functionalities and Flow Analyses of Knowledge Oriented Web Portals

Functionalities and Flow Analyses of Knowledge Oriented Web Portals Functionalities and Flow Analyses of Knowledge Oriented Web Portals Daniele Cenni, Paolo Nesi, Michela Paolucci DISIT DSI, Distributed Systems and Internet Technology Lab Dipartimento di Sistemi e Informatica,

More information

FCA CERTIFIED WEBSITE PROGRAM

FCA CERTIFIED WEBSITE PROGRAM FCA CERTIFIED WEBSITE PROGRAM RESPONSIVE CORE WEBSITE PLATFORM RESPONSIVE WEBSITE PLATFORM Conversion Optimized VLP and VDP Inventory Feed Integration Analytics Dashboard Advanced SEO Tools Meta Titles

More information

Query Suggestions Using Query-Flow Graphs

Query Suggestions Using Query-Flow Graphs Query Suggestions Using Query-Flow Graphs Paolo Boldi 1 boldi@dsi.unimi.it Debora Donato 2 debora@yahoo-inc.com Francesco Bonchi 2 bonchi@yahoo-inc.com Sebastiano Vigna 1 vigna@dsi.unimi.it 1 DSI, Università

More information

A System for Discovering Regions of Interest from Trajectory Data

A System for Discovering Regions of Interest from Trajectory Data A System for Discovering Regions of Interest from Trajectory Data Muhammad Reaz Uddin, Chinya Ravishankar, and Vassilis J. Tsotras University of California, Riverside, CA, USA {uddinm,ravi,tsotras}@cs.ucr.edu

More information

State of the American Tourist

State of the American Tourist State of the American Tourist search social mobile Presented by Kara Kramer, Senior Director of Global Enterprise kkramer@comscore.com Retail & Travel online spending trend $162 $186 $123 $130 $130 $142

More information

A Model to Estimate Intrinsic Document Relevance from the Clickthrough Logs of a Web Search Engine

A Model to Estimate Intrinsic Document Relevance from the Clickthrough Logs of a Web Search Engine A Model to Estimate Intrinsic Document Relevance from the Clickthrough Logs of a Web Search Engine Georges Dupret Yahoo! Labs gdupret@yahoo-inc.com Ciya Liao Yahoo! Labs ciyaliao@yahoo-inc.com ABSTRACT

More information

Emerging Trends in Search User Interfaces

Emerging Trends in Search User Interfaces Emerging Trends in Search User Interfaces Marti Hearst UC Berkeley Hypertext 2011 Keynote Talk June 7, 2011 Book full text freely available at: http://searchuserinterfaces.com What works well in search

More information

Sriram Srinivasan Kalpesh Jain Shankar Venkataraman

Sriram Srinivasan Kalpesh Jain Shankar Venkataraman Sriram Srinivasan Kalpesh Jain Shankar Venkataraman Introduction Business Case Functionalities Architecture Data Mining Novelty Team Contribution Agenda 2 Introduction Established team sport for hundreds

More information

Module 10 MULTIMEDIA SYNCHRONIZATION

Module 10 MULTIMEDIA SYNCHRONIZATION Module 10 MULTIMEDIA SYNCHRONIZATION Lesson 33 Basic definitions and requirements Instructional objectives At the end of this lesson, the students should be able to: 1. Define synchronization between media

More information

Google Organic Click Through Study: Comparison of Google's CTR by Position, Industry, and Query Type

Google Organic Click Through Study: Comparison of Google's CTR by Position, Industry, and Query Type Google Organic Click Through Study: Comparison of Google's CTR by Position, Industry, and Query Type 21 Corporate Drive, Suite 200 Clifton Park, NY 12065 518-270-0854 www.internetmarketingninjas.com Table

More information

Analysis of Large Graphs: TrustRank and WebSpam

Analysis of Large Graphs: TrustRank and WebSpam Note to other teachers and users of these slides: We would be delighted if you found this our material useful in giving your own lectures. Feel free to use these slides verbatim, or to modify them to fit

More information

DB2 SQL Tuning Tips for z/os Developers

DB2 SQL Tuning Tips for z/os Developers DB2 SQL Tuning Tips for z/os Developers Tony Andrews IBM Press, Pearson pic Upper Saddle River, NJ Boston Indianapolis San Francisco New York Toronto Montreal London Munich Paris Madrid Cape Town Sydney

More information

DATA MINING - 1DL105, 1DL111

DATA MINING - 1DL105, 1DL111 1 DATA MINING - 1DL105, 1DL111 Fall 2007 An introductory class in data mining http://user.it.uu.se/~udbl/dut-ht2007/ alt. http://www.it.uu.se/edu/course/homepage/infoutv/ht07 Kjell Orsborn Uppsala Database

More information

Answering Math Queries With Search Engines

Answering Math Queries With Search Engines Answering Math Queries With Search Engines Shahab Kamali David R. Cheriton School of Computer Science University of Waterloo skamali@cs.uwaterloo.ca Johnson Apacible Microsoft Research Redmond johnsona@microsoft.com

More information

for Searching Social Media Posts

for Searching Social Media Posts Mining the Temporal Statistics of Query Terms for Searching Social Media Posts ICTIR 17 Amsterdam Oct. 1 st 2017 Jinfeng Rao Ferhan Ture Xing Niu Jimmy Lin Task: Ad-hoc Search on Social Media domain Stream

More information

Andrew Wiens International DMO Manager

Andrew Wiens International DMO Manager Andrew Wiens International DMO Manager million unique monthly visitors * million TripAdvisor members million reviews and opinions user contributions every minute 1 BILLION+ people view TripAdvisor content

More information

Overview of Web Mining Techniques and its Application towards Web

Overview of Web Mining Techniques and its Application towards Web Overview of Web Mining Techniques and its Application towards Web *Prof.Pooja Mehta Abstract The World Wide Web (WWW) acts as an interactive and popular way to transfer information. Due to the enormous

More information

Assignment 1. Assignment 2. Relevance. Performance Evaluation. Retrieval System Evaluation. Evaluate an IR system

Assignment 1. Assignment 2. Relevance. Performance Evaluation. Retrieval System Evaluation. Evaluate an IR system Retrieval System Evaluation W. Frisch Institute of Government, European Studies and Comparative Social Science University Vienna Assignment 1 How did you select the search engines? How did you find the

More information

Playbook. The Media, Publishing & Entertainment Marketer s Playbook

Playbook. The Media, Publishing & Entertainment  Marketer s Playbook Playbook The Media, Publishing & Entertainment Email Marketer s Playbook The Media, Publishing & Entertainment Consumer Email marketing has entered a new era of innovation, allowing marketers in the media,

More information

Business travel simplified.

Business travel simplified. Congrats! Now that you've registered for TripSource, your trip information is immediately accessible just tap "Trips. Though the app is very user friendly, we ve put together some tips and tricks to help

More information

MDP-based Itinerary Recommendation using Geo-Tagged Social Media

MDP-based Itinerary Recommendation using Geo-Tagged Social Media MDP-based Itinerary Recommendation using Geo-Tagged Social Media Radhika Gaonkar 1, Maryam Tavakol 2, and Ulf Brefeld 2 1 Stony Brook University, NY rgaonkar@cs.stonybrook.edu 2 Leuphana Universität Lüneburg

More information

Turning Lookers into Bookers!

Turning Lookers into Bookers! Turning Lookers into Bookers! Rachel Howes, Great National Hotels Ian Sloan, Avvio Avvio Operates Worldwide 450 4 280m Clients Offices Globally Hotel revenue processed annually Great National Hotels Acquisition

More information

Learning to Rank (part 2)

Learning to Rank (part 2) Learning to Rank (part 2) NESCAI 2008 Tutorial Filip Radlinski Cornell University Recap of Part 1 1. Learning to rank is widely used for information retrieval, and by web search engines. 2. There are many

More information

Large-Scale Validation and Analysis of Interleaved Search Evaluation

Large-Scale Validation and Analysis of Interleaved Search Evaluation Large-Scale Validation and Analysis of Interleaved Search Evaluation Olivier Chapelle, Thorsten Joachims Filip Radlinski, Yisong Yue Department of Computer Science Cornell University Decide between two

More information

COLLABORATIVE LOCATION AND ACTIVITY RECOMMENDATIONS WITH GPS HISTORY DATA

COLLABORATIVE LOCATION AND ACTIVITY RECOMMENDATIONS WITH GPS HISTORY DATA COLLABORATIVE LOCATION AND ACTIVITY RECOMMENDATIONS WITH GPS HISTORY DATA Vincent W. Zheng, Yu Zheng, Xing Xie, Qiang Yang Hong Kong University of Science and Technology Microsoft Research Asia WWW 2010

More information

Kaltura Video Package for Moodle 2.x Quick Start Guide. Version: 3.1 for Moodle

Kaltura Video Package for Moodle 2.x Quick Start Guide. Version: 3.1 for Moodle Kaltura Video Package for Moodle 2.x Quick Start Guide Version: 3.1 for Moodle 2.0-2.4 Kaltura Business Headquarters 5 Union Square West, Suite 602, New York, NY, 10003, USA Tel.: +1 800 871 5224 Copyright

More information

The Person in Personal

The Person in Personal WWW Panel: Searching Personal Content The Person in Personal (Supporting the Person in Searching Personal Content) Susan Dumais Microsoft Research http://research.microsoft.com/~sdumais Stuff I ve I Seen

More information

A Dynamic Bayesian Network Click Model for Web Search Ranking

A Dynamic Bayesian Network Click Model for Web Search Ranking A Dynamic Bayesian Network Click Model for Web Search Ranking Olivier Chapelle and Anne Ya Zhang Apr 22, 2009 18th International World Wide Web Conference Introduction Motivation Clicks provide valuable

More information

International Journal of Computer Engineering and Applications, Volume IX, Issue X, Oct. 15 ISSN

International Journal of Computer Engineering and Applications, Volume IX, Issue X, Oct. 15  ISSN DIVERSIFIED DATASET EXPLORATION BASED ON USEFULNESS SCORE Geetanjali Mohite 1, Prof. Gauri Rao 2 1 Student, Department of Computer Engineering, B.V.D.U.C.O.E, Pune, Maharashtra, India 2 Associate Professor,

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining Lecture #6: Mining Data Streams Seoul National University 1 Outline Overview Sampling From Data Stream Queries Over Sliding Window 2 Data Streams In many data mining situations,

More information