Query Heartbeat: A Strange Property of Keyword Queries on the Web

Size: px
Start display at page:

Download "Query Heartbeat: A Strange Property of Keyword Queries on the Web"

Transcription

1 Query Heartbeat: A Strange Property of Keyword Queries on the Web Karthik.B.R Aditya Ramana Rachakonda Dr. Srinath Srinivasa International Institute of Information Technology, Bangalore. December 16, 2008 A Strange Property of Keyword Queries on the Web 1 / 27

2 Outline of Topics 1 Motivation 2 Introduction 3 Dataset 4 Approach 5 Query HeartBeat Properties 6 Results 7 Conclusion A Strange Property of Keyword Queries on the Web 2 / 27

3 Motivation 1 A Strange Property of Keyword Queries on the Web 3 / 27

4 Motivation Consider an example query on Google Trends A Strange Property of Keyword Queries on the Web 3 / 27

5 Motivation Consider an example query on Google Trends 1. Queries: aircondition, watersport. 1 A Strange Property of Keyword Queries on the Web 3 / 27

6 Motivation Consider an example query on Google Trends 1. Queries: aircondition, watersport. We see that the queries have a similar temporal behavior. 1 A Strange Property of Keyword Queries on the Web 3 / 27

7 Motivation Consider an example query on Google Trends 1. Queries: aircondition, watersport. We see that the queries have a similar temporal behavior. Also, the shape of the temporal patterns are same, even though the volumes differ. 1 A Strange Property of Keyword Queries on the Web 3 / 27

8 Introduction Objective was to observe the temporal property of keyword queries and cluster keywords based on their temporal properties. 2 A Strange Property of Keyword Queries on the Web 4 / 27

9 Introduction Objective was to observe the temporal property of keyword queries and cluster keywords based on their temporal properties. However, we found that generic keyword queries that have large enough volumes tend to have the same temporal shape. We call this shape the attractor distribution. 2 A Strange Property of Keyword Queries on the Web 4 / 27

10 Introduction Objective was to observe the temporal property of keyword queries and cluster keywords based on their temporal properties. However, we found that generic keyword queries that have large enough volumes tend to have the same temporal shape. We call this shape the attractor distribution. Visually similar phenomenon were also apparent when keywords were searched on Google Trends A Strange Property of Keyword Queries on the Web 4 / 27

11 Example Consider an example for queries yahoo, access. A Strange Property of Keyword Queries on the Web 5 / 27

12 Example Consider an example for queries yahoo, access. When their volumes are normalized A Strange Property of Keyword Queries on the Web 5 / 27

13 Dataset We used the AOL query log dataset A Strange Property of Keyword Queries on the Web 6 / 27

14 Dataset We used the AOL query log dataset 3. The collection consists of around 20 million web queries. 3 A Strange Property of Keyword Queries on the Web 6 / 27

15 Dataset We used the AOL query log dataset 3. The collection consists of around 20 million web queries. The collection is done over 650,000 users. 3 A Strange Property of Keyword Queries on the Web 6 / 27

16 Dataset We used the AOL query log dataset 3. The collection consists of around 20 million web queries. The collection is done over 650,000 users. The data is collected over a period of 3 months from 1st March, 2006 to 31st May, A Strange Property of Keyword Queries on the Web 6 / 27

17 Approach Data extraction from query logs. A Strange Property of Keyword Queries on the Web 7 / 27

18 Approach Data extraction from query logs. Remove scale factor from shape. A Strange Property of Keyword Queries on the Web 7 / 27

19 Approach Data extraction from query logs. Remove scale factor from shape. Encoding the temporal shape of the queries. A Strange Property of Keyword Queries on the Web 7 / 27

20 Approach Data extraction from query logs. Remove scale factor from shape. Encoding the temporal shape of the queries. Computing pair-wise similarities. A Strange Property of Keyword Queries on the Web 7 / 27

21 Encoding Temporal Shape of Queries Consider the quantum of change in query volumes from one day to next. A Strange Property of Keyword Queries on the Web 8 / 27

22 Encoding Temporal Shape of Queries Consider the quantum of change in query volumes from one day to next. The quantum change is projected onto a set of seven primitives A through G. A Strange Property of Keyword Queries on the Web 8 / 27

23 Encoding Temporal Shape of Queries Consider the quantum of change in query volumes from one day to next. The quantum change is projected onto a set of seven primitives A through G. The primitives are assigned by mapping the slope of the query graph in one time interval onto a radical region. A Strange Property of Keyword Queries on the Web 8 / 27

24 Least Biased Distribution The slope of the line depicting change is determined as: q = q t+1 q t t (1) A Strange Property of Keyword Queries on the Web 9 / 27

25 Least Biased Distribution The slope of the line depicting change is determined as: q = q t+1 q t t To minimize biases in the calibration, we need to set t to a value such that it maximizes the entropy of points. (1) A Strange Property of Keyword Queries on the Web 9 / 27

26 Least Biased Distribution The slope of the line depicting change is determined as: q = q t+1 q t t (1) To minimize biases in the calibration, we need to set t to a value such that it maximizes the entropy of points. The entropy in the distribution of the primitives is given as H( q) = q [A...G] p( q)log 2 (p( q)) (2) A Strange Property of Keyword Queries on the Web 9 / 27

27 Normalization Consider an example for queries yahoo, access. A Strange Property of Keyword Queries on the Web 10 / 27

28 Normalization Consider an example for queries yahoo, access. When their volumes are normalized A Strange Property of Keyword Queries on the Web 10 / 27

29 Normalization Consider an example for queries yahoo, access. When their volumes are normalized The data is normalized as follows: Val = X µ σ A Strange Property of Keyword Queries on the Web 10 / 27 (3)

30 Measure Similarity between Temporal Sequences We use Dynamic Time Warping (DTW) to compute the distance measure between the respective time series. A Strange Property of Keyword Queries on the Web 11 / 27

31 Measure Similarity between Temporal Sequences We use Dynamic Time Warping (DTW) to compute the distance measure between the respective time series. The algorithm finds the least weighted path in a matrix where the rows represent one temporal signature and the columns represent another temporal signature. A Strange Property of Keyword Queries on the Web 11 / 27

32 Query HeartBeat Properties We noticed that queries which cross a certain volume tend to follow the same temporal pattern. A Strange Property of Keyword Queries on the Web 12 / 27

33 Query HeartBeat Properties We noticed that queries which cross a certain volume tend to follow the same temporal pattern. The queries that tend to follow this attractor distribution are generic queries. Their volume is not influenced by any external events (say, an event associated with the query). A Strange Property of Keyword Queries on the Web 12 / 27

34 Query HeartBeat Properties We noticed that queries which cross a certain volume tend to follow the same temporal pattern. The queries that tend to follow this attractor distribution are generic queries. Their volume is not influenced by any external events (say, an event associated with the query). Similar behavior of queries was observed on Google Trends. A Strange Property of Keyword Queries on the Web 12 / 27

35 Attractor Distribution on Google Trends A Strange Property of Keyword Queries on the Web 13 / 27

36 Attractor Distribution on Google Trends 4 Queries: book, calculator, address, mobile. 4 A Strange Property of Keyword Queries on the Web 13 / 27

37 Attractor Distribution on Google Trends 4 Queries: book, calculator, address, mobile. Queries: book, california, women, hair, access. 4 A Strange Property of Keyword Queries on the Web 13 / 27

38 Attractor Distribution for AOL Dataset The query google is the most queried word on the AOL dataset. A Strange Property of Keyword Queries on the Web 14 / 27

39 Attractor Distribution for AOL Dataset The query google is the most queried word on the AOL dataset. Queries: google, yahoo, ebay, sale, lyric. A Strange Property of Keyword Queries on the Web 14 / 27

40 Attractor Distribution for AOL Dataset The query google is the most queried word on the AOL dataset. Queries: google, yahoo, ebay, sale, lyric. Query volume: , , , , A Strange Property of Keyword Queries on the Web 14 / 27

41 Attractor Distribution for AOL Dataset The query google is the most queried word on the AOL dataset. Queries: google, yahoo, ebay, sale, lyric. Query volume: , , , , DTW distances with respect to google: yahoo , ebay , sale , lyric A Strange Property of Keyword Queries on the Web 14 / 27

42 Normalized Results for AOL Dataset Queries: google, yahoo, ebay, sale, lyric. A Strange Property of Keyword Queries on the Web 15 / 27

43 Attractor Distribution for AOL Dataset Queries: google, book, car, picture. A Strange Property of Keyword Queries on the Web 16 / 27

44 Attractor Distribution for AOL Dataset Queries: google, book, car, picture. Query volume: , 93649, , A Strange Property of Keyword Queries on the Web 16 / 27

45 Attractor Distribution for AOL Dataset Queries: google, book, car, picture. Query volume: , 93649, , DTW distances with respect to google: book- 6, car , picture A Strange Property of Keyword Queries on the Web 16 / 27

46 Normalized Results for AOL Dataset Queries: google, book, car, picture. A Strange Property of Keyword Queries on the Web 17 / 27

47 Attractor Distribution for AOL Dataset Queries: google, house, school, mexico. A Strange Property of Keyword Queries on the Web 18 / 27

48 Attractor Distribution for AOL Dataset Queries: google, house, school, mexico. A Strange Property of Keyword Queries on the Web 18 / 27

49 Attractor Distribution for AOL Dataset Queries: google, house, school, mexico. Query volume: , , , DTW distances with respect to google: house , school , mexico A Strange Property of Keyword Queries on the Web 18 / 27

50 Normalized Results for AOL Dataset Queries: google, house, school, mexico. A Strange Property of Keyword Queries on the Web 19 / 27

51 Normalized Results for AOL Dataset We plotted the normalized graph for volume of all queries put together and google. A Strange Property of Keyword Queries on the Web 20 / 27

52 Normalized Results for AOL Dataset We plotted the normalized graph for volume of all queries put together and google. The aggregated heartbeat of all queries follows the query heartbeat even though their volumes vastly differ. A Strange Property of Keyword Queries on the Web 20 / 27

53 AOL Attractor Distribution We plotted the DTW distance of queries to google on a log-log graph of query volume Vs. queries in decreasing order of volume. A Strange Property of Keyword Queries on the Web 21 / 27

54 AOL Attractor Distribution The variation of the DTW distance Vs. the log of query volume. A Strange Property of Keyword Queries on the Web 22 / 27

55 AOL Attractor Distribution The variation of the DTW distance Vs. the log of query volume. The correlation between DTW distance and log of query volume was indicating a high negative correlation. A Strange Property of Keyword Queries on the Web 22 / 27

56 Idea of major events distorting the attractor distribution in queries Consider an example of queries google and myspace on AOL dataset. Query volume: , DTW distance: A Strange Property of Keyword Queries on the Web 23 / 27

57 Idea of major events distorting the attractor distribution in queries On March 31st 2006, the market share of visits to Myspace Video increased by 1242%. A Strange Property of Keyword Queries on the Web 24 / 27

58 Idea of major events distorting the attractor distribution in queries On March 31st 2006, the market share of visits to Myspace Video increased by 1242%. On May 2nd 2006, Myspace denied access to a user who had created a profile in the name of Barack Obama, a Chicago Senator. A Strange Property of Keyword Queries on the Web 24 / 27

59 Idea of major events distorting the attractor distribution in queries On March 31st 2006, the market share of visits to Myspace Video increased by 1242%. On May 2nd 2006, Myspace denied access to a user who had created a profile in the name of Barack Obama, a Chicago Senator. On May 22nd 2006, two teenagers were charged with illegal computer access into Myspace and attempted extortion worth $150,000. A Strange Property of Keyword Queries on the Web 24 / 27

60 Characteristics of attractor distribution The primitives can be divided into three broad categories: Rise, Fall and Constant trend in Query volume. A Strange Property of Keyword Queries on the Web 25 / 27

61 Characteristics of attractor distribution The primitives can be divided into three broad categories: Rise, Fall and Constant trend in Query volume. We calculated the transition probabilities of queries in these three categories: P(fall rise) 0.68 Rise P(rise rise) 0.27 P(constant rise) 0.05 P(rise fall) 0.60 Fall P(fall fall) 0.28 P(constant fall) 0.12 A Strange Property of Keyword Queries on the Web 25 / 27

62 Conclusion and Future Work Conclusion We showed that disparate keyword queries on the Web exhibit strange central limit properties. A Strange Property of Keyword Queries on the Web 26 / 27

63 Conclusion and Future Work Conclusion We showed that disparate keyword queries on the Web exhibit strange central limit properties. The heartbeat seems to be characteristic of generic search terms that have large enough volumes and are not affected by periodicity or external events. A Strange Property of Keyword Queries on the Web 26 / 27

64 Conclusion and Future Work Conclusion We showed that disparate keyword queries on the Web exhibit strange central limit properties. The heartbeat seems to be characteristic of generic search terms that have large enough volumes and are not affected by periodicity or external events. The volume of the query determines the closeness of the query distribution with the attractor distribution. A Strange Property of Keyword Queries on the Web 26 / 27

65 Conclusion and Future Work Conclusion We showed that disparate keyword queries on the Web exhibit strange central limit properties. The heartbeat seems to be characteristic of generic search terms that have large enough volumes and are not affected by periodicity or external events. The volume of the query determines the closeness of the query distribution with the attractor distribution. Future Work A Strange Property of Keyword Queries on the Web 26 / 27

66 Conclusion and Future Work Conclusion We showed that disparate keyword queries on the Web exhibit strange central limit properties. The heartbeat seems to be characteristic of generic search terms that have large enough volumes and are not affected by periodicity or external events. The volume of the query determines the closeness of the query distribution with the attractor distribution. Future Work It would be interesting to discern interesting characteristics of the attractor distribution. A Strange Property of Keyword Queries on the Web 26 / 27

67 Conclusion and Future Work Conclusion We showed that disparate keyword queries on the Web exhibit strange central limit properties. The heartbeat seems to be characteristic of generic search terms that have large enough volumes and are not affected by periodicity or external events. The volume of the query determines the closeness of the query distribution with the attractor distribution. Future Work It would be interesting to discern interesting characteristics of the attractor distribution. Why this attractor distribution? A Strange Property of Keyword Queries on the Web 26 / 27

68 Conclusion and Future Work Conclusion We showed that disparate keyword queries on the Web exhibit strange central limit properties. The heartbeat seems to be characteristic of generic search terms that have large enough volumes and are not affected by periodicity or external events. The volume of the query determines the closeness of the query distribution with the attractor distribution. Future Work It would be interesting to discern interesting characteristics of the attractor distribution. Why this attractor distribution? What is the generative model that gives rise to this distribution? A Strange Property of Keyword Queries on the Web 26 / 27

69 Thank you A Strange Property of Keyword Queries on the Web 27 / 27

Perception Maneesh Agrawala CS : Visualization Fall 2013 Multidimensional Visualization

Perception Maneesh Agrawala CS : Visualization Fall 2013 Multidimensional Visualization Perception Maneesh Agrawala CS 294-10: Visualization Fall 2013 Multidimensional Visualization 1 Visual Encoding Variables Position Length Area Volume Value Texture Color Orientation Shape ~8 dimensions?

More information

Multivariate Data More Overview

Multivariate Data More Overview Multivariate Data More Overview CS 4460 - Information Visualization Jim Foley Last Revision August 2016 Some Key Concepts Quick Review Data Types Data Marks Basic Data Types N-Nominal (categorical) Equal

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining Privacy preserving data mining Li Xiong Slides credits: Chris Clifton Agrawal and Srikant 4/3/2011 1 Privacy Preserving Data Mining Privacy concerns about personal data AOL

More information

Multivariate Data & Tables and Graphs. Agenda. Data and its characteristics Tables and graphs Design principles

Multivariate Data & Tables and Graphs. Agenda. Data and its characteristics Tables and graphs Design principles Topic Notes Multivariate Data & Tables and Graphs CS 7450 - Information Visualization Aug. 27, 2012 John Stasko Agenda Data and its characteristics Tables and graphs Design principles Fall 2012 CS 7450

More information

Chapter 5. Track Geometry Data Analysis

Chapter 5. Track Geometry Data Analysis Chapter Track Geometry Data Analysis This chapter explains how and why the data collected for the track geometry was manipulated. The results of these studies in the time and frequency domain are addressed.

More information

Multivariate Data & Tables and Graphs. Agenda. Data and its characteristics Tables and graphs Design principles

Multivariate Data & Tables and Graphs. Agenda. Data and its characteristics Tables and graphs Design principles Multivariate Data & Tables and Graphs CS 7450 - Information Visualization Aug. 24, 2015 John Stasko Agenda Data and its characteristics Tables and graphs Design principles Fall 2015 CS 7450 2 1 Data Data

More information

Section 4.1: Time Series I. Jared S. Murray The University of Texas at Austin McCombs School of Business

Section 4.1: Time Series I. Jared S. Murray The University of Texas at Austin McCombs School of Business Section 4.1: Time Series I Jared S. Murray The University of Texas at Austin McCombs School of Business 1 Time Series Data and Dependence Time-series data are simply a collection of observations gathered

More information

Modeling Dynamic Behavior in Large Evolving Graphs

Modeling Dynamic Behavior in Large Evolving Graphs Modeling Dynamic Behavior in Large Evolving Graphs R. Rossi, J. Neville, B. Gallagher, and K. Henderson Presented by: Doaa Altarawy 1 Outline - Motivation - Proposed Model - Definitions - Modeling dynamic

More information

Getting the most from your websites SEO. A seven point guide to understanding SEO and how to maximise results

Getting the most from your websites SEO. A seven point guide to understanding SEO and how to maximise results Getting the most from your websites SEO A seven point guide to understanding SEO and how to maximise results About this document SEO: What is it? This document is aimed at giving a SEO: What is it? 2 SEO

More information

ASCERTAINING THE RELEVANCE MODEL OF A WEB SEARCH-ENGINE BIPIN SURESH

ASCERTAINING THE RELEVANCE MODEL OF A WEB SEARCH-ENGINE BIPIN SURESH ASCERTAINING THE RELEVANCE MODEL OF A WEB SEARCH-ENGINE BIPIN SURESH Abstract We analyze the factors contributing to the relevance of a web-page as computed by popular industry web search-engines. We also

More information

Estimating Parking Spot Occupancy

Estimating Parking Spot Occupancy 1 Estimating Parking Spot Occupancy David M.W. Landry and Matthew R. Morin Abstract Whether or not a car occupies a given parking spot at a given time can be modeled as a random variable. By looking at

More information

LTER Schoolyard Ecology Program

LTER Schoolyard Ecology Program LTER Schoolyard Ecology Program Title: How to Make Awesome Graphs Teacher/Author: Nora Murphy School: Concord Carslisle High School Date: March 30, 2017 1 Name Block Date How to Make Awesome Graphs Note,

More information

Building Scalable Web Sites By Cal Henderson Weibnc

Building Scalable Web Sites By Cal Henderson Weibnc We have made it easy for you to find a PDF Ebooks without any digging. And by having access to our ebooks online or by storing it on your computer, you have convenient answers with building scalable web

More information

PR3 & PR4 CBR Activities Using EasyData for CBL/CBR Apps

PR3 & PR4 CBR Activities Using EasyData for CBL/CBR Apps Summer 2006 I2T2 Process Page 23. PR3 & PR4 CBR Activities Using EasyData for CBL/CBR Apps The TI Exploration Series for CBR or CBL/CBR books, are all written for the old CBL/CBR Application. Now we can

More information

Look at the table below. Into which group should a 35-year-old man go? Maths at City & Guilds City and Guilds of London Institute 2013

Look at the table below. Into which group should a 35-year-old man go? Maths at City & Guilds City and Guilds of London Institute 2013 QUIZ CARDS Card 1 Look at the table below. Into which group should a 35-year-old man go? Under 16 17 26 27 36 D 37 46 E Over 46 Under 16 17 26 27 36 37 46 Over 46 QUIZ CARDS Card 2 Look at the table below.

More information

Introduction to Geospatial Analysis

Introduction to Geospatial Analysis Introduction to Geospatial Analysis Introduction to Geospatial Analysis 1 Descriptive Statistics Descriptive statistics. 2 What and Why? Descriptive Statistics Quantitative description of data Why? Allow

More information

Differentiation of Cognitive Abilities across the Lifespan. Online Supplement. Elliot M. Tucker-Drob

Differentiation of Cognitive Abilities across the Lifespan. Online Supplement. Elliot M. Tucker-Drob 1 Differentiation of Cognitive Abilities across the Lifespan Online Supplement Elliot M. Tucker-Drob This online supplement reports the results of an alternative set of analyses performed on a single sample

More information

Multiple Regression White paper

Multiple Regression White paper +44 (0) 333 666 7366 Multiple Regression White paper A tool to determine the impact in analysing the effectiveness of advertising spend. Multiple Regression In order to establish if the advertising mechanisms

More information

University of Virginia Department of Computer Science. CS 4501: Information Retrieval Fall 2015

University of Virginia Department of Computer Science. CS 4501: Information Retrieval Fall 2015 University of Virginia Department of Computer Science CS 4501: Information Retrieval Fall 2015 2:00pm-3:30pm, Tuesday, December 15th Name: ComputingID: This is a closed book and closed notes exam. No electronic

More information

Whitepaper US SEO Ranking Factors 2012

Whitepaper US SEO Ranking Factors 2012 Whitepaper US SEO Ranking Factors 2012 Authors: Marcus Tober, Sebastian Weber Searchmetrics Inc. 1115 Broadway 12th Floor, Room 1213 New York, NY 10010 Phone: 1 866-411-9494 E-Mail: sales-us@searchmetrics.com

More information

Data Analysis: Displaying Data - Deception with Graphs

Data Analysis: Displaying Data - Deception with Graphs This module looks at ways in which data can be deceptively displayed with graphs. Such deception can be defined as "the deliberate or inadvertent manipulation or distortion of the form or content of a

More information

Extracting Rankings for Spatial Keyword Queries from GPS Data

Extracting Rankings for Spatial Keyword Queries from GPS Data Extracting Rankings for Spatial Keyword Queries from GPS Data Ilkcan Keles Christian S. Jensen Simonas Saltenis Aalborg University Outline Introduction Motivation Problem Definition Proposed Method Overview

More information

Writing Reports with Report Designer and SSRS 2014 Level 1

Writing Reports with Report Designer and SSRS 2014 Level 1 Writing Reports with Report Designer and SSRS 2014 Level 1 Duration- 2days About this course In this 2-day course, students are introduced to the foundations of report writing with Microsoft SQL Server

More information

Schedule for Rest of Semester

Schedule for Rest of Semester Schedule for Rest of Semester Date Lecture Topic 11/20 24 Texture 11/27 25 Review of Statistics & Linear Algebra, Eigenvectors 11/29 26 Eigenvector expansions, Pattern Recognition 12/4 27 Cameras & calibration

More information

Chapter 13. Creating Business Diagrams with SmartArt. Creating SmartArt Diagrams

Chapter 13. Creating Business Diagrams with SmartArt. Creating SmartArt Diagrams Chapter 13 Creating Business Diagrams with SmartArt Office 2007 adds support for 80 different types of business diagrams. These diagrams include list charts, process charts, cycle charts, hierarchy and

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to JULY 2011 Afsaneh Yazdani What motivated? Wide availability of huge amounts of data and the imminent need for turning such data into useful information and knowledge What motivated? Data

More information

Whitepaper Italy SEO Ranking Factors 2012

Whitepaper Italy SEO Ranking Factors 2012 Whitepaper Italy SEO Ranking Factors 2012 Authors: Marcus Tober, Sebastian Weber Searchmetrics GmbH Greifswalder Straße 212 10405 Berlin Phone: +49-30-3229535-0 Fax: +49-30-3229535-99 E-Mail: info@searchmetrics.com

More information

13. Geospatio-temporal Data Analytics. Jacobs University Visualization and Computer Graphics Lab

13. Geospatio-temporal Data Analytics. Jacobs University Visualization and Computer Graphics Lab 13. Geospatio-temporal Data Analytics Recall: Twitter Data Analytics 573 Recall: Twitter Data Analytics 574 13.1 Time Series Data Analytics Introduction to Time Series Analysis A time-series is a set of

More information

Regression III: Advanced Methods

Regression III: Advanced Methods Lecture 3: Distributions Regression III: Advanced Methods William G. Jacoby Michigan State University Goals of the lecture Examine data in graphical form Graphs for looking at univariate distributions

More information

Package gtrendsr. October 19, 2017

Package gtrendsr. October 19, 2017 Type Package Title Perform and Display Google Trends Queries Version 1.4.0 Date 2017-10-19 Package gtrendsr October 19, 2017 An interface for retrieving and displaying the information returned online by

More information

Introduction April 27 th 2016

Introduction April 27 th 2016 Social Web Mining Summer Term 2016 1 Introduction April 27 th 2016 Dr. Darko Obradovic Insiders Technologies GmbH Kaiserslautern d.obradovic@insiders-technologies.de Outline for Today 1.1 1.2 1.3 1.4 1.5

More information

CHAPTER 2 TEXTURE CLASSIFICATION METHODS GRAY LEVEL CO-OCCURRENCE MATRIX AND TEXTURE UNIT

CHAPTER 2 TEXTURE CLASSIFICATION METHODS GRAY LEVEL CO-OCCURRENCE MATRIX AND TEXTURE UNIT CHAPTER 2 TEXTURE CLASSIFICATION METHODS GRAY LEVEL CO-OCCURRENCE MATRIX AND TEXTURE UNIT 2.1 BRIEF OUTLINE The classification of digital imagery is to extract useful thematic information which is one

More information

Social Media and Web 2.0. The Social Media and Web 2.0 webinar will begin shortly.

Social Media and Web 2.0. The Social Media and Web 2.0 webinar will begin shortly. The Social Media and Web 2.0 webinar will begin shortly. If you need technical assistance with the webcast, contact us at hsmai@commpartners.com and we will assist you immediately. 1 Social Media and Web

More information

Chapter 1 Polynomials and Modeling

Chapter 1 Polynomials and Modeling Chapter 1 Polynomials and Modeling 1.1 Linear Functions Recall that a line is a function of the form y = mx+ b, where m is the slope of the line (how steep the line is) and b gives the y-intercept (where

More information

Algebra 2 Chapter Relations and Functions

Algebra 2 Chapter Relations and Functions Algebra 2 Chapter 2 2.1 Relations and Functions 2.1 Relations and Functions / 2.2 Direct Variation A: Relations What is a relation? A of items from two sets: A set of values and a set of values. What does

More information

Stat 428 Autumn 2006 Homework 2 Solutions

Stat 428 Autumn 2006 Homework 2 Solutions Section 6.3 (5, 8) 6.3.5 Here is the Minitab output for the service time data set. Descriptive Statistics: Service Times Service Times 0 69.35 1.24 67.88 17.59 28.00 61.00 66.00 Variable Q3 Maximum Service

More information

This research aims to present a new way of visualizing multi-dimensional data using generalized scatterplots by sensitivity coefficients to highlight

This research aims to present a new way of visualizing multi-dimensional data using generalized scatterplots by sensitivity coefficients to highlight This research aims to present a new way of visualizing multi-dimensional data using generalized scatterplots by sensitivity coefficients to highlight local variation of one variable with respect to another.

More information

Whitepaper Spain SEO Ranking Factors 2012

Whitepaper Spain SEO Ranking Factors 2012 Whitepaper Spain SEO Ranking Factors 2012 Authors: Marcus Tober, Sebastian Weber Searchmetrics GmbH Greifswalder Straße 212 10405 Berlin Phone: +49-30-3229535-0 Fax: +49-30-3229535-99 E-Mail: info@searchmetrics.com

More information

M7D1.a: Formulate questions and collect data from a census of at least 30 objects and from samples of varying sizes.

M7D1.a: Formulate questions and collect data from a census of at least 30 objects and from samples of varying sizes. M7D1.a: Formulate questions and collect data from a census of at least 30 objects and from samples of varying sizes. Population: Census: Biased: Sample: The entire group of objects or individuals considered

More information

How to Use the Cancer-Rates.Info/NJ

How to Use the Cancer-Rates.Info/NJ How to Use the Cancer-Rates.Info/NJ Web- Based Incidence and Mortality Mapping and Inquiry Tool to Obtain Statewide and County Cancer Statistics for New Jersey Cancer Incidence and Mortality Inquiry System

More information

Rise-Time Enhancement Techniques for Resistive Array Infrared Scene Projectors

Rise-Time Enhancement Techniques for Resistive Array Infrared Scene Projectors Rise-Time Enhancement Techniques for Resistive Array Infrared Scene Projectors Greg Franks a, Joe LaVeigne a, Kevin Sparkman a, Jim Oleson a a Santa Barbara Infrared, Inc., 30 S. Calle Cesar Chavez, #D,

More information

Multivariate Data & Tables and Graphs

Multivariate Data & Tables and Graphs Multivariate Data & Tables and Graphs CS 4460/7450 - Information Visualization Jan. 13, 2009 John Stasko Agenda Data and its characteristics Tables and graphs Design principles Spring 2009 CS 4460/7450

More information

3 Graphical Displays of Data

3 Graphical Displays of Data 3 Graphical Displays of Data Reading: SW Chapter 2, Sections 1-6 Summarizing and Displaying Qualitative Data The data below are from a study of thyroid cancer, using NMTR data. The investigators looked

More information

MAC-CPTM Situations Project. Situation 04: Representing Standard Deviation* (* formerly Bull s Eye )

MAC-CPTM Situations Project. Situation 04: Representing Standard Deviation* (* formerly Bull s Eye ) MAC-CPTM Situations Project Situation 04: Representing Standard Deviation* (* formerly Bull s Eye ) Prepared at Pennsylvania State University Mid-Atlantic Center for Mathematics Teaching and Learning 18

More information

Security analytics: From data to action Visual and analytical approaches to detecting modern adversaries

Security analytics: From data to action Visual and analytical approaches to detecting modern adversaries Security analytics: From data to action Visual and analytical approaches to detecting modern adversaries Chris Calvert, CISSP, CISM Director of Solutions Innovation Copyright 2013 Hewlett-Packard Development

More information

Using the MySpace Friend Mapper to Build Connections for an Investigation

Using the MySpace Friend Mapper to Build Connections for an Investigation Using the MySpace Friend Mapper to Build Connections for an Investigation By Lauren Wagner High-Tech Crime Training Specialist S E A R C H T R A I N I N G S E R V I C E S March 2009 Social networking sites

More information

Algebra II Notes Unit Two: Linear Equations and Functions

Algebra II Notes Unit Two: Linear Equations and Functions Syllabus Objectives:.1 The student will differentiate between a relation and a function.. The student will identify the domain and range of a relation or function.. The student will derive a function rule

More information

Lesson 18-1 Lesson Lesson 18-1 Lesson Lesson 18-2 Lesson 18-2

Lesson 18-1 Lesson Lesson 18-1 Lesson Lesson 18-2 Lesson 18-2 Topic 18 Set A Words survey data Topic 18 Set A Words Lesson 18-1 Lesson 18-1 sample line plot Lesson 18-1 Lesson 18-1 frequency table bar graph Lesson 18-2 Lesson 18-2 Instead of making 2-sided copies

More information

Expectation Maximization: Inferring model parameters and class labels

Expectation Maximization: Inferring model parameters and class labels Expectation Maximization: Inferring model parameters and class labels Emily Fox University of Washington February 27, 2017 Mixture of Gaussian recap 1 2/26/17 Jumble of unlabeled images HISTOGRAM blue

More information

Monthly SEO Report. Example Client 16 November 2012 Scott Lawson. Date. Prepared by

Monthly SEO Report. Example Client 16 November 2012 Scott Lawson. Date. Prepared by Date Monthly SEO Report Prepared by Example Client 16 November 212 Scott Lawson Contents Thanks for using TrackPal s automated SEO and Analytics reporting template. Below is a brief explanation of the

More information

Ready To Go On? Skills Intervention 4-1 Graphing Relationships

Ready To Go On? Skills Intervention 4-1 Graphing Relationships Read To Go On? Skills Intervention -1 Graphing Relationships Find these vocabular words in Lesson -1 and the Multilingual Glossar. Vocabular continuous graph discrete graph Relating Graphs to Situations

More information

Analysing Search Trends

Analysing Search Trends Data Mining in Business Intelligence 7 March 2013, Ben-Gurion University Analysing Search Trends Yair Shimshoni, Google R&D center, Tel-Aviv. shimsh@google.com Outline What are search trends? The Google

More information

Q: Which month has the lowest sale? Answer: Q:There are three consecutive months for which sale grow. What are they? Answer: Q: Which month

Q: Which month has the lowest sale? Answer: Q:There are three consecutive months for which sale grow. What are they? Answer: Q: Which month Lecture 1 Q: Which month has the lowest sale? Q:There are three consecutive months for which sale grow. What are they? Q: Which month experienced the biggest drop in sale? Q: Just above November there

More information

CP SC 8810 Data Visualization. Joshua Levine

CP SC 8810 Data Visualization. Joshua Levine CP SC 8810 Data Visualization Joshua Levine levinej@clemson.edu Lecture 15 Text and Sets Oct. 14, 2014 Agenda Lab 02 Grades! Lab 03 due in 1 week Lab 2 Summary Preferences on x-axis label separation 10

More information

Information Visualization

Information Visualization Information Visualization Text: Information visualization, Robert Spence, Addison-Wesley, 2001 What Visualization? Process of making a computer image or graph for giving an insight on data/information

More information

Predicting housing price

Predicting housing price Predicting housing price Shu Niu Introduction The goal of this project is to produce a model for predicting housing prices given detailed information. The model can be useful for many purpose. From estimating

More information

Why is the Internet so important? Reaching clients on the Internet

Why is the Internet so important? Reaching clients on the Internet Why is the Internet so important? According to a recent study,* 70% of U.S. households use the Internet to search for local goods and services. *March 2005, The Kelsey Group Reaching clients 1 Reaching

More information

Observation Coverage SURFACE WATER MODELING SYSTEM. 1 Introduction. 2 Opening the Data

Observation Coverage SURFACE WATER MODELING SYSTEM. 1 Introduction. 2 Opening the Data SURFACE WATER MODELING SYSTEM Observation Coverage 1 Introduction An important part of any computer model is the verification of results. Surface water modeling is no exception. Before using a surface

More information

CIE L*a*b* color model

CIE L*a*b* color model CIE L*a*b* color model To further strengthen the correlation between the color model and human perception, we apply the following non-linear transformation: with where (X n,y n,z n ) are the tristimulus

More information

TDWI strives to provide course books that are contentrich and that serve as useful reference documents after a class has ended.

TDWI strives to provide course books that are contentrich and that serve as useful reference documents after a class has ended. Previews of TDWI course books offer an opportunity to see the quality of our material and help you to select the courses that best fit your needs. The previews cannot be printed. TDWI strives to provide

More information

Construction Change Order analysis CPSC 533C Analysis Project

Construction Change Order analysis CPSC 533C Analysis Project Construction Change Order analysis CPSC 533C Analysis Project Presented by Chiu, Chao-Ying Department of Civil Engineering University of British Columbia Problems of Using Construction Data Hybrid of physical

More information

Internet Lead Generation START with Your Own Web Site

Internet Lead Generation START with Your Own Web Site Internet Lead Generation START with Your Own Web Site Matt Johnston, Santa Barbara Business College Mike McHugh, PlattForm Career College Association 2007 What s s The Big Deal? More Control Higher Quality

More information

DATA MINING TEST 2 INSTRUCTIONS: this test consists of 4 questions you may attempt all questions. maximum marks = 100 bonus marks available = 10

DATA MINING TEST 2 INSTRUCTIONS: this test consists of 4 questions you may attempt all questions. maximum marks = 100 bonus marks available = 10 COMP717, Data Mining with R, Test Two, Tuesday the 28 th of May, 2013, 8h30-11h30 1 DATA MINING TEST 2 INSTRUCTIONS: this test consists of 4 questions you may attempt all questions. maximum marks = 100

More information

CS 229: Machine Learning Final Report Identifying Driving Behavior from Data

CS 229: Machine Learning Final Report Identifying Driving Behavior from Data CS 9: Machine Learning Final Report Identifying Driving Behavior from Data Robert F. Karol Project Suggester: Danny Goodman from MetroMile December 3th 3 Problem Description For my project, I am looking

More information

Learning Temporal-Dependent Ranking Models

Learning Temporal-Dependent Ranking Models Learning Temporal-Dependent Ranking Models Miguel Costa, Francisco Couto, Mário Silva LaSIGE @ Faculty of Sciences, University of Lisbon IST/INESC-ID, University of Lisbon 37th Annual ACM SIGIR Conference,

More information

Aggregation for searching complex information spaces. Mounia Lalmas

Aggregation for searching complex information spaces. Mounia Lalmas Aggregation for searching complex information spaces Mounia Lalmas mounia@acm.org Outline Document Retrieval Focused Retrieval Aggregated Retrieval Complexity of the information space (s) INEX - INitiative

More information

Tag-based Social Interest Discovery

Tag-based Social Interest Discovery Tag-based Social Interest Discovery Xin Li / Lei Guo / Yihong (Eric) Zhao Yahoo!Inc 2008 Presented by: Tuan Anh Le (aletuan@vub.ac.be) 1 Outline Introduction Data set collection & Pre-processing Architecture

More information

TELCOM2125: Network Science and Analysis

TELCOM2125: Network Science and Analysis School of Information Sciences University of Pittsburgh TELCOM2125: Network Science and Analysis Konstantinos Pelechrinis Spring 2015 Figures are taken from: M.E.J. Newman, Networks: An Introduction 2

More information

Mining Web Data. Lijun Zhang

Mining Web Data. Lijun Zhang Mining Web Data Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Web Crawling and Resource Discovery Search Engine Indexing and Query Processing Ranking Algorithms Recommender Systems

More information

Name: Date: Period: Chapter 2. Section 1: Describing Location in a Distribution

Name: Date: Period: Chapter 2. Section 1: Describing Location in a Distribution Name: Date: Period: Chapter 2 Section 1: Describing Location in a Distribution Suppose you earned an 86 on a statistics quiz. The question is: should you be satisfied with this score? What if it is the

More information

Statistics: Normal Distribution, Sampling, Function Fitting & Regression Analysis (Grade 12) *

Statistics: Normal Distribution, Sampling, Function Fitting & Regression Analysis (Grade 12) * OpenStax-CNX module: m39305 1 Statistics: Normal Distribution, Sampling, Function Fitting & Regression Analysis (Grade 12) * Free High School Science Texts Project This work is produced by OpenStax-CNX

More information

CSCI5070 Advanced Topics in Social Computing

CSCI5070 Advanced Topics in Social Computing CSCI5070 Advanced Topics in Social Computing Irwin King The Chinese University of Hong Kong king@cse.cuhk.edu.hk!! 2012 All Rights Reserved. Outline Scale-Free Networks Generation Properties Analysis Dynamic

More information

Deep Model Compression

Deep Model Compression Deep Model Compression Xin Wang Oct.31.2016 Some of the contents are borrowed from Hinton s and Song s slides. Two papers Distilling the Knowledge in a Neural Network by Geoffrey Hinton et al What s the

More information

Normal Curves and Sampling Distributions

Normal Curves and Sampling Distributions Normal Curves and Sampling Distributions 6 Copyright Cengage Learning. All rights reserved. Section 6.2 Standard Units and Areas Under the Standard Normal Distribution Copyright Cengage Learning. All rights

More information

DATA MINING - 1DL105, 1DL111

DATA MINING - 1DL105, 1DL111 1 DATA MINING - 1DL105, 1DL111 Fall 2007 An introductory class in data mining http://user.it.uu.se/~udbl/dut-ht2007/ alt. http://www.it.uu.se/edu/course/homepage/infoutv/ht07 Kjell Orsborn Uppsala Database

More information

Multi-Dimensional Vis

Multi-Dimensional Vis CSE512 :: 21 Jan 2014 Multi-Dimensional Vis Jeffrey Heer University of Washington 1 Last Time: Exploratory Data Analysis 2 Exposure, the effective laying open of the data to display the unanticipated,

More information

Search Costs vs. User Satisfaction on Mobile

Search Costs vs. User Satisfaction on Mobile Search Costs vs. User Satisfaction on Mobile Manisha Verma, Emine Yilmaz University College London mverma@cs.ucl.ac.uk, emine.yilmaz@ucl.ac.uk Abstract. Information seeking is an interactive process where

More information

Error Analysis, Statistics and Graphing

Error Analysis, Statistics and Graphing Error Analysis, Statistics and Graphing This semester, most of labs we require us to calculate a numerical answer based on the data we obtain. A hard question to answer in most cases is how good is your

More information

Course Outline. Writing Reports with Report Builder and SSRS Level 1 Course 55123: 2 days Instructor Led. About this course

Course Outline. Writing Reports with Report Builder and SSRS Level 1 Course 55123: 2 days Instructor Led. About this course About this course Writing Reports with Report Builder and SSRS Level 1 Course 55123: 2 days Instructor Led In this 2-day course, students will continue their learning on the foundations of report writing

More information

Search Marketing 101 CCT332

Search Marketing 101 CCT332 Search Marketing 101 CCT332 Prepared, and delivered, by Dev Basu www.devbasu.com for Prof. Tim Richardson First used in March 2009 at University of Toronto, Mississauga campus Who s The Better Internet

More information

Novelty Detection from an Ego- Centric Perspective

Novelty Detection from an Ego- Centric Perspective Novelty Detection from an Ego- Centric Perspective Omid Aghazadeh, Josephine Sullivan, and Stefan Carlsson Presented by Randall Smith 1 Outline Introduction Sequence Alignment Appearance Based Cues Geometric

More information

Linear and Quadratic Least Squares

Linear and Quadratic Least Squares Linear and Quadratic Least Squares Prepared by Stephanie Quintal, graduate student Dept. of Mathematical Sciences, UMass Lowell in collaboration with Marvin Stick Dept. of Mathematical Sciences, UMass

More information

SAS Visual Analytics 8.2: Working with Report Content

SAS Visual Analytics 8.2: Working with Report Content SAS Visual Analytics 8.2: Working with Report Content About Objects After selecting your data source and data items, add one or more objects to display the results. SAS Visual Analytics provides objects

More information

Utilizing Folksonomy: Similarity Metadata from the Del.icio.us System CS6125 Project

Utilizing Folksonomy: Similarity Metadata from the Del.icio.us System CS6125 Project Utilizing Folksonomy: Similarity Metadata from the Del.icio.us System CS6125 Project Blake Shaw December 9th, 2005 1 Proposal 1.1 Abstract Traditionally, metadata is thought of simply

More information

Domain Specific Search Engine for Students

Domain Specific Search Engine for Students Domain Specific Search Engine for Students Domain Specific Search Engine for Students Wai Yuen Tang The Department of Computer Science City University of Hong Kong, Hong Kong wytang@cs.cityu.edu.hk Lam

More information

Exam 4. In the above, label each of the following with the problem number. 1. The population Least Squares line. 2. The population distribution of x.

Exam 4. In the above, label each of the following with the problem number. 1. The population Least Squares line. 2. The population distribution of x. Exam 4 1-5. Normal Population. The scatter plot show below is a random sample from a 2D normal population. The bell curves and dark lines refer to the population. The sample Least Squares Line (shorter)

More information

A New Measure of the Cluster Hypothesis

A New Measure of the Cluster Hypothesis A New Measure of the Cluster Hypothesis Mark D. Smucker 1 and James Allan 2 1 Department of Management Sciences University of Waterloo 2 Center for Intelligent Information Retrieval Department of Computer

More information

Media Pack. Home About User Benefits Advertiser Benefits Advertising Packages

Media Pack. Home About User Benefits Advertiser Benefits Advertising Packages Media Pack About Hotfrog One of Australia s largest online business directories Launched in May 2005 with the purpose to generate targeted sales leads for businesses Owned by the global Reed Elsevier publishing

More information

MONITORING THE REPEATABILITY AND REPRODUCIBILTY OF A NATURAL GAS CALIBRATION FACILITY

MONITORING THE REPEATABILITY AND REPRODUCIBILTY OF A NATURAL GAS CALIBRATION FACILITY MONITORING THE REPEATABILITY AND REPRODUCIBILTY OF A NATURAL GAS CALIBRATION FACILITY T.M. Kegel and W.R. Johansen Colorado Engineering Experiment Station, Inc. (CEESI) 54043 WCR 37, Nunn, CO, 80648 USA

More information

Computer Experiments: Space Filling Design and Gaussian Process Modeling

Computer Experiments: Space Filling Design and Gaussian Process Modeling Computer Experiments: Space Filling Design and Gaussian Process Modeling Best Practice Authored by: Cory Natoli Sarah Burke, Ph.D. 30 March 2018 The goal of the STAT COE is to assist in developing rigorous,

More information

CS229 Lecture notes. Raphael John Lamarre Townshend

CS229 Lecture notes. Raphael John Lamarre Townshend CS229 Lecture notes Raphael John Lamarre Townshend Decision Trees We now turn our attention to decision trees, a simple yet flexible class of algorithms. We will first consider the non-linear, region-based

More information

Discriminate Analysis

Discriminate Analysis Discriminate Analysis Outline Introduction Linear Discriminant Analysis Examples 1 Introduction What is Discriminant Analysis? Statistical technique to classify objects into mutually exclusive and exhaustive

More information

Seeing and Reading Red: Hue and Color-word Correlation in Images and Attendant Text on the WWW

Seeing and Reading Red: Hue and Color-word Correlation in Images and Attendant Text on the WWW Seeing and Reading Red: Hue and Color-word Correlation in Images and Attendant Text on the WWW Shawn Newsam School of Engineering University of California at Merced Merced, CA 9534 snewsam@ucmerced.edu

More information

Chapter 2: Descriptive Statistics

Chapter 2: Descriptive Statistics Chapter 2: Descriptive Statistics Student Learning Outcomes By the end of this chapter, you should be able to: Display data graphically and interpret graphs: stemplots, histograms and boxplots. Recognize,

More information

Dta Mining and Data Warehousing

Dta Mining and Data Warehousing CSCI6405 Fall 2003 Dta Mining and Data Warehousing Instructor: Qigang Gao, Office: CS219, Tel:494-3356, Email: q.gao@dal.ca Teaching Assistant: Christopher Jordan, Email: cjordan@cs.dal.ca Office Hours:

More information

CMPT 354 Database Systems I. Spring 2012 Instructor: Hassan Khosravi

CMPT 354 Database Systems I. Spring 2012 Instructor: Hassan Khosravi CMPT 354 Database Systems I Spring 2012 Instructor: Hassan Khosravi Textbook First Course in Database Systems, 3 rd Edition. Jeffry Ullman and Jennifer Widom Other text books Ramakrishnan SILBERSCHATZ

More information

PRO: Designing a Business Intelligence Infrastructure Using Microsoft SQL Server 2008

PRO: Designing a Business Intelligence Infrastructure Using Microsoft SQL Server 2008 Microsoft 70452 PRO: Designing a Business Intelligence Infrastructure Using Microsoft SQL Server 2008 Version: 33.0 QUESTION NO: 1 Microsoft 70452 Exam You plan to create a SQL Server 2008 Reporting Services

More information

Time Series Analysis DM 2 / A.A

Time Series Analysis DM 2 / A.A DM 2 / A.A. 2010-2011 Time Series Analysis Several slides are borrowed from: Han and Kamber, Data Mining: Concepts and Techniques Mining time-series data Lei Chen, Similarity Search Over Time-Series Data

More information

Box Plots. OpenStax College

Box Plots. OpenStax College Connexions module: m46920 1 Box Plots OpenStax College This work is produced by The Connexions Project and licensed under the Creative Commons Attribution License 3.0 Box plots (also called box-and-whisker

More information

An independent component analysis based tool for exploring functional connections in the brain

An independent component analysis based tool for exploring functional connections in the brain An independent component analysis based tool for exploring functional connections in the brain S. M. Rolfe a, L. Finney b, R. F. Tungaraza b, J. Guan b, L.G. Shapiro b, J. F. Brinkely b, A. Poliakov c,

More information