Antonio Fernández Anta
|
|
- Lora Hamilton
- 6 years ago
- Views:
Transcription
1 Antonio Fernández Anta Joint work with Luis F. Chiroque, Héctor Cordobés, Rafael A. García Leiva, Philippe Morere, Lorenzo Ornella, Fernando Pérez, and Agustín Santos
2 Recommendation Engines (RE) suggest items to users RE are becoming highly popular in many different contexts - Shopping websites (Amazon), - Content distribution (Netflix, Spotify) - Online social networks (Facebook, twitter, LikendIn)
3 Some authors talk about the age of recommendation versus the age of search (Chris Anderson, The Long Tail)
4 Most modern RE are based on collaborative filtering: Recommendations are based on - Historic data - Similarity between users - Similarity between items Multiple metrics to quantify similarity: Euclidean distance, Cosine similarity, Pearson correlation similarity, etc.
5 The users and items are typically connected as a bipartite graph Considering users as nodes and items as hyperlinks (or vice versa) we obtain hypergraphs (that can be transformed into graphs of users or items)
6 Graph theory and network analysis concepts can be useful (as before) - Google, Pagerank - Natural Language Processing
7 We explore graph-based approaches for recommendation engines Apply them to an ecosystem of smartphone apps. - Apps are advertised in other apps via banners - Performance metric is click-through rate (CTR) Large (big) data available - More than a billion records - Millions of users - Hundreds of items (apps)
8 Recommendation engines based on - Collaborative filtering (in wide sense) - Graph theory concepts The engines were evaluated in the real world (and showed good performance)
9 Involves processing the historical data available with big data technologies: Hadoop, Elastic Map Reduce, Pig The processing is done is a few hours thanks to these technologies
10 There are more that 100 GiB of historical data - Millions of users applications - More than 1400 millions records The data is in multiple tables The first process is to clean the data This involves joining several tables
11 The output of this process is a file with records that contain - User - Application advertised - Running application (publisher app) - Action (advertisement, click) - Date and time In MySQL we started in of the join operations (not the largest) and stopped it after 30 hours and no more than 15% completion
12 We have used Hadoop on Amazon Elastic Map Reduce and Pig scripts to process the data The output has more that 700 million records
13 From the clean data the graphs used by the RE have been generated This process is less time consuming since it typically involves data aggregation
14 The graph that has apps as nodes and undirected links weighted by the number of common users
15 Shared users: The apps with largest number of users shared with the publisher app are preferred recommendation Publisher app
16 Filtering algorithm: - Let v the binary vector of the requesting user apps, and M the adjacency matrix of the common users graph - The apps whose position have the largest value in v T M are preferred recommendation
17 Common users graph, modulated by age (user weight decreases exponentially with age) Weight(app1,app2)= Σ u δ age(u) Aged shared users: Same as shared users in the aged graph Aged filtering: Same as shared users in the aged graph
18 The CTR graph is a directed graph with links weighted by the frequency of clicks in the banner of the application (head) in the publisher (tail) 5/20 4/12 25/50 6/10 22/40 7/14 4/50
19 Maxflow algorithm: - Recommendation algorithm used to promote specific applications - The apps with largest maxflow in the CTR graph to the promoted apps are preferred recommendation source Promoted app
20 For reference there are two basic recommendation engines: - Random: Engine that suggests random applications - Static promotion: Returns always the promoted apps
21 The different algorithms have been tested over a week in the real system Algorithm CTR Random 1.57% Shared users 1.64% Aged shared users 1.69% Filtering 1.51% Aged filtering 1.71% Static promotion 1.45% Maxflow 1.86% Aging is useful Global view These values improve over the current CTR
22 Graph analytics can be useful in the development of recommendation engines Big data technology allowed us to process historical data and produce graphs The graphs generated are small. They could be processed with classical technologies Current map reduce technologies do not seem to be the solution for large graph analysis
23 Explore technologies that are more suited for large graph analytics: - Graphlab, GraphChi - Spark, GraphX - Stratosphere, Flink, Spargel Devise ways to process incremental data Design and testing of new recommendation algorithms that use larger graphs
24 Thank you!
Recommender System. What is it? How to build it? Challenges. R package: recommenderlab
Recommender System What is it? How to build it? Challenges R package: recommenderlab 1 What is a recommender system Wiki definition: A recommender system or a recommendation system (sometimes replacing
More informationJure Leskovec Including joint work with Y. Perez, R. Sosič, A. Banarjee, M. Raison, R. Puttagunta, P. Shah
Jure Leskovec (@jure) Including joint work with Y. Perez, R. Sosič, A. Banarjee, M. Raison, R. Puttagunta, P. Shah 2 My research group at Stanford: Mining and modeling large social and information networks
More informationGraph Algorithms using Map-Reduce. Graphs are ubiquitous in modern society. Some examples: The hyperlink structure of the web
Graph Algorithms using Map-Reduce Graphs are ubiquitous in modern society. Some examples: The hyperlink structure of the web Graph Algorithms using Map-Reduce Graphs are ubiquitous in modern society. Some
More informationSEO CASE STUDY. Pulis Professional Plumbing is a group of plumbers who are specialized in plumbing more than 20 years of experience.
SEO CASE STUDY About Company Pulis Professional Plumbing is a group of plumbers who are specialized in plumbing more than 20 years of experience. Client Requirement Rank organically for the keywords related
More informationGraph Data Management
Graph Data Management Analysis and Optimization of Graph Data Frameworks presented by Fynn Leitow Overview 1) Introduction a) Motivation b) Application for big data 2) Choice of algorithms 3) Choice of
More informationCS 345A Data Mining Lecture 1. Introduction to Web Mining
CS 345A Data Mining Lecture 1 Introduction to Web Mining What is Web Mining? Discovering useful information from the World-Wide Web and its usage patterns Web Mining v. Data Mining Structure (or lack of
More informationMatrix-Vector Multiplication by MapReduce. From Rajaraman / Ullman- Ch.2 Part 1
Matrix-Vector Multiplication by MapReduce From Rajaraman / Ullman- Ch.2 Part 1 Google implementation of MapReduce created to execute very large matrix-vector multiplications When ranking of Web pages that
More informationG(B)enchmark GraphBench: Towards a Universal Graph Benchmark. Khaled Ammar M. Tamer Özsu
G(B)enchmark GraphBench: Towards a Universal Graph Benchmark Khaled Ammar M. Tamer Özsu Bioinformatics Software Engineering Social Network Gene Co-expression Protein Structure Program Flow Big Graphs o
More informationPart 11: Collaborative Filtering. Francesco Ricci
Part : Collaborative Filtering Francesco Ricci Content An example of a Collaborative Filtering system: MovieLens The collaborative filtering method n Similarity of users n Methods for building the rating
More informationBig Data Architect.
Big Data Architect www.austech.edu.au WHAT IS BIG DATA ARCHITECT? A big data architecture is designed to handle the ingestion, processing, and analysis of data that is too large or complex for traditional
More informationSPOTIFY LOOSE ENDS: PRICING, SUBSCRIBER-BASED VALUE AND BIG DATA! Aswath Damodaran
SPOTIFY LOOSE ENDS: PRICING, SUBSCRIBER-BASED VALUE AND BIG DATA! Aswath Damodaran 2 Loose Ends Aswath Damodaran In my last post, I valued Spotify, using information from its prospectus, and promised to
More informationData Clustering on the Parallel Hadoop MapReduce Model. Dimitrios Verraros
Data Clustering on the Parallel Hadoop MapReduce Model Dimitrios Verraros Overview The purpose of this thesis is to implement and benchmark the performance of a parallel K- means clustering algorithm on
More informationResearch challenges in data-intensive computing The Stratosphere Project Apache Flink
Research challenges in data-intensive computing The Stratosphere Project Apache Flink Seif Haridi KTH/SICS haridi@kth.se e2e-clouds.org Presented by: Seif Haridi May 2014 Research Areas Data-intensive
More informationComplex-Network Modelling and Inference
Complex-Network Modelling and Inference Lecture 8: Graph features (2) Matthew Roughan http://www.maths.adelaide.edu.au/matthew.roughan/notes/ Network_Modelling/ School
More informationCluster Computing Architecture. Intel Labs
Intel Labs Legal Notices INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED
More informationHow Facebook knows exactly what turns you on
How Facebook knows exactly what turns you on We have developed our anti tracking system to combat a culture of user data collection which, we believe, has gone too far. These systems operate hidden from
More informationCampaign Goals, Objectives and Timeline SEO & Pay Per Click Process SEO Case Studies SEO, SEM, Social Media Strategy On Page SEO Off Page SEO
Campaign Goals, Objectives and Timeline SEO & Pay Per Click Process SEO Case Studies SEO, SEM, Social Media Strategy On Page SEO Off Page SEO Reporting Pricing Plans Why Us & Contact Generate organic search
More informationThe Hadoop Ecosystem. EECS 4415 Big Data Systems. Tilemachos Pechlivanoglou
The Hadoop Ecosystem EECS 4415 Big Data Systems Tilemachos Pechlivanoglou tipech@eecs.yorku.ca A lot of tools designed to work with Hadoop 2 HDFS, MapReduce Hadoop Distributed File System Core Hadoop component
More informationReal-time Recommendations on Spark. Jan Neumann, Sridhar Alla (Comcast Labs) DC Spark Interactive Meetup East May
Real-time Recommendations on Spark Jan Neumann, Sridhar Alla (Comcast Labs) DC Spark Interactive Meetup East May 19 2015 Who am I? Jan Neumann, Lead of Big Data and Content Analysis Research Teams This
More informationDistributed Graph Storage. Veronika Molnár, UZH
Distributed Graph Storage Veronika Molnár, UZH Overview Graphs and Social Networks Criteria for Graph Processing Systems Current Systems Storage Computation Large scale systems Comparison / Best systems
More informationBIG DATA. Using the Lambda Architecture on a Big Data Platform to Improve Mobile Campaign Management. Author: Sandesh Deshmane
BIG DATA Using the Lambda Architecture on a Big Data Platform to Improve Mobile Campaign Management Author: Sandesh Deshmane Executive Summary Growing data volumes and real time decision making requirements
More information!!!!!! Digital Foundations
Digital Foundations Digital Literacy Knowing which tools to use and how to use them. The goal of our workshop today is to improve your digital literacy so you strategically choose what to do online and
More information7/17/ Learning Objectives and Overview. + Economic Impact. Digital Marketing Mix
+ Digital Marketing Mix + Learning Objectives and Overview Learning Objectives 1. How does Google look at websites and rank websites accordingly? 2. What can I do get better rankings? 3. How can I use
More informationTI2736-B Big Data Processing. Claudia Hauff
TI2736-B Big Data Processing Claudia Hauff ti2736b-ewi@tudelft.nl Intro Streams Streams Map Reduce HDFS Pig Ctd. Graphs Pig Design Patterns Hadoop Ctd. Giraph Zoo Keeper Spark Spark Ctd. Learning objectives
More informationCloud Computing 2. CSCI 4850/5850 High-Performance Computing Spring 2018
Cloud Computing 2 CSCI 4850/5850 High-Performance Computing Spring 2018 Tae-Hyuk (Ted) Ahn Department of Computer Science Program of Bioinformatics and Computational Biology Saint Louis University Learning
More informationEfficient Mining Algorithms for Large-scale Graphs
Efficient Mining Algorithms for Large-scale Graphs Yasunari Kishimoto, Hiroaki Shiokawa, Yasuhiro Fujiwara, and Makoto Onizuka Abstract This article describes efficient graph mining algorithms designed
More informationSimilarity Ranking in Large- Scale Bipartite Graphs
Similarity Ranking in Large- Scale Bipartite Graphs Alessandro Epasto Brown University - 20 th March 2014 1 Joint work with J. Feldman, S. Lattanzi, S. Leonardi, V. Mirrokni [WWW, 2014] 2 AdWords Ads Ads
More informationDistributed computing: index building and use
Distributed computing: index building and use Distributed computing Goals Distributing computation across several machines to Do one computation faster - latency Do more computations in given time - throughput
More informationCanadian ecommerce Monthly Trends Report
November 12 Canadian ecommerce Monthly Trends Report SPECIAL EDITION: Black Friday & Cyber Monday Demac Media 71 King St. East, Suite 301, Toronto, ON M5C 1G3 www.demacmedia.com 2 Canadian ecommerce Monthly
More informationSpecialist ICT Learning
Specialist ICT Learning APPLIED DATA SCIENCE AND BIG DATA ANALYTICS GTBD7 Course Description This intensive training course provides theoretical and technical aspects of Data Science and Business Analytics.
More informationGraph Analytics in the Big Data Era
Graph Analytics in the Big Data Era Yongming Luo, dr. George H.L. Fletcher Web Engineering Group What is really hot? 19-11-2013 PAGE 1 An old/new data model graph data Model entities and relations between
More informationAnalyzing Flight Data
IBM Analytics Analyzing Flight Data Jeff Carlson Rich Tarro July 21, 2016 2016 IBM Corporation Agenda Spark Overview a quick review Introduction to Graph Processing and Spark GraphX GraphX Overview Demo
More informationGraph Analytics and Machine Learning A Great Combination Mark Hornick
Graph Analytics and Machine Learning A Great Combination Mark Hornick Oracle Advanced Analytics and Machine Learning November 3, 2017 Safe Harbor Statement The following is intended to outline our research
More informationData Mining & Analytics Data Mining Reference Model Data Warehouse Legal and Ethical Issues. Slides by Michael Hahsler
Data Mining & Analytics Data Mining Reference Model Data Warehouse Legal and Ethical Issues Slides by Michael Hahsler Data Mining & Analytics Analytics is the discovery and communication of meaningful
More informationOverview. Background. Intelligence at the Edge. Learning at the Edge: Challenges and Brainstorming. Amazon Alexa Smart Home!
Overview Background Intelligence at the Edge Samsung Research Learning at the Edge: Challenges and Brainstorming Amazon Alexa Smart Home! Background Ph.D. at UW CSE RFID, Mobile, Sensors, Data Nokia Research
More informationBig Data. Big Data Analyst. Big Data Engineer. Big Data Architect
Big Data Big Data Analyst INTRODUCTION TO BIG DATA ANALYTICS ANALYTICS PROCESSING TECHNIQUES DATA TRANSFORMATION & BATCH PROCESSING REAL TIME (STREAM) DATA PROCESSING Big Data Engineer BIG DATA FOUNDATION
More informationNavigating the Maze of Graph Analytics Frameworks using Massive Graph Datasets
Navigating the Maze of Graph Analytics Frameworks using Massive Graph Datasets Nadathur Satish, Narayanan Sundaram, Mostofa Ali Patwary, Jiwon Seo, Jongsoo Park, M. Amber Hassaan, Shubho Sengupta, Zhaoming
More informationDetecting and Analyzing Communities in Social Network Graphs for Targeted Marketing
Detecting and Analyzing Communities in Social Network Graphs for Targeted Marketing Gautam Bhat, Rajeev Kumar Singh Department of Computer Science and Engineering Shiv Nadar University Gautam Buddh Nagar,
More information15-388/688 - Practical Data Science: Big data and MapReduce. J. Zico Kolter Carnegie Mellon University Spring 2018
15-388/688 - Practical Data Science: Big data and MapReduce J. Zico Kolter Carnegie Mellon University Spring 2018 1 Outline Big data Some context in distributed computing map + reduce MapReduce MapReduce
More informationNowcasting. D B M G Data Base and Data Mining Group of Politecnico di Torino. Big Data: Hype or Hallelujah? Big data hype?
Big data hype? Big Data: Hype or Hallelujah? Data Base and Data Mining Group of 2 Google Flu trends On the Internet February 2010 detected flu outbreak two weeks ahead of CDC data Nowcasting http://www.internetlivestats.com/
More informationDistributed Graph Algorithms
Distributed Graph Algorithms Alessio Guerrieri University of Trento, Italy 2016/04/26 This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Contents 1 Introduction
More informationWhy do we need graph processing?
Why do we need graph processing? Community detection: suggest followers? Determine what products people will like Count how many people are in different communities (polling?) Graphs are Everywhere Group
More information2013 AWS Worldwide Public Sector Summit Washington, D.C.
2013 AWS Worldwide Public Sector Summit Washington, D.C. EMR for Fun and for Profit Ben Butler Sr. Manager, Big Data butlerb@amazon.com @bensbutler Overview 1. What is big data? 2. What is AWS Elastic
More informationDatabases 2 (VU) ( / )
Databases 2 (VU) (706.711 / 707.030) MapReduce (Part 3) Mark Kröll ISDS, TU Graz Nov. 27, 2017 Mark Kröll (ISDS, TU Graz) MapReduce Nov. 27, 2017 1 / 42 Outline 1 Problems Suited for Map-Reduce 2 MapReduce:
More informationFinding a needle in Haystack: Facebook's photo storage
Finding a needle in Haystack: Facebook's photo storage The paper is written at facebook and describes a object storage system called Haystack. Since facebook processes a lot of photos (20 petabytes total,
More informationEpilog: Further Topics
Ludwig-Maximilians-Universität München Institut für Informatik Lehr- und Forschungseinheit für Datenbanksysteme Knowledge Discovery in Databases SS 2016 Epilog: Further Topics Lecture: Prof. Dr. Thomas
More informationBig Data Analytics using Apache Hadoop and Spark with Scala
Big Data Analytics using Apache Hadoop and Spark with Scala Training Highlights : 80% of the training is with Practical Demo (On Custom Cloudera and Ubuntu Machines) 20% Theory Portion will be important
More informationA2. Statistical methodology
A2. Statistical methodology This report analyses findings collected from panellists who had the Ofcom mobile research app downloaded for at least seven days during the second fieldwork period. Panellists
More informationWeb Development & Design Foundations with HTML5
1 Web Development & Design Foundations with HTML5 CHAPTER 13 WEB PROMOTION 2 Learning Outcomes In this chapter, you will learn how to: Identify commonly used search engines and search indexes Describe
More informationConstructing Websites toward High Ranking Using Search Engine Optimization SEO
Constructing Websites toward High Ranking Using Search Engine Optimization SEO Pre-Publishing Paper Jasour Obeidat 1 Dr. Raed Hanandeh 2 Master Student CIS PhD in E-Business Middle East University of Jordan
More informationReduce and Aggregate: Similarity Ranking in Multi-Categorical Bipartite Graphs
Reduce and Aggregate: Similarity Ranking in Multi-Categorical Bipartite Graphs Alessandro Epasto J. Feldman*, S. Lattanzi*, S. Leonardi, V. Mirrokni*. *Google Research Sapienza U. Rome Motivation Recommendation
More informationApache Giraph. for applications in Machine Learning & Recommendation Systems. Maria Novartis
Apache Giraph for applications in Machine Learning & Recommendation Systems Maria Stylianou @marsty5 Novartis Züri Machine Learning Meetup #5 June 16, 2014 Apache Giraph for applications in Machine Learning
More informationHow App Ratings and Reviews Impact Rank on Google Play and the App Store
APP STORE OPTIMIZATION MASTERCLASS How App Ratings and Reviews Impact Rank on Google Play and the App Store BIG APPS GET BIG RATINGS 13,927 AVERAGE NUMBER OF RATINGS FOR TOP-RATED IOS APPS 196,833 AVERAGE
More informationStructure of Social Networks
Structure of Social Networks Outline Structure of social networks Applications of structural analysis Social *networks* Twitter Facebook Linked-in IMs Email Real life Address books... Who Twitter #numbers
More informationPart 11: Collaborative Filtering. Francesco Ricci
Part : Collaborative Filtering Francesco Ricci Content An example of a Collaborative Filtering system: MovieLens The collaborative filtering method n Similarity of users n Methods for building the rating
More informationIrregular Graph Algorithms on Parallel Processing Systems
Irregular Graph Algorithms on Parallel Processing Systems George M. Slota 1,2 Kamesh Madduri 1 (advisor) Sivasankaran Rajamanickam 2 (Sandia mentor) 1 Penn State University, 2 Sandia National Laboratories
More informationFall 2018: Introduction to Data Science GIRI NARASIMHAN, SCIS, FIU
Fall 2018: Introduction to Data Science GIRI NARASIMHAN, SCIS, FIU !2 MapReduce Overview! Sometimes a single computer cannot process data or takes too long traditional serial programming is not always
More informationElection Analysis and Prediction Using Big Data Analytics
Election Analysis and Prediction Using Big Data Analytics Omkar Sawant, Chintaman Taral, Roopak Garbhe Students, Department Of Information Technology Vidyalankar Institute of Technology, Mumbai, India
More informationSQT03 Big Data and Hadoop with Azure HDInsight Andrew Brust. Senior Director, Technical Product Marketing and Evangelism
Big Data and Hadoop with Azure HDInsight Andrew Brust Senior Director, Technical Product Marketing and Evangelism Datameer Level: Intermediate Meet Andrew Senior Director, Technical Product Marketing and
More informationGraphs (Part II) Shannon Quinn
Graphs (Part II) Shannon Quinn (with thanks to William Cohen and Aapo Kyrola of CMU, and J. Leskovec, A. Rajaraman, and J. Ullman of Stanford University) Parallel Graph Computation Distributed computation
More informationU.S. Mobile Benchmark Report
U.S. Mobile Benchmark Report ADOBE DIGITAL INDEX 2014 80% 40% Methodology Report based on aggregate and anonymous data across retail, media, entertainment, financial service, and travel websites. Behavioral
More informationThe AQ2E Affiliate Marketing System First Steps
The AQ2E Affiliate Marketing System First Steps The AQ2E Affiliate Marketing System is designed to work in two ways. First is the Warm Market Marketing Tools Warm Market Marketing is designed to allow
More informationLECTURE 12. Web-Technology
LECTURE 12 Web-Technology Household issues Course evaluation on Caracal o https://caracal.uu.nl o Between 3-4-2018 and 29-4-2018 Assignment 3 deadline extension No lecture/practice on Friday 30/03/18 2
More informationCS224W Project: Recommendation System Models in Product Rating Predictions
CS224W Project: Recommendation System Models in Product Rating Predictions Xiaoye Liu xiaoye@stanford.edu Abstract A product recommender system based on product-review information and metadata history
More informationPutting it together. Data-Parallel Computation. Ex: Word count using partial aggregation. Big Data Processing. COS 418: Distributed Systems Lecture 21
Big Processing -Parallel Computation COS 418: Distributed Systems Lecture 21 Michael Freedman 2 Ex: Word count using partial aggregation Putting it together 1. Compute word counts from individual files
More informationExam IST 441 Spring 2013
Exam IST 441 Spring 2013 Last name: Student ID: First name: I acknowledge and accept the University Policies and the Course Policies on Academic Integrity This 100 point exam determines 30% of your grade.
More informationCSE 454 Final Report TasteCliq
CSE 454 Final Report TasteCliq Samrach Nouv, Andrew Hau, Soheil Danesh, and John-Paul Simonis Goals Your goals for the project Create an online service which allows people to discover new media based on
More informationIntroduction. Chapter Background Recommender systems Collaborative based filtering
ii Abstract Recommender systems are used extensively today in many areas to help users and consumers with making decisions. Amazon recommends books based on what you have previously viewed and purchased,
More informationBig Data Analytics CSCI 4030
High dim. data Graph data Infinite data Machine learning Apps Locality sensitive hashing PageRank, SimRank Filtering data streams SVM Recommen der systems Clustering Community Detection Queries on streams
More informationCopyright 2000, Kevin Wayne 1
Chapter 3 - Graphs Undirected Graphs Undirected graph. G = (V, E) V = nodes. E = edges between pairs of nodes. Captures pairwise relationship between objects. Graph size parameters: n = V, m = E. Directed
More informationFive Common Myths About Scaling MySQL
WHITE PAPER Five Common Myths About Scaling MySQL Five Common Myths About Scaling MySQL In this age of data driven applications, the ability to rapidly store, retrieve and process data is incredibly important.
More informationNode Similarity. Ralucca Gera, Applied Mathematics Dept. Naval Postgraduate School Monterey, California
Node Similarity Ralucca Gera, Applied Mathematics Dept. Naval Postgraduate School Monterey, California rgera@nps.edu Motivation We talked about global properties Average degree, average clustering, ave
More informationHadoop, Yarn and Beyond
Hadoop, Yarn and Beyond 1 B. R A M A M U R T H Y Overview We learned about Hadoop1.x or the core. Just like Java evolved, Java core, Java 1.X, Java 2.. So on, software and systems evolve, naturally.. Lets
More informationHow Many, How Often & How Long:
How Many, How Often & How Long: Comparable Metrics & Any Given Minute Update 2Q 2017 Nielsen s Comparable Metrics Report Provides An Apples-To- Apples View of Media Consumption Nielsen s Q2 2017 Comparable
More informationBig Data Infrastructure at Spotify
Big Data Infrastructure at Spotify Wouter de Bie Team Lead Data Infrastructure September 26, 2013 2 Who am I? According to ZDNet: "The work they have done to improve the Apache Hive data warehouse system
More informationMODERN BIG DATA DESIGN PATTERNS CASE DRIVEN DESINGS
MODERN BIG DATA DESIGN PATTERNS CASE DRIVEN DESINGS SUJEE MANIYAM FOUNDER / PRINCIPAL @ ELEPHANT SCALE www.elephantscale.com sujee@elephantscale.com HI, I M SUJEE MANIYAM Founder / Principal @ ElephantScale
More informationSplunk Review. 1. Introduction
Splunk Review 1. Introduction 2. Splunk Splunk is a software tool for searching, monitoring and analysing machine generated data via web interface. It indexes and correlates real-time and non-real-time
More informationCSE 444: Database Internals. Lecture 23 Spark
CSE 444: Database Internals Lecture 23 Spark References Spark is an open source system from Berkeley Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing. Matei
More informationTRACKING YOUR WEBSITE WITH GOOGLE ANALYTICS CHRIS EDWARDS
TRACKING YOUR WEBSITE WITH GOOGLE ANALYTICS CHRIS EDWARDS Hi, I am Chris Edwards Data Nerd & Agency Owner Website Developer 18 years WordPress Developer 6 years Google Analytics 13 years Digital Marketer/SEO
More informationBlended Learning Outline: Cloudera Data Analyst Training (171219a)
Blended Learning Outline: Cloudera Data Analyst Training (171219a) Cloudera Univeristy s data analyst training course will teach you to apply traditional data analytics and business intelligence skills
More informationITB Berlin Media Data
The media concept for ITB Berlin 2018: Everything from one source. ITB Berlin Virtual Market Place ITB Berlin Homepage ITB Berlin App ITB Berlin Newsletter With many advertising packages and an online
More informationBig Data Analytics: What is Big Data? Stony Brook University CSE545, Fall 2016 the inaugural edition
Big Data Analytics: What is Big Data? Stony Brook University CSE545, Fall 2016 the inaugural edition What s the BIG deal?! 2011 2011 2008 2010 2012 What s the BIG deal?! (Gartner Hype Cycle) What s the
More informationThe age of Big Data Big Data for Oracle Database Professionals
The age of Big Data Big Data for Oracle Database Professionals Oracle OpenWorld 2017 #OOW17 SessionID: SUN5698 Tom S. Reddy tom.reddy@datareddy.com About the Speaker COLLABORATE & OpenWorld Speaker IOUG
More informationUK Digital Market Overview - Sept If you have any questions, please contact:
UK Digital Market Overview - Sept 2017 If you have any questions, please contact: insights@ukom.uk.net A Guide to Data Sources Total Digital Population = Unduplicated audience across MMX, Mobile Metrix
More informationGraph Theory Review. January 30, Network Science Analytics Graph Theory Review 1
Graph Theory Review Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/ January 30, 2018 Network
More informationData Analytics at Logitech Snowflake + Tableau = #Winning
Welcome # T C 1 8 Data Analytics at Logitech Snowflake + Tableau = #Winning Avinash Deshpande I am a futurist, scientist, engineer, designer, data evangelist at heart Find me at Avinash Deshpande Chief
More informationOverview. Prerequisites. Course Outline. Course Outline :: Apache Spark Development::
Title Duration : Apache Spark Development : 4 days Overview Spark is a fast and general cluster computing system for Big Data. It provides high-level APIs in Scala, Java, Python, and R, and an optimized
More informationParallel HITS Algorithm Implemented Using HADOOP GIRAPH Framework to resolve Big Data Problem
I J C T A, 9(41) 2016, pp. 1235-1239 International Science Press Parallel HITS Algorithm Implemented Using HADOOP GIRAPH Framework to resolve Big Data Problem Hema Dubey *, Nilay Khare *, Alind Khare **
More informationDemystifying Machine Learning
Demystifying Machine Learning Dmitry Figol, WW Enterprise Sales Systems Engineer - Programmability @dmfigol CTHRST-1002 Agenda Machine Learning examples What is Machine Learning Types of Machine Learning
More informationRAW, SMACKDOWN AND PRIMETIME CABLE TV RATINGS
KEY PERFORMANCE INDICATORS MAY 3, 2018 AVERAGE US PRIMETIME CABLE TV RATINGS RAW, SMACKDOWN AND PRIMETIME CABLE TV RATINGS 2.47 +2% 2.53-2% 2.08 2.04 1.91 1.94 1.61 1.56 1.27-12% 1.11-8% 0.75 0.69 0.98
More informationNinthDecimal Mobile Audience Insights Report. Q Spotlight on Quick Service (QSR) & Casual Dining Restaurants
NinthDecimal Mobile Audience Insights Report Q4 2013 Spotlight on Quick Service (QSR) & Casual Dining Restaurants Research Overview Consumer Dining Frequency 2 Mobile Influence on Path to Purchase 3-7
More informationWeb OS Opportunities and Challenges in China. Robert Wang Project Manager of China Mobile
Web OS Opportunities and Challenges in China Robert Wang Project Manager of China Mobile Agenda Overview China Smartphone market What is driving China Smartphone market Tizen s Opportunities and Challenges
More informationSampling Large Graphs: Algorithms and Applications
Sampling Large Graphs: Algorithms and Applications Don Towsley College of Information & Computer Science Umass - Amherst Collaborators: P.H. Wang, J.C.S. Lui, J.Z. Zhou, X. Guan Measuring, analyzing large
More informationA Hierarchical Synchronous Parallel Model for Wide-Area Graph Analytics
A Hierarchical Synchronous Parallel Model for Wide-Area Graph Analytics Shuhao Liu*, Li Chen, Baochun Li, Aiden Carnegie University of Toronto April 17, 2018 Graph Analytics What is Graph Analytics? 2
More informationInf 496/596 Topics in Informatics: Analysis of Social Network Data
Inf 496/596 Topics in Informatics: Analysis of Social Network Data Jagdish S. Gangolly Department of Informatics College of Computing & Information State University of New York at Albany Lecture 1B (Graphs)
More information1. Title: Case Study: Successful App Genres in India Subtitle: The most popular types of apps with daily active users
1. Title: Case Study: Successful App Genres in India Subtitle: The most popular types of apps with daily active users 2. Title: Successful App Categories in India Subtitle: The top 5 app categories with
More informationPrototyping Data Intensive Apps: TrendingTopics.org
Prototyping Data Intensive Apps: TrendingTopics.org Pete Skomoroch Research Scientist at LinkedIn Consultant at Data Wrangling @peteskomoroch 09/29/09 1 Talk Outline TrendingTopics Overview Wikipedia Page
More informationJAVASCRIPT CHARTING. Scaling for the Enterprise with Metric Insights Copyright Metric insights, Inc.
JAVASCRIPT CHARTING Scaling for the Enterprise with Metric Insights 2013 Copyright Metric insights, Inc. A REVOLUTION IS HAPPENING... 3! Challenges... 3! Borrowing From The Enterprise BI Stack... 4! Visualization
More informationUnifying Big Data Workloads in Apache Spark
Unifying Big Data Workloads in Apache Spark Hossein Falaki @mhfalaki Outline What s Apache Spark Why Unification Evolution of Unification Apache Spark + Databricks Q & A What s Apache Spark What is Apache
More informationWhat is happening at the Base of the Pyramid in South Africa?
What is happening at the Base of the Pyramid in South Africa? Enrico Calandro, Miriama Deen-Swarray, Steve Esselaar, Alison Gillwald & Christoph Stork Global Forum on Innovation and Technology Innovation,
More information