CIS : Scalable Data Analysis

Size: px
Start display at page:

Download "CIS : Scalable Data Analysis"

Transcription

1 CIS : Scalable Data Analysis Visualization Dr. David Koop

2 Growth of Data 2

3 Usefulness of Data 3

4 Analyzed Data 4

5 Example Data Sources Radio Telescopes Twitter Wind Turbine Sensors Surveillance Cameras Cars & Airplanes Dog Collars Dishwashers Traffic Lights MRI Scanners NFL Football Players Farming [Zebra MotionWorks] [CC-SA 2.0, Stephan Trebs] 5

6 Large Synoptic Survey Telescope (LSST) Image every 15 seconds 100PB over 10 years [ 6

7 Large Numerical Simulations Millennium simulation: dark matter, 30TB raw data [V. Springel et al., 2005] 7

8 More Data Sources Awesome Public Datasets Kaggle Datasets Government Data: data.gov Customer Data: (see Internal Business Data 8

9 Dimensions of Data Dimension Categories Question to be answered objective subjective Type Web Crawler, Customizable Crawler, Search What is the type of the core offering? Engine, Pure Data Vendor, Complex Data Vendor, Matching Vendor, Enrichment Tagging, Enrichment Sentiment, Enrichment Analysis, Data Market Place Time Frame Static/Factual, Up To Date Is the data static or real-time? Domain All, Finance/Economy, Bio Medicine, Social Media, What is the data about? Geo Data, Address Data Data Origin Internet, Self-Generated, User, Community, Where does the data come from? Who is the author? Government, Authority Pricing Model Free, Freemium, Pay-Per-Use, Flat Rate Is the offer free, pay-per-use or usable with a flat rate? Data Access API, Download, Specialized Software, Web What technical means are offered to access the data? Interface Data Output XML, CSV/XLS, JSON, RDF, Report In what way is the data formatted for the user? Language English, German, More What is the language of the website? Does it differ from the language of the data? Target Audience Business, Customer Towards whom is the product geared? Trustworthiness Low, Medium, High How trustworthy is the vendor? Can the original data source be tracked or verified? Size of Vendor Startup, Medium, Big, Global Player How big is the vendor? Maturity Research Project, Beta, Medium, High Is the product still in beta or already established? [Schomm et al., 2013] 9

10 Big Data or Small Data? Many companies feel the need to overclaim the amount of data "when you take a normal tech company and sprinkle on data, you get the next Google" [C. O'Neil] Many large datasets are not useful Twitter processes 8TB, but the tweets only take about 30GB Wikipedia can be downloaded onto a USB drive All MP3s can be stored on a moderately sized disk array Can learn a lot from a "small" dataset, e.g. sensors from a single turbine, grocery store, Apple Watch Small data focused on end-user, more timely insights? 10

11 Jobs on a Large Analytics Cluster [R. Appuswamy et al., 2013] 11

12 Reading Quiz 12

13 Assignment 1 cis fa/assignment1.html Boston Property Assessments - Initial exploratory analysis - Use a Python Notebook - May use pandas - Label subproblems and answers - Show work (even if it's not your final answer) [Google Maps] 13

14 Big Data Visualization (Slides from Dr. Nan Cao via Dr. Ching-Yung Lin)

15 Big Data Visualization What is Visualization and Why Visualization? Big Data Visualization Challenges and Techniques Visualizing Big Data Visual Analytics and Big Data 15

16 Whisper: Tracing Information Diffusion in Real Time 16

17 Customizing Computational Methods for Visual Analytics with Big Data J. Choo and H. Park

18 Complexities of Visual Analytics of Big Data Human perception and large numbers of items - locating items - tracking items Limited screen space: - clutter - overlapping items 18

19 Use Computational Methods Methods: - Dimensionality reduction - Clustering - Machine learning & data mining Issues with using these methods - What's going on? - Waiting time Goal: - Interactive - Faster 19

20 Exploiting Discrepancies Precision: use knowledge of screen resolution to set precision Convergence: Don't worry about minor changes that may be imperceptible - Human perception - Screen resolution constraints 20

21 Changes in Cluster Membership in k-means No. of per-iteration changes/accuracy (%) Accuracy against final solution Per-iteration changes No. of iterations 21

22 Customizing Computations Use lower precision computation Use interactive visualization that shows iterations Refine results iteratively Data scale confinement 22

23 Iteration-level Interactive Visualization Computational module Subroutine 1 Subroutine k Output Visualization/summarization Iterate (a) Interaction Computational module Subroutine 1 Subroutine k Output Visualization/summarization Iterate (b) Interaction 23

24 Next Progressive Visualization Read: How Progressive Visualizations Affect Exploratory Analysis Write: - Critique of Paper - < 1 paragraph summary, 2 paragraphs critique Which ideas in the paper are interesting and why? Which ideas do you have related to the paper Which ideas seem problematic? Can you suggest alternatives? - Turn in via mycourses - Due Tuesday before class 24

INTRODUCTION TO BIG DATA, DATA MINING, AND MACHINE LEARNING

INTRODUCTION TO BIG DATA, DATA MINING, AND MACHINE LEARNING CS 7265 BIG DATA ANALYTICS INTRODUCTION TO BIG DATA, DATA MINING, AND MACHINE LEARNING * Some contents are adapted from Dr. Hung Huang and Dr. Chengkai Li at UT Arlington Mingon Kang, PhD Computer Science,

More information

Nowcasting. D B M G Data Base and Data Mining Group of Politecnico di Torino. Big Data: Hype or Hallelujah? Big data hype?

Nowcasting. D B M G Data Base and Data Mining Group of Politecnico di Torino. Big Data: Hype or Hallelujah? Big data hype? Big data hype? Big Data: Hype or Hallelujah? Data Base and Data Mining Group of 2 Google Flu trends On the Internet February 2010 detected flu outbreak two weeks ahead of CDC data Nowcasting http://www.internetlivestats.com/

More information

Apache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context

Apache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context 1 Apache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context Generality: diverse workloads, operators, job sizes

More information

Data-Intensive Distributed Computing

Data-Intensive Distributed Computing Data-Intensive Distributed Computing CS 451/651 431/631 (Winter 2018) Part 5: Analyzing Relational Data (1/3) February 8, 2018 Jimmy Lin David R. Cheriton School of Computer Science University of Waterloo

More information

CC PROCESAMIENTO MASIVO DE DATOS OTOÑO 2018

CC PROCESAMIENTO MASIVO DE DATOS OTOÑO 2018 CC5212-1 PROCESAMIENTO MASIVO DE DATOS OTOÑO 2018 Lecture 1: Introduction Aidan Hogan aidhog@gmail.com THE VALUE OF DATA Soho, London, 1854 Cholera: What we know now Cholera: What we knew in 1854 1854:

More information

Big Data - Some Words BIG DATA 8/31/2017. Introduction

Big Data - Some Words BIG DATA 8/31/2017. Introduction BIG DATA Introduction Big Data - Some Words Connectivity Social Medias Share information Interactivity People Business Data Data mining Text mining Business Intelligence 1 What is Big Data Big Data means

More information

Topics. Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples

Topics. Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples Hadoop Introduction 1 Topics Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples 2 Big Data Analytics What is Big Data?

More information

Based on Big Data: Hype or Hallelujah? by Elena Baralis

Based on Big Data: Hype or Hallelujah? by Elena Baralis Based on Big Data: Hype or Hallelujah? by Elena Baralis http://dbdmg.polito.it/wordpress/wp-content/uploads/2010/12/bigdata_2015_2x.pdf 1 3 February 2010 Google detected flu outbreak two weeks ahead of

More information

Big Data Analytics. Izabela Moise, Evangelos Pournaras, Dirk Helbing

Big Data Analytics. Izabela Moise, Evangelos Pournaras, Dirk Helbing Big Data Analytics Izabela Moise, Evangelos Pournaras, Dirk Helbing Izabela Moise, Evangelos Pournaras, Dirk Helbing 1 Big Data "The world is crazy. But at least it s getting regular analysis." Izabela

More information

2/26/2017. Originally developed at the University of California - Berkeley's AMPLab

2/26/2017. Originally developed at the University of California - Berkeley's AMPLab Apache is a fast and general engine for large-scale data processing aims at achieving the following goals in the Big data context Generality: diverse workloads, operators, job sizes Low latency: sub-second

More information

Scalable Data Analysis (CIS )

Scalable Data Analysis (CIS ) Scalable Data Analysis (CIS 602-01) Introduction Dr. David Koop NYC Taxi Data [Analyzing 1.1 Billion NYC Taxi and Uber Trips, with a Vengeance, T. W. Schneider] 2 What are your questions about this data?

More information

Programming Technologies for Web Resource Mining

Programming Technologies for Web Resource Mining Programming Technologies for Web Resource Mining SoftLang Team, University of Koblenz-Landau Prof. Dr. Ralf Lämmel Msc. Johannes Härtel Msc. Marcel Heinz Motivation What are interesting web resources??

More information

2013 AWS Worldwide Public Sector Summit Washington, D.C.

2013 AWS Worldwide Public Sector Summit Washington, D.C. 2013 AWS Worldwide Public Sector Summit Washington, D.C. EMR for Fun and for Profit Ben Butler Sr. Manager, Big Data butlerb@amazon.com @bensbutler Overview 1. What is big data? 2. What is AWS Elastic

More information

Embedded Technosolutions

Embedded Technosolutions Hadoop Big Data An Important technology in IT Sector Hadoop - Big Data Oerie 90% of the worlds data was generated in the last few years. Due to the advent of new technologies, devices, and communication

More information

Analytics Platform for ATLAS Computing Services

Analytics Platform for ATLAS Computing Services Analytics Platform for ATLAS Computing Services Ilija Vukotic for the ATLAS collaboration ICHEP 2016, Chicago, USA Getting the most from distributed resources What we want To understand the system To understand

More information

Introduction to MapReduce (cont.)

Introduction to MapReduce (cont.) Introduction to MapReduce (cont.) Rafael Ferreira da Silva rafsilva@isi.edu http://rafaelsilva.com USC INF 553 Foundations and Applications of Data Mining (Fall 2018) 2 MapReduce: Summary USC INF 553 Foundations

More information

Data Intensive Scalable Computing. Thanks to: Randal E. Bryant Carnegie Mellon University

Data Intensive Scalable Computing. Thanks to: Randal E. Bryant Carnegie Mellon University Data Intensive Scalable Computing Thanks to: Randal E. Bryant Carnegie Mellon University http://www.cs.cmu.edu/~bryant Big Data Sources: Seismic Simulations Wave propagation during an earthquake Large-scale

More information

Big Data Analytics: What is Big Data? Stony Brook University CSE545, Fall 2016 the inaugural edition

Big Data Analytics: What is Big Data? Stony Brook University CSE545, Fall 2016 the inaugural edition Big Data Analytics: What is Big Data? Stony Brook University CSE545, Fall 2016 the inaugural edition What s the BIG deal?! 2011 2011 2008 2010 2012 What s the BIG deal?! (Gartner Hype Cycle) What s the

More information

Knowledge Discovery. URL - Spring 2018 CS - MIA 1/22

Knowledge Discovery. URL - Spring 2018 CS - MIA 1/22 Knowledge Discovery Javier Béjar cbea URL - Spring 2018 CS - MIA 1/22 Knowledge Discovery (KDD) Knowledge Discovery in Databases (KDD) Practical application of the methodologies from machine learning/statistics

More information

G(B)enchmark GraphBench: Towards a Universal Graph Benchmark. Khaled Ammar M. Tamer Özsu

G(B)enchmark GraphBench: Towards a Universal Graph Benchmark. Khaled Ammar M. Tamer Özsu G(B)enchmark GraphBench: Towards a Universal Graph Benchmark Khaled Ammar M. Tamer Özsu Bioinformatics Software Engineering Social Network Gene Co-expression Protein Structure Program Flow Big Graphs o

More information

Introduction to Data Management CSE 344

Introduction to Data Management CSE 344 Introduction to Data Management CSE 344 Lectures 23 and 24 Parallel Databases 1 Why compute in parallel? Most processors have multiple cores Can run multiple jobs simultaneously Natural extension of txn

More information

Flynax SEO Guide Flynax

Flynax SEO Guide Flynax Flynax SEO Guide Flynax 2018 1 Ì ÌFlynax SEO Guide Due to the fact that every project has its own purpose, audience and location preferences, it is almost impossible to make the script that will meet SEO

More information

Cloud Computing 2. CSCI 4850/5850 High-Performance Computing Spring 2018

Cloud Computing 2. CSCI 4850/5850 High-Performance Computing Spring 2018 Cloud Computing 2 CSCI 4850/5850 High-Performance Computing Spring 2018 Tae-Hyuk (Ted) Ahn Department of Computer Science Program of Bioinformatics and Computational Biology Saint Louis University Learning

More information

The Billion Object Platform (BOP): a system to lower barriers to support big, streaming, spatio-temporal data sources

The Billion Object Platform (BOP): a system to lower barriers to support big, streaming, spatio-temporal data sources FOSS4G 2017 Boston The Billion Object Platform (BOP): a system to lower barriers to support big, streaming, spatio-temporal data sources Devika Kakkar and Ben Lewis Harvard Center for Geographic Analysis

More information

Blurring the Line Between Developer and Data Scientist

Blurring the Line Between Developer and Data Scientist Blurring the Line Between Developer and Data Scientist Notebooks with PixieDust va barbosa va@us.ibm.com Developer Advocacy IBM Watson Data Platform WHY ARE YOU HERE? More companies making bet-the-business

More information

DATA COLLECTION. Slides by WESLEY WILLETT 13 FEB 2014

DATA COLLECTION. Slides by WESLEY WILLETT 13 FEB 2014 DATA COLLECTION Slides by WESLEY WILLETT INFO VISUAL 340 ANALYTICS D 13 FEB 2014 WHERE DOES DATA COME FROM? We tend to think of data as a thing in a database somewhere WHY DO YOU NEED DATA? (HINT: Usually,

More information

Introduction to Data Management CSE 344

Introduction to Data Management CSE 344 Introduction to Data Management CSE 344 Lecture 25: Parallel Databases CSE 344 - Winter 2013 1 Announcements Webquiz due tonight last WQ! J HW7 due on Wednesday HW8 will be posted soon Will take more hours

More information

SQLite vs. MongoDB for Big Data

SQLite vs. MongoDB for Big Data SQLite vs. MongoDB for Big Data In my latest tutorial I walked readers through a Python script designed to download tweets by a set of Twitter users and insert them into an SQLite database. In this post

More information

CSC 170 Introduction to Computers and Their Applications. Computers

CSC 170 Introduction to Computers and Their Applications. Computers CSC 170 Introduction to Computers and Their Applications Lecture #4 Digital Devices Computers At its core, a computer is a multipurpose device that accepts input, processes data, stores data, and produces

More information

Data, Data, Everywhere. We are now in the Big Data Era.

Data, Data, Everywhere. We are now in the Big Data Era. Data, Data, Everywhere. We are now in the Big Data Era. CONTENTS Background Big Data What is Generating our Big Data Physical Management of Big Data Optimisation in Data Processing Techniques for Handling

More information

CIS : Scalable Data Analysis

CIS : Scalable Data Analysis CIS 602-01: Scalable Data Analysis Cloud Workloads Dr. David Koop Scaling Up PC [Haeberlen and Ives, 2015] 2 Scaling Up PC Server [Haeberlen and Ives, 2015] 2 Scaling Up PC Server Cluster [Haeberlen and

More information

Case-based Recommendation. Peter Brusilovsky with slides of Danielle Lee

Case-based Recommendation. Peter Brusilovsky with slides of Danielle Lee Case-based Recommendation Peter Brusilovsky with slides of Danielle Lee Where we are? Search Navigation Recommendation Content-based Semantics / Metadata Social Modern E-Commerce Site The Power of Metadata

More information

Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2017)

Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2017) Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2017) Week 10: Mutable State (1/2) March 14, 2017 Jimmy Lin David R. Cheriton School of Computer Science University of Waterloo These

More information

Understanding the SAP HANA Difference. Amit Satoor, SAP Data Management

Understanding the SAP HANA Difference. Amit Satoor, SAP Data Management Understanding the SAP HANA Difference Amit Satoor, SAP Data Management Webinar Logistics Got Flash? http://get.adobe.com/flashplayer to download. The future holds many transformational opportunities Capitalize

More information

QMiner is a data analytics platform for processing large-scale real-time streams containing structured and unstructured data.

QMiner is a data analytics platform for processing large-scale real-time streams containing structured and unstructured data. Data analytics with QMiner This topic provides a practical insights on data analytics using QMiner. QMiner implements a comprehensive set of techniques for supervised, unsupervised and active learning

More information

Maximise your return in search. Mark Lilley

Maximise your return in search. Mark Lilley Maximise your return in search Mark Lilley 19.10.2017 Hello Mark Lilley Co- Founder & Director Groundswell groundswellgrowth.com Head of Ecommerce Chain Reaction Cycles 5 years Who are Groundswell? Ecommerce

More information

Popularity of Twitter Accounts: PageRank on a Social Network

Popularity of Twitter Accounts: PageRank on a Social Network Popularity of Twitter Accounts: PageRank on a Social Network A.D-A December 8, 2017 1 Problem Statement Twitter is a social networking service, where users can create and interact with 140 character messages,

More information

USERS CONFERENCE Copyright 2016 OSIsoft, LLC

USERS CONFERENCE Copyright 2016 OSIsoft, LLC Bridge IT and OT with a process data warehouse Presented by Matt Ziegler, OSIsoft Complexity Problem Complexity Drives the Need for Integrators Disparate assets or interacting one-by-one Monitoring Real-time

More information

OpenStreetMap. For the semi-uninitiated. Fraser Kirkpatrick Technical Project Manager 27 th September CGI Group Inc.

OpenStreetMap. For the semi-uninitiated. Fraser Kirkpatrick Technical Project Manager 27 th September CGI Group Inc. OpenStreetMap For the semi-uninitiated Fraser Kirkpatrick Technical Project Manager 27 th September 2018 1 Agenda Introduction to OpenStreetMap Summary of OSM data model Live Edit Data Maturity Consuming

More information

Ubiquitous Computing. Ambient Intelligence

Ubiquitous Computing. Ambient Intelligence Ubiquitous Computing Ambient Intelligence CS4031 Introduction to Digital Media 2016 Computing Evolution Ubiquitous Computing Mark Weiser, Xerox PARC 1988 Ubiquitous computing enhances computer use by making

More information

Next steps in single-pair ecosystem - consideration of extended reach. IEEE NEA Ad hoc

Next steps in single-pair ecosystem - consideration of extended reach. IEEE NEA Ad hoc Next steps in single-pair ecosystem - consideration of extended reach IEEE 802.3 NEA Ad hoc 1 Authors Harald Mueller, Endress+Hauser (Industrial Automation) David Brandt, Rockwell Automation (Industrial

More information

Website minute read. Understand the business implications, tactics, costs, and creation process of an effective website.

Website minute read. Understand the business implications, tactics, costs, and creation process of an effective website. Website 101 Understand the business implications, tactics, costs, and creation process of an effective website. 8 minute read Mediant Web Development What to Expect 1. Why a Good Website is Crucial 2.

More information

Using the Force of Python and SAS Viya on Star Wars Fan Posts

Using the Force of Python and SAS Viya on Star Wars Fan Posts SESUG Paper BB-170-2017 Using the Force of Python and SAS Viya on Star Wars Fan Posts Grace Heyne, Zencos Consulting, LLC ABSTRACT The wealth of information available on the Internet includes useful and

More information

King Fahd University of Petroleum & Minerals Computer Engineering g Dept

King Fahd University of Petroleum & Minerals Computer Engineering g Dept King Fahd University of Petroleum & Minerals Computer Engineering g Dept COE 540 Computer Networks Term 121 Dr. Ashraf S. Hasan Mahmoud Rm 22-420 Ext. 1724 Email: ashraf@kfupm.edu.sa 9/1/2012 Dr. Ashraf

More information

A REVIEW PAPER ON BIG DATA ANALYTICS

A REVIEW PAPER ON BIG DATA ANALYTICS A REVIEW PAPER ON BIG DATA ANALYTICS Kirti Bhatia 1, Lalit 2 1 HOD, Department of Computer Science, SKITM Bahadurgarh Haryana, India bhatia.kirti.it@gmail.com 2 M Tech 4th sem SKITM Bahadurgarh, Haryana,

More information

Introduction to Data Analytics. David Walling

Introduction to Data Analytics. David Walling Introduction to Data Analytics David Walling walling@tacc.utexas.edu Source: http://research.microsoft.com/en-us/collaboration/fourthparadigm/default.aspx Computational Simulation Model first, given initial

More information

The Future of High Performance Computing

The Future of High Performance Computing The Future of High Performance Computing Randal E. Bryant Carnegie Mellon University http://www.cs.cmu.edu/~bryant Comparing Two Large-Scale Systems Oakridge Titan Google Data Center 2 Monolithic supercomputer

More information

Purpose, features and functionality

Purpose, features and functionality Topic 6 Purpose, features and functionality In this topic you will look at the purpose, features, functionality and range of users that use information systems. You will learn the importance of being able

More information

/ Cloud Computing. Recitation 13 April 14 th 2015

/ Cloud Computing. Recitation 13 April 14 th 2015 15-319 / 15-619 Cloud Computing Recitation 13 April 14 th 2015 Overview Last week s reflection Project 4.1 Budget issues Tagging, 15619Project This week s schedule Unit 5 - Modules 18 Project 4.2 Demo

More information

CS 398 ACC Streaming. Prof. Robert J. Brunner. Ben Congdon Tyler Kim

CS 398 ACC Streaming. Prof. Robert J. Brunner. Ben Congdon Tyler Kim CS 398 ACC Streaming Prof. Robert J. Brunner Ben Congdon Tyler Kim MP3 How s it going? Final Autograder run: - Tonight ~9pm - Tomorrow ~3pm Due tomorrow at 11:59 pm. Latest Commit to the repo at the time

More information

Data Centers and Cloud Computing. Slides courtesy of Tim Wood

Data Centers and Cloud Computing. Slides courtesy of Tim Wood Data Centers and Cloud Computing Slides courtesy of Tim Wood 1 Data Centers Large server and storage farms 1000s of servers Many TBs or PBs of data Used by Enterprises for server applications Internet

More information

Data Mining Concepts & Tasks

Data Mining Concepts & Tasks Data Mining Concepts & Tasks Duen Horng (Polo) Chau Georgia Tech CSE6242 / CX4242 Jan 16, 2014 Partly based on materials by Professors Guy Lebanon, Jeffrey Heer, John Stasko, Christos Faloutsos Last Time

More information

27/04/2015 CC PROCESAMIENTO MASIVO DE DATOS OTOÑO Lecture 1: Introduction THE VALUE OF DATA. Aidan Hogan

27/04/2015 CC PROCESAMIENTO MASIVO DE DATOS OTOÑO Lecture 1: Introduction THE VALUE OF DATA. Aidan Hogan CC5212-1 PROCESAMIENTO MASIVO DE DATOS OTOÑO 2015 Lecture 1: Introduction Aidan Hogan aidhog@gmail.com THE VALUE OF DATA Soho, London, 1854 The mystery of cholera The Hunt for the invisible cholera Cholera:

More information

Introduction to Database Systems CSE 414

Introduction to Database Systems CSE 414 Introduction to Database Systems CSE 414 Lecture 24: Parallel Databases CSE 414 - Spring 2015 1 Announcements HW7 due Wednesday night, 11 pm Quiz 7 due next Friday(!), 11 pm HW8 will be posted middle of

More information

CEO Position starts January 2012

CEO Position starts January 2012 CEO Position starts January 2012 Peter Hirsch It is a Cell Phone (of course) It is a Video Conferencing Phone It is a Digital HD Camera (Photos and Videos) It is a MP3 Player (Music Player) It is a Digital

More information

The Fastest And Most Efficient Block Storage Software (SDS)

The Fastest And Most Efficient Block Storage Software (SDS) The Fastest And Most Efficient Block Storage Software (SDS) StorPool: Product Summary 1. Advanced Block-level Software Defined Storage, SDS (SDS 2.0) Fully distributed, scale-out, online changes of everything,

More information

Digital Marketing. Introduction of Marketing. Introductions

Digital Marketing. Introduction of Marketing. Introductions Digital Marketing Introduction of Marketing Origin of Marketing Why Marketing is important? What is Marketing? Understanding Marketing Processes Pillars of marketing Marketing is Communication Mass Communication

More information

SMARTATL. A Smart City Overview and Roadmap. Evanta CIO Executive Summit 1

SMARTATL. A Smart City Overview and Roadmap. Evanta CIO Executive Summit 1 SMARTATL A Smart City Overview and Roadmap Evanta CIO Executive Summit 1 Southeast USA Overview Evanta CIO Executive Summit 2 Metro Atlanta Overview Evanta CIO Executive Summit 3 Permits, New Units under

More information

Data Centers and Cloud Computing. Data Centers

Data Centers and Cloud Computing. Data Centers Data Centers and Cloud Computing Slides courtesy of Tim Wood 1 Data Centers Large server and storage farms 1000s of servers Many TBs or PBs of data Used by Enterprises for server applications Internet

More information

Hey Guys, My name is Piyush Mathur. By Profession I am a Digital marketing consultant.

Hey Guys, My name is Piyush Mathur. By Profession I am a Digital marketing consultant. BY PIYUSH MATHUR Hey Guys, My name is Piyush Mathur. By Profession I am a Digital marketing consultant. I work with many startups and large companies to develop intelligent and effective online strategies.

More information

NVIDIA DEEP LEARNING INSTITUTE

NVIDIA DEEP LEARNING INSTITUTE NVIDIA DEEP LEARNING INSTITUTE TRAINING CATALOG Valid Through July 31, 2018 INTRODUCTION The NVIDIA Deep Learning Institute (DLI) trains developers, data scientists, and researchers on how to use artificial

More information

IoT Impact On Storage Architecture

IoT Impact On Storage Architecture IoT Impact On Storage Architecture SDC India Girish Kumar B K NetApp 24 th May 2018 1 IoT - Agenda 1) Introduction 2) Data growth and compute model 3) Industrial needs and IoT architecture 4) Data flow

More information

/ Cloud Computing. Recitation 7 October 10, 2017

/ Cloud Computing. Recitation 7 October 10, 2017 15-319 / 15-619 Cloud Computing Recitation 7 October 10, 2017 Overview Last week s reflection Project 3.1 OLI Unit 3 - Module 10, 11, 12 Quiz 5 This week s schedule OLI Unit 3 - Module 13 Quiz 6 Project

More information

Delivering HCI with VMware vsan and Cisco UCS

Delivering HCI with VMware vsan and Cisco UCS BRKPAR-2447 Delivering HCI with VMware vsan and Cisco UCS Bhumik Patel Director, Technical Alliances, VMware bhumikp@vmware.com 2 Blistering Pace of vsan Adoption Fastest since ESX 10,000 Customers $300M

More information

Clicking on Analytics will bring you to the Overview page.

Clicking on Analytics will bring you to the Overview page. YouTube Analytics - formerly known as Insight - is an extremely powerful set of tools that can provide you with a lot of information about your videos, your audience, and your customers. Clicking on Analytics

More information

The Computation and Data Needs of Canadian Astronomy

The Computation and Data Needs of Canadian Astronomy Summary The Computation and Data Needs of Canadian Astronomy The Computation and Data Committee In this white paper, we review the role of computing in astronomy and astrophysics and present the Computation

More information

seeing through complexity CISCO SPARK

seeing through complexity CISCO SPARK seeing through complexity CISCO SPARK 1 CISCO SPARK What is Cisco Spark? Cisco Spark is a cloud based service that provides teams with powerful collaboration and communication tools via a cross platform

More information

Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2016)

Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2016) Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2016) Week 10: Mutable State (1/2) March 15, 2016 Jimmy Lin David R. Cheriton School of Computer Science University of Waterloo These

More information

Mathpak. A platform for collaborative analytic apps. Bay Area R Users Group 10/02/2013

Mathpak. A platform for collaborative analytic apps. Bay Area R Users Group 10/02/2013 Mathpak A platform for collaborative analytic apps Bay Area R Users Group 10/02/2013 Overview 2/14 Mathpak A cloud based platform for developers to build and deploy analytic apps Developers upload components

More information

Science 2.0 VU Big Science, e-science and E- Infrastructures + Bibliometric Network Analysis

Science 2.0 VU Big Science, e-science and E- Infrastructures + Bibliometric Network Analysis W I S S E N n T E C H N I K n L E I D E N S C H A F T Science 2.0 VU Big Science, e-science and E- Infrastructures + Bibliometric Network Analysis Elisabeth Lex KTI, TU Graz WS 2015/16 u www.tugraz.at

More information

Effective Information Management and Governance: Building the Business Case for Taxonomy

Effective Information Management and Governance: Building the Business Case for Taxonomy Effective Information Management and Governance: Building the Business Case for Taxonomy May 16, 2017 A WAND, Inc. White Paper By Mark Leher, COO, WAND, Inc. Executive Summary Companies are often reluctant

More information

Pre-Requisites: CS2510. NU Core Designations: AD

Pre-Requisites: CS2510. NU Core Designations: AD DS4100: Data Collection, Integration and Analysis Teaches how to collect data from multiple sources and integrate them into consistent data sets. Explains how to use semi-automated and automated classification

More information

Big Data com Hadoop. VIII Sessão - SQL Bahia. Impala, Hive e Spark. Diógenes Pires 03/03/2018

Big Data com Hadoop. VIII Sessão - SQL Bahia. Impala, Hive e Spark. Diógenes Pires 03/03/2018 Big Data com Hadoop Impala, Hive e Spark VIII Sessão - SQL Bahia 03/03/2018 Diógenes Pires Connect with PASS Sign up for a free membership today at: pass.org #sqlpass Internet Live http://www.internetlivestats.com/

More information

Science Cookbook. Practical Data. open source community experience distilled. Benjamin Bengfort. science projects in R and Python.

Science Cookbook. Practical Data. open source community experience distilled. Benjamin Bengfort. science projects in R and Python. Practical Data Science Cookbook 89 hands-on recipes to help you complete real-world data science projects in R and Python Tony Ojeda Sean Patrick Murphy Benjamin Bengfort Abhijit Dasgupta PUBLISHING open

More information

OECD PUBLISHING SERVICES JULY 2014 RELEASE

OECD PUBLISHING SERVICES JULY 2014 RELEASE OECD PUBLISHING SERVICES JULY 2014 RELEASE OECD ilibrary news what has happened since 2010? The OECD ilibrary has grown a lot since its launch in 2010. In 2011, it reached 3.5 million downloads. This number

More information

Data Analyst Nanodegree Syllabus

Data Analyst Nanodegree Syllabus Data Analyst Nanodegree Syllabus Discover Insights from Data with Python, R, SQL, and Tableau Before You Start Prerequisites : In order to succeed in this program, we recommend having experience working

More information

Data Mining Concepts & Tasks

Data Mining Concepts & Tasks Data Mining Concepts & Tasks Duen Horng (Polo) Chau Georgia Tech CSE6242 / CX4242 Sept 9, 2014 Partly based on materials by Professors Guy Lebanon, Jeffrey Heer, John Stasko, Christos Faloutsos Last Time

More information

kjhf MIS 510 sharethisdeal - Final Project Report Team Members:- Damini Akash Krittika Karan Khurana Agrawal Patil Dhingra

kjhf MIS 510 sharethisdeal - Final Project Report Team Members:- Damini Akash Krittika Karan Khurana Agrawal Patil Dhingra kjhf MIS 510 sharethisdeal - Final Project Report Team Members:- Damini Akash Krittika Karan Khurana Agrawal Patil Dhingra 5/14/2014 Introduction Shoppers often come across offers that require them to

More information

Surveillance Dell EMC Storage with IndigoVision Control Center

Surveillance Dell EMC Storage with IndigoVision Control Center Surveillance Dell EMC Storage with IndigoVision Control Center Sizing Guide H14832 REV 1.1 Copyright 2016-2017 Dell Inc. or its subsidiaries. All rights reserved. Published May 2016 Dell believes the information

More information

https://www.youtube.com/watch?v=-gj93l2qa6c Topics: Foundation of Data Analytics and Data Mining Data Volume, Velocity, & Variety Harnessing Big Data Enabling technologies: Cloud Computing 2 No single

More information

Acknowledgements. Beyond DBMSs. Presentation Outline

Acknowledgements. Beyond DBMSs. Presentation Outline Acknowledgements Beyond RDBMSs These slides are put together from a variety of sources (both papers and slides/tutorials available on the web) Sharma Chakravarthy Information Technology Laboratory Computer

More information

Apache Hadoop 3. Balazs Gaspar Sales Engineer CEE & CIS Cloudera, Inc. All rights reserved.

Apache Hadoop 3. Balazs Gaspar Sales Engineer CEE & CIS Cloudera, Inc. All rights reserved. Apache Hadoop 3 Balazs Gaspar Sales Engineer CEE & CIS balazs@cloudera.com 1 We believe data can make what is impossible today, possible tomorrow 2 We empower people to transform complex data into clear

More information

Modern Data Warehouse The New Approach to Azure BI

Modern Data Warehouse The New Approach to Azure BI Modern Data Warehouse The New Approach to Azure BI History On-Premise SQL Server Big Data Solutions Technical Barriers Modern Analytics Platform On-Premise SQL Server Big Data Solutions Modern Analytics

More information

NVIDIA DLI HANDS-ON TRAINING COURSE CATALOG

NVIDIA DLI HANDS-ON TRAINING COURSE CATALOG NVIDIA DLI HANDS-ON TRAINING COURSE CATALOG Valid Through July 31, 2018 INTRODUCTION The NVIDIA Deep Learning Institute (DLI) trains developers, data scientists, and researchers on how to use artificial

More information

PROFESSIONAL DUAL CAR CAM WITH GPS LOGGER

PROFESSIONAL DUAL CAR CAM WITH GPS LOGGER PROFESSIONAL DUAL CAR CAM WITH GPS LOGGER SKU: DualCCPro THANK YOU FOR PURCHASING THE DUAL CC PRO Please read this manual before operating the Dual CC Pro and keep it handy. The Dual Car Cam Pro is just

More information

Exploring the Structure of Data at Scale. Rudy Agovic, PhD CEO & Chief Data Scientist at Reliancy January 16, 2019

Exploring the Structure of Data at Scale. Rudy Agovic, PhD CEO & Chief Data Scientist at Reliancy January 16, 2019 Exploring the Structure of Data at Scale Rudy Agovic, PhD CEO & Chief Data Scientist at Reliancy January 16, 2019 Outline Why exploration of large datasets matters Challenges in working with large data

More information

/ Cloud Computing. Recitation 8 October 18, 2016

/ Cloud Computing. Recitation 8 October 18, 2016 15-319 / 15-619 Cloud Computing Recitation 8 October 18, 2016 1 Overview Administrative issues Office Hours, Piazza guidelines Last week s reflection Project 3.2, OLI Unit 3, Module 13, Quiz 6 This week

More information

DATA ANALYTICS BOOT CAMP

DATA ANALYTICS BOOT CAMP The UofT SCS DATA ANALYTICS BOOT CAMP Curriculum Overview Over the past decade, the explosion of data has transformed nearly every industry known to man. Whether it s marketing, healthcare, government,

More information

Integrating MATLAB Analytics into Business-Critical Applications Marta Wilczkowiak Senior Applications Engineer MathWorks

Integrating MATLAB Analytics into Business-Critical Applications Marta Wilczkowiak Senior Applications Engineer MathWorks Integrating MATLAB Analytics into Business-Critical Applications Marta Wilczkowiak Senior Applications Engineer MathWorks 2015 The MathWorks, Inc. 1 Problem statement Democratization: Is it possible to

More information

DIGITAL SIGNAGE SOFTWARE

DIGITAL SIGNAGE SOFTWARE DIGITAL SIGNAGE SOFTWARE MODERN. POWERFUL. EFFECTIVE. The most complete solution for visual communication JADE GIVES LIFE TO YOUR DIGITAL SCREENS Jade is the modern and reliable digital signage content

More information

: Semantic Web (2013 Fall)

: Semantic Web (2013 Fall) 03-60-569: Web (2013 Fall) University of Windsor September 4, 2013 Table of contents 1 2 3 4 5 Definition of the Web The World Wide Web is a system of interlinked hypertext documents accessed via the Internet

More information

<Insert Picture Here> Introduction to Big Data Technology

<Insert Picture Here> Introduction to Big Data Technology Introduction to Big Data Technology The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into

More information

3 Data, Data Mining. Chengkai Li

3 Data, Data Mining. Chengkai Li CSE4334/5334 Data Mining 3 Data, Data Mining Chengkai Li Department of Computer Science and Engineering University of Texas at Arlington Fall 2018 (Slides partly courtesy of Pang-Ning Tan, Michael Steinbach

More information

Web & Automotive. Paris, April Dave Raggett

Web & Automotive. Paris, April Dave Raggett Web & Automotive Paris, April 2012 Dave Raggett 1 Aims To discuss potential for Web Apps in cars Identify what kinds of Web standards are needed Discuss plans for W3C Web & Automotive Workshop

More information

Get the most value from your surveys with text analysis

Get the most value from your surveys with text analysis SPSS Text Analysis for Surveys 3.0 Specifications Get the most value from your surveys with text analysis The words people use to answer a question tell you a lot about what they think and feel. That s

More information

THE ATLAS DISTRIBUTED DATA MANAGEMENT SYSTEM & DATABASES

THE ATLAS DISTRIBUTED DATA MANAGEMENT SYSTEM & DATABASES 1 THE ATLAS DISTRIBUTED DATA MANAGEMENT SYSTEM & DATABASES Vincent Garonne, Mario Lassnig, Martin Barisits, Thomas Beermann, Ralph Vigne, Cedric Serfon Vincent.Garonne@cern.ch ph-adp-ddm-lab@cern.ch XLDB

More information

ResponseTek Listening Platform Release Notes Q4 16

ResponseTek Listening Platform Release Notes Q4 16 ResponseTek Listening Platform Release Notes Q4 16 Nov 23 rd, 2016 Table of Contents Release Highlights...3 Predictive Analytics Now Available...3 Text Analytics Now Supports Phrase-based Analysis...3

More information

Asset and network modeling in HP ArcSight ESM and Express

Asset and network modeling in HP ArcSight ESM and Express Asset and network modeling in HP ArcSight ESM and Express Till Jäger, CISSP, CEH EMEA ArcSight Architect, HP ESP Agenda Overview Walkthrough of asset modeling in ArcSight ESM More inside info about the

More information

Selling Virtual Tickets. A Case Study & Guide for Private Streaming

Selling Virtual Tickets. A Case Study & Guide for Private Streaming Selling Virtual Tickets A Case Study & Guide for Private Streaming Considerations for Event Planning One of the main considerations for selling virtual tickets is how you plan to deliver your virtual content.

More information

INTRODUCTION TO DATA MINING

INTRODUCTION TO DATA MINING INTRODUCTION TO DATA MINING 1 Chiara Renso KDDLab - ISTI CNR, Italy http://www-kdd.isti.cnr.it email: chiara.renso@isti.cnr.it Knowledge Discovery and Data Mining Laboratory, ISTI National Research Council,

More information