CIS : Scalable Data Analysis
|
|
- Timothy Baker
- 5 years ago
- Views:
Transcription
1 CIS : Scalable Data Analysis Visualization Dr. David Koop
2 Growth of Data 2
3 Usefulness of Data 3
4 Analyzed Data 4
5 Example Data Sources Radio Telescopes Twitter Wind Turbine Sensors Surveillance Cameras Cars & Airplanes Dog Collars Dishwashers Traffic Lights MRI Scanners NFL Football Players Farming [Zebra MotionWorks] [CC-SA 2.0, Stephan Trebs] 5
6 Large Synoptic Survey Telescope (LSST) Image every 15 seconds 100PB over 10 years [ 6
7 Large Numerical Simulations Millennium simulation: dark matter, 30TB raw data [V. Springel et al., 2005] 7
8 More Data Sources Awesome Public Datasets Kaggle Datasets Government Data: data.gov Customer Data: (see Internal Business Data 8
9 Dimensions of Data Dimension Categories Question to be answered objective subjective Type Web Crawler, Customizable Crawler, Search What is the type of the core offering? Engine, Pure Data Vendor, Complex Data Vendor, Matching Vendor, Enrichment Tagging, Enrichment Sentiment, Enrichment Analysis, Data Market Place Time Frame Static/Factual, Up To Date Is the data static or real-time? Domain All, Finance/Economy, Bio Medicine, Social Media, What is the data about? Geo Data, Address Data Data Origin Internet, Self-Generated, User, Community, Where does the data come from? Who is the author? Government, Authority Pricing Model Free, Freemium, Pay-Per-Use, Flat Rate Is the offer free, pay-per-use or usable with a flat rate? Data Access API, Download, Specialized Software, Web What technical means are offered to access the data? Interface Data Output XML, CSV/XLS, JSON, RDF, Report In what way is the data formatted for the user? Language English, German, More What is the language of the website? Does it differ from the language of the data? Target Audience Business, Customer Towards whom is the product geared? Trustworthiness Low, Medium, High How trustworthy is the vendor? Can the original data source be tracked or verified? Size of Vendor Startup, Medium, Big, Global Player How big is the vendor? Maturity Research Project, Beta, Medium, High Is the product still in beta or already established? [Schomm et al., 2013] 9
10 Big Data or Small Data? Many companies feel the need to overclaim the amount of data "when you take a normal tech company and sprinkle on data, you get the next Google" [C. O'Neil] Many large datasets are not useful Twitter processes 8TB, but the tweets only take about 30GB Wikipedia can be downloaded onto a USB drive All MP3s can be stored on a moderately sized disk array Can learn a lot from a "small" dataset, e.g. sensors from a single turbine, grocery store, Apple Watch Small data focused on end-user, more timely insights? 10
11 Jobs on a Large Analytics Cluster [R. Appuswamy et al., 2013] 11
12 Reading Quiz 12
13 Assignment 1 cis fa/assignment1.html Boston Property Assessments - Initial exploratory analysis - Use a Python Notebook - May use pandas - Label subproblems and answers - Show work (even if it's not your final answer) [Google Maps] 13
14 Big Data Visualization (Slides from Dr. Nan Cao via Dr. Ching-Yung Lin)
15 Big Data Visualization What is Visualization and Why Visualization? Big Data Visualization Challenges and Techniques Visualizing Big Data Visual Analytics and Big Data 15
16 Whisper: Tracing Information Diffusion in Real Time 16
17 Customizing Computational Methods for Visual Analytics with Big Data J. Choo and H. Park
18 Complexities of Visual Analytics of Big Data Human perception and large numbers of items - locating items - tracking items Limited screen space: - clutter - overlapping items 18
19 Use Computational Methods Methods: - Dimensionality reduction - Clustering - Machine learning & data mining Issues with using these methods - What's going on? - Waiting time Goal: - Interactive - Faster 19
20 Exploiting Discrepancies Precision: use knowledge of screen resolution to set precision Convergence: Don't worry about minor changes that may be imperceptible - Human perception - Screen resolution constraints 20
21 Changes in Cluster Membership in k-means No. of per-iteration changes/accuracy (%) Accuracy against final solution Per-iteration changes No. of iterations 21
22 Customizing Computations Use lower precision computation Use interactive visualization that shows iterations Refine results iteratively Data scale confinement 22
23 Iteration-level Interactive Visualization Computational module Subroutine 1 Subroutine k Output Visualization/summarization Iterate (a) Interaction Computational module Subroutine 1 Subroutine k Output Visualization/summarization Iterate (b) Interaction 23
24 Next Progressive Visualization Read: How Progressive Visualizations Affect Exploratory Analysis Write: - Critique of Paper - < 1 paragraph summary, 2 paragraphs critique Which ideas in the paper are interesting and why? Which ideas do you have related to the paper Which ideas seem problematic? Can you suggest alternatives? - Turn in via mycourses - Due Tuesday before class 24
INTRODUCTION TO BIG DATA, DATA MINING, AND MACHINE LEARNING
CS 7265 BIG DATA ANALYTICS INTRODUCTION TO BIG DATA, DATA MINING, AND MACHINE LEARNING * Some contents are adapted from Dr. Hung Huang and Dr. Chengkai Li at UT Arlington Mingon Kang, PhD Computer Science,
More informationNowcasting. D B M G Data Base and Data Mining Group of Politecnico di Torino. Big Data: Hype or Hallelujah? Big data hype?
Big data hype? Big Data: Hype or Hallelujah? Data Base and Data Mining Group of 2 Google Flu trends On the Internet February 2010 detected flu outbreak two weeks ahead of CDC data Nowcasting http://www.internetlivestats.com/
More informationApache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context
1 Apache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context Generality: diverse workloads, operators, job sizes
More informationData-Intensive Distributed Computing
Data-Intensive Distributed Computing CS 451/651 431/631 (Winter 2018) Part 5: Analyzing Relational Data (1/3) February 8, 2018 Jimmy Lin David R. Cheriton School of Computer Science University of Waterloo
More informationCC PROCESAMIENTO MASIVO DE DATOS OTOÑO 2018
CC5212-1 PROCESAMIENTO MASIVO DE DATOS OTOÑO 2018 Lecture 1: Introduction Aidan Hogan aidhog@gmail.com THE VALUE OF DATA Soho, London, 1854 Cholera: What we know now Cholera: What we knew in 1854 1854:
More informationBig Data - Some Words BIG DATA 8/31/2017. Introduction
BIG DATA Introduction Big Data - Some Words Connectivity Social Medias Share information Interactivity People Business Data Data mining Text mining Business Intelligence 1 What is Big Data Big Data means
More informationTopics. Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples
Hadoop Introduction 1 Topics Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples 2 Big Data Analytics What is Big Data?
More informationBased on Big Data: Hype or Hallelujah? by Elena Baralis
Based on Big Data: Hype or Hallelujah? by Elena Baralis http://dbdmg.polito.it/wordpress/wp-content/uploads/2010/12/bigdata_2015_2x.pdf 1 3 February 2010 Google detected flu outbreak two weeks ahead of
More informationBig Data Analytics. Izabela Moise, Evangelos Pournaras, Dirk Helbing
Big Data Analytics Izabela Moise, Evangelos Pournaras, Dirk Helbing Izabela Moise, Evangelos Pournaras, Dirk Helbing 1 Big Data "The world is crazy. But at least it s getting regular analysis." Izabela
More information2/26/2017. Originally developed at the University of California - Berkeley's AMPLab
Apache is a fast and general engine for large-scale data processing aims at achieving the following goals in the Big data context Generality: diverse workloads, operators, job sizes Low latency: sub-second
More informationScalable Data Analysis (CIS )
Scalable Data Analysis (CIS 602-01) Introduction Dr. David Koop NYC Taxi Data [Analyzing 1.1 Billion NYC Taxi and Uber Trips, with a Vengeance, T. W. Schneider] 2 What are your questions about this data?
More informationProgramming Technologies for Web Resource Mining
Programming Technologies for Web Resource Mining SoftLang Team, University of Koblenz-Landau Prof. Dr. Ralf Lämmel Msc. Johannes Härtel Msc. Marcel Heinz Motivation What are interesting web resources??
More information2013 AWS Worldwide Public Sector Summit Washington, D.C.
2013 AWS Worldwide Public Sector Summit Washington, D.C. EMR for Fun and for Profit Ben Butler Sr. Manager, Big Data butlerb@amazon.com @bensbutler Overview 1. What is big data? 2. What is AWS Elastic
More informationEmbedded Technosolutions
Hadoop Big Data An Important technology in IT Sector Hadoop - Big Data Oerie 90% of the worlds data was generated in the last few years. Due to the advent of new technologies, devices, and communication
More informationAnalytics Platform for ATLAS Computing Services
Analytics Platform for ATLAS Computing Services Ilija Vukotic for the ATLAS collaboration ICHEP 2016, Chicago, USA Getting the most from distributed resources What we want To understand the system To understand
More informationIntroduction to MapReduce (cont.)
Introduction to MapReduce (cont.) Rafael Ferreira da Silva rafsilva@isi.edu http://rafaelsilva.com USC INF 553 Foundations and Applications of Data Mining (Fall 2018) 2 MapReduce: Summary USC INF 553 Foundations
More informationData Intensive Scalable Computing. Thanks to: Randal E. Bryant Carnegie Mellon University
Data Intensive Scalable Computing Thanks to: Randal E. Bryant Carnegie Mellon University http://www.cs.cmu.edu/~bryant Big Data Sources: Seismic Simulations Wave propagation during an earthquake Large-scale
More informationBig Data Analytics: What is Big Data? Stony Brook University CSE545, Fall 2016 the inaugural edition
Big Data Analytics: What is Big Data? Stony Brook University CSE545, Fall 2016 the inaugural edition What s the BIG deal?! 2011 2011 2008 2010 2012 What s the BIG deal?! (Gartner Hype Cycle) What s the
More informationKnowledge Discovery. URL - Spring 2018 CS - MIA 1/22
Knowledge Discovery Javier Béjar cbea URL - Spring 2018 CS - MIA 1/22 Knowledge Discovery (KDD) Knowledge Discovery in Databases (KDD) Practical application of the methodologies from machine learning/statistics
More informationG(B)enchmark GraphBench: Towards a Universal Graph Benchmark. Khaled Ammar M. Tamer Özsu
G(B)enchmark GraphBench: Towards a Universal Graph Benchmark Khaled Ammar M. Tamer Özsu Bioinformatics Software Engineering Social Network Gene Co-expression Protein Structure Program Flow Big Graphs o
More informationIntroduction to Data Management CSE 344
Introduction to Data Management CSE 344 Lectures 23 and 24 Parallel Databases 1 Why compute in parallel? Most processors have multiple cores Can run multiple jobs simultaneously Natural extension of txn
More informationFlynax SEO Guide Flynax
Flynax SEO Guide Flynax 2018 1 Ì ÌFlynax SEO Guide Due to the fact that every project has its own purpose, audience and location preferences, it is almost impossible to make the script that will meet SEO
More informationCloud Computing 2. CSCI 4850/5850 High-Performance Computing Spring 2018
Cloud Computing 2 CSCI 4850/5850 High-Performance Computing Spring 2018 Tae-Hyuk (Ted) Ahn Department of Computer Science Program of Bioinformatics and Computational Biology Saint Louis University Learning
More informationThe Billion Object Platform (BOP): a system to lower barriers to support big, streaming, spatio-temporal data sources
FOSS4G 2017 Boston The Billion Object Platform (BOP): a system to lower barriers to support big, streaming, spatio-temporal data sources Devika Kakkar and Ben Lewis Harvard Center for Geographic Analysis
More informationBlurring the Line Between Developer and Data Scientist
Blurring the Line Between Developer and Data Scientist Notebooks with PixieDust va barbosa va@us.ibm.com Developer Advocacy IBM Watson Data Platform WHY ARE YOU HERE? More companies making bet-the-business
More informationDATA COLLECTION. Slides by WESLEY WILLETT 13 FEB 2014
DATA COLLECTION Slides by WESLEY WILLETT INFO VISUAL 340 ANALYTICS D 13 FEB 2014 WHERE DOES DATA COME FROM? We tend to think of data as a thing in a database somewhere WHY DO YOU NEED DATA? (HINT: Usually,
More informationIntroduction to Data Management CSE 344
Introduction to Data Management CSE 344 Lecture 25: Parallel Databases CSE 344 - Winter 2013 1 Announcements Webquiz due tonight last WQ! J HW7 due on Wednesday HW8 will be posted soon Will take more hours
More informationSQLite vs. MongoDB for Big Data
SQLite vs. MongoDB for Big Data In my latest tutorial I walked readers through a Python script designed to download tweets by a set of Twitter users and insert them into an SQLite database. In this post
More informationCSC 170 Introduction to Computers and Their Applications. Computers
CSC 170 Introduction to Computers and Their Applications Lecture #4 Digital Devices Computers At its core, a computer is a multipurpose device that accepts input, processes data, stores data, and produces
More informationData, Data, Everywhere. We are now in the Big Data Era.
Data, Data, Everywhere. We are now in the Big Data Era. CONTENTS Background Big Data What is Generating our Big Data Physical Management of Big Data Optimisation in Data Processing Techniques for Handling
More informationCIS : Scalable Data Analysis
CIS 602-01: Scalable Data Analysis Cloud Workloads Dr. David Koop Scaling Up PC [Haeberlen and Ives, 2015] 2 Scaling Up PC Server [Haeberlen and Ives, 2015] 2 Scaling Up PC Server Cluster [Haeberlen and
More informationCase-based Recommendation. Peter Brusilovsky with slides of Danielle Lee
Case-based Recommendation Peter Brusilovsky with slides of Danielle Lee Where we are? Search Navigation Recommendation Content-based Semantics / Metadata Social Modern E-Commerce Site The Power of Metadata
More informationBig Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2017)
Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2017) Week 10: Mutable State (1/2) March 14, 2017 Jimmy Lin David R. Cheriton School of Computer Science University of Waterloo These
More informationUnderstanding the SAP HANA Difference. Amit Satoor, SAP Data Management
Understanding the SAP HANA Difference Amit Satoor, SAP Data Management Webinar Logistics Got Flash? http://get.adobe.com/flashplayer to download. The future holds many transformational opportunities Capitalize
More informationQMiner is a data analytics platform for processing large-scale real-time streams containing structured and unstructured data.
Data analytics with QMiner This topic provides a practical insights on data analytics using QMiner. QMiner implements a comprehensive set of techniques for supervised, unsupervised and active learning
More informationMaximise your return in search. Mark Lilley
Maximise your return in search Mark Lilley 19.10.2017 Hello Mark Lilley Co- Founder & Director Groundswell groundswellgrowth.com Head of Ecommerce Chain Reaction Cycles 5 years Who are Groundswell? Ecommerce
More informationPopularity of Twitter Accounts: PageRank on a Social Network
Popularity of Twitter Accounts: PageRank on a Social Network A.D-A December 8, 2017 1 Problem Statement Twitter is a social networking service, where users can create and interact with 140 character messages,
More informationUSERS CONFERENCE Copyright 2016 OSIsoft, LLC
Bridge IT and OT with a process data warehouse Presented by Matt Ziegler, OSIsoft Complexity Problem Complexity Drives the Need for Integrators Disparate assets or interacting one-by-one Monitoring Real-time
More informationOpenStreetMap. For the semi-uninitiated. Fraser Kirkpatrick Technical Project Manager 27 th September CGI Group Inc.
OpenStreetMap For the semi-uninitiated Fraser Kirkpatrick Technical Project Manager 27 th September 2018 1 Agenda Introduction to OpenStreetMap Summary of OSM data model Live Edit Data Maturity Consuming
More informationUbiquitous Computing. Ambient Intelligence
Ubiquitous Computing Ambient Intelligence CS4031 Introduction to Digital Media 2016 Computing Evolution Ubiquitous Computing Mark Weiser, Xerox PARC 1988 Ubiquitous computing enhances computer use by making
More informationNext steps in single-pair ecosystem - consideration of extended reach. IEEE NEA Ad hoc
Next steps in single-pair ecosystem - consideration of extended reach IEEE 802.3 NEA Ad hoc 1 Authors Harald Mueller, Endress+Hauser (Industrial Automation) David Brandt, Rockwell Automation (Industrial
More informationWebsite minute read. Understand the business implications, tactics, costs, and creation process of an effective website.
Website 101 Understand the business implications, tactics, costs, and creation process of an effective website. 8 minute read Mediant Web Development What to Expect 1. Why a Good Website is Crucial 2.
More informationUsing the Force of Python and SAS Viya on Star Wars Fan Posts
SESUG Paper BB-170-2017 Using the Force of Python and SAS Viya on Star Wars Fan Posts Grace Heyne, Zencos Consulting, LLC ABSTRACT The wealth of information available on the Internet includes useful and
More informationKing Fahd University of Petroleum & Minerals Computer Engineering g Dept
King Fahd University of Petroleum & Minerals Computer Engineering g Dept COE 540 Computer Networks Term 121 Dr. Ashraf S. Hasan Mahmoud Rm 22-420 Ext. 1724 Email: ashraf@kfupm.edu.sa 9/1/2012 Dr. Ashraf
More informationA REVIEW PAPER ON BIG DATA ANALYTICS
A REVIEW PAPER ON BIG DATA ANALYTICS Kirti Bhatia 1, Lalit 2 1 HOD, Department of Computer Science, SKITM Bahadurgarh Haryana, India bhatia.kirti.it@gmail.com 2 M Tech 4th sem SKITM Bahadurgarh, Haryana,
More informationIntroduction to Data Analytics. David Walling
Introduction to Data Analytics David Walling walling@tacc.utexas.edu Source: http://research.microsoft.com/en-us/collaboration/fourthparadigm/default.aspx Computational Simulation Model first, given initial
More informationThe Future of High Performance Computing
The Future of High Performance Computing Randal E. Bryant Carnegie Mellon University http://www.cs.cmu.edu/~bryant Comparing Two Large-Scale Systems Oakridge Titan Google Data Center 2 Monolithic supercomputer
More informationPurpose, features and functionality
Topic 6 Purpose, features and functionality In this topic you will look at the purpose, features, functionality and range of users that use information systems. You will learn the importance of being able
More information/ Cloud Computing. Recitation 13 April 14 th 2015
15-319 / 15-619 Cloud Computing Recitation 13 April 14 th 2015 Overview Last week s reflection Project 4.1 Budget issues Tagging, 15619Project This week s schedule Unit 5 - Modules 18 Project 4.2 Demo
More informationCS 398 ACC Streaming. Prof. Robert J. Brunner. Ben Congdon Tyler Kim
CS 398 ACC Streaming Prof. Robert J. Brunner Ben Congdon Tyler Kim MP3 How s it going? Final Autograder run: - Tonight ~9pm - Tomorrow ~3pm Due tomorrow at 11:59 pm. Latest Commit to the repo at the time
More informationData Centers and Cloud Computing. Slides courtesy of Tim Wood
Data Centers and Cloud Computing Slides courtesy of Tim Wood 1 Data Centers Large server and storage farms 1000s of servers Many TBs or PBs of data Used by Enterprises for server applications Internet
More informationData Mining Concepts & Tasks
Data Mining Concepts & Tasks Duen Horng (Polo) Chau Georgia Tech CSE6242 / CX4242 Jan 16, 2014 Partly based on materials by Professors Guy Lebanon, Jeffrey Heer, John Stasko, Christos Faloutsos Last Time
More information27/04/2015 CC PROCESAMIENTO MASIVO DE DATOS OTOÑO Lecture 1: Introduction THE VALUE OF DATA. Aidan Hogan
CC5212-1 PROCESAMIENTO MASIVO DE DATOS OTOÑO 2015 Lecture 1: Introduction Aidan Hogan aidhog@gmail.com THE VALUE OF DATA Soho, London, 1854 The mystery of cholera The Hunt for the invisible cholera Cholera:
More informationIntroduction to Database Systems CSE 414
Introduction to Database Systems CSE 414 Lecture 24: Parallel Databases CSE 414 - Spring 2015 1 Announcements HW7 due Wednesday night, 11 pm Quiz 7 due next Friday(!), 11 pm HW8 will be posted middle of
More informationCEO Position starts January 2012
CEO Position starts January 2012 Peter Hirsch It is a Cell Phone (of course) It is a Video Conferencing Phone It is a Digital HD Camera (Photos and Videos) It is a MP3 Player (Music Player) It is a Digital
More informationThe Fastest And Most Efficient Block Storage Software (SDS)
The Fastest And Most Efficient Block Storage Software (SDS) StorPool: Product Summary 1. Advanced Block-level Software Defined Storage, SDS (SDS 2.0) Fully distributed, scale-out, online changes of everything,
More informationDigital Marketing. Introduction of Marketing. Introductions
Digital Marketing Introduction of Marketing Origin of Marketing Why Marketing is important? What is Marketing? Understanding Marketing Processes Pillars of marketing Marketing is Communication Mass Communication
More informationSMARTATL. A Smart City Overview and Roadmap. Evanta CIO Executive Summit 1
SMARTATL A Smart City Overview and Roadmap Evanta CIO Executive Summit 1 Southeast USA Overview Evanta CIO Executive Summit 2 Metro Atlanta Overview Evanta CIO Executive Summit 3 Permits, New Units under
More informationData Centers and Cloud Computing. Data Centers
Data Centers and Cloud Computing Slides courtesy of Tim Wood 1 Data Centers Large server and storage farms 1000s of servers Many TBs or PBs of data Used by Enterprises for server applications Internet
More informationHey Guys, My name is Piyush Mathur. By Profession I am a Digital marketing consultant.
BY PIYUSH MATHUR Hey Guys, My name is Piyush Mathur. By Profession I am a Digital marketing consultant. I work with many startups and large companies to develop intelligent and effective online strategies.
More informationNVIDIA DEEP LEARNING INSTITUTE
NVIDIA DEEP LEARNING INSTITUTE TRAINING CATALOG Valid Through July 31, 2018 INTRODUCTION The NVIDIA Deep Learning Institute (DLI) trains developers, data scientists, and researchers on how to use artificial
More informationIoT Impact On Storage Architecture
IoT Impact On Storage Architecture SDC India Girish Kumar B K NetApp 24 th May 2018 1 IoT - Agenda 1) Introduction 2) Data growth and compute model 3) Industrial needs and IoT architecture 4) Data flow
More information/ Cloud Computing. Recitation 7 October 10, 2017
15-319 / 15-619 Cloud Computing Recitation 7 October 10, 2017 Overview Last week s reflection Project 3.1 OLI Unit 3 - Module 10, 11, 12 Quiz 5 This week s schedule OLI Unit 3 - Module 13 Quiz 6 Project
More informationDelivering HCI with VMware vsan and Cisco UCS
BRKPAR-2447 Delivering HCI with VMware vsan and Cisco UCS Bhumik Patel Director, Technical Alliances, VMware bhumikp@vmware.com 2 Blistering Pace of vsan Adoption Fastest since ESX 10,000 Customers $300M
More informationClicking on Analytics will bring you to the Overview page.
YouTube Analytics - formerly known as Insight - is an extremely powerful set of tools that can provide you with a lot of information about your videos, your audience, and your customers. Clicking on Analytics
More informationThe Computation and Data Needs of Canadian Astronomy
Summary The Computation and Data Needs of Canadian Astronomy The Computation and Data Committee In this white paper, we review the role of computing in astronomy and astrophysics and present the Computation
More informationseeing through complexity CISCO SPARK
seeing through complexity CISCO SPARK 1 CISCO SPARK What is Cisco Spark? Cisco Spark is a cloud based service that provides teams with powerful collaboration and communication tools via a cross platform
More informationBig Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2016)
Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2016) Week 10: Mutable State (1/2) March 15, 2016 Jimmy Lin David R. Cheriton School of Computer Science University of Waterloo These
More informationMathpak. A platform for collaborative analytic apps. Bay Area R Users Group 10/02/2013
Mathpak A platform for collaborative analytic apps Bay Area R Users Group 10/02/2013 Overview 2/14 Mathpak A cloud based platform for developers to build and deploy analytic apps Developers upload components
More informationScience 2.0 VU Big Science, e-science and E- Infrastructures + Bibliometric Network Analysis
W I S S E N n T E C H N I K n L E I D E N S C H A F T Science 2.0 VU Big Science, e-science and E- Infrastructures + Bibliometric Network Analysis Elisabeth Lex KTI, TU Graz WS 2015/16 u www.tugraz.at
More informationEffective Information Management and Governance: Building the Business Case for Taxonomy
Effective Information Management and Governance: Building the Business Case for Taxonomy May 16, 2017 A WAND, Inc. White Paper By Mark Leher, COO, WAND, Inc. Executive Summary Companies are often reluctant
More informationPre-Requisites: CS2510. NU Core Designations: AD
DS4100: Data Collection, Integration and Analysis Teaches how to collect data from multiple sources and integrate them into consistent data sets. Explains how to use semi-automated and automated classification
More informationBig Data com Hadoop. VIII Sessão - SQL Bahia. Impala, Hive e Spark. Diógenes Pires 03/03/2018
Big Data com Hadoop Impala, Hive e Spark VIII Sessão - SQL Bahia 03/03/2018 Diógenes Pires Connect with PASS Sign up for a free membership today at: pass.org #sqlpass Internet Live http://www.internetlivestats.com/
More informationScience Cookbook. Practical Data. open source community experience distilled. Benjamin Bengfort. science projects in R and Python.
Practical Data Science Cookbook 89 hands-on recipes to help you complete real-world data science projects in R and Python Tony Ojeda Sean Patrick Murphy Benjamin Bengfort Abhijit Dasgupta PUBLISHING open
More informationOECD PUBLISHING SERVICES JULY 2014 RELEASE
OECD PUBLISHING SERVICES JULY 2014 RELEASE OECD ilibrary news what has happened since 2010? The OECD ilibrary has grown a lot since its launch in 2010. In 2011, it reached 3.5 million downloads. This number
More informationData Analyst Nanodegree Syllabus
Data Analyst Nanodegree Syllabus Discover Insights from Data with Python, R, SQL, and Tableau Before You Start Prerequisites : In order to succeed in this program, we recommend having experience working
More informationData Mining Concepts & Tasks
Data Mining Concepts & Tasks Duen Horng (Polo) Chau Georgia Tech CSE6242 / CX4242 Sept 9, 2014 Partly based on materials by Professors Guy Lebanon, Jeffrey Heer, John Stasko, Christos Faloutsos Last Time
More informationkjhf MIS 510 sharethisdeal - Final Project Report Team Members:- Damini Akash Krittika Karan Khurana Agrawal Patil Dhingra
kjhf MIS 510 sharethisdeal - Final Project Report Team Members:- Damini Akash Krittika Karan Khurana Agrawal Patil Dhingra 5/14/2014 Introduction Shoppers often come across offers that require them to
More informationSurveillance Dell EMC Storage with IndigoVision Control Center
Surveillance Dell EMC Storage with IndigoVision Control Center Sizing Guide H14832 REV 1.1 Copyright 2016-2017 Dell Inc. or its subsidiaries. All rights reserved. Published May 2016 Dell believes the information
More informationhttps://www.youtube.com/watch?v=-gj93l2qa6c Topics: Foundation of Data Analytics and Data Mining Data Volume, Velocity, & Variety Harnessing Big Data Enabling technologies: Cloud Computing 2 No single
More informationAcknowledgements. Beyond DBMSs. Presentation Outline
Acknowledgements Beyond RDBMSs These slides are put together from a variety of sources (both papers and slides/tutorials available on the web) Sharma Chakravarthy Information Technology Laboratory Computer
More informationApache Hadoop 3. Balazs Gaspar Sales Engineer CEE & CIS Cloudera, Inc. All rights reserved.
Apache Hadoop 3 Balazs Gaspar Sales Engineer CEE & CIS balazs@cloudera.com 1 We believe data can make what is impossible today, possible tomorrow 2 We empower people to transform complex data into clear
More informationModern Data Warehouse The New Approach to Azure BI
Modern Data Warehouse The New Approach to Azure BI History On-Premise SQL Server Big Data Solutions Technical Barriers Modern Analytics Platform On-Premise SQL Server Big Data Solutions Modern Analytics
More informationNVIDIA DLI HANDS-ON TRAINING COURSE CATALOG
NVIDIA DLI HANDS-ON TRAINING COURSE CATALOG Valid Through July 31, 2018 INTRODUCTION The NVIDIA Deep Learning Institute (DLI) trains developers, data scientists, and researchers on how to use artificial
More informationPROFESSIONAL DUAL CAR CAM WITH GPS LOGGER
PROFESSIONAL DUAL CAR CAM WITH GPS LOGGER SKU: DualCCPro THANK YOU FOR PURCHASING THE DUAL CC PRO Please read this manual before operating the Dual CC Pro and keep it handy. The Dual Car Cam Pro is just
More informationExploring the Structure of Data at Scale. Rudy Agovic, PhD CEO & Chief Data Scientist at Reliancy January 16, 2019
Exploring the Structure of Data at Scale Rudy Agovic, PhD CEO & Chief Data Scientist at Reliancy January 16, 2019 Outline Why exploration of large datasets matters Challenges in working with large data
More information/ Cloud Computing. Recitation 8 October 18, 2016
15-319 / 15-619 Cloud Computing Recitation 8 October 18, 2016 1 Overview Administrative issues Office Hours, Piazza guidelines Last week s reflection Project 3.2, OLI Unit 3, Module 13, Quiz 6 This week
More informationDATA ANALYTICS BOOT CAMP
The UofT SCS DATA ANALYTICS BOOT CAMP Curriculum Overview Over the past decade, the explosion of data has transformed nearly every industry known to man. Whether it s marketing, healthcare, government,
More informationIntegrating MATLAB Analytics into Business-Critical Applications Marta Wilczkowiak Senior Applications Engineer MathWorks
Integrating MATLAB Analytics into Business-Critical Applications Marta Wilczkowiak Senior Applications Engineer MathWorks 2015 The MathWorks, Inc. 1 Problem statement Democratization: Is it possible to
More informationDIGITAL SIGNAGE SOFTWARE
DIGITAL SIGNAGE SOFTWARE MODERN. POWERFUL. EFFECTIVE. The most complete solution for visual communication JADE GIVES LIFE TO YOUR DIGITAL SCREENS Jade is the modern and reliable digital signage content
More information: Semantic Web (2013 Fall)
03-60-569: Web (2013 Fall) University of Windsor September 4, 2013 Table of contents 1 2 3 4 5 Definition of the Web The World Wide Web is a system of interlinked hypertext documents accessed via the Internet
More information<Insert Picture Here> Introduction to Big Data Technology
Introduction to Big Data Technology The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into
More information3 Data, Data Mining. Chengkai Li
CSE4334/5334 Data Mining 3 Data, Data Mining Chengkai Li Department of Computer Science and Engineering University of Texas at Arlington Fall 2018 (Slides partly courtesy of Pang-Ning Tan, Michael Steinbach
More informationWeb & Automotive. Paris, April Dave Raggett
Web & Automotive Paris, April 2012 Dave Raggett 1 Aims To discuss potential for Web Apps in cars Identify what kinds of Web standards are needed Discuss plans for W3C Web & Automotive Workshop
More informationGet the most value from your surveys with text analysis
SPSS Text Analysis for Surveys 3.0 Specifications Get the most value from your surveys with text analysis The words people use to answer a question tell you a lot about what they think and feel. That s
More informationTHE ATLAS DISTRIBUTED DATA MANAGEMENT SYSTEM & DATABASES
1 THE ATLAS DISTRIBUTED DATA MANAGEMENT SYSTEM & DATABASES Vincent Garonne, Mario Lassnig, Martin Barisits, Thomas Beermann, Ralph Vigne, Cedric Serfon Vincent.Garonne@cern.ch ph-adp-ddm-lab@cern.ch XLDB
More informationResponseTek Listening Platform Release Notes Q4 16
ResponseTek Listening Platform Release Notes Q4 16 Nov 23 rd, 2016 Table of Contents Release Highlights...3 Predictive Analytics Now Available...3 Text Analytics Now Supports Phrase-based Analysis...3
More informationAsset and network modeling in HP ArcSight ESM and Express
Asset and network modeling in HP ArcSight ESM and Express Till Jäger, CISSP, CEH EMEA ArcSight Architect, HP ESP Agenda Overview Walkthrough of asset modeling in ArcSight ESM More inside info about the
More informationSelling Virtual Tickets. A Case Study & Guide for Private Streaming
Selling Virtual Tickets A Case Study & Guide for Private Streaming Considerations for Event Planning One of the main considerations for selling virtual tickets is how you plan to deliver your virtual content.
More informationINTRODUCTION TO DATA MINING
INTRODUCTION TO DATA MINING 1 Chiara Renso KDDLab - ISTI CNR, Italy http://www-kdd.isti.cnr.it email: chiara.renso@isti.cnr.it Knowledge Discovery and Data Mining Laboratory, ISTI National Research Council,
More information