Scalable Data Analysis (CIS )

Size: px
Start display at page:

Download "Scalable Data Analysis (CIS )"

Transcription

1 Scalable Data Analysis (CIS ) Introduction Dr. David Koop

2 NYC Taxi Data [Analyzing 1.1 Billion NYC Taxi and Uber Trips, with a Vengeance, T. W. Schneider] 2

3 What are your questions about this data? 3

4 NYC Taxi Data [Analyzing 1.1 Billion NYC Taxi and Uber Trips, with a Vengeance, T. W. Schneider] 4

5 NYC Taxi Data: Day analysis [Ferreira et al., 2013] 5

6 NYC Taxi Data: Region analysis t) in different neighborhoods over the first week of May The plots show that Midtown over the weekend, there is an increased number of dropoffs in Downtown. The figure also ncrease in the ay 5), with big s indicates that ew of city life, qualities. Peohe lack of taxi at their disconude difference her more afflule people take xploring other rising fact: the he other neighat the fare per omic incentive eans to reward Fig. 10. Comparing movement across NYC transportation hubs. On the [Ferreira et al., 2013] s top, we examine trips starting at the two major airports in NYC: JFK and La Guardia. In the bottom, we refine the query to compare trips starting 6 Grand Central) D. Koop, CIS , Fall 2017 at the airports with trips starting at the train stations, Penn Station and

7 Marine Traffic Data + - Leaflet Map data OpenStreetMap contributors 7

8 Marine Traffic Data + - Leaflet Map data OpenStreetMap contributors 8

9 Baseball Data [Deitrich et al., 2014] 9

10 Baseball Data [Deitrich et al., 2014] 10

11 Baseball Data [Deitrich et al., 2014] 11

12 Mobile Data Growth [Cisco Visual Networking Index Mobile, 2017] 12

13 Mobile Video Keeps Growing Note: Figures in parentheses refer to 2016 and 2021 traffic share. [Cisco Visual Networking Index Mobile, 2017] 13

14 Data Science Venn Diagram [D. Conway, The Data Science Venn Diagram, 2013] 14

15 Questions are important! Having data is great, but most of the time it just sits waiting for someone to analyze it The reason data analysis is not completely automated is that there are so many potential questions Humans need to stay involved in the loop Interaction and visualization can be important, especially early in data analysis 15

16 Scalability Big Data - What is big? For whom is it big? - variety, velocity, volume, Lots of data that was big is not an issue now Understanding the scalability of techniques is important There will always be larger datasets, want to understand - how methods scale - performance bounds - storage constraints 16

17 Real-time Analysis Want to have results now How? - Faster machines - Clusters - Progressive techniques 17

18 About Me Research Interests - Visualization - Computational Provenance - Geospatial Analysis Research Projects - VisTrails: - Dataflow Notebooks - Meta-versioning - Marine Traffic Data See my web page for more information

19 About You Previous topics course (CIS 602)? Research Papers? Data Science? Python? Database Experience? Analytics Experience? Cloud Computing Experience? Anything you want to see covered? 19

20 About this course Course web page is authoritative: Schedule, Readings, Assignments will be posted online - Check the web site before ing me Topics course - A current research area the professor works in - A chance to be on the cutting edge of research Requires student participation - Reading responses - Project presentations 20

21 About this course Balance of techniques and research ideas Some background (Python) followed by topic areas and readings Assignments at the beginning of course, project at end Two tests Topic areas: - Exploratory Data Analysis and Visualization - Data Acquisition - Data Storage and Access - Cloud Computing and Scalable Computation - Applications and specific data considerations 21

22 Project Do scalable data analysis of a large dataset - Questions - Analysis - Visualizations - Cloud/Cluster Computing Another option: research-related topic Waypoints: - Proposal - Progress Report - Final Presentation 22

23 About this course Course Registration: - Make sure you have registered in COIN for the course - me if you are not registered but are interested in taking the course Review of course policies: - Plagiarism and academic honesty - If you have any concerns or questions, please me as soon as possible If you are not sure if this course is a good fit, please me or talk to me 23

24 Data What is this data? Semantics: real-world meaning of the data Type: structural or mathematical interpretation Both often require metadata - Sometimes we can infer some of this information - Line between data and metadata isn t always clear 24

25 Data 25

26 Data Types Items - An item is an individual discrete entity - e.g. row in a table, node in a network Attributes - An attribute is some specific property that can be measured, observed, or logged - a.k.a. variable, (data) dimension 26

27 Items & Attributes attribute Field item 22 27

28 Data Types Nodes - Synonym for item but in the context of networks (graphs) Links - A link is a relation between two items - e.g. social network friends, computer network links 28

29 Items & Links Item Links [Bostock, 2011] 29

30 Dataset Types Dataset Types Tables Networks Fields (Continuous) Geometry (Spatial) Attributes (columns) Grid of positions Items (rows) Cell containing value Link Node (item) Cell Attributes (columns) Position Multidimensional Table Trees Value in cell Value in cell [Munzner (ill. Maguire), 2014] 30

31 Attribute Types Attribute Types Categorical Ordered Ordinal Quantitative Ordering Direction Sequential Diverging Cyclic [Munzner (ill. Maguire), 2014] 31

32 Categorial, Ordinal, and Quantitative 1 = Quantitative quantitative 23 2 = Nominal ordinal 3 = Ordinal categorical 32

33 Categorial, Ordinal, and Quantitative 1 = Quantitative quantitative 24 2 = Nominal ordinal 3 = Ordinal categorical 33

34 Semantics The type of data does not tell us what the data means or how it should be interpreted Tables have keys/values, fields have independent/dependent vars Flat Tables Multidimensional Fields [Munzner (ill. Maguire), 2014] 34

35 Analysis Actions Why? Targets Analyze All Data Consume Trends Outliers Features Discover Present Enjoy Produce Annotate Record Derive tag Attributes One Many Distribution Dependency Correlation Similarity Search Extremes Target known Target unknown Location known Location unknown Lookup Locate Browse Explore Network Data Topology Query Identify Compare Summarize Paths Spatial Data Shape What? Why? How? [Munzner (ill. Maguire), 2014] 35

36 Analysis: Consume & Produce Consume - Exploration - Explanation Analyze Consume Discover Present Enjoy - Enjoyment Produce - Annotation - Record - Derivation Produce Leads to new directions/ideas Annotate Record Derive tag [Munzner (ill. Maguire), 2014] 36

37 Analysis: Search and Query Search based on what a user knows Search - Target Target known Target unknown - Location Location known Lookup Browse Query depends on what data matters - One - Some (Often Two) - All Location unknown Query Locate Explore Identify Compare Summarize [Munzner (ill. Maguire), 2014] 37

38 Targets ALL DATA NETWORK DATA Trends Outliers Features Topology Paths ATTRIBUTES One Many Distribution Dependency Correlation Similarity SPATIAL DATA Shape Extremes [Munzner (ill. Maguire), 2014] 38

39 More Reading Listed on course schedule: - Challenges and Opportunities with Big Data, D. Agrawal et al. - Toward Scalable Systems for Big Data Analysis: A Technology Tutorial, H. Hu et al. - Big Data computing and clouds: Trends and future directions, M. D. Assuncao et al. 39

40 Next Class Introduction to/review of Python Download anaconda distribution: I am planning to use Python 3 (3.6) 40

DSC 201: Data Analysis & Visualization

DSC 201: Data Analysis & Visualization DSC 201: Data Analysis & Visualization Visualization Tools Dr. David Koop Visualization for Exploration 2 MTA Fare Data Exploration 3 MTA Fare Data Exploration 4 MTA Fare Data Exploration 5 MTA Fare Data

More information

Data Visualization (CIS/DSC 468)

Data Visualization (CIS/DSC 468) Data Visualization (CIS/DSC 468) Data & Tasks Dr. David Koop Programmatic SVG Example Draw a horizontal bar chart - var a = [6, 2, 6, 10, 7, 18, 0, 17, 20, 6]; Steps: - Programmatically create SVG - Create

More information

Data Visualization (DSC 530/CIS )

Data Visualization (DSC 530/CIS ) Data Visualization (DSC 530/CIS 602-01) Data Dr. David Koop HTML and CSS HTML: Tags define the boundaries of the structures of the content this is cool. What about this?

More information

Data Visualization (CIS 468)

Data Visualization (CIS 468) Data Visualization (CIS 468) D3 + Marks & Channels Dr. David Koop Tasks Actions Targets Analyze All Data Consume Trends Outliers Features Discover Present Enjoy Produce Annotate Record Derive tag Attributes

More information

DSC 201: Data Analysis & Visualization

DSC 201: Data Analysis & Visualization DSC 201: Data Analysis & Visualization Exploratory Data Analysis Dr. David Koop What is Exploratory Data Analysis? "Detective work" to summarize and explore datasets Includes: - Data acquisition and input

More information

3.Data Abstraction. Prof. Tulasi Prasad Sariki SCSE, VIT, Chennai 1 / 26

3.Data Abstraction. Prof. Tulasi Prasad Sariki SCSE, VIT, Chennai   1 / 26 3.Data Abstraction Prof. Tulasi Prasad Sariki SCSE, VIT, Chennai www.learnersdesk.weebly.com 1 / 26 Outline What can be visualized? Why Do Data Semantics and Types Matter? Data Types Items, Attributes,

More information

DSC 201: Data Analysis & Visualization

DSC 201: Data Analysis & Visualization DSC 201: Data Analysis & Visualization Visualization Design Dr. David Koop Definition Computer-based visualization systems provide visual representations of datasets designed to help people carry out tasks

More information

Data Visualization (CIS/DSC 468)

Data Visualization (CIS/DSC 468) Data Visualization (CIS/DSC 468) Data Dr. David Koop SVG Example http://codepen.io/dakoop/pen/ yexvxb

More information

Visualization Analysis & Design Full-Day Tutorial Session 1

Visualization Analysis & Design Full-Day Tutorial Session 1 Visualization Analysis & Design Full-Day Tutorial Session 1 Tamara Munzner Department of Computer Science University of British Columbia Sanger Institute / European Bioinformatics Institute June 2014,

More information

CIS 467/602-01: Data Visualization

CIS 467/602-01: Data Visualization CIS 467/602-01: Data Visualization Tables Dr. David Koop Assignment 2 http://www.cis.umassd.edu/ ~dkoop/cis467/assignment2.html Plagiarism on Assignment 1 Any questions? 2 Recap (Interaction) Important

More information

DSC 201: Data Analysis & Visualization

DSC 201: Data Analysis & Visualization DSC 201: Data Analysis & Visualization Python and Notebooks Dr. David Koop Computer-based visualization systems provide visual representations of datasets designed to help people carry out tasks more effectively.

More information

Visual Traffic Jam Analysis based on Trajectory Data

Visual Traffic Jam Analysis based on Trajectory Data Visualization Workshop 13 Visual Traffic Jam Analysis based on Trajectory Data Zuchao Wang 1, Min Lu 1, Xiaoru Yuan 1, 2, Junping Zhang 3, Huub van de Wetering 4 1) Key Laboratory of Machine Perception

More information

TrajAnalytics: A software system for visual analysis of urban trajectory data

TrajAnalytics: A software system for visual analysis of urban trajectory data TrajAnalytics: A software system for visual analysis of urban trajectory data Ye Zhao Computer Science, Kent State University Xinyue Ye Geography, Kent State University Jing Yang Computer Science, University

More information

Contact: Ye Zhao, Professor Phone: Dept. of Computer Science, Kent State University, Ohio 44242

Contact: Ye Zhao, Professor Phone: Dept. of Computer Science, Kent State University, Ohio 44242 Table of Contents I. Overview... 2 II. Trajectory Datasets and Data Types... 3 III. Data Loading and Processing Guide... 5 IV. Account and Web-based Data Access... 14 V. Visual Analytics Interface... 15

More information

Data Visualization. Fall 2016

Data Visualization. Fall 2016 Data Visualization Fall 2016 Information Visualization Upon now, we dealt with scientific visualization (scivis) Scivisincludes visualization of physical simulations, engineering, medical imaging, Earth

More information

Historical Text Mining:

Historical Text Mining: Historical Text Mining Historical Text Mining, and Historical Text Mining: Challenges and Opportunities Dr. Robert Sanderson Dept. of Computer Science University of Liverpool azaroth@liv.ac.uk http://www.csc.liv.ac.uk/~azaroth/

More information

Data Foundations. Topic Objectives. and list subcategories of each. its properties. before producing a visualization. subsetting

Data Foundations. Topic Objectives. and list subcategories of each. its properties. before producing a visualization. subsetting CS 725/825 Information Visualization Fall 2013 Data Foundations Dr. Michele C. Weigle http://www.cs.odu.edu/~mweigle/cs725-f13/ Topic Objectives! Distinguish between ordinal and nominal values and list

More information

Your Name: Section: INTRODUCTION TO STATISTICAL REASONING Computer Lab #4 Scatterplots and Regression

Your Name: Section: INTRODUCTION TO STATISTICAL REASONING Computer Lab #4 Scatterplots and Regression Your Name: Section: 36-201 INTRODUCTION TO STATISTICAL REASONING Computer Lab #4 Scatterplots and Regression Objectives: 1. To learn how to interpret scatterplots. Specifically you will investigate, using

More information

COSC 490 Computational Topology

COSC 490 Computational Topology COSC 490 Computational Topology Dr. Joe Anderson Fall 2018 Salisbury University Course Structure Weeks 1-2: Python and Basic Data Processing Python commonly used in industry & academia Weeks 3-6: Group

More information

Data Mining: Concepts and Techniques. (3 rd ed.) Chapter 3. Chapter 3: Data Preprocessing. Major Tasks in Data Preprocessing

Data Mining: Concepts and Techniques. (3 rd ed.) Chapter 3. Chapter 3: Data Preprocessing. Major Tasks in Data Preprocessing Data Mining: Concepts and Techniques (3 rd ed.) Chapter 3 1 Chapter 3: Data Preprocessing Data Preprocessing: An Overview Data Quality Major Tasks in Data Preprocessing Data Cleaning Data Integration Data

More information

Data Analyst Nanodegree Syllabus

Data Analyst Nanodegree Syllabus Data Analyst Nanodegree Syllabus Discover Insights from Data with Python, R, SQL, and Tableau Before You Start Prerequisites : In order to succeed in this program, we recommend having experience working

More information

DSC 201: Data Analysis & Visualization

DSC 201: Data Analysis & Visualization DSC 201: Data Analysis & Visualization Exploratory Data Analysis Dr. David Koop http://www.cis.umassd.edu/~dkoop/dsc201 2 Chicago Food Inspections Exploration Based on David Beazley's PyData Chicago talk

More information

This tutorial has been prepared for computer science graduates to help them understand the basic-to-advanced concepts related to data mining.

This tutorial has been prepared for computer science graduates to help them understand the basic-to-advanced concepts related to data mining. About the Tutorial Data Mining is defined as the procedure of extracting information from huge sets of data. In other words, we can say that data mining is mining knowledge from data. The tutorial starts

More information

Data Analyst Nanodegree Syllabus

Data Analyst Nanodegree Syllabus Data Analyst Nanodegree Syllabus Discover Insights from Data with Python, R, SQL, and Tableau Before You Start Prerequisites : In order to succeed in this program, we recommend having experience working

More information

Instructor: Dr. Mehmet Aktaş. Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University

Instructor: Dr. Mehmet Aktaş. Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University Instructor: Dr. Mehmet Aktaş Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org

More information

SPSS TRAINING SPSS VIEWS

SPSS TRAINING SPSS VIEWS SPSS TRAINING SPSS VIEWS Dataset Data file Data View o Full data set, structured same as excel (variable = column name, row = record) Variable View o Provides details for each variable (column in Data

More information

DSC 201: Data Analysis & Visualization

DSC 201: Data Analysis & Visualization DSC 201: Data Analysis & Visualization Exploratory Data Analysis Dr. David Koop Python Support for Time The datetime package - Has date, time, and datetime classes -.now() method: the current datetime

More information

Data Management Glossary

Data Management Glossary Data Management Glossary A Access path: The route through a system by which data is found, accessed and retrieved Agile methodology: An approach to software development which takes incremental, iterative

More information

Data Polygamy: The Many-Many Relationships among Urban Spatio-Temporal Data Sets. Fernando Chirigati Harish Doraiswamy Theodoros Damoulas

Data Polygamy: The Many-Many Relationships among Urban Spatio-Temporal Data Sets. Fernando Chirigati Harish Doraiswamy Theodoros Damoulas Data Polygamy: The Many-Many Relationships among Urban Spatio-Temporal Data Sets Fernando Chirigati Harish Doraiswamy Theodoros Damoulas Juliana Freire New York University New York University University

More information

Web Development: Client Side

Web Development: Client Side Course Description This course introduces web site design and development using EXtensible HyperText Markup Language (XHTML) and Cascading Style Sheets (CSS). You will learn standard XHTML and CSS and

More information

Learning Objectives for Data Concept and Visualization

Learning Objectives for Data Concept and Visualization Learning Objectives for Data Concept and Visualization Assignment 1: Data Quality Concept and Impact of Data Quality Summarize concepts of data quality. Understand and describe the impact of data on actuarial

More information

: Semantic Web (2013 Fall)

: Semantic Web (2013 Fall) 03-60-569: Web (2013 Fall) University of Windsor September 4, 2013 Table of contents 1 2 3 4 5 Definition of the Web The World Wide Web is a system of interlinked hypertext documents accessed via the Internet

More information

Tips and Guidance for Analyzing Data. Executive Summary

Tips and Guidance for Analyzing Data. Executive Summary Tips and Guidance for Analyzing Data Executive Summary This document has information and suggestions about three things: 1) how to quickly do a preliminary analysis of time-series data; 2) key things to

More information

The Ultimate YouTube SEO Guide: Tips & Tricks on How to Increase Views and Rankings for your Online Videos

The Ultimate YouTube SEO Guide: Tips & Tricks on How to Increase Views and Rankings for your Online Videos The Ultimate YouTube SEO Guide: Tips & Tricks on How to Increase Views and Rankings for your Online Videos The Ultimate App Store Optimization Guide Summary 1. Introduction 2. Choose the right video topic

More information

MULTIMEDIA DATABASES OVERVIEW

MULTIMEDIA DATABASES OVERVIEW MULTIMEDIA DATABASES OVERVIEW Recent developments in information systems technologies have resulted in computerizing many applications in various business areas. Data has become a critical resource in

More information

Two-dimensional Totalistic Code 52

Two-dimensional Totalistic Code 52 Two-dimensional Totalistic Code 52 Todd Rowland Senior Research Associate, Wolfram Research, Inc. 100 Trade Center Drive, Champaign, IL The totalistic two-dimensional cellular automaton code 52 is capable

More information

STA 570 Spring Lecture 5 Tuesday, Feb 1

STA 570 Spring Lecture 5 Tuesday, Feb 1 STA 570 Spring 2011 Lecture 5 Tuesday, Feb 1 Descriptive Statistics Summarizing Univariate Data o Standard Deviation, Empirical Rule, IQR o Boxplots Summarizing Bivariate Data o Contingency Tables o Row

More information

USC Viterbi School of Engineering

USC Viterbi School of Engineering Introduction to Computational Thinking and Data Science USC Viterbi School of Engineering http://www.datascience4all.org Term: Fall 2016 Time: Tues- Thur 10am- 11:50am Location: Allan Hancock Foundation

More information

What are we working with? Data Abstractions. Week 4 Lecture A IAT 814 Lyn Bartram

What are we working with? Data Abstractions. Week 4 Lecture A IAT 814 Lyn Bartram What are we working with? Data Abstractions Week 4 Lecture A IAT 814 Lyn Bartram Munzner s What-Why-How What are we working with? DATA abstractions, statistical methods Why are we doing it? Task abstractions

More information

Math 7 Notes - Unit 4 Pattern & Functions

Math 7 Notes - Unit 4 Pattern & Functions Math 7 Notes - Unit 4 Pattern & Functions Syllabus Objective: (3.2) The student will create tables, charts, and graphs to extend a pattern in order to describe a linear rule, including integer values.

More information

Data Preprocessing. Slides by: Shree Jaswal

Data Preprocessing. Slides by: Shree Jaswal Data Preprocessing Slides by: Shree Jaswal Topics to be covered Why Preprocessing? Data Cleaning; Data Integration; Data Reduction: Attribute subset selection, Histograms, Clustering and Sampling; Data

More information

Centralities (4) By: Ralucca Gera, NPS. Excellence Through Knowledge

Centralities (4) By: Ralucca Gera, NPS. Excellence Through Knowledge Centralities (4) By: Ralucca Gera, NPS Excellence Through Knowledge Some slide from last week that we didn t talk about in class: 2 PageRank algorithm Eigenvector centrality: i s Rank score is the sum

More information

Efficient Orienteering-Route Search over Uncertain Spatial Datasets

Efficient Orienteering-Route Search over Uncertain Spatial Datasets Efficient Orienteering-Route Search over Uncertain Spatial Datasets Mr. Nir DOLEV, Israel Dr. Yaron KANZA, Israel Prof. Yerach DOYTSHER, Israel 1 Route Search A standard search engine on the WWW returns

More information

Hierarchy of knowledge BIG DATA 9/7/2017. Architecture

Hierarchy of knowledge BIG DATA 9/7/2017. Architecture BIG DATA Architecture Hierarchy of knowledge Data: Element (fact, figure, etc.) which is basic information that can be to be based on decisions, reasoning, research and which is treated by the human or

More information

The basic arrangement of numeric data is called an ARRAY. Array is the derived data from fundamental data Example :- To store marks of 50 student

The basic arrangement of numeric data is called an ARRAY. Array is the derived data from fundamental data Example :- To store marks of 50 student Organizing data Learning Outcome 1. make an array 2. divide the array into class intervals 3. describe the characteristics of a table 4. construct a frequency distribution table 5. constructing a composite

More information

Understanding Geospatial Data Models

Understanding Geospatial Data Models Understanding Geospatial Data Models 1 A geospatial data model is a formal means of representing spatially referenced information. It is a simplified view of physical entities and a conceptualization of

More information

Database and Knowledge-Base Systems: Data Mining. Martin Ester

Database and Knowledge-Base Systems: Data Mining. Martin Ester Database and Knowledge-Base Systems: Data Mining Martin Ester Simon Fraser University School of Computing Science Graduate Course Spring 2006 CMPT 843, SFU, Martin Ester, 1-06 1 Introduction [Fayyad, Piatetsky-Shapiro

More information

Data Processing at Scale (CSE 511)

Data Processing at Scale (CSE 511) Data Processing at Scale (CSE 511) Note: Below outline is subject to modifications and updates. About this Course Database systems are used to provide convenient access to disk-resident data through efficient

More information

CIS : Scalable Data Analysis

CIS : Scalable Data Analysis CIS 602-01: Scalable Data Analysis Visualization Dr. David Koop Growth of Data 2 Usefulness of Data 3 Analyzed Data 4 Example Data Sources Radio Telescopes Twitter Wind Turbine Sensors Surveillance Cameras

More information

Strategic Briefing Paper Big Data

Strategic Briefing Paper Big Data Strategic Briefing Paper Big Data The promise of Big Data is improved competitiveness, reduced cost and minimized risk by taking better decisions. This requires affordable solution architectures which

More information

Semi-Structured Data Management (CSE 511)

Semi-Structured Data Management (CSE 511) Semi-Structured Data Management (CSE 511) Note: Below outline is subject to modifications and updates. About this Course Database systems are used to provide convenient access to disk-resident data through

More information

Overview of Web Mining Techniques and its Application towards Web

Overview of Web Mining Techniques and its Application towards Web Overview of Web Mining Techniques and its Application towards Web *Prof.Pooja Mehta Abstract The World Wide Web (WWW) acts as an interactive and popular way to transfer information. Due to the enormous

More information

Visualization Analysis & Design

Visualization Analysis & Design Visualization Analysis & Design Tamara Munzner Department of Computer Science University of British Columbia UBC STAT 545A Guest Lecture October 20 2016, Vancouver BC http://www.cs.ubc.ca/~tmm/talks.html#vad16bryan

More information

CISD Math Department

CISD Math Department CISD Math Department New vocabulary! New verbs! We cannot just go to OLD questions and use them to represent NEW TEKS. New nouns! New grade level changes! New resources! NEW 7.4A represent constant

More information

Week 6: Networks, Stories, Vis in the Newsroom

Week 6: Networks, Stories, Vis in the Newsroom Week 6: Networks, Stories, Vis in the Newsroom Tamara Munzner Department of Computer Science University of British Columbia JRNL 520H, Special Topics in Contemporary Journalism: Data Visualization Week

More information

University of Florida CISE department Gator Engineering. Visualization

University of Florida CISE department Gator Engineering. Visualization Visualization Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida What is visualization? Visualization is the process of converting data (information) in to

More information

How to actively build inbound enquiry. ebook

How to actively build inbound enquiry. ebook How to actively build inbound enquiry ebook You know it s important HOW TO ACTIVELY BUILD INBOUND ENQUIRY... Businesses spend thousands of dollars every month on PR, advertising and at times, elaborate

More information

Resource Discovery in IoT: Current Trends, Gap Analysis and Future Standardization Aspects

Resource Discovery in IoT: Current Trends, Gap Analysis and Future Standardization Aspects Resource Discovery in IoT: Current Trends, Gap Analysis and Future Standardization Aspects Soumya Kanti Datta Research Engineer, EURECOM TF-DI Coordinator in W3C WoT IG Email: dattas@eurecom.fr Roadmap

More information

Esri and MarkLogic: Location Analytics, Multi-Model Data

Esri and MarkLogic: Location Analytics, Multi-Model Data Esri and MarkLogic: Location Analytics, Multi-Model Data Ben Conklin, Industry Manager, Defense, Intel and National Security, Esri Anthony Roach, Product Manager, MarkLogic James Kerr, Technical Director,

More information

Visual Computing. Lecture 2 Visualization, Data, and Process

Visual Computing. Lecture 2 Visualization, Data, and Process Visual Computing Lecture 2 Visualization, Data, and Process Pipeline 1 High Level Visualization Process 1. 2. 3. 4. 5. Data Modeling Data Selection Data to Visual Mappings Scene Parameter Settings (View

More information

Information Visualization

Information Visualization Information Visualization Text: Information visualization, Robert Spence, Addison-Wesley, 2001 What Visualization? Process of making a computer image or graph for giving an insight on data/information

More information

Data warehouse architecture consists of the following interconnected layers:

Data warehouse architecture consists of the following interconnected layers: Architecture, in the Data warehousing world, is the concept and design of the data base and technologies that are used to load the data. A good architecture will enable scalability, high performance and

More information

Massive Data Analysis

Massive Data Analysis Professor, Department of Electrical and Computer Engineering Tennessee Technological University February 25, 2015 Big Data This talk is based on the report [1]. The growth of big data is changing that

More information

Data Visualization Pitfalls to Avoid

Data Visualization Pitfalls to Avoid Data Visualization Pitfalls to Avoid Tamara Munzner Department of Computer Science University of British Columbia CBR Arts Meets Science, UBC Centre for Blood Research Mar 23 2017, Vancouver BC http://www.cs.ubc.ca/~tmm/talks.html#cbr17

More information

Terms and definitions * keep definitions of processes and terms that may be useful for tests, assignments

Terms and definitions * keep definitions of processes and terms that may be useful for tests, assignments Lecture 1 Core of GIS Thematic layers Terms and definitions * keep definitions of processes and terms that may be useful for tests, assignments Lecture 2 What is GIS? Info: value added data Data to solve

More information

Data Visualization. Fall 2017

Data Visualization. Fall 2017 Data Visualization Fall 2017 Course Targets and Goals Getting acquainted with advanced techniques of visualization of scientific and technical data (spatial and non-spatial data) Application of selected

More information

Glyphs. Presentation Overview. What is a Glyph!? Cont. What is a Glyph!? Glyph Fundamentals. Goal of Paper. Presented by Bertrand Low

Glyphs. Presentation Overview. What is a Glyph!? Cont. What is a Glyph!? Glyph Fundamentals. Goal of Paper. Presented by Bertrand Low Presentation Overview Glyphs Presented by Bertrand Low A Taxonomy of Glyph Placement Strategies for Multidimensional Data Visualization Matthew O. Ward, Information Visualization Journal, Palmgrave,, Volume

More information

DS504/CS586: Big Data Analytics Data Pre-processing and Cleaning Prof. Yanhua Li

DS504/CS586: Big Data Analytics Data Pre-processing and Cleaning Prof. Yanhua Li Welcome to DS504/CS586: Big Data Analytics Data Pre-processing and Cleaning Prof. Yanhua Li Time: 6:00pm 8:50pm R Location: AK 232 Fall 2016 The Data Equation Oceans of Data Ocean Biodiversity Informatics,

More information

Preprocessing Short Lecture Notes cse352. Professor Anita Wasilewska

Preprocessing Short Lecture Notes cse352. Professor Anita Wasilewska Preprocessing Short Lecture Notes cse352 Professor Anita Wasilewska Data Preprocessing Why preprocess the data? Data cleaning Data integration and transformation Data reduction Discretization and concept

More information

Data Management: the What, When and How

Data Management: the What, When and How Data Management: the What, When and How Data Management: the What DAMA(Data Management Association) states that "Data Resource Management is the development and execution of architectures, policies, practices

More information

Website Designs Australia

Website Designs Australia Proudly Brought To You By: Website Designs Australia Contents Disclaimer... 4 Why Your Local Business Needs Google Plus... 5 1 How Google Plus Can Improve Your Search Engine Rankings... 6 1. Google Search

More information

Make the most of your access to ScienceDirect

Make the most of your access to ScienceDirect 1 Make the most of your access to ScienceDirect Present Future 2 ScienceDirect Training Deck We re here to help you make the most of your access to ScienceDirect. ScienceDirect offers researchers the latest

More information

21 st Century Math Projects

21 st Century Math Projects Project Title: International Cell Phone Plan Standard Focus: Patterns, Algebra & Functions Topics of Focus: - Linear Functions - Rate of Change Time Range: 4-5 Days Supplies: Computer lab Benchmarks: Functions

More information

DATA MINING AND WAREHOUSING

DATA MINING AND WAREHOUSING DATA MINING AND WAREHOUSING Qno Question Answer 1 Define data warehouse? Data warehouse is a subject oriented, integrated, time-variant, and nonvolatile collection of data that supports management's decision-making

More information

Real-Time & Big Data GIS: Leveraging the spatiotemporal big data store

Real-Time & Big Data GIS: Leveraging the spatiotemporal big data store Real-Time & Big Data GIS: Leveraging the spatiotemporal big data store Suzanne Foss Product Manager, Esri sfoss@esri.com Ricardo Trujillo Real-Time & Big Data GIS Developer, Esri rtrujillo@esri.com @rtrujill007

More information

Trending Words in Digital Library for Term Cloud-based Navigation

Trending Words in Digital Library for Term Cloud-based Navigation Trending Words in Digital Library for Term Cloud-based Navigation Samuel Molnár, Róbert Móro, Mária Bieliková Institute of Informatics and Software Engineering, Faculty of Informatics and Information Technologies,

More information

Business Analytics Nanodegree Syllabus

Business Analytics Nanodegree Syllabus Business Analytics Nanodegree Syllabus Master data fundamentals applicable to any industry Before You Start There are no prerequisites for this program, aside from basic computer skills. You should be

More information

Computer Science Seminar. Whats the next big thing? Ruby? Python? Neither?

Computer Science Seminar. Whats the next big thing? Ruby? Python? Neither? Computer Science Seminar Whats the next big thing? Ruby? Python? Neither? Introduction Seminar Style course unlike many computer science courses discussion important, encouraged and part of your grade

More information

06 Visualizing Information

06 Visualizing Information Professor Shoemaker 06-VisualizingInformation.xlsx 1 It can be sometimes difficult to uncover meaning in data that s presented in a table or list Especially if the table has many rows and/or columns But

More information

Data Preprocessing. Why Data Preprocessing? MIT-652 Data Mining Applications. Chapter 3: Data Preprocessing. Multi-Dimensional Measure of Data Quality

Data Preprocessing. Why Data Preprocessing? MIT-652 Data Mining Applications. Chapter 3: Data Preprocessing. Multi-Dimensional Measure of Data Quality Why Data Preprocessing? Data in the real world is dirty incomplete: lacking attribute values, lacking certain attributes of interest, or containing only aggregate data e.g., occupation = noisy: containing

More information

GeoTemporal Reasoning for the Social Semantic Web

GeoTemporal Reasoning for the Social Semantic Web GeoTemporal Reasoning for the Social Semantic Web Jans Aasman Franz Inc. 2201 Broadway, Suite 715, Oakland, CA 94612, USA ja@franz.com Abstract: We demonstrate a Semantic Web application that organizes

More information

Course Title: Computer Networking 2. Course Section: CNS (Winter 2018) FORMAT: Face to Face

Course Title: Computer Networking 2. Course Section: CNS (Winter 2018) FORMAT: Face to Face Course Title: Computer Networking 2 Course Section: CNS-106-50 (Winter 2018) FORMAT: Face to Face TIME FRAME: Start Date: 15 January 2018 End Date: 28 February 2018 Monday & Wednesday 1:00pm 5:00pm CREDITS:

More information

Data Visualization (CIS 468)

Data Visualization (CIS 468) Data Visualization (CIS 468) Web Programming Dr. David Koop What is Data Visualization? 2 Exploration Communication Spectrum Consecutive Starts by a Quarterback for a Single Team Exploration Confirmation

More information

MEMBERSHIP & PARTICIPATION

MEMBERSHIP & PARTICIPATION MEMBERSHIP & PARTICIPATION What types of activities can I expect to participate in? There are a variety of activities for you to participate in such as discussion boards, idea exchanges, contests, surveys,

More information

Getting Started. What is SAS/SPECTRAVIEW Software? CHAPTER 1

Getting Started. What is SAS/SPECTRAVIEW Software? CHAPTER 1 3 CHAPTER 1 Getting Started What is SAS/SPECTRAVIEW Software? 3 Using SAS/SPECTRAVIEW Software 5 Data Set Requirements 5 How the Software Displays Data 6 Spatial Data 6 Non-Spatial Data 7 Summary of Software

More information

How to use search, recommender systems and online community to help users find what they want. Rashmi Sinha

How to use search, recommender systems and online community to help users find what they want. Rashmi Sinha The Quest for the "right item": How to use search, recommender systems and online community to help users find what they want. Rashmi Sinha Summary of the talk " Users have different types of information

More information

Acquisition Description Exploration Examination Understanding what data is collected. Characterizing properties of data.

Acquisition Description Exploration Examination Understanding what data is collected. Characterizing properties of data. Summary Statistics Acquisition Description Exploration Examination what data is collected Characterizing properties of data. Exploring the data distribution(s). Identifying data quality problems. Selecting

More information

Project Collaboration

Project Collaboration Bonus Chapter 8 Project Collaboration It s quite ironic that the last bonus chapter of this book contains information that many of you will need to get your first Autodesk Revit Architecture project off

More information

Data publication and discovery with Globus

Data publication and discovery with Globus Data publication and discovery with Globus Questions and comments to outreach@globus.org The Globus data publication and discovery services make it easy for institutions and projects to establish collections,

More information

Data Statistics Population. Census Sample Correlation... Statistical & Practical Significance. Qualitative Data Discrete Data Continuous Data

Data Statistics Population. Census Sample Correlation... Statistical & Practical Significance. Qualitative Data Discrete Data Continuous Data Data Statistics Population Census Sample Correlation... Voluntary Response Sample Statistical & Practical Significance Quantitative Data Qualitative Data Discrete Data Continuous Data Fewer vs Less Ratio

More information

Chapter 8: GPS Clustering and Analytics

Chapter 8: GPS Clustering and Analytics Chapter 8: GPS Clustering and Analytics Location information is crucial for analyzing sensor data and health inferences from mobile and wearable devices. For example, let us say you monitored your stress

More information

Data Model and Management

Data Model and Management Data Model and Management Ye Zhao and Farah Kamw Outline Urban Data and Availability Urban Trajectory Data Types Data Preprocessing and Data Registration Urban Trajectory Data and Query Model Spatial Database

More information

Chapter 1, Introduction

Chapter 1, Introduction CSI 4352, Introduction to Data Mining Chapter 1, Introduction Young-Rae Cho Associate Professor Department of Computer Science Baylor University What is Data Mining? Definition Knowledge Discovery from

More information

Scientific Visualization

Scientific Visualization Scientific Visualization Topics Motivation Color InfoVis vs. SciVis VisTrails Core Techniques Advanced Techniques 1 Check Assumptions: Why Visualize? Problem: How do you apprehend 100k tuples? when your

More information

Data Mining: Exploring Data. Lecture Notes for Chapter 3

Data Mining: Exploring Data. Lecture Notes for Chapter 3 Data Mining: Exploring Data Lecture Notes for Chapter 3 1 What is data exploration? A preliminary exploration of the data to better understand its characteristics. Key motivations of data exploration include

More information

ECONOMICS 5317: CONTEMPORARY GOVERNMENT AND BUSINESS RELATIONS

ECONOMICS 5317: CONTEMPORARY GOVERNMENT AND BUSINESS RELATIONS 1 ECONOMICS 5317: CONTEMPORARY GOVERNMENT AND BUSINESS RELATIONS Fall 2011, MWF 9:05-9:55, HCB 408 INSTRUCTOR: David VanHoose OFFICE HOURS: OFFICE: 339 Hankamer MWF 8:00-9:00 & 12:15-1:15; OFFICE PHONE:

More information

D DAVID PUBLISHING. Big Data; Definition and Challenges. 1. Introduction. Shirin Abbasi

D DAVID PUBLISHING. Big Data; Definition and Challenges. 1. Introduction. Shirin Abbasi Journal of Energy and Power Engineering 10 (2016) 405-410 doi: 10.17265/1934-8975/2016.07.004 D DAVID PUBLISHING Shirin Abbasi Computer Department, Islamic Azad University-Tehran Center Branch, Tehran

More information

Advanced Visualization

Advanced Visualization 320581 Advanced Visualization Prof. Lars Linsen Fall 2011 0 Introduction 0.1 Syllabus and Organization Course Website Link in CampusNet: http://www.faculty.jacobsuniversity.de/llinsen/teaching/320581.htm

More information

Security analytics: From data to action Visual and analytical approaches to detecting modern adversaries

Security analytics: From data to action Visual and analytical approaches to detecting modern adversaries Security analytics: From data to action Visual and analytical approaches to detecting modern adversaries Chris Calvert, CISSP, CISM Director of Solutions Innovation Copyright 2013 Hewlett-Packard Development

More information

Data Mining. Ryan Benton Center for Advanced Computer Studies University of Louisiana at Lafayette Lafayette, La., USA.

Data Mining. Ryan Benton Center for Advanced Computer Studies University of Louisiana at Lafayette Lafayette, La., USA. Data Mining Ryan Benton Center for Advanced Computer Studies University of Louisiana at Lafayette Lafayette, La., USA January 13, 2011 Important Note! This presentation was obtained from Dr. Vijay Raghavan

More information