CS Information Visualization Sep. 2, 2015 John Stasko

Similar documents
CS Information Visualization Sep. 19, 2016 John Stasko

Parallel Coordinates ++

CS 4460 Intro. to Information Visualization Sep. 18, 2017 John Stasko

Multiple Dimensional Visualization

HYPERVARIATE DATA VISUALIZATION

3. Multidimensional Information Visualization II Concepts for visualizing univariate to hypervariate data

Information Visualization. Jing Yang Spring Multi-dimensional Visualization (1)

CP SC 8810 Data Visualization. Joshua Levine

Large Scale Information

Interactive Visual Exploration

CS 4460 Intro. to Information Visualization October 18, 2017 John Stasko

Interaction. CS Information Visualization. Chris Plaue Some Content from John Stasko s CS7450 Spring 2006

Few s Design Guidance

Interaction. Interaction? What do you mean by interaction? CS Information Visualization September 21, 2015 John Stasko

CS Information Visualization September 26, 2016 John Stasko

Visual Computing. Lecture 2 Visualization, Data, and Process

Data Visualization. Fall 2016

Visual Analysis of Set Relations in a Graph

Glyphs. Presentation Overview. What is a Glyph!? Cont. What is a Glyph!? Glyph Fundamentals. Goal of Paper. Presented by Bertrand Low

Parallel Coordinates CS 6630 Scientific Visualization

Graphs / Networks CSE 6242/ CX Interactive applications. Duen Horng (Polo) Chau Georgia Tech

Hierarchies and Trees 1 (Node-link) CS Information Visualization November 12, 2012 John Stasko

Clustering and Visualisation of Data

SAS Visual Analytics 8.2: Working with Report Content

Multivariate Data & Tables and Graphs. Agenda. Data and its characteristics Tables and graphs Design principles

Lecture Topic Projects

Data Mining: Exploring Data. Lecture Notes for Chapter 3

Toward a Deeper Understanding of the Role of Interaction in Information Visualization

MSA220 - Statistical Learning for Big Data

Edge Equalized Treemaps

3. Multidimensional Information Visualization I Concepts for visualizing univariate to hypervariate data

A STUDY OF PATTERN ANALYSIS TECHNIQUES OF WEB USAGE

Data Mining: Exploring Data. Lecture Notes for Data Exploration Chapter. Introduction to Data Mining

Statistical Graphs & Charts

Data Mining: Exploring Data. Lecture Notes for Chapter 3. Introduction to Data Mining

Knowledge Discovery and Data Mining I

Hierarchies and Trees 1 (Node-link) CS 4460/ Information Visualization March 10, 2009 John Stasko

cs6964 February TABULAR DATA Miriah Meyer University of Utah

cs6964 March TREES & GRAPHS Miriah Meyer University of Utah

HYBRID FORCE-DIRECTED AND SPACE-FILLING ALGORITHM FOR EULER DIAGRAM DRAWING. Maki Higashihara Takayuki Itoh Ochanomizu University

SAS Visual Analytics 8.2: Getting Started with Reports

Multi-Dimensional Vis

Social Visualization

InfoVis Systems & Toolkits

Multidimensional Visualization and Clustering

InfoVis Systems & Toolkits

Information Visualization in Data Mining. S.T. Balke Department of Chemical Engineering and Applied Chemistry University of Toronto

Background. Parallel Coordinates. Basics. Good Example

Dust and Magnets. Don Beaver Tuesday, April 17, 12

DESIGN MOBILE APPS FOR ANDROID DEVICES

Lecture Topic Projects

INTERACTION IN VISUALIZATION. Petra Isenberg

Survey of Math: Excel Spreadsheet Guide (for Excel 2016) Page 1 of 9

Storytelling in InfoVis

Introduc)on to Informa)on Visualiza)on

Chapter 2: Looking at Multivariate Data

Lecture 5: DATA MAPPING & VISUALIZATION. November 3 rd, Presented by: Anum Masood (TA)

Chapter 2: From Graphics to Visualization

Infographics and Visualisation (or: Beyond the Pie Chart) LSS: ITNPBD4, 1 November 2016

Become strong in Excel (2.0) - 5 Tips To Rock A Spreadsheet!

Multivariate Data & Tables and Graphs. Agenda. Data and its characteristics Tables and graphs Design principles

TNM093 Tillämpad visualisering och virtuell verklighet. Jimmy Johansson C-Research, Linköping University

CSE 255 Lecture 6. Data Mining and Predictive Analytics. Community Detection

Tree-mapping Based App Access System for ios Platform

Courtesy of Prof. Shixia University

Interaction. Interaction? What do you mean by interaction? CS 4460/ Information Visualization Feb. 24, 2009 John Stasko

What is Interaction?

How to use FSBforecast Excel add in for regression analysis

SpringView: Cooperation of Radviz and Parallel Coordinates for View Optimization and Clutter Reduction

Perception Maneesh Agrawala CS : Visualization Fall 2013 Multidimensional Visualization

CSE Data Visualization. Multidimensional Vis. Jeffrey Heer University of Washington

Bar Charts and Frequency Distributions

Trees & Graphs. Nathalie Henry Riche, Microsoft Research

CSE 258 Lecture 6. Web Mining and Recommender Systems. Community Detection

DS504/CS586: Big Data Analytics Big Data Clustering II

ay/bi199: methods of computational science visualization jumpstart+tools+techniques santiago v lombeyda center for advanced computing research caltech

interaction path analysis

Exercise: Graphing and Least Squares Fitting in Quattro Pro

Strategic Series-7001 Introduction to Custom Screens Version 9.0

Facet: Multiple View Methods

Information Visualization - Introduction

Value and Relation Display for Interactive Exploration of High Dimensional Datasets

At the end of the chapter, you will learn to: Present data in textual form. Construct different types of table and graphs

Interactive Visualization of Fuzzy Set Operations

Optimization and least squares. Prof. Noah Snavely CS1114

Information Visualization. Jing Yang Spring Hierarchy and Tree Visualization

Data Visualization Techniques

Geometric Techniques. Part 1. Example: Scatter Plot. Basic Idea: Scatterplots. Basic Idea. House data: Price and Number of bedrooms

Network visualization techniques and evaluation

Data Mining: Exploring Data. Lecture Notes for Chapter 3

Tufte s Design Principles

GRADE 5 UNIT 5 SHAPE AND COORDINATE GEOMETRY Established Goals: Standards

Introducing Adobe Photoshop CC

Clustering and Dimensionality Reduction

Visual Encoding Design

Outlier detection using autoencoders

DS504/CS586: Big Data Analytics Big Data Clustering II

DSC 201: Data Analysis & Visualization

Data Analysis More Than Two Variables: Graphical Multivariate Analysis

The first thing we ll need is some numbers. I m going to use the set of times and drug concentration levels in a patient s bloodstream given below.

Transcription:

Multivariate Visual Representations 2 CS 7450 - Information Visualization Sep. 2, 2015 John Stasko Recap We examined a number of techniques for projecting >2 variables (modest number of dimensions) down onto the 2D plane Scatterplot matrix Table lens Parallel coordinates etc. Fall 2015 CS 7450 2 1

Varieties of Techniques Fall 2015 CS 7450 3 Can We Make a Taxonomy? D. Keim proposes a taxonomy of techniques Standard 2D/3D display Bar charts, scatterplots Geometrically transformed display Parallel coordinates Iconic display Needle icons, Chernoff faces Dense pixel display What we re about to see Stacked display Treemaps, dimensional stacking TVCG 02 Fall 2015 CS 7450 4 2

Minimum Possible? We have data cases with variables What s the smallest representation we can use? How? Fall 2015 CS 7450 5 Dense Pixel Display Represent data case or a variable as a pixel Million or more per display Seems to rely on use of color Can pack lots in Challenge: What s the layout? Fall 2015 CS 7450 6 3

One Representation Grouping arrangement One pixel per variable Each data case has its own small rectangular icon Plot out variables for data point in that icon using a grid or spiral layout Uses color scale Fall 2015 CS 7450 7 Illustration Levkowitz Vis 91 Fall 2015 CS 7450 8 4

Related Idea Pixel Bar Chart Overload typical bar chart with more information about individual elements Keim et al Information Visualization 02 Fall 2015 CS 7450 9 Idea 1 Height encodes quantity Width encodes quantity Fall 2015 CS 7450 10 5

Idea 2 Make each pixel within a bar correspond to a data point in that group represented by the bar Can do millions that way Color the pixel to represent the value of one of the data point s variables Fall 2015 CS 7450 11 Idea 3 Each pixel is a customer Color encodes amount spent by that person High-bright, Low-dark Ordered by that color attribute too Right one shows more customers Fall 2015 CS 7450 12 6

Idea 4 Product type is x-axis divider Customers ordered by y-axis: dollar amount x-axis: number of visits Color is (a) dollar amount spent, (b) number of visits, (c) sales quantity Fall 2015 CS 7450 13 Example Application 1. Product type 7 and product type 10 have the top dollar amount customers (dark colors of bar 7 and 10 in Figure 13a) 2. The dollar amount spent and the number of visits are clearly correlated, especially for product type 4 (linear increase of dark colors at the top of bar 4 in Figure 13b) 3. Product types 4 and 11 have the highest quantities sold (dark colors of bar 4 and 11 in Figure 13c) 4. Clicking on pixel A shows details for that customer Fall 2015 CS 7450 14 7

Thoughts? Do you think that would be a helpful exploratory tool? Fall 2015 CS 7450 15 High Dimensions Those techniques could show lots of data, but not so many dimensions at once Have to pick and choose Fall 2015 CS 7450 16 8

Another Idea Use the dense pixel display for showing data and dimensions, but then project into 2D plane to encode more information VaR Value and relation display Yang et al InfoVis 04 Fall 2015 CS 7450 17 Algorithm Find a correlation function for comparing dimensions Calculate distances between dimensions (similarities) Make each dimension into a dense pixel glyph Assign position for each glyph in 2D plane using multi-dimensional scaling Fall 2015 CS 7450 18 9

Individual Dimensions Fall 2015 CS 7450 19 Questions What order are the data cases in each dimension-glyph? Maybe there is a predefined order Choose one dimension as important then order data cases by their values in that dimension Important one may be the one in which many cases are similar Fall 2015 CS 7450 20 10

Alternative Instead of each glyph being a dimension, it can be a data case Fall 2015 CS 7450 21 Follow-on Work Use alternate positioning strategies other than MDS Use Jigsaw map idea (Wattenberg, InfoVis 05) to lay out the dimensions into a grid Removes overlap Limits number that can be plotted Yang et al TVCG 07 Fall 2015 CS 7450 22 11

New Layout Plot the glyphs into the grid positions Fall 2015 CS 7450 23 Very Different Metaphor Represent each data case as a small glyph Make interaction be a crucial part of the visualization Fall 2015 CS 7450 24 12

Dust & Magnet Altogether different metaphor Data cases represented as small bits of iron dust Different attributes given physical manifestation as magnets Interact with objects to explore data Yi, Melton, Stasko & Jacko Information Visualization 05 Fall 2015 CS 7450 25 Interface Fall 2015 CS 7450 26 13

Interaction Iron bits (data) are drawn toward magnets (attributes) proportional to that data element s value in that attribute Higher values attracted more strongly All magnets present on display affect position of all dust Individual power of magnets can be changed Dust s color and size can connected to attributes as well Fall 2015 CS 7450 27 Interaction Moving a magnet makes all the dust move Also command for shaking dust Different strategies for how to position magnets in order to explore the data Fall 2015 CS 7450 28 14

See It Live ftp://ftp.cc.gatech.edu/pub/people/stasko/movies/dnm.mov Video & Demo Fall 2015 CS 7450 29 Kinetica Video Stress physics metaphor Touch interaction on tablet Rzeszotarski & Kittur CHI 14 Fall 2015 CS 7450 30 15

Go Big Video Dust & Magnet on a large multitouch display Dai, Sadana, Stolper & Stasko InfoVis 15 Poster Fall 2015 CS 7450 31 Set Data & Operations Different type of problem Large set of items, each can be in one or more sets How do we visually represent the set membership? Fall 2015 CS 7450 32 16

Standard Technique Venn Diagram A B C Contains all possible zones of overlap Fall 2015 CS 7450 33 Alternately Euler Diagram Does not necessarily show all possible overlap zones http://en.wikipedia.org/wiki/file:british_isles_euler_diagram_15.svg But what s the problem? Fall 2015 CS 7450 34 17

Bubble Sets Video Collins et al TVCG (InfoVis) 09 Fall 2015 CS 7450 35 ComED & DupED Item can appear more than once Item appears once Video Riche & Dwyer TVCG (InfoVis) 10 Fall 2015 CS 7450 36 18

OnSet Represent set as a box, elements are spots in that box Use interaction to do set union, intersection Sadana, Major, Dove & Stasko TVCG (InfoVis) 14 Fall 2015 CS 7450 37 A MultiLayer OR with three sets. A MultiLayer AND of nested OR layers. Dragging and dropping a PixelLayer to create a new AND MultiLayer. OnSet shows the similarity of two sets via the thickness of a band between them. Hovering over a similarity band highlights the common elements between two sets. Demo/video Fall 2015 CS 7450 38 19

Nice Review Fall 2015 CS 7450 39 Step Back Most of the techniques we ve examined work for a modest number of data cases or variables What happens when you have lots and lots of data cases and/or variables? Fall 2015 CS 7450 40 20

Many Cases Out5d dataset (5 dimensions, 16384 data items) (courtesy of J. Yang) Fall 2015 CS 7450 41 Many Variables Fall 2015 CS 7450 42 21

Strategies How are we going to deal with such big datasets with so many variables per case? Ideas? Fall 2015 CS 7450 43 General Notion Data that is similar in most dimensions ought to be drawn together Cluster at high dimensions Need to project the data down into the plane and give it some ultra-simplified representation Or perhaps only look at certain aspects of the data at any one time Fall 2015 CS 7450 44 22

Mathematical Assistance 1 There exist many techniques for clustering high-dimensional data with respect to all those dimensions Affinity propagation k-means Expectation maximization Hierarchical clustering Fall 2015 CS 7450 45 Mathematical Assistance 2 There exist many techniques for projecting n-dimensions down to 2-D (dimensionality reduction) Multi-dimensional scaling (MDS) Principal component analysis Linear discriminant analysis Factor analysis Comput Sci & Eng courses Data & Visual Analytics, Prof. Chau Data mining Knowledge discovery Fall 2015 CS 7450 46 23

Other Techniques Other techniques exist to manage scale Sampling We only include every so many data cases or variables Aggregation We combine many data cases or variables Interaction (later) Employ user interaction rather than special renderings to help manage scale Fall 2015 CS 7450 47 Use? What kinds of questions/tasks would you want such techniques to address? Clusters of similar data cases Useless dimensions Dimensions similar to each other Outlier data cases Think about the cognitive tasks we want to accomplish Fall 2015 CS 7450 48 24

Recap We ve seen many general techniques for multivariate data these past two days Know strengths and limitations of each Know which ones are good for which circumstances We still haven t explored interaction much Fall 2015 CS 7450 49 Visualization of the Day Everyone posts one Use tumblr Overview on class webpages Details on t-square Please comment & share thoughts Part of participation grade Fall 2015 CS 7450 50 25

Project Overview Topics Last.fm example Teams Teams & Topics due Monday 14th You must meet me or TA before then Bring 3 copies Fall 2015 CS 7450 51 HW 1 Recap Fall 2015 CS 7450 52 26

Design Challenge Smart Phones sold by OS Challenge: Help someone understand the competitive landscape in this area Projections Source: Gartner Fall 2015 CS 7450 53 Upcoming Labor Day holiday Visualization Programming Tutorial Reading Murray online book InfoVis Systems & Toolkits Reading: Viegas et al, 07 Fall 2015 CS 7450 54 27