Interactive Graphics for Statistics

Similar documents
Plot and Look! Trust your Data more than your Models

What Dataminers Want

Projekt 1 Ausarbeitung

Linked Data Views. Introduction. Starting with Scatterplots. By Graham Wills

Trellis Displays. Definition. Example. Trellising: Which plot is best? Historical Development. Technical Definition

Adobe Photoshop Sh S.K. Sublania and Sh. Naresh Chand

SpRay. an R-based visual-analytics platform for large and high-dimensional datasets. J. Heinrich 1 J. Dietzsch 1 D. Bartz 2 K.

iplots extreme Next-generation interactive graphics for analysis of large data Simon Urbanek AT&T Labs Statistics Research

L E S S O N 2 Background

The Toolbox. 2.1 Introduction

Technique and Cue Selection for Graphical Presentation of Generic Hyperdimensional Data

Pen Tool, Fill Layers, Color Range, Levels Adjustments, Magic Wand tool, and shadowing techniques

Examining a Single Variable

Working with Graphics and Text

Exploratory model analysis

Graphic Design & Digital Photography. Photoshop Basics: Working With Selection.

Advances in MicroStation 3D

Exploratory Data Analysis EDA

Chapter 2 User Interface Features. networks Window. Drawing Panel

GenStat for Schools. Disappearing Rock Wren in Fiordland

Photoshop Basics A quick introduction to the basic tools in Photoshop

Chapter 2 Assignment (due Thursday, April 19)

Advanced Techniques for Greater Accuracy, Capacity, and Speed using Maxwell 11. Julius Saitz Ansoft Corporation

Education and Training CUFMEM14A. Exercise 2. Create, Manipulate and Incorporate 2D Graphics

Chapter 10. Creating 3D Objects Delmar, Cengage Learning

USO RESTRITO. AppleWorks 6. Quick Reference

Statistical graphics in analysis Multivariable data in PCP & scatter plot matrix. Paula Ahonen-Rainio Maa Visual Analysis in GIS

LAB 1 INSTRUCTIONS DESCRIBING AND DISPLAYING DATA

Spreadsheet View and Basic Statistics Concepts

Photoshop Domain 3: Setting Project Requirements. Dreamweaver Domain 3

Points Lines Connected points X-Y Scatter. X-Y Matrix Star Plot Histogram Box Plot. Bar Group Bar Stacked H-Bar Grouped H-Bar Stacked

Edupen Pro User Manual

Chapter 2 Assignment (due Thursday, October 5)

Applied Regression Modeling: A Business Approach

GGobi meets R: an extensible environment for interactive dynamic data visualization

Lesson 1 Parametric Modeling Fundamentals

JMP Chong Ho

LEARNING TO PROGRAM WITH MATLAB. Building GUI Tools. Wiley. University of Notre Dame. Craig S. Lent Department of Electrical Engineering

An introduction to plotting data

TOON BOOM HARMONY 12.1 Preferences Guide

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 2.1- #

IDL Tutorial. Contours and Surfaces. Copyright 2008 ITT Visual Information Solutions All Rights Reserved

Virtual MODELA USER'S MANUAL

Sharing Interactive Web Reports from JMP

HYPERSTUDIO TOOLS. THE GRAPHIC TOOL Use this tool to select graphics to edit. SPRAY PAINT CAN Scatter lots of tiny dots with this tool.

The process of making a selection is fundamental to all Photoshop

SWITCHING FROM SKETCHUP TO VECTORWORKS

S U N G - E U I YO O N, K A I S T R E N D E R I N G F R E E LY A VA I L A B L E O N T H E I N T E R N E T

OnPoint s Guide to MimioStudio 9

The Wireframe Update Buttons. The Frontface and Backface Buttons. The Project Designer 265

SAS Visual Analytics 8.2: Working with Report Content

Getting Started. What is SAS/SPECTRAVIEW Software? CHAPTER 1

White Paper. Solid Wireframe. February 2007 WP _v01

Adobe Fireworks is an incredible application with specific solutions to

Quick Guides for Etoys on the OLPC XO

STP 226 ELEMENTARY STATISTICS NOTES

Keyboard Shortcuts. Command Windows Macintosh

A Visual Guide To Corel Painter 7 Keyboard Shortcuts

Adobe Flash CS3 Reference Flash CS3 Application Window

Editing Objects. Introduction

GIMP WEB 2.0 BUTTONS

Visualisation of Categorical Data

ILLUSTRATOR TUTORIAL-1 workshop handout

DSC 2003 Working Papers (Draft Versions) iplots. High interaction graphics for R

IDL Tutorial. Working with Images. Copyright 2008 ITT Visual Information Solutions All Rights Reserved

hvpcp.apr user s guide: set up and tour

GeoGebra Workbook 2 More Constructions, Measurements and Sliders

Adobe Flash CS4 Part 1: Introduction to Flash

ESCHERLIKE developed by Géraud Bousquet. User s manual C03-04 STROKE WIDTH C03-05 TRANSPARENCY C04-01 SAVE YOUR WORK C04-02 OPEN A FILE

Illustrator Domains 1-4: Getting to Know Your Workspace. Dreamweaver Domain 3

Late Penalty: Late assignments will not be accepted.

To build shapes from scratch, use the tools are the far right of the top tool bar. These

iplots High Interaction Graphics for R

University of Florida CISE department Gator Engineering. Visualization

ILLUSTRATOR. Introduction to Adobe Illustrator. You will;

1Anchors - Access. Part 23-1 Copyright 2004 ARCHIdigm. Architectural Desktop Development Guide PART 23 ANCHORS 1-23 ANCHORS

03 Vector Graphics. Multimedia Systems. 2D and 3D Graphics, Transformations

Chapter 6. Task-Related Organization. Single Menus. Menu Selection, Form Fill-in and Dialog Boxes. Binary Menus Radio Buttons Button Choice

Class 5. Data Representation and Introduction to Visualization

6 Using Technology Wisely

Interface. 2. Interface Illustrator CS H O T

Think of layers as a stack of transparencies. Layers can be changed independently of other layers by clicking on its name in the layers palette.

2.4-Statistical Graphs

BME 5742 Bio-Systems Modeling and Control

Inkscape tutorial: Donate button

User Interface Guide

Tracking Handle Menu Lloyd K. Konneker Jan. 29, Abstract

Learning to use the drawing tools

Evgeny Maksakov Advantages and disadvantages: Advantages and disadvantages: Advantages and disadvantages: Advantages and disadvantages:

Construction Change Order analysis CPSC 533C Analysis Project

Step by Step GIS. Section 1

Assignment 1 Photoshop CAD Fundamentals I Due January 18 Architecture 411

Software User s Manual

Integrated Mathematics I Performance Level Descriptors

Shortcut Keys for Adobe Photoshop (Educational Support)

Road Map for Essential Studio 2010 Volume 1

3D Model Based Mechanisms Learning Software User Manual (Version 4) Freely Available for Academic Use!!! September 2015

Mesh Quality Tutorial

USING THE PHOTOSHOP TOOLBOX

Survey of Math: Excel Spreadsheet Guide (for Excel 2016) Page 1 of 9

Transcription:

Interactive Graphics for Statistics Principles & Examples martin.theus@math.uni-augsburg.de

2 Graphics in Statistics: Diagnostics vs. Analysis Continuous Data

2 Graphics in Statistics: Diagnostics vs. Analysis Continuous Data

3 Graphics in Statistics: Diagnostics vs. Analysis Categorical Data

3 Graphics in Statistics: Diagnostics vs. Analysis Categorical Data

3 Graphics in Statistics: Diagnostics vs. Analysis Categorical Data

4 Graphics for Data Analysis is not based on formal theory but has proven to be very effective is not taught in standard curricula but used by most statisticians once they left college is supporting an interactive and iterative exploration of data thus needs interactive software tools is not well supported by most statistics software but there is software that helps

5 Interactive Graphical Tools for DA: History

5 Interactive Graphical Tools for DA: History 1973 PRIM-9 Tukey et al.

5 Interactive Graphical Tools for DA: History 1983 SPLOM Becker et al. 1973 PRIM-9 Tukey et al.

5 Interactive Graphical Tools for DA: History 1983 SPLOM Becker et al. 1973 PRIM-9 Tukey et al. 1985 DataDesk Vellemann

5 Interactive Graphical Tools for DA: History 1983 SPLOM Becker et al. 1991 XGobi Buja et al. 1973 PRIM-9 Tukey et al. 1985 DataDesk Vellemann

5 Interactive Graphical Tools for DA: History 1983 SPLOM Becker et al. 1999 ggobi 1991 Swayne XGobi et al. Buja et al. 1973 PRIM-9 Tukey et al. 1985 DataDesk Vellemann

5 Interactive Graphical Tools for DA: History 1983 SPLOM Becker et al. 1999 ggobi 1991 Swayne XGobi et al. Buja et al. 1973 PRIM-9 Tukey et al. 1985 DataDesk Vellemann 1993 MANET Unwin et al.

5 Interactive Graphical Tools for DA: History 1983 SPLOM Becker et al. 1999 ggobi 1991 Swayne XGobi et al. Buja et al. 1997 Mondrian Theus 136/ 437 1973 PRIM-9 Tukey et al. 1985 DataDesk Vellemann 1993 MANET Unwin et al.

5 Interactive Graphical Tools for DA: History 1983 SPLOM Becker et al. 1999 ggobi 1991 Swayne XGobi et al. Buja et al. 1997 Mondrian Theus 136/ 437? 1973 PRIM-9 Tukey et al. 1985 DataDesk Vellemann 1993 MANET Unwin et al. 2006 StatStudio SAS

5 Interactive Graphical Tools for DA: History 1983 SPLOM Becker et al. 1999 ggobi 1991 Swayne XGobi et al. Buja et al. 1997 Mondrian Theus 136/ 437? 1973 1985 1993 2006 PRIM-9 DataDesk MANET StatStudio Tukey et al. Vellemann Unwin et al. SAS

5 Interactive Graphical Tools for DA: History 1983 SPLOM Becker et al. 1999 ggobi 1991 Swayne XGobi et al. Buja et al. 1997 Mondrian Theus 136/ 437? 1973 PRIM-9 Tukey et al. 1985 DataDesk Vellemann 1993 MANET Unwin et al. 2006 StatStudio SAS

5 Interactive Graphical Tools for DA: History 1983 SPLOM Becker et al. 1999 ggobi 1991 Swayne XGobi et al. Buja et al. 1997 Mondrian Theus 136/ 437 1973 PRIM-9 Tukey et al.? 1985 1993 2006 DataDesk MANET StatStudio Vellemann Unwin et al. SAS

5 Interactive Graphical Tools for DA: History 1983 SPLOM Becker et al. 1999 ggobi 1991 Swayne XGobi et al. Buja et al. 1997 Mondrian Theus 136/ 437 1973 PRIM-9 Tukey et al.? 1985 1993 2006 DataDesk MANET StatStudio Vellemann Unwin et al. SAS

5 Interactive Graphical Tools for DA: History 1983 SPLOM Becker et al. 1999 ggobi 1991 Swayne XGobi et al. Buja et al. 1997 Mondrian Theus 136/ 437 1973 PRIM-9 Tukey et al. 1985 1993 DataDesk MANET Vellemann Unwin et al.? 2006 StatStudio SAS people forget far faster than they learn give them time!

6 On Hammers and Nails No one single tool is universal enough to cover every problem we might want to handle in data analysis use the best suited!?

6 On Hammers and Nails No one single tool is universal enough to cover every problem we might want to handle in data analysis use the best suited!? Iris data in a Fluctuation Diagram

6 On Hammers and Nails No one single tool is universal enough to cover every problem we might want to handle in data analysis use the best suited!? Iris data in a Fluctuation Diagram Titanic data in a scatterplot

7 What is Interactive Statistical Graphics about?

7 What is Interactive Statistical Graphics about? Direct manipulation is at the core (classical computational statistics was 90% algorithm and 10% interface Interactive Statistical Graphics is 10% algorithm and 90% interface!) Selection of subgroups, i.e., conditioning Modification of plot parameters

7 What is Interactive Statistical Graphics about? Direct manipulation is at the core (classical computational statistics was 90% algorithm and 10% interface Interactive Statistical Graphics is 10% algorithm and 90% interface!) Selection of subgroups, i.e., conditioning Modification of plot parameters What makes a graphics interactive? data can be selected (selection state is actually only one possible attribute of the data) support for highlighting objects can be queried

7 What is Interactive Statistical Graphics about? Direct manipulation is at the core (classical computational statistics was 90% algorithm and 10% interface Interactive Statistical Graphics is 10% algorithm and 90% interface!) Selection of subgroups, i.e., conditioning Modification of plot parameters What makes a graphics interactive? data can be selected (selection state is actually only one possible attribute of the data) support for highlighting objects can be queried Interactive Graphics Dynamic Graphics dynamic variation of plot parameters (animation) was the first interaction to be implemented 30 years ago, but is now out-dated.

8 Where did Linking go? Linking is used to (optionally) propagate attributes of the data Example:

8 Where did Linking go? Linking is used to (optionally) propagate attributes of the data Example: Set Dataset Attributes selection color 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0 0 Set Axes Mark: [0, 4, 0.25]

8 Where did Linking go? Linking is used to (optionally) propagate attributes of the data Example: Set Dataset Attributes selection color 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0 0 Get Get Set Axes Mark: [0, 4, 0.25]

9 More on Linking

9 More on Linking Linking is not restricted to the highlight state

9 More on Linking Linking is not restricted to the highlight state Case linking (transient) highlighting (persistent) color brushing symbols visibility

9 More on Linking Linking is not restricted to the highlight state Case linking (transient) highlighting (persistent) color brushing symbols visibility Axes linking continuous intervals categorical orderings

9 More on Linking Linking is not restricted to the highlight state Case linking (transient) highlighting (persistent) color brushing symbols visibility Axes linking continuous intervals categorical orderings Parameter linking

9 More on Linking Linking is not restricted to the highlight state Case linking (transient) highlighting (persistent) color brushing symbols visibility Axes linking continuous intervals categorical orderings Parameter linking Linking of attributes of the data (or any other incarnation of the data) is optionally.

9 More on Linking Linking is not restricted to the highlight state Case linking (transient) highlighting (persistent) color brushing symbols visibility Axes linking continuous intervals categorical orderings Parameter linking Linking of attributes of the data (or any other incarnation of the data) is optionally. Setting and/or getting of attributes may have constraints (group selections, cross-relation selections, )

10 More on Selections Different tools can be used (point, rectangle, slice, lasso, brush, ) most of them select parallel or orthogonal to variable axes. Successive selection steps must be combinable with boolean operators (single step selections can only select trivial sets). Multiple brushes can be used to define a sequence of selections. Painting (persistent brush selection) is actually covered by a selection sequence when all brushes are in OR mode. Prefer parametric selection over arbitrary selections. (How to communicate a lasso selection, or a selection in some high-dimensional rotated space?)

11 More on Highlighting Implementation of highlighting for glyph-based plots is trivial. Highlighting for rectangle based plots is still straight forward. The highlighting in a plot not necessarily is of the same type as the plot itself, but it must be faithful to the plots definition. Highlighting is a transient attribute of the data and usually compares to the complete sample, Color brushing defines a persistent group membership, and thus the groups should compare between each other. Not all plots can be used to implement (sensible) highlighting

12 More on Queries Graphics are good at communicating qualitative information but fail to give exact quantities need queries to get exact values. Interactive graphics often display very little scale information (cf. Tufte s data-ink-ratio ). The level of detail of a query should have optional granularities, e.g. scatterplot:

12 More on Queries Graphics are good at communicating qualitative information but fail to give exact quantities need queries to get exact values. Interactive graphics often display very little scale information (cf. Tufte s data-ink-ratio ). The level of detail of a query should have optional granularities, e.g. scatterplot: orientation

12 More on Queries Graphics are good at communicating qualitative information but fail to give exact quantities need queries to get exact values. Interactive graphics often display very little scale information (cf. Tufte s data-ink-ratio ). The level of detail of a query should have optional granularities, e.g. scatterplot: orientation standard

12 More on Queries Graphics are good at communicating qualitative information but fail to give exact quantities need queries to get exact values. Interactive graphics often display very little scale information (cf. Tufte s data-ink-ratio ). The level of detail of a query should have optional granularities, e.g. scatterplot: orientation standard extended

13 Interacting with Plots Typical manipulations comprise of: Selections Change of scale axes zoom (includes logical zooming) (order) Change of order (both manually and automatically) categories in a barchart variables in parallel coordinates variables in a mosaic plot Change of plot parameters, e.g. anchor point, bin width in histogram point size in scatterplots Important: Identify general concepts!

14 On the Interface The usability of an interactive, computer based system can be strongly improved by strict conventions. All interactions (selection, queries, ordering, zoom, ) MUST be implemented consistently across all plots and summaries.

14 On the Interface The usability of an interactive, computer based system can be strongly improved by strict conventions. All interactions (selection, queries, ordering, zoom, ) MUST be implemented consistently across all plots and summaries. single mouse-click keyboard shortcut slider, mouse-drag pop-up menu floating palette pull-down menu Amount of Information & Complexity Speed & Context Preservation modal dialog command line

15 Less can be more Software should offer a clearly defined (orthogonal) set of functions, which can be combined to get a wider range of functionality. Examples: Painting Mode is a brush in OR mode and covered by selection sequences Categorizing a continuous variable via a histogram Functions are less obvious for a novice user!

15 Less can be more Software should offer a clearly defined (orthogonal) set of functions, which can be combined to get a wider range of functionality. Examples: Painting Mode is a brush in OR mode and covered by selection sequences Categorizing a continuous variable via a histogram 1918 2001 Functions are less obvious for a novice user!

15 Less can be more Software should offer a clearly defined (orthogonal) set of functions, which can be combined to get a wider range of functionality. Examples: Painting Mode is a brush in OR mode and covered by selection sequences Categorizing a continuous variable via a histogram 1918 2001 1918 2001 Functions are less obvious for a novice user!

15 Less can be more Software should offer a clearly defined (orthogonal) set of functions, which can be combined to get a wider range of functionality. Examples: Painting Mode is a brush in OR mode and covered by selection sequences Categorizing a continuous variable via a histogram 1918 2001 1918 2001 Functions are less obvious for a novice user!

15 Less can be more Software should offer a clearly defined (orthogonal) set of functions, which can be combined to get a wider range of functionality. Examples: Painting Mode is a brush in OR mode and covered by selection sequences Categorizing a continuous variable via a histogram 1918 2001 1918 2001 Functions are less obvious for a novice user!

15 Less can be more Software should offer a clearly defined (orthogonal) set of functions, which can be combined to get a wider range of functionality. Examples: Painting Mode is a brush in OR mode and covered by selection sequences Categorizing a continuous variable via a histogram 1918 2001 1918 2001 Functions are less obvious for a novice user!

15 Less can be more Software should offer a clearly defined (orthogonal) set of functions, which can be combined to get a wider range of functionality. Examples: Painting Mode is a brush in OR mode and covered by selection sequences Categorizing a continuous variable via a histogram 1918 2001 1918 2001 1 2 Functions are less obvious for a novice user! 3 4 5

15 Less can be more Software should offer a clearly defined (orthogonal) set of functions, which can be combined to get a wider range of functionality. Examples: Painting Mode is a brush in OR mode and covered by selection sequences Categorizing a continuous variable via a histogram 1918 2001 1918 2001 1 2 Functions are less obvious for a novice user! 3 4 5

16 Rendering Quality is an Issue! Most graphics systems (S, R, SAS, ) were designed at times where pen on paper was the rendering model. Drawing on a computer screen will introduce rounding problem; the size of small objects is usually far off from what it should be. Anti-aliased graphics and sub-pixel rendering are not only a matter of taste. Overplotting is not an inherent problem of computer graphics, but if software does not handle it decently, the interpretation of (glyph based) graphics can lead to wrong conclusions. (extremely misleading with color brushing!)

17 Conclusion

17 Conclusion Graphics for data analysis is still rare, as the focus of statistical education is often not on data analysis. it is not taught at universities (except for some misfits). there is not much continuity in tools yet. it still lacks possible formalization.

17 Conclusion Graphics for data analysis is still rare, as the focus of statistical education is often not on data analysis. it is not taught at universities (except for some misfits). there is not much continuity in tools yet. it still lacks possible formalization. Formalization can be done for some parts of the techniques involved (IMHO it does not make sense to go for a global model).

17 Conclusion Graphics for data analysis is still rare, as the focus of statistical education is often not on data analysis. it is not taught at universities (except for some misfits). there is not much continuity in tools yet. it still lacks possible formalization. Formalization can be done for some parts of the techniques involved (IMHO it does not make sense to go for a global model). Unfortunately statistical computing degraded to implementing yet another package for R not much hope for new software.

17 Conclusion Graphics for data analysis is still rare, as the focus of statistical education is often not on data analysis. it is not taught at universities (except for some misfits). there is not much continuity in tools yet. it still lacks possible formalization. Formalization can be done for some parts of the techniques involved (IMHO it does not make sense to go for a global model). Unfortunately statistical computing degraded to implementing yet another package for R not much hope for new software. Apart from a formal description in a mathematical sense, the way of teaching concepts and tools via case studies seems to be quite promising

18 Commercial Break