Visualizing high-dimensional data:

Similar documents
2. Navigating high-dimensional spaces and the RnavGraph R package

Visualization for Classification Problems, with Examples Using Support Vector Machines

clusterfly exploring cluster analysis using R and GGobi Hadley Wickham, Iowa State University

An Experiment in Visual Clustering Using Star Glyph Displays

Plot and Look! Trust your Data more than your Models

Pairwise display of high dimensional information via Eulerian tours and Hamiltonian decompositions

Evgeny Maksakov Advantages and disadvantages: Advantages and disadvantages: Advantages and disadvantages: Advantages and disadvantages:

Graphs as navigational infrastructure for high dimensional data spaces

Data Mining - Data. Dr. Jean-Michel RICHER Dr. Jean-Michel RICHER Data Mining - Data 1 / 47

FODAVA Partners Leland Wilkinson (SYSTAT & UIC) Robert Grossman (UIC) Adilson Motter (Northwestern) Anushka Anand, Troy Hernandez (UIC)

k Nearest Neighbors Super simple idea! Instance-based learning as opposed to model-based (no pre-processing)

Visually-Motivated Characterizations of Point Sets Embedded in High-Dimensional Geometric Spaces: Current Projects

Data Mining: Exploring Data. Lecture Notes for Chapter 3

Machine Learning: Algorithms and Applications Mockup Examination

Gaining Insights into Support Vector Machine Pattern Classifiers Using Projection-Based Tour Methods

Graphs as navigational infrastructure for high dimensional data spaces. C.B. Hurley and R.W. Oldford DRAFT. November 11, 2008

University of Florida CISE department Gator Engineering. Visualization

Intro to R for Epidemiologists

Package cepp. R topics documented: December 22, 2014

Introduction to R and Statistical Data Analysis

Data Mining: Exploring Data. Lecture Notes for Chapter 3

Data Mining: Exploring Data. Lecture Notes for Chapter 3. Introduction to Data Mining

FODAVA Partners Leland Wilkinson (SYSTAT, Advise Analytics & UIC) Anushka Anand, Tuan Nhon Dang (UIC)

Data Mining: Exploring Data. Lecture Notes for Data Exploration Chapter. Introduction to Data Mining

STAT 1291: Data Science

Introduction to Artificial Intelligence

CSE Data Visualization. Multidimensional Vis. Jeffrey Heer University of Washington

An Introduction to Cluster Analysis. Zhaoxia Yu Department of Statistics Vice Chair of Undergraduate Affairs

Part I. Graphical exploratory data analysis. Graphical summaries of data. Graphical summaries of data

Points Lines Connected points X-Y Scatter. X-Y Matrix Star Plot Histogram Box Plot. Bar Group Bar Stacked H-Bar Grouped H-Bar Stacked

CSE 6242 A / CS 4803 DVA. Feb 12, Dimension Reduction. Guest Lecturer: Jaegul Choo

CSE Data Visualization. Multidimensional Vis. Jeffrey Heer University of Washington

Visual Encoding Design

Visual Computing. Lecture 2 Visualization, Data, and Process

The Curse of Dimensionality

MULTIVARIATE ANALYSIS USING R

Data Mining: Exploring Data

BL5229: Data Analysis with Matlab Lab: Learning: Clustering

Lecture 6: Statistical Graphics

Data Exploration and Preparation Data Mining and Text Mining (UIC Politecnico di Milano)

Analysis and Latent Semantic Indexing

Orange3 Educational Add-on Documentation

Data Mining Chapter 3: Visualizing and Exploring Data Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

cs6964 February TABULAR DATA Miriah Meyer University of Utah

A Systematic Overview of Data Mining Algorithms. Sargur Srihari University at Buffalo The State University of New York

Instance-Based Representations. k-nearest Neighbor. k-nearest Neighbor. k-nearest Neighbor. exemplars + distance measure. Challenges.

CIE L*a*b* color model

Hsiaochun Hsu Date: 12/12/15. Support Vector Machine With Data Reduction

Clojure & Incanter. Introduction to Datasets & Charts. Data Sorcery with. David Edgar Liebke

Interactive Visual Exploration

netzen - a software tool for the analysis and visualization of network data about

Quick Guide for pairheatmap Package

Multi-Dimensional Vis

KTH ROYAL INSTITUTE OF TECHNOLOGY. Lecture 14 Machine Learning. K-means, knn

Multiple Dimensional Visualization

Exploratory Data Analysis EDA

An Introduction to R Graphics

COMPARISON OF DENSITY-BASED CLUSTERING ALGORITHMS

DATA VISUALIZATION WITH GGPLOT2. Coordinates

CS 8520: Artificial Intelligence. Weka Lab. Paula Matuszek Fall, CSC 8520 Fall Paula Matuszek

Graphing Bivariate Relationships

Cluster Analysis and Visualization. Workshop on Statistics and Machine Learning 2004/2/6

InterAxis: Steering Scatterplot Axes via Observation-Level Interaction

Photo Tourism: Exploring Photo Collections in 3D

CIS 467/602-01: Data Visualization

Introduction to R for Epidemiologists

Scaling Techniques in Political Science

Manuel Oviedo de la Fuente and Manuel Febrero Bande

Regression III: Advanced Methods

刘淇 School of Computer Science and Technology USTC

Non-linear dimension reduction

Cluster Analysis for Microarray Data

Introduction to Machine Learning Prof. Anirban Santara Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Interactive Visualization of Fuzzy Set Operations

Glyphs. Presentation Overview. What is a Glyph!? Cont. What is a Glyph!? Glyph Fundamentals. Goal of Paper. Presented by Bertrand Low

Part I, Chapters 4 & 5. Data Tables and Data Analysis Statistics and Figures

Function Approximation and Feature Selection Tool

arulescba: Classification for Factor and Transactional Data Sets Using Association Rules

Applications. Foreground / background segmentation Finding skin-colored regions. Finding the moving objects. Intelligent scissors

3. Multidimensional Information Visualization II Concepts for visualizing univariate to hypervariate data

Combo Charts. Chapter 145. Introduction. Data Structure. Procedure Options

Clustering. CE-717: Machine Learning Sharif University of Technology Spring Soleymani

Stats 170A: Project in Data Science Exploratory Data Analysis: Clustering Algorithms

Clustering. CS294 Practical Machine Learning Junming Yin 10/09/06

Work 2. Case-based reasoning exercise

Oliver Dürr. Statistisches Data Mining (StDM) Woche 5. Institut für Datenanalyse und Prozessdesign Zürcher Hochschule für Angewandte Wissenschaften

Finding Clusters 1 / 60

A Data Explorer System and Rulesets of Table Functions

Linear discriminant analysis and logistic

Creating publication-ready Word tables in R

Data analysis case study using R for readily available data set using any one machine learning Algorithm

Information Visualization. Jing Yang Spring Multi-dimensional Visualization (1)

LaTeX packages for R and Advanced knitr

Using surface markings to enhance accuracy and stability of object perception in graphic displays

Introduction to Minitab 1

CS Information Visualization Sep. 2, 2015 John Stasko

Input: Concepts, Instances, Attributes

CSE 6242 A / CX 4242 DVA. March 6, Dimension Reduction. Guest Lecturer: Jaegul Choo

TopMill TopTurn. Jobshop Programming & Simulation for Multi-Side & Complete Mill-Turn Machining for every CNC Control

Motion Interpretation and Synthesis by ICA

Transcription:

Visualizing high-dimensional data: Applying graph theory to data visualization Wayne Oldford based on joint work with Catherine Hurley (Maynooth, Ireland) Adrian Waddell (Waterloo, Canada)

Challenge p values on each of n individuals modern data: n, or p, or both, can be very large

Challenge p values on each of n individuals modern data: n, or p, or both, can be very large

Challenge p values on each of n individuals modern data: n, or p, or both, can be very large

Challenge p values on each of n individuals modern data: n, or p, or both, can be very large

Challenge p values on each of n individuals modern data: n, or p, or both, can be very large

Challenge p values on each of n individuals modern data: n, or p, or both, can be very large can have non-obvious variables, complex, unanticipated structure,...

Data Visualization powerful human visual system use a variety of cues: proximity, movement, shape, colour, texture,... patterns, relations, like and unlike,... recognition and discovery structure need not be anticipated

Data Visualization powerful human visual system use a variety of cues: proximity, movement, shape, colour, texture,... patterns, relations, like and unlike,... recognition and discovery structure need not be anticipated

Data Visualization powerful human visual system use a variety of cues: proximity, movement, shape, colour, texture,... patterns, relations, like and unlike,... recognition and discovery structure need not be anticipated

Data Visualization powerful human visual system use a variety of cues: proximity, movement, shape, colour, texture,... patterns, relations, like and unlike,... recognition and discovery structure need not be anticipated

Challenge Large p visually, we re constrained to small p locations: p < 4 use colour, shape, texture, movement,... comprehension depends on only a few dimensions... at a time

Challenge Large p visually, we re constrained to small p locations: p < 4 use colour, shape, texture, movement,... comprehension depends on only a few dimensions... at a time Approach: large number of low dimensional views

Challenge Large p visually, we re constrained to small p locations: p < 4 use colour, shape, texture, movement,... comprehension depends on only a few dimensions... at a time Approach: large number of low dimensional views p d d-dimensional views, preferably highly interactive

Challenge Large p visually, we re constrained to small p locations: p < 4 use colour, shape, texture, movement,... comprehension depends on only a few dimensions... at a time Approach: large number of low dimensional views p d d-dimensional views, preferably highly interactive Which dimensions? How connected? How explored?

Axis systems Choice of coordinate axis layout Radial (PairViz R package) Parallel (PairViz R package) Orthogonal (RnavGraph R package) Punchline graph theory framework for exploratory data analysis looks very promising

7 dim: A, B, C, D, E, F, G Radial axes

Radial axes 7 dim: A, B, C, D, E, F, G C B Equi-angular D radial axes A E G F

Radial axes 7 dim: A, B, C, D, E, F, G Equi-angular D C B A radial axes E G F - Length of each radius is proportional to value of variate for that case. - High dimensional data, each case represented by a star shaped glyph

Radial axes: B D C A E G F

Radial axes: 1 2 3 4 5 6 7 8 9 - Compare cases by shape of glyph, here 9 cases in 7 dimensions - Visually cluster high dimensional data by shape: {7,8,9,1} {2,3} {4} {5,6}?

Radial axes: 1 2 3 4 5 6 7 8 9 - Compare cases by shape of glyph, here 9 cases in 7 dimensions - Visually cluster high dimensional data by shape: {7,8,9,1} {2,3} {4} {5,6}? - What if the variables were assigned in a different order?

Radial axes: order effect Dataset order H0 {7,8,9,1} {2,3} {4} {5,6}? Order H1 1 2 3 1 2 3 4 5 6 4 5 6 7 8 9 {7,8,9,1} {2,3} {4} {5} {6}? 7 8 9

Radial axes: order effect Dataset order H0 {7,8,9,1} {2,3} {4} {5,6}? Order H1 1 2 3 1 2 3 4 5 6 4 5 6 7 8 9 {7,8,9,1} {2,3} {4} {5} {6}? 7 8 9 Order H2 {1,4} {2,3} {5,6} {7,8,9}? 1 2 3 4 5 6 7 8 9

Radial axes: order effect Dataset order H0 {7,8,9,1} {2,3} {4} {5,6}? Order H1 1 2 3 1 2 3 4 5 6 4 5 6 7 8 9 {7,8,9,1} {2,3} {4} {5} {6}? 7 8 9 Order H2 {1,4} {2,3} {5,6} {7,8,9}? Order H3 1 2 3 1 2 3 4 5 6 4 5 6 7 8 9 {1} {2} {3} {4} {5} {6} {7,8,9}? 7 8 9

Radial axes: reduced order effect Instead, have all pairs of variables appear together A G B F C E D - Default was an arbitrary selection of pairs such that all nodes were visited once. If via a path (cycle), it is a Hamiltonian path (cycle). - Visiting all edges would give all pairs in some order.

Radial axes: reduced order effect An Eulerian G A B F C E D

Radial axes: reduced order effect An Eulerian A Hamiltonian G A B G A B F C F C E D E D

Radial axes: reduced order effect An Eulerian A Hamiltonian G A B G A B G A B F C F C F C E D E D E D

Radial axes: reduced order effect An Eulerian A Hamiltonian G A B G A B G A B G A B F C F C F C F C E D E D E D E D

Radial axes: reduced order effect A Hamiltonian decomposition G A B G A B G A B G A B F C = F C + F C + F C E D E D E D E D

Radial axes: reduced order effect - Which when assembled form an Eulerian cycle composed of Hamiltonians B C D E F G A = B C D E F G A B C D E F G A B C D E F G A A Hamiltonian decomposition

Radial axes: reduced order effect - Which when assembled form an Eulerian cycle composed of Hamiltonians - Could build a glyph from these cycles (21 radii instead of 7) B C D E F G A = B C D E F G A B C D E F G A B C D E F G A A Hamiltonian decomposition

Radial axes: reduced order effect A Hamiltonian decomposition Hamiltonian decomp, H1:H2:H3 1 2 3 4 5 6 7 8 9 - Could build a glyph from these cycles (21 radii instead of 7)

Radial axes glyphs A Hamiltonian decomposition Hamiltonian decomp, H1:H2:H3 {1,4} {2,3} {5,6} {7,8,9} 1 2 3 4 5 6 7 8 9 Maserati

Radial axes glyphs A Hamiltonian decomposition Hamiltonian decomp, H1:H2:H3 {1,4} {2,3} {5,6} {7,8,9} 1 2 3 Clustering dendrogram 4 5 6 Height 0 2 4 5 6 2 3 9 7 8 1 4 7 8 9 Maserati

Radial axes glyphs A Greedy Eulerian (maximizing pairwise correlation) Eulerian order 1 2 3 Clustering dendrogram 4 5 6 5 6 2 3 9 7 Height 8 0 2 4 1 4 7 8 9 Maserati

Radial axes glyphs A Greedy Eulerian (maximizing pairwise correlation) 4 First two principal components Clustering dendrogram 1 5 6 3 2 Height 0 2 4 5 6 2 3 9 7 8 1 4 7 8 9 All pairs (Greedy Eulerian or Hamiltonian decomposition) reduce the effect of variable pair patterns, making star glyphs more reliable. Maserati

Radial axes glyphs A Greedy Eulerian (maximizing pairwise correlation) Maserati 4 First two principal components Duster 360 Ferrari Dino Clustering dendrogram 5 1 Mercedes 450SE Mercedes 450SL 6 Mazda RX4 Porsche 914 3 2 Height 0 2 4 Lotus Europa 5 6 2 3 9 7 8 1 4 7 8 9 Mercedes 450SLC All pairs (Greedy Eulerian or Hamiltonian decomposition) reduce the effect of variable pair patterns, making star glyphs more reliable. Maserati

Parallel Axes X Y

Parallel Axes X Y Each point is a curve in two dimensions

Parallel Coordinates W X Y Z Each point is a curve in two dimensions Have as many dimensions as the display permits

Parallel Coordinates Virginica Versicolor Sepal.Length Sepal.Width Petal.Length Petal.Width Gaspé Irises: Each flower is a curve in two dimensions positioned by that flower s measurement on all variables. Setosa

Parallel Coordinates Virginica Versicolor Setosa Sepal.Length Sepal.Width Petal.Length Petal.Width Sepal.Length.1 Petal.Length.1 Sepal.Width.1 Petal.Width.1 By following an Eulerian Tour on the complete graph, every pair of variables will appear side by side.

Parallel Coordinates Greedy Eulerian: Focus on most positively correlated Sleep1 data (alr3 R package): 10 variables on 62 mammals Br = log brain weight, Bd = log body weight, SW = Slow wave non-dreaming sleep, PS = Paradoxical dreaming sleep, TS = Total sleep, D = Danger index, P = Predation index, L = Max life span, SE = Sleep exposure index, GP = Gestation time SW TS PS SW P D SE Bd Br GP Bd L Br SE P SE GP L GP D Bd P G

Scagnostics Cognostics (Computer aided diagnostics) Scagnostics Scatterplot cognostics Wilkinson et al (2006) (from idea proposed by Tukey & Tukey (1985))

Scagnostics Cognostics (Computer aided diagnostics) Scagnostics Scatterplot cognostics Wilkinson et al (2006) (from idea proposed by Tukey & Tukey (1985))

Scagnostics

Scagnostics

Parallel Coordinates Eulerian on scagnostics: Outlying GP L Bd SW L Br GP Bd Br SW 0.0 0.3 0.6 Focus on features of interest (Wilkinson et al 2006, from Tukey & Tukey 1985 Cognostics scagnostics ) Eulerian may pick up a subset of the variables

Parallel Coordinates Best Hamiltonian on scagnostics: Striated Sparse Bd D PS SE SW GP P L Br TS 0.0 0.3 0.6 Focus on features of interest (Wilkinson et al 2006, from Tukey & Tukey 1985 Cognostics scagnostics ) Hamiltonian ensures all variables appear

Parallel Coordinates Eulerian on scagnostics: Outlying Same data GP L Bd SW L Br GP Bd Br SW Best Hamiltonian on scagnostics: Striated Sparse Different measures Bd D PS SE SW GP P L Br TS 0.0 0.3 0.6 0.0 0.3 0.6

Orthogonal axes Y X

Orthogonal axes Y X Z

Orthogonal axes Petal Width Sepal Length? Petal Length Sepal Width What about more than 3 dimensions?

Example: Italian olive oils Different regions of Italy: NORTH (Umbria, East- Liguria, West-Liguria) Liguria SOUTH (Calabria, Sicily, North-Apulia, South-Apulia) SARDINIA (Inland, Coast) Sardinia Umbria Apulia Calabria Sicily

Example: Italian olive oils Different regions of Italy: NORTH (Umbria, East- Liguria, West-Liguria) SOUTH (Calabria, Sicily, North-Apulia, South-Apulia) SARDINIA (Inland, Coast) NORTH ALL SOUTH Liguria Umbria Sardinia Sicily SARDINIA Apulia Calabria Liguria Umbria Calabria Apulia Sicily Inland Coast West East South North

Example: Italian olive oils Measurements: n = 572 different olive samples concentrations of p=8 fatty acids: arachidic, eicosenoic, linoleic (l1), linolenic (l2), oleic, palmitic (p1), palmitoleic (p2), and stearic. Liguria Sardinia Umbria Sicily Apulia Calabria

Connecting low-d spaces Navigation Graphs node = variable pair edges connect nodes that share a variable could display scatterplot at each node edges are 3D transitions high dimensional space is explored by moving from one 2D space to another through 3D (or 4D) transitions track/map exploration suggest routes

Connecting low-d spaces Navigation Graphs node = variable pair edges connect nodes that share a variable could display scatterplot at each node edges are 3D transitions high dimensional space is explored by moving from one 2D space to another through 3D (or 4D) transitions track/map exploration suggest routes

Connecting low-d spaces Navigation Graphs node = variable pair edges connect nodes that share a variable could display scatterplot at each node edges are 3D transitions high dimensional space is explored by moving from one 2D space to another through 3D (or 4D) transitions track/map exploration suggest routes

Connecting low-d spaces Navigation Graphs node = variable pair edges connect nodes that share a variable could display scatterplot at each node edges are 3D transitions high dimensional space is explored by moving from one 2D space to another through 3D (or 4D) transitions track/map exploration suggest routes

Navigation Graphs RNavgraph... R implementation

Example: Italian olive oils Interactive Interactive scatterplot 3d transition graph

Example: Italian olive oils Interactive Interactive scatterplot 3d transition graph

Example: Italian olive oils Interactive 3d transition graph Interactive scatterplot Move back and forth by hand

Example: Italian olive oils Interactive 3d transition graph Interactive scatterplot Move back and forth by hand

Example: Italian olive oils Interactive 3d transition graph Interactive scatterplot Scatterplot control panel offers interactive features

Example: Italian olive oils Interactive 3d transition graph Interactive scatterplot Brushing

Example: Italian olive oils Interactive 3d transition graph Interactive scatterplot Deactivate selected points

Example: Italian olive oils Interactive 3d transition graph Interactive scatterplot Deactivate selected points Return to starting position

Example: Italian olive oils Interactive 3d transition graph Interactive scatterplot Zoom and relocate Note World View changes

Example: Italian olive oils Interactive 3d transition graph Interactive scatterplot At least 3 groups; Colour two of them.

Example: Italian olive oils Interactive 3d transition graph Interactive scatterplot Could also select a whole path to traverse

Example: Italian olive oils Interactive 3d transition graph Interactive scatterplot Could also select a whole path to traverse

Example: Italian olive oils Interactive 3d transition graph Interactive scatterplot Paths can be saved, annotated, viewed, and walked again.

Example: Italian olive oils Interactive 3d transition graph Interactive scatterplot Paths can be saved, annotated, viewed, and walked again.

Example: Italian olive oils Interactive 3d transition graph Interactive scatterplot Appears to be a third horizontal group... zoom etc.

Example: Italian olive oils Interactive 3d transition graph Interactive scatterplot Appears to be a third horizontal group... zoom etc. And that outlier

Example: Italian olive oils Interactive 3d transition graph Interactive scatterplot Colour group orange, outlier red.

Example: Italian olive oils Interactive 3d transition graph Interactive scatterplot Colour group orange, outlier red. Can switch to glyphs

Example: Italian olive oils Interactive 3d transition graph Interactive scatterplot Colour group orange, outlier red. Focus on a region

Example: Italian olive oils Interactive 3d transition graph Interactive scatterplot Colour group orange, outlier red. Move to compare shapes

Example: Italian olive oils Interactive 3d transition graph Interactive scatterplot Colour group orange, outlier red. Enlarge to compare shapes

Example: Italian olive oils Interactive 3d transition graph Interactive scatterplot Colour group orange, outlier red. Identify possible orange?

Example: Italian olive oils Interactive 3d transition graph Interactive scatterplot Colour group orange, outlier red. Can actually check here

Example: Italian olive oils Continue in this way: bring back deactivated points identify groups, reassign points note natural hierarchical clustering save grouping by colour in R

Challenge Large p => large graphs p... overall dimensionality (olive, p=8) p 2 p... potential 2d nodes (28) 3... potential 3d edges (56) p 5 10 20 50 p 2 p 3 10 45 190 1225 10 120 1140 19600

Challenge Large p => large graphs p... overall dimensionality (olive, p=8) p 2 p... potential 2d nodes (28) 3... potential 3d edges (56) p 5 10 20 50 p 2 p 3 10 45 190 1225 10 120 1140 19600 Need to start with small, but interesting, graphs

Interesting node pairs For each scagnostic, calculate the value for every pair. View only those pairs with high scores (e.g. top fraction of scores).

Scagnostics: Italian olive oils 3D Monotonic Groups coloured by regions

Scagnostics: Italian olive oils 3D Monotonic Groups coloured by regions

Scagnostics: Italian olive oils Switch to 3D Striated Groups coloured by regions

Scagnostics: Italian olive oils 3D Striated Groups coloured by regions

Scagnostics: Italian olive oils 3D Non-Convex Groups coloured by regions

Challenge Large p => large graphs scagnostics work well, but when p is very large, so is p 2 dimensionality reduction methods could be employed.

Example: images Frey: 1,965 movie frames

Example: images Frey: 1,965 movie frames 28 x 20 array

Example: images Frey: 1,965 movie frames 28 x 20 array 560 dimensions

Example: images Frey: 1,965 movie frames 28 x 20 array 560 dimensions explore via low dimensional spaces

Example: images Frey: 1,965 movie frames 560 dimensions

Example: images Frey: 1,965 movie frames 560 dimensions Using LLE: local linear embedding k=12 neighbours

Example: images Frey: 1,965 movie frames 560 dimensions Using LLE: local linear embedding k=12 neighbours reduce to 5

Example: images Frey: 1,965 movie frames 560 dimensions reduce to 5 interactive low-d view

Example: images Frey: 1,965 movie frames 560 dimensions reduce to 5 interactive low-d view connect low-d views

Example: images

Example: images Interactive panel

Example: images Switch to images

Example: images Back to dots

Example: images Lots of structure... explored in 5d

Example: images Lots of structure... explored in 5d

Example: images Lots of structure... explored in 5d

Can link across NavGraph Sessions Here LLE and ISOMAP embeddings

Can link across NavGraph Sessions Here LLE and ISOMAP embeddings

Can link across NavGraph Sessions Here LLE and ISOMAP embeddings

Multiple visualizations Kernel density contours and 3D surface

Multiple visualizations Kernel density contours and 3D surface

Multiple visualizations Kernel density contours and 3D surface

Multiple visualizations Kernel density contours and 3D surface

Summary Graph theory structure organizes order of axes (e.g. radial, parallel, orthog.) use interesting orders (correlations, scagnostics, etc.) organizes ANY display order (e.g. multiple comparisons) graphs become maps to navigate high-dimensional space graph walks are low dimensional trajectories can focus on interesting trajectories capitalizes on visual ability graphs easily constructed; graph theory/algorithms exist

Summary Try it yourself R packages: PairViz Hurley & Oldford RnavGraph Waddell & Oldford

Thank you

Thank you Questions?

Papers Hurley & Oldford: Graphs as navigational infrastructure for high dimensional data spaces (Comp Stats 2011) Pairwise display of high dimensional information via Eulerian tours and Hamiltonian decompositions (JCGS, 2010) Eulerian tour algorithms for data visualization and the PairViz package (Comp Stats 2011) PairViz R package available on CRAN. Oldford & Waddell: Visual clustering of high-dimensional data by navigating low-dimensional spaces (ISI Dublin, 2011) RnavGraph: A visualization tool for navigating through high dimensional data (ISI Dublin, 2011) RnavGraph R package available on CRAN Oldford & Zhou: Tree Ensemble Reduced Clustering via a Graph Algebraic Framework. submitted

Aside: 4d transitions 3d and 4d transition graphs 3d transition graph

Aside: 4d transitions 3d and 4d transition graphs 3d transition graph

Aside: 4d transitions 3d and 4d transition graphs 3d transition graph its complement a 4d transition graph

Aside: 4d transitions 3d and 4d transition graphs a 4d transition graph

Aside: 4d transitions 3d and 4d transition graphs

Aside: 4d transitions 3d and 4d transition graphs 4d navgraph Observe the 4d transition NOT a rigid rotation