Part 1 Geometric Techniques Scatterplots, Parallel Coordinates,... Geometric Techniques Basic Idea Visualization of Geometric Transformations and Projections of the Data Scatterplots [Cleveland 1993] Parallel Coordinates [Inselberg 1985/1990] Prosection Views [Spence 95] Landscape [Wise, et al. 1995] ThemeRiver [Havre, et al 2000] Hyperslice [van Wijk, et al 1993] [Keim, 2001] Basic Idea: Scatterplots Visualizes a Relation (Correlation) between two Variables X and Y e.g., weight and height Individual Data Points are Represented in 2D where axes represent the variables X on the horizontal axis Y on the vertical axis in 3D in... Example: Scatter Plot House data: Price and Number of bedrooms User can identify global trends, local trade-offs and outliners. Number of Bedrooms 6 5 4 3 2 1 50K 100K 150K 200K 250K 300K Price ( )
Examples: Scatterplots (1/3) No relationship Strong linear (positive correlation) Examples: Scatterplots (2/3) Quadratic relationship Exponential relationship Exact linear (positive correlation) Strong linear (negative correlation) Sinusoidal relationship (damped) Outlier Examples: Scatterplots (3/3) Variation of Y doesn't depend on X (homoscedastic) Variation of Y does depend on X (heteroscedastic) Scatterplot - Conditioning Plot One limitation of the scatterplot matrix is that it cannot show interaction effects with another variable Purpose: Check pairwise relationship between two variables conditional on a third variable temp: torque versus time
3 D Data in the Box 3 D Data Set of 50 Observations in the Box Scatterplot Matrix of all pairwise Scatterplots Example: Cars [Becker & Cleveland, 1996] Example: Cars - Scatterplots Example 2 - Cars - Scatterplot m x m scatterplots diagonal = same (m 2 - m) left -right the same (m 2 - m)/2
3 D Scatterplot plus Color Scatterplot & SDOF (1/2) Scatterplot & SDOF (2/2) Basic Idea: Parallel Coordinates Assigns one Vertical Axis to each Variable Evenly spaces these axes horizontally Traditional Cartesian Coordinates All axes are mutually perpendicular Layout: k Parallel Axes Axes to [min, max] Scaling individually for each variable Polygonal Line [Inselberg and Dimsdale, 1990] Every data item correspond to a polygonal line Intersects each of the axes at the point Corresponds to the value for the attribute
Parallel Coordinates [Inselberg and Dimsdale, 1990] Parallel Coordinates Parallel Coordinates Basic 6-dim. Point with cordinates (-5,3,4,-2,0,1) T Visualization of Correlation Discover the Correlation one line: point in PC one circle:
Problems with Parallel Cord. Color in Parallel Coordinates Polygons need to Much Space Hierach Parallele Coord. Example: Cars - Parallel Cord.
Parallel Coordinates Demo Programs: Parallel Coordinates Visualization Applet http://csgrad.cs.vt.edu/~agoel/parallel_coordinates/ Benefits and Limitations Benefit Represent data greater than three dimensions Opportunities for human pattern recognition Flexibility: each coordinate can be individually scaled Zooming in or out: effectively brushing out or eliminating portions of the data set Limitations As the number of dimensions increases, the axes come closer to each other, making it more difficult to perceive patterns Prosection Views Similar to Scatterplots m-dim Data Sets Operators Projections Selections Color Coding customer s requirements (different limits) yes: red or green no: black, dark gray, light gray, and white [Spence, et al. 1995] The Prosection Matrix Design of a chair seat A design is represented by a point in Area-Thickness space Various performance limits restrict the range of possible designs Area Thickness Area too flexible too large too heavy too uncomfortable [ 2001 Robert Spence] Thickness
The Prosection Matrix [ 2001 Robert Spence] The Prosection Matrix [ 2001 Robert Spence] Problem: we don t know where the green area is located Area Moreover, there are typically many parameters (not 2) and many performance limits (not 2) Thickness Solution? Either iterative search (human, automated or mixed) or generation of data to visualise. Color Coding Parameter limits vs Performance limits Upper Limit S2 Par 2 Tolerance Region Satisfied all limits The Prosection Matrix A Prosection: Projection of a section [ 2001 Robert Spence] Lower Limit S2 Fail one performance limits, but manufactured Fail one or more performance limits, but manufactured Upper Limit S1 Fail one or more performance limits, not manufactured Lower Limit S1 Par 1 Satisfied all the performance limits, but outside one parameter limit = not manufactured
The Prosection Matrix The Prosection Matrix [ 2001 Robert Spence] [ 2001 Robert Spence] Prosection Matrix for the lamp design Parameters Model Performances Raw Data User A difficult cognitive problem is eased by a simple perceptual task Customer s Performance Requirements Selection Encoding Presentation Interaction Visualization tool designer Tolerances on parameter values The visualization tool (e.g., Influence Explorer) designer must take into account the need of the user to specify the model, the exploratory range of parameter values and the customer s performance specifications, as well as the selection, encoding and presentation of data. Landscape [Wise, et al. 1995] Data needs to be transformed into a (possible artificial) 2D spatial representation which preserves the characteristics of the data ThemeRiver: Visualizing Theme Changes over Time Susan Havre, Beth Hetzler, and Lucy Nowell Battelle Pacific Northwest Division, Washington, USA Applications I: Document Visualization
Excursus IEEE Symposium on Information Visualization - InfoVis 2000 InfoVis 2000: Facts'n'Figures October 9-10, 2000 Salt Lake City, Utah, USA Annual Conference/Symposium 6th Parent Conference IEEE Visualization 2000 ---- 11th Proceedings: CD Rom IEEE Computer Society, Los Alamitos, CA IEEE Visualization 2000 Annual Conference 11th
Types of Papers at InfoVis Session Topics Keynote Address Jock D. Mackinlay, University of Aarhus, Denmark Presentation, Visualization, What's Next Coining the term InfoVis Visual Data Mining Readings in InfoVis 20 Papers - 5 Sessions 6 Papers - Late Breaking Hot Topics Capstone Address Nahum Gershon, MITRE Visual Storytelling - Where Technology and Culture Meet Visual Querying and Data Exploration Graphs and Hierarchies Taxonomies, Frameworks, and Methodology Applications I: Document Visualization, Collaborative Visualization, Techniques Applications II: Algorithm Visualization, 3D Navigation ThemeRiver: Idea Visualizing Theme Changes over Time Susan Havre, Beth Hetzler, and Lucy Nowell Battelle Pacific Northwest Division, Washington, USA Applications I: Document Visualization A Large Collection of Documents Themes Changes River Metaphor - helps users to identify time-related patterns, trends, and relationships across a large collection of documents A Prototype System
ThemeRiver TM Histograms Data set: collection of speeches, interviews, articles, and other text associated with Fidel Castro Data set: collection of speeches, interviews, articles, and other text associated with Fidel Castro User Interactions Display Topic and Event Labels Display Time and Event Grid Lines Display the Raw Data Points Choose Among Drawing algorithms for the Currents and River Pan and Zoom Other Time Periods or Parts of the River More Detail or Broader Context Usability Evaluation 2 Users & Questions: Do users understand the metaphor? Can they identify themes that are more often discussed? Does the visualization help them raise new questions about the data? Do they interpret details of the visualization in ways we had not expected? How does their interpretation of the visualization differ from that of a histogram showing the same data?
Evaluation Results ThemeRiver Easy to Understand Useful + / - + River Metaphor + Abstraction to the Whole Collection + Identifying Macro Trends - Identifying Minor Trends Improvements Features of the Histogram Seeing Numeric Values (on demand) Total Number of Documents Features to Access the Documents User-Defined Ordering Reorder the Theme Currents Ordering by Correlation Parallel Rivers Impr.: Features of the Histogram Impr.: Parallel Rivers Data set: 1990 Associated Press (AP) newswire data from the TREC5 distribution disks, a set of over 100,000 documents Data set: compare 1990 Associated Press (AP) AP with data from Washington, D.C. and New York from the same time period
Color Family Tracking related themes is simplified by assigning them to the same color family. This ensures related themes appear together and are identifiable as a group. Conclusion River Metaphors Perception Principles [Ware 2000] Improvements Needed Event Time Line - Automatically Selecting and Ordering of Theme Currents More Information/Data on Demand Hyperslices (m 2 - m)/2 2D slices Operator Selection [ van Wiik, et al 1993]