Lecture 3 - Data Visualization. Module 2

Similar documents
The basic arrangement of numeric data is called an ARRAY. Array is the derived data from fundamental data Example :- To store marks of 50 student

At the end of the chapter, you will learn to: Present data in textual form. Construct different types of table and graphs

Making Science Graphs and Interpreting Data

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 2.1- #

Select Cases. Select Cases GRAPHS. The Select Cases command excludes from further. selection criteria. Select Use filter variables

Statistical graphics in analysis Multivariable data in PCP & scatter plot matrix. Paula Ahonen-Rainio Maa Visual Analysis in GIS

TNM093 Tillämpad visualisering och virtuell verklighet. Jimmy Johansson C-Research, Linköping University

LAB 1 INSTRUCTIONS DESCRIBING AND DISPLAYING DATA

To make sense of data, you can start by answering the following questions:

Organisation and Presentation of Data in Medical Research Dr K Saji.MD(Hom)

MATH 117 Statistical Methods for Management I Chapter Two

Organizing and Summarizing Data

Advanced data visualization (charts, graphs, dashboards, fever charts, heat maps, etc.)

Principles of Professional Communication 1!! Familiar icons & symbols what do they represent?! Familiar signs!

CHAPTER 2 Information processing (Units 3 and 4)

Glyphs. Presentation Overview. What is a Glyph!? Cont. What is a Glyph!? Glyph Fundamentals. Goal of Paper. Presented by Bertrand Low

Creating a Basic Chart in Excel 2007

University of Florida CISE department Gator Engineering. Visualization

What Type Of Graph Is Best To Use To Show Data That Are Parts Of A Whole

8. MINITAB COMMANDS WEEK-BY-WEEK

Working with Charts Stratum.Viewer 6

B. Graphing Representation of Data

Mn/DOT Market Research Reporting General Guidelines for Qualitative and Quantitative Market Research Reports Revised: August 2, 2011

Graphical Presentation for Statistical Data (Relevant to AAT Examination Paper 4: Business Economics and Financial Mathematics) Introduction

Principles of Professional Communication 1!! Lecture 12! Graphics & Visuals a picture paints a thousand words!

Multivariate Data & Tables and Graphs. Agenda. Data and its characteristics Tables and graphs Design principles

1. Data Analysis Yields Numbers & Visualizations. 2. Why Visualize Data? 3. What do Visualizations do? 4. Research on Visualizations

Multivariate Data & Tables and Graphs. Agenda. Data and its characteristics Tables and graphs Design principles

GRAPHING BAYOUSIDE CLASSROOM DATA

Why Should We Care? More importantly, it is easy to lie or deceive people with bad plots

8 Organizing and Displaying

Decimals should be spoken digit by digit eg 0.34 is Zero (or nought) point three four (NOT thirty four).

Multivariate Data & Tables and Graphs

NOTES TO CONSIDER BEFORE ATTEMPTING EX 1A TYPES OF DATA

Frequency Distributions

MGMT 3125 Introduction to Data Visualization

TDWI strives to provide course books that are contentrich and that serve as useful reference documents after a class has ended.

Summarising Data. Mark Lunt 09/10/2018. Arthritis Research UK Epidemiology Unit University of Manchester

Statistical Tables and Graphs

Section 2-2 Frequency Distributions. Copyright 2010, 2007, 2004 Pearson Education, Inc

Table of Contents (As covered from textbook)

BIO 360: Vertebrate Physiology Lab 9: Graphing in Excel. Lab 9: Graphing: how, why, when, and what does it mean? Due 3/26

= 3 + (5*4) + (1/2)*(4/2)^2.

Infographics and Visualisation (or: Beyond the Pie Chart) LSS: ITNPBD4, 1 November 2016

Data Visualization for M&E. BRIDGE M&E Colloquium Jerusha Govender 8 August 2017

HOUR 12. Adding a Chart

THINKING VISUALLY: AN INTRODUCTION TO DATA & INFORMATION VISUALIZATION

visualizing q uantitative quantitative information information

Studying in the Sciences

Designing more effective scientific figures

Raw Data. Statistics 1/8/2016. Relative Frequency Distribution. Frequency Distributions for Qualitative Data

We will start at 2:05 pm! Thanks for coming early!

This chapter will show how to organize data and then construct appropriate graphs to represent the data in a concise, easy-to-understand form.

CHAPTER 2. Objectives. Frequency Distributions and Graphs. Basic Vocabulary. Introduction. Organise data using frequency distributions.

ENV Laboratory 2: Graphing

2.1 Objectives. Math Chapter 2. Chapter 2. Variable. Categorical Variable EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES

Excel Tutorial 4: Analyzing and Charting Financial Data

STP 226 ELEMENTARY STATISTICS NOTES

IAT 355 Visual Analytics. Animation 2. Lyn Bartram. Many of these slides were borrowed from M. Hearst and J. Heer

Name Date Types of Graphs and Creating Graphs Notes

MARKET RESEARCH AND EVALUATION2017. Reporting Guidelines

Here is the data collected.

INDEX UNIT 4 PPT SLIDES

Making Tables and Figures

Downloaded from

STK 573 Metode Grafik untuk Analisis dan Penyajian Data

Week 2: Frequency distributions

AND NUMERICAL SUMMARIES. Chapter 2

DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF TORONTO CSC318S THE DESIGN OF INTERACTIVE COMPUTATIONAL MEDIA. Lecture March 1998

Projected Message Design Principles

Data Analysis: Displaying Data - Deception with Graphs

Data Visualization Principles for Scientific Communication

Visual Encoding Design

Submission Guideline Checklist

Visualization as an Analysis Tool: Presentation Supplement

Statistical Methods. Instructor: Lingsong Zhang. Any questions, ask me during the office hour, or me, I will answer promptly.

Visual Computing. Lecture 2 Visualization, Data, and Process

Creating Dashboards That Work Learning Web Analytics

Overview for Families

2.1: Frequency Distributions

VIVO Identity Guidelines

Information Visualization. SWE 432, Fall 2016 Design and Implementation of Software for the Web

Part I, Chapters 4 & 5. Data Tables and Data Analysis Statistics and Figures

Chapter 2 Describing, Exploring, and Comparing Data

KECK Geology Consortium 2013 Symposium Volume, Instructions for Authors

Data Visualization Principles for Dashboard Design

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

Introduction to Geospatial Analysis

BUSINESS DECISION MAKING. Topic 1 Introduction to Statistical Thinking and Business Decision Making Process; Data Collection and Presentation

Introduction to Minitab 1

Creating Charts and Graphs to Visualize and Trend Your Business Metrics. Richard Iriye, RPh Kathy Costello, RN Kelly Britt, RPh

Appendix A: Graph Types Available in OBIEE

WELCOME! Lecture 3 Thommy Perlinger

Graphic Standards Guide. September 2014 PREPARED BY:

Chapters 1.5 and 2.5 Statistics: Collecting and Displaying Data

Creating Charts and Graphs to Visualize and Trend Your Business Metrics

Excel Tips and FAQs - MS 2010

Chapter 2 - Graphical Summaries of Data

TRANSANA and Chapter 8 Retrieval

Poster-making 101 for 1 PowerPoint slide

Transcription:

Lecture 3 - Data Visualization Module 2

A. Data is the raw material used to create information. B. Data collection gathers information measurements. C. Knowledge is measured by the number of data collected. 0% 0% 0% A. B. C.

A. It employs a lot of people B. It produces data visuals C. It gathers data measurements for further analysis 0% 0% 0% A. B. C.

A. True B. False 0% 0% A. B.

A. True B. False 0% 0% A. B.

A. True B. False 0% 0% A. B.

A. Satellite images B. Monitoring networks C. Sampling 0% 0% 0% A. B. C.

A. Data from outer space B. Data with coordinates C. Data referring to environmental resources 0% 0% 0% A. B. C.

http://www.wordle.net/ 5 Words about Climate Change by SCI103 Community 2015/2016

Having the data is not enough. I have to show it in ways people both enjoy and understand. (Prof. Hans Rosling) Raw data does not provide much insight unless it is processed and presented. The ways data is presented have a huge impact in providing meaningful analysis and interpretation. The human brain retains more the information contained in visuals rather than the information conveyed by written words or words spoken in a conversation. Visual tools are crucial for effective data communication.

Some of the worst data visualizations are the ones you have to stare at for several minutes before you even comprehend what they re trying to say. Turn data into something more engaging. There should be a swift aha! moment within seconds after someone sees your data. https://youtu.be/adszjzb-ax8

Goal of data visualization: communicate information clearly and effectively using graphics. Graphics: visual images presented in a surface such as paper or a computer screen. Data visualization: effective techniques used to communicate data or information by encoding it as visual objects able to synthesize large amounts of data. Visuals need to combine functionality and design to convey information intuitively.

HTTPS://YOUTU.BE/MKEXX7SDXAI Charts and tables: synthesize and display data (qualitative and quantitative) Charts Tables Source: Goulburn Murray Water

Conceptual diagrams and infographics synthesis, visualisation and context Conceptual diagram Infographics Source : Bureau of Meteorology http://images.wookmark.com/96600_post1_new.jpg

Satellite images and maps - geographic context Satellite image Map Source : Bureau of Meteorology

Video Photographs provide unique information Video clips can help to tell a story by capturing motion, perspective and sound Photograph Source: L Huzzey

Graphic visual representation of complex data/information, quickly and clearly Include traditional data visuals such as charts and tables Right combination of design and content to be effective https://doms.csu.edu.au/csu/thumbs/2aa657f5-7f7a-45a6-85b4-3610fd378eba/1/ad6a21bc-9c86-4640-b42a-dd470fa45d1a

http://images.sixrevisions.com/2009/05/09-03_coffee_drinks.jpg http://graphs.net/wp-content/uploads/2012/11/worlds-10-popular-books-sold-in-last-50-years.jpg

"Excellence in statistical graphics consists of complex ideas communicated with clarity, precision and efficiency" (Prof. Edward Tufte) Graphical excellence is of extreme relevance when data is used to perform analytical tasks such as making comparisons or determining causality. Bad visuals might distort the data, making it harder to understand or compare & ineffective and poorly presented information can lead to misinformation.

Understand the data you are trying to visualize including its size and cardinality (the uniqueness of data values in a column) Determine what you are trying to visualize and what kind of information you want to communicate Know your audience and understand how it processes visual information Use a visual that conveys the information in the best and simplest form for your audience.

Guideline 1: create the simplest graph that conveys the information you want to convey Guideline 2: consider the type of encoding object (points, lines, and bars) and attribute (point position, line length, color) used to create a plot Guideline 3: focus on visualizing patterns or on visualizing details, depending on the purpose of the plot Guideline 4: select meaningful axis ranges Guideline 5: data transformations and carefully chosen graph aspect ratios can be used to emphasize rates of change for timeseries data

Guideline 6: plot overlapping points in a way that density differences become apparent in scatter plots Guideline 7: use lines when connecting sequential data in time-series plots Guideline 8: aggregate larger datasets in meaningful ways Guideline 9: keep axis ranges as similar as possible to compare variables Guideline 10: select an appropriate color scheme based on the type of data

summarize and describe vast amounts of information in a compact, efficient and eye-catching way (Ducklan & Martin, 2002) Useful for data analysis, visualization & communication: present raw data present results of fairly complex analyses, summarise information, expose unanticipated characteristics of data, suggest hypotheses which may be further investigated.

Bar charts Histograms Pie charts Graphs http://www.statmethods.net/graphs/images/pie2.jpg

Y-axis: continuous data (count, value or percent) Display categorical data being each category independent of the others Used to compare a variable across a number of different groups, showing the size of each group (the length of each bar is proportional to the value it represents) Two main types of bar charts: horizontal and vertical. Horizontal bars usually represent a single period of time whereas column bars may represent similar items at different times Can also be displayed as subdivided bar charts where different variables are represented in the same column X-axis: discrete data Gaps between bars http://www.statmethods.net/graphs/images/barplot3.jpg

Type of a bar chart since data is displayed data using bars/columns but bars are placed next to each other Used to display frequency values, or the number of values that fall within the same category or interval (represented in the x-axis). In statistics, histograms are a graphical representation of the distribution of data To construct a histogram: divide the entire range of values into a series of categories; count how many values fall into each category; draw a rectangle with height proportional to the count and width equal to the category size.

Y-axis: continuous data (frequency values) X-axis: interval data

Mainly used to compare proportions To construct a pie chart: calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) Labels are nominal or ordinal data Proportions are ratio data It is generally recommended to avoid pie charts for data analysis and visualization: humans process easily differences in line length than surface area: it is more effective to use a bar chart (which takes advantage of line length to show comparison) rather than pie charts (which use surface area to show comparison) pie charts usually encode only a handful of numbers and a table is usually a much more efficient way to present such information

A line showing the relationship between two or more variables (line graphs and scatterplots) Advantages: Display of high information density, sometimes with no loss of data Rapid assimilation of the overall result Clearly display of complex relationships among multivariate data Graph interpretation: Height of the line (or series of lines) Patterns (seasonal pattern, trend or a combination of both)

Y-axis: Discrete and continuous data Scale should start at zero Y-axis: Continuous data Shows a trend Relationship between two variables by plotting their (x,y) positions X-axis: Discrete and continuous data http://www.statmethods.net/graphs/images/linechart1.png X-axis: Continuous data

Columns and rows filled with data Used for summarising results and data comparison Tables vs charts Tables are usually a better option than a chart when only few data points need to be displayed. If exact numeric values are required, a table is best since it can be hard to represent the exact values in a chart axis. In a thesis or research report, the detail and precision of tables may be more important since they are a repository of information.

(Swires-Hennessy, 2014)

Swires-Hennessy, E. (2014). How to Communicate your Message Effectively. Hay, I. (2012). Communicating in geography and the environmental sciences. Thomas, J. E., Saxby, T. A., Jones, A. B., Carruthers, T. J. B., Abal, E. G. & Dennison, W. C. (2006) Communicating science effectively: a practical handbook for integrating visual elements. Kelly, D., Jasperse, J. & Westbrooke, I. (2005). Designing science graphs for data analysis and presentation: the bad, the good and the better. Schwabish, J.A. (2014). An Economist s Guide to Visualizing Data.

Show the data Reveal content Avoid distorting data Present many numbers in a small space Encourage comparison of datasets Reveal data at several levels of details Serve a clear purpose Be integrated with verbal and statistical descriptions of data Consistent style and format

Concise and Comprehensible (present only the information that is relevant and required to support the content Who?, What?, How? ) Independent (someone who has not read the document associated with the graphic should be able to look at it and understand what it means) Referenced

http://abacus.bates.edu/~ganderso/biology/resources/writing/graphparts2003.gif http://abacus.bates.edu/~ganderso/biology/resources/writing/population_variation_table_gif.gif

Chart axis Axis labels legible and easy to find and easily associated with the axis/object depicted Labels on the graph should be clearly offset from the data or outside the axes When appropriate the units of measurement should be displayed Start the y-axis at zero when graphing numbers http://www.owlnet.rice.edu/~labgroup/pdf/excelplot.pdf

Chart axis Range of axis scale allows the full range of data to be included. Data points properly spaced Tick marks placed at sufficiently frequent intervals for a reader to work out accurately the value of each data point Time shown as the x-axis with time progressing from left to right & time intervals should be equal http://www.owlnet.rice.edu/~labgroup/pdf/excelplot.pdf

Chart type Two or more datasets must be easily distinguished from one another No more than 4 simultaneous symbols, values or lines, and each line or symbol should be sufficiently different from the others Overlapping symbols or lines must be visually separable Use vertical axes on the left and right sides of the graph to depict different scales when comparing datasets with different measurements O.P. Yakutina, T.V. Nechaeva, N.V. Smirnova, Consequences of snowmelt erosion: Soil fertility, productivity and quality of wheat on Greyzemic Phaeozem in the south of West Siberia, Agriculture, Ecosystems & Environment, Volume 200, 1 February 2015, Pages 88-93, ISSN 0167-8809, http://dx.doi.org/10.1016/j.agee.2014.10.021.

Table number (unique number for each table to be easily identified) Table title (self-explanatory, above the table) Column headings (explain meaning of the data, including units of measurement) Table notes (supplementary information, below the table) Table source (references)

Excel Table with Wagga Wagga climate data (September 2005 to August 2006) (Bureau of Meteorology) Formatting the table to make it effective Highlighted rows show the days in September 2005 where the rainfall was greater than evaporation

Figures and tables are numbered separately (e.g. Table 1, Table 2, Figure 1, Table 3, Figure 2) Figures and tables must always be properly referred to and captioned in the text. This means that a figure or table must be mentioned in the text before the figure or table appears Figures and tables captions: should be informative without being too long; if the data is sourced from elsewhere then this should be referenced in the caption.

Thomas et al., 2006

Table captions go above the table Figure captions go below the figure Graham S. Leonard, Carol Stewart, Thomas M. Wilson, Jonathan N. Procter, Bradley J. Scott, Harry J. Keys, Gill E. Jolly, Johnny B. Wardman, Shane J. Cronin, Sara K. McBride, Integrating multidisciplinary science, modelling and impact data into evolving, syn-event volcanic hazard mapping and communication: A case study from the 2012 Tongariro eruption crisis, New Zealand, Journal of Volcanology and Geothermal Research, Volume 286, 1 October 2014, Pages 208-232, ISSN 0377-0273, http://dx.doi.org/10.1016/j.jvolgeores.2014.08.018.

A title is not always used if the chart is used as a figure in the text and it is included in as part of the caption (a stand-alone chart in a presentation would normally have a title) Choose the right typeface style Avoid using different typefaces: use bold, italics, capitals, small caps or contrasting colours to create contrast and emphasis If possible, avoid putting values on charts (if numbers are needed use tables instead) Eliminate all redundant terms For tables, ensure that all values for the same variable have the same number of decimal points and ensure decimal points are aligned

Only use 2D charts for 2D data (i.e. to plot two variables) 3D visual appearance distorts the data and avoids a clear interpretation

Use colour, but use it with caution & Try to design your chart without the use of colour If it reproduces well in black and white it will be able to be reproduced in any medium Black and white design: make patterns in columns as contrasting as possible Don t make shading too gradual

Use the right resolution for your graphics Resolution measured as number of dots per inch (dpi) or number of pixels (given as the width and height of the image or as the total number of pixels in the image). More pixels > more resolution > file size Different figures formats (*.jpg, *.tif, *.gif, *.png, *.eps): resolution-dependent (image quality changes with compression) and resolution-independent (same quality even when we change size)

Multivariate data refers to data that is measured for more than 2 variables (bivariate data refers to 2 variables) Multivariate charts: Scatterplot matrices (continuous data) Mosaic plots (categorical data) Example multivariate data

Displays the relationship among two or more categorical variables Used for representing frequency tables (i.e. the number of times a data value occurs) Example: Mortality rates aboard the Titanic vary for males and females. Among females, 67% survived (coded as 1) and 33% died (coded as 0). https://medschool.vanderbilt.edu/cqs/files/cqs/media/drtsai2_0.pdf

Multidimensional data refers to measurements of variables in more than 2 dimensions (or 2D, normally referring to the Cartesian plot with the x,y axis) Common multidimensional charts: 3D plots where time (t) and depth (z) are frequently chosen as the third dimension

Next lecture: What is spatial data? Spatial data collection (GPS) Go through Module 2 in your Learning Modules The information presented here is important for Assessment 2a) and 2b)