Introduction to Geospatial Analysis

Similar documents
Key Terms. Symbology. Categorical attributes. Style. Layer file

Organizing and Summarizing Data

Part I, Chapters 4 & 5. Data Tables and Data Analysis Statistics and Figures

Frequency Distributions

Acquisition Description Exploration Examination Understanding what data is collected. Characterizing properties of data.

Chapter 1. Looking at Data-Distribution

Chapter 2. Descriptive Statistics: Organizing, Displaying and Summarizing Data

Chapter 3 - Displaying and Summarizing Quantitative Data

Descriptive Statistics and Graphing

a. divided by the. 1) Always round!! a) Even if class width comes out to a, go up one.

CHAPTER 3: Data Description

STA Rev. F Learning Objectives. Learning Objectives (Cont.) Module 3 Descriptive Measures

AP Statistics Summer Assignment:

Data can be in the form of numbers, words, measurements, observations or even just descriptions of things.

The first few questions on this worksheet will deal with measures of central tendency. These data types tell us where the center of the data set lies.

Table of Contents (As covered from textbook)

Prepare a stem-and-leaf graph for the following data. In your final display, you should arrange the leaves for each stem in increasing order.

Measures of Central Tendency. A measure of central tendency is a value used to represent the typical or average value in a data set.

BIO 360: Vertebrate Physiology Lab 9: Graphing in Excel. Lab 9: Graphing: how, why, when, and what does it mean? Due 3/26

Use of GeoGebra in teaching about central tendency and spread variability

Making Science Graphs and Interpreting Data

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 2.1- #

Name Date Types of Graphs and Creating Graphs Notes

Univariate Statistics Summary

Downloaded from

MAT 142 College Mathematics. Module ST. Statistics. Terri Miller revised July 14, 2015

Descriptive Statistics, Standard Deviation and Standard Error

Vocabulary. 5-number summary Rule. Area principle. Bar chart. Boxplot. Categorical data condition. Categorical variable.

STA Module 2B Organizing Data and Comparing Distributions (Part II)

STA Learning Objectives. Learning Objectives (cont.) Module 2B Organizing Data and Comparing Distributions (Part II)

Elementary Statistics

Chapter 5snow year.notebook March 15, 2018

The basic arrangement of numeric data is called an ARRAY. Array is the derived data from fundamental data Example :- To store marks of 50 student

Measures of Central Tendency

STA 570 Spring Lecture 5 Tuesday, Feb 1

CHAPTER 1. Introduction. Statistics: Statistics is the science of collecting, organizing, analyzing, presenting and interpreting data.

Chapter 2 - Graphical Summaries of Data

Chapter 6: DESCRIPTIVE STATISTICS

At the end of the chapter, you will learn to: Present data in textual form. Construct different types of table and graphs

Ex.1 constructing tables. a) find the joint relative frequency of males who have a bachelors degree.

M7D1.a: Formulate questions and collect data from a census of at least 30 objects and from samples of varying sizes.

Chapter 3 Analyzing Normal Quantitative Data

statistical mapping outline

Chapter 6 Normal Probability Distributions

Further Maths Notes. Common Mistakes. Read the bold words in the exam! Always check data entry. Write equations in terms of variables

MATH 1070 Introductory Statistics Lecture notes Descriptive Statistics and Graphical Representation

CHAPTER 2: SAMPLING AND DATA

Raw Data is data before it has been arranged in a useful manner or analyzed using statistical techniques.

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency

Exploring and Understanding Data Using R.

Week 2: Frequency distributions

Measures of Dispersion

LAB 1 INSTRUCTIONS DESCRIBING AND DISPLAYING DATA

Bar Graphs and Dot Plots

Chapter 2 Modeling Distributions of Data

Statistics: Normal Distribution, Sampling, Function Fitting & Regression Analysis (Grade 12) *

8: Statistics. Populations and Samples. Histograms and Frequency Polygons. Page 1 of 10

2.1 Objectives. Math Chapter 2. Chapter 2. Variable. Categorical Variable EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES

Basically, a graph is a representation of the relationship between two or more variables.

Summarising Data. Mark Lunt 09/10/2018. Arthritis Research UK Epidemiology Unit University of Manchester

Data Classification 1

Unit Title Key Concepts Vocabulary CCS

7 th Grade HONORS Year at a Glance

Univariate descriptives

Name: Date: Period: Chapter 2. Section 1: Describing Location in a Distribution

Processing, representing and interpreting data

2.1: Frequency Distributions and Their Graphs

Measures of Dispersion

Applied Statistics for the Behavioral Sciences

Chapter 2 - Frequency Distributions and Graphs

Lecture Series on Statistics -HSTC. Frequency Graphs " Dr. Bijaya Bhusan Nanda, Ph. D. (Stat.)

IT 403 Practice Problems (1-2) Answers

Data Mining: Exploring Data. Lecture Notes for Chapter 3

Lecture Notes 3: Data summarization

Understanding and Comparing Distributions. Chapter 4

UNIT 1A EXPLORING UNIVARIATE DATA

1 Overview of Statistics; Essential Vocabulary

The School District of Palm Beach County Kindergarten Mathematics Scope st Trimester

Glossary Common Core Curriculum Maps Math/Grade 6 Grade 8

Data Mining: Exploring Data. Lecture Notes for Chapter 3. Introduction to Data Mining

Getting to Know Your Data

Data Statistics Population. Census Sample Correlation... Statistical & Practical Significance. Qualitative Data Discrete Data Continuous Data

Interactive Math Glossary Terms and Definitions

Chapter 2: Understanding Data Distributions with Tables and Graphs

Chapter 2 Describing, Exploring, and Comparing Data

Data Mining: Exploring Data. Lecture Notes for Data Exploration Chapter. Introduction to Data Mining

Graphical Presentation for Statistical Data (Relevant to AAT Examination Paper 4: Business Economics and Financial Mathematics) Introduction

UNIT 15 GRAPHICAL PRESENTATION OF DATA-I

Chapter 2 Organizing and Graphing Data. 2.1 Organizing and Graphing Qualitative Data

AND NUMERICAL SUMMARIES. Chapter 2

Stat 428 Autumn 2006 Homework 2 Solutions

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

Chapter 2: The Normal Distribution

Chapter Two: Descriptive Methods 1/50

Geography 281 Map Making with GIS Project Three: Viewing Data Spatially

Math 227 EXCEL / MEGASTAT Guide

Data Analysis and Solver Plugins for KSpread USER S MANUAL. Tomasz Maliszewski

Basic Statistical Terms and Definitions

2.1: Frequency Distributions

Box Plots. OpenStax College

Transcription:

Introduction to Geospatial Analysis Introduction to Geospatial Analysis 1

Descriptive Statistics Descriptive statistics. 2

What and Why? Descriptive Statistics Quantitative description of data Why? Allow cartographer to explore character of data through summary/descriptive statistics. Describes: Central tendency, dispersion, and shape So what are descriptive statistics and why should you know about them for mapping? Descriptive statistics are quantitative descriptions of data that provides some basic descriptions of the data set. Cartographers use descriptive statistics to explore the character of the data. Descriptive statistics can describe things such as the central tendency of the data, the dispersion of the data, and the shape of the data. 3

Central Tendency Defined: Distribution of data Measures: Maximum Minimum Range Mean Median Mode Central tendency of data describes the distribution of the data. There are 6 measures that we commonly used to describe the central tendency of a data set: maximum, minimum, range, mean, median, and mode. Let s discuss each one of these measures a little more detail. 4

Central Tendency Maximum: Maximum value in data set Minimum: Minimum value in data set Range: Maximum - Minimum The maximum measure of central tendency simply reports the maximum value in a data set. The minimum measure of central tendency returns the minimum value in a data set. The range measure returns the maximum value minus the minimum value of a data set. 5

Central Tendency Mean: തX = Σx N Median: Middle point of the data Mode: Most common value The mean measure of central tendency is the average value of the data set. The average value is defined by summing all values the data set and dividing it by the number of values in the data set. The media measure of central tendency reports the middle point of the data. So, for instance, if we had 3 observations in our data set, we would sort the observations in ascending order and then look at the value at the midway point which would be the 2 nd observation of this case. If there is an even number of observations in a data set, then the media will be the average of the 2 most central observations. The mode measure of central tendency reports the most common value found in the data set. If multiple values are tied as the most common value, the data set has multiple mode values. 6

Dispersion Defined: Measures the variability of the data set Two measures: Variance Standard Deviation Dispersion measures the variability of the data set. There are 2 measures of dispersion: variants, and standard deviation. 7

Variance The sum of the squares of the deviations divided by the number of observations σ 2 = Σ(x തX) 2 N Units of variance are identical to original units The variance measure of dispersion takes the sum of the squares of the deviations divided by the number of observations. The units of variance are identical to the original units of measure. So for instance, if the original units of the observations were in feet, then they variance will report how much the observations very on average in feet. 8

Standard Deviation The square root of the sum of the squares of the variance divided by the number of observations. σ = Σ(x തX) 2 N Allows for comparison between two variables The standard deviation is similar to the variance except that it takes the square root of the sum of the squares of the variance divided by the number of observations. The standard deviation measures dispersion in a standard way so that 2 different data sets can be compared. Standard deviation does not report the dispersion in the original units of the observations. 9

What we were looking at here is a normal distribution of data which means that most of the examples of the data set are close to the average value while relatively few examples tend to one extreme or the other. The X axis is the value in question and the y-axis is the number of observations for each value on the x-axis. The standard deviation of tells us how tightly all the various observations are clustered around the mean of the data set. When the examples are pretty tightly bunched together, the bell shaped curve is steep, and the standard deviation is small. When the examples are spread apart, the Bell curve is relatively flat, which tell you that you have a reasonably large standard deviation. One standard deviation away from the mean in either direction on the horizontal axis accounts for about 68% of the observations in the data set. 2 standard deviations away from the mean which of the 4 areas closest to the center, accounts for about 95% of the observations the data set. 3 standard deviations account for about 99% of all the observations of the curve. 10

Skewness Happens when the peak of a distribution is to either side of the mean of the distribution The skewness measure tells us whether the peak of a distribution is to one side of the mean or the other. If a data set as a negative skew, then the peak of the distribution is above the average, if a data set has a positive skew, the peak of the data set is below the average. 11

Kurtosis Describes the flatness or peaked-ness of a distribution Normal distribution kurtosis = 3.0 =3.0 is a mesokurtic distribution >3.0 is a leptokurtic distribution <3.0 is a platykurtic distribution The final measure of central tendency, is kurtosis. Kurtosis describes the flatness or peaked-ness of a distribution. For normal distribution, the kurtosis is equal to the value of 3.0. A normally distributed data set having the kurtosis value of 3.0 is a mesokurtic distribution. A value a above 3.0 is a leptokurtic distribution. A value less than 3.0 is a platykurtic distribution. 12

Here s a visualization of the 3 types of kurtosis. The green line, is the normal distribution with a value of 3.0 and is mesokurtic. The red line with a strong peak is the leptokurtic distribution and has a value greater than 3.0. The blue line which has a flat top is the platykurtic distribution which has a value less than 3.0. 13

Data Classification Data classification. 14

Data Classification Classification categorizes objects based on a set of conditions Classification may add to or modify attribute data for each geographic object Add a new code e.g. Large, small Recode an attribute e.g. Change urban to dense One attribute can yield many different maps depending on classification method Has direct affect on how map and data is perceived Data classification categorizes objects based on a set of conditions into separate bins or classes. Classification may add to or modify attribute data for each geographic object. For example, a classification could add a new code such as large or small. Classification could also recode an attribute such as changing urban 2 dents. One attribute can yield many different maps depending on which classification method is chosen. Different classification methods will have a direct effect on how the map and data are perceived by the map user, therefore much care must be taken when choosing a data classification method. 15

Why Classify Data? Primary Reason: To simplify data for visual display Simple Not Simple So why should you classify data? The primary reason to classify data is to simplify the data for visual display. For instance, the map on the right assigns each value a unique color, which means in this case, that each county has its own unique color. It is very difficult to look for patterns in this case when each county has its own unique color. The map on the left, uses the same data, but uses a data classification method to categorize the counties to make it easier to visualize the data. With the classified data, or the lighter values represent less of an item, and the darker values represent more of an item, spatial patterns of distribution begin to emerge. 16

Example of a Classification Here s an example of a classification showing countries with high, medium, and low values of population. 17

Classification Goals: Simplify visualization Group like observations Show difference between groups There are 3 goals for data classification with regards to cartography. The 1 st goal is to simplify the visualization so that spatial patterns of distribution can become more easily viewable by the map reader. 2 nd goal is to group similar observations together. The 3 rd goal is to show the difference between the groups. 18

Classification Methods Classification methods. 19

Binary Classification Places objects into two classes: 0 and 1 true and false Any other dichotomy you can think of Typically used to store results of complex operations Binary classification is when objects are placed into 2 classes. The 2 classes can be the value of 0 and 1, true and false, or any other dichotomy that you can think of. Typically, a binary classification is used to store the results of complex operations when the operation returns either a yes, or no answer. 20

In this example, the states were classified into 2 binary classes: one class for the northern states, and one class for the southern states. 21

Dissolve Dissolve combines similar features within a data layer based on an attribute The next classification method we can use is the dissolve method. The dissolve method combines similar features within a data layer based on a shared attribute. For example, continuing the northern and southern states example, we can dissolve all the states based on whether they are a northern state or a southern state. The dissolve operation creates new geometry and in this case, to polygons. One polygon which combines the extensive all of the northern states, and one polygon that combines all the extents of the southern states. Take note that the new geometry does not transfer any of the attributes of the dissolved. 22

Automatic Classification For the rest of this section, we will discuss automatic classification methods. An automatic classification method is where you set up rules for the computer to follow, and in the computer executes the data classification. 23

Jenks Natural Breaks Maximizes homogeneity in classes Uses breaks in histogram as class breaks Assumes that grouped data are alike Advantages Considers distribution of data Minimizes in-class variance Maximizes between-class variance Disadvantages Complicated Difficult to understand procedure for grouping Produces classification with high accuracy the Jenks natural breaks classification method aims to maximize homogenate eating classes. It uses breaks in a histogram as class breaks and assumes that group data are like. Advantages of the Jenks natural breaks automatic classification method is that it considers the distribution of data to minimize in class variance and maximize between class periods. Additionally, this method produces a classification with high accuracy of maximizing homogenate eating classes. Disadvantages of the Jenks natural breaks is that it is complicated to understand how it works. 24

Natural Breaks Here s an example of the Wisconsin map classified using the natural breaks method. On the right is the histogram of the values where the X axis is the observation value, and the y-axis is the number of observations for each value. The vertical blue lines show the extent of each class and are considered to be the class breaks. Looking at the histogram, and the blue lines, you can see that natural breaks tends to place the class breaks whether our natural valleys or vacancies in the data set. The map of the left displays the result of the classification visually and shows the 5 classes varying from a light brown to a dark brown in a sequential color. This map is interesting to look at, and displays interesting patterns in the data. 25

Nested Means Creates classes about arithmetic means Additional means can be calculated about the first mean to create additional classes Advantages Easily computed Mathematically intuitive Disadvantages Limited to 2-4-8-. Classes Does not consider distribution of data The nested means automatic classification method creates classes about the arithmetic means of the data set. Additional means can be calculated about the 1 st mean to create additional classes. The advantage of the nested means automatic classification method is that it is easily computed and mathematically intuitive. The disadvantages is that this classification method is limited to 2, 4, 8, or additional powers of 2 number of classes. Additionally, the nested means is not consider the distribution of data, only where the averages of the data fall. 26

Nested Means The way the nested means works is at the 1 st 2 classes are created 1 above the mean and 1 below the mean. The 3 rd and 4 th classes are created above and below the means from the 1 st 2 classes. In this case, it creates a reasonably interesting map, or each class has about the same number of observations. 27

Mean and Standard Deviation Creates classes about arithmetic mean and standard deviations above and below Colors should diverge about the mean Advantages Good for data with normal distribution Considers distribution of data Produces constant class intervals Disadvantages Most data are not normally distributed Requires understanding of statistics Not easily understood by map reader The mean and standard deviation automatic classification method creates classes about the arithmetic mean and standard deviations above and below the mean. For this method, since we are looking at divergent behavior, you should use a diverging color red. Advantages of this classification method are that it is good for displaying data with a normal distribution and it considers the distribution of data. Additionally, it produces constant class intervals one for each standard deviation above and below the mean. Disadvantages of this method is that most data is not normally distributed and is not a good fit for this method. Additionally, requires that the map user understand statistics to appropriately interpret the map. 28

Mean and Standard Deviation In this example, a diverging color ramp is used with the colors tending towards brown being observations below the mean, and observations towards blue being above the mean. The data for the state has a positive skew, it is not normally distributed, therefore, this method is not ideal for this data set, but does still tend to display interesting geographic patterns. 29

Equal Interval Creates classes with equal range Class Range = Max Value Min Value number observations Advantages Easily understood Simple to compute No gaps in legend Disadvantages Does not consider distribution of data May produce class with zero observations The equal interval automatic classification method creates classes with equal ranges. The class range is calculated by taking the maximum value of the data set, subtracting the minimum value from it, and then dividing that by the number of observations the data set. The advantages of the equal interval classification method is that it is easy to understand simple to compute and leaves the legend. Disadvantages is that it does not consider the distribution of data at all and it may produce classes with 0 observations which is highly undesirable. 30

Equal Interval If we look at the histogram of the data, we can see that the class breaks are evenly distributed which is the goal of the equal interval classification method. Looking at the map, so the interesting patterns are now hidden since the class ranges are including data that is different from each other. The map is also tending towards being a little boring for this data set. 31

Equal Frequency a.k.a. Quantile Distributes observations equally among classes If unequal distribution, overload lower classes Advantages Easily calculated Applicable to ordinal data No classes with zero observations Disadvantages Does not consider distribution of data Gaps in legend Observation value may be closer to value in different class The equal frequency automatic classification method is also known as the Quantile classification method. The goal of the equal frequency method is to distribute observations equally among classes, that is, each class will have the same number of observations. If the number of observations does not divide equally into the number of classes, then you should place the extra observations across the lower classes, thereby, overloading the lower classes. Advantages of this classification method is that it is easily calculated, applicable to ordinal data, and will not create a class with 0 observations. Disadvantages of the equal frequency automatic classification method are that it does not consider the distribution of data, can create gaps in the legend, and observation values inside of a class may not be similar. 32

Equal Frequency As the equal frequency automatic classification method aims to have the same number of observations in each class, there should be about the same amount of each color on the map, which lets towards a balanced looking map. Again, a major negative, is at a class may contain data values that are dissimilar, which goes against the purpose of classifying data. 33

Arithmetic and Geometric Intervals Class boundaries change systematically with mathematical progression Useful when range of observations are significant Advantages Good for data with large ranges Break points determined by rate of change in the data set Disadvantages No appropriate for data with small ranges or linear trends The arithmetic and geometric intervals automatic classification methods create class boundaries that change systematically with a mathematical progression. This classification method is useful when a range of observations are significant and the observations follow some sort of mathematical progression that can be followed with the classes. Advantages are that it is good for data with large ranges and the breakpoints are determined by the rate of change the data set. Disadvantages are that it is not appropriate for data with a small range or with linear trends. 34

Arithmetic and Geometric Intervals If we look at the histogram, you can see that the class ranges slowly increase along the x-axis as this trying to fit the data in the classes that increase at increasing rates. On the map, the creates a reasonably interesting map, but in this case does seem to over fit the data and make additional patterns exist were they probably should not. 35

Table Operations This section will cover the concept of table operations and how they apply to attribute tables in a geospatial data set. 36

Table Operations Combining tables of attributes is a powerful analysis technique Three common types of table operations: Intersect* Union* Join *Tables must have same fields and field types (schema) Combining tables of attributes is a powerful analysis technique. There are three common types of table operations: intersect, union, and join. Each one of these operations combined tables in different ways, and for different purposes. It is important to note, that in order for multiple tables to be inputs to intersect, or union table operation, the tables must have the same fields, and field types, which is commonly referred to as the schema. Let s take a look at each one of these three table operations in more detail. 37

Table Operations - Intersect Output table only contains records that all input tables have in common. Input Table 1 ID Region Salesman 1 A Will 2 D Adam 3 A Billy 4 M Alice 5 W Claire Input Table 2 ID Region Salesman 1 A Will 5 W Claire 11 M Jean Intersect Output Table ID Region Salesman 1 A Will 5 W Claire The intersect table operation produces an output table that only contains records that all of the input tables had in common. That means, that every attribute for entire row was identical. As an example, we have to input tables that have the same schema, as all input tables must have the same schema for the intersect operation. The output table only includes the two records that were found in both input tables. In this case, the row where the ID was one, region was A, and salesman was will, in the second row was where the ID was five, the region was W, and the salesman was Claire. 38

Table Operations - Union Output table only contains all records that exist in all input tables Input Table 1 ID Region Salesman 1 A Will 2 D Adam 3 A Billy 4 M Alice 5 W Claire Input Table 2 ID Region Salesman 1 A Will 5 W Claire 11 M Jean Union Output Table ID Region Salesman 1 A Will 2 D Adam 3 A Billy 4 M Alice 5 W Claire 11 M Jean The union table operation produces an output table that contains all records that existed in all input tables. For example, we have the same two tables as before, and notice at the output table combines all the rows from both input tables. Again, note that both input tables have the same schema, as this is required for union. 39

Table Operations - Join Based on a shared attribute, attributes of one input table appended to attributes of the other input table ID Region Salesman 1 A Will 2 D Adam 3 A Billy 4 M Alice 5 W Claire Input Table 2 Input Table 1 Region Manager Area (sq. mi.) A John 20 D Felicia 25 W Alan 16 M Seneca 19 Join Output Table ID Region Salesman Manager Area (sq. mi.) 1 A Will John 20 2 D Adam Felicia 25 3 A Billy John 20 4 M Alice Seneca 19 5 W Claire Alan 16 The third table operation, is the join operation. The join operation joins together to tables based on a shared attribute. Based on the shared attribute, attributes of one input table is appended to the attributes of the other input table. For example, we have to input tables that share the common attribute of region. Based on the region attribute, the manager and area attributes of input table 2, are appended to the rows and input table 1. The region attribute, matches its value between the two tables, which determines which attributes of which row from input table 2 is copied over. For example, if we look at the output table, the row with ID one had a region a, which means that the manager John, and area of 20 was attended. In row where the ID was two, since the region was value of D, then the row with the same region of value D in the second input table is copied to the output table. Also note, that for the record were ID was three, the region is A again, so once again, the manager John, and area 20 are copied to the output table for that row. 40

Creating Graphic Statistical Models Creating graphic statistical models. 41

Graphic Statistical Models Another form of cartography, providing a visualization for analyzing data Allows the user to identify trends Provides an insight into the distribution of the data Scatter plots, bar graphs, vertical line graphs, and histograms all provide this visual representation A graphic statistical model is another form of cartography that provides a visualization for analyzing data in a form other than a map. Graphic statistical models are excellent at allowing the user to identify trends in the data. Additionally, graphic statistical models provides an insight into the distribution of the data over the entire data set. There are many different graphic statistical models available for use, but here we will only cover the 4 most common which are scatterplots, bar graphs, vertical line graphs, and histograms. 42

Scatter Plots Good method of viewing relationships in datasets Identify relationships between 2 variables Relationships are visible by pattern in the plot (ie. trend lines) Describes correlations between datasets May also describe no relationship based on random scatters A scatterplots is a good method of viewing relationships between variables. A scatterplot allows the reader to identify relationships between 2 variables were each variable is on its own axis. With both variables on separate axes, relationships are visible by a pattern in the plot which is commonly known as a trend and is traced using trendlines. If there is a trend inside the data set, is scatterplot will visually describe it for you. It may also describe no relationship at the data seems the randomly scattered. 43

This is an example of a scatterplot showing the GDP per capita of the United States in the year 2009, and the economic complexity index of the United States and your 2008. As we can see in the scatterplot, as the GDP increases, so does the economic complexity index. This is a good example of positive correlation between the 2 variables. Also, since there seems to be a trend in this data set, you could draw a trend line starting at the bottom left through the center of the data points to represent the general trend of the data. 44

Bar Graph Represents qualitative data in relative dimensions Involves independent variables There is no uniform distance between levels Can extract trends between bars, but can t calculate the sloe from the height of the bars A bar graph grows rectangular bars in proportion to values of the represent. This allows for comparison between categories of items in a nice visual form. On a bar graph, one axis shows the categories of the items while the other shows a discrete value for each one of those items. 45

Here s an example of a bar graph showing the wind generated electricity as a percentage of a states energy generation for the year 2011. On the vertical axis we see the name of each item, in this case state names. On the horizontal axis, we see numbers of represent percentages. For each state, a bar is drawn proportional to the value of what the percentage of the states energy generation is wind generated. With all of the states bars drawn, it presents a nice visual for comparing the percentages between states. It should be noted that while this bar graph is ordered in descending order from top to bottom, this is not required. 46

Vertical Line Graph Displays series of information represented by points and connected by lines Good for visualizing a trend A vertical line graph displays a series of information represented by points and connected by lines. Vertical line graphs are excellent for visualizing a trend within a data set. 47

Record High Tempretures for Corpus Christi 115 110 105 100 95 90 85 80 January February March April May June July August September October November December In this vertical line graph, on the vertical axis are temperature values between 80 and 115. On the horizontal axis, are the 12 months in the year. For each month, a dot is placed above that month at the location where the temperature was recorded. This is done for all 12 months. Next, a line is drawn between all 12 dots to show continuity. Vertical line graphs are very simple to create and interpret. 48

Histogram Represents continuous variables (ie. heights, temperatures, etc.) Used to plot the density of data Allows the user to see in which range the higher frequency of data exists A histogram is used to plot the density of data. On a histogram, for each range, or value of ranges, a bar is extended for each observation that falls on the value or range of values. A histogram allows the user to see in which range of the data set a higher frequency of observations exist. 49

Describes both car reliability anthis histogram shows the percentages of 2000 models of automobiles at age 5 to 6 years with the Consumer Reports overall reliability rate. On the vertical axis, we see the percentage value. On the horizontal axis, we see 5 qualitative measures of reliability. The height of the bars vary depending on how many values fall within each measure of reliability. As can be seen here, the majority of values fell within the average reliability rating. d durability 50