Exploratory Data Analysis September 6, 2005
|
|
- Grace Wade
- 5 years ago
- Views:
Transcription
1 Exploratory Data Analysis September 6, 2005 Exploratory Data Analysis p. 1/16
2 Somethings to Look for with EDA skewness in distributions non-constant variability nonlinearity need for transformations outliers unknown groups or clusters Gain Insight into Data Check Assumptions for more Formal Statistical Models Exploratory Data Analysis p. 2/16
3 Graphical Views 1. Univariate: histograms, density curves, boxplots, quantile-quantile plots 2. Bivariate: scatter plots with trend lines, side-by-side boxplots 3. Several variables: scatter plot matrices, lattice or trellis plots, 3-dimensional plots, dynamic plots Exploratory Data Analysis p. 3/16
4 .First() Function To use the HH code, we need to 1. download the hh les from the course calendar link 2. download the First.R le 3. edit the First.R code to add the path for the hh les 4. Install packages for R (abind, lattice, multcomp, mvtnorm): Run the Gui version of R, and use the install packages from CRAN option. 5. load the.first function > source("first.r") 6. run the function (this session only if you save your workspace) >.First() Exploratory Data Analysis p. 4/16
5 Creating a Dataframe in R The hh function speci es the path for all HH les > usair = read.table(hh("datasets/usair.dat")) > names(usair) [1] "V1" "V2" "V3" "V4" "V5" "V6" "V7" colnames(usair)=c("so2","temp","mfgfirms","popn Notes: 1. the header=f (default) indicates no header (variable name) info 2. the names function extracts the names variables and cases in a dataframe 3. colnames can be used to assign more meaningful names Exploratory Data Analysis p. 5/16
6 Reading Data read.csv Comma separated variable format read.fwf Fixed width format useful for 4.1! read.delim Tab delimited les See help(read.table) for options, such as setting character for NAs, column separators, skipping lines, etc See also scan() Exploratory Data Analysis p. 6/16
7 Scatter Plots bivariate plot(x,y) plot(y x) Note use of model formula all-possible pairwise scatter plots plot(dataframe) pairs(dataframe) Exploratory Data Analysis p. 7/16
8 pairs() pairs(usair) pairs(usair, panel=panel.smooth) Add a smoother to each plot pairs(so2., panel=panel.smooth, data=usair) use a model formula Hartigan s original version of a scatterplot matrix had histograms on the diagonal. We need to rst de ne a function panel.hist for the diaginal panels Exploratory Data Analysis p. 8/16
9 Defining a function panel.hist = function(x,...) { usr <- par("usr"); on.exit(par(usr)) par(usr = c(usr[1:2], 0, 1.5) ) h <- hist(x, plot = FALSE) breaks <- h$breaks; nb <- length(breaks) y <- h$counts; y <- y/max(y) rect(breaks[-nb],0,breaks[-1],y,col="cyan",.. } Exploratory Data Analysis p. 9/16
10 SPLOM with histogram > pairs(so2 temp + mfgfirms + popn + wind + precip + raindays, data=usair, panel=panel.smooth, diag.panel=panel.hist) > pairs(log(so2) log(temp) + log(mfgfirms) + log(popn) + log(wind) + log(precip) + log(raindays), data=usair, panel=panel.smooth, diag.panel=panel.hist) Exploratory Data Analysis p. 10/16
11 Trellis Plots Trellis plots (S-Plus) and Lattice plots in R also create layouts for multiple plots. A trellis of plots is generated as a sequence of plots that are then arranged in rows, columns and pages. The sequence is determined by the conditioning factors in the formula X Y X Y X Z Y X Z*W where Z and W are factors or shingles, Y is on the y-axis, and X is on the x-axis Exploratory Data Analysis p. 11/16
12 Getting started library(lattice) help(lattice) help(xyplot) example(xyplot) Exploratory Data Analysis p. 12/16
13 Ladder of Powers The ladder function of HH is built on the lattice package > ladder(so2 temp, data=usair, main="ladder of Powers for SO2 and Tempe Explore Box-Cox power transformations of y (and x): power(y, p) { y p 1 p (p 0) log(y) (p = 0) Exploratory Data Analysis p. 13/16
14 Ladder of Powers with Boxplots and QQPl 1. create new function ladder.1d(x) from code in hh/graph/code/graph.f10.r 2. ladder.1d(usair$so2) y^!1 y^!0.5 Boxplot with Powers y^ 0 y^ 0.5 y^ 1 y^ 2!0.12!0.10!0.08!0.06!0.04!0.02!0.35!0.30!0.25!0.20!0.15! y^!1 Normal quantiles with Powers y^!0.5 y^ 0 y^ 0.5 y^ 1 y^ 2!0.12!0.10!0.08!0.06!0.04!0.02!0.35!0.30!0.25!0.20!0.15! ! ! ! ! ! ! Exploratory Data Analysis p. 14/16
15 Box-Cox Function A more formal way to nd a power transformation is to use the Box-Cox function library(mass) # more formal method to estimate power boxcox(so2 temp, data=usair)) boxcox(so2 log(temp), data=usair) boxcox(so2 sqrt(temp), data=usair) boxcox(so2 log(temp) + log(mfgfirms) + log(popn) + log(wind) + log(precip) + log(raindays), data=usair) Find value of power that maximizes the likelihood of normality Exploratory Data Analysis p. 15/16
16 SO2 data log!likelihood!230!220!210!200!190!180 95%!2! ! Choose a power near max or in interval Assumes a particular model formulation! Exploratory Data Analysis p. 16/16
Exploratory Data Analysis September 3, 2008
Exploratory Data Analysis September 3, 2008 Exploratory Data Analysis p.1/16 Installing Packages from CRAN R GUI (MAC/WINDOWS): Use the Package Menu (make sure that you select install dependencies ) Command-line
More informationExploratory Data Analysis September 8, 2010
Exploratory Data Analysis p. 1/2 Exploratory Data Analysis September 8, 2010 Exploratory Data Analysis p. 2/2 Scatter Plots plot(x,y) plot(y x) Note use of model formula Today: how to add lines/smoothed
More informationAdvanced Statistics 1. Lab 11 - Charts for three or more variables. Systems modelling and data analysis 2016/2017
Advanced Statistics 1 Lab 11 - Charts for three or more variables 1 Preparing the data 1. Run RStudio Systems modelling and data analysis 2016/2017 2. Set your Working Directory using the setwd() command.
More informationExploratory Data Analysis - Part 2 September 8, 2005
Exploratory Data Analysis - Part 2 September 8, 2005 Exploratory Data Analysis - Part 2 p. 1/20 Trellis Plots Trellis plots (S-Plus) and Lattice plots in R also create layouts for multiple plots. A trellis
More informationExploratory Data Analysis EDA
Exploratory Data Analysis EDA Luc Anselin http://spatial.uchicago.edu 1 from EDA to ESDA dynamic graphics primer on multivariate EDA interpretation and limitations 2 From EDA to ESDA 3 Exploratory Data
More informationSECTION 1-B. 1.2 Data Visualization with R Reading Data. Flea Beetles. Data Mining 2018
Data Mining 18 SECTION 1-B 1.2 Data Visualization with R It is very important to gain a feel for the data that we are investigating. One way to do this is by visualization. We will do this by starting
More informationGetting Started. Slides R-Intro: R-Analytics: R-HPC:
Getting Started Download and install R + Rstudio http://www.r-project.org/ https://www.rstudio.com/products/rstudio/download2/ TACC ssh username@wrangler.tacc.utexas.edu % module load Rstats %R Slides
More informationTable of Contents. Preface... ix
See also the online version (not complete) http://www.cookbook-r.com/graphs/ UCSD download from http:// proquest.safaribooksonline.com/9781449363086 Table of Contents Preface.......................................................................
More information8. MINITAB COMMANDS WEEK-BY-WEEK
8. MINITAB COMMANDS WEEK-BY-WEEK In this section of the Study Guide, we give brief information about the Minitab commands that are needed to apply the statistical methods in each week s study. They are
More informationRegression III: Advanced Methods
Lecture 3: Distributions Regression III: Advanced Methods William G. Jacoby Michigan State University Goals of the lecture Examine data in graphical form Graphs for looking at univariate distributions
More informationIntroduction to Lattice Graphics. Richard Pugh 4 th December 2012
Introduction to Lattice Graphics Richard Pugh 4 th December 2012 Agenda Overview of Lattice Functions Creating basic graphics Panelled Graphics Grouped Data Multiple Variables Writing Panel Functions Summary
More informationR Graphics. SCS Short Course March 14, 2008
R Graphics SCS Short Course March 14, 2008 Archeology Archeological expedition Basic graphics easy and flexible Lattice (trellis) graphics powerful but less flexible Rgl nice 3d but challenging Tons of
More informationImporting and visualizing data in R. Day 3
Importing and visualizing data in R Day 3 R data.frames Like pandas in python, R uses data frame (data.frame) object to support tabular data. These provide: Data input Row- and column-wise manipulation
More informationVisual Analytics. Visualizing multivariate data:
Visual Analytics 1 Visualizing multivariate data: High density time-series plots Scatterplot matrices Parallel coordinate plots Temporal and spectral correlation plots Box plots Wavelets Radar and /or
More informationRoger D. Peng, Associate Professor of Biostatistics Johns Hopkins Bloomberg School of Public Health
The Lattice Plotting System in R Roger D. Peng, Associate Professor of Biostatistics Johns Hopkins Bloomberg School of Public Health The Lattice Plotting System The lattice plotting system is implemented
More informationACHIEVEMENTS FROM TRAINING
LEARN WELL TECHNOCRAFT DATA SCIENCE/ MACHINE LEARNING SYLLABUS 8TH YEAR OF ACCOMPLISHMENTS AUTHORIZED GLOBAL CERTIFICATION CENTER FOR MICROSOFT, ORACLE, IBM, AWS AND MANY MORE. 8411002339/7709292162 WWW.DW-LEARNWELL.COM
More informationEvgeny Maksakov Advantages and disadvantages: Advantages and disadvantages: Advantages and disadvantages: Advantages and disadvantages:
Today Problems with visualizing high dimensional data Problem Overview Direct Visualization Approaches High dimensionality Visual cluttering Clarity of representation Visualization is time consuming Dimensional
More informationMultistat2 1
Multistat2 1 2 Multistat2 3 Multistat2 4 Multistat2 5 Multistat2 6 This set of data includes technologically relevant properties for lactic acid bacteria isolated from Pasta Filata cheeses 7 8 A simple
More informationIntroduction to R. A Statistical Computing Environment. J.C. Wang. Department of Statistics Western Michigan University
Introduction to R A Statistical Computing Environment J.C. Wang Department of Statistics Western Michigan University September 19, 2008 / Statistics Seminar Outline 1 Introduction What is R R Environment
More informationMATH11400 Statistics Homepage
MATH11400 Statistics 1 2010 11 Homepage http://www.stats.bris.ac.uk/%7emapjg/teach/stats1/ 1.1 A Framework for Statistical Problems Many statistical problems can be described by a simple framework in which
More informationPackage OLScurve. August 29, 2016
Type Package Title OLS growth curve trajectories Version 0.2.0 Date 2014-02-20 Package OLScurve August 29, 2016 Maintainer Provides tools for more easily organizing and plotting individual ordinary least
More informationR syntax guide. Richard Gonzalez Psychology 613. August 27, 2015
R syntax guide Richard Gonzalez Psychology 613 August 27, 2015 This handout will help you get started with R syntax. There are obviously many details that I cannot cover in these short notes but these
More informationwireframe: perspective plot of a surface evaluated on a regular grid cloud: perspective plot of a cloud of points (3D scatterplot)
Trellis graphics Extremely useful approach for graphical exploratory data analysis (EDA) Allows to examine for complicated, multiple variable relationships. Types of plots xyplot: scatterplot bwplot: boxplots
More informationDSCI 325: Handout 18 Introduction to Graphics in R
DSCI 325: Handout 18 Introduction to Graphics in R Spring 2016 This handout will provide an introduction to creating graphics in R. One big advantage that R has over SAS (and over several other statistical
More informationPackage r2d2. February 20, 2015
Package r2d2 February 20, 2015 Version 1.0-0 Date 2014-03-31 Title Bivariate (Two-Dimensional) Confidence Region and Frequency Distribution Author Arni Magnusson [aut], Julian Burgos [aut, cre], Gregory
More informationTrellis Displays. Definition. Example. Trellising: Which plot is best? Historical Development. Technical Definition
Trellis Displays The curse of dimensionality as described by Huber [6] is not restricted to mathematical statistical problems, but can be found in graphicbased data analysis as well. Most plots like histograms
More information1.3 Graphical Summaries of Data
Arkansas Tech University MATH 3513: Applied Statistics I Dr. Marcel B. Finan 1.3 Graphical Summaries of Data In the previous section we discussed numerical summaries of either a sample or a data. In this
More informationReading in data. Programming in R for Data Science Anders Stockmarr, Kasper Kristensen, Anders Nielsen
Reading in data Programming in R for Data Science Anders Stockmarr, Kasper Kristensen, Anders Nielsen Data Import R can import data many ways. Packages exists that handles import from software systems
More informationChapter 2: Looking at Multivariate Data
Chapter 2: Looking at Multivariate Data Multivariate data could be presented in tables, but graphical presentations are more effective at displaying patterns. We can see the patterns in one variable at
More informationR is a programming language of a higher-level Constantly increasing amount of packages (new research) Free of charge Website:
Introduction to R R R is a programming language of a higher-level Constantly increasing amount of packages (new research) Free of charge Website: http://www.r-project.org/ Code Editor: http://rstudio.org/
More informationA brief introduction to R
A brief introduction to R Cavan Reilly September 29, 2017 Table of contents Background R objects Operations on objects Factors Input and Output Figures Missing Data Random Numbers Control structures Background
More informationVisualizing univariate data 1
Visualizing univariate data 1 Xijin Ge SDSU Math/Stat Broad perspectives of exploratory data analysis(eda) EDA is not a mere collection of techniques; EDA is a new altitude and philosophy as to how we
More informationTypes of Plotting Functions. Managing graphics devices. Further High-level Plotting Functions. The plot() Function
3 / 23 5 / 23 Outline The R Statistical Environment R Graphics Peter Dalgaard Department of Biostatistics University of Copenhagen January 16, 29 1 / 23 2 / 23 Overview Standard R Graphics The standard
More informationData Science and Machine Learning Essentials
Data Science and Machine Learning Essentials Lab 3A Visualizing Data By Stephen Elston and Graeme Malcolm Overview In this lab, you will learn how to use R or Python to visualize data. If you intend to
More informationInstall RStudio from - use the standard installation.
Session 1: Reading in Data Before you begin: Install RStudio from http://www.rstudio.com/ide/download/ - use the standard installation. Go to the course website; http://faculty.washington.edu/kenrice/rintro/
More informationFlowJo Software Lecture Outline:
FlowJo Software Lecture Outline: Workspace Basics: 3 major components 1) The Ribbons (toolbar) The availability of buttons here can be customized. *One of the best assets of FlowJo is the help feature*
More informationbook 2014/5/6 15:21 page v #3 List of figures List of tables Preface to the second edition Preface to the first edition
book 2014/5/6 15:21 page v #3 Contents List of figures List of tables Preface to the second edition Preface to the first edition xvii xix xxi xxiii 1 Data input and output 1 1.1 Input........................................
More informationPart I. Graphical exploratory data analysis. Graphical summaries of data. Graphical summaries of data
Week 3 Based in part on slides from textbook, slides of Susan Holmes Part I Graphical exploratory data analysis October 10, 2012 1 / 1 2 / 1 Graphical summaries of data Graphical summaries of data Exploratory
More informationTutorial 3. Chiun-How Kao 高君豪
Tutorial 3 Chiun-How Kao 高君豪 maokao@stat.sinica.edu.tw Introduction Generalized Association Plots (GAP) Presentation of Raw Data Matrix Seriation of Proximity Matrices and Raw Data Matrix Partitions of
More informationVisualizing and Exploring Data
Visualizing and Exploring Data Sargur University at Buffalo The State University of New York Visual Methods for finding structures in data Power of human eye/brain to detect structures Product of eons
More informationIntroduction to R Benedikt Brors Dept. Intelligent Bioinformatics Systems German Cancer Research Center
Introduction to R Benedikt Brors Dept. Intelligent Bioinformatics Systems German Cancer Research Center What is R? R is a statistical computing environment with graphics capabilites It is fully scriptable
More informationIntroduction to R Reading, writing and exploring data
Introduction to R Reading, writing and exploring data R-peer-group QUB February 12, 2013 R-peer-group (QUB) Session 2 February 12, 2013 1 / 26 Session outline Review of last weeks exercise Introduction
More informationDSC 201: Data Analysis & Visualization
DSC 201: Data Analysis & Visualization Exploratory Data Analysis Dr. David Koop What is Exploratory Data Analysis? "Detective work" to summarize and explore datasets Includes: - Data acquisition and input
More informationAdvanced Multivariate Continuous Displays and Diagnostics
Advanced Multivariate Continuous Displays and Diagnostics 37 This activity explores advanced multivariate plotting. With the advanced diagnostic functions, output graphs are more readable and useful for
More informationFODAVA Partners Leland Wilkinson (SYSTAT & UIC) Robert Grossman (UIC) Adilson Motter (Northwestern) Anushka Anand, Troy Hernandez (UIC)
FODAVA Partners Leland Wilkinson (SYSTAT & UIC) Robert Grossman (UIC) Adilson Motter (Northwestern) Anushka Anand, Troy Hernandez (UIC) Visually-Motivated Characterizations of Point Sets Embedded in High-Dimensional
More informationLecture 6: Chapter 6 Summary
1 Lecture 6: Chapter 6 Summary Z-score: Is the distance of each data value from the mean in standard deviation Standardizes data values Standardization changes the mean and the standard deviation: o Z
More informationData Mining. CS57300 Purdue University. Bruno Ribeiro. February 1st, 2018
Data Mining CS57300 Purdue University Bruno Ribeiro February 1st, 2018 1 Exploratory Data Analysis & Feature Construction How to explore a dataset Understanding the variables (values, ranges, and empirical
More informationVisualizing Multivariate Data
Visualizing Multivariate Data We ve spent a lot of time so far looking at analysis of the relationship of two variables. When we compared groups, we had 1 continuous variable and 1 categorical variable.
More informationWork through the sheet in any order you like. Skip the starred (*) bits in the first instance, unless you re fairly confident.
CDT R Review Sheet Work through the sheet in any order you like. Skip the starred (*) bits in the first instance, unless you re fairly confident. 1. Vectors (a) Generate 100 standard normal random variables,
More informationStat 849: Plotting responses and covariates
Stat 849: Plotting responses and covariates Douglas Bates Department of Statistics University of Wisconsin, Madison 2010-09-03 Outline R Graphics Systems Brain weight Cathedrals Longshoots Domedata Summary
More informationYour Name: Section: INTRODUCTION TO STATISTICAL REASONING Computer Lab #4 Scatterplots and Regression
Your Name: Section: 36-201 INTRODUCTION TO STATISTICAL REASONING Computer Lab #4 Scatterplots and Regression Objectives: 1. To learn how to interpret scatterplots. Specifically you will investigate, using
More informationStat 849: Plotting responses and covariates
Stat 849: Plotting responses and covariates Douglas Bates 10-09-03 Outline Contents 1 R Graphics Systems Graphics systems in R ˆ R provides three dierent high-level graphics systems base graphics The system
More informationData Mining Chapter 3: Visualizing and Exploring Data Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University
Data Mining Chapter 3: Visualizing and Exploring Data Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Exploratory data analysis tasks Examine the data, in search of structures
More informationGraph tool instructions and R code
Graph tool instructions and R code 1) Prepare data: tab-delimited format Data need to be inputted in a tab-delimited format. This can be easily achieved by preparing the data in a spread sheet program
More informationNature Methods: doi: /nmeth Supplementary Figure 1
Supplementary Figure 1 Schematic representation of the Workflow window in Perseus All data matrices uploaded in the running session of Perseus and all processing steps are displayed in the order of execution.
More informationnetzen - a software tool for the analysis and visualization of network data about
Architect and main contributor: Dr. Carlos D. Correa Other contributors: Tarik Crnovrsanin and Yu-Hsuan Chan PI: Dr. Kwan-Liu Ma Visualization and Interface Design Innovation (ViDi) research group Computer
More informationApplied Regression Modeling: A Business Approach
i Applied Regression Modeling: A Business Approach Computer software help: SAS SAS (originally Statistical Analysis Software ) is a commercial statistical software package based on a powerful programming
More informationStat 290: Lab 2. Introduction to R/S-Plus
Stat 290: Lab 2 Introduction to R/S-Plus Lab Objectives 1. To introduce basic R/S commands 2. Exploratory Data Tools Assignment Work through the example on your own and fill in numerical answers and graphs.
More informationLearner Expectations UNIT 1: GRAPICAL AND NUMERIC REPRESENTATIONS OF DATA. Sept. Fathom Lab: Distributions and Best Methods of Display
CURRICULUM MAP TEMPLATE Priority Standards = Approximately 70% Supporting Standards = Approximately 20% Additional Standards = Approximately 10% HONORS PROBABILITY AND STATISTICS Essential Questions &
More informationhvpcp.apr user s guide: set up and tour
: set up and tour by Rob Edsall HVPCP (HealthVis-ParallelCoordinatePlot) is a visualization environment that serves as a follow-up to HealthVis (produced by Dan Haug and Alan MacEachren at Penn State)
More informationIntroduction to R. Daniel Berglund. 9 November 2017
Introduction to R Daniel Berglund 9 November 2017 1 / 15 R R is available at the KTH computers If you want to install it yourself it is available at https://cran.r-project.org/ Rstudio an IDE for R is
More informationThe basic arrangement of numeric data is called an ARRAY. Array is the derived data from fundamental data Example :- To store marks of 50 student
Organizing data Learning Outcome 1. make an array 2. divide the array into class intervals 3. describe the characteristics of a table 4. construct a frequency distribution table 5. constructing a composite
More informationIntroduction. Product List. Design and Functionality 1/10/2013. GIS Seminar Series 2012 Division of Spatial Information Science
Introduction Open GEODA GIS Seminar Series 2012 Division of Spatial Information Science University of Tsukuba H.Malinda Siriwardana The GeoDa Center for Geospatial Analysis and Computation develops state
More informationMatlab Tutorial 1: Working with variables, arrays, and plotting
Matlab Tutorial 1: Working with variables, arrays, and plotting Setting up Matlab First of all, let's make sure we all have the same layout of the different windows in Matlab. Go to Home Layout Default.
More informationCREATING POWERFUL AND EFFECTIVE GRAPHICAL DISPLAYS: AN INTRODUCTION TO LATTICE GRAPHICS IN R
APSA Short Course, SC 13 Chicago, Illinois August 29, 2007 Michigan State University CREATING POWERFUL AND EFFECTIVE GRAPHICAL DISPLAYS: AN INTRODUCTION TO LATTICE GRAPHICS IN R I. Some Basic R Concepts
More informationAz R adatelemzési nyelv
Az R adatelemzési nyelv alapjai II. Egészségügyi informatika és biostatisztika Gézsi András gezsi@mit.bme.hu Functions Functions Functions do things with data Input : function arguments (0,1,2, ) Output
More informationVisual Encoding Design
CSE 442 - Data Visualization Visual Encoding Design Jeffrey Heer University of Washington Review: Expressiveness & Effectiveness / APT Choosing Visual Encodings Assume k visual encodings and n data attributes.
More informationQuick Start Guide Jacob Stolk PhD Simone Stolk MPH November 2018
Quick Start Guide Jacob Stolk PhD Simone Stolk MPH November 2018 Contents Introduction... 1 Start DIONE... 2 Load Data... 3 Missing Values... 5 Explore Data... 6 One Variable... 6 Two Variables... 7 All
More informationWELCOME! Lecture 3 Thommy Perlinger
Quantitative Methods II WELCOME! Lecture 3 Thommy Perlinger Program Lecture 3 Cleaning and transforming data Graphical examination of the data Missing Values Graphical examination of the data It is important
More informationSession 6: Oracle R Enterprise Statistics Engine Oracle R Technologies
Session 6: Oracle R Enterprise 1.5.1 Statistics Engine Oracle R Technologies Mark Hornick Director, Oracle Advanced Analytics and Machine Learning July 2017 Safe Harbor Statement The following is intended
More informationEXCEL SKILLS. Selecting Cells: Step 1: Click and drag to select the cells you want.
Selecting Cells: Step 1: Click and drag to select the cells you want. Naming Cells: Viewlet available Step 2: To select different cells that are not next to each other, hold down as you click and
More informationApplied Multivariate Statistics for Ecological Data ECO632 Lab 1: Data Sc re e ning
Applied Multivariate Statistics for Ecological Data ECO632 Lab 1: Data Sc re e ning The purpose of this lab exercise is to get to know your data set and, in particular, screen for irregularities (e.g.,
More informationOverview. Frequency Distributions. Chapter 2 Summarizing & Graphing Data. Descriptive Statistics. Inferential Statistics. Frequency Distribution
Chapter 2 Summarizing & Graphing Data Slide 1 Overview Descriptive Statistics Slide 2 A) Overview B) Frequency Distributions C) Visualizing Data summarize or describe the important characteristics of a
More informationMixed models in R using the lme4 package Part 2: Lattice graphics
Mixed models in R using the lme4 package Part 2: Lattice graphics Douglas Bates University of Wisconsin - Madison and R Development Core Team University of Lausanne July 1,
More informationIAT 355 Visual Analytics. Data and Statistical Models. Lyn Bartram
IAT 355 Visual Analytics Data and Statistical Models Lyn Bartram Exploring data Example: US Census People # of people in group Year # 1850 2000 (every decade) Age # 0 90+ Sex (Gender) # Male, female Marital
More informationMinitab 17 commands Prepared by Jeffrey S. Simonoff
Minitab 17 commands Prepared by Jeffrey S. Simonoff Data entry and manipulation To enter data by hand, click on the Worksheet window, and enter the values in as you would in any spreadsheet. To then save
More informationPackage pmg. R topics documented: March 9, Version Title Poor Man s GUI. Author John Verzani with contributions by Yvonnick Noel
Package pmg March 9, 2010 Version 0.9-42 Title Poor Man s GUI Author John Verzani with contributions by Yvonnick Noel Maintainer John Verzani Depends lattice, MASS, proto, foreign,
More informationIntermediate Programming in R Session 1: Data. Olivia Lau, PhD
Intermediate Programming in R Session 1: Data Olivia Lau, PhD Outline About Me About You Course Overview and Logistics R Data Types R Data Structures Importing Data Recoding Data 2 About Me Using and programming
More informationR package
R package www.r-project.org Download choose the R version for your OS install R for the first time Download R 3 run R MAGDA MIELCZAREK 2 help help( nameofthefunction )? nameofthefunction args(nameofthefunction)
More informationIntroduction to R. Nishant Gopalakrishnan, Martin Morgan January, Fred Hutchinson Cancer Research Center
Introduction to R Nishant Gopalakrishnan, Martin Morgan Fred Hutchinson Cancer Research Center 19-21 January, 2011 Getting Started Atomic Data structures Creating vectors Subsetting vectors Factors Matrices
More informationEXST 7014, Lab 1: Review of R Programming Basics and Simple Linear Regression
EXST 7014, Lab 1: Review of R Programming Basics and Simple Linear Regression OBJECTIVES 1. Prepare a scatter plot of the dependent variable on the independent variable 2. Do a simple linear regression
More informationChapter 5: The beast of bias
Chapter 5: The beast of bias Self-test answers SELF-TEST Compute the mean and sum of squared error for the new data set. First we need to compute the mean: + 3 + + 3 + 2 5 9 5 3. Then the sum of squared
More informationQuick introduction to descriptive statistics and graphs in. R Commander. Written by: Robin Beaumont
Quick introduction to descriptive statistics and graphs in R Commander Written by: Robin Beaumont e-mail: robin@organplayers.co.uk http://www.robin-beaumont.co.uk/virtualclassroom/stats/course1.html Date
More information(Refer Slide Time: 0:51)
Introduction to Remote Sensing Dr. Arun K Saraf Department of Earth Sciences Indian Institute of Technology Roorkee Lecture 16 Image Classification Techniques Hello everyone welcome to 16th lecture in
More informationPart I, Chapters 4 & 5. Data Tables and Data Analysis Statistics and Figures
Part I, Chapters 4 & 5 Data Tables and Data Analysis Statistics and Figures Descriptive Statistics 1 Are data points clumped? (order variable / exp. variable) Concentrated around one value? Concentrated
More informationDesktop Command window
Chapter 1 Matlab Overview EGR1302 Desktop Command window Current Directory window Tb Tabs to toggle between Current Directory & Workspace Windows Command History window 1 Desktop Default appearance Command
More informationLAB #1: DESCRIPTIVE STATISTICS WITH R
NAVAL POSTGRADUATE SCHOOL LAB #1: DESCRIPTIVE STATISTICS WITH R Statistics (OA3102) Lab #1: Descriptive Statistics with R Goal: Introduce students to various R commands for descriptive statistics. Lab
More informationIn Minitab interface has two windows named Session window and Worksheet window.
Minitab Minitab is a statistics package. It was developed at the Pennsylvania State University by researchers Barbara F. Ryan, Thomas A. Ryan, Jr., and Brian L. Joiner in 1972. Minitab began as a light
More informationChapter 2 Describing, Exploring, and Comparing Data
Slide 1 Chapter 2 Describing, Exploring, and Comparing Data Slide 2 2-1 Overview 2-2 Frequency Distributions 2-3 Visualizing Data 2-4 Measures of Center 2-5 Measures of Variation 2-6 Measures of Relative
More informationIntroduction to Geospatial Analysis
Introduction to Geospatial Analysis Introduction to Geospatial Analysis 1 Descriptive Statistics Descriptive statistics. 2 What and Why? Descriptive Statistics Quantitative description of data Why? Allow
More informationSurvey of Math: Excel Spreadsheet Guide (for Excel 2016) Page 1 of 9
Survey of Math: Excel Spreadsheet Guide (for Excel 2016) Page 1 of 9 Contents 1 Introduction to Using Excel Spreadsheets 2 1.1 A Serious Note About Data Security.................................... 2 1.2
More informationIntroduction to Exploratory Data Analysis
Introduction to Exploratory Data Analysis Ref: NIST/SEMATECH e-handbook of Statistical Methods http://www.itl.nist.gov/div898/handbook/index.htm The original work in Exploratory Data Analysis (EDA) was
More informationLes exemples des fonctions graphiques de haut niveau
Fiche TD avec le logiciel : tdr79 Les exemples des fonctions graphiques de haut niveau P r Jean R. Lobry Table des matières 1 Introduction 3 2 boot 1.3-18 4 2.1 glm.diag.plots : Diagnostics plots for generalized
More informationCluster Analysis and Visualization. Workshop on Statistics and Machine Learning 2004/2/6
Cluster Analysis and Visualization Workshop on Statistics and Machine Learning 2004/2/6 Outlines Introduction Stages in Clustering Clustering Analysis and Visualization One/two-dimensional Data Histogram,
More informationExploratory/Visual Data Analysis
Exploratory/Visual Data Analysis Intelligent Data Analysis http://www.mit.bme.hu/node/8036 9/14/2018 Budapest University of Technology and Economics Fault Tolerant Systems Research Group Budapesti Műszaki
More informationADVANCED EXCEL BY NACHIKET PENDHARKAR (CA, CFA, MICROSOFT CERTIFIED TRAINER & EXCEL EXPERT)
ADVANCED EXCEL BY NACHIKET PENDHARKAR (CA, CFA, MICROSOFT CERTIFIED TRAINER & EXCEL EXPERT) Ph: +91 80975 38138 Email: nachiket@vinlearningcentre.com Website: www.vinlearningcentre.com LOOKUP FUNCTIONS
More informationData Mining: Exploring Data. Lecture Notes for Chapter 3
Data Mining: Exploring Data Lecture Notes for Chapter 3 Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler Look for accompanying R code on the course web site. Topics Exploratory Data Analysis
More informationTime Series Analysis by State Space Methods
Time Series Analysis by State Space Methods Second Edition J. Durbin London School of Economics and Political Science and University College London S. J. Koopman Vrije Universiteit Amsterdam OXFORD UNIVERSITY
More informationDr. Junchao Xia Center of Biophysics and Computational Biology. Fall /6/ /13
BIO5312 Biostatistics R Session 02: Graph Plots in R Dr. Junchao Xia Center of Biophysics and Computational Biology Fall 2016 9/6/2016 1 /13 Graphic Methods Graphic methods of displaying data give a quick
More informationUsing the DATAMINE Program
6 Using the DATAMINE Program 304 Using the DATAMINE Program This chapter serves as a user s manual for the DATAMINE program, which demonstrates the algorithms presented in this book. Each menu selection
More information