Advanced Statistics 1. Lab 11 - Charts for three or more variables. Systems modelling and data analysis 2016/2017
|
|
- Britton Cox
- 6 years ago
- Views:
Transcription
1 Advanced Statistics 1 Lab 11 - Charts for three or more variables 1 Preparing the data 1. Run RStudio Systems modelling and data analysis 2016/ Set your Working Directory using the setwd() command. 3. If you didn t do it before, clean up your workspace using rm command 2 Creating Charts Creating clustered bar charts for frequencies 1. Load the data from warpbreaks data set to looking at how often different kinds of wool break under different kinds of tension. Here we have three variables: the outcome variable which is the number of breaks and two predictor variables: the kind of wool (a or b) and the level of tension (low, medium or high).?warpbreaks 2. Use the barplot function to chart breaks as a function of wool and tension. Is it works? Why? barplot(breaks ~ wool*tension, data = warpbreaks) 3. Restructure the data using the function tapply to create double matrix. data <- tapply(warpbreaks$breaks, list(warpbreaks$wool, warpbreaks$tension), mean) 4. Now create a chart using barplot function for the data. barplot(data, beside = TRUE, col = c( steelblue3, thistle3 ), bor = NA, main = Mean Number of Warp Breaks\nby Tension and Wool, xlab = Tension, ylab = Mean Number of Breaks )
2 Advanced Statistics 2 5. Add legend to the created chart. Notice that if we use locator(1) this legend will be interactive and lets you click where you want to put the legend. You can also specify legend location with coordinates. legend(locator(1), rownames(data), fill = c( steelblue3, thistle3 )) Creating scatter plots for grouped data Variations of scatter plots are most common choices to show the relationship between several quantitative variables. 1. Load and look at the first five observations of iris file, which is very well known data set in the statistic world. Edgar Anderson s Iris Data has the measurement of sepal lenght and width and petal length and width on three species of irises.?iris data(iris) iris[1:5, ] 2. Load car package that is companion to applied regression (if you don t have it installed already, you will need to do install.packages and then car. require(car) 3. Create a single scatter plot with groups marked by using the sp function from car sp(sepal.width ~ Sepal.Length Species, data = iris, xlab = Sepal Width, ylab = Sepal Length, main = Iris Data, labels = row.names(iris)) 4. Analyze created chart. Creating scatter plot matrices Scatter plot matrix is one of the options to look at the association of several quantitative variables with each other. 1. Create the basic scatter plot matrix for the iris data set. Use the pairs function. Analyze the created plot. pairs(iris[1:4])
3 Advanced Statistics 3 2. Create palette with RColorBrewer package to color the scatter plot. require( RColorBrewer ) display.brewer.pal(3, Pastel1 ) 3. Put histograms on the diagonal. For this - create the function panel.hist. panel.hist <- function(x,...) { usr <- par("usr"); on.exit(par(usr)) par(usr = c(usr[1:2], 0, 1.5) ) h <- hist(x, plot = FALSE) breaks <- h$breaks; nb <- length(breaks) y <- h$counts; y <- y/max(y) rect(breaks[-nb], 0, breaks[-1], y,...) # Removed "col = "cyan" from code block; original below # rect(breaks[-nb], 0, breaks[-1], y, col = "cyan",...) } 4. Again create the scatter plot matrix by using the pairs function, but now with smoother (panel.smooth) and with histograms on the diagonal (panel.hist function) pairs(iris[1:4], panel = panel.smooth, # Optional smoother main = "Scatterplot Matrix for Iris Data Using pairs Function", diag.panel = panel.hist, pch = 16, col = brewer.pal(3, "Pastel1")[unclass(iris$Species)]) 5. Create new scatter plot by using the scatterplotmatrix function from car package. Compare all created plots. library(car) scatterplotmatrix(~petal.length + Petal.Width + Sepal.Length + Sepal.Width Species, data = iris, col = brewer.pal(3, "Dark2"), main="scatterplot Matrix for Iris Data Using \"car\" Package") 6. Clean up the workspace. palette("default") # Return to default detach("package:rcolorbrewer", unload = TRUE) detach("package:car", unload=true)
4 Advanced Statistics 4 Creating 3D scatter plots 1. Load the iris data into the work space. data(iris) 2. To produce a static 3D scatter plot, first install and load the scatterplot3d package. Then, create the basic static 3D scatter plot by choosing the first three variables from iris data set with the function scatterplot3d. install.packages( scatterplot3d ) require( scatterplot3d ) scatterplot3d(iris[1:3]) 3. Now, create the modified static 3D scatter plot by adding coloring and vertical lines that connect each points to the floor of the scatter plot. Save the plot to the object s3d. s3d <- scatterplotd(iris[1:3], pch = 16, highlight.3d = TRUE, type = h, main = 3D Scatter plot ) 4. Calculate the regression plane and add this plane to the s3d object. Try to analyze this graph. plane <- lm(iris$petal.length ~ iris$sepal.length + iris$sepal.width) s3d$plane3d(plane) 5. Produce an interactive (here: dynamic spinning) 3D scatter plot, which is much easier to see the pattern because you can move it around and see where things are located in space. To do this, install and load the rgl package ( 3D visualization device system (OpenGL) ). Notice that unfortunately, this is not compatible with the RStudio (it will cause RStudio to crash when you close the graphic window). Because of that, you need to go to the standard console version of R and run it from there. install.packages( rgl ) require( rgl ) require( RColorBrewer ) plot3d(iris$petal.length, # x variable iris$petal.width, # y variable iris$sepal.length, # z variable xlab = "Petal.Length", ylab = "Petal.Width",
5 Advanced Statistics 5 zlab = "Sepal.Length", col = brewer.pal(3, "Dark2")[unclass(iris$Species)], size = 8) 6. Clean up your workspace. detach("package:scatterplot3d", unload = TRUE) detach("package:rgl", unload = TRUE) detach("package:rcolorbrewer", unload = TRUE) 3 Exercise Creating a scatter plot matrix Use the data from Lab 8 (searchdata.csv file) - This external data set contains information about Google searches. 1. Download, extract and load into R Studio the data from the file searchdata.zip 2. Look for the five variablesfrom this data set: nba, nfl, fifa, degree (demographic information about the percentage of adults with degrees), age (average). 3. Graph this using pairs. 4. Graph this using scatterplotmatrix from package cars which is for companion to applied regression.
Exploratory Data Analysis September 6, 2005
Exploratory Data Analysis September 6, 2005 Exploratory Data Analysis p. 1/16 Somethings to Look for with EDA skewness in distributions non-constant variability nonlinearity need for transformations outliers
More informationExploratory Data Analysis September 8, 2010
Exploratory Data Analysis p. 1/2 Exploratory Data Analysis September 8, 2010 Exploratory Data Analysis p. 2/2 Scatter Plots plot(x,y) plot(y x) Note use of model formula Today: how to add lines/smoothed
More informationExploratory Data Analysis September 3, 2008
Exploratory Data Analysis September 3, 2008 Exploratory Data Analysis p.1/16 Installing Packages from CRAN R GUI (MAC/WINDOWS): Use the Package Menu (make sure that you select install dependencies ) Command-line
More informationIntroduction to R for Epidemiologists
Introduction to R for Epidemiologists Jenna Krall, PhD Thursday, January 29, 2015 Final project Epidemiological analysis of real data Must include: Summary statistics T-tests or chi-squared tests Regression
More informationTable of Contents. Preface... ix
See also the online version (not complete) http://www.cookbook-r.com/graphs/ UCSD download from http:// proquest.safaribooksonline.com/9781449363086 Table of Contents Preface.......................................................................
More informationIntroduction to R and Statistical Data Analysis
Microarray Center Introduction to R and Statistical Data Analysis PART II Petr Nazarov petr.nazarov@crp-sante.lu 22-11-2010 OUTLINE PART II Descriptive statistics in R (8) sum, mean, median, sd, var, cor,
More informationSECTION 1-B. 1.2 Data Visualization with R Reading Data. Flea Beetles. Data Mining 2018
Data Mining 18 SECTION 1-B 1.2 Data Visualization with R It is very important to gain a feel for the data that we are investigating. One way to do this is by visualization. We will do this by starting
More informationAn Introduction to R Graphics
An Introduction to R Graphics PnP Group Seminar 25 th April 2012 Why use R for graphics? Fast data exploration Easy automation and reproducibility Create publication quality figures Customisation of almost
More informationk Nearest Neighbors Super simple idea! Instance-based learning as opposed to model-based (no pre-processing)
k Nearest Neighbors k Nearest Neighbors To classify an observation: Look at the labels of some number, say k, of neighboring observations. The observation is then classified based on its nearest neighbors
More informationMULTIVARIATE ANALYSIS USING R
MULTIVARIATE ANALYSIS USING R B N Mandal I.A.S.R.I., Library Avenue, New Delhi 110 012 bnmandal @iasri.res.in 1. Introduction This article gives an exposition of how to use the R statistical software for
More informationIntro to R for Epidemiologists
Lab 9 (3/19/15) Intro to R for Epidemiologists Part 1. MPG vs. Weight in mtcars dataset The mtcars dataset in the datasets package contains fuel consumption and 10 aspects of automobile design and performance
More informationnetzen - a software tool for the analysis and visualization of network data about
Architect and main contributor: Dr. Carlos D. Correa Other contributors: Tarik Crnovrsanin and Yu-Hsuan Chan PI: Dr. Kwan-Liu Ma Visualization and Interface Design Innovation (ViDi) research group Computer
More informationClojure & Incanter. Introduction to Datasets & Charts. Data Sorcery with. David Edgar Liebke
Data Sorcery with Clojure & Incanter Introduction to Datasets & Charts National Capital Area Clojure Meetup 18 February 2010 David Edgar Liebke liebke@incanter.org Outline Overview What is Incanter? Getting
More informationGraphing Bivariate Relationships
Graphing Bivariate Relationships Overview To fully explore the relationship between two variables both summary statistics and visualizations are important. For this assignment you will describe the relationship
More informationIntroduction to R. Daniel Berglund. 9 November 2017
Introduction to R Daniel Berglund 9 November 2017 1 / 15 R R is available at the KTH computers If you want to install it yourself it is available at https://cran.r-project.org/ Rstudio an IDE for R is
More informationLaTeX packages for R and Advanced knitr
LaTeX packages for R and Advanced knitr Iowa State University April 9, 2014 More ways to combine R and LaTeX Additional knitr options for formatting R output: \Sexpr{}, results='asis' xtable - formats
More informationVisualizing Multivariate Data
Visualizing Multivariate Data We ve spent a lot of time so far looking at analysis of the relationship of two variables. When we compared groups, we had 1 continuous variable and 1 categorical variable.
More informationStatistics 251: Statistical Methods
Statistics 251: Statistical Methods Summaries and Graphs in R Module R1 2018 file:///u:/documents/classes/lectures/251301/renae/markdown/master%20versions/summary_graphs.html#1 1/14 Summary Statistics
More informationAn Introduction to R 2.2 Statistical graphics
An Introduction to R 2.2 Statistical graphics Dan Navarro (daniel.navarro@adelaide.edu.au) School of Psychology, University of Adelaide ua.edu.au/ccs/people/dan DSTO R Workshop, 29-Apr-2015 Scatter plots
More informationCreating publication-ready Word tables in R
Creating publication-ready Word tables in R Sara Weston and Debbie Yee 12/09/2016 Has this happened to you? You re working on a draft of a manuscript with your adviser, and one of her edits is something
More informationData Visualization. Andrew Jaffe Instructor
Module 9 Data Visualization Andrew Jaffe Instructor Basic Plots We covered some basic plots previously, but we are going to expand the ability to customize these basic graphics first. 2/45 Read in Data
More informationThe Basics of Plotting in R
The Basics of Plotting in R R has a built-in Datasets Package: iris mtcars precip faithful state.x77 USArrests presidents ToothGrowth USJudgeRatings You can call built-in functions like hist() or plot()
More informationData analysis case study using R for readily available data set using any one machine learning Algorithm
Assignment-4 Data analysis case study using R for readily available data set using any one machine learning Algorithm Broadly, there are 3 types of Machine Learning Algorithms.. 1. Supervised Learning
More informationDEPARTMENT OF BIOSTATISTICS UNIVERSITY OF COPENHAGEN. Graphics. Compact R for the DANTRIP team. Klaus K. Holst
Graphics Compact R for the DANTRIP team Klaus K. Holst 2012-05-16 The R Graphics system R has a very flexible and powerful graphics system Basic plot routine: plot(x,y,...) low-level routines: lines, points,
More informationA system for statistical analysis. Instructions for installing software. R, R-studio and the R-commander
Instructions for installing software R, R-studio and the R-commander Graeme.Hutcheson@manchester.ac.uk Manchester Institute of Education, University of Manchester This course uses the following software...
More informationCombo Charts. Chapter 145. Introduction. Data Structure. Procedure Options
Chapter 145 Introduction When analyzing data, you often need to study the characteristics of a single group of numbers, observations, or measurements. You might want to know the center and the spread about
More informationData Mining - Data. Dr. Jean-Michel RICHER Dr. Jean-Michel RICHER Data Mining - Data 1 / 47
Data Mining - Data Dr. Jean-Michel RICHER 2018 jean-michel.richer@univ-angers.fr Dr. Jean-Michel RICHER Data Mining - Data 1 / 47 Outline 1. Introduction 2. Data preprocessing 3. CPA with R 4. Exercise
More informationLinear discriminant analysis and logistic
Practical 6: classifiers Linear discriminant analysis and logistic This practical looks at two different methods of fitting linear classifiers. The linear discriminant analysis is implemented in the MASS
More informationBasic Statistical Graphics in R. Stem and leaf plots 100,100,100,99,98,97,96,94,94,87,83,82,77,75,75,73,71,66,63,55,55,55,51,19
Basic Statistical Graphics in R. Stem and leaf plots Example. Create a vector of data titled exam containing the following scores: 100,100,100,99,98,97,96,94,94,87,83,82,77,75,75,73,71,66,63,55,55,55,51,19
More informationIntroduction to R 21/11/2016
Introduction to R 21/11/2016 C3BI Vincent Guillemot & Anne Biton R: presentation and installation Where? https://cran.r-project.org/ How to install and use it? Follow the steps: you don t need advanced
More informationAdvanced Graphics in R
Advanced Graphics in R Laurel Stell February 7, 8 Introduction R Markdown file and slides Download in easy steps: http://web.stanford.edu/ lstell/ Click on Data Studio presentation: Advanced graphics in
More informationData Visualization in R
Data Visualization in R L. Torgo ltorgo@fc.up.pt Faculdade de Ciências / LIAAD-INESC TEC, LA Universidade do Porto Oct, 216 Introduction Motivation for Data Visualization Humans are outstanding at detecting
More informationAn Introduction to R- Programming
An Introduction to R- Programming Hadeel Alkofide, Msc, PhD NOT a biostatistician or R expert just simply an R user Some slides were adapted from lectures by Angie Mae Rodday MSc, PhD at Tufts University
More informationR is a programming language of a higher-level Constantly increasing amount of packages (new research) Free of charge Website:
Introduction to R R R is a programming language of a higher-level Constantly increasing amount of packages (new research) Free of charge Website: http://www.r-project.org/ Code Editor: http://rstudio.org/
More informationBIO5312: R Session 1 An Introduction to R and Descriptive Statistics
BIO5312: R Session 1 An Introduction to R and Descriptive Statistics Yujin Chung August 30th, 2016 Fall, 2016 Yujin Chung R Session 1 Fall, 2016 1/24 Introduction to R R software R is both open source
More informationIntroduction to R Programming
Course Overview Over the past few years, R has been steadily gaining popularity with business analysts, statisticians and data scientists as a tool of choice for conducting statistical analysis of data
More informationDSCI 325: Handout 18 Introduction to Graphics in R
DSCI 325: Handout 18 Introduction to Graphics in R Spring 2016 This handout will provide an introduction to creating graphics in R. One big advantage that R has over SAS (and over several other statistical
More informationModule 10. Data Visualization. Andrew Jaffe Instructor
Module 10 Data Visualization Andrew Jaffe Instructor Basic Plots We covered some basic plots on Wednesday, but we are going to expand the ability to customize these basic graphics first. 2/37 But first...
More informationA Data Explorer System and Rulesets of Table Functions
A Data Explorer System and Rulesets of Table Functions Kunihiko KANEKO a*, Ashir AHMED b*, Seddiq ALABBASI c* * Department of Advanced Information Technology, Kyushu University, Motooka 744, Fukuoka-Shi,
More informationIntroduction to Statistical Graphics Procedures
Introduction to Statistical Graphics Procedures Selvaratnam Sridharma, U.S. Census Bureau, Washington, DC ABSTRACT SAS statistical graphics procedures (SG procedures) that were introduced in SAS 9.2 help
More informationLECTURE NOTES FOR ECO231 COMPUTER APPLICATIONS I. Part Two. Introduction to R Programming. RStudio. November Written by. N.
LECTURE NOTES FOR ECO231 COMPUTER APPLICATIONS I Part Two Introduction to R Programming RStudio November 2016 Written by N.Nilgün Çokça Introduction to R Programming 5 Installing R & RStudio 5 The R Studio
More informationAuthor: Leonore Findsen, Qi Wang, Sarah H. Sellke, Jeremy Troisi
0. Downloading Data from the Book Website 1. Go to http://bcs.whfreeman.com/ips8e 2. Click on Data Sets 3. Click on Data Sets: PC Text 4. Click on Click here to download. 5. Right Click PC Text and choose
More informationSession 6: Oracle R Enterprise Statistics Engine Oracle R Technologies
Session 6: Oracle R Enterprise 1.5.1 Statistics Engine Oracle R Technologies Mark Hornick Director, Oracle Advanced Analytics and Machine Learning July 2017 Safe Harbor Statement The following is intended
More informationPartitioning Cluster Analysis with Possibilistic C-Means Zeynel Cebeci
Partitioning Cluster Analysis with Possibilistic C-Means Zeynel Cebeci 2017-11-10 Contents 1 PREPARING FOR THE ANALYSIS 1 1.1 Install and load the package ppclust................................ 1 1.2
More informationAdvanced Multivariate Continuous Displays and Diagnostics
Advanced Multivariate Continuous Displays and Diagnostics 37 This activity explores advanced multivariate plotting. With the advanced diagnostic functions, output graphs are more readable and useful for
More informationStatistical Programming with R
Statistical Programming with R Lecture 9: Basic graphics in R Part 2 Bisher M. Iqelan biqelan@iugaza.edu.ps Department of Mathematics, Faculty of Science, The Islamic University of Gaza 2017-2018, Semester
More informationThe RcmdrPlugin.HH Package
Type Package The RcmdrPlugin.HH Package Title Rcmdr support for the HH package Version 1.1-4 Date 2007-07-24 July 31, 2007 Author Richard M. Heiberger, with contributions from Burt Holland. Maintainer
More informationPackage reghelper. April 8, 2017
Type Package Title Helper Functions for Regression Analysis Version 0.3.3 Date 2017-04-07 Package reghelper April 8, 2017 A set of functions used to automate commonly used methods in regression analysis.
More informationMachine Learning: Algorithms and Applications Mockup Examination
Machine Learning: Algorithms and Applications Mockup Examination 14 May 2012 FIRST NAME STUDENT NUMBER LAST NAME SIGNATURE Instructions for students Write First Name, Last Name, Student Number and Signature
More informationarulescba: Classification for Factor and Transactional Data Sets Using Association Rules
arulescba: Classification for Factor and Transactional Data Sets Using Association Rules Ian Johnson Southern Methodist University Abstract This paper presents an R package, arulescba, which uses association
More informationR Graphics. SCS Short Course March 14, 2008
R Graphics SCS Short Course March 14, 2008 Archeology Archeological expedition Basic graphics easy and flexible Lattice (trellis) graphics powerful but less flexible Rgl nice 3d but challenging Tons of
More informationIntro to R Graphics Center for Social Science Computation and Research, 2010 Stephanie Lee, Dept of Sociology, University of Washington
Intro to R Graphics Center for Social Science Computation and Research, 2010 Stephanie Lee, Dept of Sociology, University of Washington Class Outline - The R Environment and Graphics Engine - Basic Graphs
More informationDATA VISUALIZATION WITH GGPLOT2. Grid Graphics
DATA VISUALIZATION WITH GGPLOT2 Grid Graphics ggplot2 internals Explore grid graphics 35 30 Elements of ggplot2 plot 25 How do graphics work in R? 2 plotting systems mpg 20 15 base package grid graphics
More informationData Visualization in R
Data Visualization in R L. Torgo ltorgo@fc.up.pt Faculdade de Ciências / LIAAD-INESC TEC, LA Universidade do Porto Aug, 2017 Introduction Motivation for Data Visualization Humans are outstanding at detecting
More informationChuck Cartledge, PhD. 20 January 2018
Big Data: Data Analysis Boot Camp Visualizing the Iris Dataset Chuck Cartledge, PhD 20 January 2018 1/31 Table of contents (1 of 1) 1 Intro. 2 Histograms Background 3 Scatter plots 4 Box plots 5 Outliers
More informationBusiness Statistics: R tutorials
Business Statistics: R tutorials Jingyu He September 29, 2017 Install R and RStudio R is a free software environment for statistical computing and graphics. Download free R and RStudio for Windows/Mac:
More informationChapter 2: Descriptive Statistics: Tabular and Graphical Methods
Chapter 2: Descriptive Statistics: Tabular and Graphical Methods Example 1 C2_1
More information6 Subscripting. 6.1 Basics of Subscripting. 6.2 Numeric Subscripts. 6.3 Character Subscripts
6 Subscripting 6.1 Basics of Subscripting For objects that contain more than one element (vectors, matrices, arrays, data frames, and lists), subscripting is used to access some or all of those elements.
More informationExploratory Projection Pursuit
Exploratory Projection Pursuit (Jerome Friedman, PROJECTION PURSUIT METHODS FOR DATA ANALYSIS, June. 1980, SLAC PUB-2768) Read in required files drive - D: code.dir - paste(drive, DATA/Data Mining R-Code,
More informationInstruction: Download and Install R and RStudio
1 Instruction: Download and Install R and RStudio We will use a free statistical package R, and a free version of RStudio. Please refer to the following two steps to download both R and RStudio on your
More informationUniversity of Wollongong School of Mathematics and Applied Statistics. STAT231 Probability and Random Variables Introductory Laboratory
1 R and RStudio University of Wollongong School of Mathematics and Applied Statistics STAT231 Probability and Random Variables 2014 Introductory Laboratory RStudio is a powerful statistical analysis package.
More informationIntroduction to R. Hao Helen Zhang. Fall Department of Mathematics University of Arizona
Department of Mathematics University of Arizona hzhang@math.aricona.edu Fall 2019 What is R R is the most powerful and most widely used statistical software Video: A language and environment for statistical
More informationPackage superheat. February 4, 2017
Type Package Package superheat February 4, 2017 Title A Graphical Tool for Exploring Complex Datasets Using Heatmaps Version 0.1.0 Description A system for generating extendable and customizable heatmaps
More informationPackage Mondrian. R topics documented: March 4, Type Package
Type Package Package Mondrian March 4, 2016 Title A Simple Graphical Representation of the Relative Occurrence and Co-Occurrence of Events The unique function of this package allows representing in a single
More informationCommon Sta 101 Commands for R. 1 One quantitative variable. 2 One categorical variable. 3 Two categorical variables. Summary statistics
Common Sta 101 Commands for R 1 One quantitative variable summary(x) # most summary statitstics at once mean(x) median(x) sd(x) hist(x) boxplot(x) # horizontal = TRUE for horizontal plot qqnorm(x) qqline(x)
More informationShort Introduction to R
Short Introduction to R Paulino Pérez 1 José Crossa 2 1 ColPos-México 2 CIMMyT-México June, 2015. CIMMYT, México-SAGPDB Short Introduction to R 1/51 Contents 1 Introduction 2 Simple objects 3 User defined
More informationWork 2. Case-based reasoning exercise
Work 2. Case-based reasoning exercise Marc Albert Garcia Gonzalo, Miquel Perelló Nieto November 19, 2012 1 Introduction In this exercise we have implemented a case-based reasoning system, specifically
More informationPrelims Data Analysis TT 2018 Sheet 7
Prelims Data Analysis TT 208 Sheet 7 At the end of this exercise sheet there are Optional Practical Exercises in R and Matlab. It is strongly recommended that students do these exercises, but students
More informationA Tour of Sweave. Max Kuhn. March 14, Pfizer Global R&D Non Clinical Statistics Groton
A Tour of Sweave Max Kuhn Pfizer Global R&D Non Clinical Statistics Groton March 14, 2011 Creating Data Analysis Reports For most projects where we need a written record of our work, creating the report
More informationChuck Cartledge, PhD. 23 September 2017
Introduction Lattice Cancer case study Hands-on Q&A Conclusion References Files Big Data: Data Analysis Boot Camp Visualizing with the Lattice Package Chuck Cartledge, PhD 23 September 2017 1/38 Table
More informationCIND123 Module 6.2 Screen Capture
CIND123 Module 6.2 Screen Capture Hello, everyone. In this segment, we will discuss the basic plottings in R. Mainly; we will see line charts, bar charts, histograms, pie charts, and dot charts. Here is
More informationK-fold cross validation in the Tidyverse Stephanie J. Spielman 11/7/2017
K-fold cross validation in the Tidyverse Stephanie J. Spielman 11/7/2017 Requirements This demo requires several packages: tidyverse (dplyr, tidyr, tibble, ggplot2) modelr broom proc Background K-fold
More informationThe first one will centre the data and ensure unit variance (i.e. sphere the data):
SECTION 5 Exploratory Projection Pursuit We are now going to look at an exploratory tool called projection pursuit (Jerome Friedman, PROJECTION PURSUIT METHODS FOR DATA ANALYSIS, June. 1980, SLAC PUB-2768)
More informationBasics of Plotting Data
Basics of Plotting Data Luke Chang Last Revised July 16, 2010 One of the strengths of R over other statistical analysis packages is its ability to easily render high quality graphs. R uses vector based
More informationThe heplots Package. February 1, 2007
The heplots Package February 1, 2007 Type Package Title Visualizing Tests in Multivariate Linear Models Version 0.8-0 Date 2007-1-31 Author John Fox, Michael Friendly, and Georges Monette Maintainer John
More informationError-Bar Charts from Summary Data
Chapter 156 Error-Bar Charts from Summary Data Introduction Error-Bar Charts graphically display tables of means (or medians) and variability. Following are examples of the types of charts produced by
More informationMIS2502: Data Analytics Introduction to Advanced Analytics and R. Jing Gong
MIS2502: Data Analytics Introduction to Advanced Analytics and R Jing Gong gong@temple.edu http://community.mis.temple.edu/gong The Information Architecture of an Organization Now we re here Data entry
More informationPractical Data Mining COMP-321B. Tutorial 1: Introduction to the WEKA Explorer
Practical Data Mining COMP-321B Tutorial 1: Introduction to the WEKA Explorer Gabi Schmidberger Mark Hall Richard Kirkby July 12, 2006 c 2006 University of Waikato 1 Setting up your Environment Before
More informationWeek 2 Basic Statistical Concepts, Part II
Week 2 Basic Statistical Concepts, Part II Week 2 Objectives 1 Data presentation through numerical and graphical summaries using R: sample mean, variance and percentiles; the box plot, histogram, stem
More informationWhile not exactly the same, these definitions highlight four key elements of statistics.
What Is Statistics? Some Definitions of Statistics This is a book primarily about statistics, but what exactly is statistics? In other words, what is this book about? 1 Here are some definitions of statistics
More informationFitting Classification and Regression Trees Using Statgraphics and R. Presented by Dr. Neil W. Polhemus
Fitting Classification and Regression Trees Using Statgraphics and R Presented by Dr. Neil W. Polhemus Classification and Regression Trees Machine learning methods used to construct predictive models from
More information7/18/16. Review. Review of Homework. Lecture 3: Programming Statistics in R. Questions from last lecture? Problems with Stata? Problems with Excel?
Lecture 3: Programming Statistics in R Christopher S. Hollenbeak, PhD Jane R. Schubart, PhD The Outcomes Research Toolbox Review Questions from last lecture? Problems with Stata? Problems with Excel? 2
More informationEffective Graphics Made Simple Using SAS/GRAPH SG Procedures Dan Heath, SAS Institute Inc., Cary, NC
Effective Graphics Made Simple Using SAS/GRAPH SG Procedures Dan Heath, SAS Institute Inc., Cary, NC ABSTRACT There are many types of graphics displays that you might need to create on a daily basis. In
More informationSpring 2017 CS130 - Intro to R 1 R VISUALIZING DATA. Spring 2017 CS130 - Intro to R 2
Spring 2017 CS130 - Intro to R 1 R VISUALIZING DATA Spring 2017 Spring 2017 CS130 - Intro to R 2 Goals for this lecture: Review constructing Data Frame, Categorizing variables Construct basic graph, learn
More informationWhat is KNIME? workflows nodes standard data mining, data analysis data manipulation
KNIME TUTORIAL What is KNIME? KNIME = Konstanz Information Miner Developed at University of Konstanz in Germany Desktop version available free of charge (Open Source) Modular platform for building and
More informationChuck Cartledge, PhD. 13 October 2018
Big Data: Data Analysis Boot Camp Visualizing with the Lattice Package Chuck Cartledge, PhD 13 October 2018 1/38 Table of contents (1 of 1) 1 Intro. 2 Lattice Background Examples 3 Cancer case study The
More informationPractical 2: Plotting
Practical 2: Plotting Complete this sheet as you work through it. If you run into problems, then ask for help - don t skip sections! Open Rstudio and store any files you download or create in a directory
More informationPackage rafalib. R topics documented: August 29, Version 1.0.0
Version 1.0.0 Package rafalib August 29, 2016 Title Convenience Functions for Routine Data Eploration A series of shortcuts for routine tasks originally developed by Rafael A. Irizarry to facilitate data
More informationAn Introduction to Statistical Computing in R
An Introduction to Statistical Computing in R K2I Data Science Boot Camp - Day 1 AM Session May 15, 2017 Statistical Computing in R May 15, 2017 1 / 55 AM Session Outline Intro to R Basics Plotting In
More informationData Visualization Using R & ggplot2. Karthik Ram October 6, 2013
Data Visualization Using R & ggplot2 Karthik Ram October 6, 2013 Some housekeeping Install some packages install.packages("ggplot2", dependencies = TRUE) install.packages("plyr") install.packages("ggthemes")
More informationData Mining: Exploring Data. Lecture Notes for Chapter 3
Data Mining: Exploring Data Lecture Notes for Chapter 3 1 What is data exploration? A preliminary exploration of the data to better understand its characteristics. Key motivations of data exploration include
More informationDATA VISUALIZATION WITH GGPLOT2. Coordinates
DATA VISUALIZATION WITH GGPLOT2 Coordinates Coordinates Layer Controls plot dimensions coord_ coord_cartesian() Zooming in scale_x_continuous(limits =...) xlim() coord_cartesian(xlim =...) Original Plot
More informationR for IR. Created by Narren Brown, Grinnell College, and Diane Saphire, Trinity University
R for IR Created by Narren Brown, Grinnell College, and Diane Saphire, Trinity University For presentation at the June 2013 Meeting of the Higher Education Data Sharing Consortium Table of Contents I.
More informationGraph tool instructions and R code
Graph tool instructions and R code 1) Prepare data: tab-delimited format Data need to be inputted in a tab-delimited format. This can be easily achieved by preparing the data in a spread sheet program
More informationBasic Concepts Weka Workbench and its terminology
Changelog: 14 Oct, 30 Oct Basic Concepts Weka Workbench and its terminology Lecture Part Outline Concepts, instances, attributes How to prepare the input: ARFF, attributes, missing values, getting to know
More informationCanadian Bioinforma,cs Workshops.
Canadian Bioinforma,cs Workshops www.bioinforma,cs.ca Module #: Title of Module 2 Modified from Richard De Borja, Cindy Yao and Florence Cavalli R Review Objectives To review the basic commands in R To
More informationThe nuts and bolts of Sweave/Knitr for reproducible research
The nuts and bolts of Sweave/Knitr for reproducible research Marcus W. Beck ORISE Post-doc Fellow USEPA NHEERL Gulf Ecology Division, Gulf Breeze, FL Email: beck.marcusepa.gov, Phone: 850 934 2480 January
More informationNCSS Statistical Software
Chapter 152 Introduction When analyzing data, you often need to study the characteristics of a single group of numbers, observations, or measurements. You might want to know the center and the spread about
More informationCS4618: Artificial Intelligence I. Accuracy Estimation. Initialization
CS4618: Artificial Intelligence I Accuracy Estimation Derek Bridge School of Computer Science and Information echnology University College Cork Initialization In [1]: %reload_ext autoreload %autoreload
More informationmmpf: Monte-Carlo Methods for Prediction Functions by Zachary M. Jones
CONTRIBUTED RESEARCH ARTICLE 1 mmpf: Monte-Carlo Methods for Prediction Functions by Zachary M. Jones Abstract Machine learning methods can often learn high-dimensional functions which generalize well
More information