ICPSR Training Program McMaster University Summer, The R Statistical Computing Environment: The Basics and Beyond
|
|
- Charity Lester
- 6 years ago
- Views:
Transcription
1 John Fox ICPSR Training Program McMaster University Summer, 2012 The R Statistical Computing Environment: The Basics and Beyond The R statistical programming language and computing environment has become the defacto standard for writing statistical software among statisticians and has made substantial inroads in the social sciences. R is a free, open-source implementation of the S language, and is available for Windows, Mac OS X, and Unix/Linux systems. There is also a commercial implementation of S called S-PLUS, but it has been eclipsed by R. The basic R system is developed and maintained by the R Core group, comprising 20 members, many of them eminent in the field of statistical computing. The R Project for Statistical Computing is a project of the R Foundation, whose membership includes the R Core group and several other individuals. A statistical package, such as SPSS or SAS, is primarily oriented toward combining instructions with rectangular case-by-variable datasets to produce (often voluminous) printouts. Such packages make routine data analysis relatively easy, but they make it relatively difficult to do things that are innovative or non-standard, or to add to the builtin capabilities of the package. In contrast, a good statistical computing environment also makes routine data analysis easy, but it additionally supports convenient programming; this means that users can extend the already impressive facilities of R. Statisticians and others have taken advantage of the extensibility of R to contribute nearly 4000 freely available packages of documented R programs and data to CRAN (the Comprehensive R Archive Network) < and many others to the Bioconductor package archive < As well, R is especially capable in the area of statistical graphics, reflecting the origin of S at Bell Labs, a centre of graphical innovation. The first day of this workshop is meant to provide a basic overview of and introduction to R, including to statistical modeling in R in effect, using R as a statistical package. The following three days pick up where the basic lectures leave off, and are intended to provide the background required to use R seriously for data analysis and presentation, including an introduction to R programming and to the design of custom statistical graphs, unlocking the power in the R statistical programming environment. Participants should bring their laptops to the workshop and should install R and RStudio in advance (see the instructions on the workshop website). An outline of the workshop follows (with chapter references to Fox and Weisberg, An R Companion to Applied Regression, Second Edition): Day 1. Getting started with R (Ch. 1); statistical models in R (Ch. 4, 5, & appendices) Day 2. Data in R (Ch. 2); the basics of R programming (Ch. 8, Sec )
2 Day 3. R programming, beyond the basics (Ch. 8, Sec ) Day 4. R graphics (Ch. 7); building R packages Course Web Site Materials for the course will be deposited at < abbreviation < >, which also has active links to many of the resources described in this syllabus. Acquiring R More detailed instructions are on the workshop website at < Windows Users You can download the R Windows installer from CRAN < or better from a CRAN mirror site near you < then double-click on the installer to install R as you would any Windows software. You can subsequently download and install only those packages that you want over the Internet from CRAN, via the Packages Install packages from CRAN menu in the RGui console. Mac Users A universal binary for Mac OS X 10.5 and higher is available from CRAN < or better from a CRAN mirror site near you < Double-click on the downloaded file to install R. You can then download and install packages over the Internet via the Packages & Data Packages Installer menu in R.app or R64.app console. Linux/Unix Users Precompiled binaries for popular Linux systems are available from CRAN < (or better from a CRAN mirror site near you < or users can compile R from source. See CRAN for details < RStudio RStudio < is a free, open-source interactive development environment (IDE) for R that installs easily on Windows, Mac OS X, and Linux systems and works well out of the box. Though still under active development, RStudio in my
3 opinion provides a better interface to R than the standard Windows and Mac OS X interfaces. Among the many services that it provides RStudio includes a package manner that will allow you to install packages conveniently. Installing the car Package For this course, you'll want to install the car package associated with the R Companion to Applied Regression; use the command install.packages("car") or install via the menus in the Windows or Mac OS X versions of R or via the packages tab in RStudio. Selected Bibliography Publishers of statistical texts have been producing a steady stream of books on R. Of particular note is Springer's Use R! series of brief paperbacks on various R-related topics < several titles of which I've listed below. Recently, Chapman and Hall, which has published a number of books on R, has also announced The R Series. Basic Texts The principal source for this workshop is J. Fox and S. Weisberg, An R Companion to Applied Regression, Second Edition, Sage (2011). Additional materials are available on the web site for the book < including several appendices (on structural-equation models, mixed models, survival analysis, etc.). The book is associated with the car package for R. I am a member of the R Foundation. Alternatively (or additionally), more advanced students may wish to use W. N. Venables and B. D. Ripley, Modern Applied Statistics with S as a principal source. Bill Venables is a member of the R Foundation, and Brian Ripley is a member of the R Core group. Manuals R is distributed with a set of manuals, which are also available at the CRAN web site < A manual for S-PLUS Trellis Graphics (also useful for the lattice package in R) is also available on the web at < Programming in R
4 R. A. Becker, J. M. Chambers, and A.R. Wilks, The New S Language: A Programming Environment for Data Analysis and Statistics. Pacific Grove, CA: Wadsworth, Defines S Version 2, which forms the basis of S Versions 3 and 4, as well as R. (Sometimes called the Blue Book. ) J. M. Chambers, Programming with Data: A Guide to the S Language. New York: Springer, Describes the then-new features in S Version 4, including the newer formal object-oriented programming system (also incorporated in R), by the principal designer of the S language and a member of the R Core group of developers. Not an easy read. (The Green Book. ) J. M. Chambers, Software for Data Analysis: Programming with R. New York: Springer, Chambers s newest book ranges quite widely, and emphasizes a deep understanding of the R language, along with object-oriented programming, and links between R and other software. Some topics are unusual, such as processing text data in R. J. M. Chambers and T.J. Hastie, eds., Statistical Models in S. Pacific Grove, CA: Wadsworth, An edited volume describing the statistical modeling capabilities in S, Versions 3 and 4, and R, and the object-oriented programming system used in S Version 3 and R (and available, for backwards compatibility, in S Version 4). In addition, the text covers S software for particular kinds of statistical models, including linear models, nonlinear models, generalized linear models, local-polynomial regression models, and generalized additive models. (The White Book. ) R. Gentleman, R Programming for Bioinformatics. Boca Raton: Chapman and Hall, A thorough, though at points relatively difficult, treatment of programming in R, by one of the original co-developers of R and a founder of the related Bioconductor Project (which develops computing tools for the analysis of genomic data). Don t let the title fool you: Most of the book is of general interest to R programmers. R. Ihaka and R. Gentleman, R: A language for data analysis and graphics. Journal of Computational and Graphical Statistics, 5: , The original published description of the R project, now quite out of date but still worth looking at. W. N. Venables and B. D. Ripley, S Programming. New York: Springer, A companion volume to Modern Applied Statistics with S, and at the time of its publication the definitive treatment of writing software in the various versions of S-PLUS and R; now somewhat dated, particularly with respect to R. Brian Ripley is a member of the R Core group of developers, and Bill Venables is a member of the R Foundation. Statistical Computing in R The following three books treat traditional topics in statistical computing, such as optimization, simulation, probability calculations, and computational linear algebra, using R (although the coverage of particular topics in the books differs). All offer introductions
5 to R programming. Of these books, Braun and Murdoch is the briefest and most accessible. W. J. Braun and D. J. Murdoch, A First Course in Statistical Programming with R. Cambridge: Cambridge University Press, Duncan Murdoch is a member of the R Core group of developers. O. Jones, R. Maillardet, and A. Robinson, Introduction to Scientific Programming and Simulation Using R. Boca Raton: Chapman and Hall, M. L. Rizzo. Statistical Computing with R, Boca Raton: Chapman and Hall, Graphics in R P. Murrell. R Graphics, Second Edition. New York: Chapman and Hall, A tour-deforce the definitive reference on traditional R graphics and on the grid graphics system on which lattice graphics (the R implementation of William Cleveland s Trellis graphics) is built. R code to produce the figures in the book are on Murrell s web site < Paul Murrell is a member of the R Core group of developers. P. Murrell and R. Ihaka, An approach to providing mathematical annotation in plots. Journal of Computational and Graphical Statistics, 9: , One of the unusual and very useful features of R graphics is the ability to include mathematical notation. This article explains how. Paul Murrell and Ross Ihaka are both members of the R core group. D. Sarkar, Lattice: Multivariate Data Visualization with R. New York: Springer, Deepayan Sarkar is the developer of the powerful lattice package in R, which implements Trellis graphics. This book provides a fine introduction to and overview of lattice graphics. Figures from the book and the R code to produce them are available on the web < Deepayan Sarkar is a member of the R Core group of developers. H. Wickham, ggplot2: Elegant Graphics for Data Analysis. New York: Springer, 2009: A guide to Hadley Wickham's ggplot2 package, which provides an alternative graphics system for R based on an extension of Wilkinson's The Grammar of Graphics (Second Edition, Springer, 2005), which, in turn, provides a systematic basis for constructing statistical graphs. Data Management P. Spector, Data Manipulation with R. New York: Springer, Data management is a dry subject, but the ability to carry it out is vital to the effective day-to-day use of R (or of any statistical software). Spector provides a reasonably broad and clear introduction to the subject.
6 (Highly) Selected Statistical Methods Programmed in R Also see the package listing on CRAN < and the various CRAN task views < R. S. Bivand, E. J. Pebesma, and V. Gómez-Rubio, Applied Spatial Data Analysis with R, New York: Springer, There is a strong community of researchers in spatial statistics developing R software, much of which is described in this book, including the basic sp package, which provides R classes for spatial data. Roger Bivand is a member of the R Foundation. W. Bowman and A. Azzalini, Applied Smoothing Techniques for Data Analysis: The Kernel Approach with S-Plus Illustrations. Oxford: Oxford University Press, A good introduction to nonparametric density estimation and nonparametric regression, associated with the sm package (for both S-PLUS and R). C. Davison and D. V. Hinkley, Bootstrap Methods and their Application. Cambridge: Cambridge University Press, A comprehensive introduction to bootstrap resampling, associated with the boot package (written by A. J. Canty). Somewhat more difficult than Efron and Tibshirani (immediately below). B. Efron and R. J. Tibshirani, An Introduction to the Bootstrap. London: Chapman and Hall, Another extensive treatment of bootstrapping by its originator (Efron), also accompanied by an R package, bootstrap (but somewhat less usable than boot). A. Gelman and J. Hill, Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge: Cambridge University Press, A wide-ranging yet deep treatment of hierarchical models and various related topics, predominantly but not exclusively from a Bayesian perspective, using both R and BUGS software. F. E. Harrell, Jr., Regression Modeling Strategies, With Applications to Linear Models, Logistic Regression, and Survival Analysis. New York: Springer, Describes an interesting approach to statistical modeling, with frequent references to Harrell's Hmisc and Design packages. T. J. Hastie and R. J. Tibshirani, Generalized Additive Models. London: Chapman and Hall, An accessible treatment of generalized additive models, as implemented in the gam package, and of nonparametric regression analysis in general. [The gam function in the mgcv package in R takes a somewhat different approach; see Wood (2000), below.] R. Koenker, Quantile Regression. Cambridge: Cambridge University Press, Describes a variety of methods for quantile regression by the leading figure in the area. The methods are implemented in Koenker's quantreg package for R.
7 C. Loader, Local Likelihood and Regression. New York: Springer, Another text on nonparametric regression and density estimation, using the locfit package. Although the text is less readable than Bowman and Azzalini, the locfit software in very capable. T. Lumley, Complex Surveys: A Guide to Analysis Using R. Hoboken NJ, Wiley, A lucid introduction to the analysis of data from complex survey samples and to Lumley's highly capable survey package. Thomas Lumley is a member of the R Core group of developers. G. P. Nason, Wavelet Methods in Statistics with R. New York: Springer, Describes the wavethresh package for wavelet smoothing, by one of the key figures in the development of wavelet methods in statistics. J. C. Pinheiro and D. M. Bates, Mixed-Effects Models in S and S-PLUS. New York: Springer, An extensive treatment of linear and nonlinear mixed-effects models in S, focused on the authors' nlme package. Mixed models are appropriate for various kinds of non-independent (clustered) data, including hierarchical and longitudinal data. Does not cover Bates's newer lme4 package. Doug Bates is a member of the R Core group of developers. T. M. Therneau and P. M. Grambsch, Modeling Survival Data: Extending the Cox Model. New York, Springer: An overview of both basic and advanced methods of survival analysis (event-history analysis), with reference to S and SAS software, the former implemented in Therneau's state-of-the-art survival package. S. van Buuren, Flexible Imputation of Missing Data, Boca Raton FL: CRC Press, There are several packages in R for multiple imputation of missing data; this book largely describes the mice (multiple imputation by chained equations) package. W. N. Venables and B. D. Ripley. Modern Applied Statistics with S, Fourth Edition. New York: Springer, An influential and wide-ranging treatment of data analysis using S. Many of the facilities described in the book are programmed in the associated (and indispensable) MASS, nnet, and spatial packages, which are included in the standard R distribution. This text is more advanced and has a broader focus than the R Companion. Brian Ripley is a member of the R Core group of developers. S. N. Wood, Generalized Additive Models: An Introduction with R. New York: Chapman and Hall, Describes the mgcv package in R, which contains a gam function for fitting generalized additive models based on smoothing splines. The initials mgcv stand for multiple generalized cross validation, the method by which Wood selects GAM smoothing parameters. Other Sources (Some Free) See the publications list on the R web site < The R Journal < the journal of the R
8 Project for Statistical Computing, and its predecessor R News < are also good sources of information, as is the Journal of Statistical Software < an on-line American Statistical Association journal dominated by coverage of R packages.
ICPSR Training Program McMaster University Summer, Introduction to the R Statistical Computing Environment
John Fox ICPSR Training Program McMaster University Summer, 2016 Introduction to the R Statistical Computing Environment The R statistical programming language and computing environment has become the
More informationICPSR Training Program McMaster University Summer, The R Statistical Computing Environment: The Basics and Beyond
John Fox ICPSR Training Program McMaster University Summer, 2016 The R Statistical Computing Environment: The Basics and Beyond The R statistical programming language and computing environment has become
More informationOverview of R. Biostatistics
Overview of R Biostatistics 140.776 Stroustrup s Law There are only two kinds of languages: the ones people complain about and the ones nobody uses. R is a dialect of S What is R? What is S? S is a language
More informationGoals of this course. Crash Course in R. Getting Started with R. What is R? What is R? Getting you setup to use R under Windows
Oxford Spring School, April 2013 Effective Presentation ti Monday morning lecture: Crash Course in R Robert Andersen Department of Sociology University of Toronto And Dave Armstrong Department of Political
More informationIASC-ERS Summer School CoDaCourse 2014
IASC-ERS Summer School CoDaCourse 2014 3. Software www.compositionaldata.com Dept. Informàtica, Matemàtica Aplicada i Estadística Universitat de Girona Campus Montilivi, EPS-4 E-17071 Girona (Spain) [Coord:
More information8.1 Come analizzare i dati: R
8.1 Come analizzare i dati: R Insegnamento di Informatica Elisabetta Ronchieri Corso di Laurea di Economia, Universitá di Ferrara I semestre, anno 2014-2015 Elisabetta Ronchieri (Universitá) Insegnamento
More informationStatistics Statistical Computing Software
Statistics 135 - Statistical Computing Software Mark E. Irwin Department of Statistics Harvard University Autumn Term Monday, September 19, 2005 - January 2006 Copyright c 2005 by Mark E. Irwin Personnel
More informationIntroducing Oracle R Enterprise 1.4 -
Hello, and welcome to this online, self-paced lesson entitled Introducing Oracle R Enterprise. This session is part of an eight-lesson tutorial series on Oracle R Enterprise. My name is Brian Pottle. I
More informationIntroduction to R: Part I
Introduction to R: Part I Jeffrey C. Miecznikowski March 26, 2015 R impact R is the 13th most popular language by IEEE Spectrum (2014) Google uses R for ROI calculations Ford uses R to improve vehicle
More informationCREATING POWERFUL AND EFFECTIVE GRAPHICAL DISPLAYS: AN INTRODUCTION TO LATTICE GRAPHICS IN R
APSA Short Course, SC 13 Chicago, Illinois August 29, 2007 Michigan State University CREATING POWERFUL AND EFFECTIVE GRAPHICAL DISPLAYS: AN INTRODUCTION TO LATTICE GRAPHICS IN R I. Some Basic R Concepts
More informationBootstrap and multiple imputation under missing data in AR(1) models
EUROPEAN ACADEMIC RESEARCH Vol. VI, Issue 7/ October 2018 ISSN 2286-4822 www.euacademic.org Impact Factor: 3.4546 (UIF) DRJI Value: 5.9 (B+) Bootstrap and multiple imputation under missing ELJONA MILO
More informationNonparametric Regression
Nonparametric Regression John Fox Department of Sociology McMaster University 1280 Main Street West Hamilton, Ontario Canada L8S 4M4 jfox@mcmaster.ca February 2004 Abstract Nonparametric regression analysis
More informationMultivariable Regression Modelling
Multivariable Regression Modelling A review of available spline packages in R. Aris Perperoglou for TG2 ISCB 2015 Aris Perperoglou for TG2 Multivariable Regression Modelling ISCB 2015 1 / 41 TG2 Members
More informationAn Introduction to R. Subhajit Dutta Stat-Math Unit. Indian Statistical Institute, Kolkata October 17, 2012
An Introduction to R Subhajit Dutta Stat-Math Unit Indian Statistical Institute, Kolkata October 17, 2012 Why R? It is FREE!! Basic as well as specialized data analysis technique at your fingertips. Highly
More informationThe History and Use of R. Joseph Kambourakis
The History and Use of R Joseph Kambourakis Ground Rules Interrupt me These are all my opinions and not of EMC or Big Data Analytics, Discovery & Visualization Meetup Slides will be available Joseph
More informationA Survey of Statistical Modeling Tools
1 of 6 A Survey of Statistical Modeling Tools Madhuri Kulkarni (A survey paper written under the guidance of Prof. Raj Jain) Abstract: A plethora of statistical modeling tools are available in the market
More informationSolving the Unsolvable Through Scientific Computing: Explorations in the Best Uses of Popular Mathematics Software
Solving the Unsolvable Through Scientific Computing: Explorations in the Best Uses of Popular Mathematics Software Talitha Washington, Howard University Edray Goins, Purdue University Luis Melara, Shippensburg
More informationThe R statistical computing environment
The R statistical computing environment Luke Tierney Department of Statistics & Actuarial Science University of Iowa June 17, 2011 Luke Tierney (U. of Iowa) R June 17, 2011 1 / 27 Introduction R is a language
More informationPart 1: Getting Started
Part 1: Getting Started 140.776 Statistical Computing Ingo Ruczinski Thanks to Thomas Lumley and Robert Gentleman of the R-core group (http://www.r-project.org/) for providing some tex files that appear
More informationA Method for Comparing Multiple Regression Models
CSIS Discussion Paper No. 141 A Method for Comparing Multiple Regression Models Yuki Hiruta Yasushi Asami Department of Urban Engineering, the University of Tokyo e-mail: hiruta@ua.t.u-tokyo.ac.jp asami@csis.u-tokyo.ac.jp
More informationAn Introduction to R 1.3 Some important practical matters when working with R
An Introduction to R 1.3 Some important practical matters when working with R Dan Navarro (daniel.navarro@adelaide.edu.au) School of Psychology, University of Adelaide ua.edu.au/ccs/people/dan DSTO R Workshop,
More informationAn introduction to ggplot: An implementation of the grammar of graphics in R
An introduction to ggplot: An implementation of the grammar of graphics in R Hadley Wickham 00-0-7 1 Introduction Currently, R has two major systems for plotting data, base graphics and lattice graphics
More informationFuzzy Rogers Research Computing Administrator Materials Research Laboratory (MRL) Center for Scientific Computing (CSC)
Intro to R Fuzzy Rogers Research Computing Administrator Materials Research Laboratory (MRL) Center for Scientific Computing (CSC) fuz@mrl.ucsb.edu MRL 2066B Sharon Solis Paul Weakliem Research Computing
More informationThe R Software Environment
The R Software Environment a (very) short introduction L. Torgo ltorgo@dcc.fc.up.pt Departamento de Ciência de Computadores Faculdade de Ciências / Universidade do Porto Feb, 2017 What is R? The R Project
More informationOn R for Statistics. Subhajit Dutta Stat-Math Unit. Indian Statistical Institute, Kolkata September 16, 2011
On R for Statistics Subhajit Dutta Stat-Math Unit Indian Statistical Institute, Kolkata September 16, 2011 Why R? It is FREE!! Basic as well as specialized data analysis technique at your fingertips. Highly
More informationSQL Server 2017: Data Science with Python or R?
SQL Server 2017: Data Science with Python or R? Dejan Sarka Sponsor Introduction Dejan Sarka (dsarka@solidq.com, dsarka@siol.net, @DejanSarka) 30 years of experience SQL Server MVP, MCT, 16 books 20+ courses,
More informationIntroduction to machine learning, pattern recognition and statistical data modelling Coryn Bailer-Jones
Introduction to machine learning, pattern recognition and statistical data modelling Coryn Bailer-Jones What is machine learning? Data interpretation describing relationship between predictors and responses
More informationdavidr Cornell University
1 NONPARAMETRIC RANDOM EFFECTS MODELS AND LIKELIHOOD RATIO TESTS Oct 11, 2002 David Ruppert Cornell University www.orie.cornell.edu/ davidr (These transparencies and preprints available link to Recent
More informationPackage r2d2. February 20, 2015
Package r2d2 February 20, 2015 Version 1.0-0 Date 2014-03-31 Title Bivariate (Two-Dimensional) Confidence Region and Frequency Distribution Author Arni Magnusson [aut], Julian Burgos [aut, cre], Gregory
More informationData Handling: Import, Cleaning and Visualisation
Data Handling: Import, Cleaning and Visualisation 1 Data Display Lecture 11: Visualisation and Dynamic Documents Prof. Dr. Ulrich Matter (University of St. Gallen) 13/12/18 In the last part of a data pipeline
More informationIntroduction to R. base -> R win32.exe (this will change depending on the latest version)
Dr Raffaella Calabrese, Essex Business School 1. GETTING STARTED Introduction to R R is a powerful environment for statistical computing which runs on several platforms. R is available free of charge.
More informationComputational statistics Jamie Griffin. Semester B 2018 Lecture 1
Computational statistics Jamie Griffin Semester B 2018 Lecture 1 Course overview This course is not: Statistical computing Programming This course is: Computational statistics Statistical methods that
More informationPackage slp. August 29, 2016
Version 1.0-5 Package slp August 29, 2016 Author Wesley Burr, with contributions from Karim Rahim Copyright file COPYRIGHTS Maintainer Wesley Burr Title Discrete Prolate Spheroidal
More informationAn Introduction To R. Erin Rachael Shellman Bioinformatics PhD Program Biostatistics Brownbag Seminar 09/26/2008
An Introduction To R Erin Rachael Shellman Bioinformatics PhD Program www.umich.edu/~shellman/rtalk.html Biostatistics Brownbag Seminar 09/26/2008 1 Talking Points In this talk, my goal is to: Introduce
More informationPackage blocksdesign
Type Package Package blocksdesign September 11, 2017 Title Nested and Crossed Block Designs for Factorial, Fractional Factorial and Unstructured Treatment Sets Version 2.7 Date 2017-09-11 Author R. N.
More informationIST Computational Tools for Statistics I. DEÜ, Department of Statistics
IST 1051 Computational Tools for Statistics I 1 DEÜ, Department of Statistics Course Objectives Computational Tools for Statistics-I course can increase the understanding of statistics and helps to learn
More informationSTATISTICS (STAT) 200 Level Courses. 300 Level Courses. Statistics (STAT) 1
Statistics (STAT) 1 STATISTICS (STAT) 200 Level Courses STAT 250: Introductory Statistics I. 3 credits. Elementary introduction to statistics. Topics include descriptive statistics, probability, and estimation
More informationFuzzy Rogers Research Computing Administrator Materials Research Laboratory (MRL) Center for Scientific Computing (CSC)
Intro to R Fuzzy Rogers Research Computing Administrator Materials Research Laboratory (MRL) Center for Scientific Computing (CSC) fuz@mrl.ucsb.edu MRL 2066B Sharon Solis Paul Weakliem Research Computing
More informationHistory and Ecology of R
History and Ecology of R Martyn Plummer International Agency for Research on Cancer ANF R avancé et performances Aussois 6 Oct 2015 Pre-history Before there was R, there was S. The S language Developed
More informationGeneralized Additive Model
Generalized Additive Model by Huimin Liu Department of Mathematics and Statistics University of Minnesota Duluth, Duluth, MN 55812 December 2008 Table of Contents Abstract... 2 Chapter 1 Introduction 1.1
More informationAn Introduction to the Bootstrap
An Introduction to the Bootstrap Bradley Efron Department of Statistics Stanford University and Robert J. Tibshirani Department of Preventative Medicine and Biostatistics and Department of Statistics,
More informationAn Introduction to R- Programming
An Introduction to R- Programming Hadeel Alkofide, Msc, PhD NOT a biostatistician or R expert just simply an R user Some slides were adapted from lectures by Angie Mae Rodday MSc, PhD at Tufts University
More informationRegression III: Advanced Methods
Lecture 2: Software Introduction Regression III: Advanced Methods William G. Jacoby Department of Political Science Michigan State University jacoby@msu.edu Getting Started with R What is R? A tiny R session
More informationStat 849: Plotting responses and covariates
Stat 849: Plotting responses and covariates Douglas Bates Department of Statistics University of Wisconsin, Madison 2010-09-03 Outline R Graphics Systems Brain weight Cathedrals Longshoots Domedata Summary
More informationStat 849: Plotting responses and covariates
Stat 849: Plotting responses and covariates Douglas Bates 10-09-03 Outline Contents 1 R Graphics Systems Graphics systems in R ˆ R provides three dierent high-level graphics systems base graphics The system
More informationSTATISTICS (STAT) 200 Level Courses Registration Restrictions: STAT 250: Required Prerequisites: not Schedule Type: Mason Core: STAT 346:
Statistics (STAT) 1 STATISTICS (STAT) 200 Level Courses STAT 250: Introductory Statistics I. 3 credits. Elementary introduction to statistics. Topics include descriptive statistics, probability, and estimation
More informationAn Introduction to the R Commander
An Introduction to the R Commander BIO/MAT 460, Spring 2011 Christopher J. Mecklin Department of Mathematics & Statistics Biomathematics Research Group Murray State University Murray, KY 42071 christopher.mecklin@murraystate.edu
More informationFrames, Environments, and Scope in R and S-PLUS
Frames, Environments, and Scope in R and S-PLUS Appendix to An R and S-PLUS Companion to Applied Regression John Fox March 2002 1 Introduction Section 2.2.1 of the text describes in some detail how objects
More informationA comparison of spline methods in R for building explanatory models
A comparison of spline methods in R for building explanatory models Aris Perperoglou on behalf of TG2 STRATOS Initiative, University of Essex ISCB2017 Aris Perperoglou Spline Methods in R ISCB2017 1 /
More informationIntroduction to RStudio
Introduction to RStudio Ulrich Halekoh Epidemiology and Biostatistics, SDU May 4, 2018 R R is a language that started by Ross Ihaka and Robert Gentleman in 1991 as an open source alternative to S emphasizes
More informationIntroduction to R and Bioconductor
Introduction to R and Bioconductor RNA-Seq / ChIP-Seq Data Analysis Workshop 10 September 2012 CSC, Helsinki Nicolas Delhomme A bit of interaction? What is your R knowledge, on a 0 (beginner) to 2 (expert)
More informationIntro to R. Some history. Some history
Intro to R Héctor Corrada Bravo CMSC858B Spring 2012 University of Maryland Computer Science http://www.nytimes.com/2009/01/07/technology/business-computing/07program.html?_r=2&pagewanted=1 http://www.forbes.com/forbes/2010/0524/opinions-software-norman-nie-spss-ideas-opinions.html
More informationAMERICAN JOURNAL OF POLITICAL SCIENCE GUIDELINES FOR PREPARING REPLICATION FILES Version 1.0, March 25, 2015 William G. Jacoby
AJPS, South Kedzie Hall, 368 Farm Lane, S303, East Lansing, MI 48824 ajps@msu.edu (517) 884-7836 AMERICAN JOURNAL OF POLITICAL SCIENCE GUIDELINES FOR PREPARING REPLICATION FILES Version 1.0, March 25,
More informationIntroduction to R. Hao Helen Zhang. Fall Department of Mathematics University of Arizona
Department of Mathematics University of Arizona hzhang@math.aricona.edu Fall 2019 What is R R is the most powerful and most widely used statistical software Video: A language and environment for statistical
More informationMissing Data: What Are You Missing?
Missing Data: What Are You Missing? Craig D. Newgard, MD, MPH Jason S. Haukoos, MD, MS Roger J. Lewis, MD, PhD Society for Academic Emergency Medicine Annual Meeting San Francisco, CA May 006 INTRODUCTION
More informationA First Course in Statistical Programming with R
A First Course in Statistical Programming with R This new, color edition of Braun and Murdoch s bestselling textbook integrates use of the RStudio platform and adds discussion of newer graphics systems,
More informationUpdates and Errata for Statistical Data Analytics (1st edition, 2015)
Updates and Errata for Statistical Data Analytics (1st edition, 2015) Walter W. Piegorsch University of Arizona c 2018 The author. All rights reserved, except where previous rights exist. CONTENTS Preface
More informationDynamic Thresholding for Image Analysis
Dynamic Thresholding for Image Analysis Statistical Consulting Report for Edward Chan Clean Energy Research Center University of British Columbia by Libo Lu Department of Statistics University of British
More informationAn Introduction To R For Spatial Analysis And Mapping
We have made it easy for you to find a PDF Ebooks without any digging. And by having access to our ebooks online or by storing it on your computer, you have convenient answers with an introduction to r
More informationStatistics (STAT) Statistics (STAT) 1. Prerequisites: grade in C- or higher in STAT 1200 or STAT 1300 or STAT 1400
Statistics (STAT) 1 Statistics (STAT) STAT 1200: Introductory Statistical Reasoning Statistical concepts for critically evaluation quantitative information. Descriptive statistics, probability, estimation,
More informationTrellis Displays. Definition. Example. Trellising: Which plot is best? Historical Development. Technical Definition
Trellis Displays The curse of dimensionality as described by Huber [6] is not restricted to mathematical statistical problems, but can be found in graphicbased data analysis as well. Most plots like histograms
More informationHierarchical Mixture Models for Nested Data Structures
Hierarchical Mixture Models for Nested Data Structures Jeroen K. Vermunt 1 and Jay Magidson 2 1 Department of Methodology and Statistics, Tilburg University, PO Box 90153, 5000 LE Tilburg, Netherlands
More informationMAT128A: Numerical Analysis Lecture One: Course Logistics and What is Numerical Analysis?
MAT128A: Numerical Analysis Lecture One: Course Logistics and What is Numerical Analysis? September 26, 2018 Lecture 1 September 26, 2018 1 / 19 Course Logistics My contact information: James Bremer Email:
More informationSoftware for your own computer: R, RStudio, LaTeX, PsychoPy
Software for your own computer: R, RStudio, LaTeX, PsychoPy There are four software packages that you might want to install on your own computer. They will allow you to work on the various class exercises
More informationR for absolute beginners Duncan Golicher
R for absolute beginners Duncan Golicher 11/19/2008 Introduction Motivation for the course 1. Encourage researchers and students to begin using R 2. Draw on personal experience to flatten the learning
More informationSoftware for your own computer: R, RStudio, LaTeX, PsychoPy
Software for your own computer: R, RStudio, LaTeX, PsychoPy You do not need your own computer for this class. There are, however, four software packages that you might want to install on your own computer,
More informationIntroduction to R and RStudio IDE
Introduction to R and RStudio IDE Wan Nor Arifin Unit of Biostatistics and Research Methodology, Universiti Sains Malaysia. email: wnarifin@usm.my December 19, 2018 Wan Nor Arifin (USM) Introduction to
More informationIntroduction to R programming a SciLife Lab course
Introduction to R programming a SciLife Lab course 31 August 2016 What R is a programming language, a programming platform (=environment + interpreter), a software project driven by the core team and the
More information2 Installing and Updating R
2 Installing and Updating R Stata and R are somewhat similar in that both are modular. Each comes with a single binary executable file and a large number of individual functions or commands. These are
More informationIntroduction to R programming a SciLife Lab course
Introduction to R programming a SciLife Lab course 20 October 2017 What R really is? a programming language, a programming platform (= environment + interpreter), a software project driven by the core
More informationEPIB Four Lecture Overview of R
EPIB-613 - Four Lecture Overview of R R is a package with enormous capacity for complex statistical analysis. We will see only a small proportion of what it can do. The R component of EPIB-613 is divided
More informationPackage FCGR. October 13, 2015
Type Package Title Fatigue Crack Growth in Reliability Version 1.0-0 Date 2015-09-29 Package FCGR October 13, 2015 Author Antonio Meneses , Salvador Naya ,
More informationLavastorm Analytic Library Predictive and Statistical Analytics Node Pack FAQs
1.1 Introduction Lavastorm Analytic Library Predictive and Statistical Analytics Node Pack FAQs For brevity, the Lavastorm Analytics Library (LAL) Predictive and Statistical Analytics Node Pack will be
More informationPackage blocksdesign
Type Package Package blocksdesign June 12, 2018 Title Nested and Crossed Block Designs for Factorial, Fractional Factorial and Unstructured Treatment Sets Version 2.9 Date 2018-06-11" Author R. N. Edmondson.
More informationThe main topics that we will learn about during the class are
STA 141 Syllabus This class is about scientific and statistical computing. It is intended to provide you with a strong foundation in computing skills that are increasingly necessary for a practicing statistician
More informationIntro Intro.3
Intro.1 Intro.2 Introduction to R Much of the content here is from Appendix A of my Analysis of Categorical Data with R book (www.chrisbilder.com/ categorical). All R code is available in AppendixInitialExamples.R
More informationR Primer for Introduction to Mathematical Statistics 8th Edition Joseph W. McKean
R Primer for Introduction to Mathematical Statistics 8th Edition Joseph W. McKean Copyright 2017 by Joseph W. McKean at Western Michigan University. All rights reserved. Reproduction or translation of
More informationCategorical explanatory variables
Hutcheson, G. D. (2011). Tutorial: Categorical Explanatory Variables. Journal of Modelling in Management. 6, 2: 225 236. NOTE: this is a slightly updated version of this paper which is distributed to correct
More informationIntroduction to R programming a SciLife Lab course
Introduction to R programming a SciLife Lab course 22 March 2017 What R really is? a programming language, a programming platform (= environment + interpreter), a software project driven by the core team
More informationMultiple-imputation analysis using Stata s mi command
Multiple-imputation analysis using Stata s mi command Yulia Marchenko Senior Statistician StataCorp LP 2009 UK Stata Users Group Meeting Yulia Marchenko (StataCorp) Multiple-imputation analysis using mi
More informationDecision Making Procedure: Applications of IBM SPSS Cluster Analysis and Decision Tree
World Applied Sciences Journal 21 (8): 1207-1212, 2013 ISSN 1818-4952 IDOSI Publications, 2013 DOI: 10.5829/idosi.wasj.2013.21.8.2913 Decision Making Procedure: Applications of IBM SPSS Cluster Analysis
More informationIntroduction to R Jason Huff, QB3 CGRL UC Berkeley April 15, 2016
Introduction to R Jason Huff, QB3 CGRL UC Berkeley April 15, 2016 Installing R R is constantly updated and you should download a recent version; the version when this workshop was written was 3.2.4 I also
More informationA review of spline function selection procedures in R
Matthias Schmid Department of Medical Biometry, Informatics and Epidemiology University of Bonn joint work with Aris Perperoglou on behalf of TG2 of the STRATOS Initiative September 1, 2016 Introduction
More informationMS in Applied Statistics: Study Guide for the Data Science concentration Comprehensive Examination. 1. MAT 456 Applied Regression Analysis
MS in Applied Statistics: Study Guide for the Data Science concentration Comprehensive Examination. The Part II comprehensive examination is a three-hour closed-book exam that is offered on the second
More informationData Science Bootcamp Curriculum. NYC Data Science Academy
Data Science Bootcamp Curriculum NYC Data Science Academy 100+ hours free, self-paced online course. Access to part-time in-person courses hosted at NYC campus Machine Learning with R and Python Foundations
More informationOutline. Mixed models in R using the lme4 package Part 1: Introduction to R. Following the operations on the slides
Outline Mixed models in R using the lme4 package Part 1: Introduction to R Douglas Bates University of Wisconsin - Madison and R Development Core Team UseR!2009, Rennes, France
More informationData Wrangling in the Tidyverse
Data Wrangling in the Tidyverse 21 st Century R DS Portugal Meetup, at Farfetch, Porto, Portugal April 19, 2017 Jim Porzak Data Science for Customer Insights 4/27/2017 1 Outline 1. A very quick introduction
More informationIBM SPSS Statistics and open source: A powerful combination. Let s go
and open source: A powerful combination Let s go The purpose of this paper is to demonstrate the features and capabilities provided by the integration of IBM SPSS Statistics and open source programming
More informationConverting a large R package to S4 classes and methods
DSC 2003 Working Papers (Draft Versions) http://www.ci.tuwien.ac.at/conferences/dsc-2003/ Converting a large R package to S4 classes and methods Douglas M. Bates and Saikat DebRoy Department of Statistics
More informationAnalysis of Incomplete Multivariate Data
Analysis of Incomplete Multivariate Data J. L. Schafer Department of Statistics The Pennsylvania State University USA CHAPMAN & HALL/CRC A CR.C Press Company Boca Raton London New York Washington, D.C.
More informationWeekly Discussion Sections & Readings
Weekly Discussion Sections & Readings Teaching Fellows (TA) Name Office Email Mengting Gu Bass 437 mengting.gu (at) yale.edu Paul Muir Bass437 Paul.muir (at) yale.edu Please E-mail cbb752@gersteinlab.org
More informationSTATISTICS (STAT) Statistics (STAT) 1
Statistics (STAT) 1 STATISTICS (STAT) STAT 2013 Elementary Statistics (A) Prerequisites: MATH 1483 or MATH 1513, each with a grade of "C" or better; or an acceptable placement score (see placement.okstate.edu).
More informationUsing the DATAMINE Program
6 Using the DATAMINE Program 304 Using the DATAMINE Program This chapter serves as a user s manual for the DATAMINE program, which demonstrates the algorithms presented in this book. Each menu selection
More informationSoftware for your own computer: R, RStudio, LaTeX, PsychoPy
Software for your own computer: R, RStudio, LaTeX, PsychoPy You do not need your own computer for this class. There are, however, four software packages that you might want to install on your own computer,
More informationUsing Sunflower Plots and Classification Trees to Study Typeface Legibility
CS-BIGS 2(2): 92-98 2009 CS-BIGS http://www.bentley.edu/csbigs/vol2-2/merkle.pdf Using Sunflower Plots and Classification Trees to Study Typeface Legibility Edgar C. Merkle and Barbara S. Chaparro Wichita
More informationStatistical Modeling with Spline Functions Methodology and Theory
This is page 1 Printer: Opaque this Statistical Modeling with Spline Functions Methodology and Theory Mark H Hansen University of California at Los Angeles Jianhua Z Huang University of Pennsylvania Charles
More informationLecture 27: Review. Reading: All chapters in ISLR. STATS 202: Data mining and analysis. December 6, 2017
Lecture 27: Review Reading: All chapters in ISLR. STATS 202: Data mining and analysis December 6, 2017 1 / 16 Final exam: Announcements Tuesday, December 12, 8:30-11:30 am, in the following rooms: Last
More informationSandeep Kharidhi and WenSui Liu ChoicePoint Precision Marketing
Generalized Additive Model and Applications in Direct Marketing Sandeep Kharidhi and WenSui Liu ChoicePoint Precision Marketing Abstract Logistic regression 1 has been widely used in direct marketing applications
More informationShort Introduction to R
Short Introduction to R Paulino Pérez 1 José Crossa 2 1 ColPos-México 2 CIMMyT-México June, 2015. CIMMYT, México-SAGPDB Short Introduction to R 1/51 Contents 1 Introduction 2 Simple objects 3 User defined
More information1 Overview XploRe is an interactive computational environment for statistics. The aim of XploRe is to provide a full, high-level programming language
Teaching Statistics with XploRe Marlene Muller Institute for Statistics and Econometrics, Humboldt University Berlin Spandauer Str. 1, D{10178 Berlin, Germany marlene@wiwi.hu-berlin.de, http://www.wiwi.hu-berlin.de/marlene
More information