R For Sql Developers. Kiran Math

Size: px
Start display at page:

Download "R For Sql Developers. Kiran Math"

Transcription

1 R For Sql Developers Kiran Math - kiranmath@outlook.com

2 R Figure 1: R

3 Download R CRAN The Comprehensive R Archive Network ~ OpenSource R MRAN Micrsoft R Open ~ Enhanced R Distribution

4 Microsoft R Microsoft R Server Microsoft R Client Microsoft R Open

5 Capability Comparision Capability R Open Microsoft R Open Microsoft R Server Data Size In - Memory In - Memory In - Memory + Disk Speed Single - Thread Multi - Thread, MKL Multi - Thread, Multi Node, MKL Support Community Community Community + Commercial Package Support CRAN CRAN + MRAN CRAN + MRAN + MSR Scale/ Performance Packages License Open Source Open Source Commercial

6 R Development Tools R Studio R Tools for Visual Studio (RTVS)

7 Data NHANES : The National Health and Nutrition Examination Survey SEQN GENDER AGEYR BMI WAIST Male Female Male Male Female Male NA

8 R - Objects Vectors Data Frames Factors Lists Matrices Arrays

9 Data Types Types Example Logical TRUE, FALSE Numeric 14.6, 50, 100 Integer 1L, 374L, 10L Character SQL, Saturday Complex 3 + 2i Raw c (SQL)

10 Variables Variables are reserved memory locations to store values. The variables are assigned with R-Objects. The data type of the R-object becomes the data type of the variable.

11 Vector A vector is a sequence of data elements of the same basic type. Members in a vector are called components.

12 Character Vector GENDER <- c ("Male", "Female", "Male", "Male", "Female", "Male") print(gender) ## [1] "Male" "Female" "Male" "Male" "Female" "Male" class(gender) ## [1] "character"

13 Numeric Vector AGEYR <- c(55l, 52L, 63L, 83L, 37L, NA) print(ageyr) ## [1] NA class(ageyr) ## [1] "integer"

14 Numeric Vector BMI <- c(31.26, 25.49, 19.60, 28.32, 19.34, 16.57) print(bmi) ## [1] class(bmi) ## [1] "numeric"

15 Numeric Vector SEQN < :21014 print(seqn) ## [1] class(seqn) ## [1] "integer"

16 Factor Variables that take limited number of different values. Categorical Variables Nominal Ordinal Data is stored as vector integer Unique Character value is stored once

17 Factor fac_gender <- factor(gender) str(fac_gender) ## Factor w/ 2 levels "Female","Male": print(fac_gender) ## [1] Male Female Male Male Female Male ## Levels: Female Male

18 BMI Classification Group BMI Range Underweight =< 18.4 Normal Overweight Obesity X Obesity bgroup <- c( "Under", "Normal", "Over", "Obese", "Extra") brange <- c(0, 18.4, 24.9, 29.9, 39.9, 100)

19 Ordered Factor fac_bmi <- cut(bmi, breaks = brange, labels = bgroup, ordered = TRUE) print(bmi) ## [1] print(fac_bmi) ## [1] Obese Over Normal Over Normal Under ## Levels: Under < Normal < Over < Obese < Extra

20 Data Frame dfbmi <- data.frame( id = SEQN, gender = fac_gender, age = AGEYR, bmi = fac_bmi ) id gender age bmi Male 55 Obese Female 52 Over Male 63 Normal Male 83 Over Female 37 Normal Male NA Under

21 Data Science Life Cycle Figure 2: R

22 Programming Paradigm Imperative Declarative Object-Oriented Procedural Functional Logical

23 Programming Paradigm Algorithm = Logic + Control Logic What Must Be Done Control How the desired solution is found Imperative Declarative Both are needed Only Logic is needed

24 Imperative Programming num <- c(1:10) print(num) ## [1] total <- 0 for (i in num) { total <- total + i } print(total) ## [1] 55

25 Declarative Programming num <- c(1:10) print(num) ## [1] total <- sum(num) print(total) ## [1] 55

26 Functional Programming R,at its heart, is a functional programming language. It Provides many tools to creation and manipulation of functions. Assign function to variables Store functions in list Pass function as argument to other functions Create function inside function Return function as a result of a function First Class Functions

27 R Function add <- function(x,y) { return (x + y) }

28 R Function total <- add(1, 2) print(total) ## [1] 3

29 Use Case Import CSV Data into R Using Open R

30 Import CSV file into R BMIFile <- "D:\\Presentation\\SpartanburgR2017\\Data\\BMI.c dfbmi <- read.csv(bmifile) SEQN GENDER AGEYR BMI WAIST Male Female Female Male Male Female

31 Filter Data Using Package DPLYR library(dplyr) df16plus <- dfbmi %>% filter(ageyr >=18 & is.na(bmi) == FALSE) SEQN GENDER AGEYR BMI WAIST Male Male Female Male Male Female

32 ## [1] Over Extra Normal Discretize BMI Function dbmi <- function(vbmi) { bgroup <- c( "Under", "Normal", "Over", "Obese", "E brange <- c(0, 18.4, 24.9, 29.9, 39.9, fac_bmi <- cut(vbmi, breaks = brange, labels = bgroup, ordered = TRUE) } return(fac_bmi) vsample <- c(25,40, 19) dbmi(vsample)

33 Add New Column to Data Frame df <- df16plus %>% mutate(bmigroup = dbmi(bmi)) SEQN GENDER AGEYR BMI WAIST BMIGroup Male Extra Male Obese Female Over Male Normal Male Over Female Normal

34 Visualize Data : Frequency hplot <- hist(df$bmi, xlab="bmi", border="blue", col="green" ) print(hplot, abline(v=24.9, col="red", lty = 3)) Histogram of df$bmi Frequency

35 Visualize Data : Frequency library(ggplot2) ggplot(df, aes(bmigroup)) + geom_bar() count 500 0

36 CSV To Sql Table : Variables library(revoscaler) sqldesttable <- "tblbmi" sqlrowsperread < sqlconnstring <- hidden BMIFile <- "D:\\Presentation\\SpartanburgR2017\\Data\\BMI.c datcol <- c( "SEQN" = "integer", "GENDER" = "factor", "AGEYR" = "integer", "BMI" = "numeric", "WAIST" = "numeric")

37 CSV To Sql Table des <- RxSqlServerData(connectionString = sqlconnstring, table = sqldesttable, rowsperread = sqlrowsperread) indat <- RxTextData( file = BMIFile, colclasses = datcol) rxdatastep(indata = indat, outfile = des, overwrite = TRUE) ## Rows Read: 9643, Total Rows Processed: 9643, Total Chunk ## ## Elapsed time to compute low/high values and/or factor le ## ## Rows Read: 9643, Total Rows Processed: 9643 ## Total Rows written: 9643, Total time: ##, Total Chunk Time: seconds

38 tblbmi SELECT TOP 6 * FROM tblbmi; SEQN GENDER AGEYR BMI WAIST Male Female Female Male Male Female

39 SQL Server Scripts With R EXEC = N = N R = N SQL = N = N = N Script Block Parameter WITH RESULT SETS (( Return Columns ))

40 Demo GetBMIGroup.StoredProcedure GetBMIGroupWithParameter.StoredProcedure GetBMIGroupWithParameterExternalFile.StoredProcedure

41 Thank you Figure 3: Thank you Kiran Math Greenville SC

PSS718 - Data Mining

PSS718 - Data Mining Lecture 3 Hacettepe University, IPS, PSS October 10, 2016 Data is important Data -> Information -> Knowledge -> Wisdom Dataset a collection of data, a.k.a. matrix, table. Observation a row of a dataset,

More information

Andrea Martorana Tusa. Failure prediction for manifacturing industry

Andrea Martorana Tusa. Failure prediction for manifacturing industry Andrea Martorana Tusa Failure prediction for manifacturing industry Event Sponsors Expo Sponsors Expo Light Sponsors Speaker Info First name: Andrea. Last name: Martorana Tusa. Italian, working by Widex

More information

Microsoft, Open Source, R: You Gotta be Kidding Me!

Microsoft, Open Source, R: You Gotta be Kidding Me! Microsoft, Open Source, R: You Gotta be Kidding Me! Bio - Niels Berglund Software Specialist - Derivco lots of production dev. plus figuring out ways to "use and abuse" existing and new technologies Author

More information

Preparing for Data Analysis

Preparing for Data Analysis Preparing for Data Analysis Prof. Andrew Stokes March 21, 2017 Managing your data Entering the data into a database Reading the data into a statistical computing package Checking the data for errors and

More information

INTRODUCTION TO DATA. Welcome to the course!

INTRODUCTION TO DATA. Welcome to the course! INTRODUCTION TO DATA Welcome to the course! High School and Beyond id gender race socst 70 male white 57 121 female white 61 86 male white 31 137 female white 61 Loading data > # Load package > library(openintro)

More information

Intro to R. Fall Fall 2017 CS130 - Intro to R 1

Intro to R. Fall Fall 2017 CS130 - Intro to R 1 Intro to R Fall 2017 Fall 2017 CS130 - Intro to R 1 Intro to R R is a language and environment that allows: Data management Graphs and tables Statistical analyses You will need: some basic statistics We

More information

CIS 4930/6930 Spring 2014 Introduction to Data Science /Data Intensive Computing. University of Florida, CISE Department Prof.

CIS 4930/6930 Spring 2014 Introduction to Data Science /Data Intensive Computing. University of Florida, CISE Department Prof. CIS 4930/6930 Spring 2014 Introduction to Data Science /Data Intensive Computing University of Florida, CISE Department Prof. Daisy Zhe Wang Data Visualization Value of Visualization Data And Image Models

More information

Preparing for Data Analysis

Preparing for Data Analysis Preparing for Data Analysis Prof. Andrew Stokes March 27, 2018 Managing your data Entering the data into a database Reading the data into a statistical computing package Checking the data for errors and

More information

1 Building a simple data package for R. 2 Data files. 2.1 bmd data

1 Building a simple data package for R. 2 Data files. 2.1 bmd data 1 Building a simple data package for R Suppose that we wish to make a package containing data sets only available in-house or on CRAN. This is often done for the data sets in the examples and exercises

More information

A GREATER GOODS BRAND

A GREATER GOODS BRAND A GREATER GOODS BRAND 1 2 3 Physical Features Measuring Units lb. kg pound kilogram Setting The Measuring Unit By pressing the UNIT button on the back of the scale, you can switch between lb. (pound) and

More information

Entering and Outputting Data 2 nd best TA ever: Steele H. Valenzuela February 2-6, 2015

Entering and Outputting Data 2 nd best TA ever: Steele H. Valenzuela February 2-6, 2015 Entering and Outputting Data 2 nd best TA ever: Steele H. Valenzuela February 2-6, 2015 Contents Things to Know Before You Begin.................................... 1 Entering and Outputting Data......................................

More information

Where Does R Fit Into Your SQL Server Stack? Stacia Varga blog.datainspirations.com

Where Does R Fit Into Your SQL Server Stack? Stacia Varga blog.datainspirations.com Where Does R Fit Into Your SQL Server Stack? Stacia Varga Stacia@datainspirations.com blog.datainspirations.com Twitter: @_StaciaV_ Stacia (Misner) Varga Over 30 years of IT experience, 17 years of BI

More information

R Workshop Guide. 1 Some Programming Basics. 1.1 Writing and executing code in R

R Workshop Guide. 1 Some Programming Basics. 1.1 Writing and executing code in R R Workshop Guide This guide reviews the examples we will cover in today s workshop. It should be a helpful introduction to R, but for more details, you can access a more extensive user guide for R on the

More information

WORKSHOP: Using the Health Survey for England, 2014

WORKSHOP: Using the Health Survey for England, 2014 WORKSHOP: Using the Health Survey for England, 2014 There are three sections to this workshop, each with a separate worksheet. The worksheets are designed to be accessible to those who have no prior experience

More information

NHANES June Introduction. Data information & loading data. Using dynamic data within a typical classroom

NHANES June Introduction. Data information & loading data. Using dynamic data within a typical classroom NHANES June 2016 Introduction The NHANES data come from the National Health and Nutrition Examination Survey, surveys given nationwide by the Center for Disease Controls (CDC). The data are collected to

More information

Analysis of Complex Survey Data with SAS

Analysis of Complex Survey Data with SAS ABSTRACT Analysis of Complex Survey Data with SAS Christine R. Wells, Ph.D., UCLA, Los Angeles, CA The differences between data collected via a complex sampling design and data collected via other methods

More information

Creating New Variables in JMP Datasets Using Formulas Exercises

Creating New Variables in JMP Datasets Using Formulas Exercises Creating New Variables in JMP Datasets Using Formulas Exercises Exercise 3 Calculate the Difference of Two Columns 1. This Exercise will use the data table Cholesterol. This data table contains the following

More information

An Introduction to R- Programming

An Introduction to R- Programming An Introduction to R- Programming Hadeel Alkofide, Msc, PhD NOT a biostatistician or R expert just simply an R user Some slides were adapted from lectures by Angie Mae Rodday MSc, PhD at Tufts University

More information

STAT10010 Introductory Statistics Lab 2

STAT10010 Introductory Statistics Lab 2 STAT10010 Introductory Statistics Lab 2 1. Aims of Lab 2 By the end of this lab you will be able to: i. Recognize the type of recorded data. ii. iii. iv. Construct summaries of recorded variables. Calculate

More information

A brief introduction to R

A brief introduction to R A brief introduction to R Cavan Reilly September 29, 2017 Table of contents Background R objects Operations on objects Factors Input and Output Figures Missing Data Random Numbers Control structures Background

More information

Python for Data Analysis. Prof.Sushila Aghav-Palwe Assistant Professor MIT

Python for Data Analysis. Prof.Sushila Aghav-Palwe Assistant Professor MIT Python for Data Analysis Prof.Sushila Aghav-Palwe Assistant Professor MIT Four steps to apply data analytics: 1. Define your Objective What are you trying to achieve? What could the result look like? 2.

More information

Package calpassapi. August 25, 2018

Package calpassapi. August 25, 2018 Title R Interface to Access CalPASS API Version 0.0.1 Package calpassapi August 25, 2018 Description Implements methods for querying data from CalPASS using its API. CalPASS Plus. MMAP API V1. .

More information

KANRI DISTANCE CALCULATOR. User Guide v2.4.9

KANRI DISTANCE CALCULATOR. User Guide v2.4.9 KANRI DISTANCE CALCULATOR User Guide v2.4.9 KANRI DISTANCE CALCULATORTM FLOW Participants Input File Correlation Distance Type? Generate Target Profile General Target Define Target Profile Calculate Off-Target

More information

A Simple Guide to Using SPSS (Statistical Package for the. Introduction. Steps for Analyzing Data. Social Sciences) for Windows

A Simple Guide to Using SPSS (Statistical Package for the. Introduction. Steps for Analyzing Data. Social Sciences) for Windows A Simple Guide to Using SPSS (Statistical Package for the Social Sciences) for Windows Introduction ٢ Steps for Analyzing Data Enter the data Select the procedure and options Select the variables Run the

More information

A GREATER GOODS BRAND

A GREATER GOODS BRAND A GREATER GOODS BRAND 1 Symbol for THE OPERATION GUIDE MUST BE READ Symbol for TYPE BF APPLIED PARTS Symbol for MANUFACTURE DATE Symbol for SERIAL NUMBER Symbol for MANUFACTURER Symbol for DIRECT CURRENT

More information

R Workshop Daniel Fuller

R Workshop Daniel Fuller R Workshop Daniel Fuller Welcome to the R Workshop @ Memorial HKR The R project for statistical computing is a free open source statistical programming language and project. Follow these steps to get started:

More information

Basic concepts and terms

Basic concepts and terms CHAPTER ONE Basic concepts and terms I. Key concepts Test usefulness Reliability Construct validity Authenticity Interactiveness Impact Practicality Assessment Measurement Test Evaluation Grading/marking

More information

Making sense of census microdata

Making sense of census microdata Making sense of census microdata Tutorial 3: Creating aggregated variables and visualisations First, open a new script in R studio and save it in your working directory, so you will be able to access this

More information

TYPES OF VARIABLES, STRUCTURE OF DATASETS, AND BASIC STATA LAYOUT

TYPES OF VARIABLES, STRUCTURE OF DATASETS, AND BASIC STATA LAYOUT PRIMER FOR ACS OUTCOMES RESEARCH COURSE: TYPES OF VARIABLES, STRUCTURE OF DATASETS, AND BASIC STATA LAYOUT STEP 1: Install STATA statistical software. STEP 2: Read through this primer and complete the

More information

Package rsurfer. October 30, 2017

Package rsurfer. October 30, 2017 Version 0.2 Date 2017-10-30 Title Manipulating 'Freesurfer' Generated Data Package rsurfer October 30, 2017 Maintainer Alexander Luke Spedding Depends R (>= 3.3.3), stringr

More information

Package ggextra. April 4, 2018

Package ggextra. April 4, 2018 Package ggextra April 4, 2018 Title Add Marginal Histograms to 'ggplot2', and More 'ggplot2' Enhancements Version 0.8 Collection of functions and layers to enhance 'ggplot2'. The flagship function is 'ggmarginal()',

More information

1 Introduction. 1.1 What is Statistics?

1 Introduction. 1.1 What is Statistics? 1 Introduction 1.1 What is Statistics? MATH1015 Biostatistics Week 1 Statistics is a scientific study of numerical data based on natural phenomena. It is also the science of collecting, organising, interpreting

More information

Module Contact: Dr Geoff McKeown, CMP Copyright of the University of East Anglia Version 1

Module Contact: Dr Geoff McKeown, CMP Copyright of the University of East Anglia Version 1 UNIVERSITY OF EAST ANGLIA School of Computing Sciences Main Series UG Examination 2015-16 PROGRAMMING FOR APPLICATIONS CMP-4009B Time allowed: 2 hours Section A (Attempt all questions: 80 marks) Section

More information

4 Displaying Multiway Tables

4 Displaying Multiway Tables 4 Displaying Multiway Tables An important subset of statistical data comes in the form of tables. Tables usually record the frequency or proportion of observations that fall into a particular category

More information

Motivations. Chapter 3: Selections and Conditionals. Relational Operators 8/31/18. Objectives. Problem: A Simple Math Learning Tool

Motivations. Chapter 3: Selections and Conditionals. Relational Operators 8/31/18. Objectives. Problem: A Simple Math Learning Tool Chapter 3: Selections and Conditionals CS1: Java Programming Colorado State University Motivations If you assigned a negative value for radius in Listing 2.2, ComputeAreaWithConsoleInput.java, the program

More information

R programming Philip J Cwynar University of Pittsburgh School of Information Sciences and Intelligent Systems Program

R programming Philip J Cwynar University of Pittsburgh School of Information Sciences and Intelligent Systems Program R programming Philip J Cwynar University of Pittsburgh School of Information Sciences and Intelligent Systems Program Background R is a programming language and software environment for statistical analysis,

More information

Package embed. November 19, 2018

Package embed. November 19, 2018 Version 0.0.2 Package embed November 19, 2018 Title Extra Recipes for Encoding Categorical Predictors Description Factor predictors can be converted to one or more numeric representations using simple

More information

Extract API: Build sophisticated data models with the Extract API

Extract API: Build sophisticated data models with the Extract API Welcome # T C 1 8 Extract API: Build sophisticated data models with the Extract API Justin Craycraft Senior Sales Consultant Tableau / Customer Consulting My Office Photo Used with permission Agenda 1)

More information

Functions and data structures. Programming in R for Data Science Anders Stockmarr, Kasper Kristensen, Anders Nielsen

Functions and data structures. Programming in R for Data Science Anders Stockmarr, Kasper Kristensen, Anders Nielsen Functions and data structures Programming in R for Data Science Anders Stockmarr, Kasper Kristensen, Anders Nielsen Objects of the game In R we have objects which are functions and objects which are data.

More information

Reading and wri+ng data

Reading and wri+ng data An introduc+on to Reading and wri+ng data Noémie Becker & Benedikt Holtmann Winter Semester 16/17 Course outline Day 4 Course outline Review Data types and structures Reading data How should data look

More information

Dr. V. Alhanaqtah. Econometrics. Graded assignment

Dr. V. Alhanaqtah. Econometrics. Graded assignment LABORATORY ASSIGNMENT 4 (R). SURVEY: DATA PROCESSING The first step in econometric process is to summarize and describe the raw information - the data. In this lab, you will gain insight into public health

More information

R commander an introduction

R commander an introduction R commander an introduction free, user-friendly, and powerful software Ho Kim SCHOOL OF PUBLIC HEALTH, SNU Useful sites R is a free software with powerful tools The Comprehensive R Archives Network http://cran.r-project.org/

More information

Data and AI LATAM 2018

Data and AI LATAM 2018 Data and AI LATAM 2018 La parte de imagen con el identificador de relación rid5 no se encontró en el archivo. La parte de imagen con el identificador de relación rid5 no se encontró en el archivo. La parte

More information

BE/EE189 Design and Construction of Biodevices Lecture 2. BE/EE189 Design and Construction of Biodevices - Caltech

BE/EE189 Design and Construction of Biodevices Lecture 2. BE/EE189 Design and Construction of Biodevices - Caltech BE/EE189 Design and Construction of Biodevices Lecture 2 LabVIEW Programming More Basics, Structures, Data Types, VI Case structure Debugging techniques Useful shortcuts Data types in labview Concept of

More information

A Brief Introduction to R

A Brief Introduction to R A Brief Introduction to R Babak Shahbaba Department of Statistics, University of California, Irvine, USA Chapter 1 Introduction to R 1.1 Installing R To install R, follow these steps: 1. Go to http://www.r-project.org/.

More information

Introduction (SPSS) Opening SPSS Start All Programs SPSS Inc SPSS 21. SPSS Menus

Introduction (SPSS) Opening SPSS Start All Programs SPSS Inc SPSS 21. SPSS Menus Introduction (SPSS) SPSS is the acronym of Statistical Package for the Social Sciences. SPSS is one of the most popular statistical packages which can perform highly complex data manipulation and analysis

More information

An introduction to ggplot: An implementation of the grammar of graphics in R

An introduction to ggplot: An implementation of the grammar of graphics in R An introduction to ggplot: An implementation of the grammar of graphics in R Hadley Wickham 00-0-7 1 Introduction Currently, R has two major systems for plotting data, base graphics and lattice graphics

More information

Basic Medical Statistics Course

Basic Medical Statistics Course Basic Medical Statistics Course S0 SPSS Intro December 2014 Wilma Heemsbergen w.heemsbergen@nki.nl This Afternoon 13.00 ~ 15.00 SPSS lecture Short break Exercise 2 Database Example 3 Types of data Type

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models Methods@Manchester Summer School Manchester University July 2 6, 2018 Software and Data www.research-training.net/manchester2018 Graeme.Hutcheson@manchester.ac.uk University of

More information

Package quickreg. R topics documented:

Package quickreg. R topics documented: Package quickreg September 28, 2017 Title Build Regression Models Quickly and Display the Results Using 'ggplot2' Version 1.5.0 A set of functions to extract results from regression models and plot the

More information

Chuck Cartledge, PhD. 24 September 2017

Chuck Cartledge, PhD. 24 September 2017 Introduction Amdahl BD Processing Languages Q&A Conclusion References Big Data: Data Analysis Boot Camp Serial vs. Parallel Processing Chuck Cartledge, PhD 24 September 2017 1/24 Table of contents (1 of

More information

Checking whether the protocol was followed: gender and age 51

Checking whether the protocol was followed: gender and age 51 Checking whether the protocol was followed: gender and age 51 Session 4: Checking whether the protocol was followed: gender and age In the data cleaning workbook there are two worksheets which form the

More information

Package furniture. November 10, 2017

Package furniture. November 10, 2017 Package furniture November 10, 2017 Type Package Title Furniture for Quantitative Scientists Version 1.7.2 Date 2017-10-16 Maintainer Tyson S. Barrett Contains three main

More information

Dr Wan Nor Arifin Unit of Biostatistics and Research Methodology, Universiti Sains Malaysia.

Dr Wan Nor Arifin Unit of Biostatistics and Research Methodology, Universiti Sains Malaysia. Introduction to SPSS Dr Wan Nor Arifin Unit of Biostatistics and Research Methodology, Universiti Sains Malaysia. wnarifin@usm.my Outlines Introduction Data Editor Data View Variable View Menus Shortcut

More information

Database Integrated Analytics using R: Initial Experiences with SQL-Server + R

Database Integrated Analytics using R: Initial Experiences with SQL-Server + R Database Integrated Analytics using R: Initial Experiences with SQL-Server + R Josep Ll. Berral and Nicolas Poggi Barcelona Supercomputing Center (BSC) Universitat Politècnica de Catalunya (BarcelonaTech)

More information

Summary of the Lecture

Summary of the Lecture Summary of the Lecture 1 Introduction 2 MATLAB env., Variables, and format 3 4 5 MATLAB function, arrays and operations Algorithm and flowchart M-files: Script and Function Files 6 Structured Programming

More information

Quick introduction to descriptive statistics and graphs in. R Commander. Written by: Robin Beaumont

Quick introduction to descriptive statistics and graphs in. R Commander. Written by: Robin Beaumont Quick introduction to descriptive statistics and graphs in R Commander Written by: Robin Beaumont e-mail: robin@organplayers.co.uk http://www.robin-beaumont.co.uk/virtualclassroom/stats/course1.html Date

More information

Racket Pattern Matching

Racket Pattern Matching Racket Pattern Matching Principles of Programming Languages https://lambda.mines.edu if evaluates a predicate, and returns either the consequent or the alternative depending on the result: (if predicate

More information

Boost your Analytics with ML for SQL Nerds

Boost your Analytics with ML for SQL Nerds Boost your Analytics with ML for SQL Nerds SQL Saturday Spokane Mar 10, 2018 Julie Koesmarno @MsSQLGirl mssqlgirl.com jukoesma@microsoft.com Principal Program Manager in Business Analytics for SQL Products

More information

Introduction to Functions. Biostatistics

Introduction to Functions. Biostatistics Introduction to Functions Biostatistics 140.776 Functions The development of a functions in R represents the next level of R programming, beyond writing code at the console or in a script. 1. Code 2. Functions

More information

LSP 121. LSP 121 Math and Tech Literacy II. Topics. Quartiles. Intro to Statistics. More Descriptive Statistics

LSP 121. LSP 121 Math and Tech Literacy II. Topics. Quartiles. Intro to Statistics. More Descriptive Statistics Greg Brewster, DePaul University Page 1 LSP 121 Math and Tech Literacy II More Descriptive Statistics Greg Brewster DePaul University Topics More Descriptive Statistics Quartiles Percentiles Categorical

More information

IENG484 Quality Engineering Lab 1 RESEARCH ASSISTANT SHADI BOLOUKIFAR

IENG484 Quality Engineering Lab 1 RESEARCH ASSISTANT SHADI BOLOUKIFAR IENG484 Quality Engineering Lab 1 RESEARCH ASSISTANT SHADI BOLOUKIFAR SPSS (Statistical package for social science) Originally is acronym of Statistical Package for the Social Science but, now it stands

More information

COMP 110 MORE TYPES. Instructor: Sasa Junuzovic

COMP 110 MORE TYPES. Instructor: Sasa Junuzovic COMP 110 MORE TYPES Instructor: Sasa Junuzovic PREREQUISITES Types Math 2 PRIMITIVE TYPES int, double, boolean, long, short, float, byte char 3 PRIMITIVE TYPES Constants (Literals & Named Constants) Operations

More information

Multiple-imputation analysis using Stata s mi command

Multiple-imputation analysis using Stata s mi command Multiple-imputation analysis using Stata s mi command Yulia Marchenko Senior Statistician StataCorp LP 2009 UK Stata Users Group Meeting Yulia Marchenko (StataCorp) Multiple-imputation analysis using mi

More information

Data Manipulation with SQL Mara Werner, HHS/OIG, Chicago, IL

Data Manipulation with SQL Mara Werner, HHS/OIG, Chicago, IL Paper TS05-2011 Data Manipulation with SQL Mara Werner, HHS/OIG, Chicago, IL Abstract SQL was developed to pull together information from several different data tables - use this to your advantage as you

More information

Introductions Overview of SPSS

Introductions Overview of SPSS Introductions Overview of SPSS Welcome to our SPSS tutorials. This first tutorial will provide a basic overview of the SPSS environment. We will be using SPSS version 22 for these tutorials, however, versions

More information

Introduction to Statistics using R/Rstudio

Introduction to Statistics using R/Rstudio Introduction to Statistics using R/Rstudio R and Rstudio Getting Started Assume that R for Windows and Macs already installed on your laptop. (Instructions for installations sent) R on Windows R on MACs

More information

Machine Learning - Clustering. CS102 Fall 2017

Machine Learning - Clustering. CS102 Fall 2017 Machine Learning - Fall 2017 Big Data Tools and Techniques Basic Data Manipulation and Analysis Performing well-defined computations or asking well-defined questions ( queries ) Data Mining Looking for

More information

ARTIFICIAL INTELLIGENCE AND PYTHON

ARTIFICIAL INTELLIGENCE AND PYTHON ARTIFICIAL INTELLIGENCE AND PYTHON DAY 1 STANLEY LIANG, LASSONDE SCHOOL OF ENGINEERING, YORK UNIVERSITY WHAT IS PYTHON An interpreted high-level programming language for general-purpose programming. Python

More information

Frequency Tables. Chapter 500. Introduction. Frequency Tables. Types of Categorical Variables. Data Structure. Missing Values

Frequency Tables. Chapter 500. Introduction. Frequency Tables. Types of Categorical Variables. Data Structure. Missing Values Chapter 500 Introduction This procedure produces tables of frequency counts and percentages for categorical and continuous variables. This procedure serves as a summary reporting tool and is often used

More information

Package filematrix. R topics documented: February 27, Type Package

Package filematrix. R topics documented: February 27, Type Package Type Package Package filematrix February 27, 2018 Title File-Backed Matrix Class with Convenient Read and Write Access Version 1.3 Date 2018-02-26 Description Interface for working with large matrices

More information

Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016

Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 Why R? FREE Open source Constantly updating the functions is has Constantly adding new functions Learning R will help you learn other programming

More information

Introduction to Data Processing with R

Introduction to Data Processing with R Introduction to Data Processing with R Jon Clayden DIBS Teaching Seminar, 11 Dec 2015 Photo by José Martín Ramírez Carrasco https://www.behance.net/martini_rc R: Background and status

More information

Exploratory Data Analysis on NCES Data Developed by Yuqi Liao, Paul Bailey, and Ting Zhang May 10, 2018

Exploratory Data Analysis on NCES Data Developed by Yuqi Liao, Paul Bailey, and Ting Zhang May 10, 2018 Exploratory Data Analysis on NCES Data Developed by Yuqi Liao, Paul Bailey, and Ting Zhang May 1, 218 Vignette Outline This vignette provides examples of conducting exploratory data analysis (EDA) on NAEP

More information

Mr. Kongmany Chaleunvong. GFMER - WHO - UNFPA - LAO PDR Training Course in Reproductive Health Research Vientiane, 22 October 2009

Mr. Kongmany Chaleunvong. GFMER - WHO - UNFPA - LAO PDR Training Course in Reproductive Health Research Vientiane, 22 October 2009 Mr. Kongmany Chaleunvong GFMER - WHO - UNFPA - LAO PDR Training Course in Reproductive Health Research Vientiane, 22 October 2009 1 Object of the Course Introduction to SPSS The basics of managing data

More information

Python - Week 3. Mohammad Shokoohi-Yekta

Python - Week 3. Mohammad Shokoohi-Yekta Python - Week 3 Mohammad Shokoohi-Yekta 1 Objective To solve mathematic problems by using the functions in the math module To represent and process strings and characters To use the + operator to concatenate

More information

Package internetarchive

Package internetarchive Type Package Title An API Client for the Internet Archive Package internetarchive December 8, 2016 Search the Internet Archive, retrieve metadata, and download files. Version 0.1.6 Date 2016-12-08 License

More information

Data Science Essentials Lab 5 Transforming Data

Data Science Essentials Lab 5 Transforming Data Data Science Essentials Lab 5 Transforming Data Overview In this lab, you will learn how to use tools in Azure Machine Learning along with either Python or R to integrate, clean and transform data. Collectively,

More information

Introduction to R. Adrienn Szabó. DMS Group, MTA SZTAKI. Aug 30, /62

Introduction to R. Adrienn Szabó. DMS Group, MTA SZTAKI. Aug 30, /62 Introduction to R Adrienn Szabó DMS Group, MTA SZTAKI Aug 30, 2014 1/62 1 What is R? What is R for? Who is R for? 2 Basics Data Structures Control Structures 3 ExtRa stuff R packages Unit testing in R

More information

Estimating Variance Components in MMAP

Estimating Variance Components in MMAP Last update: 6/1/2014 Estimating Variance Components in MMAP MMAP implements routines to estimate variance components within the mixed model. These estimates can be used for likelihood ratio tests to compare

More information

Package chunked. July 2, 2017

Package chunked. July 2, 2017 Type Package Title Chunkwise Text-File Processing for 'dplyr' Version 0.4 Package chunked July 2, 2017 Text data can be processed chunkwise using 'dplyr' commands. These are recorded and executed per data

More information

17 - VARIABLES... 1 DOCUMENT AND CODE VARIABLES IN MAXQDA Document Variables Code Variables... 1

17 - VARIABLES... 1 DOCUMENT AND CODE VARIABLES IN MAXQDA Document Variables Code Variables... 1 17 - Variables Contents 17 - VARIABLES... 1 DOCUMENT AND CODE VARIABLES IN MAXQDA... 1 Document Variables... 1 Code Variables... 1 The List of document variables and the List of code variables... 1 Managing

More information

BIOSTATS 640 Spring 2018 Introduction to R Data Description. 1. Start of Session. a. Preliminaries... b. Install Packages c. Attach Packages...

BIOSTATS 640 Spring 2018 Introduction to R Data Description. 1. Start of Session. a. Preliminaries... b. Install Packages c. Attach Packages... BIOSTATS 640 Spring 2018 Introduction to R and R-Studio Data Description Page 1. Start of Session. a. Preliminaries... b. Install Packages c. Attach Packages... 2. Load R Data.. a. Load R data frames...

More information

Empirical Reasoning Center R Workshop (Summer 2016) Session 1. 1 Writing and executing code in R. 1.1 A few programming basics

Empirical Reasoning Center R Workshop (Summer 2016) Session 1. 1 Writing and executing code in R. 1.1 A few programming basics Empirical Reasoning Center R Workshop (Summer 2016) Session 1 This guide reviews the examples we will cover in today s workshop. It should be a helpful introduction to R, but for more details, the ERC

More information

Package statar. July 6, 2017

Package statar. July 6, 2017 Package statar July 6, 2017 Title Tools Inspired by 'Stata' to Manipulate Tabular Data Version 0.6.5 A set of tools inspired by 'Stata' to eplore data.frames ('summarize', 'tabulate', 'tile', 'pctile',

More information

Chuck Cartledge, PhD. 21 January 2018

Chuck Cartledge, PhD. 21 January 2018 Big Data: Data Analysis Boot Camp Serial vs. Parallel Processing Chuck Cartledge, PhD 21 January 2018 1/24 Table of contents (1 of 1) 1 Intro. 2 Amdahl A little math 3 BD Processing Programming paradigms

More information

Introduction to Minitab 1

Introduction to Minitab 1 Introduction to Minitab 1 We begin by first starting Minitab. You may choose to either 1. click on the Minitab icon in the corner of your screen 2. go to the lower left and hit Start, then from All Programs,

More information

Level 3 Computing Year 2 Lecturer: Phil Smith

Level 3 Computing Year 2 Lecturer: Phil Smith Level 3 Computing Year 2 Lecturer: Phil Smith Previously We started to build a GUI program using visual studio 2010 and vb.net. We have a form designed. We have started to write the code to provided the

More information

Package MonetDBLite. January 14, 2018

Package MonetDBLite. January 14, 2018 Version 0.5.1 Title In-Process Version of 'MonetDB' Package MonetDBLite January 14, 2018 Author Hannes Muehleisen [aut, cre], Mark Raasveldt [ctb], Thomas Lumley [ctb], MonetDB B.V. [cph], CWI [cph], The

More information

Surviving SPSS.

Surviving SPSS. Surviving SPSS http://dataservices.gmu.edu/workshops/spss http://dataservices.gmu.edu/software/spss Debby Kermer George Mason University Libraries Data Services Research Consultant Mason Data Services

More information

Machine Learning Chapter 2. Input

Machine Learning Chapter 2. Input Machine Learning Chapter 2. Input 2 Input: Concepts, instances, attributes Terminology What s a concept? Classification, association, clustering, numeric prediction What s in an example? Relations, flat

More information

The editor window is where we write our SAS programs which we will begin doing shortly.

The editor window is where we write our SAS programs which we will begin doing shortly. Introductions Overview of SAS Welcome to our SAS tutorials. This first tutorial will provide a basic overview of the SAS environment and SAS programming. We don t want you to try to follow along with this

More information

OLAP and Data Warehousing

OLAP and Data Warehousing OLAP and Data Warehousing Lab Exercises Part I OLAP Purpose: The purpose of this practical guide to data warehousing is to learn how online analytical processing (OLAP) methods and tools can be used to

More information

In this tutorial we will see some of the basic operations on data frames in R. We begin by first importing the data into an R object called train.

In this tutorial we will see some of the basic operations on data frames in R. We begin by first importing the data into an R object called train. Data frames in R In this tutorial we will see some of the basic operations on data frames in R Understand the structure Indexing Column names Add a column/row Delete a column/row Subset Summarize We will

More information

Variables: Objects in R

Variables: Objects in R Variables: Objects in R Basic R Functionality Introduction to R for Public Health Researchers Common new users frustations 1. Different versions of software 2. Data type problems (is that a string or a

More information

Subsetting, dplyr, magrittr Author: Lloyd Low; add:

Subsetting, dplyr, magrittr Author: Lloyd Low;  add: Subsetting, dplyr, magrittr Author: Lloyd Low; Email add: wai.low@adelaide.edu.au Introduction So you have got a table with data that might be a mixed of categorical, integer, numeric, etc variables? And

More information

Package MonetDB.R. March 21, 2016

Package MonetDB.R. March 21, 2016 Version 1.0.1 Title Connect MonetDB to R Package MonetDB.R March 21, 2016 Author Hannes Muehleisen [aut, cre], Anthony Damico [aut], Thomas Lumley [ctb] Maintainer Hannes Muehleisen Imports

More information

Figure 3.20: Visualize the Titanic Dataset

Figure 3.20: Visualize the Titanic Dataset 80 Chapter 3. Data Mining with Azure Machine Learning Studio Figure 3.20: Visualize the Titanic Dataset 3. After verifying the output, we will cast categorical values to the corresponding columns. To begin,

More information

You will learn: The structure of the Stata interface How to open files in Stata How to modify variable and value labels How to manipulate variables

You will learn: The structure of the Stata interface How to open files in Stata How to modify variable and value labels How to manipulate variables Jennie Murack You will learn: The structure of the Stata interface How to open files in Stata How to modify variable and value labels How to manipulate variables How to conduct basic descriptive statistics

More information

MATH 117 Statistical Methods for Management I Chapter Two

MATH 117 Statistical Methods for Management I Chapter Two Jubail University College MATH 117 Statistical Methods for Management I Chapter Two There are a wide variety of ways to summarize, organize, and present data: I. Tables 1. Distribution Table (Categorical

More information