R For Sql Developers. Kiran Math
|
|
- Claud McKinney
- 6 years ago
- Views:
Transcription
1 R For Sql Developers Kiran Math - kiranmath@outlook.com
2 R Figure 1: R
3 Download R CRAN The Comprehensive R Archive Network ~ OpenSource R MRAN Micrsoft R Open ~ Enhanced R Distribution
4 Microsoft R Microsoft R Server Microsoft R Client Microsoft R Open
5 Capability Comparision Capability R Open Microsoft R Open Microsoft R Server Data Size In - Memory In - Memory In - Memory + Disk Speed Single - Thread Multi - Thread, MKL Multi - Thread, Multi Node, MKL Support Community Community Community + Commercial Package Support CRAN CRAN + MRAN CRAN + MRAN + MSR Scale/ Performance Packages License Open Source Open Source Commercial
6 R Development Tools R Studio R Tools for Visual Studio (RTVS)
7 Data NHANES : The National Health and Nutrition Examination Survey SEQN GENDER AGEYR BMI WAIST Male Female Male Male Female Male NA
8 R - Objects Vectors Data Frames Factors Lists Matrices Arrays
9 Data Types Types Example Logical TRUE, FALSE Numeric 14.6, 50, 100 Integer 1L, 374L, 10L Character SQL, Saturday Complex 3 + 2i Raw c (SQL)
10 Variables Variables are reserved memory locations to store values. The variables are assigned with R-Objects. The data type of the R-object becomes the data type of the variable.
11 Vector A vector is a sequence of data elements of the same basic type. Members in a vector are called components.
12 Character Vector GENDER <- c ("Male", "Female", "Male", "Male", "Female", "Male") print(gender) ## [1] "Male" "Female" "Male" "Male" "Female" "Male" class(gender) ## [1] "character"
13 Numeric Vector AGEYR <- c(55l, 52L, 63L, 83L, 37L, NA) print(ageyr) ## [1] NA class(ageyr) ## [1] "integer"
14 Numeric Vector BMI <- c(31.26, 25.49, 19.60, 28.32, 19.34, 16.57) print(bmi) ## [1] class(bmi) ## [1] "numeric"
15 Numeric Vector SEQN < :21014 print(seqn) ## [1] class(seqn) ## [1] "integer"
16 Factor Variables that take limited number of different values. Categorical Variables Nominal Ordinal Data is stored as vector integer Unique Character value is stored once
17 Factor fac_gender <- factor(gender) str(fac_gender) ## Factor w/ 2 levels "Female","Male": print(fac_gender) ## [1] Male Female Male Male Female Male ## Levels: Female Male
18 BMI Classification Group BMI Range Underweight =< 18.4 Normal Overweight Obesity X Obesity bgroup <- c( "Under", "Normal", "Over", "Obese", "Extra") brange <- c(0, 18.4, 24.9, 29.9, 39.9, 100)
19 Ordered Factor fac_bmi <- cut(bmi, breaks = brange, labels = bgroup, ordered = TRUE) print(bmi) ## [1] print(fac_bmi) ## [1] Obese Over Normal Over Normal Under ## Levels: Under < Normal < Over < Obese < Extra
20 Data Frame dfbmi <- data.frame( id = SEQN, gender = fac_gender, age = AGEYR, bmi = fac_bmi ) id gender age bmi Male 55 Obese Female 52 Over Male 63 Normal Male 83 Over Female 37 Normal Male NA Under
21 Data Science Life Cycle Figure 2: R
22 Programming Paradigm Imperative Declarative Object-Oriented Procedural Functional Logical
23 Programming Paradigm Algorithm = Logic + Control Logic What Must Be Done Control How the desired solution is found Imperative Declarative Both are needed Only Logic is needed
24 Imperative Programming num <- c(1:10) print(num) ## [1] total <- 0 for (i in num) { total <- total + i } print(total) ## [1] 55
25 Declarative Programming num <- c(1:10) print(num) ## [1] total <- sum(num) print(total) ## [1] 55
26 Functional Programming R,at its heart, is a functional programming language. It Provides many tools to creation and manipulation of functions. Assign function to variables Store functions in list Pass function as argument to other functions Create function inside function Return function as a result of a function First Class Functions
27 R Function add <- function(x,y) { return (x + y) }
28 R Function total <- add(1, 2) print(total) ## [1] 3
29 Use Case Import CSV Data into R Using Open R
30 Import CSV file into R BMIFile <- "D:\\Presentation\\SpartanburgR2017\\Data\\BMI.c dfbmi <- read.csv(bmifile) SEQN GENDER AGEYR BMI WAIST Male Female Female Male Male Female
31 Filter Data Using Package DPLYR library(dplyr) df16plus <- dfbmi %>% filter(ageyr >=18 & is.na(bmi) == FALSE) SEQN GENDER AGEYR BMI WAIST Male Male Female Male Male Female
32 ## [1] Over Extra Normal Discretize BMI Function dbmi <- function(vbmi) { bgroup <- c( "Under", "Normal", "Over", "Obese", "E brange <- c(0, 18.4, 24.9, 29.9, 39.9, fac_bmi <- cut(vbmi, breaks = brange, labels = bgroup, ordered = TRUE) } return(fac_bmi) vsample <- c(25,40, 19) dbmi(vsample)
33 Add New Column to Data Frame df <- df16plus %>% mutate(bmigroup = dbmi(bmi)) SEQN GENDER AGEYR BMI WAIST BMIGroup Male Extra Male Obese Female Over Male Normal Male Over Female Normal
34 Visualize Data : Frequency hplot <- hist(df$bmi, xlab="bmi", border="blue", col="green" ) print(hplot, abline(v=24.9, col="red", lty = 3)) Histogram of df$bmi Frequency
35 Visualize Data : Frequency library(ggplot2) ggplot(df, aes(bmigroup)) + geom_bar() count 500 0
36 CSV To Sql Table : Variables library(revoscaler) sqldesttable <- "tblbmi" sqlrowsperread < sqlconnstring <- hidden BMIFile <- "D:\\Presentation\\SpartanburgR2017\\Data\\BMI.c datcol <- c( "SEQN" = "integer", "GENDER" = "factor", "AGEYR" = "integer", "BMI" = "numeric", "WAIST" = "numeric")
37 CSV To Sql Table des <- RxSqlServerData(connectionString = sqlconnstring, table = sqldesttable, rowsperread = sqlrowsperread) indat <- RxTextData( file = BMIFile, colclasses = datcol) rxdatastep(indata = indat, outfile = des, overwrite = TRUE) ## Rows Read: 9643, Total Rows Processed: 9643, Total Chunk ## ## Elapsed time to compute low/high values and/or factor le ## ## Rows Read: 9643, Total Rows Processed: 9643 ## Total Rows written: 9643, Total time: ##, Total Chunk Time: seconds
38 tblbmi SELECT TOP 6 * FROM tblbmi; SEQN GENDER AGEYR BMI WAIST Male Female Female Male Male Female
39 SQL Server Scripts With R EXEC = N = N R = N SQL = N = N = N Script Block Parameter WITH RESULT SETS (( Return Columns ))
40 Demo GetBMIGroup.StoredProcedure GetBMIGroupWithParameter.StoredProcedure GetBMIGroupWithParameterExternalFile.StoredProcedure
41 Thank you Figure 3: Thank you Kiran Math Greenville SC
PSS718 - Data Mining
Lecture 3 Hacettepe University, IPS, PSS October 10, 2016 Data is important Data -> Information -> Knowledge -> Wisdom Dataset a collection of data, a.k.a. matrix, table. Observation a row of a dataset,
More informationAndrea Martorana Tusa. Failure prediction for manifacturing industry
Andrea Martorana Tusa Failure prediction for manifacturing industry Event Sponsors Expo Sponsors Expo Light Sponsors Speaker Info First name: Andrea. Last name: Martorana Tusa. Italian, working by Widex
More informationMicrosoft, Open Source, R: You Gotta be Kidding Me!
Microsoft, Open Source, R: You Gotta be Kidding Me! Bio - Niels Berglund Software Specialist - Derivco lots of production dev. plus figuring out ways to "use and abuse" existing and new technologies Author
More informationPreparing for Data Analysis
Preparing for Data Analysis Prof. Andrew Stokes March 21, 2017 Managing your data Entering the data into a database Reading the data into a statistical computing package Checking the data for errors and
More informationINTRODUCTION TO DATA. Welcome to the course!
INTRODUCTION TO DATA Welcome to the course! High School and Beyond id gender race socst 70 male white 57 121 female white 61 86 male white 31 137 female white 61 Loading data > # Load package > library(openintro)
More informationIntro to R. Fall Fall 2017 CS130 - Intro to R 1
Intro to R Fall 2017 Fall 2017 CS130 - Intro to R 1 Intro to R R is a language and environment that allows: Data management Graphs and tables Statistical analyses You will need: some basic statistics We
More informationCIS 4930/6930 Spring 2014 Introduction to Data Science /Data Intensive Computing. University of Florida, CISE Department Prof.
CIS 4930/6930 Spring 2014 Introduction to Data Science /Data Intensive Computing University of Florida, CISE Department Prof. Daisy Zhe Wang Data Visualization Value of Visualization Data And Image Models
More informationPreparing for Data Analysis
Preparing for Data Analysis Prof. Andrew Stokes March 27, 2018 Managing your data Entering the data into a database Reading the data into a statistical computing package Checking the data for errors and
More information1 Building a simple data package for R. 2 Data files. 2.1 bmd data
1 Building a simple data package for R Suppose that we wish to make a package containing data sets only available in-house or on CRAN. This is often done for the data sets in the examples and exercises
More informationA GREATER GOODS BRAND
A GREATER GOODS BRAND 1 2 3 Physical Features Measuring Units lb. kg pound kilogram Setting The Measuring Unit By pressing the UNIT button on the back of the scale, you can switch between lb. (pound) and
More informationEntering and Outputting Data 2 nd best TA ever: Steele H. Valenzuela February 2-6, 2015
Entering and Outputting Data 2 nd best TA ever: Steele H. Valenzuela February 2-6, 2015 Contents Things to Know Before You Begin.................................... 1 Entering and Outputting Data......................................
More informationWhere Does R Fit Into Your SQL Server Stack? Stacia Varga blog.datainspirations.com
Where Does R Fit Into Your SQL Server Stack? Stacia Varga Stacia@datainspirations.com blog.datainspirations.com Twitter: @_StaciaV_ Stacia (Misner) Varga Over 30 years of IT experience, 17 years of BI
More informationR Workshop Guide. 1 Some Programming Basics. 1.1 Writing and executing code in R
R Workshop Guide This guide reviews the examples we will cover in today s workshop. It should be a helpful introduction to R, but for more details, you can access a more extensive user guide for R on the
More informationWORKSHOP: Using the Health Survey for England, 2014
WORKSHOP: Using the Health Survey for England, 2014 There are three sections to this workshop, each with a separate worksheet. The worksheets are designed to be accessible to those who have no prior experience
More informationNHANES June Introduction. Data information & loading data. Using dynamic data within a typical classroom
NHANES June 2016 Introduction The NHANES data come from the National Health and Nutrition Examination Survey, surveys given nationwide by the Center for Disease Controls (CDC). The data are collected to
More informationAnalysis of Complex Survey Data with SAS
ABSTRACT Analysis of Complex Survey Data with SAS Christine R. Wells, Ph.D., UCLA, Los Angeles, CA The differences between data collected via a complex sampling design and data collected via other methods
More informationCreating New Variables in JMP Datasets Using Formulas Exercises
Creating New Variables in JMP Datasets Using Formulas Exercises Exercise 3 Calculate the Difference of Two Columns 1. This Exercise will use the data table Cholesterol. This data table contains the following
More informationAn Introduction to R- Programming
An Introduction to R- Programming Hadeel Alkofide, Msc, PhD NOT a biostatistician or R expert just simply an R user Some slides were adapted from lectures by Angie Mae Rodday MSc, PhD at Tufts University
More informationSTAT10010 Introductory Statistics Lab 2
STAT10010 Introductory Statistics Lab 2 1. Aims of Lab 2 By the end of this lab you will be able to: i. Recognize the type of recorded data. ii. iii. iv. Construct summaries of recorded variables. Calculate
More informationA brief introduction to R
A brief introduction to R Cavan Reilly September 29, 2017 Table of contents Background R objects Operations on objects Factors Input and Output Figures Missing Data Random Numbers Control structures Background
More informationPython for Data Analysis. Prof.Sushila Aghav-Palwe Assistant Professor MIT
Python for Data Analysis Prof.Sushila Aghav-Palwe Assistant Professor MIT Four steps to apply data analytics: 1. Define your Objective What are you trying to achieve? What could the result look like? 2.
More informationPackage calpassapi. August 25, 2018
Title R Interface to Access CalPASS API Version 0.0.1 Package calpassapi August 25, 2018 Description Implements methods for querying data from CalPASS using its API. CalPASS Plus. MMAP API V1. .
More informationKANRI DISTANCE CALCULATOR. User Guide v2.4.9
KANRI DISTANCE CALCULATOR User Guide v2.4.9 KANRI DISTANCE CALCULATORTM FLOW Participants Input File Correlation Distance Type? Generate Target Profile General Target Define Target Profile Calculate Off-Target
More informationA Simple Guide to Using SPSS (Statistical Package for the. Introduction. Steps for Analyzing Data. Social Sciences) for Windows
A Simple Guide to Using SPSS (Statistical Package for the Social Sciences) for Windows Introduction ٢ Steps for Analyzing Data Enter the data Select the procedure and options Select the variables Run the
More informationA GREATER GOODS BRAND
A GREATER GOODS BRAND 1 Symbol for THE OPERATION GUIDE MUST BE READ Symbol for TYPE BF APPLIED PARTS Symbol for MANUFACTURE DATE Symbol for SERIAL NUMBER Symbol for MANUFACTURER Symbol for DIRECT CURRENT
More informationR Workshop Daniel Fuller
R Workshop Daniel Fuller Welcome to the R Workshop @ Memorial HKR The R project for statistical computing is a free open source statistical programming language and project. Follow these steps to get started:
More informationBasic concepts and terms
CHAPTER ONE Basic concepts and terms I. Key concepts Test usefulness Reliability Construct validity Authenticity Interactiveness Impact Practicality Assessment Measurement Test Evaluation Grading/marking
More informationMaking sense of census microdata
Making sense of census microdata Tutorial 3: Creating aggregated variables and visualisations First, open a new script in R studio and save it in your working directory, so you will be able to access this
More informationTYPES OF VARIABLES, STRUCTURE OF DATASETS, AND BASIC STATA LAYOUT
PRIMER FOR ACS OUTCOMES RESEARCH COURSE: TYPES OF VARIABLES, STRUCTURE OF DATASETS, AND BASIC STATA LAYOUT STEP 1: Install STATA statistical software. STEP 2: Read through this primer and complete the
More informationPackage rsurfer. October 30, 2017
Version 0.2 Date 2017-10-30 Title Manipulating 'Freesurfer' Generated Data Package rsurfer October 30, 2017 Maintainer Alexander Luke Spedding Depends R (>= 3.3.3), stringr
More informationPackage ggextra. April 4, 2018
Package ggextra April 4, 2018 Title Add Marginal Histograms to 'ggplot2', and More 'ggplot2' Enhancements Version 0.8 Collection of functions and layers to enhance 'ggplot2'. The flagship function is 'ggmarginal()',
More information1 Introduction. 1.1 What is Statistics?
1 Introduction 1.1 What is Statistics? MATH1015 Biostatistics Week 1 Statistics is a scientific study of numerical data based on natural phenomena. It is also the science of collecting, organising, interpreting
More informationModule Contact: Dr Geoff McKeown, CMP Copyright of the University of East Anglia Version 1
UNIVERSITY OF EAST ANGLIA School of Computing Sciences Main Series UG Examination 2015-16 PROGRAMMING FOR APPLICATIONS CMP-4009B Time allowed: 2 hours Section A (Attempt all questions: 80 marks) Section
More information4 Displaying Multiway Tables
4 Displaying Multiway Tables An important subset of statistical data comes in the form of tables. Tables usually record the frequency or proportion of observations that fall into a particular category
More informationMotivations. Chapter 3: Selections and Conditionals. Relational Operators 8/31/18. Objectives. Problem: A Simple Math Learning Tool
Chapter 3: Selections and Conditionals CS1: Java Programming Colorado State University Motivations If you assigned a negative value for radius in Listing 2.2, ComputeAreaWithConsoleInput.java, the program
More informationR programming Philip J Cwynar University of Pittsburgh School of Information Sciences and Intelligent Systems Program
R programming Philip J Cwynar University of Pittsburgh School of Information Sciences and Intelligent Systems Program Background R is a programming language and software environment for statistical analysis,
More informationPackage embed. November 19, 2018
Version 0.0.2 Package embed November 19, 2018 Title Extra Recipes for Encoding Categorical Predictors Description Factor predictors can be converted to one or more numeric representations using simple
More informationExtract API: Build sophisticated data models with the Extract API
Welcome # T C 1 8 Extract API: Build sophisticated data models with the Extract API Justin Craycraft Senior Sales Consultant Tableau / Customer Consulting My Office Photo Used with permission Agenda 1)
More informationFunctions and data structures. Programming in R for Data Science Anders Stockmarr, Kasper Kristensen, Anders Nielsen
Functions and data structures Programming in R for Data Science Anders Stockmarr, Kasper Kristensen, Anders Nielsen Objects of the game In R we have objects which are functions and objects which are data.
More informationReading and wri+ng data
An introduc+on to Reading and wri+ng data Noémie Becker & Benedikt Holtmann Winter Semester 16/17 Course outline Day 4 Course outline Review Data types and structures Reading data How should data look
More informationDr. V. Alhanaqtah. Econometrics. Graded assignment
LABORATORY ASSIGNMENT 4 (R). SURVEY: DATA PROCESSING The first step in econometric process is to summarize and describe the raw information - the data. In this lab, you will gain insight into public health
More informationR commander an introduction
R commander an introduction free, user-friendly, and powerful software Ho Kim SCHOOL OF PUBLIC HEALTH, SNU Useful sites R is a free software with powerful tools The Comprehensive R Archives Network http://cran.r-project.org/
More informationData and AI LATAM 2018
Data and AI LATAM 2018 La parte de imagen con el identificador de relación rid5 no se encontró en el archivo. La parte de imagen con el identificador de relación rid5 no se encontró en el archivo. La parte
More informationBE/EE189 Design and Construction of Biodevices Lecture 2. BE/EE189 Design and Construction of Biodevices - Caltech
BE/EE189 Design and Construction of Biodevices Lecture 2 LabVIEW Programming More Basics, Structures, Data Types, VI Case structure Debugging techniques Useful shortcuts Data types in labview Concept of
More informationA Brief Introduction to R
A Brief Introduction to R Babak Shahbaba Department of Statistics, University of California, Irvine, USA Chapter 1 Introduction to R 1.1 Installing R To install R, follow these steps: 1. Go to http://www.r-project.org/.
More informationIntroduction (SPSS) Opening SPSS Start All Programs SPSS Inc SPSS 21. SPSS Menus
Introduction (SPSS) SPSS is the acronym of Statistical Package for the Social Sciences. SPSS is one of the most popular statistical packages which can perform highly complex data manipulation and analysis
More informationAn introduction to ggplot: An implementation of the grammar of graphics in R
An introduction to ggplot: An implementation of the grammar of graphics in R Hadley Wickham 00-0-7 1 Introduction Currently, R has two major systems for plotting data, base graphics and lattice graphics
More informationBasic Medical Statistics Course
Basic Medical Statistics Course S0 SPSS Intro December 2014 Wilma Heemsbergen w.heemsbergen@nki.nl This Afternoon 13.00 ~ 15.00 SPSS lecture Short break Exercise 2 Database Example 3 Types of data Type
More informationGeneralized Linear Models
Generalized Linear Models Methods@Manchester Summer School Manchester University July 2 6, 2018 Software and Data www.research-training.net/manchester2018 Graeme.Hutcheson@manchester.ac.uk University of
More informationPackage quickreg. R topics documented:
Package quickreg September 28, 2017 Title Build Regression Models Quickly and Display the Results Using 'ggplot2' Version 1.5.0 A set of functions to extract results from regression models and plot the
More informationChuck Cartledge, PhD. 24 September 2017
Introduction Amdahl BD Processing Languages Q&A Conclusion References Big Data: Data Analysis Boot Camp Serial vs. Parallel Processing Chuck Cartledge, PhD 24 September 2017 1/24 Table of contents (1 of
More informationChecking whether the protocol was followed: gender and age 51
Checking whether the protocol was followed: gender and age 51 Session 4: Checking whether the protocol was followed: gender and age In the data cleaning workbook there are two worksheets which form the
More informationPackage furniture. November 10, 2017
Package furniture November 10, 2017 Type Package Title Furniture for Quantitative Scientists Version 1.7.2 Date 2017-10-16 Maintainer Tyson S. Barrett Contains three main
More informationDr Wan Nor Arifin Unit of Biostatistics and Research Methodology, Universiti Sains Malaysia.
Introduction to SPSS Dr Wan Nor Arifin Unit of Biostatistics and Research Methodology, Universiti Sains Malaysia. wnarifin@usm.my Outlines Introduction Data Editor Data View Variable View Menus Shortcut
More informationDatabase Integrated Analytics using R: Initial Experiences with SQL-Server + R
Database Integrated Analytics using R: Initial Experiences with SQL-Server + R Josep Ll. Berral and Nicolas Poggi Barcelona Supercomputing Center (BSC) Universitat Politècnica de Catalunya (BarcelonaTech)
More informationSummary of the Lecture
Summary of the Lecture 1 Introduction 2 MATLAB env., Variables, and format 3 4 5 MATLAB function, arrays and operations Algorithm and flowchart M-files: Script and Function Files 6 Structured Programming
More informationQuick introduction to descriptive statistics and graphs in. R Commander. Written by: Robin Beaumont
Quick introduction to descriptive statistics and graphs in R Commander Written by: Robin Beaumont e-mail: robin@organplayers.co.uk http://www.robin-beaumont.co.uk/virtualclassroom/stats/course1.html Date
More informationRacket Pattern Matching
Racket Pattern Matching Principles of Programming Languages https://lambda.mines.edu if evaluates a predicate, and returns either the consequent or the alternative depending on the result: (if predicate
More informationBoost your Analytics with ML for SQL Nerds
Boost your Analytics with ML for SQL Nerds SQL Saturday Spokane Mar 10, 2018 Julie Koesmarno @MsSQLGirl mssqlgirl.com jukoesma@microsoft.com Principal Program Manager in Business Analytics for SQL Products
More informationIntroduction to Functions. Biostatistics
Introduction to Functions Biostatistics 140.776 Functions The development of a functions in R represents the next level of R programming, beyond writing code at the console or in a script. 1. Code 2. Functions
More informationLSP 121. LSP 121 Math and Tech Literacy II. Topics. Quartiles. Intro to Statistics. More Descriptive Statistics
Greg Brewster, DePaul University Page 1 LSP 121 Math and Tech Literacy II More Descriptive Statistics Greg Brewster DePaul University Topics More Descriptive Statistics Quartiles Percentiles Categorical
More informationIENG484 Quality Engineering Lab 1 RESEARCH ASSISTANT SHADI BOLOUKIFAR
IENG484 Quality Engineering Lab 1 RESEARCH ASSISTANT SHADI BOLOUKIFAR SPSS (Statistical package for social science) Originally is acronym of Statistical Package for the Social Science but, now it stands
More informationCOMP 110 MORE TYPES. Instructor: Sasa Junuzovic
COMP 110 MORE TYPES Instructor: Sasa Junuzovic PREREQUISITES Types Math 2 PRIMITIVE TYPES int, double, boolean, long, short, float, byte char 3 PRIMITIVE TYPES Constants (Literals & Named Constants) Operations
More informationMultiple-imputation analysis using Stata s mi command
Multiple-imputation analysis using Stata s mi command Yulia Marchenko Senior Statistician StataCorp LP 2009 UK Stata Users Group Meeting Yulia Marchenko (StataCorp) Multiple-imputation analysis using mi
More informationData Manipulation with SQL Mara Werner, HHS/OIG, Chicago, IL
Paper TS05-2011 Data Manipulation with SQL Mara Werner, HHS/OIG, Chicago, IL Abstract SQL was developed to pull together information from several different data tables - use this to your advantage as you
More informationIntroductions Overview of SPSS
Introductions Overview of SPSS Welcome to our SPSS tutorials. This first tutorial will provide a basic overview of the SPSS environment. We will be using SPSS version 22 for these tutorials, however, versions
More informationIntroduction to Statistics using R/Rstudio
Introduction to Statistics using R/Rstudio R and Rstudio Getting Started Assume that R for Windows and Macs already installed on your laptop. (Instructions for installations sent) R on Windows R on MACs
More informationMachine Learning - Clustering. CS102 Fall 2017
Machine Learning - Fall 2017 Big Data Tools and Techniques Basic Data Manipulation and Analysis Performing well-defined computations or asking well-defined questions ( queries ) Data Mining Looking for
More informationARTIFICIAL INTELLIGENCE AND PYTHON
ARTIFICIAL INTELLIGENCE AND PYTHON DAY 1 STANLEY LIANG, LASSONDE SCHOOL OF ENGINEERING, YORK UNIVERSITY WHAT IS PYTHON An interpreted high-level programming language for general-purpose programming. Python
More informationFrequency Tables. Chapter 500. Introduction. Frequency Tables. Types of Categorical Variables. Data Structure. Missing Values
Chapter 500 Introduction This procedure produces tables of frequency counts and percentages for categorical and continuous variables. This procedure serves as a summary reporting tool and is often used
More informationPackage filematrix. R topics documented: February 27, Type Package
Type Package Package filematrix February 27, 2018 Title File-Backed Matrix Class with Convenient Read and Write Access Version 1.3 Date 2018-02-26 Description Interface for working with large matrices
More informationIntroduction to R (& Rstudio) Fall R Workshop August 23-24, 2016
Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 Why R? FREE Open source Constantly updating the functions is has Constantly adding new functions Learning R will help you learn other programming
More informationIntroduction to Data Processing with R
Introduction to Data Processing with R Jon Clayden DIBS Teaching Seminar, 11 Dec 2015 Photo by José Martín Ramírez Carrasco https://www.behance.net/martini_rc R: Background and status
More informationExploratory Data Analysis on NCES Data Developed by Yuqi Liao, Paul Bailey, and Ting Zhang May 10, 2018
Exploratory Data Analysis on NCES Data Developed by Yuqi Liao, Paul Bailey, and Ting Zhang May 1, 218 Vignette Outline This vignette provides examples of conducting exploratory data analysis (EDA) on NAEP
More informationMr. Kongmany Chaleunvong. GFMER - WHO - UNFPA - LAO PDR Training Course in Reproductive Health Research Vientiane, 22 October 2009
Mr. Kongmany Chaleunvong GFMER - WHO - UNFPA - LAO PDR Training Course in Reproductive Health Research Vientiane, 22 October 2009 1 Object of the Course Introduction to SPSS The basics of managing data
More informationPython - Week 3. Mohammad Shokoohi-Yekta
Python - Week 3 Mohammad Shokoohi-Yekta 1 Objective To solve mathematic problems by using the functions in the math module To represent and process strings and characters To use the + operator to concatenate
More informationPackage internetarchive
Type Package Title An API Client for the Internet Archive Package internetarchive December 8, 2016 Search the Internet Archive, retrieve metadata, and download files. Version 0.1.6 Date 2016-12-08 License
More informationData Science Essentials Lab 5 Transforming Data
Data Science Essentials Lab 5 Transforming Data Overview In this lab, you will learn how to use tools in Azure Machine Learning along with either Python or R to integrate, clean and transform data. Collectively,
More informationIntroduction to R. Adrienn Szabó. DMS Group, MTA SZTAKI. Aug 30, /62
Introduction to R Adrienn Szabó DMS Group, MTA SZTAKI Aug 30, 2014 1/62 1 What is R? What is R for? Who is R for? 2 Basics Data Structures Control Structures 3 ExtRa stuff R packages Unit testing in R
More informationEstimating Variance Components in MMAP
Last update: 6/1/2014 Estimating Variance Components in MMAP MMAP implements routines to estimate variance components within the mixed model. These estimates can be used for likelihood ratio tests to compare
More informationPackage chunked. July 2, 2017
Type Package Title Chunkwise Text-File Processing for 'dplyr' Version 0.4 Package chunked July 2, 2017 Text data can be processed chunkwise using 'dplyr' commands. These are recorded and executed per data
More information17 - VARIABLES... 1 DOCUMENT AND CODE VARIABLES IN MAXQDA Document Variables Code Variables... 1
17 - Variables Contents 17 - VARIABLES... 1 DOCUMENT AND CODE VARIABLES IN MAXQDA... 1 Document Variables... 1 Code Variables... 1 The List of document variables and the List of code variables... 1 Managing
More informationBIOSTATS 640 Spring 2018 Introduction to R Data Description. 1. Start of Session. a. Preliminaries... b. Install Packages c. Attach Packages...
BIOSTATS 640 Spring 2018 Introduction to R and R-Studio Data Description Page 1. Start of Session. a. Preliminaries... b. Install Packages c. Attach Packages... 2. Load R Data.. a. Load R data frames...
More informationEmpirical Reasoning Center R Workshop (Summer 2016) Session 1. 1 Writing and executing code in R. 1.1 A few programming basics
Empirical Reasoning Center R Workshop (Summer 2016) Session 1 This guide reviews the examples we will cover in today s workshop. It should be a helpful introduction to R, but for more details, the ERC
More informationPackage statar. July 6, 2017
Package statar July 6, 2017 Title Tools Inspired by 'Stata' to Manipulate Tabular Data Version 0.6.5 A set of tools inspired by 'Stata' to eplore data.frames ('summarize', 'tabulate', 'tile', 'pctile',
More informationChuck Cartledge, PhD. 21 January 2018
Big Data: Data Analysis Boot Camp Serial vs. Parallel Processing Chuck Cartledge, PhD 21 January 2018 1/24 Table of contents (1 of 1) 1 Intro. 2 Amdahl A little math 3 BD Processing Programming paradigms
More informationIntroduction to Minitab 1
Introduction to Minitab 1 We begin by first starting Minitab. You may choose to either 1. click on the Minitab icon in the corner of your screen 2. go to the lower left and hit Start, then from All Programs,
More informationLevel 3 Computing Year 2 Lecturer: Phil Smith
Level 3 Computing Year 2 Lecturer: Phil Smith Previously We started to build a GUI program using visual studio 2010 and vb.net. We have a form designed. We have started to write the code to provided the
More informationPackage MonetDBLite. January 14, 2018
Version 0.5.1 Title In-Process Version of 'MonetDB' Package MonetDBLite January 14, 2018 Author Hannes Muehleisen [aut, cre], Mark Raasveldt [ctb], Thomas Lumley [ctb], MonetDB B.V. [cph], CWI [cph], The
More informationSurviving SPSS.
Surviving SPSS http://dataservices.gmu.edu/workshops/spss http://dataservices.gmu.edu/software/spss Debby Kermer George Mason University Libraries Data Services Research Consultant Mason Data Services
More informationMachine Learning Chapter 2. Input
Machine Learning Chapter 2. Input 2 Input: Concepts, instances, attributes Terminology What s a concept? Classification, association, clustering, numeric prediction What s in an example? Relations, flat
More informationThe editor window is where we write our SAS programs which we will begin doing shortly.
Introductions Overview of SAS Welcome to our SAS tutorials. This first tutorial will provide a basic overview of the SAS environment and SAS programming. We don t want you to try to follow along with this
More informationOLAP and Data Warehousing
OLAP and Data Warehousing Lab Exercises Part I OLAP Purpose: The purpose of this practical guide to data warehousing is to learn how online analytical processing (OLAP) methods and tools can be used to
More informationIn this tutorial we will see some of the basic operations on data frames in R. We begin by first importing the data into an R object called train.
Data frames in R In this tutorial we will see some of the basic operations on data frames in R Understand the structure Indexing Column names Add a column/row Delete a column/row Subset Summarize We will
More informationVariables: Objects in R
Variables: Objects in R Basic R Functionality Introduction to R for Public Health Researchers Common new users frustations 1. Different versions of software 2. Data type problems (is that a string or a
More informationSubsetting, dplyr, magrittr Author: Lloyd Low; add:
Subsetting, dplyr, magrittr Author: Lloyd Low; Email add: wai.low@adelaide.edu.au Introduction So you have got a table with data that might be a mixed of categorical, integer, numeric, etc variables? And
More informationPackage MonetDB.R. March 21, 2016
Version 1.0.1 Title Connect MonetDB to R Package MonetDB.R March 21, 2016 Author Hannes Muehleisen [aut, cre], Anthony Damico [aut], Thomas Lumley [ctb] Maintainer Hannes Muehleisen Imports
More informationFigure 3.20: Visualize the Titanic Dataset
80 Chapter 3. Data Mining with Azure Machine Learning Studio Figure 3.20: Visualize the Titanic Dataset 3. After verifying the output, we will cast categorical values to the corresponding columns. To begin,
More informationYou will learn: The structure of the Stata interface How to open files in Stata How to modify variable and value labels How to manipulate variables
Jennie Murack You will learn: The structure of the Stata interface How to open files in Stata How to modify variable and value labels How to manipulate variables How to conduct basic descriptive statistics
More informationMATH 117 Statistical Methods for Management I Chapter Two
Jubail University College MATH 117 Statistical Methods for Management I Chapter Two There are a wide variety of ways to summarize, organize, and present data: I. Tables 1. Distribution Table (Categorical
More information