IST 1051 Computational Tools for Statistics I 1 DEÜ, Department of Statistics Course Objectives Computational Tools for Statistics-I course can increase the understanding of statistics and helps to learn how to decrease the calculation time by using statistical software packages. These packages originally designed to be an easy-touse statistical system in statistical data analysis. This course gives a comprehensive information on the use of some statistical packages, Minitab, SPSS, R, etc. in statistics. 2 1
Assesment Ratio Homework 10% Midterm Exam 40% Final Exam 50% 3 Computational Tools for Statistics Product Developer Latest release Open source Software licence Minitab 18 Minitab Inc. June 2017 No Proprietary IBM SPSS 25 IBM August 2017 No Proprietary R 3.4.1 R Foundation June 2017 Yes GNU GPL Source: Comparison of statistical packages, Wikipedia 4 2
What is Minitab? MINITAB is a general purpose statistical program that provides a wide range of basic and advanced data analysis capabilities. MINITAB, highly interactive statistical computing system designed especially for students and researchers who have little or no previous experience with computers. 5 History of Minitab? It was developed at the Pennsylvania State University by researchers Barbara F. Ryan, Thomas A. Ryan, Jr. and Brian L. Joiner in 1972. Today, Minitab is often used in conjunction with the implementation of Six Sigma, CMMI and other statisticsbased process improvement methods. 6 3
Some Minitab Capabilities Data and File Management Graphics Basic Statistics Analysis of Variance Regression Analysis Multivariate Analysis Tables Time Series Analysis Simulation and Distributions Nonparametrics Macros etc. 7 What is SPSS? IBM SPSS is a computer statistical software package. It is one of the most popular of the many statistical packages currently available for statistical analysis. SPSS at one time was an acronym for Statistical Package for the Social Sciences but it is now treated as just a familiar array of letters.. 8 4
History of SPSS? The software was released in its first version in 1968 by SPSS inc. SPSS Inc. announced on July 28, 2009 that it was being acquired by IBM for US$1.2 billion. 9 Some SPSS Capabilities Its popularity stems from the fact that the program: Allows for a great deal of flexibility in the format of data Provides the user with a comprehensive set of procedures for data transformation and file manipulation Offers the researcher a large number of statistical analyses commonly used in social sciences 10 5
What is R? R is an integrated suite of software facilities for data manipulation, calculation and graphical display. 11 History of R? R is based on the computer language S, developed by John Chambers and others at Bell Laboratories in 1976. In 1993 Robert Gentleman and Ross Ihaka at the University of Auckland wanted to experiment with the language, so they developed an implementation, and named it R. R is currently developed by the R Development Core Team. 12 6
Some R Capabilities an effective data handling and storage facility, a suite of operators for calculations on arrays, in particular matrices, a large, coherent, integrated collection of intermediate tools for data analysis, graphical facilities for data analysis and display either directly at the computer or on hardcopy, a well developed, simple and effective programming language. 13 Starting Minitab 14 7
Minitab 14 15 Minitab Session Window Menu Bar Tool Bar Worksheet Data Window 16 8
Data window Minitab Interface o Displays your current worksheet in a spreadsheet format, with rows and columns. A worksheet can contain up to 4000 columns, 1000 constants, and up to 10,000,000 rows depending on how much memory your computer has. (Minitab14) In the Data window you can o enter columns of data into the worksheet o name, resize, and format columns o move quickly to different cell locations o cut, copy, or paste cells to and from the Clipboard 17 Minitab Interface Session window o Displays the text output generated by your analyses and other work. Project manager o The Project Manager contains folders that allow you to navigate, view, and manipulate various parts of your project. 18 9
data entry direction arrow Minitab Columns Rows Cells 19 Three Forms of Data : Columns, Constants, and Matrices Column: contains numeric,text or date/time data, referred as C1,C2, (maximum 4000) Constant: contains a single number or a text, referred as K1,K2, (maximum 1000) Matrix: contains a rectangular block of cells containing numbers, referred as M1,M2, (maximum 100) 20 10
Opening a Worksheet 1. Choose File Open Worksheet. 2. In the Sample Data folder, double-click Meet Minitab. 3. Choose ShippingData.MTW, then click Open. If you get a message box click OK. 21 Examine Worksheet Minitab accepts three types of data: numeric, text, and date/time. 22 11
Selecting Cells Selecting a single cell 23 Right Click Menu 24 12
Creating a New Woorksheet File New Minitab worksheet To enter data columnwise Click the data entry direction arrow to make it point down. Enter your data, pressing [Tab] or [Enter] to move the active cell. Press [Ctrl]+[Enter] to move the active cell to the top of the next column. To enter data rowwise Click the data entry direction arrow to make it point to the right. Enter your data. Press [Ctrl]+[Enter] to move the active cell to the beginning of the next row. To enter data within a block Highlight the area you want to work in. Enter your data. The active cell moves only within the selected area. To unselect the area, press an arrow key or click anywhere in the Data window. 25 Entering Data 26 13
Clear Cells To... Do this clear a cell (erasing its contents) Highlight the cells and choose Edit > Clear Cells (or press backspace). In a numeric column, Minitab inserts in a cleared cell (unless it is the last cell in a column). delete one or more cells (and move Select the cells, then press <Delete> other rows in the column up) insert one or more cells above the active cell Select the cells, then right click Insert Cells. 27 Saving Worksheet Minitab Project File (.MPJ) o Session window o Worksheets o Graphs Worksheet File (.MTW) Macro File (.MTB) File Save Project File Save Current Worksheet File Save Current Worksheet As File Open Project File Open Worksheet 28 14
Minitab Menu File : Use the file menu to open, close, save, print, or run the various file types that Minitab can use. Edit : Undo, redo, clear, cut, copy, paste, Data (in Minitab 13 and former versions: Manip) : Commands for Manipulating Data Calc :Calculate mathematical expressions and transformations. Stat : Basic Statistics, Regression, ANOVA, Multivariate Analysis, Nonparametrics, Time Series, Tables, EDA, Power and Sample Size. 29 Minitab Menu Graph : Minitab provides a flexible suite of graphs to support a variety of analysis needs. Editor : Editor Menu commands are dynamic and change depending on which window is active. Tools : Use the Tools menu to add new user-defined tools, display and hide tool and status bars, customize Minitab's menus, toolbars, and shortcut keys, change Minitab's default options. Windows Help 30 15
SPSS When the SPSS program is launched, the following window will open. to create a new data file, close this window by clicking Cancel 31 SPSS Data View:This view displays the actual data values or defined value labels. Variable View:This view displays variable definition information 32 16
SPSS THE VARIABLE VIEW DISPLAY: This view displays variable definition information Name: A reasonably short but descriptive name of the variable. No spaces or special characters are allowed but underscores can be used. Type: Data type Width: Number of digits or characters. Decimals: Number of decimal places Label: A phrase to describe the variable. Values: The place that we can specify labels for each category code. Missing:User-defined missing values Columns: Column width Align: This specifies alignment in the Data View display. Measure: This specifies the scale of measurement of the variable. Role:Variable role (input, target, none, ) 33 Define Variables: SPSS Entering Data Data Window: 34 17
SPSS File Types Data File. This is a spreadsheet containing the data that were collected from the cases. In the data file, the variables are represented as columns; cases, as rows. This file type uses the extension.sav. Output File. This file is produced when IBM SPSS has performed the requested operations. It contains the results of the procedure. This file type uses the extension.spv. Syntax File. This file contains the IBM SPSS computer code (syntax) that drives the analysis. This file type uses the extension.sps. 35 R Using the Command Line When you open R, the main window is the R console. R can be used as a calculator. 36 18
R Assignment o o o o You can enter commands one at a time at the command prompt (>) or run a set of commands from a source file. There is a wide variety of data types, including vectors (numerical, character, logical), matrices, dataframes, and lists. the assignment operators: <- = Results of calculations can be stored in objects using the assignment operators. 37 R Objects o R objects can then be used in other calculations. To print the object just enter the name of the object. There are some restrictions when giving an object a name: 1. Object names cannot contain symbols like!, +, -, # 2. A dot (.) and an underscore ( _ ) are allowed, also a name starting with a dot. 3. Object names can contain a number but cannot start with a number. 4. R is case sensitive, Name and name are two different objects. 38 19
R Data types o R has a wide variety of data types including vector (numerical, character, logical), matrix, array, dataframe, and list. o vector: a collection of data of the same type. Cells are accessed through indexing operations (square brackets) such as a[1]. R has six basic vector types: logical, integer, real, complex, string (or character), raw 39 R Vectors The concatenation function c is used to define vectors. > a<-c(1,3,5,6) > a [1] 1 3 5 6 40 20
R Example > region<-c("north","west","central","south","south","west") > month<-c("jan","feb","apr","may","jan","feb") > sales<-c(23,36,44,34,26,23) 41 R Example > region [1] "North" "West" "Central" "south" "south" "west" > month [1] "Jan" "Feb" "Apr" "May" "Jan" "Feb" > sales [1] 23 36 44 34 26 23 > sales[3] [1] 44 42 21