Data Analysis Multiple Regression

Similar documents
DoE with Visual-XSel 13.0

Weibull Reliability Analyses

Applied Regression Modeling: A Business Approach

Multivariate Calibration Quick Guide

Minitab 17 commands Prepared by Jeffrey S. Simonoff

CDAA No. 4 - Part Two - Multiple Regression - Initial Data Screening

Applied Regression Modeling: A Business Approach

Graphical Analysis of Data using Microsoft Excel [2016 Version]

Data Analysis Guidelines

Year 10 General Mathematics Unit 2

Chapter 7: Linear regression

Here is Kellogg s custom menu for their core statistics class, which can be loaded by typing the do statement shown in the command window at the very

Fathom Dynamic Data TM Version 2 Specifications

Further Maths Notes. Common Mistakes. Read the bold words in the exam! Always check data entry. Write equations in terms of variables

Polymath 6. Overview

SLStats.notebook. January 12, Statistics:

Bivariate (Simple) Regression Analysis

Multiple Regression White paper

THE L.L. THURSTONE PSYCHOMETRIC LABORATORY UNIVERSITY OF NORTH CAROLINA. Forrest W. Young & Carla M. Bann

DATA SCIENCE INTRODUCTION QSHORE TECHNOLOGIES. About the Course:

Using Excel for Graphical Analysis of Data

SPSS QM II. SPSS Manual Quantitative methods II (7.5hp) SHORT INSTRUCTIONS BE CAREFUL

Building Better Parametric Cost Models

CHAPTER 3 AN OVERVIEW OF DESIGN OF EXPERIMENTS AND RESPONSE SURFACE METHODOLOGY

Excel Spreadsheets and Graphs

Math 121 Project 4: Graphs

Part I, Chapters 4 & 5. Data Tables and Data Analysis Statistics and Figures

Spreadsheet View and Basic Statistics Concepts

Project 11 Graphs (Using MS Excel Version )

Visual-XSel Introduction & SixSigma Selected statistical methods, examples and SixSigma with Visual-XSel Copyright CRGRAPH

Using Excel for Graphical Analysis of Data

General Program Description

An introduction to plotting data

Introduction. About this Document. What is SPSS. ohow to get SPSS. oopening Data

Today is the last day to register for CU Succeed account AND claim your account. Tuesday is the last day to register for my class

PSY 9556B (Feb 5) Latent Growth Modeling

Geology Geomath Estimating the coefficients of various Mathematical relationships in Geology

Resources for statistical assistance. Quantitative covariates and regression analysis. Methods for predicting continuous outcomes.

Excel Primer CH141 Fall, 2017

Microsoft Excel 2007 Lesson 7: Charts and Comments

CREATING THE ANALYSIS

Model Diagnostic tests

Error Analysis, Statistics and Graphing

RSM Split-Plot Designs & Diagnostics Solve Real-World Problems

CPSC 695. Methods for interpolation and analysis of continuing surfaces in GIS Dr. M. Gavrilova

Bivariate Linear Regression James M. Murray, Ph.D. University of Wisconsin - La Crosse Updated: October 04, 2017

Math 227 EXCEL / MEGASTAT Guide

The problem we have now is called variable selection or perhaps model selection. There are several objectives.

Excel 2010 Worksheet 3. Table of Contents

Points Lines Connected points X-Y Scatter. X-Y Matrix Star Plot Histogram Box Plot. Bar Group Bar Stacked H-Bar Grouped H-Bar Stacked

Fathom Tutorial. Dynamic Statistical Software. for teachers of Ontario grades 7-10 mathematics courses

Statistical Good Practice Guidelines. 1. Introduction. Contents. SSC home Using Excel for Statistics - Tips and Warnings

Excel Tips and FAQs - MS 2010

EXST 7014, Lab 1: Review of R Programming Basics and Simple Linear Regression

Brief Guide on Using SPSS 10.0

Integrated Mathematics I Performance Level Descriptors

LINEAR TOPICS Notes and Homework: DUE ON EXAM

MAGMA joint modelling options and QC read-me (v1.07a)

Lab1: Use of Word and Excel

Selected Introductory Statistical and Data Manipulation Procedures. Gordon & Johnson 2002 Minitab version 13.

Lecture 7: Linear Regression (continued)

Bluman & Mayer, Elementary Statistics, A Step by Step Approach, Canadian Edition

[1] CURVE FITTING WITH EXCEL

For Additional Information...

Data Management - 50%

Voluntary State Curriculum Algebra II

Cognalysis TM Reserving System User Manual

Exercise: Graphing and Least Squares Fitting in Quattro Pro

Robust Linear Regression (Passing- Bablok Median-Slope)

Introduction to Statistical Analyses in SAS

Data Analysis and Solver Plugins for KSpread USER S MANUAL. Tomasz Maliszewski

Tips and Guidance for Analyzing Data. Executive Summary

DISCOVERING THE NATURE OF PERIODIC DATA: II. ANALYZING DATA AND JUDGING THE GOODNESS OF FIT USING RESIDUALS

Microsoft Excel 2007 Creating a XY Scatter Chart

Two-Stage Least Squares

Chapter 3: Rate Laws Excel Tutorial on Fitting logarithmic data

How to use FSBforecast Excel add in for regression analysis

Experiment 1 CH Fall 2004 INTRODUCTION TO SPREADSHEETS

Your Name: Section: INTRODUCTION TO STATISTICAL REASONING Computer Lab #4 Scatterplots and Regression

Using Graphical Analysis This exercise will familiarize you with using Graphical Analysis in Beer s Law and kinetics.

Grade 9 Math Terminology

8. MINITAB COMMANDS WEEK-BY-WEEK

LAB 1 INSTRUCTIONS DESCRIBING AND DISPLAYING DATA

Scatterplot: The Bridge from Correlation to Regression

Minitab Study Card J ENNIFER L EWIS P RIESTLEY, PH.D.

GRETL FOR TODDLERS!! CONTENTS. 1. Access to the econometric software A new data set: An existent data set: 3

3.7. Vertex and tangent

STAT 311 (3 CREDITS) VARIANCE AND REGRESSION ANALYSIS ELECTIVE: ALL STUDENTS. CONTENT Introduction to Computer application of variance and regression

SASEG 9B Regression Assumptions

Chapter 5snow year.notebook March 15, 2018

252 APPENDIX D EXPERIMENT 1 Introduction to Computer Tools and Uncertainties

5. Key ingredients for programming a random walk

User Manual STATISTICS

Introduction to creating and working with graphs

Assignment 2. with (a) (10 pts) naive Gauss elimination, (b) (10 pts) Gauss with partial pivoting

addition + =5+C2 adds 5 to the value in cell C2 multiplication * =F6*0.12 multiplies the value in cell F6 by 0.12

Welcome to class! Put your Create Your Own Survey into the inbox. Sign into Edgenuity. Begin to work on the NC-Math I material.

Heteroskedasticity and Homoskedasticity, and Homoskedasticity-Only Standard Errors

= 3 + (5*4) + (1/2)*(4/2)^2.

Appendix 2: PREPARATION & INTERPRETATION OF GRAPHS

Transcription:

Introduction Visual-XSel 14.0 is both, a powerful software to create a DoE (Design of Experiment) as well as to evaluate the results, or historical data. After starting the software, the main guide shows the direct access to the important functionality. More information to the statistical background one can find under: www.weibull.de/com/statistics.pdf To use the System Analysis, please have a look to: www.weibull.de/com/system_analysis.pdf If you first join the program, it is recommended to use always the main guide (select the menu item File / New if the guide is not visible). Later one can use also the menu Statistics or the icons below. The Visual-XSel setup is available at: www.crgraph.com/software.htm On the following pages the most important steps are shown. First use Data Analyis from the Main-Guide or via the Menu Statistics

The results of experiments can be loaded from the file Example_MulReg. vxt in the path Examples. The response name is here Accel Use then the icon Data-analysis to open the dialog for the multiple regression. Alternatively click to Dataanalysis in the Main-Guide, shown at the first page and follow the speech bubbles. Depending from the entrance, normally the Data Analysis Guide will help you find out the right method for the used data. In this case, a normal Regression without transformation is suitable (continuously measurements). Use more parameters for Regression/ ANOVA In the dialog Multiple Regression the response and the factors (here independent parameters) must be selected with the respective buttons. Note: If in the list of Data-columns a double click is used, the names will be moved in this field, where the button is red marked. In the rubric Model the Quadratic Model has to be used analog to the experiment-definition. The Parameter-scaling should be always Standardize. For the multiple regression, this is the best way for the least-square-method. Note that the respective coefficients of the model concerns to the standardization. Other software use often Normal. For the evaluation there are two methods: MLR (lease-square) and PLS (Partial-Least-Square). PLS is to use if the date doesn t come from a DoE and have a great correlation. Use at first only the default MLR, Visual-XSel tests

the data if PLS will be better. The next step is to check the correlation. Because of the DoE the data is not critical here. There is a limit with the a red line, to decide if the MLR is suitable. This limit comes more from experience and is not a statistical factor. If this limit will be overstepped automatically a dialog-box appears to give some alternatives, how to proceed. limit for hypothesis of independent data The result of the regression is shown on the next rubric. The Coefficients are the weight of the influence of each term. The visualisation of this coef. are the green horizontal bars on the right with additional confidence ranges. The p-val (value) is the significance for the coefficients. If the defined limit of 0.05 is exceeded the recommendation is to exclude this term from the model. This will be done for all terms by stepwise regression (see button below). After excluding the non signigicant term those will be grayed, but can brought back manually (sometimes it is better to decide by technical understanding than by statistical issues). the coefficient of determination shows how much the model can explain the data use the auto button to start the stepwise regression As of version 14, there is the option of outputting the so-called VIF (Variance Inflation Factor). This is a measure of how far the model terms correlate with the others. The higher this is, the more critical the evaluation is. For more information, see the speech bubble, for each term by clicking on the respective term in the VIF column.

The (Model)-ANOVA gives enhanced information of how much trust one can have to the model. For more information about the statistical values see the statistics-doc at the beginning. The so called Box-Cox-Transformation checks, wether the Y-data (response) should be used better by converting with mathematically standard formulas. The curve with the green points shows the special Box-Cox transformation with the goal to have the best normallity of the data (must be as small as possible). The curve with the red points shows the best coefficient of determination R² (must be as large as possible). Note: Sometimes the best transformation between the two arguments is not the same. this p-value shows the significance of the whole model The optimizer calculate on the basis of the model the best-point, what you have defined (here for example the minimum of the response). This calculation finds mostly better parameter adjustments, than the best observation of the data in the table. If there more responses the optimizer try to find the best compromize. If there are restrictions of some parameters, one can fix this. So only the non fixed parameter will be adjusted. Select the Optimal points to get a mark in the Curve-diagram later. Recommendation: Use this transformation only, if there is a greate advantage by R², for example by lifetime data. Click to start after selecting the options

In the view of Adjustments one can judge, wether it is usefull to complete experiments, espacially if there are doubt about interactions. In this case it is recommended to add experiments of missing edges-points. At the end one can select charts for the represantion on the main window. The most important charts are bold marked. Note: do not forget to describe the project by a meaningful title in the top. Click to the OK-button to create the charts in the main window. You can go back to the regression dialog at any time with the Data-analysis button. The curve-diagramm shows the function of the model grafically. The steeper the slope of the curve is, the higher is the influence of the parameters. The vertical red lines represent the actual parameter adjustments with their values on the top. The horizontal red line shows the result of in the response axis together with a confidence range. Move the vertical red lines to change the parameter sets and to calculate the new result of the model.

The next diagram Model versus Observation gives one an overview where are the deviations between the regression model (function) and the measured values. The best model is, if all points lie on the straight line. In this case the coefficient of determination R² would be 1. If outliers exists they are marked in red (not including in the example data set) Model- results 10 9 8 7 6 5 4 52 50 39 29 30 51 21 22 16 231817 24 25 26 27 40 4 44 11 48 12 9 5 49 33 8 6 1410 7 1332 34 4147 20 28 38 53 54 19 35 37 36 31 42 45 3 15 2 46 43 1 3 3 4 5 6 7 8 9 10 Also a very important information about the condition of the model gives the Residuals-distribution. The methode of least square requires that the residuals are normal distributed. If there are a deviation of a group of points this is a strong indicator that there are unknown disturbing factors or too much scatter. To decide whether a deviation is critical, there is a p-value for the hypothesis of normality (if p-val<0.05 there is no normality. Outliers will be marked in red also here. FrequencyAccel 99,99 99 96 90 80 60 40 20 10 3 1 0,1 Observation _ x = 3,75E- - 16±0,0466 s = 0,17037 R ² = 0,983 p value = 0,511 AD 33 51 4713 4250 29 465 1296 24 36 19717 4138 25 53 1614454 26322402 8 27 48 52 45 18 3530 28 2321 44 0,01-0,4-0,2 0,0 0,2 0,4 0,6 Residuals 31 1 3249 1511 4334 391037 20 In addition there are a lot of further charts, which are not described here. If there are any suggestions or hints about this short introdution, please give us a feedback to info@crgraph.de