UDC : S.B.PRYKHODKO, N.V.PRYKHODKO, L.M.MAKAROVA, O.O.KUDIN, T.G.SMYKODUB Admiral Makarov National University of Shipbuilding

Similar documents
X- Chart Using ANOM Approach

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Why visualisation? IRDS: Visualization. Univariate data. Visualisations that we won t be interested in. Graphics provide little additional information

USING LINEAR REGRESSION FOR THE AUTOMATION OF SUPERVISED CLASSIFICATION IN MULTITEMPORAL IMAGES

y and the total sum of

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

S1 Note. Basis functions.

2x x l. Module 3: Element Properties Lecture 4: Lagrange and Serendipity Elements

The Man-hour Estimation Models & Its Comparison of Interim Products Assembly for Shipbuilding

Some Advanced SPC Tools 1. Cumulative Sum Control (Cusum) Chart For the data shown in Table 9-1, the x chart can be generated.

Simulation: Solving Dynamic Models ABE 5646 Week 11 Chapter 2, Spring 2010

Support Vector Machines

Empirical Distributions of Parameter Estimates. in Binary Logistic Regression Using Bootstrap

Lecture 4: Principal components

An Image Fusion Approach Based on Segmentation Region

Synthesizer 1.0. User s Guide. A Varying Coefficient Meta. nalytic Tool. Z. Krizan Employing Microsoft Excel 2007

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics

A Simple and Efficient Goal Programming Model for Computing of Fuzzy Linear Regression Parameters with Considering Outliers

Feature Reduction and Selection

Lecture #15 Lecture Notes

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach

Reducing Frame Rate for Object Tracking

Parameter estimation for incomplete bivariate longitudinal data in clinical trials

Type-2 Fuzzy Non-uniform Rational B-spline Model with Type-2 Fuzzy Data

Wavefront Reconstructor

Fuzzy Filtering Algorithms for Image Processing: Performance Evaluation of Various Approaches

THE FUZZY GROUP METHOD OF DATA HANDLING WITH FUZZY INPUTS. Yuriy Zaychenko

Life Tables (Times) Summary. Sample StatFolio: lifetable times.sgp

Model Selection with Cross-Validations and Bootstraps Application to Time Series Prediction with RBFN Models

BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET

CS 534: Computer Vision Model Fitting

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters

The Research of Support Vector Machine in Agricultural Data Classification

Cluster Analysis of Electrical Behavior

Classification / Regression Support Vector Machines

Smoothing Spline ANOVA for variable screening

Outlier Detection based on Robust Parameter Estimates

A Robust Method for Estimating the Fundamental Matrix

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

A Semi-parametric Regression Model to Estimate Variability of NO 2

A Binarization Algorithm specialized on Document Images and Photos

Detection of an Object by using Principal Component Analysis

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Adaptive Transfer Learning

Positive Semi-definite Programming Localization in Wireless Sensor Networks

Edge Detection in Noisy Images Using the Support Vector Machines

Hybridization of Expectation-Maximization and K-Means Algorithms for Better Clustering Performance

Fusion Performance Model for Distributed Tracking and Classification

The Research of Ellipse Parameter Fitting Algorithm of Ultrasonic Imaging Logging in the Casing Hole

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

SVM-based Learning for Multiple Model Estimation

Categories and Subject Descriptors B.7.2 [Integrated Circuits]: Design Aids Verification. General Terms Algorithms

Fuzzy Logic Based RS Image Classification Using Maximum Likelihood and Mahalanobis Distance Classifiers

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Corner-Based Image Alignment using Pyramid Structure with Gradient Vector Similarity

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

Should Duration and Team Size be Used for Effort Estimation?

Available online at ScienceDirect. Procedia Environmental Sciences 26 (2015 )

A New Approach For the Ranking of Fuzzy Sets With Different Heights

Modeling Local Uncertainty accounting for Uncertainty in the Data

A new paradigm of fuzzy control point in space curve

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

An Indian Journal FULL PAPER ABSTRACT KEYWORDS. Trade Science Inc.

A Multivariate Analysis of Static Code Attributes for Defect Prediction

Hermite Splines in Lie Groups as Products of Geodesics

An Entropy-Based Approach to Integrated Information Needs Assessment

A Robust LS-SVM Regression

Electrical analysis of light-weight, triangular weave reflector antennas

C2 Training: June 8 9, Combining effect sizes across studies. Create a set of independent effect sizes. Introduction to meta-analysis

Estimating Regression Coefficients using Weighted Bootstrap with Probability

Discrete and Continuous Time High-Order Markov Models for Software Reliability Assessment

Software Reliability Assessment Using High-Order Markov Chains

Target Tracking Analysis Based on Corner Registration Zhengxi Kang 1, a, Hui Zhao 1, b, Yuanzhen Dang 1, c

TN348: Openlab Module - Colocalization

Module Management Tool in Software Development Organizations

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Six-Band HDTV Camera System for Color Reproduction Based on Spectral Information

Optimal Design of Nonlinear Fuzzy Model by Means of Independent Fuzzy Scatter Partition

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints

Distance Calculation from Single Optical Image

Online codebook modeling based background subtraction with a moving camera

Computer Animation and Visualisation. Lecture 4. Rigging / Skinning

An Accurate Evaluation of Integrals in Convex and Non convex Polygonal Domain by Twelve Node Quadrilateral Finite Element Method

An Ensemble Learning algorithm for Blind Signal Separation Problem

We Two Seismic Interference Attenuation Methods Based on Automatic Detection of Seismic Interference Moveout

ICA Denoising for Event-Related fmri Studies

Malaysian Journal of Applied Sciences

(1) The control processes are too complex to analyze by conventional quantitative techniques.

A penalized fuzzy clustering algorithm

Imputation Methods for Longitudinal Data: A Comparative Study

Face Recognition University at Buffalo CSE666 Lecture Slides Resources:

Exercises (Part 4) Introduction to R UCLA/CCPR. John Fox, February 2005

New dynamic zoom calibration technique for a stereo-vision based multi-view 3D modeling system

MODELING THE CONDITION OF BUILDINGS BY REAL FUZZY SETS

A COMPARISON OF TWO METHODS FOR FITTING HIGH DIMENSIONAL RESPONSE SURFACES

Design for Reliability: Case Studies in Manufacturing Process Synthesis

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices

Invariant Shape Object Recognition Using B-Spline, Cardinal Spline, and Genetic Algorithm

Transcription:

ВІСНИК ХНТУ 3(6), 07 р., ТОМ UDC 004.4:59.37.B.PRKHODKO,.V.PRKHODKO, L.M.MAKAROVA, O.O.KUDI, T.G.MKODUB Admral Makarov atonal Unversty of hpbuldng COTRUCTIG O-LIEAR REGREIO EQUATIO O THE BAI OF BIVARIATE ORMALIIG TRAFORMATIO The technques for constructng equatons, confdence and predcton ntervals of non-lnear regressons on the bass of bvarate normalzng transformatons for non-gaussan data are proposed. Applcaton of the technques s consdered for the bvarate non-gaussan data set: actual effort (hours) and sze (adjusted functon ponts) from 33 mantenance and development software projects. Keywords: non-lnear regresson equaton, confdence nterval, predcton nterval, normalzng transformaton, bvarate non-gaussan data С.Б.ПРИХОДЬКО, Н.В.ПРИХОДЬКО, Л.М.МАКАРОВА, О.О.КУДІН, Т.Г.СМИКОДУБ Національний університет кораблебудування імені адмірала Макарова ПОБУДОВА НЕЛІНІЙНИХ РЕГРЕСІЙНИХ РІВНЯНЬ НА ОСНОВІ ДВОМІРНИХ НОРМАЛІЗУЮЧИХ ПЕРЕТВОРЕНЬ Запропоновано методи побудови рівнянь, довірчих інтервалів та інтервалів передбачення нелінійних регресій на основі двомірних нормалізуючих перетворень для негаусовських даних. Застосування методів розглядається для одного набору двомірних негаусовських даних: для фактичної трудомісткості (години) і розміру (скориговані функціональні точки) зі 33 проектів з підтримки та розробки програмного забезпечення. Ключові слова: нелінійне рівняння регресії, довірчий інтервал, інтервал передбачення, нормалізуюче перетворення, двовимірні негаусовські дані С.Б.ПРИХОДЬКО, Н.В.ПРИХОДЬКО, Л.Н.МАКАРОВА, О.А.КУДИН, Т.Г.СМЫКОДУБ Национальный университет кораблестроения имени адмирала Макарова ПОСТРОЕНИЕ НЕЛИНЕЙНЫХ РЕГРЕССИОННЫХ УРАВНЕНИЙ НА ОСНОВЕ ДВУМЕРНЫХ НОРМАЛИЗУЮЩИХ ПРЕОБРАЗОВАНИЙ Предложены методы построения уравнений, доверительных интервалов и интервалов предсказания нелинейных регрессий на основе двумерных нормализующих преобразований для негауссовских данных. Применение методов рассматривается для двух наборов двумерных негауссовских данных: для фактической трудоемкости (часы) и размера (скорректированные функциональные точки) из 33 проектов по поддержке и разработке программного обеспечения. Ключевые слова: нелинейное уравнение регрессии, доверительный интервал, интервал предсказания, нормализующее преобразование, двумерные негауссовские данные Problem formulaton A normalzng transformaton s often a good way to construct equatons, confdence and predcton ntervals of non-lnear regressons [-5], and t s often used for that purposes n emprcal software engneerng, nformaton technology, bometry, ecology, fnance, etc. However, well-known technques for constructng equatons, confdence and predcton ntervals of non-lnear regressons are based on the unvarate normalzng transformatons, whch do not take nto account the correlaton between random varables n the case of normalzaton of bvarate non-gaussan data. Ths leads to the need to use the bvarate normalzng transformatons, whch take nto account that correlaton to construct equatons, confdence and predcton ntervals of non-lnear regressons. Analyss of recent research and publcatons Transformatons are an extremely mportant part of regresson analyss, but the use of transformatons can be somewhat trcky []. Accordng [] transformatons are made for essentally four purposes, two of whch are: frst, to obtan approxmate normalty for the dstrbuton of the error term (resduals) or the dependent random varable, second, to transform the response and/or the predctor n such a way that the strength of the lnear relatonshp between new varables (normalzed varables) s batter than the lnear relatonshp between dependent and ndependent random varables. ow well-known normalzng transformatons are used to construct the equatons, confdence and predcton ntervals of non-lnear regressons. For that purposes, for example, t s known the applcaton of such normalzng transformatons as the decmal logarthm transformaton [-6], the Box-Cox transformaton [, 4], the Johnson translaton system [7, 8]. However, known technques for constructng equatons, confdence and predcton ntervals of non-lnear regressons are based on the unvarate normalzng transformatons, whch do not take nto account the correlaton between random varables n the case of normalzaton of bvarate non-gaussan data. 333

ВІСНИК ХНТУ 3(6), 07 р., ТОМ Purpose of the study The purpose of the study s to propose the technques for constructng the equatons, confdence and predcton ntervals of non-lnear regressons for bvarate non-gaussan data n general case, when t necessary to take nto account the correlaton between the response and the predctor (dependent and ndependent random varables) n the case of normalzaton of that varables. Presentaton of the man research materal We propose the technques for constructng the equatons, confdence and predcton ntervals of non-lnear regressons for bvarate non-gaussan data. As and n [9, 0] the technques consst of three steps. In the frst step, a set of bvarate non-gaussan data s normalzed usng a bjectve bvarate normalzng transformaton. In the second step, the equaton, confdence and predcton ntervals of non-lnear regresson for the normalzed data are bult. In the thrd step, the equatons, confdence and predcton ntervals of non-lnear regressons for bvarate non- Gaussan data are constructed on the bass of the equaton, confdence and predcton ntervals of lnear regresson for the normalzed data and the normalzng transformaton. The technques. Consder bjectve multvarate normalzng transformaton of non-gaussan random vector P T to Gaussan random vector T, X and the nverse transformaton for () T, s gven by T ψ P T P ψ. () The lnear regresson equaton for normalzed data accordng to () wll have the form [] where Ẑ ˆ ˆ b, (3) s predcton lnear regresson equaton result for values of regresson equaton parameter. The non-lnear regresson equaton wll have the form b ; ˆb () s estmator for lnear ˆ b. (4) The technque for constructng of confdence nterval s based on transformaton () and equaton CI ˆ t,. (5) where t, s a quantle of student's t-dstrbuton wth degrees of freedom and level; ; sgnfcance ;. The technque conssts of three steps. In the frst step, non-gaussan data s normalzed usng a bjectve normalzng transformaton (), and lnear regresson equaton (3) s bult on the bass of the normalzed data. In the second step, the confdence nterval for lnear regresson s detected. In the thrd step, the confdence nterval for nonlnear regresson s bult on the bass of the confdence nterval for lnear regresson and the normalzng transformaton. The confdence nterval for non-lnear regresson wll have the form CI ˆ t,. (6) The technque for constructng of predcton nterval s based on the transformaton () and equaton [] 334

ВІСНИК ХНТУ 3(6), 07 р., ТОМ PI ˆ t,. (7) Lke prevous the technque conssts of three steps, wth the dfference that nstead of the confdence ntervals, we defne the predcton ntervals. The predcton nterval for non-lnear regresson wll have the form PI ˆ t,. (8) The equatons (4), (6) and (8) are used for constructng the equatons, confdence and predcton ntervals of non-lnear regressons for bvarate non-gaussan data. The lnes of equatons, confdence and predcton ntervals of non-lnear regressons can also be bult by the nverse transformaton () of the values of varables PI from equaton (3), (5) and (7) respectvely. Bvarate normalzng transformatons. ome normalzng transformatons have been proposed for normalzng bvarate non-gaussan data, such as, transformaton on the bass of the Box-Cox transformaton, the Johnson translaton system and others. However, only a few normalzng transformatons are bjectve. uch bjectve transformaton s the transformaton of U famly of the Johnson translaton system. The Johnson normalzng translaton s gven by [] γ where, η, T, γ ηh λ X m 0 m, Ẑ, CI, (9) and λ are parameters of the Johnson normalzng translaton; γ T, ; dag, ; λ dag ; h, y h y h y T ; h (.) s one of the translaton functons, y, and η ; ln lny h Arsh y y, y y,, for for for for L B U (log normal) famly; (bounded) famly; (unbounded) famly; (normal) famly; s the covarance matrx. Here y x ; Arsh y ln y y. The nverse transformaton for the Johnson normalzng translaton (9) s gven by [] x λh η z γ. (0) Example. We consder the example of constructng the equaton, confdence and predcton ntervals of nonlnear regresson for the bvarate non-gaussan data set: actual effort (hours) and sze (adjusted functon ponts) from 33 mantenance and development projects [] after the cutoff of outlers by the technque for detectng bvarate outlers on the bass of the normalzng transformatons for non-gaussan data [3]. On Fg. the lnear regresson (sold lne), the borders of confdence (dot-dash lnes) and predcton (dotted lnes) ntervals ( 0. 05 ) of lnear regresson for normalzed data (ponts n the form of crcles) from 33 projects are presented. 335

ВІСНИК ХНТУ 3(6), 07 р., ТОМ Fg.. Equaton, confdence and predcton ntervals of lnear regresson for normalzed data from 33 projects These data s normalzed by B famly of the transformaton (9). In these case the pont estmates of parameters are such:.88055,.7334, 0.793776, 0.95403, 3.890, 96.5557, 768.509 and 870.7. The sample covarance matrx of the s used as the approxmate moment-matchng estmator of covarance matrx 0.9948 0.848. 0.848 0.9948 On Fg. the non-lnear regresson (sold lne), the borders of confdence (dot-dash lnes) and predcton (dotted lnes) ntervals ( 0. 05 ) of non-lnear regresson for non-gaussan data (ponts n the form of crcles) from 33 projects are presented. That non-lnear regresson, the confdence and predcton ntervals were bult on the bases of transformatons (9) and (0). Also the non-lnear regresson, the confdence and predcton ntervals were bult on the bases of the decmal logarthm transformaton. For that transformaton on Fg. the borders of predcton nterval (dotted lnes wth short dashes) are also presented. We note, n ths case (the decmal logarthm transformaton) at the maxmum value of the ndependent varable the wdth of predcton nterval s more by 60 percent compared to predcton nterval, whch constructed on the bases of transformaton (9). Fg.. Equaton, confdence and predcton ntervals of non-lnear regresson for non-gaussan data from 33 projects In our opnon, such a bg dfference s due to poor bvarate data normalzaton by the decmal logarthm transformaton. We note, Marda s multvarate kurtoss [4] equals 8 under bvarate normalty for our case. The values of pont estmate of kurtoss equal respectvely 8.00 and 6.93 for the normalzed data on Fg. and the data, whch normalzed by the decmal logarthm transformaton. These values ndcate that the necessary condton for bvarate normalty s practcally performed for the normalzed data by transformaton (9) only. 336

ВІСНИК ХНТУ 3(6), 07 р., ТОМ At the same tme the non-lnear regressons, whch were bult on the bases of transformatons (9) and the decmal logarthm transformaton, are approxmately smlar: the values of coeffcent of determnaton equal 0.5664 and 0.5759 respectvely. Conclusons From the examples we conclude that the proposed technques for constructng the equatons, confdence and predcton ntervals of non-lnear regressons for bvarate non-gaussan data are promsng. Applcaton of the technques s consdered for the bvarate non-gaussan data set: actual effort (hours) and sze (adjusted functon ponts) from 33 mantenance and development software projects. Accountng the correlaton between random varables n the case of normalzaton of that bvarate non-gaussan data leads to reducton of the wdth of confdence and predcton ntervals of the non-lnear regresson compared to the same ntervals, whch constructed on the bases of the decmal logarthm transformaton. In the future, we ntend to try other bvarate non-gaussan data sets. References. D.M. Bates and D.G. Watts. onlnear Regresson Analyss and Its Applcatons. Wley, 988, 384 p.. T.P. Ryan. Modern regresson methods. Wley, 997, 59 p. 3. G.A.F. eber and C.J. Wld. onlnear Regresson. John Wley & ons, Inc., 003, 79 p. 4. R.A. Johnson and D.W. Wchern. Appled Multvarate tatstcal Analyss. Pearson Prentce Hall, 007, 800 p. 5. Ian Pardoe. Appled regresson modellng. Wley, 0, 35 p. 6. Chatterjee and J.. monoff. Handbook of Regresson Analyss. John Wley & ons, Inc., 03, 36 p. 7..B. Prykhodko and A.V. Pukhalevch, Developng PC oftware Project Duraton Model based on Johnson transformaton, n Modern Problems of Rado Engneerng, Telecommuncatons and Computer cence, Proceedngs of the th Internatonal Conference, Lvv-lavske, Ukrane, 5 February - March, 04, pp. 4-6. 8..B. Prykhodko and A.V. Pukhalevch, Confdence nterval estmaton of PC software project duraton regresson based on Johnson transformaton Radoelectronc and Computer ystems, o, Vol. 66, pp. 04-07, 04. 9..B. Prykhodko, tatstcal anomaly detecton technques based on normalzng transformatons for non- Gaussan data, n Computatonal Intellgence (Results, Problems and Perspectves), Proceedngs of the Internatonal Conference, Kyv-Cherkasy, Ukrane, May -5, 05, pp. 86-87. 0..B. Prykhodko, Developng the software defect predcton models usng regresson analyss based on normalzng transformatons n Modern problems n testng of the appled software (PTTA-06), Abstracts of the Research and Practce emnar, Poltava, Ukrane, May 5-6, 06, pp. 6-7.. P.M. tanfeld, J.R. Wlson, G.A. Mrka,.F. Glasscock, J.P. Pshogos, J.R. Davs Multvarate nput modelng wth Johnson dstrbutons, n Proceedngs of the 8th Wnter smulaton conference WC'96, December 8-, 996, Coronado, CA, UA, ed..andradуttr, K.J.Healy, D.H.Wthers, and B.L.elson, IEEE Computer ocety Washngton, DC, UA, 996, pp. 457-464.. B. Ktchenham,.L. Pfleeger, B. McColl, and. Eagan, An emprcal study of mantenance and development estmaton accuracy, The Journal of ystems and oftware, 64, pp.57-77, 00. 3.. Prykhodko,. Prykhodko, L. Makarova, O. Kudn, T. mykodub and A. Prykhodko, Detectng bvarate outlers on the bass of normalzng transformatons for non-gaussan data n Advanced Informaton ystems and Technologes, Proceedngs of the V Internatonal centfc Conference, umy, Ukrane, May 7-9, pp. 95-97, 07. 4. K.V. Marda, Measures of multvarate skewness and kurtoss wth applcatons, Bometrka, 57, pp. 59 530, 970. R 337