Chapter 2 and 3, Data Pre-processing

Size: px
Start display at page:

Download "Chapter 2 and 3, Data Pre-processing"

Transcription

1 CSI 4352, Itroductio to Data Miig Chapter 2 ad 3, Data Pre-processig Youg-Rae Cho Associate Professor Departmet of Computer Sciece Baylor Uiversity Why Need Data Pre-processig? Icomplete Data Missig values, or Lack of attributes of iterest Noisy Data Errors, or Outliers Redudat Data Duplicate data, or Duplicate attributes e.g., Age = 47, Birthday = 01/07/1968 Icosistet Data Cotaiig discrepacies i format or ame e.g., Ratig by 1, 2, 3, Ratig by A, B, C Huge Volume of Data 1

2 Importace Lower Quality Data, Lower Quality Miig Results!! Miig quality depeds o data quality as well as miig techiques. Majority of Data Miig Data pre-processig comprises the majority of the works for data warehousig ad data miig Major Tasks Data Cleaig Fill i missig values, smooth oisy data, remove outliers, remove redudacy, ad resolve icosistecy Data Itegratio Itegratio of multiple databases or files Data Trasformatio Normalizatio ad aggregatio Data Reductio Reducig represetatio i volume with similar aalytical results Discretizatio of cotiuous data 2

3 CSI 4352, Itroductio to Data Miig Chapter 2 ad 3, Data Pre-processig Geeral Data Characteristics Descriptive Data Summarizatio Data Cleaig Data Itegratio Data Trasformatio Data Reductio Data Types Record Relatioal records Data matrix, e.g., umerical matrix, crosstabs Documet data, e.g., text documets Trasactio data Ordered Data Sequetial data, e.g., trasactio sequeces, biological sequeces Temporal data, e.g., time-series data Spatial data, e.g., maps Graph WWW, iteret Social or iformatio etworks Biological etworks 3

4 Attribute Types Nomial e.g., ID umber, professio, zip code Ordial e.g., rakig, grades, sizes Biary e.g., medical test (positive or egative) Iterval e.g., caledar dates, temperature, height Ratio e.g., populatio, sales Discrete vs. Cotiuous Attributes Discrete Attribute Fiite set of values Sometimes, represeted as iteger values Biary attributes are a special case of discrete attributes Cotiuous Attribute Real umbers as values Typically, represeted as floatig-poit variables I practice, show as a fiite umber of digits 4

5 Characteristics of Data Dimesioality Curse of dimesioality Sparsity Lack of iformatio Resolutio Patters depedig o the scale Similarity Similarity measures for complex types of data CSI 4352, Itroductio to Data Miig Chapter 2 ad 3, Data Pre-processig Geeral Data Characteristics Descriptive Data Summarizatio Data Cleaig Data Itegratio Data Trasformatio Data Reductio 5

6 Descriptive Data Miig Motivatio To better uderstad the properties of data distributios, e.g., cetral tedecy, spread ad variatio Measuremets media, max, mi, quatiles, outliers, etc. Aalysis Process Foldig the measures ito umeric dimesios Graphic aalysis o the trasformed dimesio space Cetral Tedecy Measures Mea Weighted arithmetic mea: Trimmed mea: choppig extreme values Media Middle value if odd umber of values Average of two middle values otherwise Estimatio by iterpolatio for grouped data: N media L ( Mode The value that occurs the most frequetly i the data Uimodal, bimodal, trimodal distributio 1 x / 2 i 1 i 1 ( w freq i w x i med i freq ) low ) width 6

7 Cetral Tedecy i Skewed Data Symmetric Data Mea, Media, Mode Skewed Data Data Dispersio Measures Quartiles ad Outliers Quartiles: Q 1 (25 th percetile), Q 3 (75 th percetile) Iter-quartile rage: IQR = Q 3 Q 1 Outliers: data with extreme low ad high values usually, values lower/higher tha Q IQR / Q IQR Variace ad Stadard Deviatio 2, i populatio: ( xi ) N i 1 N i 1 x 2 i 2 s 2, s by samplig: 2 1 s 1 i 1 ( x x) i 2 1 [ xi ( xi ) i 1 i 1 ] Degree of Freedom: # idepedet pieces of iformatio (= # idepedet measuremet # parameters) 7

8 Graphic Aalysis Boxplot Display of five-umber summary Histogram Display of tabulated frequecies Quatile-Quatile (Q-Q) Plot Descriptio of the relatioship betwee two uivariate distributios Scatter Plot Descriptio of the relatioship betwee two attributes of a bivariate distributio Boxplot Aalysis Five-umber summary of a Distributio Miimum / Q 1 / Media / Q 3 / Maximum Boxplot Represeted as a box The bottom of the box is Q1 The top of the box is Q3 The media is marked by a lie Whiskers: two lies outside of the box exted to miimum ad maximum 8

9 Histogram Aalysis Histogram Uivariate graphic method Represeted as a set of bars reflectig the frequecies of the discrete values Groupig data values ito classes if they are cotiuous Boxplot vs. Histogram Ofte, histogram gives more iformatio tha boxplot Frequecy Frequecy ~59 60~79 80~99 100~ ~139 40~59 60~79 80~99 100~ ~139 Uit Price Uit Price Quatile Plot Aalysis Quatile Plot Plots quatile iformatio of the data (sorted i a ascedig order) Displays all the data Q-Q (Quatile-Quatile) Plot Plots the quatiles of oe uivariate distributio agaist the quatiles of the other Describes the relatioship betwee two distributios 9

10 Scatter Plot Aalysis Scatter Plot Displays the poits of bivariate data Describes the relatioship betwee two attributes (variables) positively correlated data egatively correlated data clusters patters outliers CSI 4352, Itroductio to Data Miig Chapter 2 ad 3, Data Pre-processig Geeral Data Characteristics Descriptive Data Summarizatio Data Cleaig Data Itegratio Data Trasformatio Data Reductio 10

11 Missig Data Data is ot always available e.g., may tuples have o record value for several attributes Missig data may be due to equipmet malfuctio icosistet with other recorded data ad thus deleted data ot etered due to misuderstadig certai data may ot be cosidered importat at the time of etry ot register history or chages of the data Missig data may eed to be iferred How to Hadle Missig Data Igore the missig values Not effective Fill i the missig values maually Tedious, ifeasible? Fill i the missig values automatically with ukow : ot effective The attribute mea The attribute mea of all samples belogig to the same class The most probable value by iferece or classificatio techiques 11

12 Noisy Data Noise Radom error or variace i a measured variable Icorrect data may be due to faulty data collectio istrumets data trasmissio problem techology limitatio icosistecy i data coversio Other Data Problems Duplicate records Icomplete data Icosistet data How to Hadle Noisy Data Biig Sort data ad partitio ito bis Smooth by bi meas, smooth by bi media, smooth by bi boudaries Regressio Smooth by fittig the data ito regressio fuctios Clusterig Detect ad remove outliers Ispectio Semi-automatically Detect suspicious values ad check by huma 12

13 Partitioig for Biig Equal-Width (Distace) Partitioig Divides the rage ito N itervals of equal distace (uiform grid) If A ad B are the lowest ad highest values of the attribute, the the width of itervals will be (B-A)/N. Straightforward Problem: (1) Outliers may domiate the partitios. (2) Skewed data is ot hadled well. Equal-Depth (Frequecy) Partitioig Divides the rage ito N itervals of equal frequecy, i.e., each cotaiig approximately same umber of samples. Problem: Not possible for categorical attributes Data Smoothig for Biig Example Sorted data of price (i dollars): 4,8,9,15,21,21,24,25,26,28,29,34 Partitio ito three equal-frequecy bis Bi 1: 4, 8, 9, 15 Bi 2: 21, 21, 24, 25 Bi 3: 26, 28, 29, 34 Bi 1: 9, 9, 9, 9 Bi 2: 23, 23, 23, 23 Bi 3: 29, 29, 29, 29 smoothig by bi meas Bi 1: 4, 4, 4, 15 Bi 2: 21, 21, 25, 25 Bi 3: 26, 26, 26, 34 smoothig by bi boudaries 13

14 Regressio Liear Regressio Modeled as a liear fuctio of oe variable, Y = w X + b Ofte, uses a least-square method. y 1 Y y 1 y = x + 1 x 1 X Multiple Regressio Modeled as a liear fuctio of a multi-dimesioal feature vector, Y = b 0 + b 1 X 1 + b 2 X 2 May o-liear fuctios ca be trasformed. Log-Liear Model Approximates discrete multi-dimesioal probability distributios. Clusterig Outlier Detectio outliers 14

15 CSI 4352, Itroductio to Data Miig Chapter 2 ad 3, Data Pre-processig Geeral Data Characteristics Descriptive Data Summarizatio Data Cleaig Data Itegratio Data Trasformatio Data Reductio Data Itegratio Defiitio Process to combie multiple data sources ito coheret storage Process to provide uiform iterface to multiple data sources Process Data Modelig Schema Matchig Data Extractio Data Modelig Creatig global schema (mediated schema) Schema Matchig Matchig betwee two attributes of differet sources The most critical step of data itegratio Schema-level matchig / Istace-level matchig 15

16 Istace-Level Matchig Defiitio Detectig ad resolvig data value coflicts Etity Idetificatio For the same real world etity, values from differet sources might be differet Possible reasos: (1) differet represetatios, e.g., Greg Hamerly = Gregory Hamerly (2) differet format, e.g., Sep 16, 2009 = 09/16/09 (3) differet scale, e.g., meters iches Schema-Level Matchig Defiitio Detectig ad resolvig attribute coflicts ad redudat attributes Object Idetificatio The same attribute (or object) might have differet ames i differet sources. e.g, trasactio id = TID Oe attribute might be a derived attribute i aother table. e.g., Age = Birthday Attribute Redudacy Aalysis Ca be aalyzed by correlatio / variatio measures e.g., 2 test, Pearso coefficiet, t-test, F-test 16

17 Pearso Coefficiet Pearso Coefficiet Evaluates correlatio betwee two samples. Give two samples X={x 1,x 2,,x } ad Y={y 1,y 2,, y }, r i 1 ( x x)( y y) i ( 1) x i y co-variace betwee X ad Y idividual variace (stadard deviatio) of X ad Y If r > 0, X ad Y are positively correlated. If r = 0, X ad Y are idepedet. If r < 0, X ad Y are egatively correlated. t-test ad F-Test t-test (t-statistics) Idepedet two-sample t-test: t 2 1 s x x / s / 2 Evaluates statistical variace betwee two samples. p-value ANOVA (Aalysis of Variace) / F-test (F-statistics) Evaluates statistical variace amog three or more samples 17

18 Chi-Square Test 2 Test ( 2 Statistic) Evaluates whether a observed distributio i a sample differs from a theoretical distributio (i.e., hypothesis). Where E i is a expected frequecy ad O i is a observed frequecy, The larger 2, the more likely the variables are related (positively or egatively). i 1 ( Oi Ei E 2 2 ) i Example Play chess Not play chess Sum (row) Like sciece fictio 250 (90) 200 (360) 450 Not like sciece fictio 50 (210) 1000 (840) 1050 Sum (col.) CSI 4352, Itroductio to Data Miig Chapter 2 ad 3, Data Pre-processig Geeral Data Characteristics Descriptive Data Summarizatio Data Cleaig Data Itegratio Data Trasformatio Data Reductio 18

19 Data Trasformatio Defiitio Process that maps a etire set of values of a give attribute ito a ew set of values Purpose To remove oise from data To chage scales Methods Smoothig (icludig biig ad regressio) Normalizatio Geeral Normalizatio Methods Mi-Max Normalizatio Maps the values i the rage [mi, max] ito a ew rage [mi, max ]. v' mi' v mi max' mi' max mi z-score Normalizatio Trasforms the values of a attribute A based o its mea ad stadard deviatio. v ' v A A Decimal Scalig v v' 10 Moves decimal poit of values where j is the maximal digit j 19

20 Quatile Normalizatio (1) Motivatio I a Q-Q plot, if two distributios are the same, the the plot should be a straight lie. Ca be exteded to dimesios Descriptio q k = (q k1,, q k ): a vector of the kth quatile for all dimesios proj d q k 1 ( i 1 1 qki,..., i 1 q ki Algorithm Sort each colum (dimesio) of X to give X Assig the meas across rows of X ito each elemet of the row Rearrage each colum of X to the same order of X ) Quatile Normalizatio (2) Advatages Efficiet i high dimesioal data ( popularly used for biological data pre-processig ) Disadvatages I practice, each dimesio may have differet distributio Refereces Bolstad, B.M., et al., A compariso of ormalizatio methods for high desity oligoucleotide array data based o variace ad bias, Bioiformatics, Vol.19 (2003) 20

21 CSI 4352, Priciples of Data Miig Chapter 2 ad 3, Data Pre-processig Geeral Data Characteristics Descriptive Data Summarizatio Data Cleaig Data Itegratio Data Trasformatio Data Reductio Data Reductio Defiitio Process to obtai a reduced represetatio of a data set, which is much smaller i volume but produces almost the same aalytical results Problems Data miig algorithms take a very log time to ru o the complete data sets Data aalysis methods are complex, iaccurate i the high dimesioal data Methods Dimesioality reductio Numerosity reductio 21

22 Dimesioality Reductio Curse of Dimesioality Whe dimesioality icreases, data becomes icreasigly sparse Possible combiatios of subspaces will grow expoetially Desity ad similarity betwee data values becomes less meaigful Purpose To avoid the curse of dimesioality To elimiate irrelevat features ad reduce oise To reduce time ad space required i data miig To allow easier visualizatio Methods Feature extractio Feature selectio Feature Extractio Process (1) Combiig a multitude of correlated features (2) Creatig a ew dimesioal feature space for the combied features Example Pricipal compoet aalysis (PCA) Fid the eigevectors of the covariace matrix Defie a ew space with the eigevectors Wavelet trasformatio Problem New dimesioal spaces might ot be meaigful i the domai of data sets 22

23 Feature Selectio Methods Elimiatig redudat features or irrelevat features Selectig sigificat (iformative) features Example Redudat features: e.g., purchase price of a product ad the amout of sales tax paid Irrelevat features e.g., paret s ame is irrelevat for selectig studet scholarship cadidates Iformative features e.g., studet s ame, studet s GPA, paret s icome are iformative for selectig studet scholarship cadidates Heuristic Search for Feature Selectio Problem of Feature Selectio If d features, how may possible combiatios of the features? 2 d Typical Heuristic Methods Step-wise feature selectio: Repeatedly pick the best feature Step-wise feature elimiatio: Repeatedly remove the worst feature Best combied feature selectio ad elimiatio Optimal brach ad boud 23

24 Numerosity Reductio Purpose To reduce data volume by choosig alterative, smaller forms of data represetatio Parametric Methods Assume the data fits some model, estimate model parameters, store oly the parameters, ad discard the data. e.g., Regressio No-parametric Methods Do ot assume models, ad use data values. e.g., Discretizatio, Clusterig, Coceptual Hierarchy Geeratio Discretizatio Methods Dividig the rage of cotiuous data ito itervals Selectig sigificat (frequet) data Strategy Supervised vs. Usupervised Splitig (top-dow) vs. Mergig (bottom-up) Examples Biig: top-dow, usupervised Samplig: top-dow, supervised Etropy-based Discretizatio: top-dow, supervised 24

25 Coceptual Hierarchy Geeratio Orderig Attributes Partial/total orderig of attributes at the schema level e.g., street < city < state < coutry Hierarchy Geeratio A hierarchy for a set of values by explicit data groupig e.g., {Dallas, Waco, Austi} < Texas Automatic Method Based o the umber of distict values per attribute coutry state city street 15 distict values 365 distict values 3567 distict values 674,339 distict values Questios? Lecture Slides are foud o the Course Website, 25

Data Analysis. Concepts and Techniques. Chapter 2. Chapter 2: Getting to Know Your Data. Data Objects and Attribute Types

Data Analysis. Concepts and Techniques. Chapter 2. Chapter 2: Getting to Know Your Data. Data Objects and Attribute Types Data Aalysis Cocepts ad Techiques Chapter 2 1 Chapter 2: Gettig to Kow Your Data Data Objects ad Attribute Types Basic Statistical Descriptios of Data Data Visualizatio Measurig Data Similarity ad Dissimilarity

More information

Data Preprocessing. Motivation

Data Preprocessing. Motivation Data Preprocessig Mirek Riedewald Some slides based o presetatio by Jiawei Ha ad Michelie Kamber Motivatio Garbage-i, garbage-out Caot get good miig results from bad data Need to uderstad data properties

More information

Data Mining: Concepts and Techniques. Chapter 2

Data Mining: Concepts and Techniques. Chapter 2 Data Miig: Cocepts ad Techiques Chapter 2 Jiawei Ha Departmet of Computer Sciece Uiversity of Illiois at Urbaa-Champaig www.cs.uiuc.edu/~haj 2006 Jiawei Ha ad Michelie Kamber, All rights reserved Jauary

More information

COMP9318: Data Warehousing and Data Mining

COMP9318: Data Warehousing and Data Mining COMP9318: Data Warehousig ad Data Miig L3: Data Preprocessig ad Data Cleaig COMP9318: Data Warehousig ad Data Miig 1 Why preprocess the data? COMP9318: Data Warehousig ad Data Miig 2 Why Data Preprocessig?

More information

Designing a learning system

Designing a learning system CS 75 Machie Learig Lecture Desigig a learig system Milos Hauskrecht milos@cs.pitt.edu 539 Seott Square, x-5 people.cs.pitt.edu/~milos/courses/cs75/ Admiistrivia No homework assigmet this week Please try

More information

( n+1 2 ) , position=(7+1)/2 =4,(median is observation #4) Median=10lb

( n+1 2 ) , position=(7+1)/2 =4,(median is observation #4) Median=10lb Chapter 3 Descriptive Measures Measures of Ceter (Cetral Tedecy) These measures will tell us where is the ceter of our data or where most typical value of a data set lies Mode the value that occurs most

More information

Administrative UNSUPERVISED LEARNING. Unsupervised learning. Supervised learning 11/25/13. Final project. No office hours today

Administrative UNSUPERVISED LEARNING. Unsupervised learning. Supervised learning 11/25/13. Final project. No office hours today Admiistrative Fial project No office hours today UNSUPERVISED LEARNING David Kauchak CS 451 Fall 2013 Supervised learig Usupervised learig label label 1 label 3 model/ predictor label 4 label 5 Supervised

More information

DATA MINING II - 1DL460

DATA MINING II - 1DL460 DATA MINING II - 1DL460 Sprig 2017 A secod course i data miig http://www.it.uu.se/edu/course/homepage/ifoutv2/vt17/ Kjell Orsbor Uppsala Database Laboratory Departmet of Iformatio Techology, Uppsala Uiversity,

More information

SAMPLE VERSUS POPULATION. Population - consists of all possible measurements that can be made on a particular item or procedure.

SAMPLE VERSUS POPULATION. Population - consists of all possible measurements that can be made on a particular item or procedure. SAMPLE VERSUS POPULATION Populatio - cosists of all possible measuremets that ca be made o a particular item or procedure. Ofte a populatio has a ifiite umber of data elemets Geerally expese to determie

More information

Fundamentals of Media Processing. Shin'ichi Satoh Kazuya Kodama Hiroshi Mo Duy-Dinh Le

Fundamentals of Media Processing. Shin'ichi Satoh Kazuya Kodama Hiroshi Mo Duy-Dinh Le Fudametals of Media Processig Shi'ichi Satoh Kazuya Kodama Hiroshi Mo Duy-Dih Le Today's topics Noparametric Methods Parze Widow k-nearest Neighbor Estimatio Clusterig Techiques k-meas Agglomerative Hierarchical

More information

OCR Statistics 1. Working with data. Section 3: Measures of spread

OCR Statistics 1. Working with data. Section 3: Measures of spread Notes ad Eamples OCR Statistics 1 Workig with data Sectio 3: Measures of spread Just as there are several differet measures of cetral tedec (averages), there are a variet of statistical measures of spread.

More information

Performance Plus Software Parameter Definitions

Performance Plus Software Parameter Definitions Performace Plus+ Software Parameter Defiitios/ Performace Plus Software Parameter Defiitios Chapma Techical Note-TG-5 paramete.doc ev-0-03 Performace Plus+ Software Parameter Defiitios/2 Backgroud ad Defiitios

More information

Improving Template Based Spike Detection

Improving Template Based Spike Detection Improvig Template Based Spike Detectio Kirk Smith, Member - IEEE Portlad State Uiversity petra@ee.pdx.edu Abstract Template matchig algorithms like SSE, Covolutio ad Maximum Likelihood are well kow for

More information

MSC BD 5002/IT 5210: Knowledge Discovery and Data Mining

MSC BD 5002/IT 5210: Knowledge Discovery and Data Mining MSC BD 5002/IT 5210: Kowledge Discovery ad Data Miig Ackowledgemet: Slides modified by Dr. Lei Che based o the slides provided by Jiawei Ha, Michelie Kamber, ad Jia Pei 2012 Ha, Kamber & Pei. All rights

More information

ECE4050 Data Structures and Algorithms. Lecture 6: Searching

ECE4050 Data Structures and Algorithms. Lecture 6: Searching ECE4050 Data Structures ad Algorithms Lecture 6: Searchig 1 Search Give: Distict keys k 1, k 2,, k ad collectio L of records of the form (k 1, I 1 ), (k 2, I 2 ),, (k, I ) where I j is the iformatio associated

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 18 Strategies for Query Processig Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe Itroductio DBMS techiques to process a query Scaer idetifies

More information

Modern Systems Analysis and Design Seventh Edition

Modern Systems Analysis and Design Seventh Edition Moder Systems Aalysis ad Desig Seveth Editio Jeffrey A. Hoffer Joey F. George Joseph S. Valacich Desigig Databases Learig Objectives ü Cocisely defie each of the followig key database desig terms: relatio,

More information

3D Model Retrieval Method Based on Sample Prediction

3D Model Retrieval Method Based on Sample Prediction 20 Iteratioal Coferece o Computer Commuicatio ad Maagemet Proc.of CSIT vol.5 (20) (20) IACSIT Press, Sigapore 3D Model Retrieval Method Based o Sample Predictio Qigche Zhag, Ya Tag* School of Computer

More information

Designing a learning system

Designing a learning system CS 75 Itro to Machie Learig Lecture Desigig a learig system Milos Hauskrecht milos@pitt.edu 539 Seott Square, -5 people.cs.pitt.edu/~milos/courses/cs75/ Admiistrivia No homework assigmet this week Please

More information

Descriptive Statistics Summary Lists

Descriptive Statistics Summary Lists Chapter 209 Descriptive Statistics Summary Lists Itroductio This procedure is used to summarize cotiuous data. Large volumes of such data may be easily summarized i statistical lists of meas, couts, stadard

More information

Image Segmentation EEE 508

Image Segmentation EEE 508 Image Segmetatio Objective: to determie (etract) object boudaries. It is a process of partitioig a image ito distict regios by groupig together eighborig piels based o some predefied similarity criterio.

More information

4.2.1 Bayesian Principal Component Analysis Weighted K Nearest Neighbor Regularized Expectation Maximization

4.2.1 Bayesian Principal Component Analysis Weighted K Nearest Neighbor Regularized Expectation Maximization 4 DATA PREPROCESSING 4.1 Data Normalizatio 4.1.1 Mi-Max 4.1.2 Z-Score 4.1.3 Decimal Scalig 4.2 Data Imputatio 4.2.1 Bayesia Pricipal Compoet Aalysis 4.2.2 K Nearest Neighbor 4.2.3 Weighted K Nearest Neighbor

More information

Lecture 5. Counting Sort / Radix Sort

Lecture 5. Counting Sort / Radix Sort Lecture 5. Coutig Sort / Radix Sort T. H. Corme, C. E. Leiserso ad R. L. Rivest Itroductio to Algorithms, 3rd Editio, MIT Press, 2009 Sugkyukwa Uiversity Hyuseug Choo choo@skku.edu Copyright 2000-2018

More information

Eigenimages. Digital Image Processing: Bernd Girod, Stanford University -- Eigenimages 1

Eigenimages. Digital Image Processing: Bernd Girod, Stanford University -- Eigenimages 1 Eigeimages Uitary trasforms Karhue-Loève trasform ad eigeimages Sirovich ad Kirby method Eigefaces for geder recogitio Fisher liear discrimat aalysis Fisherimages ad varyig illumiatio Fisherfaces vs. eigefaces

More information

MATHEMATICAL METHODS OF ANALYSIS AND EXPERIMENTAL DATA PROCESSING (Or Methods of Curve Fitting)

MATHEMATICAL METHODS OF ANALYSIS AND EXPERIMENTAL DATA PROCESSING (Or Methods of Curve Fitting) MATHEMATICAL METHODS OF ANALYSIS AND EXPERIMENTAL DATA PROCESSING (Or Methods of Curve Fittig) I this chapter, we will eamie some methods of aalysis ad data processig; data obtaied as a result of a give

More information

Describing data with graphics and numbers

Describing data with graphics and numbers Describig data with graphics ad umbers Types of Data Categorical Variables also kow as class variables, omial variables Quatitative Variables aka umerical ariables either cotiuous or discrete. Graphig

More information

Dimensionality Reduction PCA

Dimensionality Reduction PCA Dimesioality Reductio PCA Machie Learig CSE446 David Wadde (slides provided by Carlos Guestri) Uiversity of Washigto Feb 22, 2017 Carlos Guestri 2005-2017 1 Dimesioality reductio Iput data may have thousads

More information

ENGI 4421 Probability and Statistics Faculty of Engineering and Applied Science Problem Set 1 Descriptive Statistics

ENGI 4421 Probability and Statistics Faculty of Engineering and Applied Science Problem Set 1 Descriptive Statistics ENGI 44 Probability ad Statistics Faculty of Egieerig ad Applied Sciece Problem Set Descriptive Statistics. If, i the set of values {,, 3, 4, 5, 6, 7 } a error causes the value 5 to be replaced by 50,

More information

Pruning and Summarizing the Discovered Time Series Association Rules from Mechanical Sensor Data Qing YANG1,a,*, Shao-Yu WANG1,b, Ting-Ting ZHANG2,c

Pruning and Summarizing the Discovered Time Series Association Rules from Mechanical Sensor Data Qing YANG1,a,*, Shao-Yu WANG1,b, Ting-Ting ZHANG2,c Advaces i Egieerig Research (AER), volume 131 3rd Aual Iteratioal Coferece o Electroics, Electrical Egieerig ad Iformatio Sciece (EEEIS 2017) Pruig ad Summarizig the Discovered Time Series Associatio Rules

More information

Chapter 3: Introduction to Principal components analysis with MATLAB

Chapter 3: Introduction to Principal components analysis with MATLAB Chapter 3: Itroductio to Pricipal compoets aalysis with MATLAB The vriety of mathematical tools are avilable ad successfully workig to i the field of image processig. The mai problem with graphical autheticatio

More information

A New Morphological 3D Shape Decomposition: Grayscale Interframe Interpolation Method

A New Morphological 3D Shape Decomposition: Grayscale Interframe Interpolation Method A ew Morphological 3D Shape Decompositio: Grayscale Iterframe Iterpolatio Method D.. Vizireau Politehica Uiversity Bucharest, Romaia ae@comm.pub.ro R. M. Udrea Politehica Uiversity Bucharest, Romaia mihea@comm.pub.ro

More information

Numerical Methods Lecture 6 - Curve Fitting Techniques

Numerical Methods Lecture 6 - Curve Fitting Techniques Numerical Methods Lecture 6 - Curve Fittig Techiques Topics motivatio iterpolatio liear regressio higher order polyomial form expoetial form Curve fittig - motivatio For root fidig, we used a give fuctio

More information

Image Analysis. Segmentation by Fitting a Model

Image Analysis. Segmentation by Fitting a Model Image Aalysis Segmetatio by Fittig a Model Christophoros Nikou cikou@cs.uoi.gr Images take from: D. Forsyth ad J. Poce. Computer Visio: A Moder Approach, Pretice Hall, 2003. Computer Visio course by Svetlaa

More information

Lower Bounds for Sorting

Lower Bounds for Sorting Liear Sortig Topics Covered: Lower Bouds for Sortig Coutig Sort Radix Sort Bucket Sort Lower Bouds for Sortig Compariso vs. o-compariso sortig Decisio tree model Worst case lower boud Compariso Sortig

More information

Dimension Reduction and Manifold Learning. Xin Zhang

Dimension Reduction and Manifold Learning. Xin Zhang Dimesio Reductio ad Maifold Learig Xi Zhag eeizhag@scut.edu.c Cotet Motivatio of maifold learig Pricipal compoet aalysis ad its etesio Maifold learig Global oliear maifold learig (IsoMap) Local oliear

More information

A Novel Feature Extraction Algorithm for Haar Local Binary Pattern Texture Based on Human Vision System

A Novel Feature Extraction Algorithm for Haar Local Binary Pattern Texture Based on Human Vision System A Novel Feature Extractio Algorithm for Haar Local Biary Patter Texture Based o Huma Visio System Liu Tao 1,* 1 Departmet of Electroic Egieerig Shaaxi Eergy Istitute Xiayag, Shaaxi, Chia Abstract The locality

More information

Pattern Recognition Systems Lab 1 Least Mean Squares

Pattern Recognition Systems Lab 1 Least Mean Squares Patter Recogitio Systems Lab 1 Least Mea Squares 1. Objectives This laboratory work itroduces the OpeCV-based framework used throughout the course. I this assigmet a lie is fitted to a set of poits usig

More information

Ones Assignment Method for Solving Traveling Salesman Problem

Ones Assignment Method for Solving Traveling Salesman Problem Joural of mathematics ad computer sciece 0 (0), 58-65 Oes Assigmet Method for Solvig Travelig Salesma Problem Hadi Basirzadeh Departmet of Mathematics, Shahid Chamra Uiversity, Ahvaz, Ira Article history:

More information

SD vs. SD + One of the most important uses of sample statistics is to estimate the corresponding population parameters.

SD vs. SD + One of the most important uses of sample statistics is to estimate the corresponding population parameters. SD vs. SD + Oe of the most importat uses of sample statistics is to estimate the correspodig populatio parameters. The mea of a represetative sample is a good estimate of the mea of the populatio that

More information

Euclidean Distance Based Feature Selection for Fault Detection Prediction Model in Semiconductor Manufacturing Process

Euclidean Distance Based Feature Selection for Fault Detection Prediction Model in Semiconductor Manufacturing Process Vol.133 (Iformatio Techology ad Computer Sciece 016), pp.85-89 http://dx.doi.org/10.1457/astl.016. Euclidea Distace Based Feature Selectio for Fault Detectio Predictio Model i Semicoductor Maufacturig

More information

Eigenimages. Digital Image Processing: Bernd Girod, 2013 Stanford University -- Eigenimages 1

Eigenimages. Digital Image Processing: Bernd Girod, 2013 Stanford University -- Eigenimages 1 Eigeimages Uitary trasforms Karhue-Loève trasform ad eigeimages Sirovich ad Kirby method Eigefaces for geder recogitio Fisher liear discrimat aalysis Fisherimages ad varyig illumiatio Fisherfaces vs. eigefaces

More information

Task scenarios Outline. Scenarios in Knowledge Extraction. Proposed Framework for Scenario to Design Diagram Transformation

Task scenarios Outline. Scenarios in Knowledge Extraction. Proposed Framework for Scenario to Design Diagram Transformation 6-0-0 Kowledge Trasformatio from Task Scearios to View-based Desig Diagrams Nima Dezhkam Kamra Sartipi {dezhka, sartipi}@mcmaster.ca Departmet of Computig ad Software McMaster Uiversity CANADA SEKE 08

More information

EM375 STATISTICS AND MEASUREMENT UNCERTAINTY LEAST SQUARES LINEAR REGRESSION ANALYSIS

EM375 STATISTICS AND MEASUREMENT UNCERTAINTY LEAST SQUARES LINEAR REGRESSION ANALYSIS EM375 STATISTICS AND MEASUREMENT UNCERTAINTY LEAST SQUARES LINEAR REGRESSION ANALYSIS I this uit of the course we ivestigate fittig a straight lie to measured (x, y) data pairs. The equatio we wat to fit

More information

Intermediate Statistics

Intermediate Statistics Gait Learig Guides Itermediate Statistics Data processig & display, Cetral tedecy Author: Raghu M.D. STATISTICS DATA PROCESSING AND DISPLAY Statistics is the study of data or umerical facts of differet

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 19 Query Optimizatio Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe Itroductio Query optimizatio Coducted by a query optimizer i a DBMS Goal:

More information

Journal of Chemical and Pharmaceutical Research, 2013, 5(12): Research Article

Journal of Chemical and Pharmaceutical Research, 2013, 5(12): Research Article Available olie www.jocpr.com Joural of Chemical ad Pharmaceutical Research, 2013, 5(12):745-749 Research Article ISSN : 0975-7384 CODEN(USA) : JCPRC5 K-meas algorithm i the optimal iitial cetroids based

More information

Analysis Metrics. Intro to Algorithm Analysis. Slides. 12. Alg Analysis. 12. Alg Analysis

Analysis Metrics. Intro to Algorithm Analysis. Slides. 12. Alg Analysis. 12. Alg Analysis Itro to Algorithm Aalysis Aalysis Metrics Slides. Table of Cotets. Aalysis Metrics 3. Exact Aalysis Rules 4. Simple Summatio 5. Summatio Formulas 6. Order of Magitude 7. Big-O otatio 8. Big-O Theorems

More information

UNIT 4 Section 8 Estimating Population Parameters using Confidence Intervals

UNIT 4 Section 8 Estimating Population Parameters using Confidence Intervals UNIT 4 Sectio 8 Estimatig Populatio Parameters usig Cofidece Itervals To make ifereces about a populatio that caot be surveyed etirely, sample statistics ca be take from a SRS of the populatio ad used

More information

Outline. Research Definition. Motivation. Foundation of Reverse Engineering. Dynamic Analysis and Design Pattern Detection in Java Programs

Outline. Research Definition. Motivation. Foundation of Reverse Engineering. Dynamic Analysis and Design Pattern Detection in Java Programs Dyamic Aalysis ad Desig Patter Detectio i Java Programs Outlie Lei Hu Kamra Sartipi {hul4, sartipi}@mcmasterca Departmet of Computig ad Software McMaster Uiversity Caada Motivatio Research Problem Defiitio

More information

Normal Distributions

Normal Distributions Normal Distributios Stacey Hacock Look at these three differet data sets Each histogram is overlaid with a curve : A B C A) Weights (g) of ewly bor lab rat pups B) Mea aual temperatures ( F ) i A Arbor,

More information

A SOFTWARE MODEL FOR THE MULTILAYER PERCEPTRON

A SOFTWARE MODEL FOR THE MULTILAYER PERCEPTRON A SOFTWARE MODEL FOR THE MULTILAYER PERCEPTRON Roberto Lopez ad Eugeio Oñate Iteratioal Ceter for Numerical Methods i Egieerig (CIMNE) Edificio C1, Gra Capitá s/, 08034 Barceloa, Spai ABSTRACT I this work

More information

. Written in factored form it is easy to see that the roots are 2, 2, i,

. Written in factored form it is easy to see that the roots are 2, 2, i, CMPS A Itroductio to Programmig Programmig Assigmet 4 I this assigmet you will write a java program that determies the real roots of a polyomial that lie withi a specified rage. Recall that the roots (or

More information

Data Warehousing. Paper

Data Warehousing. Paper Data Warehousig Paper 28-25 Implemetig a fiacial balace scorecard o top of SAP R/3, usig CFO Visio as iterface. Ida Carapelle & Sophie De Baets, SOLID Parters, Brussels, Belgium (EUROPE) ABSTRACT Fiacial

More information

Fuzzy Linear Regression Analysis

Fuzzy Linear Regression Analysis 12th IFAC Coferece o Programmable Devices ad Embedded Systems The Iteratioal Federatio of Automatic Cotrol September 25-27, 2013. Fuzzy Liear Regressio Aalysis Jaa Nowaková Miroslav Pokorý VŠB-Techical

More information

Analysis of Server Resource Consumption of Meteorological Satellite Application System Based on Contour Curve

Analysis of Server Resource Consumption of Meteorological Satellite Application System Based on Contour Curve Advaces i Computer, Sigals ad Systems (2018) 2: 19-25 Clausius Scietific Press, Caada Aalysis of Server Resource Cosumptio of Meteorological Satellite Applicatio System Based o Cotour Curve Xiagag Zhao

More information

Sorting in Linear Time. Data Structures and Algorithms Andrei Bulatov

Sorting in Linear Time. Data Structures and Algorithms Andrei Bulatov Sortig i Liear Time Data Structures ad Algorithms Adrei Bulatov Algorithms Sortig i Liear Time 7-2 Compariso Sorts The oly test that all the algorithms we have cosidered so far is compariso The oly iformatio

More information

Wavelet Transform. CSE 490 G Introduction to Data Compression Winter Wavelet Transformed Barbara (Enhanced) Wavelet Transformed Barbara (Actual)

Wavelet Transform. CSE 490 G Introduction to Data Compression Winter Wavelet Transformed Barbara (Enhanced) Wavelet Transformed Barbara (Actual) Wavelet Trasform CSE 49 G Itroductio to Data Compressio Witer 6 Wavelet Trasform Codig PACW Wavelet Trasform A family of atios that filters the data ito low resolutio data plus detail data high pass filter

More information

Lecture 13: Validation

Lecture 13: Validation Lecture 3: Validatio Resampli methods Holdout Cross Validatio Radom Subsampli -Fold Cross-Validatio Leave-oe-out The Bootstrap Bias ad variace estimatio Three-way data partitioi Itroductio to Patter Recoitio

More information

Fast Fourier Transform (FFT) Algorithms

Fast Fourier Transform (FFT) Algorithms Fast Fourier Trasform FFT Algorithms Relatio to the z-trasform elsewhere, ozero, z x z X x [ ] 2 ~ elsewhere,, ~ e j x X x x π j e z z X X π 2 ~ The DFS X represets evely spaced samples of the z- trasform

More information

9 x and g(x) = 4. x. Find (x) 3.6. I. Combining Functions. A. From Equations. Example: Let f(x) = and its domain. Example: Let f(x) = and g(x) = x x 4

9 x and g(x) = 4. x. Find (x) 3.6. I. Combining Functions. A. From Equations. Example: Let f(x) = and its domain. Example: Let f(x) = and g(x) = x x 4 1 3.6 I. Combiig Fuctios A. From Equatios Example: Let f(x) = 9 x ad g(x) = 4 f x. Fid (x) g ad its domai. 4 Example: Let f(x) = ad g(x) = x x 4. Fid (f-g)(x) B. From Graphs: Graphical Additio. Example:

More information

Name Date Hr. ALGEBRA 1-2 SPRING FINAL MULTIPLE CHOICE REVIEW #1

Name Date Hr. ALGEBRA 1-2 SPRING FINAL MULTIPLE CHOICE REVIEW #1 Name Date Hr. ALGEBRA - SPRING FINAL MULTIPLE CHOICE REVIEW #. The high temperatures for Phoeix i October of 009 are listed below. Which measure of ceter will provide the most accurate estimatio of the

More information

Elementary Educational Computer

Elementary Educational Computer Chapter 5 Elemetary Educatioal Computer. Geeral structure of the Elemetary Educatioal Computer (EEC) The EEC coforms to the 5 uits structure defied by vo Neuma's model (.) All uits are preseted i a simplified

More information

Data Preprocessing. S1 Teknik Informatika Fakultas Teknologi Informasi Universitas Kristen Maranatha

Data Preprocessing. S1 Teknik Informatika Fakultas Teknologi Informasi Universitas Kristen Maranatha Data Preprocessing S1 Teknik Informatika Fakultas Teknologi Informasi Universitas Kristen Maranatha 1 Why Data Preprocessing? Data in the real world is dirty incomplete: lacking attribute values, lacking

More information

CSCI 5090/7090- Machine Learning. Spring Mehdi Allahyari Georgia Southern University

CSCI 5090/7090- Machine Learning. Spring Mehdi Allahyari Georgia Southern University CSCI 5090/7090- Machie Learig Sprig 018 Mehdi Allahyari Georgia Souther Uiversity Clusterig (slides borrowed from Tom Mitchell, Maria Floria Balca, Ali Borji, Ke Che) 1 Clusterig, Iformal Goals Goal: Automatically

More information

1 Enterprise Modeler

1 Enterprise Modeler 1 Eterprise Modeler Itroductio I BaaERP, a Busiess Cotrol Model ad a Eterprise Structure Model for multi-site cofiguratios are itroduced. Eterprise Structure Model Busiess Cotrol Models Busiess Fuctio

More information

Our second algorithm. Comp 135 Machine Learning Computer Science Tufts University. Decision Trees. Decision Trees. Decision Trees.

Our second algorithm. Comp 135 Machine Learning Computer Science Tufts University. Decision Trees. Decision Trees. Decision Trees. Comp 135 Machie Learig Computer Sciece Tufts Uiversity Fall 2017 Roi Khardo Some of these slides were adapted from previous slides by Carla Brodley Our secod algorithm Let s look at a simple dataset for

More information

Lecture Notes 6 Introduction to algorithm analysis CSS 501 Data Structures and Object-Oriented Programming

Lecture Notes 6 Introduction to algorithm analysis CSS 501 Data Structures and Object-Oriented Programming Lecture Notes 6 Itroductio to algorithm aalysis CSS 501 Data Structures ad Object-Orieted Programmig Readig for this lecture: Carrao, Chapter 10 To be covered i this lecture: Itroductio to algorithm aalysis

More information

Tutorial on Packet Time Metrics

Tutorial on Packet Time Metrics Power Matters. Tutorial o Packet Time Metrics Lee Cosart lee.cosart@microsemi.com ITS 204 204 Microsemi Corporatio. COMPANY POPIETAY Itroductio requecy trasport Oe-way: forward & reverse packet streams

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 26 Ehaced Data Models: Itroductio to Active, Temporal, Spatial, Multimedia, ad Deductive Databases Copyright 2016 Ramez Elmasri ad Shamkat B.

More information

Dynamic Programming and Curve Fitting Based Road Boundary Detection

Dynamic Programming and Curve Fitting Based Road Boundary Detection Dyamic Programmig ad Curve Fittig Based Road Boudary Detectio SHYAM PRASAD ADHIKARI, HYONGSUK KIM, Divisio of Electroics ad Iformatio Egieerig Chobuk Natioal Uiversity 664-4 Ga Deokji-Dog Jeoju-City Jeobuk

More information

Force Network Analysis using Complementary Energy

Force Network Analysis using Complementary Energy orce Network Aalysis usig Complemetary Eergy Adrew BORGART Assistat Professor Delft Uiversity of Techology Delft, The Netherlads A.Borgart@tudelft.l Yaick LIEM Studet Delft Uiversity of Techology Delft,

More information

Stone Images Retrieval Based on Color Histogram

Stone Images Retrieval Based on Color Histogram Stoe Images Retrieval Based o Color Histogram Qiag Zhao, Jie Yag, Jigyi Yag, Hogxig Liu School of Iformatio Egieerig, Wuha Uiversity of Techology Wuha, Chia Abstract Stoe images color features are chose

More information

Analysis of Documents Clustering Using Sampled Agglomerative Technique

Analysis of Documents Clustering Using Sampled Agglomerative Technique Aalysis of Documets Clusterig Usig Sampled Agglomerative Techique Omar H. Karam, Ahmed M. Hamad, ad Sheri M. Moussa Abstract I this paper a clusterig algorithm for documets is proposed that adapts a samplig-based

More information

Interactive PMCube Explorer

Interactive PMCube Explorer Iteractive PMCube Explorer Documetatio ad User Maual Thomas Vogelgesag Carl vo Ossietzky Uiversität Oldeburg December 9, 206 Cotets Itroductio 3 2 Applicatio Overview 4 3 Data Preparatio 6 3. Data Warehouse

More information

VALIDATING DIRECTIONAL EDGE-BASED IMAGE FEATURE REPRESENTATIONS IN FACE RECOGNITION BY SPATIAL CORRELATION-BASED CLUSTERING

VALIDATING DIRECTIONAL EDGE-BASED IMAGE FEATURE REPRESENTATIONS IN FACE RECOGNITION BY SPATIAL CORRELATION-BASED CLUSTERING VALIDATING DIRECTIONAL EDGE-BASED IMAGE FEATURE REPRESENTATIONS IN FACE RECOGNITION BY SPATIAL CORRELATION-BASED CLUSTERING Yasufumi Suzuki ad Tadashi Shibata Departmet of Frotier Iformatics, School of

More information

Evaluation of Support Vector Machine Kernels for Detecting Network Anomalies

Evaluation of Support Vector Machine Kernels for Detecting Network Anomalies Evaluatio of Support Vector Machie Kerels for Detectig Network Aomalies Prera Batta, Maider Sigh, Zhida Li, Qigye Dig, ad Ljiljaa Trajković Commuicatio Networks Laboratory http://www.esc.sfu.ca/~ljilja/cl/

More information

Octahedral Graph Scaling

Octahedral Graph Scaling Octahedral Graph Scalig Peter Russell Jauary 1, 2015 Abstract There is presetly o strog iterpretatio for the otio of -vertex graph scalig. This paper presets a ew defiitio for the term i the cotext of

More information

A Study on the Performance of Cholesky-Factorization using MPI

A Study on the Performance of Cholesky-Factorization using MPI A Study o the Performace of Cholesky-Factorizatio usig MPI Ha S. Kim Scott B. Bade Departmet of Computer Sciece ad Egieerig Uiversity of Califoria Sa Diego {hskim, bade}@cs.ucsd.edu Abstract Cholesky-factorizatio

More information

New HSL Distance Based Colour Clustering Algorithm

New HSL Distance Based Colour Clustering Algorithm The 4th Midwest Artificial Itelligece ad Cogitive Scieces Coferece (MAICS 03 pp 85-9 New Albay Idiaa USA April 3-4 03 New HSL Distace Based Colour Clusterig Algorithm Vasile Patrascu Departemet of Iformatics

More information

Chapter 1. Introduction to Computers and C++ Programming. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Chapter 1. Introduction to Computers and C++ Programming. Copyright 2015 Pearson Education, Ltd.. All rights reserved. Chapter 1 Itroductio to Computers ad C++ Programmig Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Overview 1.1 Computer Systems 1.2 Programmig ad Problem Solvig 1.3 Itroductio to C++ 1.4 Testig

More information

Descriptive Data Mining Modeling in Telecom Systems

Descriptive Data Mining Modeling in Telecom Systems Descriptive Data Miig Modelig i Telecom Systems Ivo Pejaović, Zora Sočir, Damir Medved 2 Faculty of Electrical Egieerig ad Computig, Uiversity of Zagreb Usa 3, HR-0000 Zagreb, Croatia Tel: +385 629 763;

More information

Reconciling Continuous Attribute Values from Multiple Data Sources

Reconciling Continuous Attribute Values from Multiple Data Sources Associatio for Iformatio Systems AIS Electroic Library (AISeL PACIS 2008 Proceedigs Pacific Asia Coferece o Iformatio Systems (PACIS July 2008 Recocilig Cotiuous Attribute Values from Multiple Data Sources

More information

South Slave Divisional Education Council. Math 10C

South Slave Divisional Education Council. Math 10C South Slave Divisioal Educatio Coucil Math 10C Curriculum Package February 2012 12 Strad: Measuremet Geeral Outcome: Develop spatial sese ad proportioal reasoig It is expected that studets will: 1. Solve

More information

Solving Fuzzy Assignment Problem Using Fourier Elimination Method

Solving Fuzzy Assignment Problem Using Fourier Elimination Method Global Joural of Pure ad Applied Mathematics. ISSN 0973-768 Volume 3, Number 2 (207), pp. 453-462 Research Idia Publicatios http://www.ripublicatio.com Solvig Fuzzy Assigmet Problem Usig Fourier Elimiatio

More information

Τεχνολογία Λογισμικού

Τεχνολογία Λογισμικού ΕΘΝΙΚΟ ΜΕΤΣΟΒΙΟ ΠΟΛΥΤΕΧΝΕΙΟ Σχολή Ηλεκτρολόγων Μηχανικών και Μηχανικών Υπολογιστών Τεχνολογία Λογισμικού, 7ο/9ο εξάμηνο 2018-2019 Τεχνολογία Λογισμικού Ν.Παπασπύρου, Αν.Καθ. ΣΗΜΜΥ, ickie@softlab.tua,gr

More information

Capability Analysis (Variable Data)

Capability Analysis (Variable Data) Capability Aalysis (Variable Data) Revised: 0/0/07 Summary... Data Iput... 3 Capability Plot... 5 Aalysis Summary... 6 Aalysis Optios... 8 Capability Idices... Prefereces... 6 Tests for Normality... 7

More information

Kernel Smoothing Function and Choosing Bandwidth for Non-Parametric Regression Methods 1

Kernel Smoothing Function and Choosing Bandwidth for Non-Parametric Regression Methods 1 Ozea Joural of Applied Scieces (), 009 Ozea Joural of Applied Scieces (), 009 ISSN 943-49 009 Ozea Publicatio Kerel Smoothig Fuctio ad Choosig Badwidth for No-Parametric Regressio Methods Murat Kayri ad

More information

Math 10C Long Range Plans

Math 10C Long Range Plans Math 10C Log Rage Plas Uits: Evaluatio: Homework, projects ad assigmets 10% Uit Tests. 70% Fial Examiatio.. 20% Ay Uit Test may be rewritte for a higher mark. If the retest mark is higher, that mark will

More information

Computational Geometry

Computational Geometry Computatioal Geometry Chapter 4 Liear programmig Duality Smallest eclosig disk O the Ageda Liear Programmig Slides courtesy of Craig Gotsma 4. 4. Liear Programmig - Example Defie: (amout amout cosumed

More information

An Improved Shuffled Frog-Leaping Algorithm for Knapsack Problem

An Improved Shuffled Frog-Leaping Algorithm for Knapsack Problem A Improved Shuffled Frog-Leapig Algorithm for Kapsack Problem Zhoufag Li, Ya Zhou, ad Peg Cheg School of Iformatio Sciece ad Egieerig Hea Uiversity of Techology ZhegZhou, Chia lzhf1978@126.com Abstract.

More information

Małgorzata Sterna. Mateusz Cicheński, Mateusz Jarus, Michał Miszkiewicz, Jarosław Szymczak

Małgorzata Sterna. Mateusz Cicheński, Mateusz Jarus, Michał Miszkiewicz, Jarosław Szymczak Małgorzata Stera Mateusz Cicheński, Mateusz Jarus, Michał Miszkiewicz, Jarosław Szymczak Istitute of Computig Sciece Pozań Uiversity of Techology Pozań - Polad Scope of the Talk Problem defiitio MP Formulatio

More information

Algorithm. Counting Sort Analysis of Algorithms

Algorithm. Counting Sort Analysis of Algorithms Algorithm Coutig Sort Aalysis of Algorithms Assumptios: records Coutig sort Each record cotais keys ad data All keys are i the rage of 1 to k Space The usorted list is stored i A, the sorted list will

More information

Unsupervised Discretization Using Kernel Density Estimation

Unsupervised Discretization Using Kernel Density Estimation Usupervised Discretizatio Usig Kerel Desity Estimatio Maregle Biba, Floriaa Esposito, Stefao Ferilli, Nicola Di Mauro, Teresa M.A Basile Departmet of Computer Sciece, Uiversity of Bari Via Oraboa 4, 7025

More information

A General Framework for Accurate Statistical Timing Analysis Considering Correlations

A General Framework for Accurate Statistical Timing Analysis Considering Correlations A Geeral Framework for Accurate Statistical Timig Aalysis Cosiderig Correlatios 7.4 Vishal Khadelwal Departmet of ECE Uiversity of Marylad-College Park vishalk@glue.umd.edu Akur Srivastava Departmet of

More information

The isoperimetric problem on the hypercube

The isoperimetric problem on the hypercube The isoperimetric problem o the hypercube Prepared by: Steve Butler November 2, 2005 1 The isoperimetric problem We will cosider the -dimesioal hypercube Q Recall that the hypercube Q is a graph whose

More information

Big-O Analysis. Asymptotics

Big-O Analysis. Asymptotics Big-O Aalysis 1 Defiitio: Suppose that f() ad g() are oegative fuctios of. The we say that f() is O(g()) provided that there are costats C > 0 ad N > 0 such that for all > N, f() Cg(). Big-O expresses

More information

How do we evaluate algorithms?

How do we evaluate algorithms? F2 Readig referece: chapter 2 + slides Algorithm complexity Big O ad big Ω To calculate ruig time Aalysis of recursive Algorithms Next time: Litterature: slides mostly The first Algorithm desig methods:

More information

Cluster Analysis. Andrew Kusiak Intelligent Systems Laboratory

Cluster Analysis. Andrew Kusiak Intelligent Systems Laboratory Cluster Aalysis Adrew Kusiak Itelliget Systems Laboratory 2139 Seamas Ceter The Uiversity of Iowa Iowa City, Iowa 52242-1527 adrew-kusiak@uiowa.edu http://www.icae.uiowa.edu/~akusiak Two geeric modes of

More information

COMP9318: Data Warehousing and Data Mining

COMP9318: Data Warehousing and Data Mining COMP9318: Data Warehousig ad Data Miig L8: Clusterig COMP9318: Data Warehousig ad Data Miig 1 What is Cluster Aalysis? COMP9318: Data Warehousig ad Data Miig 2 What is Cluster Aalysis? Cluster: a collectio

More information

Arithmetic Sequences

Arithmetic Sequences . Arithmetic Sequeces COMMON CORE Learig Stadards HSF-IF.A. HSF-BF.A.1a HSF-BF.A. HSF-LE.A. Essetial Questio How ca you use a arithmetic sequece to describe a patter? A arithmetic sequece is a ordered

More information