Data Preprocessing. Motivation
|
|
- Rosalyn Flynn
- 6 years ago
- Views:
Transcription
1 Data Preprocessig Mirek Riedewald Some slides based o presetatio by Jiawei Ha ad Michelie Kamber Motivatio Garbage-i, garbage-out Caot get good miig results from bad data Need to uderstad data properties to select the right techique ad parameter values Data cleaig Data formattig to match techique Data maipulatio to eable discovery of desired patters 1
2 Data Records Data sets are made up of data records A data record represets a etity Examples: Sales database: customers, store items, sales Medical database: patiets, treatmets Uiversity database: studets, professors, courses Also called samples, examples, tuples, istaces, data poits, objects Data records are described by attributes Database row = data record; colum = attribute 3 Attributes Attribute (or dimesio, feature, variable): a data field, represetig a characteristic or feature of a data record E.g., customerid, ame, address Types: Nomial (also called categorical) No orderig or meaigful distace measure Ordial Ordered domai, but o meaigful distace measure Numeric Ordered domai, meaigful distace measure Cotiuous versus discrete 4
3 Attribute Type Examples Nomial: category, status, or ame of thig Hair_color = {black, brow, blod, red, aubur, grey, white} marital status, occupatio, ID umbers, zip codes Biary: omial attribute with oly states (0 ad 1) Symmetric biary: both outcomes equally importat e.g., geder Asymmetric biary: outcomes ot equally importat. e.g., medical test (positive vs. egative) Ordial Values have a meaigful order (rakig) but magitude betwee successive values is ot kow Size = {small, medium, large}, grades, army rakigs 5 Numeric Attribute Types Quatity (iteger or real-valued) Iterval Measured o a scale of equal-sized uits Values have order E.g., temperature i C or F, caledar dates No true zero-poit Ratio Iheret zero-poit We ca speak of values as beig a order of magitude larger tha the uit of measuremet (10m is twice as high as 5m). E.g., temperature i Kelvi, legth, couts, moetary quatities 6 3
4 Discrete vs. Cotiuous Attributes Discrete Attribute Has oly a fiite or coutably ifiite set of values Nomial, biary, ordial attributes are usually discrete Iteger umeric attributes Cotiuous Attribute Has real umbers as attribute values E.g., temperature, height, or weight Practically, real values ca oly be measured ad represeted usig a fiite umber of digits Typically represeted as floatig-poit variables 7 Data Preprocessig Overview Descriptive data summarizatio Data cleaig Data itegratio Data trasformatio Summary 8 4
5 Measurig the Cetral Tedecy Sample mea: 1 x x i i1 Weighted arithmetic mea: x Trimmed mea: set weights of extreme values to zero Media Middle value if odd umber of values; average of the middle two values otherwise Mode Value that occurs most frequetly i the data Uimodal, bimodal, trimodal distributio i1 i1 w x i w i i 9 Measurig Data Dispersio: Boxplot Quartiles: Q 1 (5th percetile), Q 3 (75th percetile) Iter-quartile rage: IQR = Q 3 Q 1 Various defiitios for determiig percetiles, e.g., for N records, the p-th percetile is the record at positio (p/100)n+0.5 i icreasig order If ot iteger, roud to earest iteger or compute weighted average E.g., for N=30, p=5 (to get Q1): 5/100* = 8, i.e., Q1 is 8-th largest of the 30 values E.g., for N=3, p=5: 5/100*3+0.5 = 8.5, i.e., Q1 is average of 8-th ad 9-th largest values Boxplot: eds of the box are the quartiles, media is marked, whiskers exted to mi/max Ofte plots outliers idividually Outlier: usually, a value higher (or lower) tha 1.5 x IQR from Q3 (or Q1) 10 5
6 Measurig Data Dispersio: Variace Sample variace (aka secod cetral momet): m s 1 i1 ( x x) 1 Stadard deviatio = square root of variace Estimator of true populatio variace from a sample: 1 s 1 ( xi x) 1 i1 i i1 x i x 11 Graph display of tabulated frequecies, show as bars Shows what proportio of cases fall ito each category Area of the bar deotes the value, ot the height Crucial distictio whe the categories are ot of uiform width! Histogram 1 6
7 Scatter plot Visualizes relatioship betwee two attributes, eve a third (if categorical) For each data record, plot selected attribute pair i the plae 13 Correlated Data 14 7
8 Not Correlated Data 15 Data Preprocessig Overview Descriptive data summarizatio Data cleaig Data itegratio Data trasformatio Summary 16 8
9 Why Data Cleaig? Data i the real world is dirty Icomplete: lackig attribute values, lackig certai attributes of iterest, or cotaiig oly aggregate data E.g., occupatio= Noisy: cotaiig errors or outliers E.g., Salary= -10 Icosistet: cotaiig discrepacies i codes or ames E.g., Age= 4 ad Birthday= 03/07/1967 E.g., was ratig 1,, 3, ow ratig A, B, C E.g., discrepacy betwee duplicate records 17 Example: Bird Observatio Data Chage of rage boudaries over time, e.g., for temperature Differet uits, e.g., meters versus feet for elevatio Additio or removal of attributes over the years Missig etries, especially for habitat ad weather People wat to watch birds, ot fill out log forms GIS data based o 30m cells or 1km cells Locatio accuracy ZIP code versus GPS coordiates Walk alog trasect but report oly sigle locatio Icosistet ecodig of missig etries Hairy vs. Dowy Woodpecker 0, -9999, -3.4E+38 eed cotext to decide Varyig observer experiece ad capabilities Cofusio of species Missed species that was preset Cofusio about reportig protocol Report max versus sum see Report oly iterestig species, ot all 18 9
10 How to Hadle Missig Data? Igore the record Usually doe whe class label is missig (for classificatio tasks) Fill i maually Tedious ad ofte ot clear what value to fill i Fill i automatically with oe of the followig: Global costat, e.g., ukow Ukow could be mistake as ew cocept by data miig algorithm Attribute mea Attribute mea for all records belogig to the same class Most probable value: iferece-based such as Bayesia formula or decisio tree Some methods, e.g., trees, ca do this implicitly 19 How to Hadle Noisy Data? Noise = radom error or variace i a measured variable Typical approach: smoothig Adjust values of a record by takig values of other earby records ito accout Dozes of approaches Biig, average over eighborhood Regressio: replace origial records with records draw from regressio fuctio Idetify ad remove outliers, possibly ivolvig huma ispectio For this class: do t do it uless you uderstad the ature of the oise A good data miig techique should be able to deal with oise i the data 0 10
11 Data Preprocessig Overview Descriptive data summarizatio Data cleaig Data itegratio Data trasformatio Summary 3 Data Itegratio Combies data from multiple sources ito a coheret store Etity idetificatio problem Idetify real world etities from multiple data sources, e.g., Bill Clito = William Clito Detectig ad resolvig data value coflicts For the same real world etity, attribute values from differet sources might be differet Possible reasos: differet represetatios, differet scales, e.g., metric vs. US uits Schema itegratio: e.g., A.cust-id B.cust-# Itegrate metadata from differet sources Ca idetify idetical or similar attributes through correlatio aalysis 4 11
12 Covariace (Numerical Data) Covariace computed for data samples (A 1, A,..., A ) ad (B 1, B,..., B ): 1 Cov( A, B) i1 ( A A)( B i 1 B) If A ad B are idepedet, the Cov(A, B) = 0, but the coverse is ot true Two radom variables may have covariace of 0, but are ot idepedet If Cov(A, B) > 0, the A ad B ted to rise ad fall together The greater, the more so If covariace is egative, the A teds to rise as B falls ad vice versa i i1 A B A B i i 5 Covariace Example Suppose two stocks A ad B have the followig values i oe week: A: (, 3, 5, 4, 6) B: (5, 8, 10, 11, 14) AVG(A) = ( )/ 5 = 0/5 = 4 AVG(B) = ( ) /5 = 48/5 = 9.6 Cov(A,B) = ( )/ = 4 Cov(A,B) > 0, therefore A ad B ted to rise ad fall together 6 1
13 Correlatio Aalysis (Numerical Data) Pearso s product-momet correlatio coefficiet of radom variables A ad B: Cov( A, B) A, B Computed for two attributes A ad B from data samples (A 1, A,..., A ) ad (B 1, B,..., B ): 1 A i A Bi B r A, B 1 i1 sa sb Where A ad B are the sample meas, ad s A ad s B are the sample stadard deviatios of A ad B (usig the variace formula for s ). Note: -1 r A,B 1 r A,B > 0: A ad B positively correlated The higher, the stroger the correlatio r A,B < 0: egatively correlated A B 7 Correlatio Aalysis (Categorical Data) (chi-square) test (Observed Expected) Expected The larger the value, the more likely the variables are related The cells that cotribute the most to the value are those whose actual cout is very differet from the expected cout Correlatio does ot imply causality # of hospitals ad # of car-thefts i a city are correlated Both are causally liked to the third variable: populatio 8 13
14 Chi-Square Example Play chess Not play chess Sum (row) Like sciece fictio 50 (90) 00 (360) 450 Not like sciece fictio 50 (10) 1000 (840) 1050 Sum(col.) Numbers i parethesis are expected couts calculated based o the data distributio i the two categories (50 90) 90 (50 10) 10 (00 360) 360 ( ) It shows that like_sciece_fictio ad play_chess are correlated i the group 9 Data Preprocessig Overview Descriptive data summarizatio Data cleaig Data itegratio Data trasformatio Summary 30 14
15 Why Data Trasformatio? Make data more mieable E.g., some patters visible whe usig sigle time attribute (etire date-time combiatio), others oly whe makig hour, day, moth, year separate attributes Some patters oly visible at right graularity of represetatio Some methods require ormalized data E.g., all attributes i rage [0.0, 1.0] Reduce data size, both #attributes ad #records 31 Normalizatio Mi-max ormalizatio to [ew_mi A, ew_max A ]: v mi A v' (ew_max A ew_mi max mi E.g., ormalize icome rage [$1,000, $98,000] to [0.0, 1.0]. The $73,000 is mapped to Normalizatio by decimal scalig: A where j is the smallest iteger such that Max( ν ) < 1 A 73,600 1,000 (1.0 0) ,000 1,000 Z-score ormalizatio (μ: mea, σ: stadard deviatio): A ) ew_mi A A v A v' 73,600 54,000 E.g., for μ = 54,000 ad σ = 16,000, $73,000 is mapped to ,000 v v' 10 j 3 15
16 Data Reductio Why data reductio? Miig cost ofte icreases rapidly with data size ad umber of attributes Goal: reduce data size, but produce (almost) the same results Data reductio strategies Dimesioality reductio Data Compressio Numerosity reductio Discretizatio 33 Dimesioality Reductio: Attribute Subset Selectio Feature selectio (i.e., attribute subset selectio): Select a miimum set of attributes such that the miig result is still as good as (or eve better tha) whe usig all attributes Heuristic methods (due to expoetial umber of choices): Select idepedetly based o some test Step-wise forward selectio Step-wise backward elimiatio Combiig forward selectio ad backward elimiatio Elimiate attributes that some trusted method did ot use, e.g., a decisio tree 34 16
17 Pricipal Compoet Aalysis Fid projectio that captures largest amout of variatio i the data Space defied by eigevectors of the covariace matrix Compressio: use oly first k eigevectors x e 1 e x 1 39 Data Reductio Method: Samplig Select a small subset of a give data set Reduces miig cost Miig cost usually is super-liear i data size Ofte makes differece betwee i-memory processig ad eed for expesive I/O Choose a represetative subset of the data Simple radom samplig may have poor performace i the presece of skew Develop adaptive samplig methods Stratified samplig Approximate the percetage of each class (or sub-populatio of iterest) i the overall database Used i cojuctio with skewed data 41 17
18 Samplig with or without Replacemet Raw Data 4 Samplig: Cluster or Stratified Samplig Raw Data Cluster/Stratified Sample 43 18
19 Data Reductio: Discretizatio Applied to cotiuous attributes Reduces domai size Makes the attribute discrete ad hece eables use of techiques that oly accept categorical attributes Approach: Divide the rage of the attribute ito itervals Iterval labels replace the origial data 44 Data Preprocessig Overview Descriptive data summarizatio Data cleaig Data itegratio Data trasformatio Summary 45 19
20 Summary Data preparatio is a big issue for data miig Descriptive data summarizatio is used to uderstad data properties Data preparatio icludes Data cleaig ad itegratio Data reductio ad feature selectio Discretizatio May techiques ad commercial tools, but still major challege ad active research area 46 0
Data Analysis. Concepts and Techniques. Chapter 2. Chapter 2: Getting to Know Your Data. Data Objects and Attribute Types
Data Aalysis Cocepts ad Techiques Chapter 2 1 Chapter 2: Gettig to Kow Your Data Data Objects ad Attribute Types Basic Statistical Descriptios of Data Data Visualizatio Measurig Data Similarity ad Dissimilarity
More informationChapter 2 and 3, Data Pre-processing
CSI 4352, Itroductio to Data Miig Chapter 2 ad 3, Data Pre-processig Youg-Rae Cho Associate Professor Departmet of Computer Sciece Baylor Uiversity Why Need Data Pre-processig? Icomplete Data Missig values,
More informationData Mining: Concepts and Techniques. Chapter 2
Data Miig: Cocepts ad Techiques Chapter 2 Jiawei Ha Departmet of Computer Sciece Uiversity of Illiois at Urbaa-Champaig www.cs.uiuc.edu/~haj 2006 Jiawei Ha ad Michelie Kamber, All rights reserved Jauary
More information( n+1 2 ) , position=(7+1)/2 =4,(median is observation #4) Median=10lb
Chapter 3 Descriptive Measures Measures of Ceter (Cetral Tedecy) These measures will tell us where is the ceter of our data or where most typical value of a data set lies Mode the value that occurs most
More informationSAMPLE VERSUS POPULATION. Population - consists of all possible measurements that can be made on a particular item or procedure.
SAMPLE VERSUS POPULATION Populatio - cosists of all possible measuremets that ca be made o a particular item or procedure. Ofte a populatio has a ifiite umber of data elemets Geerally expese to determie
More informationDescriptive Statistics Summary Lists
Chapter 209 Descriptive Statistics Summary Lists Itroductio This procedure is used to summarize cotiuous data. Large volumes of such data may be easily summarized i statistical lists of meas, couts, stadard
More informationENGI 4421 Probability and Statistics Faculty of Engineering and Applied Science Problem Set 1 Descriptive Statistics
ENGI 44 Probability ad Statistics Faculty of Egieerig ad Applied Sciece Problem Set Descriptive Statistics. If, i the set of values {,, 3, 4, 5, 6, 7 } a error causes the value 5 to be replaced by 50,
More informationDescribing data with graphics and numbers
Describig data with graphics ad umbers Types of Data Categorical Variables also kow as class variables, omial variables Quatitative Variables aka umerical ariables either cotiuous or discrete. Graphig
More informationCOMP9318: Data Warehousing and Data Mining
COMP9318: Data Warehousig ad Data Miig L3: Data Preprocessig ad Data Cleaig COMP9318: Data Warehousig ad Data Miig 1 Why preprocess the data? COMP9318: Data Warehousig ad Data Miig 2 Why Data Preprocessig?
More informationDesigning a learning system
CS 75 Machie Learig Lecture Desigig a learig system Milos Hauskrecht milos@cs.pitt.edu 539 Seott Square, x-5 people.cs.pitt.edu/~milos/courses/cs75/ Admiistrivia No homework assigmet this week Please try
More informationFundamentals of Media Processing. Shin'ichi Satoh Kazuya Kodama Hiroshi Mo Duy-Dinh Le
Fudametals of Media Processig Shi'ichi Satoh Kazuya Kodama Hiroshi Mo Duy-Dih Le Today's topics Noparametric Methods Parze Widow k-nearest Neighbor Estimatio Clusterig Techiques k-meas Agglomerative Hierarchical
More informationOCR Statistics 1. Working with data. Section 3: Measures of spread
Notes ad Eamples OCR Statistics 1 Workig with data Sectio 3: Measures of spread Just as there are several differet measures of cetral tedec (averages), there are a variet of statistical measures of spread.
More informationNormal Distributions
Normal Distributios Stacey Hacock Look at these three differet data sets Each histogram is overlaid with a curve : A B C A) Weights (g) of ewly bor lab rat pups B) Mea aual temperatures ( F ) i A Arbor,
More informationUNIT 4 Section 8 Estimating Population Parameters using Confidence Intervals
UNIT 4 Sectio 8 Estimatig Populatio Parameters usig Cofidece Itervals To make ifereces about a populatio that caot be surveyed etirely, sample statistics ca be take from a SRS of the populatio ad used
More informationThe Closest Line to a Data Set in the Plane. David Gurney Southeastern Louisiana University Hammond, Louisiana
The Closest Lie to a Data Set i the Plae David Gurey Southeaster Louisiaa Uiversity Hammod, Louisiaa ABSTRACT This paper looks at three differet measures of distace betwee a lie ad a data set i the plae:
More information4.2.1 Bayesian Principal Component Analysis Weighted K Nearest Neighbor Regularized Expectation Maximization
4 DATA PREPROCESSING 4.1 Data Normalizatio 4.1.1 Mi-Max 4.1.2 Z-Score 4.1.3 Decimal Scalig 4.2 Data Imputatio 4.2.1 Bayesia Pricipal Compoet Aalysis 4.2.2 K Nearest Neighbor 4.2.3 Weighted K Nearest Neighbor
More informationImproving Template Based Spike Detection
Improvig Template Based Spike Detectio Kirk Smith, Member - IEEE Portlad State Uiversity petra@ee.pdx.edu Abstract Template matchig algorithms like SSE, Covolutio ad Maximum Likelihood are well kow for
More informationWhich movie we can suggest to Anne?
ECOLE CENTRALE SUPELEC MASTER DSBI DECISION MODELING TUTORIAL COLLABORATIVE FILTERING AS A MODEL OF GROUP DECISION-MAKING You kow that the low-tech way to get recommedatios for products, movies, or etertaiig
More informationSD vs. SD + One of the most important uses of sample statistics is to estimate the corresponding population parameters.
SD vs. SD + Oe of the most importat uses of sample statistics is to estimate the correspodig populatio parameters. The mea of a represetative sample is a good estimate of the mea of the populatio that
More informationEigenimages. Digital Image Processing: Bernd Girod, 2013 Stanford University -- Eigenimages 1
Eigeimages Uitary trasforms Karhue-Loève trasform ad eigeimages Sirovich ad Kirby method Eigefaces for geder recogitio Fisher liear discrimat aalysis Fisherimages ad varyig illumiatio Fisherfaces vs. eigefaces
More informationThe isoperimetric problem on the hypercube
The isoperimetric problem o the hypercube Prepared by: Steve Butler November 2, 2005 1 The isoperimetric problem We will cosider the -dimesioal hypercube Q Recall that the hypercube Q is a graph whose
More informationPolynomial Functions and Models. Learning Objectives. Polynomials. P (x) = a n x n + a n 1 x n a 1 x + a 0, a n 0
Polyomial Fuctios ad Models 1 Learig Objectives 1. Idetify polyomial fuctios ad their degree 2. Graph polyomial fuctios usig trasformatios 3. Idetify the real zeros of a polyomial fuctio ad their multiplicity
More informationEigenimages. Digital Image Processing: Bernd Girod, Stanford University -- Eigenimages 1
Eigeimages Uitary trasforms Karhue-Loève trasform ad eigeimages Sirovich ad Kirby method Eigefaces for geder recogitio Fisher liear discrimat aalysis Fisherimages ad varyig illumiatio Fisherfaces vs. eigefaces
More informationEM375 STATISTICS AND MEASUREMENT UNCERTAINTY LEAST SQUARES LINEAR REGRESSION ANALYSIS
EM375 STATISTICS AND MEASUREMENT UNCERTAINTY LEAST SQUARES LINEAR REGRESSION ANALYSIS I this uit of the course we ivestigate fittig a straight lie to measured (x, y) data pairs. The equatio we wat to fit
More information9.1. Sequences and Series. Sequences. What you should learn. Why you should learn it. Definition of Sequence
_9.qxd // : AM Page Chapter 9 Sequeces, Series, ad Probability 9. Sequeces ad Series What you should lear Use sequece otatio to write the terms of sequeces. Use factorial otatio. Use summatio otatio to
More informationDATA MINING II - 1DL460
DATA MINING II - 1DL460 Sprig 2017 A secod course i data miig http://www.it.uu.se/edu/course/homepage/ifoutv2/vt17/ Kjell Orsbor Uppsala Database Laboratory Departmet of Iformatio Techology, Uppsala Uiversity,
More informationIntermediate Statistics
Gait Learig Guides Itermediate Statistics Data processig & display, Cetral tedecy Author: Raghu M.D. STATISTICS DATA PROCESSING AND DISPLAY Statistics is the study of data or umerical facts of differet
More informationOctahedral Graph Scaling
Octahedral Graph Scalig Peter Russell Jauary 1, 2015 Abstract There is presetly o strog iterpretatio for the otio of -vertex graph scalig. This paper presets a ew defiitio for the term i the cotext of
More informationDesigning a learning system
CS 75 Itro to Machie Learig Lecture Desigig a learig system Milos Hauskrecht milos@pitt.edu 539 Seott Square, -5 people.cs.pitt.edu/~milos/courses/cs75/ Admiistrivia No homework assigmet this week Please
More informationImage Segmentation EEE 508
Image Segmetatio Objective: to determie (etract) object boudaries. It is a process of partitioig a image ito distict regios by groupig together eighborig piels based o some predefied similarity criterio.
More informationOur second algorithm. Comp 135 Machine Learning Computer Science Tufts University. Decision Trees. Decision Trees. Decision Trees.
Comp 135 Machie Learig Computer Sciece Tufts Uiversity Fall 2017 Roi Khardo Some of these slides were adapted from previous slides by Carla Brodley Our secod algorithm Let s look at a simple dataset for
More information3D Model Retrieval Method Based on Sample Prediction
20 Iteratioal Coferece o Computer Commuicatio ad Maagemet Proc.of CSIT vol.5 (20) (20) IACSIT Press, Sigapore 3D Model Retrieval Method Based o Sample Predictio Qigche Zhag, Ya Tag* School of Computer
More informationProtected points in ordered trees
Applied Mathematics Letters 008 56 50 www.elsevier.com/locate/aml Protected poits i ordered trees Gi-Sag Cheo a, Louis W. Shapiro b, a Departmet of Mathematics, Sugkyukwa Uiversity, Suwo 440-746, Republic
More informationNumerical Methods Lecture 6 - Curve Fitting Techniques
Numerical Methods Lecture 6 - Curve Fittig Techiques Topics motivatio iterpolatio liear regressio higher order polyomial form expoetial form Curve fittig - motivatio For root fidig, we used a give fuctio
More informationWavelet Transform. CSE 490 G Introduction to Data Compression Winter Wavelet Transformed Barbara (Enhanced) Wavelet Transformed Barbara (Actual)
Wavelet Trasform CSE 49 G Itroductio to Data Compressio Witer 6 Wavelet Trasform Codig PACW Wavelet Trasform A family of atios that filters the data ito low resolutio data plus detail data high pass filter
More informationPerformance Plus Software Parameter Definitions
Performace Plus+ Software Parameter Defiitios/ Performace Plus Software Parameter Defiitios Chapma Techical Note-TG-5 paramete.doc ev-0-03 Performace Plus+ Software Parameter Defiitios/2 Backgroud ad Defiitios
More informationAdministrative UNSUPERVISED LEARNING. Unsupervised learning. Supervised learning 11/25/13. Final project. No office hours today
Admiistrative Fial project No office hours today UNSUPERVISED LEARNING David Kauchak CS 451 Fall 2013 Supervised learig Usupervised learig label label 1 label 3 model/ predictor label 4 label 5 Supervised
More informationPattern Recognition Systems Lab 1 Least Mean Squares
Patter Recogitio Systems Lab 1 Least Mea Squares 1. Objectives This laboratory work itroduces the OpeCV-based framework used throughout the course. I this assigmet a lie is fitted to a set of poits usig
More informationData Mining: Concepts and Techniques. (3 rd ed.) Chapter 3. Chapter 3: Data Preprocessing. Major Tasks in Data Preprocessing
Data Mining: Concepts and Techniques (3 rd ed.) Chapter 3 1 Chapter 3: Data Preprocessing Data Preprocessing: An Overview Data Quality Major Tasks in Data Preprocessing Data Cleaning Data Integration Data
More informationDimensionality Reduction PCA
Dimesioality Reductio PCA Machie Learig CSE446 David Wadde (slides provided by Carlos Guestri) Uiversity of Washigto Feb 22, 2017 Carlos Guestri 2005-2017 1 Dimesioality reductio Iput data may have thousads
More informationOnes Assignment Method for Solving Traveling Salesman Problem
Joural of mathematics ad computer sciece 0 (0), 58-65 Oes Assigmet Method for Solvig Travelig Salesma Problem Hadi Basirzadeh Departmet of Mathematics, Shahid Chamra Uiversity, Ahvaz, Ira Article history:
More informationAlpha Individual Solutions MAΘ National Convention 2013
Alpha Idividual Solutios MAΘ Natioal Covetio 0 Aswers:. D. A. C 4. D 5. C 6. B 7. A 8. C 9. D 0. B. B. A. D 4. C 5. A 6. C 7. B 8. A 9. A 0. C. E. B. D 4. C 5. A 6. D 7. B 8. C 9. D 0. B TB. 570 TB. 5
More informationCopyright 2016 Ramez Elmasri and Shamkant B. Navathe
Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 18 Strategies for Query Processig Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe Itroductio DBMS techiques to process a query Scaer idetifies
More informationMath 10C Long Range Plans
Math 10C Log Rage Plas Uits: Evaluatio: Homework, projects ad assigmets 10% Uit Tests. 70% Fial Examiatio.. 20% Ay Uit Test may be rewritte for a higher mark. If the retest mark is higher, that mark will
More informationLecture 5. Counting Sort / Radix Sort
Lecture 5. Coutig Sort / Radix Sort T. H. Corme, C. E. Leiserso ad R. L. Rivest Itroductio to Algorithms, 3rd Editio, MIT Press, 2009 Sugkyukwa Uiversity Hyuseug Choo choo@skku.edu Copyright 2000-2018
More information9 x and g(x) = 4. x. Find (x) 3.6. I. Combining Functions. A. From Equations. Example: Let f(x) = and its domain. Example: Let f(x) = and g(x) = x x 4
1 3.6 I. Combiig Fuctios A. From Equatios Example: Let f(x) = 9 x ad g(x) = 4 f x. Fid (x) g ad its domai. 4 Example: Let f(x) = ad g(x) = x x 4. Fid (f-g)(x) B. From Graphs: Graphical Additio. Example:
More informationAnalysis Metrics. Intro to Algorithm Analysis. Slides. 12. Alg Analysis. 12. Alg Analysis
Itro to Algorithm Aalysis Aalysis Metrics Slides. Table of Cotets. Aalysis Metrics 3. Exact Aalysis Rules 4. Simple Summatio 5. Summatio Formulas 6. Order of Magitude 7. Big-O otatio 8. Big-O Theorems
More informationSouth Slave Divisional Education Council. Math 10C
South Slave Divisioal Educatio Coucil Math 10C Curriculum Package February 2012 12 Strad: Measuremet Geeral Outcome: Develop spatial sese ad proportioal reasoig It is expected that studets will: 1. Solve
More informationChapter 3: Introduction to Principal components analysis with MATLAB
Chapter 3: Itroductio to Pricipal compoets aalysis with MATLAB The vriety of mathematical tools are avilable ad successfully workig to i the field of image processig. The mai problem with graphical autheticatio
More informationMSC BD 5002/IT 5210: Knowledge Discovery and Data Mining
MSC BD 5002/IT 5210: Kowledge Discovery ad Data Miig Ackowledgemet: Slides modified by Dr. Lei Che based o the slides provided by Jiawei Ha, Michelie Kamber, ad Jia Pei 2012 Ha, Kamber & Pei. All rights
More informationMATHEMATICAL METHODS OF ANALYSIS AND EXPERIMENTAL DATA PROCESSING (Or Methods of Curve Fitting)
MATHEMATICAL METHODS OF ANALYSIS AND EXPERIMENTAL DATA PROCESSING (Or Methods of Curve Fittig) I this chapter, we will eamie some methods of aalysis ad data processig; data obtaied as a result of a give
More informationLearning to Shoot a Goal Lecture 8: Learning Models and Skills
Learig to Shoot a Goal Lecture 8: Learig Models ad Skills How do we acquire skill at shootig goals? CS 344R/393R: Robotics Bejami Kuipers Learig to Shoot a Goal The robot eeds to shoot the ball i the goal.
More informationEE123 Digital Signal Processing
Last Time EE Digital Sigal Processig Lecture 7 Block Covolutio, Overlap ad Add, FFT Discrete Fourier Trasform Properties of the Liear covolutio through circular Today Liear covolutio with Overlap ad add
More informationImproving Face Recognition Rate by Combining Eigenface Approach and Case-based Reasoning
Improvig Face Recogitio Rate by Combiig Eigeface Approach ad Case-based Reasoig Haris Supic, ember, IAENG Abstract There are may approaches to the face recogitio. This paper presets a approach that combies
More informationPackage popkorn. R topics documented: February 20, Type Package
Type Pacage Pacage popkor February 20, 2015 Title For iterval estimatio of mea of selected populatios Versio 0.3-0 Date 2014-07-04 Author Vi Gopal, Claudio Fuetes Maitaier Vi Gopal Depeds
More informationCapability Analysis (Variable Data)
Capability Aalysis (Variable Data) Revised: 0/0/07 Summary... Data Iput... 3 Capability Plot... 5 Aalysis Summary... 6 Aalysis Optios... 8 Capability Idices... Prefereces... 6 Tests for Normality... 7
More informationModel Enhancement in Data Mining: Calibration, ROC Analysis, Model Combination and Mimetic Models
Model Ehacemet i Miig: Calibratio, ROC Aalysis, Model Combiatio ad Mimetic Models José Herádez-Orallo Dpto. de Sistemas Iformáticos y Computació, Uiversidad Politécica de Valecia, jorallo@dsic.upv.es Rome,
More informationIMP: Superposer Integrated Morphometrics Package Superposition Tool
IMP: Superposer Itegrated Morphometrics Package Superpositio Tool Programmig by: David Lieber ( 03) Caisius College 200 Mai St. Buffalo, NY 4208 Cocept by: H. David Sheets, Dept. of Physics, Caisius College
More informationHow do we evaluate algorithms?
F2 Readig referece: chapter 2 + slides Algorithm complexity Big O ad big Ω To calculate ruig time Aalysis of recursive Algorithms Next time: Litterature: slides mostly The first Algorithm desig methods:
More informationMorgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5
Morga Kaufma Publishers 26 February, 28 COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 5 Set-Associative Cache Architecture Performace Summary Whe CPU performace icreases:
More informationcondition w i B i S maximum u i
ecture 10 Dyamic Programmig 10.1 Kapsack Problem November 1, 2004 ecturer: Kamal Jai Notes: Tobias Holgers We are give a set of items U = {a 1, a 2,..., a }. Each item has a weight w i Z + ad a utility
More informationChapter 3 MATHEMATICAL MODELING OF TOLERANCE ALLOCATION AND OVERVIEW OF EVOLUTIONARY ALGORITHMS
28 Chapter 3 MATHEMATICAL MODELING OF TOLERANCE ALLOCATION AND OVERVIEW OF EVOLUTIONARY ALGORITHMS Tolerace sythesis deals with the allocatio of tolerace values to various dimesios of idividual compoets
More informationDimension Reduction and Manifold Learning. Xin Zhang
Dimesio Reductio ad Maifold Learig Xi Zhag eeizhag@scut.edu.c Cotet Motivatio of maifold learig Pricipal compoet aalysis ad its etesio Maifold learig Global oliear maifold learig (IsoMap) Local oliear
More informationTutorial on Packet Time Metrics
Power Matters. Tutorial o Packet Time Metrics Lee Cosart lee.cosart@microsemi.com ITS 204 204 Microsemi Corporatio. COMPANY POPIETAY Itroductio requecy trasport Oe-way: forward & reverse packet streams
More informationCS 111: Program Design I Lecture 15: Objects, Pandas, Modules. Robert H. Sloan & Richard Warner University of Illinois at Chicago October 13, 2016
CS 111: Program Desig I Lecture 15: Objects, Padas, Modules Robert H. Sloa & Richard Warer Uiversity of Illiois at Chicago October 13, 2016 OBJECTS AND DOT NOTATION Objects (Implicit i Chapter 2, Variables,
More informationImage Analysis. Segmentation by Fitting a Model
Image Aalysis Segmetatio by Fittig a Model Christophoros Nikou cikou@cs.uoi.gr Images take from: D. Forsyth ad J. Poce. Computer Visio: A Moder Approach, Pretice Hall, 2003. Computer Visio course by Svetlaa
More informationFast Fourier Transform (FFT) Algorithms
Fast Fourier Trasform FFT Algorithms Relatio to the z-trasform elsewhere, ozero, z x z X x [ ] 2 ~ elsewhere,, ~ e j x X x x π j e z z X X π 2 ~ The DFS X represets evely spaced samples of the z- trasform
More informationAppendix D. Controller Implementation
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Appedix D Cotroller Implemetatio Cotroller Implemetatios Combiatioal logic (sigle-cycle); Fiite state machie (multi-cycle, pipelied);
More information. Written in factored form it is easy to see that the roots are 2, 2, i,
CMPS A Itroductio to Programmig Programmig Assigmet 4 I this assigmet you will write a java program that determies the real roots of a polyomial that lie withi a specified rage. Recall that the roots (or
More informationECLT 5810 Data Preprocessing. Prof. Wai Lam
ECLT 5810 Data Preprocessing Prof. Wai Lam Why Data Preprocessing? Data in the real world is imperfect incomplete: lacking attribute values, lacking certain attributes of interest, or containing only aggregate
More informationReconciling Continuous Attribute Values from Multiple Data Sources
Associatio for Iformatio Systems AIS Electroic Library (AISeL PACIS 2008 Proceedigs Pacific Asia Coferece o Iformatio Systems (PACIS July 2008 Recocilig Cotiuous Attribute Values from Multiple Data Sources
More informationElementary Educational Computer
Chapter 5 Elemetary Educatioal Computer. Geeral structure of the Elemetary Educatioal Computer (EEC) The EEC coforms to the 5 uits structure defied by vo Neuma's model (.) All uits are preseted i a simplified
More informationChapter 11. Friends, Overloaded Operators, and Arrays in Classes. Copyright 2014 Pearson Addison-Wesley. All rights reserved.
Chapter 11 Frieds, Overloaded Operators, ad Arrays i Classes Copyright 2014 Pearso Addiso-Wesley. All rights reserved. Overview 11.1 Fried Fuctios 11.2 Overloadig Operators 11.3 Arrays ad Classes 11.4
More informationCSC 220: Computer Organization Unit 11 Basic Computer Organization and Design
College of Computer ad Iformatio Scieces Departmet of Computer Sciece CSC 220: Computer Orgaizatio Uit 11 Basic Computer Orgaizatio ad Desig 1 For the rest of the semester, we ll focus o computer architecture:
More informationare two specific neighboring points, F( x, y)
$33/,&$7,212)7+(6(/)$92,',1* 5$1'20:$/.12,6(5('8&7,21$/*25,7+0,17+(&2/285,0$*(6(*0(17$7,21 %RJGDQ602/.$+HQU\N3$/86'DPLDQ%(5(6.$ 6LOHVLDQ7HFKQLFDO8QLYHUVLW\'HSDUWPHQWRI&RPSXWHU6FLHQFH $NDGHPLFND*OLZLFH32/$1'
More informationECE4050 Data Structures and Algorithms. Lecture 6: Searching
ECE4050 Data Structures ad Algorithms Lecture 6: Searchig 1 Search Give: Distict keys k 1, k 2,, k ad collectio L of records of the form (k 1, I 1 ), (k 2, I 2 ),, (k, I ) where I j is the iformatio associated
More informationName Date Hr. ALGEBRA 1-2 SPRING FINAL MULTIPLE CHOICE REVIEW #1
Name Date Hr. ALGEBRA - SPRING FINAL MULTIPLE CHOICE REVIEW #. The high temperatures for Phoeix i October of 009 are listed below. Which measure of ceter will provide the most accurate estimatio of the
More informationPLEASURE TEST SERIES (XI) - 04 By O.P. Gupta (For stuffs on Math, click at theopgupta.com)
wwwtheopguptacom wwwimathematiciacom For all the Math-Gya Buy books by OP Gupta A Compilatio By : OP Gupta (WhatsApp @ +9-9650 350 0) For more stuffs o Maths, please visit : wwwtheopguptacom Time Allowed
More informationCOMP 558 lecture 6 Sept. 27, 2010
Radiometry We have discussed how light travels i straight lies through space. We would like to be able to talk about how bright differet light rays are. Imagie a thi cylidrical tube ad cosider the amout
More informationLecture 2: Spectra of Graphs
Spectral Graph Theory ad Applicatios WS 20/202 Lecture 2: Spectra of Graphs Lecturer: Thomas Sauerwald & He Su Our goal is to use the properties of the adjacecy/laplacia matrix of graphs to first uderstad
More informationA General Framework for Accurate Statistical Timing Analysis Considering Correlations
A Geeral Framework for Accurate Statistical Timig Aalysis Cosiderig Correlatios 7.4 Vishal Khadelwal Departmet of ECE Uiversity of Marylad-College Park vishalk@glue.umd.edu Akur Srivastava Departmet of
More informationLecture 1: Introduction and Strassen s Algorithm
5-750: Graduate Algorithms Jauary 7, 08 Lecture : Itroductio ad Strasse s Algorithm Lecturer: Gary Miller Scribe: Robert Parker Itroductio Machie models I this class, we will primarily use the Radom Access
More informationAPPLICATION NOTE PACE1750AE BUILT-IN FUNCTIONS
APPLICATION NOTE PACE175AE BUILT-IN UNCTIONS About This Note This applicatio brief is iteded to explai ad demostrate the use of the special fuctios that are built ito the PACE175AE processor. These powerful
More informationName Date Hr. ALGEBRA 1-2 SPRING FINAL MULTIPLE CHOICE REVIEW #2
Name Date Hr. ALGEBRA - SPRING FINAL MULTIPLE CHOICE REVIEW # 5. Which measure of ceter is most appropriate for the followig data set? {7, 7, 75, 77,, 9, 9, 90} Mea Media Stadard Deviatio Rage 5. The umber
More informationPseudocode ( 1.1) Analysis of Algorithms. Primitive Operations. Pseudocode Details. Running Time ( 1.1) Estimating performance
Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Pseudocode ( 1.1) High-level descriptio of a algorithm More structured
More informationEvaluation scheme for Tracking in AMI
A M I C o m m u i c a t i o A U G M E N T E D M U L T I - P A R T Y I N T E R A C T I O N http://www.amiproject.org/ Evaluatio scheme for Trackig i AMI S. Schreiber a D. Gatica-Perez b AMI WP4 Trackig:
More informationCIS 121 Data Structures and Algorithms with Java Fall Big-Oh Notation Tuesday, September 5 (Make-up Friday, September 8)
CIS 11 Data Structures ad Algorithms with Java Fall 017 Big-Oh Notatio Tuesday, September 5 (Make-up Friday, September 8) Learig Goals Review Big-Oh ad lear big/small omega/theta otatios Practice solvig
More informationLecture 6. Lecturer: Ronitt Rubinfeld Scribes: Chen Ziv, Eliav Buchnik, Ophir Arie, Jonathan Gradstein
068.670 Subliear Time Algorithms November, 0 Lecture 6 Lecturer: Roitt Rubifeld Scribes: Che Ziv, Eliav Buchik, Ophir Arie, Joatha Gradstei Lesso overview. Usig the oracle reductio framework for approximatig
More information15 UNSUPERVISED LEARNING
15 UNSUPERVISED LEARNING [My father] advised me to sit every few moths i my readig chair for a etire eveig, close my eyes ad try to thik of ew problems to solve. I took his advice very seriously ad have
More informationChapter 1. Introduction to Computers and C++ Programming. Copyright 2015 Pearson Education, Ltd.. All rights reserved.
Chapter 1 Itroductio to Computers ad C++ Programmig Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Overview 1.1 Computer Systems 1.2 Programmig ad Problem Solvig 1.3 Itroductio to C++ 1.4 Testig
More informationA Novel Feature Extraction Algorithm for Haar Local Binary Pattern Texture Based on Human Vision System
A Novel Feature Extractio Algorithm for Haar Local Biary Patter Texture Based o Huma Visio System Liu Tao 1,* 1 Departmet of Electroic Egieerig Shaaxi Eergy Istitute Xiayag, Shaaxi, Chia Abstract The locality
More informationRecursive Procedures. How can you model the relationship between consecutive terms of a sequence?
6. Recursive Procedures I Sectio 6.1, you used fuctio otatio to write a explicit formula to determie the value of ay term i a Sometimes it is easier to calculate oe term i a sequece usig the previous terms.
More informationEuclidean Distance Based Feature Selection for Fault Detection Prediction Model in Semiconductor Manufacturing Process
Vol.133 (Iformatio Techology ad Computer Sciece 016), pp.85-89 http://dx.doi.org/10.1457/astl.016. Euclidea Distace Based Feature Selectio for Fault Detectio Predictio Model i Semicoductor Maufacturig
More informationCSCI 5090/7090- Machine Learning. Spring Mehdi Allahyari Georgia Southern University
CSCI 5090/7090- Machie Learig Sprig 018 Mehdi Allahyari Georgia Souther Uiversity Clusterig (slides borrowed from Tom Mitchell, Maria Floria Balca, Ali Borji, Ke Che) 1 Clusterig, Iformal Goals Goal: Automatically
More informationForce Network Analysis using Complementary Energy
orce Network Aalysis usig Complemetary Eergy Adrew BORGART Assistat Professor Delft Uiversity of Techology Delft, The Netherlads A.Borgart@tudelft.l Yaick LIEM Studet Delft Uiversity of Techology Delft,
More informationThe Nature of Light. Chapter 22. Geometric Optics Using a Ray Approximation. Ray Approximation
The Nature of Light Chapter Reflectio ad Refractio of Light Sectios: 5, 8 Problems: 6, 7, 4, 30, 34, 38 Particles of light are called photos Each photo has a particular eergy E = h ƒ h is Plack s costat
More informationANN WHICH COVERS MLP AND RBF
ANN WHICH COVERS MLP AND RBF Josef Boští, Jaromír Kual Faculty of Nuclear Scieces ad Physical Egieerig, CTU i Prague Departmet of Software Egieerig Abstract Two basic types of artificial eural etwors Multi
More informationFPGA IMPLEMENTATION OF BASE-N LOGARITHM. Salvador E. Tropea
FPGA IMPLEMENTATION OF BASE-N LOGARITHM Salvador E. Tropea Electróica e Iformática Istituto Nacioal de Tecología Idustrial Bueos Aires, Argetia email: salvador@iti.gov.ar ABSTRACT I this work, we preset
More informationData Structures and Algorithms. Analysis of Algorithms
Data Structures ad Algorithms Aalysis of Algorithms Outlie Ruig time Pseudo-code Big-oh otatio Big-theta otatio Big-omega otatio Asymptotic algorithm aalysis Aalysis of Algorithms Iput Algorithm Output
More informationWebAssign Lesson 6-1b Geometric Series (Homework)
WebAssig Lesso 6-b Geometric Series (Homework) Curret Score : / 49 Due : Wedesday, July 30 204 :0 AM MDT Jaimos Skriletz Math 75, sectio 3, Summer 2 204 Istructor: Jaimos Skriletz. /2 poitsrogac alcet2
More information