Data Analysis. Concepts and Techniques. Chapter 2. Chapter 2: Getting to Know Your Data. Data Objects and Attribute Types

Size: px
Start display at page:

Download "Data Analysis. Concepts and Techniques. Chapter 2. Chapter 2: Getting to Know Your Data. Data Objects and Attribute Types"

Transcription

1 Data Aalysis Cocepts ad Techiques Chapter 2 1 Chapter 2: Gettig to Kow Your Data Data Objects ad Attribute Types Basic Statistical Descriptios of Data Data Visualizatio Measurig Data Similarity ad Dissimilarity Summary 2 1

2 Types of Data Sets Record Relatioal records Data matrix, e.g., umerical matrix, crosstabs Documet data: text documets: termfrequecy vector Trasactio data Graph ad etwork World Wide Web Social or iformatio etworks Molecular Structures Ordered Video data: sequece of images Temporal data: time-series Sequetial Data: trasactio sequeces Geetic sequece data Spatial, image ad multimedia: Spatial data: maps Image data: Video data: Documet 1 Documet 2 Documet 3 team coach pla y ball score game wi lost timeout TID Items 1 Bread, Coke, Milk 2 Beer, Bread 3 Beer, Coke, Diaper, Milk 4 Beer, Bread, Diaper, Milk 5 Coke, Diaper, Milk seaso 3 Importat Characteristics of Structured Data Dimesioality Curse of dimesioality Sparsity Oly presece couts Resolutio Patters deped o the scale Distributio Cetrality ad dispersio 4 2

3 Data Objects Data sets are made up of data objects. A data object represets a etity. Examples: sales database: customers, store items, sales medical database: patiets, treatmets uiversity database: studets, professors, courses Also called samples, examples, istaces, data poits, objects, tuples. Data objects are described by attributes. Database rows -> data objects; colums ->attributes. 5 Attributes Attribute (or dimesios, features, variables): a data field, represetig a characteristic or feature of a data object. E.g., customer _ID, ame, address Types: Nomial Biary Numeric: quatitative Iterval-scaled Ratio-scaled 6 3

4 Attribute Types Nomial: categories, states, or ames of thigs Hair_color = {aubur, black, blod, brow, grey, red, white} marital status, occupatio, ID umbers, zip codes Biary Nomial attribute with oly 2 states (0 ad 1) Symmetric biary: both outcomes equally importat e.g., geder Asymmetric biary: outcomes ot equally importat. e.g., medical test (positive vs. egative) Covetio: assig 1 to most importat outcome (e.g., HIV positive) Ordial Values have a meaigful order (rakig) but magitude betwee successive values is ot kow. Size = {small, medium, large}, grades, army rakigs 7 Numeric Attribute Types Quatity (iteger or real-valued) Iterval Measured o a scale of equal-sized uits Values have order E.g., temperature i C or F, caledar dates No true zero-poit Ratio Iheret zero-poit We ca speak of values as beig a order of magitude larger tha the uit of measuremet (10 K is twice as high as 5 K ). e.g., temperature i Kelvi, legth, couts, moetary quatities 8 4

5 Discrete vs. Cotiuous Attributes Discrete Attribute Has oly a fiite or coutably ifiite set of values E.g., zip codes, professio, or the set of words i a collectio of documets Sometimes, represeted as iteger variables Note: Biary attributes are a special case of discrete attributes Cotiuous Attribute Has real umbers as attribute values E.g., temperature, height, or weight Practically, real values ca oly be measured ad represeted usig a fiite umber of digits Cotiuous attributes are typically represeted as floatigpoit variables 9 Chapter 2: Gettig to Kow Your Data Data Objects ad Attribute Types Basic Statistical Descriptios of Data Data Visualizatio Measurig Data Similarity ad Dissimilarity Summary 10 5

6 Basic Statistical Descriptios of Data Motivatio To better uderstad the data: cetral tedecy, variatio ad spread Data dispersio characteristics media, max, mi, quatiles, outliers, variace, etc. Numerical dimesios correspod to sorted itervals Data dispersio: aalyzed with multiple graularities of precisio Boxplot or quatile aalysis o sorted itervals Dispersio aalysis o computed measures Foldig measures ito umerical dimesios Boxplot or quatile aalysis o the trasformed cube 11 Measurig the Cetral Tedecy Mea (algebraic measure) (sample vs. populatio): Note: is sample size ad N is populatio size. Weighted arithmetic mea: Trimmed mea: choppig extreme values Media: Mode Middle value if odd umber of values, or average of the middle two values otherwise Estimated by iterpolatio (for grouped data): / 2 ( media L1 ( freq Value that occurs most frequetly i the data Uimodal, bimodal, trimodal Empirical formula: freq) l ) width media x 1 x i 1 i 1 Media iterval w x x i i 1 mea mode 3 ( mea media ) i w i i x N 12 6

7 Symmetric vs. Skewed Data Media, mea ad mode of symmetric, positively ad egatively skewed data symmetric positively skewed egatively skewed August 24, 2015 Data Miig: Cocepts ad Techiques 13 Measurig the Dispersio of Data Quartiles, outliers ad boxplots Quartiles: Q 1 (25 th percetile), Q 3 (75 th percetile) Iter-quartile rage: IQR = Q 3 Q 1 Five umber summary: mi, Q 1, media, Q 3, max Boxplot: eds of the box are the quartiles; media is marked; add whiskers, ad plot outliers idividually Outlier: usually, a value higher/lower tha 1.5 x IQR Variace ad stadard deviatio (sample: s, populatio: σ) Variace: (algebraic, scalable computatio) 2 1 s 1 i 1 ( x x) i 2 1 [ xi ( xi ) i 1 i 1 Stadard deviatio s (or σ) is the square root of variace s 2 ( or σ 2) ] ( xi ) xi N i 1 N i

8 Boxplot Aalysis Five-umber summary of a distributio Miimum, Q1, Media, Q3, Maximum Boxplot Data is represeted with a box The eds of the box are at the first ad third quartiles, i.e., the height of the box is IQR The media is marked by a lie withi the box Whiskers: two lies outside the box exteded to Miimum ad Maximum Outliers: poits beyod a specified outlier threshold, plotted idividually 15 Visualizatio of Data Dispersio: 3-D Boxplots August 24, 2015 Data Miig: Cocepts ad Techiques 16 8

9 Properties of Normal Distributio Curve The ormal (distributio) curve From μ σ to μ+σ: cotais about 68% of the measuremets (μ: mea, σ: stadard deviatio) From μ 2σ to μ+2σ: cotais about 95% of it From μ 3σ to μ+3σ: cotais about 99.7% of it 17 Graphic Displays of Basic Statistical Descriptios Boxplot: graphic display of five-umber summary Histogram: x-axis are values, y-axis repres. frequecies Quatile plot: each value x i is paired with f i idicatig that approximately 100 f i % of data are x i Quatile-quatile (q-q) plot: graphs the quatiles of oe uivariat distributio agaist the correspodig quatiles of aother Scatter plot: each pair of values is a pair of coordiates ad plotted as poits i the plae 18 9

10 Histogram Aalysis Histogram: Graph display of tabulated 40 frequecies, show as bars It shows what proportio of cases fall ito each of several categories Differs from a bar chart i that it is the area of the bar that deotes the value, ot the height as i bar charts, a crucial distictio whe the categories are ot of uiform width The categories are usually specified as 0 o-overlappig itervals of some variable. The categories (bars) must be adjacet Histograms Ofte Tell More tha Boxplots The two histograms show i the left may have the same boxplot represetatio The same values for: mi, Q1, media, Q3, max But they have rather differet data distributios 20 10

11 Quatile Plot Displays all of the data (allowig the user to assess both the overall behavior ad uusual occurreces) Plots quatile iformatio For a data x i data sorted i icreasig order, f i idicates that approximately 100 f i % of the data are below or equal to the value x i Data Miig: Cocepts ad Techiques 21 Quatile-Quatile (Q-Q) Plot Graphs the quatiles of oe uivariate distributio agaist the correspodig quatiles of aother View: Is there is a shift i goig from oe distributio to aother? Example shows uit price of items sold at Brach 1 vs. Brach 2 for each quatile. Uit prices of items sold at Brach 1 ted to be lower tha those at Brach

12 Scatter plot Provides a first look at bivariate data to see clusters of poits, outliers, etc Each pair of values is treated as a pair of coordiates ad plotted as poits i the plae 23 Positively ad Negatively Correlated Data The left half fragmet is positively correlated The right half is egative correlated 24 12

13 Ucorrelated Data 25 13

( n+1 2 ) , position=(7+1)/2 =4,(median is observation #4) Median=10lb

( n+1 2 ) , position=(7+1)/2 =4,(median is observation #4) Median=10lb Chapter 3 Descriptive Measures Measures of Ceter (Cetral Tedecy) These measures will tell us where is the ceter of our data or where most typical value of a data set lies Mode the value that occurs most

More information

Data Preprocessing. Motivation

Data Preprocessing. Motivation Data Preprocessig Mirek Riedewald Some slides based o presetatio by Jiawei Ha ad Michelie Kamber Motivatio Garbage-i, garbage-out Caot get good miig results from bad data Need to uderstad data properties

More information

SAMPLE VERSUS POPULATION. Population - consists of all possible measurements that can be made on a particular item or procedure.

SAMPLE VERSUS POPULATION. Population - consists of all possible measurements that can be made on a particular item or procedure. SAMPLE VERSUS POPULATION Populatio - cosists of all possible measuremets that ca be made o a particular item or procedure. Ofte a populatio has a ifiite umber of data elemets Geerally expese to determie

More information

ECLT 5810 Data Preprocessing. Prof. Wai Lam

ECLT 5810 Data Preprocessing. Prof. Wai Lam ECLT 5810 Data Preprocessing Prof. Wai Lam Why Data Preprocessing? Data in the real world is imperfect incomplete: lacking attribute values, lacking certain attributes of interest, or containing only aggregate

More information

Chapter 2 and 3, Data Pre-processing

Chapter 2 and 3, Data Pre-processing CSI 4352, Itroductio to Data Miig Chapter 2 ad 3, Data Pre-processig Youg-Rae Cho Associate Professor Departmet of Computer Sciece Baylor Uiversity Why Need Data Pre-processig? Icomplete Data Missig values,

More information

Describing data with graphics and numbers

Describing data with graphics and numbers Describig data with graphics ad umbers Types of Data Categorical Variables also kow as class variables, omial variables Quatitative Variables aka umerical ariables either cotiuous or discrete. Graphig

More information

SD vs. SD + One of the most important uses of sample statistics is to estimate the corresponding population parameters.

SD vs. SD + One of the most important uses of sample statistics is to estimate the corresponding population parameters. SD vs. SD + Oe of the most importat uses of sample statistics is to estimate the correspodig populatio parameters. The mea of a represetative sample is a good estimate of the mea of the populatio that

More information

Normal Distributions

Normal Distributions Normal Distributios Stacey Hacock Look at these three differet data sets Each histogram is overlaid with a curve : A B C A) Weights (g) of ewly bor lab rat pups B) Mea aual temperatures ( F ) i A Arbor,

More information

OCR Statistics 1. Working with data. Section 3: Measures of spread

OCR Statistics 1. Working with data. Section 3: Measures of spread Notes ad Eamples OCR Statistics 1 Workig with data Sectio 3: Measures of spread Just as there are several differet measures of cetral tedec (averages), there are a variet of statistical measures of spread.

More information

ENGI 4421 Probability and Statistics Faculty of Engineering and Applied Science Problem Set 1 Descriptive Statistics

ENGI 4421 Probability and Statistics Faculty of Engineering and Applied Science Problem Set 1 Descriptive Statistics ENGI 44 Probability ad Statistics Faculty of Egieerig ad Applied Sciece Problem Set Descriptive Statistics. If, i the set of values {,, 3, 4, 5, 6, 7 } a error causes the value 5 to be replaced by 50,

More information

Intermediate Statistics

Intermediate Statistics Gait Learig Guides Itermediate Statistics Data processig & display, Cetral tedecy Author: Raghu M.D. STATISTICS DATA PROCESSING AND DISPLAY Statistics is the study of data or umerical facts of differet

More information

Data Mining: Concepts and Techniques. Chapter 2

Data Mining: Concepts and Techniques. Chapter 2 Data Miig: Cocepts ad Techiques Chapter 2 Jiawei Ha Departmet of Computer Sciece Uiversity of Illiois at Urbaa-Champaig www.cs.uiuc.edu/~haj 2006 Jiawei Ha ad Michelie Kamber, All rights reserved Jauary

More information

Descriptive Statistics Summary Lists

Descriptive Statistics Summary Lists Chapter 209 Descriptive Statistics Summary Lists Itroductio This procedure is used to summarize cotiuous data. Large volumes of such data may be easily summarized i statistical lists of meas, couts, stadard

More information

DATA MINING II - 1DL460

DATA MINING II - 1DL460 DATA MINING II - 1DL460 Sprig 2017 A secod course i data miig http://www.it.uu.se/edu/course/homepage/ifoutv2/vt17/ Kjell Orsbor Uppsala Database Laboratory Departmet of Iformatio Techology, Uppsala Uiversity,

More information

Performance Plus Software Parameter Definitions

Performance Plus Software Parameter Definitions Performace Plus+ Software Parameter Defiitios/ Performace Plus Software Parameter Defiitios Chapma Techical Note-TG-5 paramete.doc ev-0-03 Performace Plus+ Software Parameter Defiitios/2 Backgroud ad Defiitios

More information

The isoperimetric problem on the hypercube

The isoperimetric problem on the hypercube The isoperimetric problem o the hypercube Prepared by: Steve Butler November 2, 2005 1 The isoperimetric problem We will cosider the -dimesioal hypercube Q Recall that the hypercube Q is a graph whose

More information

9.1. Sequences and Series. Sequences. What you should learn. Why you should learn it. Definition of Sequence

9.1. Sequences and Series. Sequences. What you should learn. Why you should learn it. Definition of Sequence _9.qxd // : AM Page Chapter 9 Sequeces, Series, ad Probability 9. Sequeces ad Series What you should lear Use sequece otatio to write the terms of sequeces. Use factorial otatio. Use summatio otatio to

More information

South Slave Divisional Education Council. Math 10C

South Slave Divisional Education Council. Math 10C South Slave Divisioal Educatio Coucil Math 10C Curriculum Package February 2012 12 Strad: Measuremet Geeral Outcome: Develop spatial sese ad proportioal reasoig It is expected that studets will: 1. Solve

More information

Administrative UNSUPERVISED LEARNING. Unsupervised learning. Supervised learning 11/25/13. Final project. No office hours today

Administrative UNSUPERVISED LEARNING. Unsupervised learning. Supervised learning 11/25/13. Final project. No office hours today Admiistrative Fial project No office hours today UNSUPERVISED LEARNING David Kauchak CS 451 Fall 2013 Supervised learig Usupervised learig label label 1 label 3 model/ predictor label 4 label 5 Supervised

More information

Data Preprocessing. S1 Teknik Informatika Fakultas Teknologi Informasi Universitas Kristen Maranatha

Data Preprocessing. S1 Teknik Informatika Fakultas Teknologi Informasi Universitas Kristen Maranatha Data Preprocessing S1 Teknik Informatika Fakultas Teknologi Informasi Universitas Kristen Maranatha 1 Why Data Preprocessing? Data in the real world is dirty incomplete: lacking attribute values, lacking

More information

UNIT 4 Section 8 Estimating Population Parameters using Confidence Intervals

UNIT 4 Section 8 Estimating Population Parameters using Confidence Intervals UNIT 4 Sectio 8 Estimatig Populatio Parameters usig Cofidece Itervals To make ifereces about a populatio that caot be surveyed etirely, sample statistics ca be take from a SRS of the populatio ad used

More information

Xbar/R Chart for x1-x3

Xbar/R Chart for x1-x3 Chapter 6 Selected roblem Solutios Sectio 6-5 6- a) X-bar ad Rage - Iitial Study Chartig roblem 6- X-bar Rage ----- ----- UCL:. sigma 7.4 UCL:. sigma 5.79 Ceterlie 5.9 Ceterlie.5 LCL: -. sigma.79 LCL:

More information

Math 10C Long Range Plans

Math 10C Long Range Plans Math 10C Log Rage Plas Uits: Evaluatio: Homework, projects ad assigmets 10% Uit Tests. 70% Fial Examiatio.. 20% Ay Uit Test may be rewritte for a higher mark. If the retest mark is higher, that mark will

More information

Wavelet Transform. CSE 490 G Introduction to Data Compression Winter Wavelet Transformed Barbara (Enhanced) Wavelet Transformed Barbara (Actual)

Wavelet Transform. CSE 490 G Introduction to Data Compression Winter Wavelet Transformed Barbara (Enhanced) Wavelet Transformed Barbara (Actual) Wavelet Trasform CSE 49 G Itroductio to Data Compressio Witer 6 Wavelet Trasform Codig PACW Wavelet Trasform A family of atios that filters the data ito low resolutio data plus detail data high pass filter

More information

Polynomial Functions and Models. Learning Objectives. Polynomials. P (x) = a n x n + a n 1 x n a 1 x + a 0, a n 0

Polynomial Functions and Models. Learning Objectives. Polynomials. P (x) = a n x n + a n 1 x n a 1 x + a 0, a n 0 Polyomial Fuctios ad Models 1 Learig Objectives 1. Idetify polyomial fuctios ad their degree 2. Graph polyomial fuctios usig trasformatios 3. Idetify the real zeros of a polyomial fuctio ad their multiplicity

More information

Alpha Individual Solutions MAΘ National Convention 2013

Alpha Individual Solutions MAΘ National Convention 2013 Alpha Idividual Solutios MAΘ Natioal Covetio 0 Aswers:. D. A. C 4. D 5. C 6. B 7. A 8. C 9. D 0. B. B. A. D 4. C 5. A 6. C 7. B 8. A 9. A 0. C. E. B. D 4. C 5. A 6. D 7. B 8. C 9. D 0. B TB. 570 TB. 5

More information

ECE4050 Data Structures and Algorithms. Lecture 6: Searching

ECE4050 Data Structures and Algorithms. Lecture 6: Searching ECE4050 Data Structures ad Algorithms Lecture 6: Searchig 1 Search Give: Distict keys k 1, k 2,, k ad collectio L of records of the form (k 1, I 1 ), (k 2, I 2 ),, (k, I ) where I j is the iformatio associated

More information

Which movie we can suggest to Anne?

Which movie we can suggest to Anne? ECOLE CENTRALE SUPELEC MASTER DSBI DECISION MODELING TUTORIAL COLLABORATIVE FILTERING AS A MODEL OF GROUP DECISION-MAKING You kow that the low-tech way to get recommedatios for products, movies, or etertaiig

More information

CS570 Introduction to Data Mining

CS570 Introduction to Data Mining CS570 Introduction to Data Mining Department of Mathematics and Computer Science Li Xiong Data Exploration and Data Preprocessing Data and attributes Data exploration Data pre-processing 2 10 What is Data?

More information

Force Network Analysis using Complementary Energy

Force Network Analysis using Complementary Energy orce Network Aalysis usig Complemetary Eergy Adrew BORGART Assistat Professor Delft Uiversity of Techology Delft, The Netherlads A.Borgart@tudelft.l Yaick LIEM Studet Delft Uiversity of Techology Delft,

More information

Name Date Hr. ALGEBRA 1-2 SPRING FINAL MULTIPLE CHOICE REVIEW #1

Name Date Hr. ALGEBRA 1-2 SPRING FINAL MULTIPLE CHOICE REVIEW #1 Name Date Hr. ALGEBRA - SPRING FINAL MULTIPLE CHOICE REVIEW #. The high temperatures for Phoeix i October of 009 are listed below. Which measure of ceter will provide the most accurate estimatio of the

More information

Sorting in Linear Time. Data Structures and Algorithms Andrei Bulatov

Sorting in Linear Time. Data Structures and Algorithms Andrei Bulatov Sortig i Liear Time Data Structures ad Algorithms Adrei Bulatov Algorithms Sortig i Liear Time 7-2 Compariso Sorts The oly test that all the algorithms we have cosidered so far is compariso The oly iformatio

More information

EM375 STATISTICS AND MEASUREMENT UNCERTAINTY LEAST SQUARES LINEAR REGRESSION ANALYSIS

EM375 STATISTICS AND MEASUREMENT UNCERTAINTY LEAST SQUARES LINEAR REGRESSION ANALYSIS EM375 STATISTICS AND MEASUREMENT UNCERTAINTY LEAST SQUARES LINEAR REGRESSION ANALYSIS I this uit of the course we ivestigate fittig a straight lie to measured (x, y) data pairs. The equatio we wat to fit

More information

Ones Assignment Method for Solving Traveling Salesman Problem

Ones Assignment Method for Solving Traveling Salesman Problem Joural of mathematics ad computer sciece 0 (0), 58-65 Oes Assigmet Method for Solvig Travelig Salesma Problem Hadi Basirzadeh Departmet of Mathematics, Shahid Chamra Uiversity, Ahvaz, Ira Article history:

More information

Tutorial on Packet Time Metrics

Tutorial on Packet Time Metrics Power Matters. Tutorial o Packet Time Metrics Lee Cosart lee.cosart@microsemi.com ITS 204 204 Microsemi Corporatio. COMPANY POPIETAY Itroductio requecy trasport Oe-way: forward & reverse packet streams

More information

Acquisition Description Exploration Examination Understanding what data is collected. Characterizing properties of data.

Acquisition Description Exploration Examination Understanding what data is collected. Characterizing properties of data. Summary Statistics Acquisition Description Exploration Examination what data is collected Characterizing properties of data. Exploring the data distribution(s). Identifying data quality problems. Selecting

More information

CS 111: Program Design I Lecture 21: Network Analysis. Robert H. Sloan & Richard Warner University of Illinois at Chicago April 10, 2018

CS 111: Program Design I Lecture 21: Network Analysis. Robert H. Sloan & Richard Warner University of Illinois at Chicago April 10, 2018 CS 111: Program Desig I Lecture 21: Network Aalysis Robert H. Sloa & Richard Warer Uiversity of Illiois at Chicago April 10, 2018 NETWORK ANALYSIS Which displays a graph i the sese of graph/etwork aalysis?

More information

CS6220: DATA MINING TECHNIQUES

CS6220: DATA MINING TECHNIQUES CS6220: DATA MINING TECHNIQUES 2: Data Pre-Processing Instructor: Yizhou Sun yzsun@ccs.neu.edu September 10, 2013 2: Data Pre-Processing Getting to know your data Basic Statistical Descriptions of Data

More information

Improving Template Based Spike Detection

Improving Template Based Spike Detection Improvig Template Based Spike Detectio Kirk Smith, Member - IEEE Portlad State Uiversity petra@ee.pdx.edu Abstract Template matchig algorithms like SSE, Covolutio ad Maximum Likelihood are well kow for

More information

The Closest Line to a Data Set in the Plane. David Gurney Southeastern Louisiana University Hammond, Louisiana

The Closest Line to a Data Set in the Plane. David Gurney Southeastern Louisiana University Hammond, Louisiana The Closest Lie to a Data Set i the Plae David Gurey Southeaster Louisiaa Uiversity Hammod, Louisiaa ABSTRACT This paper looks at three differet measures of distace betwee a lie ad a data set i the plae:

More information

9 x and g(x) = 4. x. Find (x) 3.6. I. Combining Functions. A. From Equations. Example: Let f(x) = and its domain. Example: Let f(x) = and g(x) = x x 4

9 x and g(x) = 4. x. Find (x) 3.6. I. Combining Functions. A. From Equations. Example: Let f(x) = and its domain. Example: Let f(x) = and g(x) = x x 4 1 3.6 I. Combiig Fuctios A. From Equatios Example: Let f(x) = 9 x ad g(x) = 4 f x. Fid (x) g ad its domai. 4 Example: Let f(x) = ad g(x) = x x 4. Fid (f-g)(x) B. From Graphs: Graphical Additio. Example:

More information

. Written in factored form it is easy to see that the roots are 2, 2, i,

. Written in factored form it is easy to see that the roots are 2, 2, i, CMPS A Itroductio to Programmig Programmig Assigmet 4 I this assigmet you will write a java program that determies the real roots of a polyomial that lie withi a specified rage. Recall that the roots (or

More information

The Nature of Light. Chapter 22. Geometric Optics Using a Ray Approximation. Ray Approximation

The Nature of Light. Chapter 22. Geometric Optics Using a Ray Approximation. Ray Approximation The Nature of Light Chapter Reflectio ad Refractio of Light Sectios: 5, 8 Problems: 6, 7, 4, 30, 34, 38 Particles of light are called photos Each photo has a particular eergy E = h ƒ h is Plack s costat

More information

A New Morphological 3D Shape Decomposition: Grayscale Interframe Interpolation Method

A New Morphological 3D Shape Decomposition: Grayscale Interframe Interpolation Method A ew Morphological 3D Shape Decompositio: Grayscale Iterframe Iterpolatio Method D.. Vizireau Politehica Uiversity Bucharest, Romaia ae@comm.pub.ro R. M. Udrea Politehica Uiversity Bucharest, Romaia mihea@comm.pub.ro

More information

Eigenimages. Digital Image Processing: Bernd Girod, Stanford University -- Eigenimages 1

Eigenimages. Digital Image Processing: Bernd Girod, Stanford University -- Eigenimages 1 Eigeimages Uitary trasforms Karhue-Loève trasform ad eigeimages Sirovich ad Kirby method Eigefaces for geder recogitio Fisher liear discrimat aalysis Fisherimages ad varyig illumiatio Fisherfaces vs. eigefaces

More information

Capability Analysis (Variable Data)

Capability Analysis (Variable Data) Capability Aalysis (Variable Data) Revised: 0/0/07 Summary... Data Iput... 3 Capability Plot... 5 Aalysis Summary... 6 Aalysis Optios... 8 Capability Idices... Prefereces... 6 Tests for Normality... 7

More information

STA Rev. F Learning Objectives. Learning Objectives (Cont.) Module 3 Descriptive Measures

STA Rev. F Learning Objectives. Learning Objectives (Cont.) Module 3 Descriptive Measures STA 2023 Module 3 Descriptive Measures Learning Objectives Upon completing this module, you should be able to: 1. Explain the purpose of a measure of center. 2. Obtain and interpret the mean, median, and

More information

Octahedral Graph Scaling

Octahedral Graph Scaling Octahedral Graph Scalig Peter Russell Jauary 1, 2015 Abstract There is presetly o strog iterpretatio for the otio of -vertex graph scalig. This paper presets a ew defiitio for the term i the cotext of

More information

IMP: Superposer Integrated Morphometrics Package Superposition Tool

IMP: Superposer Integrated Morphometrics Package Superposition Tool IMP: Superposer Itegrated Morphometrics Package Superpositio Tool Programmig by: David Lieber ( 03) Caisius College 200 Mai St. Buffalo, NY 4208 Cocept by: H. David Sheets, Dept. of Physics, Caisius College

More information

IAT 355 Visual Analytics. Data and Statistical Models. Lyn Bartram

IAT 355 Visual Analytics. Data and Statistical Models. Lyn Bartram IAT 355 Visual Analytics Data and Statistical Models Lyn Bartram Exploring data Example: US Census People # of people in group Year # 1850 2000 (every decade) Age # 0 90+ Sex (Gender) # Male, female Marital

More information

Our second algorithm. Comp 135 Machine Learning Computer Science Tufts University. Decision Trees. Decision Trees. Decision Trees.

Our second algorithm. Comp 135 Machine Learning Computer Science Tufts University. Decision Trees. Decision Trees. Decision Trees. Comp 135 Machie Learig Computer Sciece Tufts Uiversity Fall 2017 Roi Khardo Some of these slides were adapted from previous slides by Carla Brodley Our secod algorithm Let s look at a simple dataset for

More information

CHAPTER-13. Mining Class Comparisons: Discrimination between DifferentClasses: 13.4 Class Description: Presentation of Both Characterization and

CHAPTER-13. Mining Class Comparisons: Discrimination between DifferentClasses: 13.4 Class Description: Presentation of Both Characterization and CHAPTER-13 Mining Class Comparisons: Discrimination between DifferentClasses: 13.1 Introduction 13.2 Class Comparison Methods and Implementation 13.3 Presentation of Class Comparison Descriptions 13.4

More information

3D Model Retrieval Method Based on Sample Prediction

3D Model Retrieval Method Based on Sample Prediction 20 Iteratioal Coferece o Computer Commuicatio ad Maagemet Proc.of CSIT vol.5 (20) (20) IACSIT Press, Sigapore 3D Model Retrieval Method Based o Sample Predictio Qigche Zhag, Ya Tag* School of Computer

More information

How do we evaluate algorithms?

How do we evaluate algorithms? F2 Readig referece: chapter 2 + slides Algorithm complexity Big O ad big Ω To calculate ruig time Aalysis of recursive Algorithms Next time: Litterature: slides mostly The first Algorithm desig methods:

More information

Fast Fourier Transform (FFT) Algorithms

Fast Fourier Transform (FFT) Algorithms Fast Fourier Trasform FFT Algorithms Relatio to the z-trasform elsewhere, ozero, z x z X x [ ] 2 ~ elsewhere,, ~ e j x X x x π j e z z X X π 2 ~ The DFS X represets evely spaced samples of the z- trasform

More information

The golden search method: Question 1

The golden search method: Question 1 1. Golde Sectio Search for the Mode of a Fuctio The golde search method: Questio 1 Suppose the last pair of poits at which we have a fuctio evaluatio is x(), y(). The accordig to the method, If f(x())

More information

Intro to Scientific Computing: Solutions

Intro to Scientific Computing: Solutions Itro to Scietific Computig: Solutios Dr. David M. Goulet. How may steps does it take to separate 3 objects ito groups of 4? We start with 5 objects ad apply 3 steps of the algorithm to reduce the pile

More information

Analysis of Server Resource Consumption of Meteorological Satellite Application System Based on Contour Curve

Analysis of Server Resource Consumption of Meteorological Satellite Application System Based on Contour Curve Advaces i Computer, Sigals ad Systems (2018) 2: 19-25 Clausius Scietific Press, Caada Aalysis of Server Resource Cosumptio of Meteorological Satellite Applicatio System Based o Cotour Curve Xiagag Zhao

More information

Name Date Hr. ALGEBRA 1-2 SPRING FINAL MULTIPLE CHOICE REVIEW #2

Name Date Hr. ALGEBRA 1-2 SPRING FINAL MULTIPLE CHOICE REVIEW #2 Name Date Hr. ALGEBRA - SPRING FINAL MULTIPLE CHOICE REVIEW # 5. Which measure of ceter is most appropriate for the followig data set? {7, 7, 75, 77,, 9, 9, 90} Mea Media Stadard Deviatio Rage 5. The umber

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 26 Ehaced Data Models: Itroductio to Active, Temporal, Spatial, Multimedia, ad Deductive Databases Copyright 2016 Ramez Elmasri ad Shamkat B.

More information

Designing a learning system

Designing a learning system CS 75 Machie Learig Lecture Desigig a learig system Milos Hauskrecht milos@cs.pitt.edu 539 Seott Square, x-5 people.cs.pitt.edu/~milos/courses/cs75/ Admiistrivia No homework assigmet this week Please try

More information

Lower Bounds for Sorting

Lower Bounds for Sorting Liear Sortig Topics Covered: Lower Bouds for Sortig Coutig Sort Radix Sort Bucket Sort Lower Bouds for Sortig Compariso vs. o-compariso sortig Decisio tree model Worst case lower boud Compariso Sortig

More information

Assignment Problems with fuzzy costs using Ones Assignment Method

Assignment Problems with fuzzy costs using Ones Assignment Method IOSR Joural of Mathematics (IOSR-JM) e-issn: 8-8, p-issn: 9-6. Volume, Issue Ver. V (Sep. - Oct.06), PP 8-89 www.iosrjourals.org Assigmet Problems with fuzzy costs usig Oes Assigmet Method S.Vimala, S.Krisha

More information

Eigenimages. Digital Image Processing: Bernd Girod, 2013 Stanford University -- Eigenimages 1

Eigenimages. Digital Image Processing: Bernd Girod, 2013 Stanford University -- Eigenimages 1 Eigeimages Uitary trasforms Karhue-Loève trasform ad eigeimages Sirovich ad Kirby method Eigefaces for geder recogitio Fisher liear discrimat aalysis Fisherimages ad varyig illumiatio Fisherfaces vs. eigefaces

More information

Accuracy Improvement in Camera Calibration

Accuracy Improvement in Camera Calibration Accuracy Improvemet i Camera Calibratio FaJie L Qi Zag ad Reihard Klette CITR, Computer Sciece Departmet The Uiversity of Aucklad Tamaki Campus, Aucklad, New Zealad fli006, qza001@ec.aucklad.ac.z r.klette@aucklad.ac.z

More information

Numerical Methods Lecture 6 - Curve Fitting Techniques

Numerical Methods Lecture 6 - Curve Fitting Techniques Numerical Methods Lecture 6 - Curve Fittig Techiques Topics motivatio iterpolatio liear regressio higher order polyomial form expoetial form Curve fittig - motivatio For root fidig, we used a give fuctio

More information

Image Segmentation EEE 508

Image Segmentation EEE 508 Image Segmetatio Objective: to determie (etract) object boudaries. It is a process of partitioig a image ito distict regios by groupig together eighborig piels based o some predefied similarity criterio.

More information

Protected points in ordered trees

Protected points in ordered trees Applied Mathematics Letters 008 56 50 www.elsevier.com/locate/aml Protected poits i ordered trees Gi-Sag Cheo a, Louis W. Shapiro b, a Departmet of Mathematics, Sugkyukwa Uiversity, Suwo 440-746, Republic

More information

The first few questions on this worksheet will deal with measures of central tendency. These data types tell us where the center of the data set lies.

The first few questions on this worksheet will deal with measures of central tendency. These data types tell us where the center of the data set lies. Instructions: You are given the following data below these instructions. Your client (Courtney) wants you to statistically analyze the data to help her reach conclusions about how well she is teaching.

More information

Τεχνολογία Λογισμικού

Τεχνολογία Λογισμικού ΕΘΝΙΚΟ ΜΕΤΣΟΒΙΟ ΠΟΛΥΤΕΧΝΕΙΟ Σχολή Ηλεκτρολόγων Μηχανικών και Μηχανικών Υπολογιστών Τεχνολογία Λογισμικού, 7ο/9ο εξάμηνο 2018-2019 Τεχνολογία Λογισμικού Ν.Παπασπύρου, Αν.Καθ. ΣΗΜΜΥ, ickie@softlab.tua,gr

More information

Appendix D. Controller Implementation

Appendix D. Controller Implementation COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Appedix D Cotroller Implemetatio Cotroller Implemetatios Combiatioal logic (sigle-cycle); Fiite state machie (multi-cycle, pipelied);

More information

Fundamentals of Media Processing. Shin'ichi Satoh Kazuya Kodama Hiroshi Mo Duy-Dinh Le

Fundamentals of Media Processing. Shin'ichi Satoh Kazuya Kodama Hiroshi Mo Duy-Dinh Le Fudametals of Media Processig Shi'ichi Satoh Kazuya Kodama Hiroshi Mo Duy-Dih Le Today's topics Noparametric Methods Parze Widow k-nearest Neighbor Estimatio Clusterig Techiques k-meas Agglomerative Hierarchical

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 18 Strategies for Query Processig Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe Itroductio DBMS techiques to process a query Scaer idetifies

More information

Arithmetic Sequences

Arithmetic Sequences . Arithmetic Sequeces COMMON CORE Learig Stadards HSF-IF.A. HSF-BF.A.1a HSF-BF.A. HSF-LE.A. Essetial Questio How ca you use a arithmetic sequece to describe a patter? A arithmetic sequece is a ordered

More information

Package popkorn. R topics documented: February 20, Type Package

Package popkorn. R topics documented: February 20, Type Package Type Pacage Pacage popkor February 20, 2015 Title For iterval estimatio of mea of selected populatios Versio 0.3-0 Date 2014-07-04 Author Vi Gopal, Claudio Fuetes Maitaier Vi Gopal Depeds

More information

BOOLEAN MATHEMATICS: GENERAL THEORY

BOOLEAN MATHEMATICS: GENERAL THEORY CHAPTER 3 BOOLEAN MATHEMATICS: GENERAL THEORY 3.1 ISOMORPHIC PROPERTIES The ame Boolea Arithmetic was chose because it was discovered that literal Boolea Algebra could have a isomorphic umerical aspect.

More information

Chapter 3 - Displaying and Summarizing Quantitative Data

Chapter 3 - Displaying and Summarizing Quantitative Data Chapter 3 - Displaying and Summarizing Quantitative Data 3.1 Graphs for Quantitative Data (LABEL GRAPHS) August 25, 2014 Histogram (p. 44) - Graph that uses bars to represent different frequencies or relative

More information

Area As A Limit & Sigma Notation

Area As A Limit & Sigma Notation Area As A Limit & Sigma Notatio SUGGESTED REFERENCE MATERIAL: As you work through the problems listed below, you should referece Chapter 5.4 of the recommeded textbook (or the equivalet chapter i your

More information

Chapter 2 Data Preprocessing

Chapter 2 Data Preprocessing Chapter 2 Data Preprocessing CISC4631 1 Outline General data characteristics Data cleaning Data integration and transformation Data reduction Summary CISC4631 2 1 Types of Data Sets Record Relational records

More information

MATHEMATICAL METHODS OF ANALYSIS AND EXPERIMENTAL DATA PROCESSING (Or Methods of Curve Fitting)

MATHEMATICAL METHODS OF ANALYSIS AND EXPERIMENTAL DATA PROCESSING (Or Methods of Curve Fitting) MATHEMATICAL METHODS OF ANALYSIS AND EXPERIMENTAL DATA PROCESSING (Or Methods of Curve Fittig) I this chapter, we will eamie some methods of aalysis ad data processig; data obtaied as a result of a give

More information

Vocabulary. 5-number summary Rule. Area principle. Bar chart. Boxplot. Categorical data condition. Categorical variable.

Vocabulary. 5-number summary Rule. Area principle. Bar chart. Boxplot. Categorical data condition. Categorical variable. 5-number summary 68-95-99.7 Rule Area principle Bar chart Bimodal Boxplot Case Categorical data Categorical variable Center Changing center and spread Conditional distribution Context Contingency table

More information

Spectral leakage and windowing

Spectral leakage and windowing EEL33: Discrete-Time Sigals ad Systems Spectral leakage ad widowig. Itroductio Spectral leakage ad widowig I these otes, we itroduce the idea of widowig for reducig the effects of spectral leakage, ad

More information

Lecture Notes 6 Introduction to algorithm analysis CSS 501 Data Structures and Object-Oriented Programming

Lecture Notes 6 Introduction to algorithm analysis CSS 501 Data Structures and Object-Oriented Programming Lecture Notes 6 Itroductio to algorithm aalysis CSS 501 Data Structures ad Object-Orieted Programmig Readig for this lecture: Carrao, Chapter 10 To be covered i this lecture: Itroductio to algorithm aalysis

More information

Chapter 3: Introduction to Principal components analysis with MATLAB

Chapter 3: Introduction to Principal components analysis with MATLAB Chapter 3: Itroductio to Pricipal compoets aalysis with MATLAB The vriety of mathematical tools are avilable ad successfully workig to i the field of image processig. The mai problem with graphical autheticatio

More information

What are we going to learn? CSC Data Structures Analysis of Algorithms. Overview. Algorithm, and Inputs

What are we going to learn? CSC Data Structures Analysis of Algorithms. Overview. Algorithm, and Inputs What are we goig to lear? CSC316-003 Data Structures Aalysis of Algorithms Computer Sciece North Carolia State Uiversity Need to say that some algorithms are better tha others Criteria for evaluatio Structure

More information

Learning to Shoot a Goal Lecture 8: Learning Models and Skills

Learning to Shoot a Goal Lecture 8: Learning Models and Skills Learig to Shoot a Goal Lecture 8: Learig Models ad Skills How do we acquire skill at shootig goals? CS 344R/393R: Robotics Bejami Kuipers Learig to Shoot a Goal The robot eeds to shoot the ball i the goal.

More information

Data Mining and Analytics. Introduction

Data Mining and Analytics. Introduction Data Mining and Analytics Introduction Data Mining Data mining refers to extracting or mining knowledge from large amounts of data It is also termed as Knowledge Discovery from Data (KDD) Mostly, data

More information

1.2 Binomial Coefficients and Subsets

1.2 Binomial Coefficients and Subsets 1.2. BINOMIAL COEFFICIENTS AND SUBSETS 13 1.2 Biomial Coefficiets ad Subsets 1.2-1 The loop below is part of a program to determie the umber of triagles formed by poits i the plae. for i =1 to for j =

More information

Random Graphs and Complex Networks T

Random Graphs and Complex Networks T Radom Graphs ad Complex Networks T-79.7003 Charalampos E. Tsourakakis Aalto Uiversity Lecture 3 7 September 013 Aoucemet Homework 1 is out, due i two weeks from ow. Exercises: Probabilistic iequalities

More information

The Magma Database file formats

The Magma Database file formats The Magma Database file formats Adrew Gaylard, Bret Pikey, ad Mart-Mari Breedt Johaesburg, South Africa 15th May 2006 1 Summary Magma is a ope-source object database created by Chris Muller, of Kasas City,

More information

Using The Central Limit Theorem for Belief Network Learning

Using The Central Limit Theorem for Belief Network Learning Usig The Cetral Limit Theorem for Belief Network Learig Ia Davidso, Mioo Amiia Computer Sciece Dept, SUNY Albay Albay, NY, USA,. davidso@cs.albay.edu Abstract. Learig the parameters (coditioal ad margial

More information

Image Analysis. Segmentation by Fitting a Model

Image Analysis. Segmentation by Fitting a Model Image Aalysis Segmetatio by Fittig a Model Christophoros Nikou cikou@cs.uoi.gr Images take from: D. Forsyth ad J. Poce. Computer Visio: A Moder Approach, Pretice Hall, 2003. Computer Visio course by Svetlaa

More information

Recursive Procedures. How can you model the relationship between consecutive terms of a sequence?

Recursive Procedures. How can you model the relationship between consecutive terms of a sequence? 6. Recursive Procedures I Sectio 6.1, you used fuctio otatio to write a explicit formula to determie the value of ay term i a Sometimes it is easier to calculate oe term i a sequece usig the previous terms.

More information

COMP 558 lecture 6 Sept. 27, 2010

COMP 558 lecture 6 Sept. 27, 2010 Radiometry We have discussed how light travels i straight lies through space. We would like to be able to talk about how bright differet light rays are. Imagie a thi cylidrical tube ad cosider the amout

More information

Optimal Mapped Mesh on the Circle

Optimal Mapped Mesh on the Circle Koferece ANSYS 009 Optimal Mapped Mesh o the Circle doc. Ig. Jaroslav Štigler, Ph.D. Bro Uiversity of Techology, aculty of Mechaical gieerig, ergy Istitut, Abstract: This paper brigs out some ideas ad

More information

The number n of subintervals times the length h of subintervals gives length of interval (b-a).

The number n of subintervals times the length h of subintervals gives length of interval (b-a). Simulator with MadMath Kit: Riema Sums (Teacher s pages) I your kit: 1. GeoGebra file: Ready-to-use projector sized simulator: RiemaSumMM.ggb 2. RiemaSumMM.pdf (this file) ad RiemaSumMMEd.pdf (educator's

More information

Assignment 5; Due Friday, February 10

Assignment 5; Due Friday, February 10 Assigmet 5; Due Friday, February 10 17.9b The set X is just two circles joied at a poit, ad the set X is a grid i the plae, without the iteriors of the small squares. The picture below shows that the iteriors

More information

Analysis of Documents Clustering Using Sampled Agglomerative Technique

Analysis of Documents Clustering Using Sampled Agglomerative Technique Aalysis of Documets Clusterig Usig Sampled Agglomerative Techique Omar H. Karam, Ahmed M. Hamad, ad Sheri M. Moussa Abstract I this paper a clusterig algorithm for documets is proposed that adapts a samplig-based

More information

Prepare a stem-and-leaf graph for the following data. In your final display, you should arrange the leaves for each stem in increasing order.

Prepare a stem-and-leaf graph for the following data. In your final display, you should arrange the leaves for each stem in increasing order. Chapter 2 2.1 Descriptive Statistics A stem-and-leaf graph, also called a stemplot, allows for a nice overview of quantitative data without losing information on individual observations. It can be a good

More information

What is the difference between a statistician and a mortician? Nobody's dying to see the statistician! Chapter 8 Interval Estimation

What is the difference between a statistician and a mortician? Nobody's dying to see the statistician! Chapter 8 Interval Estimation Slides Preared by JOHN S. LOUCKS St. Edward s Uiversity Slide 1 What is the differece betwee a statisticia ad a morticia? Nobody's dyig to see the statisticia! Slide Chater 8 Iterval Estimatio Poulatio

More information