Anonymisation of Public Use Data Sets
|
|
- Victoria Parks
- 5 years ago
- Views:
Transcription
1 Anonymsaton of Publc Use Data Sets Methods for Reducng Dsclosure Rsk and the Analyss of Perturbed Data Harvey Goldsten Unversty of Brstol and Unversty College London and Natale Shlomo Unversty of Manchester 1
2 The problem and some solutons Release of large (pseudonymsed datasets for analyss potentally allows statstcal attack va searchng for records satsfyng certan constrants (e.g. age, locaton, medcaton.. Standard soluton s to degrade data values to make t unlkely that an attacker could correctly dentfy ndvduals. Typcally judge usng k-anonymty Two types of dsclosure control methods under safe data approach 1. Non-perturbatve methods reduce nformaton content 2. Perturbatve methods alters the data to ncrease uncertanty of dentfcaton 2
3 The problem and some solutons Non-perturbatve methods: 1. Remove cells wth small counts f data n tabular form, preservng margns 2. Delete senstve varables 3. Group categores or categorse contnuous varables of dsclosve varables such as postcode, age. 4. Sub-sample Perturbatve methods: 1. Add random nose to ncrease uncertanty around correct dentfcaton (ths ncludes random msclassfcaton for categorcal varables 2. mcro-aggregaton of smlar cases (effectvely reduces varaton 3. Create synthetc data values whle preservng data structure 3
4 Effects on statstcal analyss a key concern 1 Cell removal: may over - coarsen data and n partcular remove nterestng nteracton effects 2 Groupng: lke (1 may smooth over comple relatonshps 3 Addton of random nose wll lead to ncorrect standard errors and also based coeffcents n generalsed lnear models unless properly adjusted for 4 Synthetc data may lead to severely based coeffcents f analyss models do not nclude varables used n the synthess 4
5 Synthetc data Synthetc data: reles on assumed or modelled data relatonshps to smulate (mpute new data that appromates real data. Ths can be done for all data or a subset. Producng multply mputed datasets allows correctons to be made for mputaton varance or appromatons avalable. Few would advocate that such data should be used for a fnal analyss: rather they can provde an ndcaton for a small set of fnal models that can then smply be ftted (n a secure envronment to produce requred model estmates. 5
6 Synthetc data Ths poses partcular problems: 1. There s a strong relance on producng the rght structure, typcally va a seres of condtonal models. 2. Even usng synthetc data n eploratory mode can lead users astray, where ther models based upon an appromaton to the true structure become based, and lead to the selecton of napproprate fnal models to be estmated usng the real data. 6
7 Addng random nose n general Addng random nose s less etreme than synthetc data. We suppose that the attacker has avalable a set of q values for y (the varables to be used, say yy that she ntends to match aganst records n the data set. We propose to construct a new set of varables, z, whch s what the attacker wll see zz yy + mm where m has a predefned (normal dstrbuton (other dstrbutons are possble e.g. dfferental prvacy technques often use a double eponental dstrbuton For smplcty, assume ndependence across varables to be dsturbed or we mght consder the case of correlated nose (correlated wth true values to preserve the correlaton structure and suffcent statstcs. Note that y can be contnuous or dscrete (categores numbered 1,,p 7
8 Addng random nose n general The value of the varance (σσ 2 mm wll determne the strength of the resstance to attack and can be a functon of the true varablty of each varable. We now form a measure of the dstance between the yy and each z and then rank these dstances. A general dstance measure can be wrtten n the form DD zz yy TT WW(zz yy, where, for eample, WW 1 Ω kk But, more smply we can choose the Eucldean dstance for each comparson record DD qq jj1 (zz yy jj 2, qq DD jj1 (zz yy jj 2, 1,., nn 8
9 Rankng the dstances A ratonal attacker chooses closest record(s to ther own as the correct one(s. Form RR RRRRRRRR(DD, RR RRRRRRRR DD, Defne value of for RR 1. Defne h RR 1, For eample: Thus f h0 we have the correct match. RR RR , so h f attacker chooses closest record. h measures dfference between chosen and correct method So choose nose added large enough so that, say, Pr h < pp < εε (say, pp 3, εε 0.1 9
10 A smulaton Generate 10 3 records wth 5 normal varables and σσ 2 mm 0.1 All varances 1 and covarances For each true value record (attacker s y generate DD, DD The followng table gves some estmates of dsclosveness n terms of h for a range of ndvduals at dfferent dstances from the medan. 10
11 Dstrbuton for h hh Cumulatve percentle of D dstrbuton
12 More results Lowest decle Pr(h>5. For combnatons of Ω aaaaaa σσ mm 2 where Ω always has unt dagonal elements and equal off-dagonal elements (gven by columns are shown. Sample sze σσ mm We see that the procedure s readly tuned smply by changng the varance of the nose. We are also studyng the possblty of a more sophstcated attack that uses vales of y predcted from the perturbed dataset rather than the z themselves. 12
13 The h-nde and k-anonymsaton If we have, say, 2-anonymty ths mples that an attacker s able to dentfy two ndvdual records matchng her own nformaton, so choosng ether of them at random means that there s a probablty of 0.5 that t s the correct one. The h-nde, however, only yelds a sngle ndvdual as the closest, for eample wth a probablty about 0.5 and thus provdes less nformaton to the attacker than n the case of 2-anonymty. 13
14 The h-nde and k-anonymzaton II For k-anonymty an attacker may be qute content that they can access 2 or perhaps even 5 records contanng the one that s sought. By contrast, wth the h-nde procedure, n our most favourable case, the probablty of the sought-for ndvdual beng one of the two nearest s just over 60% and one of the fve nearest just under 80% Thus t could be argued that ths s suffcent to deter an attacker and hence sutable n terms of dsclosveness. In practce careful attenton needs to be pad to the amount of nose requred to satsfy dsclosure concerns. 14
15 How to remove the nose Assume nose η ~ d(0, 2 σ η add to contnuous varable We get unbased totals and means but larger varance and bases where predctors ncorporate nose How to make correct nferences n a general modellng framework? y η 2 σ η Assume a smple regresson model wth a dependent varable that has been subjected to Gaussan addtve nose wth a mean of 0 and a postve varance The predctor varable s error free we assume
16 16 16 The model s: where denotes the true but unobserved value of the dependent varable If we regress on then snce y y y n y η ε β α 1,...,, y y (, ( (, (, ( (, ( (, ( Var y Cov Var Cov y Cov Var y Cov Var y Cov + + η η β 0, ( Cov η How to remove the nose
17 How to remove the nose Addtve nose on the dependent varable thus does not bas slope coeffcent but ncreases standard errors due to the ncrease n varance Var ( y Var( y + Var( η Now add nose η to predctor varable The model s now: y α + + β η + ε, 1,..., n where denotes true but unobserved value of 17
18 How to remove the nose If we regress y on then for the least squares scope coeffcent: β Cov( y, Var( Cov( y, + η Var( + Var( η Cov( y, Var( + + Cov( y, Var( η η Cov( y, Var( + Var( η snce Cov( y, η 0 Addtve nose on predctor varable bases slope coeffcent downwards (attenuaton Thus we need sutable methodology to deal wth these measurement errors 18
19 How to remove the nose For the lease squares slope coeffcent n a smple lnear regresson: ˆ β p Cov( y, Var( + Var( We defne λ 1 / σ as the relablty rato η ( + σ η σ βσ η β (1 + σ 2 η / σ 2 σ 1 A consstent estmate of the slope coeffcent s obtaned by dvdng least squares estmate by λ 2 λ σ η To calculate we assumes that s released and known to the researcher
20 How to remove the nose n general Nose s random wth known propertes so a measurement error model s requred Ths requres that the parameters used to generate the nose are known to the researchers. Current work (usng a CLOSER grant at Brstol s underway to develop software to show how the nose should be generated n such a way that the parameters can be released under a predetermned h-nde to protect aganst attrbute dsclosure whlst preservng utlty 20
21 How to remove the nose In smple lnear regresson, correlated nose can be added whch produce unbased estmates of slope coeffcents by usng standard regresson technques. Current work at Brstol s developng algorthms ncorporatng measurement error models that wll handle generalzed lnear models and multlevel data of dfferent types. Specalsaton to anonymsaton wth user software currently funded through ESRC (va Closer at Brstol (Boyd, Goldsten and Burton Can be combned wth handlng mssng data values. Some loss of statstcal effcency but enables underlyng sgnal to be etracted and thus provdes unbased parameter estmates. 21
22 Further thoughts Often, a data attacker wll have no pre-estng ndvdual data and may trawl the dataset to dscover an nterestng record, for eample an ndvdual wth an unusual combnaton of values. They may then attempt to dentfy the real person usng other varables n the data record. Our procedure s also relevant to such an attack so long as the nose has been appled to the varables n queston. How to tune the nose and dfferental nose related to dentfablty of varables s an area for further research. For eample we mght wsh to add relatvely more nose to a varable such as heght than har colour. Now, t may well be the case that, condtonal on the data avalable to the attacker, a varable such as ncome can be predcted wth suffcent accuracy wthn ths dataset, and f the data structure s well appromated ether by removng nose or va synthess then ncome could be farly accurately predcted and ths may be suffcent for an attacker s purpose. Needs further consderaton. Provson of sutable analyss tools and tranng for data analysts s mportant dscussons are underway wth Government departments and agences through ADRN. 22
23 Thank you for your attenton 23
NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics
Introducton G10 NAG Fortran Lbrary Chapter Introducton G10 Smoothng n Statstcs Contents 1 Scope of the Chapter... 2 2 Background to the Problems... 2 2.1 Smoothng Methods... 2 2.2 Smoothng Splnes and Regresson
More informationFeature Reduction and Selection
Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components
More informationAdjustment methods for differential measurement errors in multimode surveys
Adjustment methods for dfferental measurement errors n multmode surveys Salah Merad UK Offce for Natonal Statstcs ESSnet MM DCSS, Fnal Meetng Wesbaden, Germany, 4-5 September 2014 Outlne Introducton Stablsng
More informationX- Chart Using ANOM Approach
ISSN 1684-8403 Journal of Statstcs Volume 17, 010, pp. 3-3 Abstract X- Chart Usng ANOM Approach Gullapall Chakravarth 1 and Chaluvad Venkateswara Rao Control lmts for ndvdual measurements (X) chart are
More informationWishing you all a Total Quality New Year!
Total Qualty Management and Sx Sgma Post Graduate Program 214-15 Sesson 4 Vnay Kumar Kalakband Assstant Professor Operatons & Systems Area 1 Wshng you all a Total Qualty New Year! Hope you acheve Sx sgma
More informationy and the total sum of
Lnear regresson Testng for non-lnearty In analytcal chemstry, lnear regresson s commonly used n the constructon of calbraton functons requred for analytcal technques such as gas chromatography, atomc absorpton
More informationSupport Vector Machines
/9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.
More informationMachine Learning: Algorithms and Applications
14/05/1 Machne Learnng: Algorthms and Applcatons Florano Zn Free Unversty of Bozen-Bolzano Faculty of Computer Scence Academc Year 011-01 Lecture 10: 14 May 01 Unsupervsed Learnng cont Sldes courtesy of
More informationS1 Note. Basis functions.
S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type
More informationParameter estimation for incomplete bivariate longitudinal data in clinical trials
Parameter estmaton for ncomplete bvarate longtudnal data n clncal trals Naum M. Khutoryansky Novo Nordsk Pharmaceutcals, Inc., Prnceton, NJ ABSTRACT Bvarate models are useful when analyzng longtudnal data
More informationEconometrics 2. Panel Data Methods. Advanced Panel Data Methods I
Panel Data Methods Econometrcs 2 Advanced Panel Data Methods I Last tme: Panel data concepts and the two-perod case (13.3-4) Unobserved effects model: Tme-nvarant and dosyncratc effects Omted varables
More informationA Post Randomization Framework for Privacy-Preserving Bayesian. Network Parameter Learning
A Post Randomzaton Framework for Prvacy-Preservng Bayesan Network Parameter Learnng JIANJIE MA K.SIVAKUMAR School Electrcal Engneerng and Computer Scence, Washngton State Unversty Pullman, WA. 9964-75
More informationModeling Local Uncertainty accounting for Uncertainty in the Data
Modelng Local Uncertanty accontng for Uncertanty n the Data Olena Babak and Clayton V Detsch Consder the problem of estmaton at an nsampled locaton sng srrondng samples The standard approach to ths problem
More informationSimulation: Solving Dynamic Models ABE 5646 Week 11 Chapter 2, Spring 2010
Smulaton: Solvng Dynamc Models ABE 5646 Week Chapter 2, Sprng 200 Week Descrpton Readng Materal Mar 5- Mar 9 Evaluatng [Crop] Models Comparng a model wth data - Graphcal, errors - Measures of agreement
More informationSynthesizer 1.0. User s Guide. A Varying Coefficient Meta. nalytic Tool. Z. Krizan Employing Microsoft Excel 2007
Syntheszer 1.0 A Varyng Coeffcent Meta Meta-Analytc nalytc Tool Employng Mcrosoft Excel 007.38.17.5 User s Gude Z. Krzan 009 Table of Contents 1. Introducton and Acknowledgments 3. Operatonal Functons
More informationA CLASS OF TRANSFORMED EFFICIENT RATIO ESTIMATORS OF FINITE POPULATION MEAN. Department of Statistics, Islamia College, Peshawar, Pakistan 2
Pa. J. Statst. 5 Vol. 3(4), 353-36 A CLASS OF TRANSFORMED EFFICIENT RATIO ESTIMATORS OF FINITE POPULATION MEAN Sajjad Ahmad Khan, Hameed Al, Sadaf Manzoor and Alamgr Department of Statstcs, Islama College,
More informationPerformance Evaluation of Information Retrieval Systems
Why System Evaluaton? Performance Evaluaton of Informaton Retreval Systems Many sldes n ths secton are adapted from Prof. Joydeep Ghosh (UT ECE) who n turn adapted them from Prof. Dk Lee (Unv. of Scence
More informationSome Advanced SPC Tools 1. Cumulative Sum Control (Cusum) Chart For the data shown in Table 9-1, the x chart can be generated.
Some Advanced SP Tools 1. umulatve Sum ontrol (usum) hart For the data shown n Table 9-1, the x chart can be generated. However, the shft taken place at sample #21 s not apparent. 92 For ths set samples,
More informationExercises (Part 4) Introduction to R UCLA/CCPR. John Fox, February 2005
Exercses (Part 4) Introducton to R UCLA/CCPR John Fox, February 2005 1. A challengng problem: Iterated weghted least squares (IWLS) s a standard method of fttng generalzed lnear models to data. As descrbed
More informationFEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur
FEATURE EXTRACTION Dr. K.Vjayarekha Assocate Dean School of Electrcal and Electroncs Engneerng SASTRA Unversty, Thanjavur613 41 Jont Intatve of IITs and IISc Funded by MHRD Page 1 of 8 Table of Contents
More informationCS 534: Computer Vision Model Fitting
CS 534: Computer Vson Model Fttng Sprng 004 Ahmed Elgammal Dept of Computer Scence CS 534 Model Fttng - 1 Outlnes Model fttng s mportant Least-squares fttng Maxmum lkelhood estmaton MAP estmaton Robust
More informationAnnouncements. Supervised Learning
Announcements See Chapter 5 of Duda, Hart, and Stork. Tutoral by Burge lnked to on web page. Supervsed Learnng Classfcaton wth labeled eamples. Images vectors n hgh-d space. Supervsed Learnng Labeled eamples
More informationSteps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices
Steps for Computng the Dssmlarty, Entropy, Herfndahl-Hrschman and Accessblty (Gravty wth Competton) Indces I. Dssmlarty Index Measurement: The followng formula can be used to measure the evenness between
More informationSVM-based Learning for Multiple Model Estimation
SVM-based Learnng for Multple Model Estmaton Vladmr Cherkassky and Yunqan Ma Department of Electrcal and Computer Engneerng Unversty of Mnnesota Mnneapols, MN 55455 {cherkass,myq}@ece.umn.edu Abstract:
More informationActive Contours/Snakes
Actve Contours/Snakes Erkut Erdem Acknowledgement: The sldes are adapted from the sldes prepared by K. Grauman of Unversty of Texas at Austn Fttng: Edges vs. boundares Edges useful sgnal to ndcate occludng
More informationSupport Vector Machines
Support Vector Machnes Decson surface s a hyperplane (lne n 2D) n feature space (smlar to the Perceptron) Arguably, the most mportant recent dscovery n machne learnng In a nutshell: map the data to a predetermned
More informationR s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes
SPH3UW Unt 7.3 Sphercal Concave Mrrors Page 1 of 1 Notes Physcs Tool box Concave Mrror If the reflectng surface takes place on the nner surface of the sphercal shape so that the centre of the mrror bulges
More informationSmoothing Spline ANOVA for variable screening
Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory
More informationA Statistical Model Selection Strategy Applied to Neural Networks
A Statstcal Model Selecton Strategy Appled to Neural Networks Joaquín Pzarro Elsa Guerrero Pedro L. Galndo joaqun.pzarro@uca.es elsa.guerrero@uca.es pedro.galndo@uca.es Dpto Lenguajes y Sstemas Informátcos
More informationLecture 5: Probability Distributions. Random Variables
Lecture 5: Probablty Dstrbutons Random Varables Probablty Dstrbutons Dscrete Random Varables Contnuous Random Varables and ther Dstrbutons Dscrete Jont Dstrbutons Contnuous Jont Dstrbutons Independent
More informationA MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS
Proceedngs of the Wnter Smulaton Conference M E Kuhl, N M Steger, F B Armstrong, and J A Jones, eds A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Mark W Brantley Chun-Hung
More informationFor instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1)
Secton 1.2 Subsets and the Boolean operatons on sets If every element of the set A s an element of the set B, we say that A s a subset of B, or that A s contaned n B, or that B contans A, and we wrte A
More informationTN348: Openlab Module - Colocalization
TN348: Openlab Module - Colocalzaton Topc The Colocalzaton module provdes the faclty to vsualze and quantfy colocalzaton between pars of mages. The Colocalzaton wndow contans a prevew of the two mages
More informationSubspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;
Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features
More informationData Mining: Model Evaluation
Data Mnng: Model Evaluaton Aprl 16, 2013 1 Issues: Evaluatng Classfcaton Methods Accurac classfer accurac: predctng class label predctor accurac: guessng value of predcted attrbutes Speed tme to construct
More informationProblem Set 3 Solutions
Introducton to Algorthms October 4, 2002 Massachusetts Insttute of Technology 6046J/18410J Professors Erk Demane and Shaf Goldwasser Handout 14 Problem Set 3 Solutons (Exercses were not to be turned n,
More informationSLAM Summer School 2006 Practical 2: SLAM using Monocular Vision
SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,
More informationRandom Variables and Probability Distributions
Random Varables and Probablty Dstrbutons Some Prelmnary Informaton Scales on Measurement IE231 - Lecture Notes 5 Mar 14, 2017 Nomnal scale: These are categorcal values that has no relatonshp of order or
More informationA New Approach For the Ranking of Fuzzy Sets With Different Heights
New pproach For the ankng of Fuzzy Sets Wth Dfferent Heghts Pushpnder Sngh School of Mathematcs Computer pplcatons Thapar Unversty, Patala-7 00 Inda pushpndersnl@gmalcom STCT ankng of fuzzy sets plays
More informationMathematics 256 a course in differential equations for engineering students
Mathematcs 56 a course n dfferental equatons for engneerng students Chapter 5. More effcent methods of numercal soluton Euler s method s qute neffcent. Because the error s essentally proportonal to the
More informationReview of approximation techniques
CHAPTER 2 Revew of appromaton technques 2. Introducton Optmzaton problems n engneerng desgn are characterzed by the followng assocated features: the objectve functon and constrants are mplct functons evaluated
More informationContent Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers
IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth
More information7/12/2016. GROUP ANALYSIS Martin M. Monti UCLA Psychology AGGREGATING MULTIPLE SUBJECTS VARIANCE AT THE GROUP LEVEL
GROUP ANALYSIS Martn M. Mont UCLA Psychology NITP AGGREGATING MULTIPLE SUBJECTS When we conduct mult-subject analyss we are tryng to understand whether an effect s sgnfcant across a group of people. Whether
More informationLECTURE : MANIFOLD LEARNING
LECTURE : MANIFOLD LEARNING Rta Osadchy Some sldes are due to L.Saul, V. C. Raykar, N. Verma Topcs PCA MDS IsoMap LLE EgenMaps Done! Dmensonalty Reducton Data representaton Inputs are real-valued vectors
More information6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour
6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the
More informationAn Entropy-Based Approach to Integrated Information Needs Assessment
Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology
More informationWhy visualisation? IRDS: Visualization. Univariate data. Visualisations that we won t be interested in. Graphics provide little additional information
Why vsualsaton? IRDS: Vsualzaton Charles Sutton Unversty of Ednburgh Goal : Have a data set that I want to understand. Ths s called exploratory data analyss. Today s lecture. Goal II: Want to dsplay data
More informationFuzzy Filtering Algorithms for Image Processing: Performance Evaluation of Various Approaches
Proceedngs of the Internatonal Conference on Cognton and Recognton Fuzzy Flterng Algorthms for Image Processng: Performance Evaluaton of Varous Approaches Rajoo Pandey and Umesh Ghanekar Department of
More informationEXTENDED BIC CRITERION FOR MODEL SELECTION
IDIAP RESEARCH REPORT EXTEDED BIC CRITERIO FOR ODEL SELECTIO Itshak Lapdot Andrew orrs IDIAP-RR-0-4 Dalle olle Insttute for Perceptual Artfcal Intellgence P.O.Box 59 artgny Valas Swtzerland phone +4 7
More informationAssignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009.
Farrukh Jabeen Algorthms 51 Assgnment #2 Due Date: June 15, 29. Assgnment # 2 Chapter 3 Dscrete Fourer Transforms Implement the FFT for the DFT. Descrbed n sectons 3.1 and 3.2. Delverables: 1. Concse descrpton
More informationHelsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)
Helsnk Unversty Of Technology, Systems Analyss Laboratory Mat-2.08 Independent research projects n appled mathematcs (3 cr) "! #$&% Antt Laukkanen 506 R ajlaukka@cc.hut.f 2 Introducton...3 2 Multattrbute
More informationOutline. Type of Machine Learning. Examples of Application. Unsupervised Learning
Outlne Artfcal Intellgence and ts applcatons Lecture 8 Unsupervsed Learnng Professor Danel Yeung danyeung@eee.org Dr. Patrck Chan patrckchan@eee.org South Chna Unversty of Technology, Chna Introducton
More informationMixed Linear System Estimation and Identification
48th IEEE Conference on Decson and Control, Shangha, Chna, December 2009 Mxed Lnear System Estmaton and Identfcaton A. Zymns S. Boyd D. Gornevsky Abstract We consder a mxed lnear system model, wth both
More informationA mathematical programming approach to the analysis, design and scheduling of offshore oilfields
17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 A mathematcal programmng approach to the analyss, desgn and
More informationA Semi-parametric Regression Model to Estimate Variability of NO 2
Envronment and Polluton; Vol. 2, No. 1; 2013 ISSN 1927-0909 E-ISSN 1927-0917 Publshed by Canadan Center of Scence and Educaton A Sem-parametrc Regresson Model to Estmate Varablty of NO 2 Meczysław Szyszkowcz
More informationComparing High-Order Boolean Features
Brgham Young Unversty BYU cholarsarchve All Faculty Publcatons 2005-07-0 Comparng Hgh-Order Boolean Features Adam Drake adam_drake@yahoo.com Dan A. Ventura ventura@cs.byu.edu Follow ths and addtonal works
More informationLife Tables (Times) Summary. Sample StatFolio: lifetable times.sgp
Lfe Tables (Tmes) Summary... 1 Data Input... 2 Analyss Summary... 3 Survval Functon... 5 Log Survval Functon... 6 Cumulatve Hazard Functon... 7 Percentles... 7 Group Comparsons... 8 Summary The Lfe Tables
More informationEmpirical Distributions of Parameter Estimates. in Binary Logistic Regression Using Bootstrap
Int. Journal of Math. Analyss, Vol. 8, 4, no. 5, 7-7 HIKARI Ltd, www.m-hkar.com http://dx.do.org/.988/jma.4.494 Emprcal Dstrbutons of Parameter Estmates n Bnary Logstc Regresson Usng Bootstrap Anwar Ftranto*
More informationAPPLICATION OF MULTIVARIATE LOSS FUNCTION FOR ASSESSMENT OF THE QUALITY OF TECHNOLOGICAL PROCESS MANAGEMENT
3. - 5. 5., Brno, Czech Republc, EU APPLICATION OF MULTIVARIATE LOSS FUNCTION FOR ASSESSMENT OF THE QUALITY OF TECHNOLOGICAL PROCESS MANAGEMENT Abstract Josef TOŠENOVSKÝ ) Lenka MONSPORTOVÁ ) Flp TOŠENOVSKÝ
More informationBiostatistics 615/815
The E-M Algorthm Bostatstcs 615/815 Lecture 17 Last Lecture: The Smplex Method General method for optmzaton Makes few assumptons about functon Crawls towards mnmum Some recommendatons Multple startng ponts
More informationRecognizing Faces. Outline
Recognzng Faces Drk Colbry Outlne Introducton and Motvaton Defnng a feature vector Prncpal Component Analyss Lnear Dscrmnate Analyss !"" #$""% http://www.nfotech.oulu.f/annual/2004 + &'()*) '+)* 2 ! &
More informationProper Choice of Data Used for the Estimation of Datum Transformation Parameters
Proper Choce of Data Used for the Estmaton of Datum Transformaton Parameters Hakan S. KUTOGLU, Turkey Key words: Coordnate systems; transformaton; estmaton, relablty. SUMMARY Advances n technologes and
More informationElectrical analysis of light-weight, triangular weave reflector antennas
Electrcal analyss of lght-weght, trangular weave reflector antennas Knud Pontoppdan TICRA Laederstraede 34 DK-121 Copenhagen K Denmark Emal: kp@tcra.com INTRODUCTION The new lght-weght reflector antenna
More informationAnalysis of Continuous Beams in General
Analyss of Contnuous Beams n General Contnuous beams consdered here are prsmatc, rgdly connected to each beam segment and supported at varous ponts along the beam. onts are selected at ponts of support,
More informationThis module is part of the. Memobust Handbook. on Methodology of Modern Business Statistics
Ths module s part of the Memobust Handbook on Methodology of Modern Busness Statstcs 26 March 2014 Theme: Donor Imputaton Contents General secton... 3 1. Summary... 3 2. General descrpton... 3 2.1 Introducton
More informationEECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science
EECS 730 Introducton to Bonformatcs Sequence Algnment Luke Huan Electrcal Engneerng and Computer Scence http://people.eecs.ku.edu/~huan/ HMM Π s a set of states Transton Probabltes a kl Pr( l 1 k Probablty
More informationLecture 4: Principal components
/3/6 Lecture 4: Prncpal components 3..6 Multvarate lnear regresson MLR s optmal for the estmaton data...but poor for handlng collnear data Covarance matrx s not nvertble (large condton number) Robustness
More informationThree supervised learning methods on pen digits character recognition dataset
Three supervsed learnng methods on pen dgts character recognton dataset Chrs Flezach Department of Computer Scence and Engneerng Unversty of Calforna, San Dego San Dego, CA 92093 cflezac@cs.ucsd.edu Satoru
More informationIntra-Parametric Analysis of a Fuzzy MOLP
Intra-Parametrc Analyss of a Fuzzy MOLP a MIAO-LING WANG a Department of Industral Engneerng and Management a Mnghsn Insttute of Technology and Hsnchu Tawan, ROC b HSIAO-FAN WANG b Insttute of Industral
More informationOptimizing Document Scoring for Query Retrieval
Optmzng Document Scorng for Query Retreval Brent Ellwen baellwe@cs.stanford.edu Abstract The goal of ths project was to automate the process of tunng a document query engne. Specfcally, I used machne learnng
More informationUnsupervised Learning and Clustering
Unsupervsed Learnng and Clusterng Why consder unlabeled samples?. Collectng and labelng large set of samples s costly Gettng recorded speech s free, labelng s tme consumng 2. Classfer could be desgned
More informationC2 Training: June 8 9, Combining effect sizes across studies. Create a set of independent effect sizes. Introduction to meta-analysis
C2 Tranng: June 8 9, 2010 Introducton to meta-analyss The Campbell Collaboraton www.campbellcollaboraton.org Combnng effect szes across studes Compute effect szes wthn each study Create a set of ndependent
More informationFusion Performance Model for Distributed Tracking and Classification
Fuson Performance Model for Dstrbuted rackng and Classfcaton K.C. Chang and Yng Song Dept. of SEOR, School of I&E George Mason Unversty FAIRFAX, VA kchang@gmu.edu Martn Lggns Verdan Systems Dvson, Inc.
More informationTHE THEORY OF REGIONALIZED VARIABLES
CHAPTER 4 THE THEORY OF REGIONALIZED VARIABLES 4.1 Introducton It s ponted out by Armstrong (1998 : 16) that Matheron (1963b), realzng the sgnfcance of the spatal aspect of geostatstcal data, coned the
More informationClassifier Selection Based on Data Complexity Measures *
Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.
More informationThe Man-hour Estimation Models & Its Comparison of Interim Products Assembly for Shipbuilding
Internatonal Journal of Operatons Research Internatonal Journal of Operatons Research Vol., No., 9 4 (005) The Man-hour Estmaton Models & Its Comparson of Interm Products Assembly for Shpbuldng Bn Lu and
More informationComplex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following.
Complex Numbers The last topc n ths secton s not really related to most of what we ve done n ths chapter, although t s somewhat related to the radcals secton as we wll see. We also won t need the materal
More informationControl strategies for network efficiency and resilience with route choice
Control strateges for networ effcency and reslence wth route choce Andy Chow Ru Sha Centre for Transport Studes Unversty College London, UK Centralsed strateges UK 1 Centralsed strateges Some effectve
More informationWe Two Seismic Interference Attenuation Methods Based on Automatic Detection of Seismic Interference Moveout
We 14 15 Two Sesmc Interference Attenuaton Methods Based on Automatc Detecton of Sesmc Interference Moveout S. Jansen* (Unversty of Oslo), T. Elboth (CGG) & C. Sanchs (CGG) SUMMARY The need for effcent
More informationAvailable online at ScienceDirect. Procedia Environmental Sciences 26 (2015 )
Avalable onlne at www.scencedrect.com ScenceDrect Proceda Envronmental Scences 26 (2015 ) 109 114 Spatal Statstcs 2015: Emergng Patterns Calbratng a Geographcally Weghted Regresson Model wth Parameter-Specfc
More informationMachine Learning 9. week
Machne Learnng 9. week Mappng Concept Radal Bass Functons (RBF) RBF Networks 1 Mappng It s probably the best scenaro for the classfcaton of two dataset s to separate them lnearly. As you see n the below
More informationUser Authentication Based On Behavioral Mouse Dynamics Biometrics
User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA
More informationReducing Frame Rate for Object Tracking
Reducng Frame Rate for Object Trackng Pavel Korshunov 1 and We Tsang Oo 2 1 Natonal Unversty of Sngapore, Sngapore 11977, pavelkor@comp.nus.edu.sg 2 Natonal Unversty of Sngapore, Sngapore 11977, oowt@comp.nus.edu.sg
More informationLearning the Kernel Parameters in Kernel Minimum Distance Classifier
Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department
More informationNUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS
ARPN Journal of Engneerng and Appled Scences 006-017 Asan Research Publshng Network (ARPN). All rghts reserved. NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS Igor Grgoryev, Svetlana
More information5.0 Quality Assurance
5.0 Dr. Fred Omega Garces Analytcal Chemstry 25 Natural Scence, Mramar College Bascs of s what we do to get the rght answer for our purpose QA s planned and refers to planned and systematc producton processes
More informationA Binarization Algorithm specialized on Document Images and Photos
A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a
More informationREFRACTIVE INDEX SELECTION FOR POWDER MIXTURES
REFRACTIVE INDEX SELECTION FOR POWDER MIXTURES Laser dffracton s one of the most wdely used methods for partcle sze analyss of mcron and submcron sze powders and dspersons. It s quck and easy and provdes
More informationA Similarity-Based Prognostics Approach for Remaining Useful Life Estimation of Engineered Systems
2008 INTERNATIONAL CONFERENCE ON PROGNOSTICS AND HEALTH MANAGEMENT A Smlarty-Based Prognostcs Approach for Remanng Useful Lfe Estmaton of Engneered Systems Tany Wang, Janbo Yu, Davd Segel, and Jay Lee
More informationLOOP ANALYSIS. The second systematic technique to determine all currents and voltages in a circuit
LOOP ANALYSS The second systematic technique to determine all currents and voltages in a circuit T S DUAL TO NODE ANALYSS - T FRST DETERMNES ALL CURRENTS N A CRCUT AND THEN T USES OHM S LAW TO COMPUTE
More informationSolutions to Programming Assignment Five Interpolation and Numerical Differentiation
College of Engneerng and Coputer Scence Mechancal Engneerng Departent Mechancal Engneerng 309 Nuercal Analyss of Engneerng Systes Sprng 04 Nuber: 537 Instructor: Larry Caretto Solutons to Prograng Assgnent
More informationUsing Auxiliary Data for Adjustment In Longitudinal Research. Dirk Sikkel Joop Hox Edith de Leeuw
Usng Auxlary Data for Adjustment In Longtudnal Research Drk Skkel Joop Hox Edth de Leeuw 1. Longtudnal Research The use of longtudnal research desgns, such as cohort or panel studes, has become more and
More informationThe Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique
//00 :0 AM Outlne and Readng The Greedy Method The Greedy Method Technque (secton.) Fractonal Knapsack Problem (secton..) Task Schedulng (secton..) Mnmum Spannng Trees (secton.) Change Money Problem Greedy
More informationGraph-based Clustering
Graphbased Clusterng Transform the data nto a graph representaton ertces are the data ponts to be clustered Edges are eghted based on smlarty beteen data ponts Graph parttonng Þ Each connected component
More informationPrincipal Component Inversion
Prncpal Component Inverson Dr. A. Neumann, H. Krawczyk German Aerospace Centre DLR Remote Sensng Technology Insttute Marne Remote Sensng Prncpal Components - Propertes The Lnear Inverson Algorthm Optmsaton
More informationSupport Vector Machines for Business Applications
Support Vector Machnes for Busness Applcatons Bran C. Lovell and Chrstan J Walder The Unversty of Queensland and Max Planck Insttute, Tübngen {lovell, walder}@tee.uq.edu.au Introducton Recent years have
More informationA Clustering Algorithm for Chinese Adjectives and Nouns 1
Clusterng lgorthm for Chnese dectves and ouns Yang Wen, Chunfa Yuan, Changnng Huang 2 State Key aboratory of Intellgent Technology and System Deptartment of Computer Scence & Technology, Tsnghua Unversty,
More informationFinite Population Small Area Interval Estimation
Journal of Offcal Statstcs, Vol. 23, No. 2, 2007, pp. 223 237 Fnte Populaton Small Area Interval Estmaton L-Chun Zhang 1 Small area nterval estmaton s consdered for a fnte populaton, where the small area
More informationProblem Definitions and Evaluation Criteria for Computational Expensive Optimization
Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty
More informationOutlier Detection based on Robust Parameter Estimates
Outler Detecton based on Robust Parameter Estmates Nor Azlda Aleng 1, Ny Ny Nang, Norzan Mohamed 3 and Kasyp Mokhtar 4 1,3 School of Informatcs and Appled Mathematcs, Unverst Malaysa Terengganu, 1030 Kuala
More information