Missing Data Toolbox for Air Quality Datasets

Size: px
Start display at page:

Download "Missing Data Toolbox for Air Quality Datasets"

Transcription

1 EnviroInfo 2002 (Wien) Environmental Communication in the Information Society - Proceedings of the 16th Conference Missing Data Toolbox for Air Quality Datasets Mikko Kolehmainen 1, Heikki Junninen 1,4, Harri Niska 1, Toni Patama 1, Anna Ruuskanen 2, Kari Tuppurainen 3 and Juhani Ruuskanen 1 Abstract The objective of the study was to find a useful missing data imputing method for air quality forecasting applications. The univariate methods studied were the linear interpolation, spline and nearest neighbour (univariate) interpolation. Multivariate methods studied were multivariate nearest neighbour (NN), Self-Organising Map (SOM) and Multi-Layer Perceptron (MLP). Additionally, a new approach was developed where univariate methods were combined with multivariate methods in order to utilise the best properties of both approaches. The results in general showed that the best overall performance can be achieved by combining univariate and multivariate methods and that the way of combining is dependent on the variable inspected. Based on these results a Missing Data Toolbox (MDT) with a Graphical User Interface (GUI) in Matlab environment was created. The MDT encapsulates the different algorithms and enables the treatment of missing data in a coherent way. The MDT and GUI were tested on Windows and Linux environments. 1. General Environmental databases and their effective use are often needed to be uninterrupted by missing data values. In many cases, however, the time-series recorded have discontinuities due to insufficient sampling, errors in measurement or faulty data acquisition and storing. The methods used for processing the data usually require the time-series to be uninterrupted. A commonly suggested way of handling this is to fill in the missing values by setting them to the mean value or some other statistical parameter. This leads, however, to corruption of the time-series properties of the data and is not usually as accurate as other methods. If the continuity of the time-series is not strictly required by the algorithm, which is often the case with multivariate algorithms, a common Departments of Environmental Sciences 1, Applied Physics 2 and Chemistry 3, University of Kuopio, P.O.Box 1627, FIN Kuopio, Finland Institute for Environment and Sustainability 4, EC Joint Research Centre, I-21020, Ispra (VA), Italy

2 446 procedure is to reject data lines having missing values and to continue with less data. In that case one must bear in mind that information in the rejected lines is lost. Therefore, it would be advantageous to be able to fill in the missing values using the best of the information available. 2. Imputing methods 2.1. Univariate methods The univariate methods studied were the linear (LI), spline (SPL) and univariate nearest neighbour (UNN) interpolations. Linear interpolation fits a straight line between the endpoints of a gap formed by missing values, which can be calculated straightforwardly employing the line equation. Spline interpolation is based on the polynomials of different degrees. Univariate nearest neighbour is likely the simplest imputation scheme: the endpoints of a gap are used as estimates for all missing values Multivariate methods Multivariate methods studied were multivariate nearest neighbour (NN), Self- Organising Map (SOM) and Multi-Layer Perceptron (MLP). The NN imputation handles a row of N variables as a co-ordinate in an N-dimensional space and takes the missing values from the nearest neighbour (row) in that space where they are available, weighting at the same time the distances proportionally to the number of missing values in each row. The goal of the SOM is to find vectors, which can represent the input data set with prototypes and at the same time realize a continuous mapping from input space to a lattice (Kohonen, 1997). The missing values can then be recovered using the lattice. Similarly with the NN also the SOM is using a whole data set so that the information in incomplete rows is utilised. During the training process of the SOM missing elements are ignored and only values available in a vector (data row) are used for updating the weights of the map. The idea of the MLP network is to learn a black box like mapping from input variables to one or several output variables. The usage of MLP in this study is actually a combination of several MLP networks so that for each missing data pattern a network of its own was trained. For more thorough description of the application of the SOM and MLP in air quality forecasting see Gardner and Dorling (1998) and Kolehmainen et al. (2001) Hybrid methods In our study, new approach was developed where univariate methods are combined

3 447 with multivariate methods in order to utilise the best properties of both approaches. In this approach short gaps are filled with univariate interpolation which gives the best performance in the given situation and longer gaps are filled with an advanced multivariate method which has no direct dependence on the gap length in time space. In this context, the shortness depends on the variable under study Measuring the goodness of the imputation In order to measure the degree in which the imputation is error free, the index-ofagreement (Willmott, 1982) was employed in this work: d 1 n i 1 i n i 1 ( P O P O ave i ave O i 2 O Where n = number of imputations, P i = predicted value, O i = observed value, and O ave = average of observed values. ave ) 2 (1) 2.5. Execution of the tests The datasets used for tests were extracted from the APPETISE (Air pollution Episodes: Modelling Tools for Improved Smog Management) database). The locations and years used were Cambridge 1996 and Belfast 1996 due to that they contain only little missing values (6.3 and 2.2 percent, respectively). The datasets from Cambridge consisted of NO x, NO 2, O 3 and CO concentrations, which had a time-scale of one per hour, together with seven meteorological parameters: wind speed, wind direction, temperature, relative humidity, precipitation, solar radiation and net radiation. Correspondently, the datasets from Belfast comprised of NO x, NO 2, O 3, PM 10, SO 2 and CO concentrations together with nine meteorological parameters: wind direction, wind speed, cloud base height, visibility, mean sea level pressure, cloud base level, temperature, dew point and wet bulb. These datasets were first imputed with the Nearest Neighbour interpolation (Dixon, 1979), which is known to be a safe method because it does not introduce new values into the data. Subsequently, missing data patterns (continuous sequences of the multivariate dataset having values missing at least in one of the variables) were extracted from the Cambridge 1995, Cambridge 1997, Belfast 1994, Belfast 1995 and Belfast 1998 datasets. Each year was mixed in random in order to construct seven different sets. These sets were then applied to the test sets of the same location in order to produce final test sets with known gaps. All the test datasets (14 for Cambridge and 21 for Belfast) were imputed separately with the methods described. The results of the tests were then combined into

4 448 tables of statistics giving the index-of-agreement value for them. 3. Results of the comparison 3.1. Univariate methods The results of comparing univariate methods are given in Table 1. One can conclude that the LI is slightly better than the UNN and that both of these are considerably better than the SPL method. Thus, the univariate method selected for further evaluation combined with the multivariate method was the LI. The results also showed that the performance of the univariate method is dependent on the length of gap in time space and that the performance depends also on a variable under study. Therefore, the length of gaps that can be replaced with the LI should be estimated separately for each variable before the replacement. Table 1. Comparison of univariate methods with the index-of-agreement. The values were calculated using all of the variables described in the chapter 2.5. LI = Linear Interpolation, UNN = Univariate Nearest Neighbour and SPL = SPLine. Method Mean Min Max Std LI UNN SPL Multivariate methods The results of the multivariate methods showed that both the SOM and MLP methods perform slightly better than multivariate NN method. The advantage of SOM compared with other methods is that it is less dependent on the actual location of the missing values. However, if computational speed is placed at first, multivariate NN is then recommended. It has also the advantage of not generating completely new values to the data. As with univariate methods, the differences in performance between variables were large. Table 2. Comparison of multivariate methods with the index-of-agreement. The values were calculated using all of the variables described in the chapter 2.5. NN = Nearest Neighbour, SOM = Self-Organizing Map and MLP = Multi-Layer Perceptron. Method Mean Min Max Std NN SOM MLP

5 Hybrid methods The results of comparing the hybrid methods are shown in Table 3. It can be seen that there is no big difference between the methods considering the mean value of the index-of-agreement and also the minimum and maximum values obtained. Therefore, the same conclusions given for the multivariate methods (3.2) also hold here. Table 3. Comparison of hybrid methods with the index-of-agreement. The values were calculated using all of the variables described in the chapter 2.5. HNN = Hybrid NN, HSOM = Hybrid SOM and HMLP = Hybrid MLP. Method Mean Min Max Std HNN HSOM HMLP Summary of the results The methods evaluated are summarised in Fig 1. The results in general showed that the best overall performance can be achieved by combining univariate and multivariate methods and that the way of combining is dependent on the variable inspected. The results in more detail have been reported in Junninen et al. (2002) Index-of-agreement LI UNN SPL NN SOM MLP HNN HSOM HMLP Fig. 1: Summary comparison of the methods evaluated in the study. The whiskers give the min and max values for the method described by the corresponding bar. LI = Linear Interpolation, UNN = Univariate Nearest Neighbour, SPL = SPLine, NN = Nearest Neighbour, SOM = Self-Organizing Map, MLP = Multi-Layer Perceptron, HNN = Hybrid NN, HSOM = Hybrid SOM and HMLP = Hybrid MLP.

6 Description of the MDT Based on these results a toolbox in Matlab environment was generated, which encapsulates the different algorithms and enables the treatment of missing data in a coherent way. The toolbox was further enhanced with a graphical user interface (GUI). The GUI has algorithms for loading and saving datasets in different file formats (Fig 2), calculating different statistics for missing data and illustrating missing data conditions with graphs (Fig 3 and Fig 5). Furthermore, end-users can choose a method for the replacement from various linear and non-linear methods such as interpolation, nearest neighbour, SOM and MLP (Fig 4). Combinations of univariate and multivariate methods are available as hybrid methods, too. 5. Dissemination of the MDT The MDT and GUI were tested on Windows and Linux environments. It was found to be usable in both environments but there are some restrictions on the Matlab version to be used with the package. The GUI was decided to be the best approach in order to bring the MDT for the end-users instead of an Internet server service. The MDT for Matlab can be downloaded from the web-address The usage is limited to academic purposes only. Currently, work is underway in order to construct a standalone MS Windows application with a GUI. However, it is likely to include some restrictions regarding the methods used (most notably the MLP). This is due to current versions of Matlab c- code generators not being able to translate Matlab code having object definitions, which is the case with the Neural Network Toolbox. The dissemination of the standalone software has not been decided yet but it is likely to be commercial at least for non-academic purposes. Bibliography Dixon, J.K. (1979): Pattern Recognition with Partly Missing Data, IEEE Transactions on Systems, Man, and Cybernetics, 10 (SMC-9), Gardner, M.W., Dorling, S.R. (1998): Artificial neural networks (the multiplayer perceptron) a review of applications in the atmospheric sciences, Atmospheric Environment, 32, Junninen, H (et al.) (2002): The Performance of Different Imputation Methods for Air Quality Data with Missing Values, Submitted to Atmospheric Environment Kohonen, T. (1997): Self-Organizing Maps, Springer, Berlin Kolehmainen, M. (et al.) (2001): Neural networks and periodic components used in air quality forecasting Atmospheric Environment, 35, Willmott, C.J. (et al.) (1985): Statistics for the evaluation and comparison of models, J. Geophys. Res., 90 (C5),

7 451 Fig 2: Main interface of the MDT Fig 3: Plot interface of the MDT Fig 4. Input method selection of the MDT Fig 5. Validation interface of the MDT

Self-Organizing Maps for Analysis of Expandable Polystyrene Batch Process

Self-Organizing Maps for Analysis of Expandable Polystyrene Batch Process International Journal of Computers, Communications & Control Vol. II (2007), No. 2, pp. 143-148 Self-Organizing Maps for Analysis of Expandable Polystyrene Batch Process Mikko Heikkinen, Ville Nurminen,

More information

Time Series Prediction as a Problem of Missing Values: Application to ESTSP2007 and NN3 Competition Benchmarks

Time Series Prediction as a Problem of Missing Values: Application to ESTSP2007 and NN3 Competition Benchmarks Series Prediction as a Problem of Missing Values: Application to ESTSP7 and NN3 Competition Benchmarks Antti Sorjamaa and Amaury Lendasse Abstract In this paper, time series prediction is considered as

More information

A SURVEY PAPER ON MISSING DATA IN DATA MINING

A SURVEY PAPER ON MISSING DATA IN DATA MINING A SURVEY PAPER ON MISSING DATA IN DATA MINING SWATI JAIN Department of computer science and engineering, MPUAT University/College of technology and engineering, Udaipur, India, swati.subhi.9@gmail.com

More information

Climate Precipitation Prediction by Neural Network

Climate Precipitation Prediction by Neural Network Journal of Mathematics and System Science 5 (205) 207-23 doi: 0.7265/259-529/205.05.005 D DAVID PUBLISHING Juliana Aparecida Anochi, Haroldo Fraga de Campos Velho 2. Applied Computing Graduate Program,

More information

Use of Multi-category Proximal SVM for Data Set Reduction

Use of Multi-category Proximal SVM for Data Set Reduction Use of Multi-category Proximal SVM for Data Set Reduction S.V.N Vishwanathan and M Narasimha Murty Department of Computer Science and Automation, Indian Institute of Science, Bangalore 560 012, India Abstract.

More information

A Comparative Study of Conventional and Neural Network Classification of Multispectral Data

A Comparative Study of Conventional and Neural Network Classification of Multispectral Data A Comparative Study of Conventional and Neural Network Classification of Multispectral Data B.Solaiman & M.C.Mouchot Ecole Nationale Supérieure des Télécommunications de Bretagne B.P. 832, 29285 BREST

More information

Images Reconstruction using an iterative SOM based algorithm.

Images Reconstruction using an iterative SOM based algorithm. Images Reconstruction using an iterative SOM based algorithm. M.Jouini 1, S.Thiria 2 and M.Crépon 3 * 1- LOCEAN, MMSA team, CNAM University, Paris, France 2- LOCEAN, MMSA team, UVSQ University Paris, France

More information

Modular network SOM : Theory, algorithm and applications

Modular network SOM : Theory, algorithm and applications Modular network SOM : Theory, algorithm and applications Kazuhiro Tokunaga and Tetsuo Furukawa Kyushu Institute of Technology, Kitakyushu 88-96, Japan {tokunaga, furukawa}@brain.kyutech.ac.jp Abstract.

More information

Intelligent Methods in Modelling and Simulation of Complex Systems

Intelligent Methods in Modelling and Simulation of Complex Systems SNE O V E R V I E W N OTE Intelligent Methods in Modelling and Simulation of Complex Systems Esko K. Juuso * Control Engineering Laboratory Department of Process and Environmental Engineering, P.O.Box

More information

GRANULAR COMPUTING AND EVOLUTIONARY FUZZY MODELLING FOR MECHANICAL PROPERTIES OF ALLOY STEELS. G. Panoutsos and M. Mahfouf

GRANULAR COMPUTING AND EVOLUTIONARY FUZZY MODELLING FOR MECHANICAL PROPERTIES OF ALLOY STEELS. G. Panoutsos and M. Mahfouf GRANULAR COMPUTING AND EVOLUTIONARY FUZZY MODELLING FOR MECHANICAL PROPERTIES OF ALLOY STEELS G. Panoutsos and M. Mahfouf Institute for Microstructural and Mechanical Process Engineering: The University

More information

LS-SVM Functional Network for Time Series Prediction

LS-SVM Functional Network for Time Series Prediction LS-SVM Functional Network for Time Series Prediction Tuomas Kärnä 1, Fabrice Rossi 2 and Amaury Lendasse 1 Helsinki University of Technology - Neural Networks Research Center P.O. Box 5400, FI-02015 -

More information

Cartographic Selection Using Self-Organizing Maps

Cartographic Selection Using Self-Organizing Maps 1 Cartographic Selection Using Self-Organizing Maps Bin Jiang 1 and Lars Harrie 2 1 Division of Geomatics, Institutionen för Teknik University of Gävle, SE-801 76 Gävle, Sweden e-mail: bin.jiang@hig.se

More information

IMAGE CLASSIFICATION USING COMPETITIVE NEURAL NETWORKS

IMAGE CLASSIFICATION USING COMPETITIVE NEURAL NETWORKS IMAGE CLASSIFICATION USING COMPETITIVE NEURAL NETWORKS V. Musoko, M. Kolı nova, A. Procha zka Institute of Chemical Technology, Department of Computing and Control Engineering Abstract The contribution

More information

Face Detection Using Radial Basis Function Neural Networks With Fixed Spread Value

Face Detection Using Radial Basis Function Neural Networks With Fixed Spread Value Detection Using Radial Basis Function Neural Networks With Fixed Value Khairul Azha A. Aziz Faculty of Electronics and Computer Engineering, Universiti Teknikal Malaysia Melaka, Ayer Keroh, Melaka, Malaysia.

More information

A New Online Clustering Approach for Data in Arbitrary Shaped Clusters

A New Online Clustering Approach for Data in Arbitrary Shaped Clusters A New Online Clustering Approach for Data in Arbitrary Shaped Clusters Richard Hyde, Plamen Angelov Data Science Group, School of Computing and Communications Lancaster University Lancaster, LA1 4WA, UK

More information

A SOM-view of oilfield data: A novel vector field visualization for Self-Organizing Maps and its applications in the petroleum industry

A SOM-view of oilfield data: A novel vector field visualization for Self-Organizing Maps and its applications in the petroleum industry A SOM-view of oilfield data: A novel vector field visualization for Self-Organizing Maps and its applications in the petroleum industry Georg Pölzlbauer, Andreas Rauber (Department of Software Technology

More information

Visualization and Statistical Analysis of Multi Dimensional Data of Wireless Sensor Networks Using Self Organising Maps

Visualization and Statistical Analysis of Multi Dimensional Data of Wireless Sensor Networks Using Self Organising Maps Visualization and Statistical Analysis of Multi Dimensional Data of Wireless Sensor Networks Using Self Organising Maps Thendral Puyalnithi #1, V Madhu Viswanatham *2 School of Computer Science and Engineering,

More information

Modelling Atmospheric Transport, Dispersion and Deposition on Short and Long Range Validation against ETEX-1 and ETEX-2, and Chernobyl

Modelling Atmospheric Transport, Dispersion and Deposition on Short and Long Range Validation against ETEX-1 and ETEX-2, and Chernobyl Modelling Atmospheric Transport, Dispersion and Deposition on Short and Long Range Validation against ETEX-1 and ETEX-2, and Chernobyl A contribution to subproject GLOREAM Annemarie Bastrup-Birk, J0rgen

More information

4. Feedforward neural networks. 4.1 Feedforward neural network structure

4. Feedforward neural networks. 4.1 Feedforward neural network structure 4. Feedforward neural networks 4.1 Feedforward neural network structure Feedforward neural network is one of the most common network architectures. Its structure and some basic preprocessing issues required

More information

Character Recognition from Google Street View Images

Character Recognition from Google Street View Images Character Recognition from Google Street View Images Indian Institute of Technology Course Project Report CS365A By Ritesh Kumar (11602) and Srikant Singh (12729) Under the guidance of Professor Amitabha

More information

Topic 3.1: Introduction to Multivariate Functions (Functions of Two or More Variables)

Topic 3.1: Introduction to Multivariate Functions (Functions of Two or More Variables) BSU Math 275 Notes Topic 3.1: Introduction to Multivariate Functions (Functions of Two or More Variables) Textbook Section: 14.1 From the Toolbox (what you need from previous classes): Know the meaning

More information

Pattern Recognition Chapter 3: Nearest Neighbour Algorithms

Pattern Recognition Chapter 3: Nearest Neighbour Algorithms Pattern Recognition Chapter 3: Nearest Neighbour Algorithms Asst. Prof. Dr. Chumphol Bunkhumpornpat Department of Computer Science Faculty of Science Chiang Mai University Learning Objectives What a nearest

More information

Applying Kohonen Network in Organising Unstructured Data for Talus Bone

Applying Kohonen Network in Organising Unstructured Data for Talus Bone 212 Third International Conference on Theoretical and Mathematical Foundations of Computer Science Lecture Notes in Information Technology, Vol.38 Applying Kohonen Network in Organising Unstructured Data

More information

RECOVERY OF PARTIALLY OBSERVED DATA APPEARING IN CLUSTERS. Sunrita Poddar, Mathews Jacob

RECOVERY OF PARTIALLY OBSERVED DATA APPEARING IN CLUSTERS. Sunrita Poddar, Mathews Jacob RECOVERY OF PARTIALLY OBSERVED DATA APPEARING IN CLUSTERS Sunrita Poddar, Mathews Jacob Department of Electrical and Computer Engineering The University of Iowa, IA, USA ABSTRACT We propose a matrix completion

More information

DEVELOPMENT OF NEURAL NETWORK TRAINING METHODOLOGY FOR MODELING NONLINEAR SYSTEMS WITH APPLICATION TO THE PREDICTION OF THE REFRACTIVE INDEX

DEVELOPMENT OF NEURAL NETWORK TRAINING METHODOLOGY FOR MODELING NONLINEAR SYSTEMS WITH APPLICATION TO THE PREDICTION OF THE REFRACTIVE INDEX DEVELOPMENT OF NEURAL NETWORK TRAINING METHODOLOGY FOR MODELING NONLINEAR SYSTEMS WITH APPLICATION TO THE PREDICTION OF THE REFRACTIVE INDEX THESIS CHONDRODIMA EVANGELIA Supervisor: Dr. Alex Alexandridis,

More information

A Fast Multivariate Nearest Neighbour Imputation Algorithm

A Fast Multivariate Nearest Neighbour Imputation Algorithm A Fast Multivariate Nearest Neighbour Imputation Algorithm Norman Solomon, Giles Oatley and Ken McGarry Abstract Imputation of missing data is important in many areas, such as reducing non-response bias

More information

ADMS-Roads Extra Air Quality Management System Version 4.1

ADMS-Roads Extra Air Quality Management System Version 4.1 ADMS-Roads Extra Air Quality Management System Version 4.1 User Guide CERC Copyright Cambridge Environmental Research Consultants Limited, 2017 ADMS-Roads Extra An Air Quality Management System User Guide

More information

Large Data Analysis via Interpolation of Functions: Interpolating Polynomials vs Artificial Neural Networks

Large Data Analysis via Interpolation of Functions: Interpolating Polynomials vs Artificial Neural Networks American Journal of Intelligent Systems 2018, 8(1): 6-11 DOI: 10.5923/j.ajis.20180801.02 Large Data Analysis via Interpolation of Functions: Interpolating Polynomials vs Artificial Neural Networks Rohit

More information

Affine Arithmetic Self Organizing Map

Affine Arithmetic Self Organizing Map Affine Arithmetic Self Organizing Map Tony Bazzi Department of Electrical and Systems Engineering Oakland University Rochester, MI 48309, USA Email: tbazzi [AT] oakland.edu Jasser Jasser Department of

More information

Visual Working Efficiency Analysis Method of Cockpit Based On ANN

Visual Working Efficiency Analysis Method of Cockpit Based On ANN Visual Working Efficiency Analysis Method of Cockpit Based On ANN Yingchun CHEN Commercial Aircraft Corporation of China,Ltd Dongdong WEI Fudan University Dept. of Mechanics an Science Engineering Gang

More information

Global Journal of Engineering Science and Research Management

Global Journal of Engineering Science and Research Management A NOVEL HYBRID APPROACH FOR PREDICTION OF MISSING VALUES IN NUMERIC DATASET V.B.Kamble* 1, S.N.Deshmukh 2 * 1 Department of Computer Science and Engineering, P.E.S. College of Engineering, Aurangabad.

More information

Cluster Analysis and Visualization. Workshop on Statistics and Machine Learning 2004/2/6

Cluster Analysis and Visualization. Workshop on Statistics and Machine Learning 2004/2/6 Cluster Analysis and Visualization Workshop on Statistics and Machine Learning 2004/2/6 Outlines Introduction Stages in Clustering Clustering Analysis and Visualization One/two-dimensional Data Histogram,

More information

Graph projection techniques for Self-Organizing Maps

Graph projection techniques for Self-Organizing Maps Graph projection techniques for Self-Organizing Maps Georg Pölzlbauer 1, Andreas Rauber 1, Michael Dittenbach 2 1- Vienna University of Technology - Department of Software Technology Favoritenstr. 9 11

More information

Exploiting the Scale-free Structure of the WWW

Exploiting the Scale-free Structure of the WWW Exploiting the Scale-free Structure of the WWW Niina Päivinen Department of Computer Science, University of Kuopio P.O. Box 1627, FIN-70211 Kuopio, Finland email niina.paivinen@cs.uku.fi tel. +358-17-16

More information

A Fuzzy C-means Clustering Algorithm Based on Pseudo-nearest-neighbor Intervals for Incomplete Data

A Fuzzy C-means Clustering Algorithm Based on Pseudo-nearest-neighbor Intervals for Incomplete Data Journal of Computational Information Systems 11: 6 (2015) 2139 2146 Available at http://www.jofcis.com A Fuzzy C-means Clustering Algorithm Based on Pseudo-nearest-neighbor Intervals for Incomplete Data

More information

Evaluation of Neural Networks in the Subject of Prognostics As Compared To Linear Regression Model

Evaluation of Neural Networks in the Subject of Prognostics As Compared To Linear Regression Model International Journal of Engineering & Technology IJET-IJENS Vol:10 No:06 50 Evaluation of Neural Networks in the Subject of Prognostics As Compared To Linear Regression Model A. M. Riad, Hamdy K. Elminir,

More information

Gridded Data Speedwell Derived Gridded Products

Gridded Data Speedwell Derived Gridded Products Gridded Data Speedwell Derived Gridded Products Introduction Speedwell Weather offers access to a wide choice of gridded data series. These datasets are sourced from the originating agencies in their native

More information

Data analysis and inference for an industrial deethanizer

Data analysis and inference for an industrial deethanizer Data analysis and inference for an industrial deethanizer Francesco Corona a, Michela Mulas b, Roberto Baratti c and Jose Romagnoli d a Dept. of Information and Computer Science, Helsinki University of

More information

SOM+EOF for Finding Missing Values

SOM+EOF for Finding Missing Values SOM+EOF for Finding Missing Values Antti Sorjamaa 1, Paul Merlin 2, Bertrand Maillet 2 and Amaury Lendasse 1 1- Helsinki University of Technology - CIS P.O. Box 5400, 02015 HUT - Finland 2- Variances and

More information

Interpolation and Splines

Interpolation and Splines Interpolation and Splines Anna Gryboś October 23, 27 1 Problem setting Many of physical phenomenona are described by the functions that we don t know exactly. Often we can calculate or measure the values

More information

A System for Joining and Recognition of Broken Bangla Numerals for Indian Postal Automation

A System for Joining and Recognition of Broken Bangla Numerals for Indian Postal Automation A System for Joining and Recognition of Broken Bangla Numerals for Indian Postal Automation K. Roy, U. Pal and B. B. Chaudhuri CVPR Unit; Indian Statistical Institute, Kolkata-108; India umapada@isical.ac.in

More information

Version 3 Updated: 10 March Distributed Oceanographic Match-up Service (DOMS) User Interface Design

Version 3 Updated: 10 March Distributed Oceanographic Match-up Service (DOMS) User Interface Design Distributed Oceanographic Match-up Service (DOMS) User Interface Design Shawn R. Smith 1, Jocelyn Elya 1, Adam Stallard 1, Thomas Huang 2, Vardis Tsontos 2, Benjamin Holt 2, Steven Worley 3, Zaihua Ji

More information

Classification of Face Images for Gender, Age, Facial Expression, and Identity 1

Classification of Face Images for Gender, Age, Facial Expression, and Identity 1 Proc. Int. Conf. on Artificial Neural Networks (ICANN 05), Warsaw, LNCS 3696, vol. I, pp. 569-574, Springer Verlag 2005 Classification of Face Images for Gender, Age, Facial Expression, and Identity 1

More information

Getting Started with NEAT-DATA: SOM Editing and Imputation Software Implementation

Getting Started with NEAT-DATA: SOM Editing and Imputation Software Implementation Laboratory of Data Analysis University of Jyväskylä EUREDIT - WP6 reports Getting Started with NEAT-DATA: SOM Editing and Imputation Software Implementation Pasi P. Koikkalainen and Ismo Horppu University

More information

Texture Classification by Combining Local Binary Pattern Features and a Self-Organizing Map

Texture Classification by Combining Local Binary Pattern Features and a Self-Organizing Map Texture Classification by Combining Local Binary Pattern Features and a Self-Organizing Map Markus Turtinen, Topi Mäenpää, and Matti Pietikäinen Machine Vision Group, P.O.Box 4500, FIN-90014 University

More information

Automatic Singular Spectrum Analysis for Time-Series Decomposition

Automatic Singular Spectrum Analysis for Time-Series Decomposition Automatic Singular Spectrum Analysis for Time-Series Decomposition A.M. Álvarez-Meza and C.D. Acosta-Medina and G. Castellanos-Domínguez Universidad Nacional de Colombia, Signal Processing and Recognition

More information

HYSPLIT model description and operational set up for benchmark case study

HYSPLIT model description and operational set up for benchmark case study HYSPLIT model description and operational set up for benchmark case study Barbara Stunder and Roland Draxler NOAA Air Resources Laboratory Silver Spring, MD, USA Workshop on Ash Dispersal Forecast and

More information

5.6 Self-organizing maps (SOM) [Book, Sect. 10.3]

5.6 Self-organizing maps (SOM) [Book, Sect. 10.3] Ch.5 Classification and Clustering 5.6 Self-organizing maps (SOM) [Book, Sect. 10.3] The self-organizing map (SOM) method, introduced by Kohonen (1982, 2001), approximates a dataset in multidimensional

More information

Stability Assessment of Electric Power Systems using Growing Neural Gas and Self-Organizing Maps

Stability Assessment of Electric Power Systems using Growing Neural Gas and Self-Organizing Maps Stability Assessment of Electric Power Systems using Growing Gas and Self-Organizing Maps Christian Rehtanz, Carsten Leder University of Dortmund, 44221 Dortmund, Germany Abstract. Liberalized competitive

More information

In this assignment, we investigated the use of neural networks for supervised classification

In this assignment, we investigated the use of neural networks for supervised classification Paul Couchman Fabien Imbault Ronan Tigreat Gorka Urchegui Tellechea Classification assignment (group 6) Image processing MSc Embedded Systems March 2003 Classification includes a broad range of decision-theoric

More information

Fourier analysis of low-resolution satellite images of cloud

Fourier analysis of low-resolution satellite images of cloud New Zealand Journal of Geology and Geophysics, 1991, Vol. 34: 549-553 0028-8306/91/3404-0549 $2.50/0 Crown copyright 1991 549 Note Fourier analysis of low-resolution satellite images of cloud S. G. BRADLEY

More information

QUALITY CONTROL FOR UNMANNED METEOROLOGICAL STATIONS IN MALAYSIAN METEOROLOGICAL DEPARTMENT

QUALITY CONTROL FOR UNMANNED METEOROLOGICAL STATIONS IN MALAYSIAN METEOROLOGICAL DEPARTMENT QUALITY CONTROL FOR UNMANNED METEOROLOGICAL STATIONS IN MALAYSIAN METEOROLOGICAL DEPARTMENT By Wan Mohd. Nazri Wan Daud Malaysian Meteorological Department, Jalan Sultan, 46667 Petaling Jaya, Selangor,

More information

Accurate Thermo-Fluid Simulation in Real Time Environments. Silvia Poles, Alberto Deponti, EnginSoft S.p.A. Frank Rhodes, Mentor Graphics

Accurate Thermo-Fluid Simulation in Real Time Environments. Silvia Poles, Alberto Deponti, EnginSoft S.p.A. Frank Rhodes, Mentor Graphics Accurate Thermo-Fluid Simulation in Real Time Environments Silvia Poles, Alberto Deponti, EnginSoft S.p.A. Frank Rhodes, Mentor Graphics M e c h a n i c a l a n a l y s i s W h i t e P a p e r w w w. m

More information

Standard and Convex NMF in Clustering UCI wine and sonar data

Standard and Convex NMF in Clustering UCI wine and sonar data Standard and Convex NMF in Clustering UCI wine and sonar data ACS AISBITS 2016, October 19-21, Miȩdzyzdroje. Anna M. Bartkowiak aba@cs.uni.wroc.pl Institute of Computer Science, University of Wroclaw PL

More information

Cambridge International Examinations Cambridge International General Certificate of Secondary Education. Published

Cambridge International Examinations Cambridge International General Certificate of Secondary Education. Published Cambridge International Examinations Cambridge International General Certificate of Secondary Education BIOLOGY 0610/63 Paper 6 Alternative to Practical May/June 2016 MARK SCHEME Maximum Mark: 40 Published

More information

3 Ways to Improve Your Regression

3 Ways to Improve Your Regression 3 Ways to Improve Your Regression Introduction This tutorial will take you through the steps demonstrated in the 3 Ways to Improve Your Regression webinar. First, you will be introduced to a dataset about

More information

Ship Energy Systems Modelling: a Gray-Box approach

Ship Energy Systems Modelling: a Gray-Box approach MOSES Workshop: Modelling and Optimization of Ship Energy Systems Ship Energy Systems Modelling: a Gray-Box approach 25 October 2017 Dr Andrea Coraddu andrea.coraddu@strath.ac.uk 30/10/2017 Modelling &

More information

Integration of Sentry Visibility Sensor into Campbell Scientific Data Logger CR1000 *

Integration of Sentry Visibility Sensor into Campbell Scientific Data Logger CR1000 * Available online at www.sciencedirect.com Procedia Environmental Sciences 12 (2012 ) 1137 1143 2011 International Conference on Environmental Science and Engineering (ICESE 2011) Integration of Sentry

More information

This is a repository copy of A Rule Chaining Architecture Using a Correlation Matrix Memory.

This is a repository copy of A Rule Chaining Architecture Using a Correlation Matrix Memory. This is a repository copy of A Rule Chaining Architecture Using a Correlation Matrix Memory. White Rose Research Online URL for this paper: http://eprints.whiterose.ac.uk/88231/ Version: Submitted Version

More information

Nonintrusive Load Monitoring using TT-Transform and Neural Networks

Nonintrusive Load Monitoring using TT-Transform and Neural Networks Nonintrusive Load Monitoring using TT-Transform and Neural Networks Khairuddin Khalid 1, Azah Mohamed 2 Department of Electrical, Electronic and Systems Engineering Faculty of Engineering and Built Environment,

More information

The role of Fisher information in primary data space for neighbourhood mapping

The role of Fisher information in primary data space for neighbourhood mapping The role of Fisher information in primary data space for neighbourhood mapping H. Ruiz 1, I. H. Jarman 2, J. D. Martín 3, P. J. Lisboa 1 1 - School of Computing and Mathematical Sciences - Department of

More information

DESIGN OF KOHONEN SELF-ORGANIZING MAP WITH REDUCED STRUCTURE

DESIGN OF KOHONEN SELF-ORGANIZING MAP WITH REDUCED STRUCTURE DESIGN OF KOHONEN SELF-ORGANIZING MAP WITH REDUCED STRUCTURE S. Kajan, M. Lajtman Institute of Control and Industrial Informatics, Faculty of Electrical Engineering and Information Technology, Slovak University

More information

Application of Multivariate Adaptive Regression Splines to Evaporation Losses in Reservoirs

Application of Multivariate Adaptive Regression Splines to Evaporation Losses in Reservoirs Open access e-journal Earth Science India, eissn: 0974 8350 Vol. 4(I), January, 20, pp.5-20 http://www.earthscienceindia.info/ Application of Multivariate Adaptive Regression Splines to Evaporation Losses

More information

Self-Organizing Maps for cyclic and unbounded graphs

Self-Organizing Maps for cyclic and unbounded graphs Self-Organizing Maps for cyclic and unbounded graphs M. Hagenbuchner 1, A. Sperduti 2, A.C. Tsoi 3 1- University of Wollongong, Wollongong, Australia. 2- University of Padova, Padova, Italy. 3- Hong Kong

More information

SINGLE IMAGE ORIENTATION USING LINEAR FEATURES AUTOMATICALLY EXTRACTED FROM DIGITAL IMAGES

SINGLE IMAGE ORIENTATION USING LINEAR FEATURES AUTOMATICALLY EXTRACTED FROM DIGITAL IMAGES SINGLE IMAGE ORIENTATION USING LINEAR FEATURES AUTOMATICALLY EXTRACTED FROM DIGITAL IMAGES Nadine Meierhold a, Armin Schmich b a Technical University of Dresden, Institute of Photogrammetry and Remote

More information

A Neural Network Model Of Insurance Customer Ratings

A Neural Network Model Of Insurance Customer Ratings A Neural Network Model Of Insurance Customer Ratings Jan Jantzen 1 Abstract Given a set of data on customers the engineering problem in this study is to model the data and classify customers

More information

Wind energy production forecasting

Wind energy production forecasting Wind energy production forecasting Floris Ouwendijk a Henk Koppelaar a Rutger ter Borg b Thijs van den Berg b a Delft University of Technology, PO box 5031, 2600 GA Delft, the Netherlands b NUON Energy

More information

A novel firing rule for training Kohonen selforganising

A novel firing rule for training Kohonen selforganising A novel firing rule for training Kohonen selforganising maps D. T. Pham & A. B. Chan Manufacturing Engineering Centre, School of Engineering, University of Wales Cardiff, P.O. Box 688, Queen's Buildings,

More information

Server room guide helps energy managers reduce server consumption

Server room guide helps energy managers reduce server consumption Server room guide helps energy managers reduce server consumption Jan Viegand Viegand Maagøe Nr. Farimagsgade 37 1364 Copenhagen K Denmark jv@viegandmaagoe.dk Keywords servers, guidelines, server rooms,

More information

3-D MRI Brain Scan Classification Using A Point Series Based Representation

3-D MRI Brain Scan Classification Using A Point Series Based Representation 3-D MRI Brain Scan Classification Using A Point Series Based Representation Akadej Udomchaiporn 1, Frans Coenen 1, Marta García-Fiñana 2, and Vanessa Sluming 3 1 Department of Computer Science, University

More information

SOM based methods in early fault detection of nuclear industry

SOM based methods in early fault detection of nuclear industry SOM based methods in early fault detection of nuclear industry Miki Sirola, Jaakko Talonen and Golan Lampi Helsinki University of Technology - Department of Information and Computer Science P.O.Box 5400,

More information

SOLWEIG1D. User Manual - Version 2015a. Date: Fredrik Lindberg Göteborg Urban Climate Group, University of Gothenburg

SOLWEIG1D. User Manual - Version 2015a. Date: Fredrik Lindberg Göteborg Urban Climate Group, University of Gothenburg Göteborg Urban Climate Group Department of Earth Sciences University of Gothenburg SOLWEIG1D User Manual - Version 2015a Date: 2015 06 17 Fredrik Lindberg Göteborg Urban Climate Group, University of Gothenburg

More information

PROCESS STATE IDENTIFICATION AND MODELING IN A FLUIDIZED BED ENERGY PLANT BY USING ARTIFICIAL NEURAL NETWORKS

PROCESS STATE IDENTIFICATION AND MODELING IN A FLUIDIZED BED ENERGY PLANT BY USING ARTIFICIAL NEURAL NETWORKS PROCESS STATE IDENTIFICATION AND MODELING IN A FLUIDIZED BED ENERGY PLANT BY USING ARTIFICIAL NEURAL NETWORKS MIKA LIUKKONEN 1,*, EERO HÄLIKKÄ 2, REIJO KUIVALAINEN 2, YRJÖ HILTUNEN 1 1 Department of Environmental

More information

Essentials for Modern Data Analysis Systems

Essentials for Modern Data Analysis Systems Essentials for Modern Data Analysis Systems Mehrdad Jahangiri, Cyrus Shahabi University of Southern California Los Angeles, CA 90089-0781 {jahangir, shahabi}@usc.edu Abstract Earth scientists need to perform

More information

Identification of Multisensor Conversion Characteristic Using Neural Networks

Identification of Multisensor Conversion Characteristic Using Neural Networks Sensors & Transducers 3 by IFSA http://www.sensorsportal.com Identification of Multisensor Conversion Characteristic Using Neural Networks Iryna TURCHENKO and Volodymyr KOCHAN Research Institute of Intelligent

More information

A Rule Chaining Architecture Using a Correlation Matrix Memory. James Austin, Stephen Hobson, Nathan Burles, and Simon O Keefe

A Rule Chaining Architecture Using a Correlation Matrix Memory. James Austin, Stephen Hobson, Nathan Burles, and Simon O Keefe A Rule Chaining Architecture Using a Correlation Matrix Memory James Austin, Stephen Hobson, Nathan Burles, and Simon O Keefe Advanced Computer Architectures Group, Department of Computer Science, University

More information

Research Article Forecasting SPEI and SPI Drought Indices Using the Integrated Artificial Neural Networks

Research Article Forecasting SPEI and SPI Drought Indices Using the Integrated Artificial Neural Networks Computational Intelligence and Neuroscience Volume 2016, Article ID 3868519, 17 pages http://dx.doi.org/10.1155/2016/3868519 Research Article Forecasting SPEI and SPI Drought Indices Using the Integrated

More information

Color Space Projection, Feature Fusion and Concurrent Neural Modules for Biometric Image Recognition

Color Space Projection, Feature Fusion and Concurrent Neural Modules for Biometric Image Recognition Proceedings of the 5th WSEAS Int. Conf. on COMPUTATIONAL INTELLIGENCE, MAN-MACHINE SYSTEMS AND CYBERNETICS, Venice, Italy, November 20-22, 2006 286 Color Space Projection, Fusion and Concurrent Neural

More information

Empirical transfer function determination by. BP 100, Universit de PARIS 6

Empirical transfer function determination by. BP 100, Universit de PARIS 6 Empirical transfer function determination by the use of Multilayer Perceptron F. Badran b, M. Crepon a, C. Mejia a, S. Thiria a and N. Tran a a Laboratoire d'oc anographie Dynamique et de Climatologie

More information

2 Michael E. Leventon and Sarah F. F. Gibson a b c d Fig. 1. (a, b) Two MR scans of a person's knee. Both images have high resolution in-plane, but ha

2 Michael E. Leventon and Sarah F. F. Gibson a b c d Fig. 1. (a, b) Two MR scans of a person's knee. Both images have high resolution in-plane, but ha Model Generation from Multiple Volumes using Constrained Elastic SurfaceNets Michael E. Leventon and Sarah F. F. Gibson 1 MIT Artificial Intelligence Laboratory, Cambridge, MA 02139, USA leventon@ai.mit.edu

More information

Analysis of Semantic Information Available in an Image Collection Augmented with Auxiliary Data

Analysis of Semantic Information Available in an Image Collection Augmented with Auxiliary Data Analysis of Semantic Information Available in an Image Collection Augmented with Auxiliary Data Mats Sjöberg, Ville Viitaniemi, Jorma Laaksonen, and Timo Honkela Adaptive Informatics Research Centre, Helsinki

More information

By choosing to view this document, you agree to all provisions of the copyright laws protecting it.

By choosing to view this document, you agree to all provisions of the copyright laws protecting it. Jussi Pakkanen and Jukka Iivarinen, A Novel Self Organizing Neural Network for Defect Image Classification. In Proceedings of the International Joint Conference on Neural Networks, pages 2553 2558, Budapest,

More information

Neural Network Based Offline Signature Recognition and Verification System

Neural Network Based Offline Signature Recognition and Verification System Abstract Research Journal of Engineering Sciences ISSN 2278 9472 Neural Network Based Offline Signature Recognition and Verification System Paigwar Shikha and Shukla Shailja Department of Electrical Engineering,

More information

Validation for Data Classification

Validation for Data Classification Validation for Data Classification HILARIO LÓPEZ and IVÁN MACHÓN and EVA FERNÁNDEZ Departamento de Ingeniería Eléctrica, Electrónica de Computadores y Sistemas Universidad de Oviedo Edificio Departamental

More information

Error Analysis, Statistics and Graphing

Error Analysis, Statistics and Graphing Error Analysis, Statistics and Graphing This semester, most of labs we require us to calculate a numerical answer based on the data we obtain. A hard question to answer in most cases is how good is your

More information

Using Decision Boundary to Analyze Classifiers

Using Decision Boundary to Analyze Classifiers Using Decision Boundary to Analyze Classifiers Zhiyong Yan Congfu Xu College of Computer Science, Zhejiang University, Hangzhou, China yanzhiyong@zju.edu.cn Abstract In this paper we propose to use decision

More information

Environmental Assessment Knowledge & Tools. Ning Liu Laboratory for architectural production

Environmental Assessment Knowledge & Tools. Ning Liu Laboratory for architectural production Environmental Assessment Knowledge & Tools Ning Liu Laboratory for architectural production 2010.03.04 lapa environment input BASICS LAPA MASTER DESIGN STUDIO INPUTS GOALS -INTRODUCE ASSESSMENT KNOWLEDGE

More information

Processing Missing Values with Self-Organized Maps

Processing Missing Values with Self-Organized Maps Processing Missing Values with Self-Organized Maps David Sommer, Tobias Grimm, Martin Golz University of Applied Sciences Schmalkalden Department of Computer Science D-98574 Schmalkalden, Germany Phone:

More information

DEVELOPMENT OF HIGH RESOLUTION 3D SOUND PROPAGATION MODEL USING LIDAR DATA AND AIR PHOTO

DEVELOPMENT OF HIGH RESOLUTION 3D SOUND PROPAGATION MODEL USING LIDAR DATA AND AIR PHOTO DEVELOPMENT OF HIGH RESOLUTION 3D SOUND PROPAGATION MODEL USING LIDAR DATA AND AIR PHOTO Susham Biswas*, Bharat Lohani Dept. of Civil Engineering, Indian Institute of Technology Kanpur, 208016 India -

More information

Calculation Methods. IES Virtual Environment 6.4 CIBSE Heat Loss & Heat Gain (ApacheCalc)

Calculation Methods. IES Virtual Environment 6.4 CIBSE Heat Loss & Heat Gain (ApacheCalc) Calculation Methods IES Virtual Environment 6.4 CIBSE Heat Loss & Heat Gain (ApacheCalc) Contents Calculation Methods...1 1 Introduction...3 2 Heat Loss...4 2.1 Heat Loss Methodology... 4 3 Heat Gain...5

More information

Machine Learning and Pervasive Computing

Machine Learning and Pervasive Computing Stephan Sigg Georg-August-University Goettingen, Computer Networks 17.12.2014 Overview and Structure 22.10.2014 Organisation 22.10.3014 Introduction (Def.: Machine learning, Supervised/Unsupervised, Examples)

More information

Guidelines for Certification - Protective Coatings Inspectors

Guidelines for Certification - Protective Coatings Inspectors Guidelines for Certification - Protective Coatings Inspectors DOCUMENT No Fifth Edition March 2015 Issued under the Authority of the Certification Board for Inspection Personnel, New Zealand (CBIP) CONTENTS

More information

Pattern Recognition ( , RIT) Exercise 1 Solution

Pattern Recognition ( , RIT) Exercise 1 Solution Pattern Recognition (4005-759, 20092 RIT) Exercise 1 Solution Instructor: Prof. Richard Zanibbi The following exercises are to help you review for the upcoming midterm examination on Thursday of Week 5

More information

Wavelet filter bank based wide-band audio coder

Wavelet filter bank based wide-band audio coder Wavelet filter bank based wide-band audio coder J. Nováček Czech Technical University, Faculty of Electrical Engineering, Technicka 2, 16627 Prague, Czech Republic novacj1@fel.cvut.cz 3317 New system for

More information

Review on Methods of Selecting Number of Hidden Nodes in Artificial Neural Network

Review on Methods of Selecting Number of Hidden Nodes in Artificial Neural Network Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 11, November 2014,

More information

Multiresolution Texture Analysis of Surface Reflection Images

Multiresolution Texture Analysis of Surface Reflection Images Multiresolution Texture Analysis of Surface Reflection Images Leena Lepistö, Iivari Kunttu, Jorma Autio, and Ari Visa Tampere University of Technology, Institute of Signal Processing P.O. Box 553, FIN-330

More information

Objective. Commercial Sensitivities. Consistent Data Analysis Process. PCWG: 3 rd Intelligence Sharing Initiative Definition Document (PCWG-Share-03)

Objective. Commercial Sensitivities. Consistent Data Analysis Process. PCWG: 3 rd Intelligence Sharing Initiative Definition Document (PCWG-Share-03) PCWG: 3 rd Intelligence Sharing Initiative Definition Document (PCWG-Share-03) Objective The goals of the 3 rd PCWG Intelligence Sharing Initiative (hereafter PCWG-Share-03) are as follows: To objectively

More information

Optimization Model of K-Means Clustering Using Artificial Neural Networks to Handle Class Imbalance Problem

Optimization Model of K-Means Clustering Using Artificial Neural Networks to Handle Class Imbalance Problem IOP Conference Series: Materials Science and Engineering PAPER OPEN ACCESS Optimization Model of K-Means Clustering Using Artificial Neural Networks to Handle Class Imbalance Problem To cite this article:

More information

Radial Basis Function Networks

Radial Basis Function Networks Radial Basis Function Networks As we have seen, one of the most common types of neural network is the multi-layer perceptron It does, however, have various disadvantages, including the slow speed in learning

More information

Neural and Neurofuzzy Techniques Applied to Modelling Settlement of Shallow Foundations on Granular Soils

Neural and Neurofuzzy Techniques Applied to Modelling Settlement of Shallow Foundations on Granular Soils Neural and Neurofuzzy Techniques Applied to Modelling Settlement of Shallow Foundations on Granular Soils M. A. Shahin a, H. R. Maier b and M. B. Jaksa b a Research Associate, School of Civil & Environmental

More information