A SOFTWARE MODEL FOR THE MULTILAYER PERCEPTRON

Similar documents
ANN WHICH COVERS MLP AND RBF

Ones Assignment Method for Solving Traveling Salesman Problem

Lecture 18. Optimization in n dimensions

Pattern Recognition Systems Lab 1 Least Mean Squares

arxiv: v2 [cs.ds] 24 Mar 2018

Python Programming: An Introduction to Computer Science

BOOLEAN MATHEMATICS: GENERAL THEORY

. Written in factored form it is easy to see that the roots are 2, 2, i,

condition w i B i S maximum u i

Polynomial Functions and Models. Learning Objectives. Polynomials. P (x) = a n x n + a n 1 x n a 1 x + a 0, a n 0

A New Morphological 3D Shape Decomposition: Grayscale Interframe Interpolation Method

Elementary Educational Computer

A Parallel DFA Minimization Algorithm

Τεχνολογία Λογισμικού

Exact Minimum Lower Bound Algorithm for Traveling Salesman Problem

Goals of the Lecture UML Implementation Diagrams

Goals of this Lecture Activity Diagram Example

Neural Networks A Model of Boolean Functions

The Magma Database file formats

Administrative UNSUPERVISED LEARNING. Unsupervised learning. Supervised learning 11/25/13. Final project. No office hours today

1 Enterprise Modeler

Computers and Scientific Thinking

Cluster Analysis. Andrew Kusiak Intelligent Systems Laboratory

CSC 220: Computer Organization Unit 11 Basic Computer Organization and Design

Outline. Research Definition. Motivation. Foundation of Reverse Engineering. Dynamic Analysis and Design Pattern Detection in Java Programs

BOOLEAN DIFFERENTIATION EQUATIONS APPLICABLE IN RECONFIGURABLE COMPUTATIONAL MEDIUM

Chapter 1. Introduction to Computers and C++ Programming. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Parabolic Path to a Best Best-Fit Line:

CHAPTER IV: GRAPH THEORY. Section 1: Introduction to Graphs

Evaluation scheme for Tracking in AMI

IMP: Superposer Integrated Morphometrics Package Superposition Tool

Software development of components for complex signal analysis on the example of adaptive recursive estimation methods.

ECE4050 Data Structures and Algorithms. Lecture 6: Searching

FREQUENCY ESTIMATION OF INTERNET PACKET STREAMS WITH LIMITED SPACE: UPPER AND LOWER BOUNDS

A new algorithm to build feed forward neural networks.

9.1. Sequences and Series. Sequences. What you should learn. Why you should learn it. Definition of Sequence

Designing a learning system

Lecture Notes 6 Introduction to algorithm analysis CSS 501 Data Structures and Object-Oriented Programming

Baan Finance Financial Statements

1 Graph Sparsfication

Image Segmentation EEE 508

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Chapter 10. Defining Classes. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Creating Exact Bezier Representations of CST Shapes. David D. Marshall. California Polytechnic State University, San Luis Obispo, CA , USA

Chapter 11. Friends, Overloaded Operators, and Arrays in Classes. Copyright 2014 Pearson Addison-Wesley. All rights reserved.

Adaptive Resource Allocation for Electric Environmental Pollution through the Control Network

MATHEMATICAL METHODS OF ANALYSIS AND EXPERIMENTAL DATA PROCESSING (Or Methods of Curve Fitting)

Intrusion Detection using Fuzzy Clustering and Artificial Neural Network

Counting the Number of Minimum Roman Dominating Functions of a Graph

AN OPTIMIZATION NETWORK FOR MATRIX INVERSION

Keywords Software Architecture, Object-oriented metrics, Reliability, Reusability, Coupling evaluator, Cohesion, efficiency

How do we evaluate algorithms?

Analysis Metrics. Intro to Algorithm Analysis. Slides. 12. Alg Analysis. 12. Alg Analysis

Text Summarization using Neural Network Theory

CIS 121 Data Structures and Algorithms with Java Spring Stacks, Queues, and Heaps Monday, February 18 / Tuesday, February 19

EMPIRICAL ANALYSIS OF FAULT PREDICATION TECHNIQUES FOR IMPROVING SOFTWARE PROCESS CONTROL

Τεχνολογία Λογισμικού

Algorithms for Disk Covering Problems with the Most Points

Performance Plus Software Parameter Definitions

Lecture 5. Counting Sort / Radix Sort

Pruning and Summarizing the Discovered Time Series Association Rules from Mechanical Sensor Data Qing YANG1,a,*, Shao-Yu WANG1,b, Ting-Ting ZHANG2,c

3D Model Retrieval Method Based on Sample Prediction

Force Network Analysis using Complementary Energy

CSC165H1 Worksheet: Tutorial 8 Algorithm analysis (SOLUTIONS)

Empirical Validate C&K Suite for Predict Fault-Proneness of Object-Oriented Classes Developed Using Fuzzy Logic.

Global Support Guide. Verizon WIreless. For the BlackBerry 8830 World Edition Smartphone and the Motorola Z6c

COP4020 Programming Languages. Functional Programming Prof. Robert van Engelen

Probabilistic Fuzzy Time Series Method Based on Artificial Neural Network

An Improved Shuffled Frog-Leaping Algorithm for Knapsack Problem

Solving Fuzzy Assignment Problem Using Fourier Elimination Method

Chapter 3 Classification of FFT Processor Algorithms

Fuzzy Rule Selection by Data Mining Criteria and Genetic Algorithms

GPUMP: a Multiple-Precision Integer Library for GPUs

Data Structures and Algorithms. Analysis of Algorithms

Math 10C Long Range Plans

Technology, Covenant University, Ota, Ogun State, Nigeria. 2

New Fuzzy Color Clustering Algorithm Based on hsl Similarity

Improvement of the Orthogonal Code Convolution Capabilities Using FPGA Implementation

Task scenarios Outline. Scenarios in Knowledge Extraction. Proposed Framework for Scenario to Design Diagram Transformation

Octahedral Graph Scaling

CIS 121 Data Structures and Algorithms with Java Spring Stacks and Queues Monday, February 12 / Tuesday, February 13

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

South Slave Divisional Education Council. Math 10C

Criterion in selecting the clustering algorithm in Radial Basis Functional Link Nets

x x 2 x Iput layer = quatity of classificatio mode X T = traspositio matrix The core of such coditioal probability estimatig method is calculatig the

Text Feature Selection based on Feature Dispersion Degree and Feature Concentration Degree

Big-O Analysis. Asymptotics

Redundancy Allocation for Series Parallel Systems with Multiple Constraints and Sensitivity Analysis

Chapter 4 Threads. Operating Systems: Internals and Design Principles. Ninth Edition By William Stallings

c-dominating Sets for Families of Graphs

Bezier curves. Figure 2 shows cubic Bezier curves for various control points. In a Bezier curve, only

MOTIF XF Extension Owner s Manual

A Semi- Non-Negative Matrix Factorization and Principal Component Analysis Unified Framework for Data Clustering

INTERSECTION CORDIAL LABELING OF GRAPHS

An Anomaly Detection Method Based On Deep Learning

An Effort Estimation by UML Points in the Early Stage of Software Development

Improving Template Based Spike Detection

Eigenimages. Digital Image Processing: Bernd Girod, 2013 Stanford University -- Eigenimages 1

New Results on Energy of Graphs of Small Order

Appendix A. Use of Operators in ARPS

Transcription:

A SOFTWARE MODEL FOR THE MULTILAYER PERCEPTRON Roberto Lopez ad Eugeio Oñate Iteratioal Ceter for Numerical Methods i Egieerig (CIMNE) Edificio C1, Gra Capitá s/, 08034 Barceloa, Spai ABSTRACT I this work we costruct a software model for the multilayer perceptro eural etwork. The whole process is carried out i the Uified Modelig Laguage (UML), which provides a formal framework for the modelig of software systems. The fial implemetatio, called Flood, has bee writte i the C++ Programmig Laguage ad placed uder the GNU Lesser Geeral Public Licese. KEYWORDS eural etworks, multilayer perpeptro, software egieerig, uified modelig laguage, C++ programmig laguage 1. INTRODUCTION The multilayer perceptro is a cetral model of eural etwork, which has foud a wide rage of applicatios. Two of the mai learig tasks for the multilayer perceptro are fuctio regressio ad patter recogitio. Both problems ca be formulated as data modelig problems. The fuctio regressio problem ca be regarded as the problem of approximatig a fuctio from a iput-target data set (Bishop 1995). The targets are a specificatio of what the output respose to the iputs should be. The task of patter recogitio (or classificatio) ca be stated as the process whereby a received patter, characterized by a distict set of features, is assiged to oe of a prescribed umber of classes (Bishop 1995). A patter recogitio problem ca be solved by approximatig a fuctio from iput-target data, where the iputs iclude a set of features which characterize a patter, ad the targets specify the class that each patter belogs to. Here we model a software implemetatio for the multilayer perceptro. The modelig process is carried out i the Uified Modelig Laguage (UML). The fial implemetatio, called Flood, provides a comprehesive C++ class library for the solutio of fuctio regressio ad patter recogitio problems (Lopez 2007). Flood has bee released as the ope source GNU Lesser Geeral Public Licese. 2. THE MULTILAYER PERCEPTRON Here we preset a theory of the multilayer perceptro from the perspective of fuctioal aalysis ad variatioal calculus. Withi this theory, a multilayer perceptro is described by four cocepts: a euro model, the perceptro, a etwork architecture, the feed-forward, ad associated objective fuctioals ad traiig algorithms. A euro model is the basic iformatio processig uit i a eural etwork. The perceptro is the characteristic euro model i the multilayer perceptro (Bishop 1995). It computes a et iput sigal u as a fuctio h of the iput sigals x ad the free parameters (b,w). The et iput sigal is the subjected to a activatio fuctio g to produce a output sigal y. Two of the most used activatio fuctios are the sigmoid fuctio, g(u) = tah(u), ad the liear fuctio, g(u) = u. 464

IADIS Iteratioal Coferece Applied Computig 2007 Mathematically, a perceptro euro model spas a parameterized fuctio space V from a iput X R to a output Y R (Lopez & Oñate 2006). The fuctio space V is parameterized by the free parameters of the euro (b,w), ad therefore the dimesio V is 1+. The elemets of the fuctio space spaed by a perceptro are of the form y: R R x a y(x;b,w) Although a sigle perceptro ca perform certai simple tasks, the power of eural computatio comes from coectig may euros i a etwork architecture. The architecture of a eural etwork refers to the umber of euros, their arragemet ad coectivity. The characteristic etwork architecture i the multilayer perceptro is the so called feed-forward architecture (Bishop 1995). A feed-forward architecture typically cosists o a iput layer of sesorial odes, oe or more hidde layers of euros, ad a output layer of euros. Commuicatio proceeds layer by layer from the iput layer via the hidde layers up to the output layer. I this way, a multilayer perceptro is a feed-forward etwork architecture of perceptro euro models. I a similar way as it happes with a sigle perceptro, a multilayer perceptro spas a parameterized m fuctio space V from a iput X R to a output Y R (Lopez & Oñate 2006). Elemets of V are parameterized by the free parameters i the etwork, which ca be grouped together i a s -dimesioal free parameter vector α. The dimesio of the fuctio space V is therefore s. The elemets of the fuctio space spaed by a multilayer perceptro are of the form m y: R R x a y(x; α) A multilayer perceptro with as few as oe hidde layer of sigmoid euros ad a output layer of liear euros provides a geeral framework for approximatig ay fuctio from oe fiite dimesioal space to aother up to ay desired degree of accuracy, provided sufficietly may hidde euros are available. I this sese, multilayer perceptro etworks are a class of uiversal approximators (Horik et al 1989). The objective fuctioal defies the task that the etwork is required to accomplish ad provides a measure of the quality of the represetatio that it is required to lear. A objective fuctioal for the multilayer perceptro is of the form F: V R y(x; α) a F[y(x; α)] The learig problem for the multilayer perceptro ca the be formulated i terms of the miimizatio of a objective fuctioal of the fuctio space spaed by the eural etwork (Lopez & Oñate 2006). Oe of the most commo objective fuctioals used i fuctio regressio ad patter recogitio is the sum squared error (SSE), which is measured o a iput-target data set (Bishop 1995). There are several variat objective fuctioals of the sum squared error. Two of the most used are the mea squared error (MSE) ad the root mea squared error (RMSE). O the other had, the objective fuctioal, F[y(x; α )], has a objective fuctio associated, f( α ), which is defied as a fuctio of the free parameters i the etwork, s f: R R α a f( α) The miimum or maximum value of the objective fuctioal is achieved for a vector of free parameters at which the objective fuctio takes o a miimum or maximum value, respectively. Therefore, the learig problem for the multilayer perceptro, formulated as a variatioal problem, ca be reduced to a fuctio optimizatio problem (Lopez & Oñate 2006). The traiig algorithm is etrusted to solve the reduced fuctio optimizatio problem, by adjustig the free parameters i the etwork so as to optimize the objective fuctio. More specifically, the traiig * algorithm searches i a s -dimesioal space for a parameter vector α at which the objective fuctio f takes a maximum or a miimum value. The tasks of maximizatio ad miimizatio are trivially related to each other, sice maximizatio of f( α ) is equivalet to miimizatio of f( α ), ad vice versa. O the other 465

had, a miimum ca be either a global miimum, the smallest value of the fuctio over its etire rage, or a local miimum, the smallest value of the fuctio withi some local eighborhood. Traiig algorithms might require iformatio from the objective fuctio oly, the gradiet vector of the objective fuctio or the Hessia matrix of the objective fuctio. These methods, i tur, ca perform either global or local optimizatio. Zero-order traiig algorithms make use of the objective fuctio oly. The most sigificat zero-order traiig algorithms are stochastic, which ivolve radomess i the optimizatio process. Typical examples are evolutioary algorithms (Fogel 1994), which are global optimizatio methods. First-order traiig algorithms use the objective fuctio ad its gradiet vector. Examples of these are gradiet descet, cojugate gradiet or quasi-newto methods (Bishop 1995). They are all local optimizatio methods. Secod-order traiig algorithms make use of the objective fuctio, its gradiet vector ad its Hessia matrix. A example is the Newto's method (Bishop 1995), which is a local optimizatio method. 3. THE SOFTWARE MODEL The Uified Modelig Laguage (UML) is a geeral purpose visual modelig laguage that is used to specify, visualize, costruct, ad documet the artifacts of a software system (Rumbaugh et al 1999). UML class diagrams show the classes of the system, their iterrelatioships ad the attributes ad operatios of the classes. I order to costruct a software model for the multilayer perceptro, we follow a top-dow developmet. This approach to the problem begis at the highest coceptual level ad works dow to the details. I this way, to create ad evolve a coceptual class diagram for the multilayer perceptro, we iteratively model (i) classes, (ii) associatios, (iii) derived classes ad (iv) attributes ad operatios. I object-orieted modelig cocepts are represeted by meas of classes. Therefore, a prime task is to idetify the mai cocepts (or classes) of the problem domai. I UML class diagrams, classes are depicted as boxes (Rumbaugh et al 1999). I this work, we have see that the multilayer perceptro is characterized by a euro model, a etwork architecture, ad associated objective fuctioals ad traiig algorithms. The characterizatio i classes of these four cocepts for the multilayer perceptro is as follows: -Perceptro: The class which represets the cocept of perceptro euro model is called Perceptro. -Multilayer perceptro: The class represetig the cocept of multilayer perceptro etwork architecture is called MultilayerPerceptro. -Objective fuctioal: The class which represets the cocept of objective fuctioal for a multilayer perceptro is called ObjectiveFuctioal. -Traiig algorithm: The class represetig the cocept of traiig algorithm for a multilayer perceptro is called TraiigAlgorithm. Oce idetified the mai cocepts i the model it is ecessary to aggregate the associatios amog them. A associatio is a relatioship betwee two cocepts which poits some sigificative or iterestig iformatio (Rumbaugh et al 1999). The appropriate associatios are ext idetified to be icluded to the UML class diagram of the system: -Perceptro Multilayer perceptro: A multilayer perceptro is built by perceptros. -Multilayer perceptro - Objective fuctioal: A multilayer perceptro has assiged a objective fuctioal. -Objective fuctioal - Traiig algorithm: A objective fuctioal is improved by a traiig algorithm. I object-orieted programmig, some classes are desiged oly as a paret from which sub-classes may be derived, but which is ot itself suitable for istatiatio. This is said to be a abstract class, as opposed to a cocrete class, which is suitable to be istatiated. The derived class cotais all the features of the base class, but may have ew features added or redefie existig features (Rumbaugh et al 1999). Associatios betwee a base class a a derived class are of the kid is a. The ext task is the to establish which classes are abstract ad to derive the ecessary cocrete classes to be added to the system. Let us the examie the classes we have so far: -Perceptro: The class Perceptro is abstract, because it does ot represet ay cocrete euro model, sice it must be assiged a specific activatio fuctio. O the other had, a multilayer perceptro with a sigmoid hidde layer ad a liear output layer meets the uiversal approximatio theorem (Horik et al 466

IADIS Iteratioal Coferece Applied Computig 2007 1989). Therefore, cocrete classes for the sigmoid ad liear perceptros must be derived. These are called SigmoidPerceptro ad LiearPerceptro, respectively. -Multilayer perceptro: The class MultilayerPerceptro is a cocrete class ad is itself suitable for istatiatio. -Objective fuctioal: The class ObjectiveFuctioal is abstract, because it does ot represet a cocrete objective fuctioal for the multilayer perceptro. Three objective fuctioals which are very used i fuctio regressio ad patter recogitio are the sum squared error, the mea squared error ad the root mea squared error. Therefore we derive the classes SumSquaredError, MeaSquaredError ad RootMeaSquaredError. It is always possible to derive ew objective fuctioals for the multilayer perceptro at ay time ad iclude them i the system. -Traiig algorithm: The class TraiigAlgorithm is abstract, because it does ot represet a cocrete traiig algorithm for the multilayer perceptro. Here we derive the classes GradietDescet, CojugateGradiet, NewtoMethod, QuasiNewtoMethod ad EvolutioaryAlgorithm to represet the cocepts of gradiet descet, cojugate gradiet, Newto s method, quasi-newto method ad evolutioary algorithm, respectively. As before, it is always possible to derive ay ew traiig algorithm for the multilayer perceptro to be added to the system. A attribute is a amed value or relatioship that exists for all or some istaces of a class. A operatio is a procedure associated with a class (Rumbaugh et al 1999). I UML class diagrams, classes are depicted as boxes with three sectios: the top oe idicates the ame of the class, the oe i the middle lists the attributes of the class, ad the bottom oe lists the operatios. Oly the mai attributes ad operatios for the most importat classes i the system are ext idetified to be icluded to the UML class diagram: -Perceptro: The mai attribute of a perceptro is the umber of iputs. The mai operatio it performs is to get the output sigal for a give set of iput sigals. -Multilayer perceptro: The mai attributes of a multilayer perceptro are the umber of iputs, the umber of hidde euros ad the umber of outputs. The mai operatio it performs is to get the set outputs for a give set of iputs. -Objective fuctioal: The mai attribute of a objective fuctioal is a relatioship to a multilayer perceptro, implemeted i C++ as a poiter to a multilayer perceptro object (Stroustrup 2000). The sum squared error, mea squared error ad root mea squared error classes also cotai a relatioship to a iput target data set, implemeted i C++ as a poiter to a iput target data set object. The mai operatio it performs is to obtai the evaluatio of a multilayer perceptro. -Traiig algorithm: The mai attribute of a traiig algorithm is a relatioship to a objective fuctioal for a multilayer perceptro. I C++ this is implemeted as a poiter to a objective fuctioal object (Stroustrup 2000). The mai operatio it performs is to trai a multilayer perceptro. Figure 1 shows a simplified UML class diagram for the Flood library with all the base classes, derived classes, associatios, attributes ad operatios icluded (Lopez 2007). Figure 1. The UML class diagram of Flood. 467

4. CONCLUSIONS A software model for the multilayer perceptro has bee costructed followig a top-dow developmet. The whole process has bee carried out i the Uified Modelig Laguage (UML) ad the fial implemetatio has bee writte i the C++ Programmig Laguage. The result is a comprehesive C++ class library for the solutio of fuctio regressio ad patter recogitio problems, called Flood ad released uder a ope source licese. REFERENCES Bishop, C., 1995. Neural Networks for Patter Recogitio. Oxford Uiversity Press. Fogel, D.B., 1994. A itroductio to simulated evolutioary optimizatio. I IEEE Trasactios o Neural Networks, Vol. 5, No. 1, pp. 3-14. Horik, K. et al, 1989. Multilayer feedforward etworks are uiversal approximators. I Neural Networks, Vol. 2, No. 5, pp. 359-366. Lopez, R. ad Oñate, E., 2006. A Variatioal Formulatio for the Multilayer Perceptro. Proceedigs of the 16th Iteratioal Coferece o Artificial Neural Networks. Athes, Greece, Vol 1, pp. 159-168. Lopez, R., 2007. Flood: A Ope Source Neural Networks C++ Library, www.cime.com/flood. Rumbaugh, J. et al, 1999. The Uified Modelig Laguage Referece Maual. Addiso Wesley. Stroustrup, B., 2000. The C++ Programmig Laguage. Addiso Wesley. 468