Effects of Model Complexity on Generalization Performance of Convolutional Neural Networks

Similar documents
Controlled Information Maximization for SOM Knowledge Induced Learning

Journal of World s Electrical Engineering and Technology J. World. Elect. Eng. Tech. 1(1): 12-16, 2012

Detection and Recognition of Alert Traffic Signs

IP Network Design by Modified Branch Exchange Method

An Extension to the Local Binary Patterns for Image Retrieval

ANN Models for Coplanar Strip Line Analysis and Synthesis

Spiral Recognition Methodology and Its Application for Recognition of Chinese Bank Checks

Point-Biserial Correlation Analysis of Fuzzy Attributes

An Unsupervised Segmentation Framework For Texture Image Queries

Input Layer f = 2 f = 0 f = f = 3 1,16 1,1 1,2 1,3 2, ,2 3,3 3,16. f = 1. f = Output Layer

A Two-stage and Parameter-free Binarization Method for Degraded Document Images

Segmentation of Casting Defects in X-Ray Images Based on Fractal Dimension

High performance CUDA based CNN image processor

A Neural Network Model for Storing and Retrieving 2D Images of Rotated 3D Object Using Principal Components

A New Finite Word-length Optimization Method Design for LDPC Decoder

A modal estimation based multitype sensor placement method

ANALYTIC PERFORMANCE MODELS FOR SINGLE CLASS AND MULTIPLE CLASS MULTITHREADED SOFTWARE SERVERS

Optical Flow for Large Motion Using Gradient Technique

Illumination methods for optical wear detection

RANDOM IRREGULAR BLOCK-HIERARCHICAL NETWORKS: ALGORITHMS FOR COMPUTATION OF MAIN PROPERTIES

Frequency Domain Approach for Face Recognition Using Optical Vanderlugt Filters

a Not yet implemented in current version SPARK: Research Kit Pointer Analysis Parameters Soot Pointer analysis. Objectives

SYSTEM LEVEL REUSE METRICS FOR OBJECT ORIENTED SOFTWARE : AN ALTERNATIVE APPROACH

Modelling, simulation, and performance analysis of a CAN FD system with SAE benchmark based message set

3D Hand Trajectory Segmentation by Curvatures and Hand Orientation for Classification through a Probabilistic Approach

Positioning of a robot based on binocular vision for hand / foot fusion Long Han

A Memory Efficient Array Architecture for Real-Time Motion Estimation

Behavioral Modeling of a C-Band Ring Hybrid Coupler Using Artificial Neural Networks

A ROI Focusing Mechanism for Digital Cameras

Slotted Random Access Protocol with Dynamic Transmission Probability Control in CDMA System

Color Correction Using 3D Multiview Geometry

Real-Time Speech-Driven Face Animation. Pengyu Hong, Zhen Wen, Tom Huang. Beckman Institute for Advanced Science and Technology

A Generalized Profile Function Model Based on Artificial Intelligence

Adaptation of TDMA Parameters Based on Network Conditions

Clustering Interval-valued Data Using an Overlapped Interval Divergence

Drag Optimization on Rear Box of a Simplified Car Model by Robust Parameter Design

Assessment of Track Sequence Optimization based on Recorded Field Operations

Obstacle Avoidance of Autonomous Mobile Robot using Stereo Vision Sensor

Transmission Lines Modeling Based on Vector Fitting Algorithm and RLC Active/Passive Filter Design

Extract Object Boundaries in Noisy Images using Level Set. Final Report

On Error Estimation in Runge-Kutta Methods

Mobility Pattern Recognition in Mobile Ad-Hoc Networks

A Recommender System for Online Personalization in the WUM Applications

MapReduce Optimizations and Algorithms 2015 Professor Sasu Tarkoma

Prediction of Time Series Using RBF Neural Networks: A New Approach of Clustering

Towards Adaptive Information Merging Using Selected XML Fragments

New Algorithms for Daylight Harvesting in a Private Office

Generalized Grey Target Decision Method Based on Decision Makers Indifference Attribute Value Preferences

Performing real-time image processing on distributed computer systems

Effective Missing Data Prediction for Collaborative Filtering

A Full-mode FME VLSI Architecture Based on 8x8/4x4 Adaptive Hadamard Transform For QFHD H.264/AVC Encoder

arxiv: v2 [physics.soc-ph] 30 Nov 2016

And Ph.D. Candidate of Computer Science, University of Putra Malaysia 2 Faculty of Computer Science and Information Technology,

FINITE ELEMENT MODEL UPDATING OF AN EXPERIMENTAL VEHICLE MODEL USING MEASURED MODAL CHARACTERISTICS

Multi-azimuth Prestack Time Migration for General Anisotropic, Weakly Heterogeneous Media - Field Data Examples

POMDP: Introduction to Partially Observable Markov Decision Processes Hossein Kamalzadeh, Michael Hahsler

Hierarchically Clustered P2P Streaming System

Topological Characteristic of Wireless Network

Machine Learning for Automatic Classification of Web Service Interface Descriptions

Scaling Location-based Services with Dynamically Composed Location Index

Feature Enhancement with a Reservoir-based Denoising Auto Encoder

Parallel processing model for XML parsing

AUTOMATED LOCATION OF ICE REGIONS IN RADARSAT SAR IMAGERY

Communication vs Distributed Computation: an alternative trade-off curve

A Minutiae-based Fingerprint Matching Algorithm Using Phase Correlation

Title. Author(s)NOMURA, K.; MOROOKA, S. Issue Date Doc URL. Type. Note. File Information

Accurate Diffraction Efficiency Control for Multiplexed Volume Holographic Gratings. Xuliang Han, Gicherl Kim, and Ray T. Chen

5 4 THE BERNOULLI EQUATION

XFVHDL: A Tool for the Synthesis of Fuzzy Logic Controllers

Feature Extraction Technique for Neural Network Based Pattern Recognition

A VECTOR PERTURBATION APPROACH TO THE GENERALIZED AIRCRAFT SPARE PARTS GROUPING PROBLEM

A Shape-preserving Affine Takagi-Sugeno Model Based on a Piecewise Constant Nonuniform Fuzzification Transform

ADDING REALISM TO SOURCE CHARACTERIZATION USING A GENETIC ALGORITHM

Optimal Adaptive Learning for Image Retrieval

On using circuit-switched networks for file transfers

The EigenRumor Algorithm for Ranking Blogs

A Mathematical Implementation of a Global Human Walking Model with Real-Time Kinematic Personification by Boulic, Thalmann and Thalmann.

HISTOGRAMS are an important statistic reflecting the

Information Retrieval. CS630 Representing and Accessing Digital Information. IR Basics. User Task. Basic IR Processes

Cellular Neural Network Based PTV

Haptic Glove. Chan-Su Lee. Abstract. This is a final report for the DIMACS grant of student-initiated project. I implemented Boundary

Classifying Datasets Using Some Different Classification Methods

Any modern computer system will incorporate (at least) two levels of storage:

Forward-Decoding Kernel-Based Phone Sequence Recognition

Linear Ensembles of Word Embedding Models

Comparisons of Transient Analytical Methods for Determining Hydraulic Conductivity Using Disc Permeameters

A New and Efficient 2D Collision Detection Method Based on Contact Theory Xiaolong CHENG, Jun XIAO a, Ying WANG, Qinghai MIAO, Jian XUE

Erasure-Coding Based Routing for Opportunistic Networks

Computational and Theoretical Analysis of Null Space and Orthogonal Linear Discriminant Analysis

MULTI-TEMPORAL AND MULTI-SENSOR IMAGE MATCHING BASED ON LOCAL FREQUENCY INFORMATION

Approximating Euclidean Distance Transform with Simple Operations in Cellular Processor Arrays

(a, b) x y r. For this problem, is a point in the - coordinate plane and is a positive number.

RESONANCE FIELDS AND SHIELDING OF A LIVING ROOM EXPOSED TO RADIATIONS FROM A TRANSMITTER BY VECTORIAL FINITE ELEMENT- NEURAL NETWORK METHOD

Dynamic Topology Control to Reduce Interference in MANETs

Ego-Motion Estimation on Range Images using High-Order Polynomial Expansion

LOSSLESS audio coding is used in such applications as

Investigation of turbulence measurements with a continuous wave, conically scanning LiDAR. Risø-R-Report

Using SPEC SFS with the SNIA Emerald Program for EPA Energy Star Data Center Storage Program Vernon Miller IBM Nick Principe Dell EMC

Module 6 STILL IMAGE COMPRESSION STANDARDS

An Assessment of the Efficiency of Close-Range Photogrammetry for Developing a Photo-Based Scanning Systeminthe Shams Tabrizi Minaret in Khoy City

Transcription:

Effects of Model Complexity on Genealization Pefomance of Convolutional Neual Netwoks Tae-Jun Kim 1, Dongsu Zhang 2, and Joon Shik Kim 3 1 Seoul National Univesity, Seoul 151-742, Koea, E-mail: tjkim@bi.snu.ac.k 2 Yangjae High School, Seoul, Koea, E-mail: 96lives@gmail.com 3 Univesity of Seoul, Seoul 130-743, Koea, E-mail: jskim.ozmagi@gmail.com Abstact. Convolutional neual netwoks ae known to be effective in leaning complex image classification tasks. Howeve, how to design the achitectue o complexity of the netwok stuctue equies a moe quantitative analysis of the achitectue design. In this pape, we study the effect of model complexity on genealization capability of the convolutional neual netwoks on lage-scale, eal-life digit ecognition data. We used the digit images of the MNIST dataset to tain the neual netwoks and evaluated thei pefomance on a test set of unobseved images. Using the LeNet softwae tool we vaied the numbe of hidden layes and the numbe of units in the layes to evaluate the effect of model complexity on the genealization capability of the convolutional neual netwoks. In ou expeimental settings, we obseve obust genealization pefomances of the convolutional neual netwoks on a wide ange of model complexities. We analyze and discuss how the convolution laye and the subsampling laye may contibute to the genealization pefomance. Keywods: Convolutional neual netwoks, MNIST, LeNet, image classification, model complexity, Occam s azo, genealization pefomance 1. Intoduction Computes solve many poblems well if pogammed appopiately by human pogammes. Howeve, some atificial intelligence poblems, such as image and speech analysis, ae had to pogam the computes to solve. In this case, machine leaning offes new possibilities since it allows to automatically build a pogam fom data, i.e. by epeatedly obseving humans solving the poblem. An impotant issue in machine leaning is how to contol the complexity of the model: if the model is too simple, it cannot lean the data, wheeas too complex models may ovefit the data. Convolutional neual netwoks ae especially inteesting as a machine leaning model since they can lean complex pattens in eal-life data sets, such as images. Howeve, the design of the netwok stuctue, i.e. the numbe of layes and the numbe of units in the layes, emain an at. In this pape we aim to undestand the elationship between the complexity of the netwok model and its genealization pefomance by exploing the achitectue space expeimentally. Ou vision is to build a

deep neual netwok that can lean to ecognize human faces as moe human faces ae obseved. In this fist stage of ou eseach, in this pape we expeiment with the MNIST benchmak data sets. The pape is oganized as follows. In Section 2 we descibe achitectue and leaning method of the convolutional neual netwok. In Section 3 we descibe the data set and ou expeimental designs. Section 4 epots on the expeimental esults and thei analysis. Section 5 concludes the wok. 2. Deep Convolutional Neual Netwoks Neual netwoks ae neuobiologically inspied computational models of leaning and memoy. A single neuon j pocesses infomation in two stages: It fist computes the net input net j fom the incoming activations x i of pesynaptic neuons i Pe net j I ipe w ji x i and then tansfes the net input though a nonlinea activation function (), such as a sigmoid function 1 (net) 1 exp(net). The neuons ae oganized typically in a layeed stuctue whee a laye of neuons ae connected fom a pevious laye of neuons and thee ae no connections between the neuons in the same laye. One of popula neual achitectues is the multilaye pecepton consisting of two fully-connected feedfowad layes of neuons. If I and H denote the numbes of input neuons and hidden neuons, espectively, the k-th output of the multilaye pecepton is expessed as H I (2) f k (x,w) k w kh h w (1) hi x i h1 whee x is the input taining patten and w (w (1),w (2) ) is the weight vecto that detemines the neual netwok function. One of the most inteesting featues of neual netwoks is its leaning capability: without pogamming a neual netwok can automatically lean to solve poblems fom taining examples, such as D N {(x n,y n ) n 1,2,..., N} whee x n is the n-th input patten and y n the associated taget output patten. The well-known eo back-popagation algoithm adapts the weight vectos iteatively, that is by i) pesenting an input patten x n to the input laye of the neual netwok, ii) computing the neual netwok output vecto f(x,w), iii) computing the eo E n (w) between the actual output f(x n,w) and the taget output y n i1 E n (w) 1 2 y n f(x n,w) 2

iv) changing the weight values w ji by a small amount ( ) in the diectionn of educ- ing the eos, E n w ji, with espect too the weights w w ji ji w and v) by popagating the eos computed sequentially and backwads fomm the out- net- put layes to the diection of input layes. Due to its leaningg capability, neual woks have been successfully usedd fo many applications inn atificial intelligence, including patten ecognition, compute vision, natual language pocessing, and ocognitive botics. Neual netwoks have also been used as computational tools fo science eseach to investigate how humans epesent, pocess, and lean infomation. Taditionally, two-laye multilaye peceptons have been used fo most applica- con- tions. Howeve, ecentt eseach shows that deep netwoks, i.e. a leaning model sisting of many layes of pocessingg units, can outpefom shallow netwoks. This is inteesting since theoetically a multilaye pecepton having a single hiddenn laye can be poven to be a univesal appoximato and, thus, can lean any complex nonlinea functions. Deep netwoks have poven especially useful if the datasets aee complex such as image and speech data. Thee ae two diffeent classes of deep netwoks. One is the deep belief netwoks (DBNs)) consisting of multiple layes of esticted Boltzwhich ae mann machines (RBMs) [1 4]. An RBM consists of two layes of neuons fully connected in a feedfowad fashion. A DBM is taditionally tained with contastive divegence which is based on Gibbs sampling [2]. Anothe class of deep netwoks is the convolutional neual netwoks (CNNs). In contast to a deep belief netwok, a CNN consistss of patially connected netwok lay- laye (Fig. 1). The convolution laye contains a numbe of featue maps, whee each es. That is, a CNN consists of manyy stacks of a convolution laye and a sub-sampling unit has a patial eceptive field of the input image. That is, in i a CNN, only patial, local patches of the inputs, not all of them, ae connected to the next sub-sampling laye. The sub-sampling laye consists of a numbe of featue maps, which ae scaled local aveages of the espective featue maps in the pevious convolutional laye. Each unit in the sub-sampling laye has one tainable coefficient fo the local aveage and one tainable bias.. The eo back-popagation algoithm is i used to tainn the con- volutional neual netwoks [5 9]. E n w ji Fig.1. A six-laye convolutional neual netwok

3. MNIST Digit Data Sets and Expeimen nt Design MNIST data sets consist of handwitten digit images [10]. This database was distibuted by the National Institute I of Standads and Technology (NIST)( [11]. The digits ae centeed and nomalized in an image of fixed size (28 288 pixel). Fo the expei- in ten ments, we used taining set and a test set of images of single-digit numbes categoies (fom 0 to t 9 ). Some example images in the database ae shown in Fig.2. Fig. 2. Example of handwitten digitt images fom MNIST database. To study the effect of model complexity on the genealization pefomance, we designed the expeiments by vaying the numbe of layes and units. We compaed the pefomances (the taining t and test eos) of CNN models of 4 layes (Fig. 3.) and 6 layes (Fig. 1). We also a vaied thee numbe of units in each laye fo the 4-laye and 6-laye CNNs. We also plot the esults in tems of the effectivee numbe of paametes (which depends on the numbe of layes and units) on the genealization pefomance. Fig 3. Achitectue of 4-laye CNN used in the expeiments fo compaison.

4. Expeimental Results and Analysis The expeiments have been pefomed by the Matlab code in [6]. We an the fou diffeent kinds of expeiments descibed in the pevious section to analyze the effect of model complexity on the leaning pefomance. The fist is to analyze the taining and test eos vs. the numbe of layes. The second is to see the taining and test eos vs. the numbe of taining iteations. The thid is the change of taining and test eos vs. the total numbe of adjustable paametes, which epesents the effective model complexity. The last is the change of taining and test eos vs. the total numbe of paametes using neual netwoks. Fig. 4. shows the expeimental esults fo the 4-laye and 6-laye CNNs. The taining eo is measued by the mean squaed eo (MSE). The test eo is evaluated by the pecentage of false answes i.e. the classification eo ate. The taining set consisted of 6,000 examples and the test set of 1,000 examples. The numbe of units was fixed to 8 fo each laye and the numbe of iteations was 1,000 epochs. Fo this setting, we see that the deepe netwoks achieve bette pefomance both in taining and test eos. 35.00% 30.00% 25.00% 20.00% 15.00% 4 laye 6 laye 10.00% 5.00% 0.00% Taining eo Test eo Fig. 4. The taining and test eos vs. the numbe of layes in the convolutional neual netwoks. Fig. 5 shows the changes of pefomance fo vaious numbes of taining epochs. Since a lage numbe of taining iteations is equivalent to a moe complex model than a small numbe of iteation, inceasing the numbe of iteations has the same effect of using moe complex models. In Fig. 5 we see the pefomance gets inceased as moe iteations ae pefomed. We do not obseve any ovefitting behavio in this

setting of expeiments. This shows an inteesting popety of the convolutional neual netwoks. We will come back to this issue in the discussion. 90.00% 80.00% 70.00% 60.00% 4 laye Taining eo 50.00% 4 laye Test eo 40.00% 6 laye Taining eo 30.00% 20.00% 6 laye Test eo 10.00% 0.00% 10 50 300 1000 Fig. 5. Pefomance compaison fo the numbe of epochs in taining.. 7.00% E o 6.00% 5.00% 4.00% 3.00% 2.00% 1.00% 0.00% Tain eo Test eo Numbe fo Paametes Fig. 6. Effects of the numbe of paametes of the convolutional neual netwoks (by changing the numbe of units as well as layes) on the taining eo (the mean squaed eo) and the test eo (the classification eo ate). Fig. 6 compaes the pefomance of the convolutional neual netwoks fo a wide ange of the numbe of paametes (360 to 3160). The ange was geneated by changing the numbe of units and the numbe of layes. The esults shows that both the

testing and taining eos decease as the numbe of paametes inceases, i.e. consistent behavio of the taining eo (the mean squaed eo) and the test eo (the classification eo ate). That is, the genealization pefomance ove a wide ange of model complexity is good except fo some specific paamete settings (e.g. aound the numbe of paametes of 2860). This demonstates the obustness of CNNs in genealization pefomance. Howeve, this esult should be intepeted moe caefully. The minimum value of test eo we achieved in these expeiments is above 3% which is lage than epoted optimal value such as 0.83% [4]. On the othe hand, we obseved the ovefitting ange expeimented by neual netwoks(fig. 7). In this expeiment, we fixed the input nodes(784), output nodes(10) and vaied hidden nodes(fom 100 to 280). The esults shows that the taining eos continuously decease as the numbe of paametes inceases, but the testing eos inceases aound the numbe of paametes of 119100. This means that NNs is much weake at ovefitting poblem than CNNs. 11.50% 11.00% E o 10.50% 10.00% 9.50% 9.00% 8.50% Tain eo Test eo 8.00% 79400 119100 158800 198500 222320 Numbe of Paametes Fig. 7. Effects of the numbe of paametes of the neual netwoks (by changing the numbe of units as well as layes) on the taining eo (the mean squaed eo) and the test eo (the classification eo ate). 5. Conclusion We studied the genealization behavio of convolutional neual netwoks fo vaious combinations of model complexities. The model complexities wee vaied by the numbe of layes, the numbe of units, the numbe of epochs, and the numbe of effective paametes. On the MNIST digit data set of 6,000 images we found that the

CNNs show obust genealization pefomances ove a wide ange of model complexities. The esults ae vey inteesting since the machine leaning theoy of model complexity says that the model should ovefit if the model complexity inceases. Pehaps one eason fo not ovefitting is that ou expeimental settings wee, despite the vaious settings, not wide enough to detect the ovefitting anges. And it is had to find ovefitting anges because of sub-sampling layes. Howeve, consideing the natue of the digit ecognition poblem and the elatively big data set size (6,000 images fo taining and 1,000 images fo test), the convolutional netwoks seem to be able to find invaiance in the images. In addition, the capability of subsampling also seems to contibute to genealization pefomance by building spase models of the data. Acknowledgement This wok was suppoted by the National Reseach Foundation of Koea (NRF) gant funded by the Koea govenment (MSIP) (NRF-2010-0017734-Videome), suppoted in pat by KEIT gant funded by the Koea govenment (MKE) (KEIT- 10035348-mLife, KEIT-10044009). Refeences 1. Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neual netwoks. Science, 313, 504 507 (2006) 2. Bengio, Y.: Leaning deep achitectues fo AI. Foundations and Tends in Machine Leaning, 2, 1 127 (2009) 3. Hinton, G.E., Osindeo, S., Teh, Y.-W.: A fast leaning algoithm fo deep belief nets. Neual Computation, 18, 1527 1554 (2006) 4. Deng, L., Yu, D., Platt, J.: Scalable stacking and leaning fo building deep achitectues. In: 2012 IEEE Intenational Confeence on Acoustics, Speech and Signal Pocessing (ICASSP), pp. 2133 2136. IEEE Pess, New Yok (2012). 5. LeCun, Y., Bose, B., Denke, J.S., Hendeson, D., Howad, R.E., Hubbad, W., Jackel, L.D.: Backpopagation applied to handwitten Zip code ecognition. Neual Computation, 1, 541 551 (1989) 6. Palm, R.B.: Pediction as a candidate fo leaning deep hieachical models of data. Technical Repot, Technical Univesity of Denmak (2012) 7. Ciesan, D.C., Meie, U., Gambadella, L.M., Schmidhube, J.: Deep big multilaye pecepton fo digit ecognition. In: Montavon, G. et al. (eds.) Neual Netwoks: Ticks of the Tade, 2 nd edn., LNCS 7700, pp. 581 598. Spinge, Heidelbeg (2012) 8. Raiko, T., Valpola, H., LeCun, Y.: Deep leaning made easie by linea tansfomations in peceptons. In: 15 th Intenational Confeence on Atificial Intelligence and Statistics (AISTATS), pp. 924 932. (2012) 9. Bouchain, D.: Chaacte ecognition using convolutional neual netwoks. Technical epot, Institute fo Neual Infomation Pocessing (2006)

10. LeÓn, G.M., Moeno-Báez, A., Magallanes-Quintana R., Valdez-Cepeda, R.D.: Assessment in subsets of MNIST handwitten digits and thei effect in the ecognition ate. Jounal of Patten Recognition Reseach, 2. 244 252 (2011) 11. LeCun, Y.: The MNIST database of handwitten digits, Available: http://yann.lecun.com/exdb/mnist/index.html 12. Wilson, K.G.: The enomalization goup and citical phenomena. Review of Moden Physics, 55, 583 600 (1983)