Image Classification using Fast Learning Convolutional Neural Networks

Similar documents
An Efficient Learning Scheme for Extreme Learning Machine and Its Application

Visual object classification by sparse convolutional neural networks

2 OVERVIEW OF RELATED WORK

Keras: Handwritten Digit Recognition using MNIST Dataset

Traffic Signs Recognition using HP and HOG Descriptors Combined to MLP and SVM Classifiers

Machine Learning 13. week

Deep Learning with Tensorflow AlexNet

Some fast and compact neural network solutions for artificial intelligence applications

Study of Residual Networks for Image Recognition

Accelerating Large-Scale Convolutional Neural Networks with Parallel Graphics Multiprocessors

Design of a Processing Structure of CNN Algorithm using Filter Buffers

Weighted Convolutional Neural Network. Ensemble.

Deep Learning Based Real-time Object Recognition System with Image Web Crawler

Extreme Learning Machines. Tony Oakden ANU AI Masters Project (early Presentation) 4/8/2014

CHAPTER 8 COMPOUND CHARACTER RECOGNITION USING VARIOUS MODELS

MACHINE LEARNING CLASSIFIERS ADVANTAGES AND CHALLENGES OF SELECTED METHODS

An Improvement of the Occlusion Detection Performance in Sequential Images Using Optical Flow

Parallelization and optimization of the neuromorphic simulation code. Application on the MNIST problem

Analysis of Randomized Performance of Bias Parameters and Activation Function of Extreme Learning Machine

Deep Learning. Roland Olsson

Keras: Handwritten Digit Recognition using MNIST Dataset

Deep neural networks II

Fully complex extreme learning machine

Supervised Learning in Neural Networks (Part 2)

arxiv: v1 [cs.ai] 7 Oct 2016

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU,

Convolutional Neural Networks

An Exploration of Computer Vision Techniques for Bird Species Classification

Robustness of Selective Desensitization Perceptron Against Irrelevant and Partially Relevant Features in Pattern Classification

Outline. Deep Convolutional Neural Network (DCNN) Stochastic Computing (SC)

Deep Neural Network Acceleration Framework Under Hardware Uncertainty

A hardware design of optimized ORB algorithm with reduced hardware cost

Lecture 2 Notes. Outline. Neural Networks. The Big Idea. Architecture. Instructors: Parth Shah, Riju Pahwa

Cursive Handwriting Recognition System Using Feature Extraction and Artificial Neural Network

Tutorial on Machine Learning Tools

Report: Privacy-Preserving Classification on Deep Neural Network

CS 2750: Machine Learning. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh April 13, 2016

Dynamic Routing Between Capsules

Deep Learning for Computer Vision II

Minimizing Computation in Convolutional Neural Networks

CMU Lecture 18: Deep learning and Vision: Convolutional neural networks. Teacher: Gianni A. Di Caro

Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani

Neural Network and Deep Learning. Donglin Zeng, Department of Biostatistics, University of North Carolina

An ELM-based traffic flow prediction method adapted to different data types Wang Xingchao1, a, Hu Jianming2, b, Zhang Yi3 and Wang Zhenyu4

Facial Expression Recognition using Principal Component Analysis with Singular Value Decomposition

Improving the way neural networks learn Srikumar Ramalingam School of Computing University of Utah

COMP9444 Neural Networks and Deep Learning 7. Image Processing. COMP9444 c Alan Blair, 2017

Convolution Neural Networks for Chinese Handwriting Recognition

A Vision Recognition Based Method for Web Data Extraction

ImageNet Classification with Deep Convolutional Neural Networks

SGD: Stochastic Gradient Descent

Deep Indian Delicacy: Classification of Indian Food Images using Convolutional Neural Networks

Bayes Risk. Classifiers for Recognition Reading: Chapter 22 (skip 22.3) Discriminative vs Generative Models. Loss functions in classifiers

COMP 551 Applied Machine Learning Lecture 16: Deep Learning

Deep Learning. Volker Tresp Summer 2014

Deep Learning Workshop. Nov. 20, 2015 Andrew Fishberg, Rowan Zellers

Object Detection Lecture Introduction to deep learning (CNN) Idar Dyrdal

Classifiers for Recognition Reading: Chapter 22 (skip 22.3)

Additive Manufacturing Defect Detection using Neural Networks

Face recognition using Singular Value Decomposition and Hidden Markov Models

Deep Learning for Generic Object Recognition

A Modular k-nearest Neighbor Classification Method for Massively Parallel Text Categorization

FACIAL POINT DETECTION BASED ON A CONVOLUTIONAL NEURAL NETWORK WITH OPTIMAL MINI-BATCH PROCEDURE. Chubu University 1200, Matsumoto-cho, Kasugai, AICHI

Deep Learning With Noise

Stacked Denoising Autoencoders for Face Pose Normalization

Traffic Sign Localization and Classification Methods: An Overview

Unsupervised learning in Vision

CS 1674: Intro to Computer Vision. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh November 16, 2016

Convolutional Neural Networks for Croatian Traffic Signs Recognition

Simulation of Back Propagation Neural Network for Iris Flower Classification

This leads to our algorithm which is outlined in Section III, along with a tabular summary of it's performance on several benchmarks. The last section

Transformed Representations for Convolutional Neural Networks in Diabetic Retinopathy Screening

Neuron Selectivity as a Biologically Plausible Alternative to Backpropagation

Implementation Feasibility of Convex Recursive Deletion Regions Using Multi-Layer Perceptrons

Deep Learning and Its Applications

Effect of the PSO Topologies on the Performance of the PSO-ELM

Additive Manufacturing Defect Detection using Neural Networks. James Ferguson May 16, 2016

Convolutional Neural Networks: Applications and a short timeline. 7th Deep Learning Meetup Kornel Kis Vienna,

Review on Methods of Selecting Number of Hidden Nodes in Artificial Neural Network

Neural Network Neurons

Identification of Multisensor Conversion Characteristic Using Neural Networks

Lecture 13. Deep Belief Networks. Michael Picheny, Bhuvana Ramabhadran, Stanley F. Chen

Post-Classification Change Detection of High Resolution Satellite Images Using AdaBoost Classifier

Probabilistic Siamese Network for Learning Representations. Chen Liu

Deep Learning in Visual Recognition. Thanks Da Zhang for the slides

Character Recognition Using Convolutional Neural Networks

Fast Learning for Big Data Using Dynamic Function

Classification using Weka (Brain, Computation, and Neural Learning)

POINT CLOUD DEEP LEARNING

Back propagation Algorithm:

Facial Keypoint Detection

arxiv: v1 [cs.lg] 25 Jan 2018

Classification and Regression using Linear Networks, Multilayer Perceptrons and Radial Basis Functions

A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images

PARALLEL TRAINING OF NEURAL NETWORKS FOR SPEECH RECOGNITION

Using Capsule Networks. for Image and Speech Recognition Problems. Yan Xiong

CHAPTER 1 INTRODUCTION

Vulnerability of machine learning models to adversarial examples

A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images

THE SPEED-LIMIT SIGN DETECTION AND RECOGNITION SYSTEM

Transcription:

, pp.50-55 http://dx.doi.org/10.14257/astl.2015.113.11 Image Classification using Fast Learning Convolutional Neural Networks Keonhee Lee 1 and Dong-Chul Park 2 1 Software Device Research Center Korea Electronics Technology Institute Seongnam-si, Gyeonggi-do, Rep. Of Korea khlee@keti.re.kr 2 Dept. of Electronics Engineering, Myoung Ji University, YongIn, Rep. of Korea parkd@mju.ac.kr Abstract. In this paper, we propose an image classification method for improving the learning speed of convolutional neural networks (CNN). Although CNN is widely used in multiclass image classification datasets, the learning speed remains slow for large amounts of data. Therefore, we attempted to improve the learning speed by applying an extreme learning machine (ELM). We propose a learning method combining a network of CNN with this ELM. First, we use an orthogonal bipolar vector (OBV) for studying different types of neural networks. After learning a limited epoch used in the CNN, we determine the output of the neural network using the ELM. To evaluate the performance, we compared the learning speed and classification rate of conventional CNN against the proposed method. Experimental results show that the proposed algorithm has a faster learning curve than the existing algorithm. Keywords: Image Classification, Orthogonal Bipolar Vector, Convolutional Neural Network 1 Introduction Image classification is still attracting a considerable amount of interest in the fields of pattern recognition and computer vision. Currently, researchers are focusing on deep learning in many industrial areas. In particular, the concept of deep learning has been applied to the field of recognition and classification. The convolutional neural networks (CNN) used in deep learning have been applied to handwritten character recognition, pedestrian detection, and traffic sign recognition [1], [2]. CNN consist of pairs of convolution and subsampling layers. Further, the last layer of the CNN structure consists of multilayer perceptron neural networks (MLPNNs) [3]. Because of this structure, CNN is an integrated model used for feature extraction and classification. In recent years, many researchers have been studying the applications of parallel computing for increasing the training speed [4]. To address the need for fast learning, an extreme learning machine (ELM) was developed. The generalization performance of the ELM is better than that of ISSN: 2287-1233 ASTL Copyright 2015 SERSC

conventional methods. However, the ELM has a disadvantage when big data and a large network are used because the hidden layer output matrix and the system memory required for calculating the inverse matrix are proportional. To overcome this disadvantage, an online sequential learning method (OS-ELM) was proposed [5]. In this paper, we propose a learning method combining a network of CNN with an ELM for fast-learning neural networks. To combine different algorithms, we used the orthogonal bipolar vector (OBV) [6] method for training the CNN without a data target. After training the CNN using an OBV, the ELM trains their last classification layer. The rest of this paper is organized as follows: In Section 2, we provide a brief review of the related CNN and ELM algorithms. Section 3 introduces the OBV target method for training the hidden layer weight of the CNN. The results of the performance evaluation of the proposed networks are described in Section 4. Finally, some concluding remarks regarding this study are presented in Section 5. 2 Overview of Learning Algorithm 2.1 Convolutional Neural Networks We can divide the structure of a CNN into a feature extraction part and a classification part. The feature extraction part contains convolution layers alternating with the subsampling layers. A convolution layer has many randomly generated twodimensional shared weights [-0.5, 0.5]. This layer convolution operation using the shared weight of the input data can be expressed as follows: (1) where denotes the data on the previous layer, and represents the weight matrix. After the convolution operation, input data are reduced according to the size of the shared weight matrix. Further, the data are calculated using an activation function called a hyperbolic tangent. Therefore, every output of the convolution layer is normalized from -1 to 1. The subsampling layer has two operations, average pooling and max-pooling. Many researchers use the max-pooling method because it performs better than average pooling for certain datasets. (2) where denotes the output data of the subsampling layer. The max value is extracted from, which is an image of the previous layers. The convolution- and subsampling-layer operations are repeated until the input data size is 1 1. The final data in the feature extraction part are the inputs of the classification layer. The Copyright 2015 SERSC 51

classification layer consists of multilayer perceptron neural networks. This layer is trained using gradient descent-based learning. 2.2 Extreme Learning Machine Traditional training algorithms require complex calculations and a lengthy learning time. Because a hidden node parameter is randomly generated according to a continuous distribution probability, the ELM was proposed for use in SLFNs, and the hidden layer parameter was not trained. Therefore, the ELM is calculated as a onetime inverse operation of the output layer parameter. The calculation of the hidden layer output matrix of the ELM can be expressed as follows: where denotes the input data matrix and represents the weight between the input layer and hidden layer. The sig() function refers to the sigmoid function. Therefore, our input data must be generalized from zero to 1. Further, a matrix H used to train the ELM can be expressed as follows: (3) (4) where represents the Moore-Penrose generalized inverse of matrix H. Some methods can be used for calculating an inverse matrix H. Further, orthogonalization, orthogonal projection, singular value decomposition (SVD), and Gauss elimination are used for calculating. For a huge dataset, a large amount of system memory is required to calculate the inverse matrix of H. Therefore, we apply OS-LEM as follows: Initial phase - (6) Sequential phase - ( ) (7) 52 Copyright 2015 SERSC

3 Orthogonal Bipolar Vector target An OBV-type target is characteristic in that the distance between it and the other target vector is constant. The component of an OBV target is calculated using, where is the number of components of the seed vector. Steps: 1) Initialize parameters and 2) Generate the seed vector (1) = (1, 1, 1, 1,...,1) 3) Construct the first vector: = [ (1), (1)] and (1) = [ (1), - (1)] 4) Construct the next Vector: = [ (1), (1)], (2) = [ (1), - (1)], = [ (2), (1)] and (4) = (2), - (2)] 5) Repeat steps 4 times to generate a vector with components. 4 Implementation 4.1 Image dataset The NORB dataset consists of five differently shaped toys. Each of the main categories has a different shape for each of ten data items, where all data are light in weight, and their angles and azimuth are slightly different. The MNIST handwritten digit dataset has a training set of 60,000 items and a test dataset of 10,000 examples. This dataset consists of images of numbers ranging from zero to 9, and the size of each gray-scale image is 28 pixels 28 pixels. The CIFAR-10 dataset consists of 60,000 32 32 color images divided into ten classes, with 6,000 images per class. There are 50,000 training images and 10,000 test images. The dataset is divided into five training batches and one test batch, each with 10,000 images. Table 1. Performance of the CNN and the proposed method CNN CNN + ELM Training time Recognition rate Training time Recognition rate NORB 29234.s 87.54% 4537.4s 89.65% MNIST 10475.1s 92.14% 3078.5s 93.54% CIFAR-10 8020.5s 63.54% 1244.8s 64.36% 4.2 Experiment In this study, we used an Intel Core i5-6600k CPU (3.50 GHz), with 16.00 GB of DDR4 RAM. Further, we used the Matlab 2014b version of the engine library written Copyright 2015 SERSC 53

in C language for testing the CNN. Note that the data chunk size was never less than the number of hidden nodes. We compared the proposed algorithm with the existing algorithms experimentally using the NORB, CIFAR-10, and MNIST datasets. The results of the performance evaluation are shown in Table 1. These results were obtained by measuring the recognition rate using the learning time and test data. The results of a recognition test of the proposed algorithm using the NORB dataset showed that the learning speed was improved by 6.4 times and that the recognition rate was improved by 2.11%. Similarly, during experiments conducted using the MNIST dataset, the learning rate was improved by 3.4 times, and the recognition rate was improved by 1.4%. CIFAR was also classified by the proposed algorithm, which in this case showed a slight improvement over the conventional algorithms. 5 Conclusion In this paper, we proposed a combination of CNN with an ELM for image classification. A learning method called an orthogonal bipolar vector (OBV) was also well applied by analyzing the neural networks learned by combining both algorithms. The experimental results demonstrate that the proposed method can conduct network learning at a faster rate than conventional CNN. In addition, it can be used to solve the local minima and overfitting problems. When applied to an image classification problem, the proposed method shows promising results in terms of the training speed. As a result of combining both algorithms mentioned above, the proposed method can reduce the learning time by 3 to 6 times compared to conventional CNN, and improves the recognition rate. Acknowledgments. This work was supported by the IT R&D program of MKE/KEIT (1004019, the development of automotive synchronous Ethernet combined IVN/OVN and safety control system for 1 Gbps class). References 1. Szarvas, M., Yoshizawa, A., Yamamoto, M., Ogata, J.,: Pedestrian detection with convolutional neural networks. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp 886 893, IEEE, San Diego (2005) 2. Jin, J., Fu, K., Zhang, C.,: Traffic Sign Recognition With Hinge Loss Trained Convolutional Neural Networks. IEEE Tr. Intelligent Transportation System. 15(5). 1991 2000 (2014) 3. Huang, F. J., LeCun, Y.,: Large-scale Learning with SVM and Convolutional Nets for Generic Object Categorization. In: IEEE Computer Society conference on Computer Vision and Pattern Recognition, pp. 284 291. IEEE, New York (2006) 4. Scherer, D., Schulz, H., Behnke, S.,: Accelerating Large-Scale Convolutional Neural Networks with Parallel Graphics Multiprocessors In: Diamantaras, Duch, Iliadis. (eds.) ICANN 2010. LNCS, vol. 6354, pp. 82 91. Springer, Thessaloniki (2010) 54 Copyright 2015 SERSC

5. Liang, N. Y., Huang, G. B., Saratchandran, P., Sundararajan, N.,: A Fast and Accurate Online Sequential Learning Algorithm for Feedforward Networks. IEEE Tr. Neural Networks, 17(6), 1411 1423 (2006) 6. Nomura, S., Yamanaka, K., Katai, O., Kawakami, H., Shiose, T.,: Improved MLP Learning via Orthogonal Bipolar Target Vectors. JACIII 9(6), 580 589 (2005) Copyright 2015 SERSC 55