Research of Image Recognition Algorithm Based on Depth Learning

Similar documents
Lecture 5: Multilayer Perceptrons

Cluster Analysis of Electrical Behavior

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

Face Detection with Deep Learning

Support Vector Machines

Outline. Self-Organizing Maps (SOM) US Hebbian Learning, Cntd. The learning rule is Hebbian like:

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Modular PCA Face Recognition Based on Weighted Average

ALEXNET FEATURE EXTRACTION AND MULTI-KERNEL LEARNING FOR OBJECT- ORIENTED CLASSIFICATION

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

An Improved Image Segmentation Algorithm Based on the Otsu Method

Load Balancing for Hex-Cell Interconnection Network

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Feature Reduction and Selection

The Study of Remote Sensing Image Classification Based on Support Vector Machine

The Discriminate Analysis and Dimension Reduction Methods of High Dimension

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Research and Application of Fingerprint Recognition Based on MATLAB

Face Recognition Based on SVM and 2DPCA

A Novel Adaptive Descriptor Algorithm for Ternary Pattern Textures

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A fast algorithm for color image segmentation

A Model Based on Multi-agent for Dynamic Bandwidth Allocation in Networks Guang LU, Jian-Wen QI

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

An Optimal Algorithm for Prufer Codes *

The Research of Support Vector Machine in Agricultural Data Classification

EYE CENTER LOCALIZATION ON A FACIAL IMAGE BASED ON MULTI-BLOCK LOCAL BINARY PATTERNS

Machine Learning 9. week

Backpropagation: In Search of Performance Parameters

Parallelization of a Series of Extreme Learning Machine Algorithms Based on Spark

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

Deep learning is a good steganalysis tool when embedding key is reused for different images, even if there is a cover source-mismatch

A new segmentation algorithm for medical volume image based on K-means clustering

International Conference on Applied Science and Engineering Innovation (ASEI 2015)

Comparing Image Representations for Training a Convolutional Neural Network to Classify Gender

BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET

Classifier Selection Based on Data Complexity Measures *

Network Intrusion Detection Based on PSO-SVM

Convolutional Neural Network- based Human Recognition for Vision Occupancy Sensors

PRÉSENTATIONS DE PROJETS

Application of Clustering Algorithm in Big Data Sample Set Optimization

The Comparison of Calibration Method of Binocular Stereo Vision System Ke Zhang a *, Zhao Gao b

The Application Model of BP Neural Network for Health Big Data Shi-xin HUANG 1, Ya-ling LUO 2, *, Xue-qing ZHOU 3 and Tian-yao CHEN 4

Object-Based Techniques for Image Retrieval

Human Face Recognition Using Generalized. Kernel Fisher Discriminant

A New Feature of Uniformity of Image Texture Directions Coinciding with the Human Eyes Perception 1

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification

User Authentication Based On Behavioral Mouse Dynamics Biometrics

Image Representation & Visualization Basic Imaging Algorithms Shape Representation and Analysis. outline

A Binarization Algorithm specialized on Document Images and Photos

Edge Detection in Noisy Images Using the Support Vector Machines

Gender Classification using Interlaced Derivative Patterns

arxiv: v1 [cs.sd] 22 Dec 2017

Zhang Ning 1,2,, Zhu Jin-fu 1* 1 College of Civil Aviation of Nanjing, University of Aeronautics &

Using Neural Networks and Support Vector Machines in Data Mining

Solving two-person zero-sum game by Matlab

Design of Simulation Model on the Battlefield Environment ZHANG Jianli 1,a, ZHANG Lin 2,b *, JI Lijian 1,c, GUO Zhongwei 1,d

Mathematics 256 a course in differential equations for engineering students

Remote Sensing Image Retrieval Algorithm based on MapReduce and Characteristic Information

Hierarchical Image Retrieval by Multi-Feature Fusion

Random Kernel Perceptron on ATTiny2313 Microcontroller

Classifying Acoustic Transient Signals Using Artificial Intelligence

Professional competences training path for an e-commerce major, based on the ISM method

Virtual Machine Migration based on Trust Measurement of Computer Node

CSCI 5417 Information Retrieval Systems Jim Martin!

The Shortest Path of Touring Lines given in the Plane

Journal of Chemical and Pharmaceutical Research, 2014, 6(6): Research Article. A selective ensemble classification method on microarray data

Vol. 5, No. 3 March 2014 ISSN Journal of Emerging Trends in Computing and Information Sciences CIS Journal. All rights reserved.

An Indian Journal FULL PAPER ABSTRACT KEYWORDS. Trade Science Inc.

SUMMARY... I TABLE OF CONTENTS...II INTRODUCTION...

Chinese Word Segmentation based on the Improved Particle Swarm Optimization Neural Networks

Research Article A High-Order CFS Algorithm for Clustering Big Data

Evaluation of an Enhanced Scheme for High-level Nested Network Mobility

Face Recognition Method Based on Within-class Clustering SVM

Suppression for Luminance Difference of Stereo Image-Pair Based on Improved Histogram Equalization

A Gradient Difference based Technique for Video Text Detection

Image Matching Algorithm based on Feature-point and DAISY Descriptor

An Efficient Method for Deformable Segmentation of 3D US Prostate Images

A Clustering Algorithm Solution to the Collaborative Filtering

A Gradient Difference based Technique for Video Text Detection

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

Available online at Available online at Advanced in Control Engineering and Information Science

Hybrid Non-Blind Color Image Watermarking

The Improved K-nearest Neighbor Solder Joints Defect Detection Meiju Liu1, a, Lingyan Li1, b *and Wenbo Guo1, c

Image Emotional Semantic Retrieval Based on ELM

Research of Dynamic Access to Cloud Database Based on Improved Pheromone Algorithm

Angle-Independent 3D Reconstruction. Ji Zhang Mireille Boutin Daniel Aliaga

An Image Fusion Approach Based on Segmentation Region

Method of Wireless Sensor Network Data Fusion

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields

CAN COMPUTERS LEARN FASTER? Seyda Ertekin Computer Science & Engineering The Pennsylvania State University

High resolution 3D Tau-p transform by matching pursuit Weiping Cao* and Warren S. Ross, Shearwater GeoServices

Security Enhanced Dynamic ID based Remote User Authentication Scheme for Multi-Server Environments

RECOGNIZING GENDER THROUGH FACIAL IMAGE USING SUPPORT VECTOR MACHINE

The Classification using the Merged Imagery from SPOT and LANDSAT

Robust Shot Boundary Detection from Video Using Dynamic Texture

Type-2 Fuzzy Non-uniform Rational B-spline Model with Type-2 Fuzzy Data

Transcription:

208 4th World Conference on Control, Electroncs and Computer Engneerng (WCCECE 208) Research of Image Recognton Algorthm Based on Depth Learnng Zhang Jan, J Xnhao Zhejang Busness College, Hangzhou, Chna, 30000 Keywords: Depth Learnng, Image Recognton Algorthm, Deep eural etwork, Convoluton eural etwork Abstract: In recent years, deep learnng has become more and more wdely used n the felds of speech recognton, computer vson and onformatcs. Especally n the feld of mage recognton, deep learnng-based convolutonal neural networks show a strong techncal superorty that surpasses prevous machne vson methods. The tradtonal methods of machne vson have not been able to meet the practcalty and securty requrements of mage recognton n the g data era. At ths stage, deep learnng has become a hotspot n mage recognton technology. To ths end, the paper elaborates the background and basc deas of deep learnng, and analyzes common models of deep learnng: deep belef networks, algorthm prncples of convolutonal neural networks, and mage recognton effcency.. Introducton Wth the development of nformaton technology, the amount of mage nformaton s gettng larger and larger. Therefore, t s very mportant to extract the feature nformaton we need from the huge amount of mage nformaton and dentfy the mage accurately. year 2006. The deep belef network (DB) proposed by Hnton et al. Has become a research hotspot n the feld of mage recognton because of ts extraordnary alty of mathematcal representaton. Deep learnng can buld algorthm through deep neural network, so as to extract mage features herarchcally and get the alty of automatc learnng features through data tranng, whch has greatly promoted the development of mage recognton technology. In recent years, based on the depth of learnng mage recognton algorthm models contnue to emerge. Manly DB (Deep Belef etwork), C (Convolutonal eural etwork), R (Recurrent eural etworks). Practce has proved that convoluton neural network s the most deal structure n the current deep learnng structure, convoluton neural network n mage recognton s wdely used. Below, the text wll explan the basc dea of deep learnng. 2. Depth learnng n the feld of mage recognton applcatons Snce 2006, when G.E. Hnton et al. Proposed deep learnng, deep learnng has shown superor performance advantages n many felds and has become a research hotspot n the feld of Internet g data and artfcal ntellgence. In 202, Hnton's research team used a deep learnng model based on convolutonal neural networks to wn the Imageet Image Classfcaton Contest, reducng the error to 5.35%, compared wth the lowest error rate of 26.72% for the tradtonal computer vson method. Many researchers have begun to study deep learnng, especally for convolutonal neural network model for a large number of studes. At the same tme, deep learnng has also brought tremendous busness value. For example, some banks and enterprses started to provde onlne account openng servce and brush face payment servce, so that people can handle all knds of busness wthout leavng home, avodng the long wat. Mcrosoft developed a deep neural network to replace the tradtonal GMM (Gaussan Mxture Model) algorthm, to speech recognton has brought a new techncal framework, sgnfcantly reducng the error rate, and n 202 offcally launched the frst deep neural network-based voce search system, becomng one of the frst companes n the world to offer voce servces. Wth the advent of the era of g data, major Internet companes n the world, such as Badu and Google, are actvely studyng the applcaton of Copyrght (208) Francs Academc Press, UK 235

deep learnng. For example, the use of deep neural networks to reduce the computatonal complexty of feature numbers now enhances the performance of search advertsng systems. In addton, deep learnng also has many applcatons n LP (atural Language Processng), IR (Informaton Retreval) and vsual scene annotaton. 3. Research of Image Recognton Algorthm Based on Depth Learnng 3. Reconstructon of MIST Dgtal Images Based on Depth eural etworks MIST, a sub-database of the atonal Insttute of Standards and Technology's large data set, s a handwrtten dgtal lbrary contanng 0,000 test samples and 60,000 tranng samples consstng of 0 to 9 dgtal samples wth a resoluton of 28 * 28 These samples are not standard mage formats but are stored n the dx fle. The MIST dataset bascally does not requre data preprocessng, t can be used mmedately. As a result, many researchers have used MIST as the preferred database for studyng technques and pattern recognton methods. Shown n Fgure, the MIST data sets have been converted to the mage format part of the sample dagram. Fgure. A partal sample dagram of the MIST dataset that has been converted to mage format Reconstructon of samples n MIST dataset wth four - layer depth belef network usng RBM. Frstly, the 300-dmensonal effectve features of the mage are extracted, the parameters are adjusted, and the error of the ntal mage between the reconstructed mage and the nput network s reduced. Ths method can sgnfcantly reduce the dmensons of the mage, compress the data effcently, and reduce the space requred to store the mage. 3.2 MIST Dgtal Image Recognton Based on Convolutonal eural etwork 3.2. Transform layer The mage has a fxed feature, and part of the mage statstcs s the same as the other parts. In other words, the features learned n ths part can also be the same as the other part. Therefore, we can use the same learnng feature to handle the same mage anywhere. Convolutonal neural network s the use of ths feature of the mage to acheve weght sharng, thereby reducng the network parameters. We can regard the mage as a plane, and based on preservng the two-dmensonal characterstc of the mage, we can process the nput mage by two knds of transformatons: lnear transformaton and non-lnear transformaton. The above-mentoned non-lnear operaton refers to the so-called ncentve functon, shown n Fgure 2, where manly three knds of nonlnear. Fgure 2. onlnear functon 236

The left one s the sgmod functon. Accordng to the mage, we can see that the range of the nput value s (0, ), whch s a common nonlnear transformaton functon of neural network. It can well explan nearly fully saturated neurons. At present, less used n the applcaton, manly because the neuronal actvaton value n the vcnty of 0 or, the gradent of the regon s almost 0, when the back propagaton, the frst few layers of weght change s mnmal, f the ntal weght s too large, eurons wll soon be saturated, the network wll stop learnng. In addton, neurons handle data centers that are not zero, adversely affectng the dynamcs of the gradent declne. The mddle s a hyperbolc tangent functon, whch lmts the nput value wthn the range of (-, ), but there s also a saturaton problem. The frst one s a non-lnear correcton functon. Compared wth the sgmod functon and the hyperbolc tangent functon, the calculaton of the number of nonlnear correcton lnes s easer and accelerates the convergence of stochastc gradent descent. Therefore, t has become very popular n recent years, but t also has the dsadvantage that ReLU neurons are not actvated at any one data pont when large gradent values pass through ReLU neurons, so ReLU cells are more vulnerable. 3.2.2 Poolng layer Convoluton feature extracton dmenson s too hgh, and there s redundancy, not sutable for classfcaton, we must frst reduce the dmenson. You can aggregate features n dfferent locatons of the mage. For example, the average or maxmum value of a feature n a regon of an mage s calculated, so as to reduce the feature dmenson and make the network dffcult to overft. Ths type of aggregaton s called poolng. As shown n Fgure 3, poolng uses the local correlaton prncple of the mage to sample the feature map and reduces the amount of data to be processed whle retanng useful structural nformaton. Maxmum poolng Average poolng 4. Analyss of Algorthms Fgure 3. 2*2 two knds of pool type dagram In ths paper, C, a convolutonal neural network wth depth structure, s used to automatcally learn mage features and perform mage recognton. By comnng the feature extracton and classfer tranng n depth learnng, the readness of mage recognton can be sgnfcantly mproved. It should be ponted out that the tradtonal mage recognton algorthms need to preprocess mages to extract better mage features, and then select the approprate classfer for more features. The tradtonal mage recognton algorthm wth a lot of uncertanty, requres a wealth of experence, and vulnerable to human factors, resultng n accurate recognton s not hgh, lack of stalty. In addton, the tradtonal mage recognton algorthm also requres complex parameter adjustment, thus sgnfcantly ncreasng the tranng tme. The tradtonal mage recognton algorthms dfferent neural network C s the convoluton of the two-dmensonal mage can be drectly nput nto the network, neural network recognzes vsual patterns n the orgnal pcture, the requred preprocessng lttle affected by human factors Small, breakng the lmtatons of human desgn features. C, as an end-to-end learnng network, has an error rate of 0.84% on the recognton result and ts recognton performance s comparable to or even better than 237

the tradtonal method, whch proves the effectveness of C method. The value of the gradent n the mage recognton algorthm based on convolutonal neural network comes from the number of nput samples. Here we wll dscuss how to calculate the gradent. Wj b) Wj = b) The most prmtve method s based on the sample gradent, as shown n the above formula, f the sample s small, the program can run normally, but f the sample number s large, the runnng tme wll be very long, the computng speed s very busy, you need to consume A large number of computng resources, easly lead to nsuffcent hardware space resources. The random selecton of a sample update parameter s called SGD (Stochastc Gradent Descend), SGD wll lead to more serous cost penalty functon oscllaton, wll result n greater data error. Wj x, ) y Wj = x, ) y In ths paper, we use the Mn-batch SGD to update the parameters. For the gradent of the random sample calculator, the parameters are updated as follows: W j = = = W W j j x, y ) x, y ) The algorthm steps are as follows: for = : MaxIters (maxmum number of teratons) (a). SGD randomly samples; (b). Calculate the resduals; (c). Calculate the gradent usng the BP algorthm; (d). etwork Parameter Update: Update the network parameters wth the gradent calculated by the BP algorthm. Softmax regresson s an extenson of logstc regresson, whch s manly used to deal wth the second-class classfcaton problems, whle the softmax regresson can be used for mult-class classfcaton tasks mutually exclusve. The tranng set label can take k values and output a k-dmensonal vector for representng the probalty values that the sample belongs to k categores. 5. Deep Learnng n the feld of mage recognton Wth the ncreasng demand for mage recognton and the development of deep learnng, the applcaton of deep learnng wll become more and more wdespread and more ntellgent n the future feld of mage recognton. ow we look at the drecton of the development of deep learnng n the feld of mage recognton. 5. The model herarchy s more and more, the model structure s more complcated Depth learnng needs to model the characterstcs of the mage layer by layer. If the network model does not have enough depth, t wll lead to the exponental ncrease of the requred computng unt, and the mage recognton complexty s ncreased. By learnng the mult-layer mage features, the characterstcs of the network model are gettng more and more global, and the restored mages are more realstc. For example, Alexet acqured the Imageet Image Recognton Contest champon n 202 usng fve convolutonal layers, three pool layers, and two fully 238

connected layers. The network model used by GoogLeet to wn the ILSVRC Champonshp n 204 used 59 Convolutonal layer, 6 pool layers and 2 fully connected layers. The network model used by Mcrosoft's Reset n 205 uses 52 convolutonal layers. Ths shows that the depth and complexty of the model are ncreasng. 5.2 The scale of tranng data ncreased rapdly The complexty of depth learnng model s gettng hgher and hgher, more and more features are needed, whch requres larger tranng data to mprove the classfcaton or predcton accuracy n mage recognton. At present, the tranng data of deep learnng algorthms are usually n the scale of hundreds of thousands and mllons. For large enterprses such as Google and Badu, the tranng data scale has reached ten mllon or even hundreds of mllons of levels. So far, Imageet has collected more than 20,000 categores and a total of about 4.2 mllon mages, but stll cannot meet the rapdly ncreasng demand for tranng. 5.3 The accuracy of model recognton s gettng hgher and hgher Wth the contnuous development of the deep learnng model and contnuous mprovement, the accuracy and effcency of mage recognton have obvously mproved. Early R-C models took 3 seconds to process an mage on the GPU and 53 seconds to the CPU, yeldng an accuracy of 53.7% n recognton. The YOLO model of 206 acheves the recognton speed of 45FPS wthout losng the accuracy, and the model recognton effcency and detecton precson are greatly mproved. 6. Conclusons To sum up, wth the contnuous development of deep learnng, the development of mage recognton technology ushered n a leap-forward development. Deep learnng has effectvely promoted the development of mage recognton technology and promoted the research and applcaton of mage recognton technology by leaps and bounds. At present, all major scence and technology enterprses n the world are nvestng heavly n research on mage recognton as the man ssue. In the comng years, we can foresee that deep learnng wll enter a perod of rapd development n theory, algorthms and applcatons, and wll also have a profound mpact on the feld of mage recognton and other related felds. Ths paper frst descrbes the basc dea of deep learnng, and analyzes the method of reconstructng MIST dataset based on depth belef network n order to get the most effectve characterzaton of the mage set. By constructng a 5-layer convoluton neural network to recognze MIST mages, the performance of each mage recognton algorthm model based on depth learnng s contrasted. The deeper the network level, the more accurate the recognton of the essental features of mages. References [] Hnton G E, Osndero S, Teh Y W. A fast learnng algorthm for deep belef nets [J]. eural Computaton, 2006. [2] LeCun, Y., Bottou, L., Bengo, Y. & Haffner, P. Gradent-based learnng appled to document recognton. Proc. IEEE 86, 2278-2324 (998) [3] Yann, LeCun, Yoshua, Bengo, &, Geoffrey, Hnton. Revew of Deep Learnng [J]. ature, 205, (52): 436-444 [4] Krzhevsky, A., Sutskever, I. & Hnton, G. Imageet classfcaton wth deep convolutonal neural networks. In Proc. Advances n eural Informaton Processng Systems 25 090-098 (202) 239