2009 International Conference on Emerging Technologies
|
|
- Tobias Holt
- 6 years ago
- Views:
Transcription
1 2009 International Conference on Emerging Technologies A Self Organizing Map Based Urdu Nasakh Character Recognition Syed Afaq Hussain *, Safdar Zaman ** and Muhammad Ayub ** afaq.husain@mail.au.edu.pk, safdaraslam@yahoo.com, ayub_iiui@hotmail.com * Department of Computer Science, Air University Islamabad, ** Department of Computer Science, International Islamic University, Islamabad Abstract Research in the field of character recognition for Urdu script faces challenges mainly due to its characteristics, like cursive nature, multiple fonts and context dependent shapes of characters and their position with respect to the base line. This paper addresses problems recognizing Nasakh script of Urdu Language. The proposed system takes segmented character as input and recognizes them in two steps. In the first step the different shapes of each character are classifies into 33 categories using Kohonen Self-organizing Map (SOM) by auto clustering similar ligatures for initial classification. During the Feature Extraction phase more than twenty five different features are extracted from each character which are further processed for final character recognition. Keywords-Offline Character Recognition, Urdu Nasakh, Clustering, Self Organizing Map (SOM), Neural Network. I. INTRODUCTION Often abbreviated OCR, optical character recognition refers to the branch of computer science that involves reading text and translating it into a computer understandable digital form (for example, into UNICODE). OCR can be either Offline or Online, depending on the input method. Conversion of scanned image of text data into text document is known as Offline Optical Character Recognition. Online character recognition involves automatic conversion of text the moment it is written on a special digitizer, where a sensor picks up the pen-tip coordinates [X (t), Y (t)] as well as pen-up/pen-down switching and generates signals for computer manipulation [6]. There are various classification methodologies one of the promising methods is Neural Network based Pattern Recognition. This methodology may be Supervised or Unsupervised. Supervised Training is accomplished by presenting a sequence of training vectors, each associated with a target output-vector. In Unsupervised training, a sequence of input vectors is provided, but no target vectors are specified. Selforganizing neural nets classify similar input vectors without the use of training data [7], [9]. We have seen, much of work has been done in recognition of handwritten and machine written English and Arabic characters as in [7], [9],[10]; which not only covers the separate script but also cursive (joined) script. Majority of people in Pakistan and sizeable minorities in India, Bangladesh and UAE, frequently use Urdu language. In addition, Urdu uses characters and writing rules which are common with many other languages like Arabic, Persian, Pushto, Sindhi, etc. A. OCR for Urdu Fonts Urdu Script uses different Fonts for characters of its character set. These characters have different 1 to 4 shapes depending upon the position as given in Table I. Font is a style of writing. More than 10 Fonts are used in Urdu. Some of the most common are Nastaliq, Nasakh, Noori Nasakh, Noori Nastaliq, Koofi, etc. Nastaleeq and Nasakh are the two most popular fonts as discussed below. Nastaleeq is a special calligraphic way of writing Urdu. It does not have a baseline rather the text is centre justified and it is very difficult to recognize this style of writing because of its complexity. Nasakh is another writing style of Urdu which follows one baseline. This way of writing is simpler than Nastaleeq so it is easy to recognize this style because of its simplicity [2]. B. Properties of Urdu Script Direction of Writing: Horizontal-justification of Urdu text is from Right to Left. Cursive Nature: Urdu text is cursive in nature i.e. all the characters are connected to each other within a word. Baseline: Urdu text has a baseline. Baseline is a horizontal line which runs through the text, cutting all the words at same point. Overlapping: Urdu characters some times overlap vertically without touching each other. Diacritics: Diacritics like Hamza, Jazm, Khari Zabar, Khari Zer etc are special symbols which are very important in the proper pronunciation of the word. Ligatures: Ligature is a set of connected characters only. Ligatures collectively make words. In a word, ligatures are separated by one another via half space whereas full space separates one word from other. No Case for characters: Urdu characters have no upper or lower case. Shapes of Context based characters: Shape of the characters depends upon the writing style as well as position in the word e.g. In Nasakh writing style character has 4 different shapes depending upon the position in the word. Strokes: Every Urdu character has one main stroke and zero or more secondary stroke. Some characters have dot(s) with this main stroke. Table I gives a set of Urdu Nasakh characters with 4 different position based shapes /09/$ IEEE 267
2 (Interactive Recognition of Arabic Characters) system for online recognition. This work considers both horizontal as well as vertical scanning of the characters for features. A. Amin [5] concentrates on thinning process and feature extraction; they also discuss the problems encountered in producing satisfactory thinned form of text and describe solutions to those difficulties. These earlier works are very much primitive for ligature and word based recognition for Nastaliq font for Urdu/Arabic characters. Our struggle is a considerable improvement for recognition of Urdu Nasakh font as character-based recognition system. TABLE I. PARTIAL URDU/ARABIC CHARACTER SET II. PREVIOUS WORK Much of research work has been done on OCR systems for different languages. Syed [1] introduce a recognizer for Urdu Noori Nastaliq script. This work is ligature based instead of character based. In this work a multi-tier approach has been utilized. Ligatures are identified in the first step. These ligatures are then classified into special and basic ligatures in Step2. Shah [2] also presented an OCR for Urdu Nastaliq font. It was a ligature based recognizer but use template matching technique for classification. Like Urdu, Arabic is also cursive and Maged [3] used an automatic recognition system of hand written Arabic characters. They used geometrical features to describe the complete skeleton of character. Amin [4] discussed both the Offline as well as online techniques for handwritten Arabic recognition. It discusses the methodology called IRAC III. PROPOSED SYSTEM Our methodology, as described in Fig. 1, assumes only segmented characters and hence it can be used with any system (Ligature/word based recognizer). Initially the system takes the segmented character as image and passes it through some steps that convert it into gray level and then into binary image. System applies some preprocessing techniques as in [10], on binary image and thins the image and makes it font independent so it is more appropriate for the extraction of features. Then another preprocessing technique takes the thinned image and removes the edges which are developed during thinning and may cause features to be invalid. After these preprocessing techniques, system gives the preprocessed image to feature extraction phase [8], during this phase image goes through a number of procedures which find the features of the image. After getting enough features of the image, we applied a weights matrix already fixed, to give results based on clustering. Then classification phase activates its sub phase classification-i, which uses weight matrix of order and gives only one of 33 clusters for 104 segmented characters. As each cluster has more than five characters, therefore the sub phase classification-ii uses additional features to classify each character from other characters in the same cluster. A. System Phases Our proposed system contains four phases as discussed below. 1) Input Phase The image obtained is converted to binary format which is appropriate for the feature extraction, for this, the following steps are carried out. 268
3 If graylevel >= 155 then Make it 1; Else Make it 0; 2) Preprocessing Phase This phase deals with preprocessing techniques discussed below: Thinning: It uses the binary image and makes the image one pixel wide. the obtained thinned image reduces the complexities faced in feature extraction. For thinning, well known Zang Suen algorithm [11] has been used. (unthinned), (thinned) Chain Code Formation: Chain Code formed in the proposed system is 8-directional i.e. system assumes codes associated with eight possible directions as used in [12]. Fig showing 8-Direction Code for pixel c Additional Features Figure 1. Block Diagram of Proposed System Acquisition of image: System takes already saved image of JPEG/JPG format. This image may be RBG/Gray level image. Conversion from RBG to Gray level: This step converts the RBG (colored) format of image into Gray level, the levels are between 0 and 255. Function used for this purpose is grayimg=rgb2gray(rbgimg); takes rbg form of input image and returns into graylevel image. Conversion of Gray level to Binary: This step converts the Gray level image into Binary format which contains only two levels 0 and 1 used for background and foreground respectively. As gray levels are from 0 to 255 so by using 155 as threshold, it is converted into binary values as Chain Code formed for Ra isolated 6,6,6,6,6,6,7,6,7,6,6,7,6,6,6,5,6,6,6,5,6,5,6,5,5,6,5,4,5,5,5,5, 4,5,4,5,4,4,4,4,4,4,4,3. Smoothing: On the basis of Chain Code obtained in previous preprocessing step, proposed system uses one pixel smoothing i.e. if a single value appears in the code sequence then it is converted into value same as the value just preceding it. Following are the images and codes for Ra isolated showing chain code before and after the smoothing process Chain Code before Smoothing: 6,6,6,6,6,6,7,6,7,6,6,7,6,6,6,5,6,6,6,5,6,5,6,5,5,6,5,4,5,5, 5,5,4,5,4,5,4,4,4,4,4,4.. Chain Code after Smoothing: 6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,66,5,5,5,5,5,5,5,5,5,5,5,5,5,4,4,4,4,4,4. 3) Feature Extraction Phase Features Extracted: Features are very important for recognition. Features are the characteristics of the object. [8]. In this phase different methods are used to extract the 269
4 features with the help of which system recognizes the characters. Height: This feature is set when the character under observation has height greater than width, Width: This feature is set when the character under observation has width greater than height, Loop: Loop is closed path. It is set for all those characters which have closed path e.g. Loop_M: This feature is set for only those characters which contain loop but in the middle of the character e.g., Loop_S: This feature is set for only those characters which contain loop but in the beginning of the character Joint: It is point from which two or three branches exit as in characters., Joint_1: It is selected when character has only one joint Joint_2: It is selected when character has two joints e.g., Joint_3: It is selected when character has three joints Cross: This is set for character which contains crossover, the point from which four different branches exit Curve_R: This feature is selected when the character has curve toward rights side Curve_U: This feature is selected when the character has curve toward up side, Start_H: It is set for characters that start with horizontal line Start_V: It is set for characters that have vertical start End_H: It is set for characters that end with horizontal line End_V: It is set for characters that end with vertical line e.g.,, Endp_1: This feature is selected when the character has only one end point Endp_2: This feature is selected when the character has two end points Endp_3: This feature is selected when the character has 3 end points Endp_4: This feature is selected when the character has 4 end points Figure 2. Graphical User Interface Fig. 2 is a Graphical User Interface of our system that shows the Result YES for the feature Loop in the segmented Character - WAO of End. 4) Recognition Phase a) Classification-I This sub phase uses the core algotithm, Kohonen Self Organizing Maps(K-SOM). K-SOM algorithm [13] is used for initial classification. The algorithm is described as: 1. Initialize parameters and learning rate 2. While stopping condition is FALSE do Step For each input vector x do Step For each j=1 to m i=1 to n 5. Find index j, when D(j) is minimum 6. Update weights for all i with j Wj (new) = Wj (old) + [x Wj (old)] 7. Update learning rate 8. Reduce the radius of topological neighborhood 9. Test the stopping condition The above training K-SOM algorithm uses weight matrix of order and gives only one of 33 clusters for each segmented character. Generally, it takes all the features obtained in the Feature extraction phase. On the basis of 270
5 those features it finds the cluster in which the segmented character being observed fits best. The Clustering is applied on the 54 classes given in Table II. To reach the final set of clusters,we went through a step-by-step process. Initially, we trained the classifier for 30 clusters with a topology of 6 5 grid, the result obtained after Epochs 75 is shown in Table II. This result is not much satisfactory as it makes 23 clusters. Cluster#17 has four characters, - - which are totally distinct characters as far as features are concerned. TABLE III. CLUSTERED CHARACTERS 5) Classification-II As Classification-I only finds the cluster for the character, and each cluster has more than one characters so in addition to basic features some additional features are also required to distinguish a character from others characters of the same cluster. TABLE II. CLUSTERS OF SEGMENTED CHARACTERS We have analyzed results obtained for 100, 150 and 200 epochs using the same topology of grid 6 5 and the results found are slightly different. Changing grid size from 6 5 to 8 7 and repeating the same number (75, 100, 150, 200) of epochs and observed all the results. Likewise, the same number of epochs are applied for the grid 9 8. Finally grid size of was used. After applying epochs 75, 100, 150 and 200 on this grid we observed that the result for the 100 epochs, is quite satisfactory as shown in Table III. This result shows that some of the clusters e.g Cluster no 1,21,27,34,35. contain only one charcter per Cluster. a) Additional Features Dots: This feature is set when the character being processed contains dots,,, Dots_1: It is set when number of dots in the character is only one,, Dots_2: It is set when number of dots in the character is two, Dots_3: It is set when number of dots in the character is three, Dots_A: This feature is set when the dot(s) in the character are above the main region of the character, Dots_B: This feature is set when the dot(s) in the character are below the main region of the character, Dots_M: This feature is set when the dots in the character are in the middle of the character. On the basis of above features, the 104 segmented characters give 54 different classes, as in Table IV. 271
6 VII. FUTURE GOALS Our system recognizes segmented machine written characters of Urdu Nasakh font. Therefore Segmentation can also be included as its one phase. System uses only dots as diacritics so more diacritics can be added in order to achieve more satisfactory results. The work can also be extended easily for handwritten segmented characters. Figure 3. Recognition of TABLE IV. GENERAL CLASSES IV. SYSTEM OUTPUT The output is in the form of fully recognized segmented character which can be used for either making the word or for searching the dictionary or database. V. RESULTS Different Urdu sentences are used to obtain segmented characters. These segmented characters when applied, System successfully achieved recognition rate more than 80%. Fig. 3 shows recognition of (CHE) in software. Fig. 4 shows the result for the cluster for segmented character (PA of Middle). The recognition rate against each segmented character is given below in the Table V. VI. CONCLUSION The proposed system resolves problems of recognition for segmented machine written characters of Urdu Nasakh Font. The system assumes 104 different segmented characters and is fully capable to recognize the segmented characters of any word. The list of all segmented characters is given in the Table I. Classification-I and Classification-II use Kohonen Self Organizing Map (SOM) [9] for recognizing machine written segmented character. Figure 4. Character (PA of Middle) VIII. REFERENCES [1] Syed Afaq Hussain, and Syed Hassan Amin, A Multi-tier Holistic approach for Urdu Nastaliq Recognition, Proceedings of IEEE INMIC Karachi [2] Zahra A. Shah, and Farah Saleem, Ligature Based Optical Character Recognition of Urdu Nastaleeq Font, Proceedings of IEEE INMIC Karachi [3] Maged Mohamed Fahmy, and Maged Mohamed, Automatic Recognition of Handwritten Arabic Characters Using Geometric Features. [4] Adnan Amin, Arabic Character Recognition. Hand Book of Character Recognition And Document. Image Analysis, pp [5] Mandana Kairanifar, and Adnan Amin, Preprocessing and Structural Features Extraction for Multi-Fonts Arbic/Persian, Proc. 5th International Conference on Document Analysis and Recognition ICDAR 99, p.213. [6] 272
7 [7] A. Amin, and S. Singh, Optical Character Recognition: Neural network Analysis of Hand-printed Characters, Proc. 7th International Workshop on Statistical and Structural Pattern Recognition, Sydney, Australia, Lecture Notes in Computer Science 1451, Springer, pp (August 1-13, 1998). [8] John Cowell, and Fiaz Hussain. (UAE) Extracting Features from Arabic Characters. [9] Laurene Fausett, Fundamentals of Neural Networks, Florida Institute of Technology. [10] J. R. Parker, Practical computer Vision using C, John Willey & Sons. [11] [12] H. L. Beus, S. S. H. Tiu, An Improved Corner Detection Algorithm based on Chain Coded Plane Curves, Pattern Recognition, 20: , [13] S. N. Sivanandam, S. Sumathi and S. N. Deepa, Introduction To Neural Networks using MATLA 6.0, Tata McGraw -Hill, TABLE V. RESULT RATE 273
A Segmentation Free Approach to Arabic and Urdu OCR
A Segmentation Free Approach to Arabic and Urdu OCR Nazly Sabbour 1 and Faisal Shafait 2 1 Department of Computer Science, German University in Cairo (GUC), Cairo, Egypt; 2 German Research Center for Artificial
More informationCONTEXTUAL SHAPE ANALYSIS OF NASTALIQ
288 CONTEXTUAL SHAPE ANALYSIS OF NASTALIQ Aamir Wali, Atif Gulzar, Ayesha Zia, Muhammad Ahmad Ghazali, Muhammad Irfan Rafiq, Muhammad Saqib Niaz, Sara Hussain, and Sheraz Bashir ABSTRACT Nastaliq calligraphic
More informationOCR For Handwritten Marathi Script
International Journal of Scientific & Engineering Research Volume 3, Issue 8, August-2012 1 OCR For Handwritten Marathi Script Mrs.Vinaya. S. Tapkir 1, Mrs.Sushma.D.Shelke 2 1 Maharashtra Academy Of Engineering,
More informationLigature-based font size independent OCR for Noori Nastalique writing style
Ligature-based font size independent OCR for Noori Nastalique writing style Qurat ul Ain Akram Sarmad Hussain Center for Language Engineering, Al-Khawarizmi Institute of Computer Science University of
More informationBidirectional Urdu script. (a) (b) Urdu (a) character set and (b) diacritical marks
Improving Nastalique-Specific Pre-Recognition Process for Urdu OCR Sobia Tariq Javed and Sarmad Hussain Center for Research in Urdu Language Processing National University of Computer and Emerging Sciences,
More informationOptical Character Recognition (OCR) for Printed Devnagari Script Using Artificial Neural Network
International Journal of Computer Science & Communication Vol. 1, No. 1, January-June 2010, pp. 91-95 Optical Character Recognition (OCR) for Printed Devnagari Script Using Artificial Neural Network Raghuraj
More informationSegmentation Free Nastalique Urdu OCR
Segmentation Free Nastalique Urdu OCR Sobia T. Javed, Sarmad Hussain, Ameera Maqbool, Samia Asloob, Sehrish Jamil and Huma Moin Abstract Electronically available Urdu data is in image form which is very
More informationA Fast Recognition System for Isolated Printed Characters Using Center of Gravity and Principal Axis
Applied Mathematics, 2013, 4, 1313-1319 http://dx.doi.org/10.4236/am.2013.49177 Published Online September 2013 (http://www.scirp.org/journal/am) A Fast Recognition System for Isolated Printed Characters
More informationSEVERAL METHODS OF FEATURE EXTRACTION TO HELP IN OPTICAL CHARACTER RECOGNITION
SEVERAL METHODS OF FEATURE EXTRACTION TO HELP IN OPTICAL CHARACTER RECOGNITION Binod Kumar Prasad * * Bengal College of Engineering and Technology, Durgapur, W.B., India. Rajdeep Kundu 2 2 Bengal College
More informationCursive Handwriting Recognition System Using Feature Extraction and Artificial Neural Network
Cursive Handwriting Recognition System Using Feature Extraction and Artificial Neural Network Utkarsh Dwivedi 1, Pranjal Rajput 2, Manish Kumar Sharma 3 1UG Scholar, Dept. of CSE, GCET, Greater Noida,
More informationCharacter Recognition
Character Recognition 5.1 INTRODUCTION Recognition is one of the important steps in image processing. There are different methods such as Histogram method, Hough transformation, Neural computing approaches
More informationSegmentation of Characters of Devanagari Script Documents
WWJMRD 2017; 3(11): 253-257 www.wwjmrd.com International Journal Peer Reviewed Journal Refereed Journal Indexed Journal UGC Approved Journal Impact Factor MJIF: 4.25 e-issn: 2454-6615 Manpreet Kaur Research
More informationFine Classification of Unconstrained Handwritten Persian/Arabic Numerals by Removing Confusion amongst Similar Classes
2009 10th International Conference on Document Analysis and Recognition Fine Classification of Unconstrained Handwritten Persian/Arabic Numerals by Removing Confusion amongst Similar Classes Alireza Alaei
More informationA Survey of Problems of Overlapped Handwritten Characters in Recognition process for Gurmukhi Script
A Survey of Problems of Overlapped Handwritten Characters in Recognition process for Gurmukhi Script Arwinder Kaur 1, Ashok Kumar Bathla 2 1 M. Tech. Student, CE Dept., 2 Assistant Professor, CE Dept.,
More informationRecent Advances In Telecommunications, Informatics And Educational Technologies
Geometric Feature Extraction from Urdu Ligatures NAILA KHAN, AWAIS ADNAN 2, SADIA BASAR Department of Computer Science, Institute of Management Sciences, 1-A, Sector E-5, Phase VII, Hayatabad Peshawar,
More informationStructural Feature Extraction to recognize some of the Offline Isolated Handwritten Gujarati Characters using Decision Tree Classifier
Structural Feature Extraction to recognize some of the Offline Isolated Handwritten Gujarati Characters using Decision Tree Classifier Hetal R. Thaker Atmiya Institute of Technology & science, Kalawad
More informationMobile Application with Optical Character Recognition Using Neural Network
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 1, January 2015,
More informationABJAD: AN OFF-LINE ARABIC HANDWRITTEN RECOGNITION SYSTEM
ABJAD: AN OFF-LINE ARABIC HANDWRITTEN RECOGNITION SYSTEM RAMZI AHMED HARATY and HICHAM EL-ZABADANI Lebanese American University P.O. Box 13-5053 Chouran Beirut, Lebanon 1102 2801 Phone: 961 1 867621 ext.
More informationAvailable online at ScienceDirect. Procedia Technology 11 (2013 )
Available online at www.sciencedirect.com ScienceDirect Procedia Technology 11 (2013 ) 334 341 The 4th International Conference on Electrical Engineering and Informatics (ICEEI 2013) Arabic Character Recognition
More informationPRINTED ARABIC CHARACTERS CLASSIFICATION USING A STATISTICAL APPROACH
PRINTED ARABIC CHARACTERS CLASSIFICATION USING A STATISTICAL APPROACH Ihab Zaqout Dept. of Information Technology Faculty of Engineering & Information Technology Al-Azhar University Gaza ABSTRACT In this
More informationRecognition of Printed Arabic Words with Fuzzy ARTMAP Neural Network
Recognition of Printed Arabic Words with Fuzzy ARTMAP Neural Network Adnan Amin' and Nabeel Murshed2 'School of Computer Science and Engineering University of New South Wales, Sydney-Australia amin@cse.unsw.edu.au
More informationImage Normalization and Preprocessing for Gujarati Character Recognition
334 Image Normalization and Preprocessing for Gujarati Character Recognition Jayashree Rajesh Prasad Department of Computer Engineering, Sinhgad College of Engineering, University of Pune, Pune, Mahaashtra
More informationA System for Joining and Recognition of Broken Bangla Numerals for Indian Postal Automation
A System for Joining and Recognition of Broken Bangla Numerals for Indian Postal Automation K. Roy, U. Pal and B. B. Chaudhuri CVPR Unit; Indian Statistical Institute, Kolkata-108; India umapada@isical.ac.in
More informationLayout Analysis of Urdu Document Images
Layout Analysis of Urdu Document Images Faisal Shafait*, Adnan-ul-Hasan, Daniel Keysers*, and Thomas M. Breuel** *Image Understanding and Pattern Recognition (IUPR) research group German Research Center
More informationAn Efficient Character Segmentation Based on VNP Algorithm
Research Journal of Applied Sciences, Engineering and Technology 4(24): 5438-5442, 2012 ISSN: 2040-7467 Maxwell Scientific organization, 2012 Submitted: March 18, 2012 Accepted: April 14, 2012 Published:
More informationA Technique for Classification of Printed & Handwritten text
123 A Technique for Classification of Printed & Handwritten text M.Tech Research Scholar, Computer Engineering Department, Yadavindra College of Engineering, Punjabi University, Guru Kashi Campus, Talwandi
More informationScale and Rotation Invariant OCR for Pashto Cursive Script using MDLSTM Network
Scale and Rotation Invariant OCR for Pashto Cursive Script using MDLSTM Network Riaz Ahmad, Muhammad Zeshan Afzal, Sheikh Faisal Rashid Marcus Liwicki, Thomas Breuel riaz@iupr.com, afzal@iupr.com, rashid@iupr.com,
More informationHANDWRITTEN GURMUKHI CHARACTER RECOGNITION USING WAVELET TRANSFORMS
International Journal of Electronics, Communication & Instrumentation Engineering Research and Development (IJECIERD) ISSN 2249-684X Vol.2, Issue 3 Sep 2012 27-37 TJPRC Pvt. Ltd., HANDWRITTEN GURMUKHI
More informationMono-font Cursive Arabic Text Recognition Using Speech Recognition System
Mono-font Cursive Arabic Text Recognition Using Speech Recognition System M.S. Khorsheed Computer & Electronics Research Institute, King AbdulAziz City for Science and Technology (KACST) PO Box 6086, Riyadh
More informationHandwritten Hindi Character Recognition System Using Edge detection & Neural Network
Handwritten Hindi Character Recognition System Using Edge detection & Neural Network Tanuja K *, Usha Kumari V and Sushma T M Acharya Institute of Technology, Bangalore, India Abstract Handwritten recognition
More informationMulti-font Numerals Recognition for Urdu Script based Languages
Multi-font Numerals Recognition for Urdu Script based Languages Muhammad Imran Razzak, S.A. Hussain, Abdel Belaïd, Muhammad Sher To cite this version: Muhammad Imran Razzak, S.A. Hussain, Abdel Belaïd,
More informationAnale. Seria Informatică. Vol. XVII fasc Annals. Computer Science Series. 17 th Tome 1 st Fasc. 2019
EVALUATION OF AN OPTICAL CHARACTER RECOGNITION MODEL FOR YORUBA TEXT 1 Abimbola Akintola, 2 Tunji Ibiyemi, 3 Amos Bajeh 1,3 Department of Computer Science, University of Ilorin, Nigeria 2 Department of
More informationSeminar. Topic: Object and character Recognition
Seminar Topic: Object and character Recognition Tse Ngang Akumawah Lehrstuhl für Praktische Informatik 3 Table of content What's OCR? Areas covered in OCR Procedure Where does clustering come in Neural
More informationIMPLEMENTING ON OPTICAL CHARACTER RECOGNITION USING MEDICAL TABLET FOR BLIND PEOPLE
Impact Factor (SJIF): 5.301 International Journal of Advance Research in Engineering, Science & Technology e-issn: 2393-9877, p-issn: 2394-2444 Volume 5, Issue 3, March-2018 IMPLEMENTING ON OPTICAL CHARACTER
More informationHandwritten Hindi Numerals Recognition System
CS365 Project Report Handwritten Hindi Numerals Recognition System Submitted by: Akarshan Sarkar Kritika Singh Project Mentor: Prof. Amitabha Mukerjee 1 Abstract In this project, we consider the problem
More informationFREEMAN CODE BASED ONLINE HANDWRITTEN CHARACTER RECOGNITION FOR MALAYALAM USING BACKPROPAGATION NEURAL NETWORKS
FREEMAN CODE BASED ONLINE HANDWRITTEN CHARACTER RECOGNITION FOR MALAYALAM USING BACKPROPAGATION NEURAL NETWORKS Amritha Sampath 1, Tripti C 2 and Govindaru V 3 1 Department of Computer Science and Engineering,
More informationOFF-LINE HANDWRITTEN JAWI CHARACTER SEGMENTATION USING HISTOGRAM NORMALIZATION AND SLIDING WINDOW APPROACH FOR HARDWARE IMPLEMENTATION
OFF-LINE HANDWRITTEN JAWI CHARACTER SEGMENTATION USING HISTOGRAM NORMALIZATION AND SLIDING WINDOW APPROACH FOR HARDWARE IMPLEMENTATION Zaidi Razak 1, Khansa Zulkiflee 2, orzaily Mohamed or 3, Rosli Salleh
More informationNastaleeq: A challenge accepted by Omega
Nastaleeq: A challenge accepted by Omega Atif Gulzar, Shafiq ur Rahman Center for Research in Urdu Language Processing, National University of Computer and Emerging Sciences, Lahore, Pakistan atif dot
More informationFigure (5) Kohonen Self-Organized Map
2- KOHONEN SELF-ORGANIZING MAPS (SOM) - The self-organizing neural networks assume a topological structure among the cluster units. - There are m cluster units, arranged in a one- or two-dimensional array;
More informationHandwritten Gurumukhi Character Recognition by using Recurrent Neural Network
139 Handwritten Gurumukhi Character Recognition by using Recurrent Neural Network Harmit Kaur 1, Simpel Rani 2 1 M. Tech. Research Scholar (Department of Computer Science & Engineering), Yadavindra College
More informationRecognition of Unconstrained Malayalam Handwritten Numeral
Recognition of Unconstrained Malayalam Handwritten Numeral U. Pal, S. Kundu, Y. Ali, H. Islam and N. Tripathy C VPR Unit, Indian Statistical Institute, Kolkata-108, India Email: umapada@isical.ac.in Abstract
More informationStructural and Syntactic Techniques for Recognition of Ethiopic Characters
Structural and Syntactic Techniques for Recognition of Ethiopic Characters Yaregal Assabie and Josef Bigun School of Information Science, Computer and Electrical Engineering Halmstad University, SE-301
More informationProbabilistic Artificial Neural Network For Recognizing the Arabic Hand Written Characters
Journal of Computer Science 2 (12): 879-884, 2006 ISSN 1549-3636 2006 Science Publications Probabilistic Artificial Neural Network For Recognizing the Arabic Hand Written Characters 1 Khalaf khatatneh,
More informationSegmentation Based Optical Character Recognition for Handwritten Marathi characters
Segmentation Based Optical Character Recognition for Handwritten Marathi characters Madhav Vaidya 1, Yashwant Joshi 2,Milind Bhalerao 3 Department of Information Technology 1 Department of Electronics
More informationCHAPTER 1 INTRODUCTION
CHAPTER 1 INTRODUCTION 1.1 Introduction Pattern recognition is a set of mathematical, statistical and heuristic techniques used in executing `man-like' tasks on computers. Pattern recognition plays an
More informationHandwritten Devanagari Character Recognition Model Using Neural Network
Handwritten Devanagari Character Recognition Model Using Neural Network Gaurav Jaiswal M.Sc. (Computer Science) Department of Computer Science Banaras Hindu University, Varanasi. India gauravjais88@gmail.com
More informationAutomatic Recognition and Verification of Handwritten Legal and Courtesy Amounts in English Language Present on Bank Cheques
Automatic Recognition and Verification of Handwritten Legal and Courtesy Amounts in English Language Present on Bank Cheques Ajay K. Talele Department of Electronics Dr..B.A.T.U. Lonere. Sanjay L Nalbalwar
More informationNOVATEUR PUBLICATIONS INTERNATIONAL JOURNAL OF INNOVATIONS IN ENGINEERING RESEARCH AND TECHNOLOGY [IJIERT] ISSN: VOLUME 5, ISSUE
OPTICAL HANDWRITTEN DEVNAGARI CHARACTER RECOGNITION USING ARTIFICIAL NEURAL NETWORK APPROACH JYOTI A.PATIL Ashokrao Mane Group of Institution, Vathar Tarf Vadgaon, India. DR. SANJAY R. PATIL Ashokrao Mane
More informationMulti-Layer Perceptron Network For Handwritting English Character Recoginition
Multi-Layer Perceptron Network For Handwritting English Character Recoginition 1 Mohit Mittal, 2 Tarun Bhalla 1,2 Anand College of Engg & Mgmt., Kapurthala, Punjab, India Abstract Handwriting recognition
More informationHilditch s Algorithm Based Tamil Character Recognition
Hilditch s Algorithm Based Tamil Character Recognition V. Karthikeyan Department of ECE, SVS College of Engineering Coimbatore, India, Karthick77keyan@gmail.com Abstract-Character identification plays
More informationIJESRT. Scientific Journal Impact Factor: (ISRA), Impact Factor: 1.852
IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY INTELLEGENT APPROACH FOR OFFLINE SIGNATURE VERIFICATION USING CHAINCODE AND ENERGY FEATURE EXTRACTION ON MULTICORE PROCESSOR Raju
More informationA New Technique for Segmentation of Handwritten Numerical Strings of Bangla Language
I.J. Information Technology and Computer Science, 2013, 05, 38-43 Published Online April 2013 in MECS (http://www.mecs-press.org/) DOI: 10.5815/ijitcs.2013.05.05 A New Technique for Segmentation of Handwritten
More informationLECTURE 6 TEXT PROCESSING
SCIENTIFIC DATA COMPUTING 1 MTAT.08.042 LECTURE 6 TEXT PROCESSING Prepared by: Amnir Hadachi Institute of Computer Science, University of Tartu amnir.hadachi@ut.ee OUTLINE Aims Character Typology OCR systems
More informationOptical Character Recognition System for Arabic Text Using Cursive Multi-Directional Approach
Journal of Computer Science 3 (7): 549-555, 2007 ISSN 1549-3636 2007 Science Publications Optical Character Recognition System for Arabic Text Using Cursive Multi-Directional Approach 1 Mansoor Al-A'ali
More informationINTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY
INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK HANDWRITTEN DEVANAGARI CHARACTERS RECOGNITION THROUGH SEGMENTATION AND ARTIFICIAL
More informationAutomatic Recognition of Offline Handwritten Urdu Digits In Unconstrained Environment Using Daubechies Wavelet Transforms
IOSR Journal of Engineering (IOSRJEN) e-issn: 2250-3021, p-issn: 2278-8719 Vol. 3, Issue 9 (September. 2013), V2 PP 50-56 Automatic Recognition of Offline Handwritten Urdu Digits In Unconstrained Environment
More informationIn this assignment, we investigated the use of neural networks for supervised classification
Paul Couchman Fabien Imbault Ronan Tigreat Gorka Urchegui Tellechea Classification assignment (group 6) Image processing MSc Embedded Systems March 2003 Classification includes a broad range of decision-theoric
More informationA New Approach to Detect and Extract Characters from Off-Line Printed Images and Text
Available online at www.sciencedirect.com Procedia Computer Science 17 (2013 ) 434 440 Information Technology and Quantitative Management (ITQM2013) A New Approach to Detect and Extract Characters from
More informationHandwritten Script Recognition at Block Level
Chapter 4 Handwritten Script Recognition at Block Level -------------------------------------------------------------------------------------------------------------------------- Optical character recognition
More informationRecognition of online captured, handwritten Tamil words on Android
Recognition of online captured, handwritten Tamil words on Android A G Ramakrishnan and Bhargava Urala K Medical Intelligence and Language Engineering (MILE) Laboratory, Dept. of Electrical Engineering,
More informationwith Profile's Amplitude Filter
Arabic Character Segmentation Using Projection-Based Approach with Profile's Amplitude Filter Mahmoud A. A. Mousa Dept. of Computer and Systems Engineering, Zagazig University, Zagazig, Egypt mamosa@zu.edu.eg
More informationOn-line handwriting recognition using Chain Code representation
On-line handwriting recognition using Chain Code representation Final project by Michal Shemesh shemeshm at cs dot bgu dot ac dot il Introduction Background When one preparing a first draft, concentrating
More informationINTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY
INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK POSSIBLE USE OF OCR FOR RECOGNITION OF KORKU LANGUAGE TEXT ARVIND ARJUNRAO TAYADE,
More informationA Neural Network Based Bank Cheque Recognition system for Malaysian Cheques
A Neural Network Based Bank Cheque Recognition system for Malaysian Cheques Ahmad Ridhwan Wahap 1 Marzuki Khalid 1 Abd. Rahim Ahmad 3 Rubiyah Yusof 1 1 Centre for Artificial Intelligence and Robotics,
More informationA Brief Study of Feature Extraction and Classification Methods Used for Character Recognition of Brahmi Northern Indian Scripts
25 A Brief Study of Feature Extraction and Classification Methods Used for Character Recognition of Brahmi Northern Indian Scripts Rohit Sachdeva, Asstt. Prof., Computer Science Department, Multani Mal
More informationSpotting Words in Latin, Devanagari and Arabic Scripts
Spotting Words in Latin, Devanagari and Arabic Scripts Sargur N. Srihari, Harish Srinivasan, Chen Huang and Shravya Shetty {srihari,hs32,chuang5,sshetty}@cedar.buffalo.edu Center of Excellence for Document
More informationSimulation of Zhang Suen Algorithm using Feed- Forward Neural Networks
Simulation of Zhang Suen Algorithm using Feed- Forward Neural Networks Ritika Luthra Research Scholar Chandigarh University Gulshan Goyal Associate Professor Chandigarh University ABSTRACT Image Skeletonization
More informationCharacter Recognition Using Matlab s Neural Network Toolbox
Character Recognition Using Matlab s Neural Network Toolbox Kauleshwar Prasad, Devvrat C. Nigam, Ashmika Lakhotiya and Dheeren Umre B.I.T Durg, India Kauleshwarprasad2gmail.com, devnigam24@gmail.com,ashmika22@gmail.com,
More informationSegmentation of Isolated and Touching characters in Handwritten Gurumukhi Word using Clustering approach
Segmentation of Isolated and Touching characters in Handwritten Gurumukhi Word using Clustering approach Akashdeep Kaur Dr.Shaveta Rani Dr. Paramjeet Singh M.Tech Student (Associate Professor) (Associate
More informationAutomatic Detection of Change in Address Blocks for Reply Forms Processing
Automatic Detection of Change in Address Blocks for Reply Forms Processing K R Karthick, S Marshall and A J Gray Abstract In this paper, an automatic method to detect the presence of on-line erasures/scribbles/corrections/over-writing
More informationTwo-step Modified SOM for Parallel Calculation
Two-step Modified SOM for Parallel Calculation Two-step Modified SOM for Parallel Calculation Petr Gajdoš and Pavel Moravec Petr Gajdoš and Pavel Moravec Department of Computer Science, FEECS, VŠB Technical
More informationA Review on Different Character Segmentation Techniques for Handwritten Gurmukhi Scripts
WWJMRD2017; 3(10): 162-166 www.wwjmrd.com International Journal Peer Reviewed Journal Refereed Journal Indexed Journal UGC Approved Journal Impact Factor MJIF: 4.25 e-issn: 2454-6615 Manas Kaur Research
More informationOne Dim~nsional Representation Of Two Dimensional Information For HMM Based Handwritten Recognition
One Dim~nsional Representation Of Two Dimensional Information For HMM Based Handwritten Recognition Nafiz Arica Dept. of Computer Engineering, Middle East Technical University, Ankara,Turkey nafiz@ceng.metu.edu.
More informationToward Part-based Document Image Decoding
2012 10th IAPR International Workshop on Document Analysis Systems Toward Part-based Document Image Decoding Wang Song, Seiichi Uchida Kyushu University, Fukuoka, Japan wangsong@human.ait.kyushu-u.ac.jp,
More informationHigh Performance Layout Analysis of Arabic and Urdu Document Images
High Performance Layout Analysis of Arabic and Urdu Document Images Syed Saqib Bukhari 1, Faisal Shafait 2, and Thomas M. Breuel 1 1 Technical University of Kaiserslautern, Germany 2 German Research Center
More informationHCR Using K-Means Clustering Algorithm
HCR Using K-Means Clustering Algorithm Meha Mathur 1, Anil Saroliya 2 Amity School of Engineering & Technology Amity University Rajasthan, India Abstract: Hindi is a national language of India, there are
More informationBMVC 1995 doi: /c.9.54
Segmentation and Recognition of Printed Arabic Characters B. M. F. Bushofa and M. Spann. School of Electronic and Electrical Engineering, The University of Birmingham 5 Abstract Arabic characters differ
More informationSkeletonization Algorithm for Numeral Patterns
International Journal of Signal Processing, Image Processing and Pattern Recognition 63 Skeletonization Algorithm for Numeral Patterns Gupta Rakesh and Kaur Rajpreet Department. of CSE, SDDIET Barwala,
More informationCharacter Recognition of High Security Number Plates Using Morphological Operator
Character Recognition of High Security Number Plates Using Morphological Operator Kamaljit Kaur * Department of Computer Engineering, Baba Banda Singh Bahadur Polytechnic College Fatehgarh Sahib,Punjab,India
More informationSkew Angle Detection of Bangla Script using Radon Transform
Skew Angle Detection of Bangla Script using Radon Transform S. M. Murtoza Habib, Nawsher Ahamed Noor and Mumit Khan Center for Research on Bangla Language Processing, BRAC University, Dhaka, Bangladesh.
More informationSeparation of Overlapping Text from Graphics
Separation of Overlapping Text from Graphics Ruini Cao, Chew Lim Tan School of Computing, National University of Singapore 3 Science Drive 2, Singapore 117543 Email: {caorn, tancl}@comp.nus.edu.sg Abstract
More informationHand Written Character Recognition using VNP based Segmentation and Artificial Neural Network
International Journal of Emerging Engineering Research and Technology Volume 4, Issue 6, June 2016, PP 38-46 ISSN 2349-4395 (Print) & ISSN 2349-4409 (Online) Hand Written Character Recognition using VNP
More informationHandwritten English Character Segmentation by Baseline Pixel Burst Method (BPBM)
AMSE JOURNALS 2014-Series: Advances B; Vol. 57; N 1; pp 31-46 Submitted Oct. 2013; Revised June 30, 2014; Accepted July 20, 2014 Handwritten English Character Segmentation by Baseline Pixel Burst Method
More informationComparative Analysis of Raw Images and Meta Feature based Urdu OCR using CNN and LSTM
Comparative Analysis of Raw Images and Meta Feature based Urdu OCR using CNN and LSTM Asma Naseer, Kashif Zafar Computer Science Department National University of Computer and Emerging Sciences Lahore,
More informationIndian Multi-Script Full Pin-code String Recognition for Postal Automation
2009 10th International Conference on Document Analysis and Recognition Indian Multi-Script Full Pin-code String Recognition for Postal Automation U. Pal 1, R. K. Roy 1, K. Roy 2 and F. Kimura 3 1 Computer
More informationA Hierarchical Pre-processing Model for Offline Handwritten Document Images
International Journal of Research Studies in Computer Science and Engineering (IJRSCSE) Volume 2, Issue 3, March 2015, PP 41-45 ISSN 2349-4840 (Print) & ISSN 2349-4859 (Online) www.arcjournals.org A Hierarchical
More informationDEVANAGARI SCRIPT SEPARATION AND RECOGNITION USING MORPHOLOGICAL OPERATIONS AND OPTIMIZED FEATURE EXTRACTION METHODS
DEVANAGARI SCRIPT SEPARATION AND RECOGNITION USING MORPHOLOGICAL OPERATIONS AND OPTIMIZED FEATURE EXTRACTION METHODS Sushilkumar N. Holambe Dr. Ulhas B. Shinde Shrikant D. Mali Persuing PhD at Principal
More informationRecognition of Gurmukhi Text from Sign Board Images Captured from Mobile Camera
International Journal of Information & Computation Technology. ISSN 0974-2239 Volume 4, Number 17 (2014), pp. 1839-1845 International Research Publications House http://www. irphouse.com Recognition of
More informationIsolated Curved Gurmukhi Character Recognition Using Projection of Gradient
International Journal of Computational Intelligence Research ISSN 0973-1873 Volume 13, Number 6 (2017), pp. 1387-1396 Research India Publications http://www.ripublication.com Isolated Curved Gurmukhi Character
More informationA Review on Handwritten Character Recognition
IJCST Vo l. 8, Is s u e 1, Ja n - Ma r c h 2017 ISSN : 0976-8491 (Online) ISSN : 2229-4333 (Print) A Review on Handwritten Character Recognition 1 Anisha Sharma, 2 Soumil Khare, 3 Sachin Chavan 1,2,3 Dept.
More informationOptical Character Recognition
Optical Character Recognition Jagruti Chandarana 1, Mayank Kapadia 2 1 Department of Electronics and Communication Engineering, UKA TARSADIA University 2 Assistant Professor, Department of Electronics
More informationUbiquitous Computing and Communication Journal (ISSN )
A STRATEGY TO COMPROMISE HANDWRITTEN DOCUMENTS PROCESSING AND RETRIEVING USING ASSOCIATION RULES MINING Prof. Dr. Alaa H. AL-Hamami, Amman Arab University for Graduate Studies, Amman, Jordan, 2011. Alaa_hamami@yahoo.com
More informationCHAPTER 8 COMPOUND CHARACTER RECOGNITION USING VARIOUS MODELS
CHAPTER 8 COMPOUND CHARACTER RECOGNITION USING VARIOUS MODELS 8.1 Introduction The recognition systems developed so far were for simple characters comprising of consonants and vowels. But there is one
More informationSegmentation of Bangla Handwritten Text
Thesis Report Segmentation of Bangla Handwritten Text Submitted By: Sabbir Sadik ID:09301027 Md. Numan Sarwar ID: 09201027 CSE Department BRAC University Supervisor: Professor Dr. Mumit Khan Date: 13 th
More informationConcept of Neural Networks in Image Processing
Concept of Neural Networks in Image Processing Megha, Er. Yogesh Kumar, Rajat Malik UIET, MDU ABSTRACT Image Processing is the scrutiny and manipulation of a digitized image, in order to advance its feature.
More informationImproved Method for Sliding Window Printed Arabic OCR
th Int'l Conference on Advances in Engineering Sciences & Applied Mathematics (ICAESAM'1) Dec. -9, 1 Kuala Lumpur (Malaysia) Improved Method for Sliding Window Printed Arabic OCR Prof. Wajdi S. Besbas
More informationA DIGITAL APPROACH TO HANDWRITTEN DOCUMENTS. B.I.T. - Bureau Ingénieur Tomasi
A DIGITAL APPROACH TO HANDWRITTEN DOCUMENTS B.I.T. - Bureau Ingénieur Tomasi Introduction Handwritten documents can for the most part not be read by computers today. Our technology such as it has been
More informationText Separation from Graphics by Analyzing Stroke Width Variety in Persian City Maps
Text Separation from Graphics by Analyzing Stroke Width Variety in Persian City Maps Ali Ghafari-Beranghar Department of Computer Engineering, Science and Research Branch, Islamic Azad University, Tehran,
More informationFRAGMENTATION OF HANDWRITTEN TOUCHING CHARACTERS IN DEVANAGARI SCRIPT
International Journal of Information Technology, Modeling and Computing (IJITMC) Vol. 2, No. 1, February 2014 FRAGMENTATION OF HANDWRITTEN TOUCHING CHARACTERS IN DEVANAGARI SCRIPT Shuchi Kapoor 1 and Vivek
More informationWord Matching of handwritten scripts
Word Matching of handwritten scripts Seminar about ancient document analysis Introduction Contour extraction Contour matching Other methods Conclusion Questions Problem Text recognition in handwritten
More information