Preprocessing in Handwritten Character RecognitionHandwritten Character Recognition (HCR) based on neural networks, deep learning, and machine learning methods is one hot topic of research today. But as a technology, HCR systems are existing since 1980s. Early products incorporating HCR were launched by companies like Pencept (product: Penpad) and Grid Systems (GRiDPad). Companies including IBM, Microsoft, Apple also introduced products around Pen Computing incorporating HCR technology in 1990s and 2000s but didn't get too much success with them primarily because of inaccuracy and inefficiency of HCR technology in those days. But in the modern days, with unparalleled computing power available at our disposal and with form factor of our computing devices getting smaller (as we are moving from Desktop PCs to Tablet/ or Smartphones) HCR technology find more use and this is a reason it is getting a lot of traction in electronic gadgets of today including smartphones, personal digital assistants (PDAs), tablets etc. In all these gadgets, characters are captured as a sequence of strokes as shown in Figure 1. Features are then extracted from these strokes and strokes are recognized with the help of these features. Generally, a post-processing module helps in forming the characters from the stroke(s). Fig. 1 Handwritten Character Recognition (HCR) [Ref. 3] Fundamentally HCR can be divided into two main streams - Online HCR and Offline HCR. In online HCR user writes on an electronic surface by the aid of a unique pen (or stylus) and during the writing process, data is captured in terms of X and Y coordinates. The device senses the pen tip movements, pen-up and pen-down switching time and does the digital representation of handwriting. Smart phones, tablets, PDA use online HCR. Offline HCR on the other hand converts handwritten text on paper into digitized image which is then scanned by Machine for recognition. Offline HCR finds application in Postal Address Recognition System, Bank Cheque Processing, Signature Verification etc. Offline HCR is quite challenging and difficult to implement when compared to Online HCR. Broadly there are three steps involved in HCR. These steps are - Preprocessing, Feature Extraction, and Classification. Here we are discussing steps involved in preprocessing as shown in Figure 2 below. Preprocessing is done to remove any unwanted noise and irrelevant data from the input image. Fig.2 - Steps Involved in Preprocessing The input image may be colored, so first it will be converted into a gray scale image. To process it further by HCR algorithm implemented in MATLAB, Python etc the gray scale image is converted into a binary image. In order to remove some unwanted data, area opening must be applied, which can remove objects having specified pixel value. To find sharp transitions in the pixel values in image, edge detection is performed. Morphological operations like dilation and holes filling can be performed to highlight edges and filling holes in image. The preprocessed image is then used for feature extraction and finally in the classification step the extracted features are matched to different classes for identification or recognition. By: Mr. Neeraj Kumar - Assistant Professor (ECE), Chitkara University, H.P. References
Disclaimer: The content of this newsletter is contributed by Chitkara University faculty & taken from resources that are believed to be reliable. The content is verified by editorial team to best of its accuracy but editorial team denies any ownership pertaining to validation of the source & accuracy of the content. The objective of the newsletter is only limited to spread awareness among faculty & students about technology and not to impose or influence decision of individuals.
|