Looking for software to read hand written text on paper forms, surveys, and other documents?
Below, you will find a comprehensive guide to Handprint Recognition (ICR) software that will
familiarize you with the terminology and the capabilities of these systems.
Want to skip all the reading and just ask an expert? Contact Us to schedule
a consultation with one of our experienced project managers (not a sales person!)
Terminology
When you first begin researching the field of document scanning, you will find a variety of perplexing terms and acronyms being thrown about without any explanation of what they mean. Below are a few of the more common ones you'll encounter, so we recommend getting to know these terms before proceeding:
- OCR - 'Optical Character Recognition' -
Converts images of machine printed (typed) text to an editable and/or searchable file format
- OMR - 'Optical Mark Recognition' -
Captures data from form elements such as checkboxes and multiple-choice bubbles
- BCR - 'Bar Code Recognition' -
Reads bar code values for the purposes of document separation, identification, and/or organization
- ICR - 'Intelligent Character Recognition' -
Converts images of handprinted text to an editable and/or searchable file format
Types of Handwriting
Handwriting differs from person to person, both in neatness and style. Cursive writing, in particular,
can vary significantly and can sometimes be difficult to decipher even by a human reader. Recognition software
relies on regular shapes and distinct letter outlines. Thus, ICR software has difficulty deciphering sloppy
handwriting, neat handwriting with an unusual style, or cursive handwriting with flowing or conjoined characters.
ICR software is most accurate with disconnected, uniform letters and numbers that are easy to differentiate and
identify.
There are three main categories of handwriting recognition application based on the most common types of
handwritten documents:
- Structured forms and surveys - Used to automate data entry tasks where large volumes of data must be entered into a database. This is the primary focus of this guide.
- Simple unconstrained handwriting - Used primarily for document indexing tasks where a piece of handwritten information, such as an invoice number or date, needs to be recognized. This is used to be recognize a small number of data elements for the purpose of naming and/or filing the scanned document. It is not as effective at converting all the data on the document into an editable format. This is handwriting equivalent of Zone OCR and can be achieved with forms processing software as well as document scanning applications like PaperVision Capture.
- Free text and cursive handwriting - Although specialized applications for this exist, they are designed more for data mining and full text searching rather than accurate data capture. ScanStore does not offer any solutions for cursive and unconstrained handwriting recognition. This guide specifically deals with ICR for data capture solutions.
Designing Forms for Accurate Handprint Recognition
When designing your forms, there are a number of different elements to consider. The most important element, of course, is the data you are trying to read from the form. In many cases, there can be several ways to represent the same type of data on a form. Constrained text fields can be handwritten or selected by a check box. Some data might be referenced in an existing database, through which it could be validated or filled in automatically. Before designing your form, you should know the different ways to represent data on a form and consider the speed, accuracy, and user friendliness of each option.
Handprint Text Fields
For the best handwriting recognition accuracy, use structured forms that require neat, separated, capital characters. To ensure that the person filling out the form writes in such a way is to indicate a space for each character on the form. This is known as "segmentation". A few common segmentation methods are listed below:
- Plain
- Simple Line
- Combed Line
- Simple Box
- Combed Box
- Joined Frames
- Separated Frames
Checkboxes
Although handprint recognition can pick up complex or unique information, sometimes you just need a simple selection from an existing set of choices. In such cases, leaving that information to be filled in creates needless room for error, resulting in wasted time.
When you need either a single selection from a set of mutually exclusive options (Yes or No; Male or Female; etc.) or multiple selections from an existing set, use checkboxes. This lets you limit the possible answer choices while removing the risk of misrecognition, misspelling, and misunderstanding that slow down data entry.
Barcodes
Barcodes can store complex information in a format that is much more easily and accurately recognized by computers than handwriting is. However, since they are not easy to create, especially when filling a form out by hand, they should be used to identify and separate documents batches or sections quickly and accurately.
Alignment Elements
The final elements on forms designed for automated processing are anchors and line separators. These are used to align the scanned image to a template used for recognition, mitigating the effects of slipping and skewing that often occurs when a page is passed through a scanner. Anchors come in two flavors, corners and squares, while line separators can be either vertical or horizontal. Corners are usually placed around the edges of the form, whereas separators can be spread throughout. With enough of these placed on a template, the software will easily find fixed information within a document regardless of skew.
Other Form Design Tips
For the best results, all form elements should be separated from the other elements on the page, usually by a space of at least 10mm, allowing each element to be identified correctly.
Using Color Dropout is also recommended to improve accuracy. In a color dropout form, the form layout is printed in a different color, most commonly red. The scanner is then calibrated to remove the red color from the scan. The allows only the handwriting to appear in the scan, removing the need for the software to distinguish between hand written marks and segmentation lines.
Working With Existing Documents
Designing a form is the easiest way to ensure accurate recognition, but what if the documents were printed and filled out long before forms processing software was considered? It may still be possible to capture data from these documents.
If the data is constrained to numbers, dates, or multiple choice values (written or check box), it can be read accurately even if it was not designed for handprint recognition. More complicated documents and data types can also be recognized, but it is difficult to tell if this will be possible at a glance. Contact us to have one of our experts evaluate your form or suggest alternatives.
Accuracy of Element Types
Certain data types are inherently easier to recognize, with the most accurate being barcodes and checkboxes / bubbles, followed by numbers, and then text. Neat and uniform handwriting improves accuracy, which is facilitated by letter delineation methods such as frames and combs.
Here are some examples of good and bad handwriting for ICR:
Scanner Settings
No matter how well designed and how carefully filled out your form is, recognition software will not be able to do its work if the scanner is not properly digitizing the page. It is recommended to scan at a resolution of 300dpi for best results. Black & White (Bitonal) is preferred over Greyscale or Color modes, and don't forget to set up your Color Dropout, if your scanner and forms support it. Finally, although most modern scanners are fairly well configured out of the box, you may want to adjust your Brightness and Contrast settings for your particular documents.
If you do not have a scanner that has the necessary speed, quality, or other features that you require to scan your forms, you can always find a large selection of scanners right here at ScanStore!
We even have a handy scanners guide to help you find the perfect scanner for you.
Handprint Recognition Accuracy
Accuracy of forms processing software varies based on the particular recognition engine that the software uses, the design of the form, the type of elements that are to be recognized, the quality of the scan, and the neatness of the writing. As mentioned previously, static text labels, anchors, and line separators help the software locate the data elements, thus improving overall accuracy.
Even if your documents are poorly recognized, forms processing software is faster than manually keying in data due to the streamlined validation and data entry interface these applications provide.
Verifying the Recognition Results
Regardless of how many precautions you take, however, errors are almost always inevitable to some degree. Not to fear! Forms processing software has an extra validation step that allows you to see and correct these errors before the data is exported. To make things even easier, many software will highlight uncertain characters and give an overall certainty percentage for each page of a document. An example of this can be found below:
Exporting the Results
After you have designed, filled out, scanned, recognized, and validated your form, the last step is to export the results. For structured forms and surveys, you can pull all the filled out information and save it as a spreadsheet or database of your choice. Alternatively, you can use the handwriting on a page to identify it and save the scanned image with a filename and location based on that data.
Advantages of Handprint Recognition
The advantages of handprint recognition software are many. Original images can be automatically archived and associated with the recognized data, so you can always go back to verify a particular point. Because the documents are now in a digital format, data can be entered and verified anywhere there is a computer, so you do not need to have every step of the process done at a single location or cover the cost of shipping large, heavy stacks of paper. Validation is scalable, so you can be as sure as you need to be that the data you save is correct. The biggest advantage, however, is the automation, as you can save countless man hours on large scanning and archiving projects.
Find Out More
Contact Us to schedule
a consultation with one of our experienced project managers.