What is Data Extraction?
OCR stands for Optical Character Recognition and is the technology
that allows software to interpret machine printed text on scanned images.
Data Extraction Software uses OCR technology to automate data entry tasks
involving machine printed forms. When the forms all have the same format, simple
Zone OCR can be employed to convert specific regions of the page to usable data.
Advanced data extraction software is also able to locate common data elements on
forms with many different formats. The most common example of this is
Invoice Processing,
but data extraction can be done with any type of document.
Data extraction software can also be used to:
Who can benefit from data extraction software?
Any organization that must enter data in a database that comes from paper
forms or electronic documents like Word, Excel and PDF files can get a very high return on investment by
automating the data entry with data extraction software.
Depending on the type and volume of documents and data you have, the cost of the
solution could range from a few hundred dollars to tens of thousands. A simple project
could justify a software purchase to save only a few days of data entry time. A complex
project with many different types of documents and unstructured data could need
to offset hundreds of data entry hours to justify the expense.
Organizations that have many separate departments that perform data entry
from documents can share the budget for data extraction software by
re-using it for other projects. Your current project may not be big enough
to justify the expense, but when combined with one or two others it would be.
How much do Data Extraction systems cost?
As mentioned above, the biggest factor in the price of the system is the
complexity and volume of the documents and data being captured. The total
cost of a data extraction solution also includes several other items:
- Time to install and configure the software
- Recognition templates must be created for each data field
- Data exports must be defined for each document template
- User and administrator training
- Labor required to verify the recognition results
- IT infrastructure and maintenance costs
If you have an IT staff that is familiar with document scanning and OCR
applications, it is possible to do most of the configuration and maintenance
in-house. If not then it is highly recommended that you use our
Consulting Services
to guide you through the setup process.
Contact Us to get a professional analysis of your project
requirements and a full time and cost estimate.
What is the typical data entry workflow?
The process of converting a paper document to live data you can use
is as follows:
- Paper is are prepped for scanning (unfolded, staples removed, etc.)
- Documents are scanned on a high-speed document scanner
- Scanned images are recognized with OCR
- Matching algorithms locate data elements within the text
- Fields that fail validation checks are presented to
the operators for manual review and correction
- Once all errors are corrected, data is exported to final destination
How do I find out more?