Optical Character Recognition (OCR) and Its Role in Desktop Automation

Long gone are the days of digitizing text by hand. Since the early 70s and the advent of personal scanners, the OCR industry has experienced a boom in growth. Today, scanners are available to everyone directly from their mobile phone. However, until recently, optical character recognition (OCR) technologies required human input. With the rise of automation, everything has changed. Let’s take a look at what role OCR plays in desktop automation.

What is optical character recognition (OCR)?

OCR is a technology that converts various types of documents, such as printed documents, PDF files, or photos, into searchable and editable formats like TXT, DOCX, and so on.

The first step in the process of running OCR is to use a scanner to process the physical shape of your document. After every page has been copied, the OCR software converts the document to a black-and-white version. The program then analyzes the bitmap for the presence of light and dark areas. Next, the OCR app identifies dark areas as characters that need to be recognized and light areas as background. Then, OCR processes the dark areas to identify letters and numbers.

OCR uses feature detection rules related to a particular letter or number (ICR). The software evaluates the document data according to rules about how a letter or number is generated. For example, the capital letter “A” is stored as two diagonal lines intersecting with a horizontal line in the middle.

After OCR identifies a character, the app converts it to ASCII code that computer systems can use. The last step is to copy all recognized characters as words into an editable document, such as a Word doc or other formats.

OCR use cases

Here’s a list of the most common OCR technology use cases:

Scanning printed documents and saving them in an editable format
Indexing printed materials for search engines
Automated processing and data entry
Transcribing documents into text that can be read aloud for visually impaired users
Archiving of historical information (newspapers, magazines) and enablement of text searches
Data extraction and transfers to accounting software (receipts, invoices)
Archiving critical legal documents in an electronic system of record
License plate recognition with speed camera and backlit camera software
Sorting letters for mail delivery
Translation of words in an image into a given language
Providing searches for scanned books

As you can see, most of the uses mentioned above are repetitive processes. If so, they can be easily automated. In this blog, we will not dwell on all the possible applications of OCR in automation. Let’s consider the most popular way OCR helps to automate repetitive tasks in Windows.

How does OCR benefit desktop automation tools?

OCR allows companies to automate business processes that depend on scanned paperwork, such as application forms, contracts, bills, and invoices. For this, a user can also include various operations such as researched website data and photo images.

The three most common operations where intelligent automation needs OCR are:

Reading and searching through PDF documents. It could be essential for business as most PDFs come in an un-editable format.
Extracting data from images. This kind of extraction can be critical when you are analyzing website data. A lot of information is placed on images that are not available for analysis by automated bots if they don’t have OCR.
Transcribing sensitive information from printed documents. Such transcripts help to eliminate data transfer errors by up to 99%.

Through the use of OCR, automation achieves nearly total control over the content of websites and documents. The ability to recognize text on graphical objects allows you to automate those processes that were prerogative of a human in the recent past. Another plus here is that it is not difficult to configure OCR for automatic operation. Let’s look at how the optical character recognition technology works using WinTask automation.

OCR and desktop automation

Let’s imagine that a bank has automated the process of opening a new account for a business client. What will this process consist of?

The process of opening a corporate account involves several steps. First, the customer must select the type of account they intend to open. Then, the client fills out a particular application form and sends it to the bank. Usually, a clerk for that bank collects additional personal information to confirm the client’s profile. This can be information from public social networks, a client’s website, and photographs of the business itself, collected from sources such as Google Maps. Here’s how WinTask automation would do that without any human involvement.

A new client logs in to the bank’s website and chooses the type of account they want to open.
Once their application form is submitted, WinTask automation extracts the data and begins the research process.
First, the automation script creates client records in the bank’s CRM system.
Then, it searches for a LinkedIn profile of the company and all its employees and enters this information in the company’s CRM profile.
Then, the automation bot opens Google Maps, searches for the business address indicated on the form, and analyzes the signs and shop windows on the street to locate the desired business’s name. This is where OCR comes into play. By recognizing textual information, the bot can confirm the presence of the expected sign.
Suppose the desired images could not be obtained from public sources. In that case, automation sends an email to the client to send the photos by mail. These photographs are then analyzed in the same way by the bot until the desired name is established.
After the information is collected, the bot creates the customer’s PDF profile and saves it to the bank’s local system of records. After that, automation sends a notification to the manager, who performs the final approval.

To set up this type of automation, you’ll need to run the process through WinTask Developer manually and create a new automation script. Then, use Runtime to run the script on each new machine of your organization. It would help if you had a couple of hours to set up a workflow of that complexity.

Easy, right? Want to try it yourself? Don’t hesitate to contact us for more details. We can schedule a free consultation and customize a demo just for you.

September 30, 2021

Optical Character Recognition (OCR) and Its Role in Desktop Automation

What is optical character recognition (OCR)?

OCR use cases

How does OCR benefit desktop automation tools?

OCR and desktop automation