Optical character recognition (OCR) is one of the AI computer vision models. It’s commonly used to read printed or handwritten documents. Usually, OCR is used as an initial step to extract the text that will be used for processing or analysis.
There is a more advanced OCR scenario when extracting the information from forms, such as purchase orders or invoices, with a semantic understanding of what the fields in the form represent.
Azure Form Recognizer Service is designed for this kind of challenge.
Form Recognizer extracts text from images of:
- Handwritten text
- Invoices
- Receipts
- Business cards
- Identity documents
- Contracts
- Health Insurance cards
- Vaccination cards
- Custom model: build custom model to extract text from a images
Form recognizer can also capture the structure of the text, such as key/value pairs and information in tables. Like the items within the order.
This post is part of azure cloud scenario post where I talk about how to implement a cloud solution for a marketing campaign contest with AI verification.
Create Azure Resource Group
Provisioning the resources can be done in different ways:
- You can use Azure CLI and execute az scripts
- You can use Infrastructure as a Code platform like Terraform
- You can use Azure Portal
This time, I will use Azure portal which is the simplest way.
First step is to create a resource group on your azure subscription. I named it: AI-Demos
Create From Recognizer Resource
Go to Azure Marketplace and search for “Form Recognizer”:
Then, create a new resource:
- Choose the region: UAE North
- The name of the service
- Pricing tier: Choose the free tier for the demo, but in the real scenario it should be the Standard
After that, click next and go to “Network” tab:
- For the demo, you could choose all networks, but in real scenario, the service must be contained in a network so only resource within the network can access the service. Another choice is a private endpoint connection for exclusive way to access the resource.
Then, choose Status off for Identity Tab for demo, for real scenrio, you could enable managed identity for the service if you want to give the service an access to another serivce like account storage that has the order images.
That’s all! After creating the resource, you have it on your resource group
Use Form Recognizer Studio
After creating Form Recognizer, you can use Form Recognizer Studio.
Form Recognizer Studio is an online tool to try the features of Form Recognizer service and integrate it into your applications. The studio will help you experiment different Form Recognizer models and check what is the data it return without writing the code.
It also can provide with code samples to know how to call the service in your code base.
So, let’s first navigate to form recognizer studio https://formrecognizer.appliedai.azure.com/
In this default page, you will see different services for document analysis, pre-built models, and custom models.
- Prebuilt models can be used for specific model types like receipts
- Custom models can be used to train your custom model for more customized cases.
The pre-built model for receipts is enough for our case.
In Prebuilt models, click on “Receipts” then, you will need to login if you are not already logged in using the same credentials to your azure subscription.
After that, a pop up window will appear to configure the service resource:
- Choose your azure subscription
- Choose the resource group that contain the form recognizer service resource
- Click “Continue” button
Then, you will see the default page of your project:
- You can choose one of the order samples and click “Analyze”
After you click “Analyze”, you will see the result on the right
- Check the Fields
- Check the JSON result
- Check the code samples in Python, JavaScript, or C#
So, now creating the form recognizer service is done and it can be integrated into our application.
How much does it cost to use Form Recognizer Service?
using Azure Calculator, it will cost you $10 for each 1000 for the built-in models like receipt in our case. In my Marketing Campaign Contest post, the scenario targets one million end user, which means it will cost $10,000.
What if this is over the budget?
I have checked an Open Source Library called PaddleOCR, I didn’t use it on production before, but it might be a good enough for your scenario.
It’s not simple to compare it to Form Recognizer Service, There should be load tests to know how much the computation cost when running it on Azure. It may cost the same or maybe more.
So, I can’t say it’s a good or bad alternative.
That’s all! I hope you enjoyed reading the blog post.