Monday, February 12, 2024

APEX with AI Services: Automating invoice entry with AI assisted key/value extraction

Since couple of years, almost everyone is talking about AI, trying to understand how this can help, both life and business. OpenAI ChatGPT made a huge impact in our lives, "You are a helpful assistant", thank you! I am using it daily. I have been also playing around with other language models like Llama 2 and actively learning ML from different resources. Hugging Face is a must join platform along with all the material on LangChain . Just these two got me as far as I could train and run my own model to classify my emails within a week, and the results were incredibly better than I could ever expect. Besides I am enjoying this a lot.

Nowadays the number of interested customers is increasing and this post is about a very basic customer use-case, a real one: invoice entry. I know, it doesn't sound interesting, at first I thought my technical consultancy for ERP days are over, but I promise this is not boring. It has new challanges for me (and for most customers) and demonstrates application of AI services to real life problems. So let's dive into it!

Requirements

"...investigating the possibilities of automating / optimizing the reading and processing of PDF documents with the help of Optical Character Recognition (OCR)..."
The moment I saw this I could imagine what they wanted. After verifying their ideal soution in our discovery meeting, we planned for a demo to proove it can work.
Here is a mock design, on the right side we display the PDF file, OCR'ed and all values extracted, and on the left side form is populated with the extracted values. Ideally operator will just click save, and will have a chance to fill in any missing information, huge time saving.

Challenges

Biggest challange is lack of skills, customer knows APEX inside out but I am not an APEX developer. I understand the APEX environment, components and how they work, installed and configured many times. Followed and demonstrated many workshops, but never developed something from scratch. Yet APEX is low code, there are many samples and I was able to complete this in 2 days.

Development

I will briefly mention the steps I've followed and highlight the important parts. Using cloud services makes it easy to start.
1 I start with creating an Autonomous database and APEX workspace. It takes minutes to start worrking on my APEX development.
2 Then I follow this LiveLab workshop as a starter application. I tweaked the table structure according to my needs, but it gave me the foundation I needed for interacting OCI Document Understanding service using object storage is a good decision.
3 Using Document Understanding service inside APEX was easy.
I have added API endpoint in application definition.

Then I made changes to saveDocument process to invoke the AI service after uploading file to Object Storage. The below code prepares the JSON body, invokes the service processorJobs, parses return message and updates the table. The execution flow and response time is not efficient for production but good enough for a POC. I am using only Key Value Extraction feature, but you can also use all features including generation of a searchable PDF file in your output bucket. The service is pretrained and capable of identifying common key/value pairs from an invoice document. My service call creates a json file, which can be located with the job_id. I've also loaded that json file into a blob, parsed and created views on top of it just to make my life a bit easier. Chris Saxon has an excellent cheat sheat for that purpose.


4 I created a new page, with 3 regions. Two side by side, left for showing/entering extracted values, right an iframe to display PDF file, last for invoice lines, as designed in the mock wireframe.
For displaying PDF inline on the right side of the page I followed instructions in this YouTube video . The only thing is I didn't have a link item on the same page, but the ID has to come as a page parameter. So I added the link on my home page where all uploaded files are listed, and passed the document_id as page parameter. Then created a new Page Load Dynamic Action to get ID and trigger PDFViewer Action to display the file.
After changing the theme to Redwood (with some modifications to make it dark) the application looks like this:

Conclusion

APEX and AI Services is a very powerful combination, that can help you boost the productivity. Please share your ideas about what you think and of course new use-cases in the comments, maybe we can build one together!


References:
1. LiveLabs: Use OCI Object Storage to Store Files in Oracle APEX Applications
2. OCI Developer Documentation: Document Understanding API
3. OCI Developer Documentation: Key Value Extraction (Invoices)
4. Oracle Blog: How to Store, Query, and Create JSON Documents in Oracle Database
5. Oracle Developer YouTube Channel: PDF Viewer in Oracle APEX
6. Banner Image Credit: Dall-e 3
7. YouTube Background Soundtrack: BenSound: Royalty Free Music for Videos
8. Postman Collections: OCI REST API Postman Collection

No comments:

Post a Comment

Featured

Putting it altogether: How to deploy scalable and secure APEX on OCI

Oracle APEX is very popular, and it is one of the most common usecases that I see with my customers. Oracle Architecture Center offers a re...