Operational Efficiency
Automated document processing for efficient insurance operations
In the insurance sector, processing high volumes of handwritten documents while maintaining accuracy remains a critical operational challenge. Agilytic partnered with a leading health insurer to implement automated document processing that could handle thousands of documents daily with near-perfect precision.
To protect confidentiality, we may alter specific details while preserving the accuracy of our core contribution.
Context & objectives
A leading health insurer processed thousands of documents manually each day. These handwritten documents contained numerous yes/no questions, but processing them efficiently while maintaining quality posed a significant challenge.
Seeking to streamline operations, the insurer aimed to automate document information extraction. Despite gradual improvements in their tools and processes over time, they required more sophisticated technology for full automated document processing.
Our key challenge was maximizing document coverage while maintaining near 100% precision. This was a critical requirement given the sensitive nature of subsequent decisions. The solution needed to:
Handle multiple document types with varying templates and checkboxes
Manage PDFs that contained missing, unordered, or extra pages
Approach
System architecture and technology stack
The system takes scanned documents containing checkboxes as input and produces a summarized list of answers to the questions within these documents. We built the solution using:
Docker
AWS cloud development
PyTorch
OpenCV
Using carefully fine-tuned Natural Language Processing (NLP) techniques, we achieved precise text classification for automated document processing.
Document recognition and processing workflow
The model follows these steps:
Recognize the document's template automatically
Compare and match with reference templates to find the highest possible correlation
Match each page of the current PDF with its reference page, accounting for unordered or missing pages
Detect checkboxes on each page automatically
Use a Convolutional Neural Network (CNN) to classify boxes as checked, unchecked, or unclear
Results
Our integrated automated document processing application delivered key features that helped the insurer save time and resources by:
Providing more than 50% coverage and 100% precision for decision-making
Processing hundreds of documents per day
Accommodating diverse document types and datasets
Offering an intuitive interface suitable for all skill levels
The modular solution also ensures future scalability and flexibility with its hybrid cloud deployment capabilities. The client can easily expand the document processing pipeline, implement performance monitoring with automated alerts, and increase coverage across different process stages.
To safeguard confidentiality, we may modify certain details within our case studies.
