[HN Gopher] Battle of document info extraction services: GCP vs....
___________________________________________________________________
Battle of document info extraction services: GCP vs. AWS vs. Azure
Author : ctk_brian
Score : 38 points
Date : 2021-04-22 18:40 UTC (4 hours ago)
(HTM) web link (www.crosstab.io)
(TXT) w3m dump (www.crosstab.io)
| ajcp wrote:
| Very interesting review, thanks for the read! If I may, I've had
| some different experiences.
|
| I work for one of the biggest supermarket chains in the US as
| part of the team implementing an invoice processing capability
| for the enterprise to utilize. We literally take in thousands of
| paper/non-digitized invoices a day, and in our testing have found
| Azure's Form Recognizer (AFR) to be very dependable and
| confidently accurate. I have also professionally used Google Form
| Parser and ABBYY's OCR engine, but not it's cloud offering.
|
| > it's also the only service fast enough to be part of a
| synchronous pipeline.
|
| I assume what you're talking about here is exposing the
| processing capability _and_ response as part of a tool that is
| utilized by a person. While maybe as a one-off edge case, we 've
| never seen the use in building for this. When talking about form
| processing the real goal of any enterprise is to get the invoice
| data into their system of record where it can be validated,
| addressed, and maintained. This does not require a "man-in-the-
| middle" approach wherein the user submits the invoice and then
| expects the results to be immediately returned so that they
| may...what, put them in the system of record, right? We've found
| that the "time to affect" workflow is the same regardless of
| whether it is hand-keyed or as the result of an AFR response to
| be programmatically submitted to the system.
|
| > requires a custom model to be trained before extracting data
|
| This is simply not true. AFR provides quite a few pre-built
| models[1] that we have found to return confidence scores
| consistently above 70%. To put that in perspective a human
| averages 66% accuracy when performing data-entry of this type[2].
| Sure, they don't necessarily provide for invoice line items
| (which requires much more complex key-value arrays and matrices)
| they can be utilized to capture metadata on an invoice that can
| then inform on how and where it may be moved along in the
| "processing" flow.
|
| We've also found that building a single, "monolithic [custom]
| model" able to address our specific vendor invoices with more
| finely tuned value returns has been fairly easy to build and
| maintain.
|
| 1. https://docs.microsoft.com/en-us/azure/cognitive-
| services/fo... 2.
| https://www.sciencedirect.com/science/article/abs/pii/S07475...
| tims33 wrote:
| There is so much potential to these technologies, but even
| autogenerated documents have so many embedded semantics that it
| is hard to train these tools for a wide variety of document
| formats.
___________________________________________________________________
(page generated 2021-04-22 23:01 UTC)