Expand my Community achievements bar.

Radically easy to access on brand approved content for distribution and omnichannel performant delivery. AEM Assets Content Hub and Dynamic Media with OpenAPI capabilities is now GA.

best field name practices for OCR scanning?


Former Community Member

I have a pdf form that, instead of filling out and submitting via xml, many people are printing and mailing to me.

I want to use OCR scanning into adobe tools to get the data back into a manipulatable text format.

I have already found problems with the way I named the fields for this.

For instance, each of my displayed text field names has a colon at the end of it which is usually being converted by OCR to a l, 1, or i.

So, now I'm looking at the old fashion method of putting a cover sheet with text blocks cut out of it over my filled form before scanning it.

Anyone know what would be best to use as field names on the cover sheet to make it most likely OCR will catch it correctly.

Is bold better than non-bold?

Is all caps better than mixed case?

Is larger type size better?

I'm already using bold, all caps, larger than normal type (12 in place of 9), and avoiding punctuation and 0,O, I, and other characters which would be easily confused.

Any suggestions or a reference for best chances of successful OCR would be appreciated.

1 Reply


Former Community Member

Instead of OCR why not put a dynamic bar code on the form. That way you can simply scan the barcode and get the xml data that was filled in the form.
