Expand my Community achievements bar.

Dive into Adobe Summit 2024! Explore curated list of AEM sessions & labs, register, connect with experts, ask questions, engage, and share insights. Don't miss the excitement.

best field name practices for OCR scanning?

Avatar

Former Community Member

I have a pdf form that, instead of filling out and submitting via xml, many people are printing and mailing to me.

I want to use OCR scanning into adobe tools to get the data back into a manipulatable text format.

I have already found problems with the way I named the fields for this.

For instance, each of my displayed text field names has a colon at the end of it which is usually being converted by OCR to a l, 1, or i.

So, now I'm looking at the old fashion method of putting a cover sheet with text blocks cut out of it over my filled form before scanning it.

Anyone know what would be best to use as field names on the cover sheet to make it most likely OCR will catch it correctly.

Is bold better than non-bold?

Is all caps better than mixed case?

Is larger type size better?

I'm already using bold, all caps, larger than normal type (12 in place of 9), and avoiding punctuation and 0,O, I, and other characters which would be easily confused.

Any suggestions or a reference for best chances of successful OCR would be appreciated.

1 Reply

Avatar

Former Community Member

Instead of OCR why not put a dynamic bar code on the form. That way you can simply scan the barcode and get the xml data that was filled in the form.

Paul