Your achievements

Level 1

0% to

Level 2

Tip /
Sign in

Sign in to Community

to gain points, level up, and earn exciting badges like the new
Bedrock Mission!

Learn more

View all

Sign in to view all badges

best field name practices for OCR scanning?

Avatar

Level 1

I have a pdf form that, instead of filling out and submitting via xml, many people are printing and mailing to me.

I want to use OCR scanning into adobe tools to get the data back into a manipulatable text format.

I have already found problems with the way I named the fields for this.

For instance, each of my displayed text field names has a colon at the end of it which is usually being converted by OCR to a l, 1, or i.

So, now I'm looking at the old fashion method of putting a cover sheet with text blocks cut out of it over my filled form before scanning it.

Anyone know what would be best to use as field names on the cover sheet to make it most likely OCR will catch it correctly.

Is bold better than non-bold?

Is all caps better than mixed case?

Is larger type size better?

I'm already using bold, all caps, larger than normal type (12 in place of 9), and avoiding punctuation and 0,O, I, and other characters which would be easily confused.

Any suggestions or a reference for best chances of successful OCR would be appreciated.

1 Reply

Avatar

Level 10

Instead of OCR why not put a dynamic bar code on the form. That way you can simply scan the barcode and get the xml data that was filled in the form.

Paul