There is NO 100% reliability for matching raw strings extracted by OCR.

Asked by srinivasulu

There is NO 100% reliability for matching raw strings extracted by OCR.

For example, the word “system” might be incorrectly recognized as “systen”.

Could you please let us know what all text fronts are supported with the Sikuli?
How to get 100% accuracy.

Question information

Language:
English Edit question
Status:
Answered
For:
SikuliX Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
RaiMan (raimund-hocke) said :
#1

SikuliX internally uses the Tesseract OCR package via C++ API.
It uses the standard eng-traineddata and the restrictions documented with tesseract apply.
There is no information about supported fonts.

If you think accuracy with the SikuliX feature is too bad, you only have 3 choices
1. live with it
2. use the Tesseract package standalone (maybe integrated with SikuliX via internal command execution)
3. create traineddata files for you situation and incorporate them into the SikuliX tessdata folder

Can you help with this problem?

Provide an answer of your own, or ask srinivasulu for more information if necessary.

To post a message you must log in.