SikuliX

Bug #710586
Comment #6

Comment 6 for bug 710586

Revision history for this message

Tsung-Hsiang Chang (vgod) wrote on 2011-10-04:

Let me briefly summarize the progress on the OCR research we are doing for Sikuli.

1. Recently I've implemented a new OCR algorithm designed for small screen text (which is from a paper "Recognition of Screen-Rendered Text", ICPR '06). However, it turns out this algorithm doesn't perform so well as the authors claimed in the paper. It's even worse than Tesseract OCR, so right now we will continue using Tesseract as Sikuli's OCR engine.

2. We are migrating from Tesseract 2 to Tesseract 3. One significant advantage of Tesseract 3 is that it supports many more languages such as Chinese and Japanese. We are also working on making a simple OCR trainer so Sikuli users can train the OCR engine using the fonts installed on their systems.

3. Improving OCR performance is very tricky. Lots of parameters and preprocessing could be done to improve it. We put a collection of screenshots with labeled ground truth in our source repo, so everyone can try to improve the OCR algorithm, and simply run the tests to know if it really gets better or worse. Welcome to fork our code and try any possible improvements, or even provide more labeled screenshots to make our data set more diverse.