SikuliX

Inconsistency in Match.text() - OCR related

Asked by Nanni Sunil on 2012-02-18

I tried this as a follow up to https://answers.launchpad.net/sikuli/+question/188070

The scenario is to,

1. Extract text from an image stored in file system.

I tried the following.

f = Finder("/Users/nanni/test-sikuli/source.png");

f.find("/Users/nanni/test-sikuli/to-extract.png");

while (f.hasNext()):
    #This doesn't print from matched image.
    m = f.next();
    print m; # Match[31,83 533x54 score=1.00 target=center]

# When i try the following, it tries to find the text which is on the screen, instead I would want to extract text from the image.
print m.text(); # uses the co-ordinates and considers it as SCREEN co-ordinates. Then it applies to SCREEN instead of my image and extracts text.

I guess REGION always considers SCREEN co-ordinates to extract the text.

Is it so ?
Is there a way to extract text from image stored on file system ?

Question information

Language:: English Edit question

Status:: Solved

For:: SikuliX Edit question

Assignee:: No assignee Edit question

Solved by:: Nanni Sunil

Solved:: 2012-02-19

Last query:: 2012-02-19

Last reply:: 2012-02-18

Revision history for this message

RaiMan (raimund-hocke) said on 2012-02-18:

no this is not so, this is a bug.

use instead:

print Region(m).text()

Revision history for this message

Nanni Sunil (sunil-jayaprakash) said on 2012-02-18:

Actually,

Both m.text() and Region(m).text are producing the same result and both are acting on SCREEN instead of the match.

Here was the sample script.
========================
print "starting";

f = Finder("/Users/nanni/test-sikuli/source.png");
f.find("/Users/nanni/test-sikuli/to-extract.png");

while (f.hasNext()):

m = f.next();

    print "From Region"
    print Region(m).text();
    print "From Match"
    print m.text();
    print "exists";

print "done";
========================

Output:
========================
starting

Found. Trying to extract Text.

From Region
Question#188121
~

From Match
Question#188121
~
done
========================
In both case, Question#188121 was present on SCREEN but not in "/Users/nanni/test-sikuli/source.png".

Is this expected ? If not, i could file a bug.

I am still unable to identify a mechanism to extract text from image on file system.

Revision history for this message

RaiMan (raimund-hocke) said on 2012-02-18:

Sorry, you are absolutely right with your first finding:
I guess REGION always considers SCREEN co-ordinates to extract the text.

And so do Match (Region subclass) and Location. Sorry for the fast but wrong answer.

If you want to have a solution for that, you must step down to the Java API and work directly on buffered images (which is possible from the Sikuli script level since it is Jython - Java API: http://sikuli.org/doc/java-x/).

If you want to post a request bug, it might be to ask for the possibility, to have a stored/buffered image as a "virtual screen".

Revision history for this message

Nanni Sunil (sunil-jayaprakash) said on 2012-02-19:

RaiMan, thanks for the inputs. I have raised a bug for Virtual Screen support. Also, as you said, i was able to achieve the functionality through Java api using BufferedImage.

Revision history for this message

RaiMan (raimund-hocke) said on 2012-02-19:

Fine. Could you post an example code snippet, how to do this with BufferedImage.
Would help me and may be others.

Revision history for this message

Nanni Sunil (sunil-jayaprakash) said on 2012-02-21:

RaiMan, https://github.com/suniljayaprakash/sikuli-ocr

Revision history for this message

RaiMan (raimund-hocke) said on 2012-02-21:

really great. Thanks

Revision history for this message

JonyGreen (jonygreen) said on 2015-11-10:

You can try this free online ocr http://www.online-code.net/ocr.html to extract text from image.

To post a message you must log in.

Ask a question

Edit question

SikuliX

Inconsistency in Match.text() - OCR related

Question information

Related bugs

Related FAQ:

Subscribers