how to implement text reading with SikuliX IDE 1.1.3 or 1.1.4

Asked by Mikle on 2018-09-06

I've broken my head already. I do not understand how to implement reading text using Sikuli IDE 1.1.3 or 1.1.4(the latest version that should do it exactly).
The challenge is to snag the text from the Gusev app. It seems as there is an opportunity through Region.text(), but how to implement in practice, I did not understand (OCR included).

find("1536235557278.png")
click()
doubleClick("1536219181729.png")
click("1536219340303.png")
paste("Request")
click("1536219429561.png")
## there should be a command for getting the text from the picture (the answer to the previous query)

Question information

Language:
English Edit question
Status:
Solved
For:
Sikuli Edit question
Assignee:
No assignee Edit question
Solved by:
RaiMan
Solved:
2018-09-25
Last query:
2018-09-25
Last reply:
2018-09-07

This question was reopened

Tim B (tangob) said : #1

There are a lot of ways to capture text, especially within the 1.1.4 version and the updated Tesseract functionality. I think perhaps you are not calling the function relative to the object? Here is a simple sample that should work, based on what you are looking for:

----------------------
# Designate a new region... used your pattern as a reference
new_reg = find("1536235557278.png")

# Print the text to the console from that region
print new_reg.text()
----------------------

Hope that helps!

Mikle (povidlov.mikle) said : #2

made as written but appears this error, which hangs and does not give back will launch Sikuli IDE

----------------
[error] ScriptingSuport: getRunner: no runner found for: text/plain
[error] IDE: runCurrentScript: Could not load a script runner for: text/python

RaiMan (raimund-hocke) said : #3

with 1.1.4 no Jython available

see: bug 1790592

Mikle (povidlov.mikle) said : #4

Thanks Tim B, that solved my question.

Mikle (povidlov.mikle) said : #5

Russian letters can not determine determines analogues of the Latin alphabet. is that the way it should be, or is there some way to fix it?

RaiMan (raimund-hocke) said : #6

The standard language recognized is english.

You might install the language sets for other languages and then tell SikuliX to use this language for recognition.

Since the docs are not yet ready for these OCR configuration options, these are the steps:

1. find the folder SikulixTesseract/tessdata in your SikuliX <app-data> folder (see docs)

2. download the languages needed from https://github.com/tesseract-ocr/tessdata/tree/3.04.00 (only the files with .traineddata)

3. put the .traineddata files into the tessdata folder (step 1.)

4. in your script say before using OCR features:

tr = TextRecognizer.start()
tr.setLanguage("xxx")

where xxx is the shorthand for the wanted language (the letters in the filename (step 3.) before the .traineddata)

Mikle (povidlov.mikle) said : #7

21 tr = TextRecognizer.start()
22 tr.setLanguage("xxx")

-------------------------------------
[error] script [test] stopped with error in line 21
[error] NameError (name 'TextRecognizer' is not defined)

Mikle (povidlov.mikle) said : #8

Thanks Tim B, that solved my question.

Mikle (povidlov.mikle) said : #9
RaiMan (raimund-hocke) said : #10

Uuups, my fault. Not yet really complete.

As a quick workaround use this:

top of script:
import org.sikuli.script.TextRecognizer as TextOCR

then later:
ocr = TextOCR.start()
ocr.setLanguage("xxx")

... I will fix it, so that with the next build the import will no longer be needed.

Mikle (povidlov.mikle) said : #11

[error] AttributeError (type object 'org.sikuli.script.TextRecognizer' has no attribute 'start')

RaiMan (raimund-hocke) said : #12

I cannot see your relevant script snippet.

... but anyways: meanwhile the latest build works so:

ocr = TextOCR.start()
ocr.setLanguage("xxx")

or shorter (if this is the only usage):
TextOCR.start().setLanguage("xxx")

Mikle (povidlov.mikle) said : #13

... probably because I use version 1.1.3
but the version 1.1.4 I have runs but produces when you run the script the error:

------------------
[error] ScriptingSuport: getRunner: no runner found for: text/plain
[error] IDE: runCurrentScript: Could not load a script runner for: text/python
------------------
I don't know how to make it disappear.

Best RaiMan (raimund-hocke) said : #14

for 1.1.4:
see my comment #3

Mikle (povidlov.mikle) said : #15

Thanks RaiMan, that solved my question.