how to implement text reading with SikuliX IDE 1.1.3 or 1.1.4

Asked by Mikle

I've broken my head already. I do not understand how to implement reading text using Sikuli IDE 1.1.3 or 1.1.4(the latest version that should do it exactly).
The challenge is to snag the text from the Gusev app. It seems as there is an opportunity through Region.text(), but how to implement in practice, I did not understand (OCR included).

find("1536235557278.png")
click()
doubleClick("1536219181729.png")
click("1536219340303.png")
paste("Request")
click("1536219429561.png")
## there should be a command for getting the text from the picture (the answer to the previous query)

Question information

Language:
English Edit question
Status:
Solved
For:
SikuliX Edit question
Assignee:
No assignee Edit question
Solved by:
RaiMan
Solved:
Last query:
Last reply:

This question was reopened

Revision history for this message
Tim B (tangob) said :
#1

There are a lot of ways to capture text, especially within the 1.1.4 version and the updated Tesseract functionality. I think perhaps you are not calling the function relative to the object? Here is a simple sample that should work, based on what you are looking for:

----------------------
# Designate a new region... used your pattern as a reference
new_reg = find("1536235557278.png")

# Print the text to the console from that region
print new_reg.text()
----------------------

Hope that helps!

Revision history for this message
Mikle (povidlov.mikle) said :
#2

made as written but appears this error, which hangs and does not give back will launch Sikuli IDE

----------------
[error] ScriptingSuport: getRunner: no runner found for: text/plain
[error] IDE: runCurrentScript: Could not load a script runner for: text/python

Revision history for this message
RaiMan (raimund-hocke) said :
#3

with 1.1.4 no Jython available

see: bug 1790592

Revision history for this message
Mikle (povidlov.mikle) said :
#4

Thanks Tim B, that solved my question.

Revision history for this message
Mikle (povidlov.mikle) said :
#5

Russian letters can not determine determines analogues of the Latin alphabet. is that the way it should be, or is there some way to fix it?

Revision history for this message
RaiMan (raimund-hocke) said :
#6

The standard language recognized is english.

You might install the language sets for other languages and then tell SikuliX to use this language for recognition.

Since the docs are not yet ready for these OCR configuration options, these are the steps:

1. find the folder SikulixTesseract/tessdata in your SikuliX <app-data> folder (see docs)

2. download the languages needed from https://github.com/tesseract-ocr/tessdata/tree/3.04.00 (only the files with .traineddata)

3. put the .traineddata files into the tessdata folder (step 1.)

4. in your script say before using OCR features:

tr = TextRecognizer.start()
tr.setLanguage("xxx")

where xxx is the shorthand for the wanted language (the letters in the filename (step 3.) before the .traineddata)

Revision history for this message
Mikle (povidlov.mikle) said :
#7

21 tr = TextRecognizer.start()
22 tr.setLanguage("xxx")

-------------------------------------
[error] script [test] stopped with error in line 21
[error] NameError (name 'TextRecognizer' is not defined)

Revision history for this message
Mikle (povidlov.mikle) said :
#8

Thanks Tim B, that solved my question.

Revision history for this message
Mikle (povidlov.mikle) said :
#9
Revision history for this message
RaiMan (raimund-hocke) said :
#10

Uuups, my fault. Not yet really complete.

As a quick workaround use this:

top of script:
import org.sikuli.script.TextRecognizer as TextOCR

then later:
ocr = TextOCR.start()
ocr.setLanguage("xxx")

... I will fix it, so that with the next build the import will no longer be needed.

Revision history for this message
Mikle (povidlov.mikle) said :
#11

[error] AttributeError (type object 'org.sikuli.script.TextRecognizer' has no attribute 'start')

Revision history for this message
RaiMan (raimund-hocke) said :
#12

I cannot see your relevant script snippet.

... but anyways: meanwhile the latest build works so:

ocr = TextOCR.start()
ocr.setLanguage("xxx")

or shorter (if this is the only usage):
TextOCR.start().setLanguage("xxx")

Revision history for this message
Mikle (povidlov.mikle) said :
#13

... probably because I use version 1.1.3
but the version 1.1.4 I have runs but produces when you run the script the error:

------------------
[error] ScriptingSuport: getRunner: no runner found for: text/plain
[error] IDE: runCurrentScript: Could not load a script runner for: text/python
------------------
I don't know how to make it disappear.

Revision history for this message
Best RaiMan (raimund-hocke) said :
#14

for 1.1.4:
see my comment #3

Revision history for this message
Mikle (povidlov.mikle) said :
#15

Thanks RaiMan, that solved my question.