update FAQ #1110 for Sikuli X

Asked by janet

I've been using Sikuli to create tests for Miro and MiroVideoConverter using FAQ #1110 as my main reference, as well as https://answers.launchpad.net/sikuli/+question/111193 and https://answers.launchpad.net/sikuli/+question/100436 for windows.

I believe that the new text recognition feature will completely rock my world. Will the instructions and the patch (http://people.csail.mit.edu/vgod/sikuli/sikuli-mod.patch) be updated to work with Sikuli X?

Question information

Language:
English Edit question
Status:
Answered
For:
SikuliX Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
Gary Lee (carrot1192001) said :
#1

Sikuli X 1.0 RC1 has been released now,how to use the new feature:region.text() ?if can give a simple example?thx so much,another question is what role the patch is?and how to install the patch?

Revision history for this message
RaiMan (raimund-hocke) said :
#2

--- the general approach around FAQ #1110 might no longer be valid/needed with Sikuli X
I'm on the way investigating and documenting, what's needed, to use the Sikuli features without ever touching the IDE (e.g just using Jython to get things running).
Major improvements, that make things much easier with Sikuli X are the support for import and an image path.

--- the patch is no longer needed, since the __main__ bug was fixed latest with 10.2

Revision history for this message
RaiMan (raimund-hocke) said :
#3

@ Gary Lee regarding Region.text()

it just does what it says:

tries to "read" the text, that is contained in the region using an OCR-engine and returns it as a string for further processing.

Since it is still experimental, do not expect miracles (e.g. using other languages than english, special fonts, special layouts, intermediat symbols/grafics).

Just make your own experiences (and talk about them) and find your usages.

example:
# suppose you have one open Firefox window
myApp = "mozilla firefox"
App.focus(myApp) # bring it to front
regWinFox = App(myApp).window() # get the window as region
textWinTitle = Region(regWinFox.x+25, reg.WinFox.y, reg.WinFox.w-125, 25).text()
print textWinTitle # should print the window title as text (do not expect miracles ;-)

Revision history for this message
Gary Lee (carrot1192001) said :
#4

ok,thank you,The example you post me works well,but how to get the what i want(such as a button ,or a line of text) on the webpage as region,i saw the Region class API,seeming need to know the coordinate of the button,or the text on this webpage
Region( x, y, w, h ) this method ,am i right?and how to localize the x,y,m,h of the button,or the text on my screen

Revision history for this message
RaiMan (raimund-hocke) said :
#5

Absolutely right - you need a region, to get the text from it.

there are some possibilities with Sikuli, but all have the same principle as a base:
you somehow need the knowledge of some key visual objects relative to each other.

You start with a known region (e.g. an app window) and use the spatial operators like above, right ... to restrict the next operation.

If you want to do this more often, some helper functions, to calculate another region based on known coordinates (e.g. the corners of the region an width and height).

Another helpful tool is the target offset setter in the preview panel, to measure some distances and locations. And you can use patterns with a targetOffset, to get a match, that defines some other relative location.

And last but not least, use the new text search feature to find things like buttons, that have shorter text entries, to find these objects.

And as an overall rule: always try to restrict the region of interest as narrow as possible - this gives speed and precision.

But be aware, that the region.text() feature is still in experimental state (see my comment before).

Revision history for this message
Gary Lee (carrot1192001) said :
#6

it is amazing that region.text() can read the text on the image,i used the find().left() to focus on the specific region and then .text() out the text.but the .text() method can not read out chinese characters,it is so pity

Revision history for this message
Gary Lee (carrot1192001) said :
#7

RaiMan:
Region(regWinFox.x+25, reg.WinFox.y, reg.WinFox.w-125, 25).text() :How did you find out the exact correct x,y,w.h cocordinate,and how should i know the .x+25,w-125 etc?

Revision history for this message
Gary Lee (carrot1192001) said :
#8

See my comment above

Revision history for this message
RaiMan (raimund-hocke) said :
#9

This was a "quick and dirty" example, but the principle is, that you get new regions by combining an existing regions x,y,w,h with values you have measured or guessed or by using the spatial operators like right, below, ...

to mesure/guess I use either the IDE preview -> target offset feature or the Mac screenshot tool which shows coordinates.

I have defined some helper functions that I import to get e.g. the region inside another region below some visual object ...

so in the example above, to get the text from title bar of the FF window, I could have written:

titleBar = regWinFox.above(1).below(25)

this gives me the whole titlebar. For text reading this is not optimal, because of the grafics left and right.
so lets make a def:
regInside(reg, left, right, top=0, bottom=0):
   return Region(reg.x+left, reg.y+top, reg.w-left-right, reg.h-top-bottom)

I decided, that the usage might be more often to ignore left and right, so I set the default of top and bottom to 0, so in the standard usage with text() I can leave them.

now we can get the text out of the titlebar
# has to be adjusted to your situation (once - forever ;-)
winTitleBarHeight = 25
winTitleBarIgnoreLeft = 25
winTitleBarIgnoreRight = 100

titleBar = regWinFox.above(1).below(winTitleBarHeight)
TitleBarText = regInside(titleBar, winTitleBarIgnoreLeft, winTitleBarIgnoreRight).text()

Can you help with this problem?

Provide an answer of your own, or ask janet for more information if necessary.

To post a message you must log in.