Locating a text in window

Asked by Faisal

Hi,

Very recently I have started using Sikuli. I have came across a situation as explained below.
Sikuli can be used to record a image and search for it any where in the screen, no need to specify any coordinate to make the image comparison. I am wondering how could I do the same for text data? I have a situation where I am doing some thing on the command line and restarting the application, so I am expecting the few things get displayed on the browser. I could predict what text will be displayed, but don't have the flexibility in predicting location(coordinate) or the keep a per-recorded image to look at the screen. Anybody please help me in solving this. Thank you very much for your time and help

Faisal

Question information

Language:
English Edit question
Status:
Answered
For:
SikuliX Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
Damon (damon-9) said :
#1

I too have a need for this. I am looking at an OCR solution using Python. If this works I will happily share with the community.
Essentially I will take the text and convert it to an image using ImageMagic. From there I should be able to do a find() or exists().

Another option is to do the reverse. Get a screen shot, and use the OCR to "find" the text on the screen when I attempt to do a validation.

Cheers Mate!

Revision history for this message
RaiMan (raimund-hocke) said :
#2

Generally Sikuli can find text in a region or the whole screen:
some_region.find("some text")

see the docs for more information.

But the potential of this feature is limited to medium sized most popular fonts. And all the rules, to make searches robust apply.

If we are talking about text in a standard html web app, the first choice should be Selenium, which can be combined easily with Sikuli on the Jython level and the Java level as well.

@ Damon
The idea of creating images from text,that can be searched is charming, but for the sake of performance, this should be done with the normal image features of Java (create an image, use the font to write the text snippet with the appropriate spacing parameters ) and finally use the buffered image to search on the screen/in the region - no intermediate file saving/loading and no external program to call. Many of the standard ImageMagick functions on images can be done on buffered images and grafic contexts directly in Java (and hence in Jython as well). It is even possible, to "learn" the font metrics from an example image of some text with some intelligent micro searches.

Revision history for this message
Damon (damon-9) said :
#3

Thanks RaiMan. I am new to Sikuli and am very familiar with Selenium. Unfortunately I am testing a legacy app written in VB6. This app has a grid control that you cannot latch onto using any of the standard automation methods. Hence using Sikuli.

Yesterday I wrote a set of functions to run in an XML-RPC server. Using these function I am able to quickly build test cases that perform any action I need to on the app.

It is the verification side of things I need to use text for. I had not gotten to working with regions yet, but will give it shot on what you suggested. As for the ImageMagik process that you described is what I thought I would be doing.

If the some_region.find("some text") works then I won't need to worry about doing any of the "tricks" with converting text to image and image to text.

Thanks again!

Revision history for this message
Damon (damon-9) said :
#4

@RaiMan:

I've done some testing with screen_region.find('some text') and found it is intermittent at best.

Examples:

jackovir not found
mARTIN not found
sa found
Test_01 found
TestAZ found
g_e found
g_eis not found
g_eiskifullreport not found

so as you can see this is inconsistent solution. I will continue to see if it has anything to do with combination of upper lower case usage.

Revision history for this message
RaiMan (raimund-hocke) said :
#5

This is the problem with current implementation of find("some_text").

You might make the results better, by restricting the region to a grid cell.

And you can try to turn it around by using

text = grid_cell_region.text()

and the compare the returned text string with your expected value.

Revision history for this message
Damon (damon-9) said :
#6

RaiMan.
Well I have had "some" success, but not exactly what I am after. What I have done to solve some of the problem is to take screenshots of all the controls on a form and then use the right, left methods to get to some controls I need.
Next on the stupid grids with static content I took specific screen shots of all values. I did try the list method, but that still doesn't work too well. The OCR is really bad.

To get to the dynamic data I am working on a method to access grid cells via a coordinate method based off of the first cell. I am looking at every grid layout we have and painstakingly looking at the size of each cell. Then I can use a formula for each grid to create a region on the fly for the specific cell.

If you have another suggestion I am all ears.

In the meantime I am going to look into a better OCR resolution for Sikuli and see if I can implement something. Maybe using Abby Fine Reader api? Anyway thanks for your input.

Cheers!

Can you help with this problem?

Provide an answer of your own, or ask Faisal for more information if necessary.

To post a message you must log in.