questions about sikulifirefoxdriver

Asked by Dan

I just discovered sikuli and I am having some trouble getting the SikuliFirefoxDriver to work the way I would like. I am trying to use Sikuli (and selenium) to click on buttons in a flash program running in my browser. If I try to do it manually by grabbing a screenshot of the desktop, it works:

ScreenRegion s = new DesktopScreenRegion();
Target target = new ImageTarget(new File("2button.png"));
ScreenRegion r = s.find(target);
Mouse mouse = new DesktopMouse();
mouse.click(r.getCenter());

but if I try to use the SikuliFirefoxDriver

ImageElement image = driver.findImageElement(new File("2button.png").toURI().toURL());
image.click();

I get a casting error when it finds the image:

Exception in thread "Thread-4" java.lang.ClassCastException: java.lang.String cannot be cast to org.openqa.selenium.WebElement
        at org.sikuli.webdriver.SikuliFirefoxDriver.findElementByLocation(SikuliFirefoxDriver.java:35)
        at org.sikuli.webdriver.SikuliFirefoxDriver.findImageElement(SikuliFirefoxDriver.java:53)

I don't understand why a String or a WebElement would show up there.

Also, how do I turn on OCR feature in SikuliFirefoxDriver, so I can try and recognize sections of text and click on them? Is there a simple way to grab the screenshot from selenium browser and turn it into a sikuli screen that can be clicked on? I want to have multiple instances running at the same time and I assume when you grab the desktop, it only grabs the images that are visible in front.

Question information

Language:
English Edit question
Status:
Answered
For:
SikuliX Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
RaiMan (raimund-hocke) said :
#1

sorry, what you are using is not SikuliX based (http://sikulix.com), but based on Sikuli Java API (https://code.google.com/p/sikuli-api).

I guess none here can help you with that.

BTW: this stuff does not have an OCR feature.

Revision history for this message
Dan (mendels99) said :
#2

Yes, I am using the java API. Is there a more appropriate forum to ask that question? The java API has a Region.text() that acts as OCR, but it seems to turned off by default.

Revision history for this message
RaiMan (raimund-hocke) said :
#3

I meant that this code
ScreenRegion s = new DesktopScreenRegion();
Target target = new ImageTarget(new File("2button.png"));
ScreenRegion r = s.find(target);
Mouse mouse = new DesktopMouse();
mouse.click(r.getCenter());

is not for the SikuliX we are talking about here. There is no forum for that, only the mentioned website.

Give me a pointer to the SikuliFirefoxDriver stuff, so I can check what it is.

Region.text() indeed is a feature of SikuliX and yes, the OCR feature is switched off by default and has to be switched on having an appropriate setup sikulixapi.jar (see http://sikulix.com).

Revision history for this message
Dan (mendels99) said :
#4

SikuliFirefoxDriver can be found here:

https://code.google.com/p/sikuli-api/wiki/SikuliWebDriver

It looks like you turn on the OCR with:

Settings.OcrTextSearch
Settings.OcrTextRead

But my jar files don't seem to have the

org.sikuli.script.Settings

or there may be a conflict with something else I am using. I was hoping there was a switch for the SikuliFirefoxDriver directly.

Revision history for this message
RaiMan (raimund-hocke) said :
#5

ok, I will try again to sort out:

There are two branches that evolved over the last 5 years:

- SikuliX
represented now by http://sikulix.com (the original developers went out, I took over)
this is the follow up of the original Sikuli versions 0.9.x - 0.10.x and finally ending up with SikuliX1.0rc3
pointers to the historic places are available on the above page
the current Version is 1.1.0 (pre-final)

- Sikuli Java API
represented by http://lab.sikuli.org/about/ (one of the original developers of Sikuli)
latest versions are from 2013

*** both branches are to some extent feature compatible, but NOT compatible at the usage nor any available API level.

SikuliFirefoxDriver uses Sikuli Java API, but the mentioned OCR/text feature is from SikuliX.

So you have to decide which branch you want to use.

IMHO: no need to use SikuliFirefoxDriver. There are many examples in the net, that show how to combine Selenium and SikuliX at various usage and API levels.

Revision history for this message
peter huang (huangsheng2) said :
#6

i want to know any ideas about Selenium and SikuliX at various usage and API level, so Railman can you share some good resources that you think it is good for references besides Sikulix documentation?

Not many real and comparative complete samples found.

Revision history for this message
RaiMan (raimund-hocke) said :
#7

Selenium and SikuliX can be used easily side by side, since Webdriver implementations are available for Java and Python/Jython.
For both worlds you have many examples available in the net.
There are no real overlapping of the features, since selenium works with the internals of the webpage and Sikuli on the visual representation of the page.

So if you understand Selenium and understand SikuliX, you have all you need to use them both within one scripting or Java environment.

Things only get a bit more tricky, if some higher level frameworks come into play: like e.g.RobotFramework. But then we are talking about other aspects than using Selenium AND SikuliX side by side.

But since I myself just started to make integration of SikuliX a bit smoother, may be I find some sources in the net.
Just look at the relevant doc sections from time to time.
You might get triggers, if you watch commits in the doc base at:
https://github.com/RaiMan/SikuliX-2014-Docs
I will be a bit more verbose when making changes ;-)

Revision history for this message
Dan (mendels99) said :
#8

I have been trying for a week or so and I can't seem to get it to do what I want. The sikuliwebdriver doesn't find images on the selenium browser page (seems like a bug to me, but no one in that world responds to messages). I can find the images using just the sikuli java API. But that is using a screen shot of the desktop to find the image and click the mouse. I want to create a Screen using the selenium webdriver screenshot. That way I can use multiple instances of Selenium at a time and it is not dependent on the browser being visible on the screen. Do you have some example code that does that? I can get a BufferedImage of the screen from Selenium. I just don't see how to create a Screen from that.

Can you help with this problem?

Provide an answer of your own, or ask Dan for more information if necessary.

To post a message you must log in.