[1.1.0] Is there a way to integrate a (background) observer with wheel()?

Asked by Glowing Crystalline Entity

Is there a way to integrate a (background) observer with wheel(), so that when an object/image comes into view, the scrolling will stop?

Question information

Language:
English Edit question
Status:
Solved
For:
SikuliX Edit question
Assignee:
No assignee Edit question
Solved by:
RaiMan
Solved:
Last query:
Last reply:
Revision history for this message
Best RaiMan (raimund-hocke) said :
#1

this works with 1.1.0: (tested on OS X 10.11)

switchApp("myApp") # bring the app to front
r = App.focusedWindow() # the search area
imgStop = "imgStop.png" # the stop image to wait for

stopAppeared = r.onAppear(imgStop) # no handler needed, since observing stops with first appearance
# save a reference to the event
r.observeInBackground(FOREVER) # start observation
while not r.isObserving(): wait(0.3) # hack currently needed, to use isObserving()

r.hover() # move mouse to a point, where wheel is accepted
while r.isObserving(): # until observing is stopped
  wheel(WHEEL_UP, 10) # wheel (might be WHEEL_DOWN on Windows)
  wait(1)

... and this as a goody on top ;-)
m = r.getEvent(stopAppeared).getMatch()
hover(m)
m.highlight(2)

Revision history for this message
Glowing Crystalline Entity (glowingcrystallineentity) said :
#2

This seems to be working, cool. :)

I realize I was actually thinking about some sort of "interrupt" functionality. That is, when the background observer sees the image, it would generate an interrupt to stop the mouse wheel scrolling in the foreground thread; rather than the foreground thread polling for the event. I know something like that could be done in Java... not sure about Python (which I'm using at the moment).

Revision history for this message
Glowing Crystalline Entity (glowingcrystallineentity) said :
#3

Thanks RaiMan, that solved my question.

Revision history for this message
RaiMan (raimund-hocke) said :
#4

SikuliX basically all over the place is constantly searching for some image on the screen (the visual approach) and the workflow waits for this to happen (you might call this polling).
observe only means, that the search loop is detached to a thread, which makes the communication a bit more complicated.
But still the main workflow waits for some image to appear and has to take care in this case to change the screen in a way, that the hidden image appears.

May be reading this is helpful:
https://sikulix.wordpress.com/2015/09/18/sikulix-how-does-it-find-images-on-the-screen/

Revision history for this message
Glowing Crystalline Entity (glowingcrystallineentity) said :
#5

I was specifically referring to this loop as "polling":

while r.isObserving(): # until observing is stopped
  wheel(WHEEL_UP, 10) # wheel (might be WHEEL_DOWN on Windows)
  wait(1)

In this context, it's fine as it is (and indeed, your suggestion above has been working well for me :) ), but when I can, I try to do multi-threaded coding in such a way that I don't have to "ask" another thread whether it's done, but rather am notified ("interrupted") when it is. Sometimes that resolves to some lower-level polling anyway, depending on the architecture available, sometimes not. It can also have other difficulties, again depending on the architecture and the specific facilities available. It's also a little bit academic for the moment. ;-)

Revision history for this message
RaiMan (raimund-hocke) said :
#6

yes, rather academic and from my point of view not really applicable to this case.

If interrupting should make sense, then you should be busy with something, that gets interrupted by some event and at that time you decide to do something else based on the type of interruption and then continue, what you have done before.

The observeInBackground together with the handler principally is something like that:
- your main worflow runs
- the observe runs in parallel
- the event happens and the callback is processed (see comment)
- the observe might be continued
- ... while your workflow is still running

comment:
from my point of view, the handler is the point of interruption (you might implement something here, that interrupts your main workflow (... but for what?).

If in this case their would have been a compound feature like
wheelDownUntilAppear("someImage")

which simply would do what it says, you would not have noticed the polling.
What else would you have wanted to do in this workflow besides waiting for the image to get visible?

... yes, rather academical ;-)

Revision history for this message
Glowing Crystalline Entity (glowingcrystallineentity) said :
#7

Well, in this case, you scroll/wheel 10 steps, then wait for a second (or even a tenth), then scroll/wheel again... and it's very jerky. That's maybe just aesthetic, but if you were actually doing a calculation for example, you'd be slowing down your computation speed. (... or a data transfer)

So you're right, that for interrupting to make sense, you need to be busy with something else, and in this case you're busy wheeling the mouse. So what I had initially implicitly imagined, was that I'd start one thread that would just wheel away without end, and another that would watch for the condition by which I wanted to stop, and then signal the first thread to do so -- at which point it could terminate or continue with other work.

Of course this sort of architecture only makes sense in a multi-processor (as opposed to merely multi-threaded) environment... but almost every computer these days is such. But the possibility, as you mention, of polling being hidden behind an API is what I meant by "... resolves to some lower-level polling anyway...".

A handler isn't really a point of interruption -- it necessarily can't run on a thread that's busy with other work, or in general know to interrupt/stop such a thread -- but it is an opportunity for the API user to create their own interrupt, if that's what they want to do. The reasons to do so might be those above, and I'm sure there are others.

Revision history for this message
RaiMan (raimund-hocke) said :
#8

I understand what you mean, but you have to always keep in mind how SikuliX works:
1. make a screenshot
2. search the image in the screenshot
3. stop the search if found else continue with 1 until max waitime is reached (then stop)

step 2 is the time consuming part here.
So if you do not somehow synchronize the wheeling and the search, in the "independent" solution the wheeling might overtake the searching and the target might be missed.

Looking from this perspective, the solution we have agreed on is a bit risky, since even here the wheeling is in fact independent from the search. The wheeling slowdown in the polling loop somehow "guarantees", that the search gets a chance to grab all parts of the web-page. Hence the faster the wheeling the greater the chance to miss the target.

The mentioned wheelDownUntilAppear("someImage") would have to be implemented in a safe way anyways:

- find how much one wheel step moves the page
- calculate how many wheel steps it takes to move the page to the next part
- start loop
  - search
  - if found then stop
  - wheel down on sub-page

Only this approach assures, that under all circumstances the target is matched, when it gets visible.

Another possibility would be to capture the complete page and calculate the amount of needed wheeling to make it visible (but this is only an option for web-pages using appropriate libraries) - hence not an option for SikuliX
(would not be WhatYouSeeIsWhatYouScript)

Revision history for this message
Glowing Crystalline Entity (glowingcrystallineentity) said :
#9

Yes indeed -- and even the pause-move of the above loop is, as you say, a bit risky: my current project's code still sometimes overshoots. It helps though to make the search area small, and other tricks that speed up step 2.

There is one special case, though, where that wouldn't be an issue -- namely, when you're scrolling to the very end of a page. The scrolling would just stop at that point even if the wheel'ing continued for a while, until the image that indicated the end got recognized. It's a special case, albeit one that is in my current project. And when that's the case, it's nice to make the scrolling as smooth and fast as possible, but still stop shortly after it reaches the bottom (or top). That's when it's nice to have a background thread that just does step 2 repeatedly.

Can you really find out how much a wheel step moves a page, for all platforms and all programs? I don't know, but I'd be kind of surprised. I see enough variation between programs on just my Win7, that there must be multiple factors that affect it. I'd think that if you take that single-threaded polling approach, you'd want to give the user some control over how often the screen shots ("polls") are taken.

There's also the issue that the search area might be made so small that a single wheel step moves the target image through or out of it. It would be good to know the wheel step size for that reason as well, and provide a warning -- again, if possible.

(... and btw, to emphasize something I at least implied earlier -- I mentioned the "interrupt driven" approach more to explain the wording of my original question/comment, than to suggest that it would be a high priority or significant improvement to actually do. Again, OCR is much more important. ;-) But these discussions can be interesting nonetheless.)

Revision history for this message
RaiMan (raimund-hocke) said :
#10

--- That's when it's nice to have a background thread that just does step 2 repeatedly.
... exactly what observeInBackGround is doing. Forgot to mention the ObserveScanRate.

--- Can you really find out how much a wheel step moves a page?
... yes you can:
- just take a significant but smaller part of the screen
- save the match where it is now
- do a wheel with 1 step
- find the same image again
- eval the step width by comparing the two matches.

--- that the search area might be made so small ...
... yes you have to assure, that the target image can be found in the search area.
Using the class Image with 1.1.0 allows you to get the image size after having created the image in-memory, but before doing the search. This allows to adjust both: the search area size and the wheel stepping.

--- But these discussions can be interesting nonetheless
I like these discussions, because it helps me to understand, what people want to do with SikuliX and what their expectations about features are. And it helps me to decide, where to put my effort in.

Revision history for this message
Glowing Crystalline Entity (glowingcrystallineentity) said :
#11

Ahhh, clever! :)

And in fact... for the case of scrolling, if you know how much a wheel step moves the page, and you can figure out how far in total you need to move the page (not always, but at least sometimes possible), then you just figure out a priori how many steps you need to take, and you don't even need to worry about background observers and matching and such.

Matching a target could just be used as a validation then, and would go more into error handling than more scrolling if it failed.

(... and I'm also having a problem with false matches, which this approach might also help alleviate.)

--- where to put my effort in

OCR! OCR! OCR! ;-)