Bug #1434434 “[request] want a workflow recording tool with supp...” : Bugs : SikuliX

RaiMan (raimund-hocke) on 2015-03-20

Changed in sikuli:
status:	New → In Progress
importance:	Undecided → High
assignee:	nobody → RaiMan (raimund-hocke)
milestone:	none → 2.0.0

Revision history for this message

RaiMan (raimund-hocke) wrote on 2015-03-24:

#1

--- from the related question
https://launchpad.net/~emaslov1

Hi all,
I have been using for several years a recorder which captures the images and puts the actions into the workflow, but finally I found that typing scripts by hands in Sikuli is more efficient, because:
1. Buttons and other screen elements almost always get highlighted and become different when the mouse is over them. Moreover, in most remote desktops, the mouse is drawn on the screenshots. Therefore the recorded images are too often useless for playback with search of the images on the screen.
2. Even if not highlighted, the images are captured without any sense, named mostly by time, and placed into the scripts by their names. When you have several dozens of test scripts, the system becomes non-maintainable: when programmers change an icon on one button, you have to update manually every script. It's even often easier to record them again than to update. The problem can be solved if you have GUI map of common images and provide that the scripts refer to the same images when you click them during the record, but it will require image recognition capability from the recorder, that can really become a large task. Instead, if you write the script manually, you can do it easily, so that you can change only one image when something changes in the tested application.
3. Manually you write clever scripts, finding screen contents by relative positions with left(), above() etc., so that the scripts survive future GUI changes occurring during software development. But with the automatic recorder, it's quite difficult to provide that it understands what spatial relations you mean. It's still possible to record the actions and them to edit the script manually, inserting necessary logics, but in this case the recording itself does not save significant time.

So, I suggest, writing a practically useful recorder can require much more work than it seems initially.

Revision history for this message

RaiMan (raimund-hocke) wrote on 2015-03-24:

#2

--- from the related question
by https://launchpad.net/~denis-work-acc

Hi Eugene!

I agree with all points that you mentioned, but let me share why I found the script writer extremely useful. By now, I'm using it simply to take pictures. It has significantly reduced my time to capture the required images. The panel is always on top of my screen and when I need to take some screenshots I just press Record button and start cutting. When I press Stop I get all my clicks written in script and all my images stored in the particular place. Then I copy the commands to my main script and continue coding.
I save time by not using a snipping tool (especially for linux there is no such handy tool like for windows), not think about the images names, not choosing the directory to save the images and not writing the click commands.

Revision history for this message

RaiMan (raimund-hocke) wrote on 2015-03-24:

#3

@Denis & Eugene
thanks for your valuable input.
additional aspect: there are surely different requirements for the use in a visual testing environment or for some automation.

For automation purposes you need reliable scripts, that even take care for some special situations, without crashing.
In testing it is ok, to just let a test fail, if something is not as it should be.

Automation scripts usually are tailored to solve a specific problem, usually for a fixed environment.
In testing today, the tests have to cover many system and rendering (mainly with browsers) environments.

In automation you want to be sure, that after some workflow step is processed, the intended operational result is achieved.
in testing usually the sequence is: when I do this, the GUI should look like that afterwards. And GUI testing often is separated from operational, data related testing (for which testers will surely not use a tool like SikuliX).

So a SikuliX recorder should surely support a sophisticated image capturing (naming, organising, optimisation, variants, ...).
With a little effort, this is possible already today:
- use the capture hotkey, which in most cases allows to capture the GUI elements in a way, as they are shown at runtime
- switch on the manual naming, so you can implement a naming convention from beginning
- separate code from images (code in one .sikuli, images, patterns and other image related information in other .sikuli), so you can decide at runtime which one to use via import

In my opinion it would be a step in the right direction, to force the user, to first decide for the element's name and some attributes (only image and/or an element that might be acted on by mouse and/or keyboard, possible clickpoint relation to other images already in the stock, ....) and then capture the image and store it in an organised way (image groups).

Revision history for this message

Eugene Maslov (emaslov1) wrote on 2015-03-24:

#4

@Raimund and Denis,

May I put some thoughts about the Recorder that I collected during the last years.
I deal mostly with GUI auto tests, where maintainability of many tests is one of the main factors, therefore I always have to store common things in common places, not in test scripts themselves.

So this is the recording scenario that I imagine.

- The user has all his existing screenshots of buttons etc. at, e.g., "guimap.sikuli" folder.
- If multiple image search is used, the folder contains also guimap.py file, describing the arrays of images as variables, like that:
    menu_file="menu_file.png"
    button_ok=["button_ok.png","button_ok_highlighted.png"] #(referred below as "image group")
- The user activates the recording mode.
- Clicks, key presses, etc. are recorded to the script as Denis already described
- When the user clicks on some control, the Recorder:
    --Gets some area around the mouse
    --Takes, one by one, all images fom guimap.sikuli folder, and checks whether one of them exists in the area around the mouse
    --As soon as the image is found, the name of the image (or, even better, the name of variable containing this image) is put to the script, like that:
        Click(button_ok) # or Click(self.button_ok), depending on the object/non-object test structure
    --If none of the images fit, the Recorder proposes the user to crop, to name, and to add the new image to some "image group", and then stores the image at guimap.sikuli and, desirably, adds the name to guimap.py

It's much better if the Recorder takes the image not from the existing screen, which contains highlighted image, but from (or also from) the screen which existed a few seconds before, where the element was intact. For that, it's just necessary to remember a few screens for the last seconds and select the image before the last change of the screen.

To get the basis for setting spatial relations, it will be very desirable to get the screenshots not only on clicks etc., but also on mouse hovers, without clicks, inserting find() in the script. For that, the user can press some key instead of the mouse, for example, F12 or like that. With this trick, the user will be able to make an anchor, e.g. find a label to find the check box at the left of it. The sense of the anchor can be so far set manually, editing the script.

Best regards
Eugene

@Raimund and Denis,

May I put some thoughts about the Recorder that I collected during the last years.
I deal mostly with GUI auto tests, where maintainability of many tests is one of the main factors, therefore I always have to store common things in common places, not in test scripts themselves.

So this is the recording scenario that I imagine.

- The user has all his existing screenshots of buttons etc. at, e.g., "guimap.sikuli" folder.
 - If multiple image search is used, the folder contains also guimap.py file, describing the arrays of images as variables, like that:
    menu_file="menu_file.png"
    button_ok=["button_ok.png","button_ok_highlighted.png"] #(referred below as "image group")
 - The user activates the recording mode.
 - Clicks, key presses, etc. are recorded to the script as Denis already described
 - When the user clicks on some control, the Recorder:
    --Gets some area around the mouse
    --Takes, one by one, all images fom guimap.sikuli folder, and checks whether one of them exists in the area around the mouse
    --As soon as the image is found, the name of the image (or, even better, the name of variable containing this image) is put to the script, like that:
        Click(button_ok) #  or Click(self.button_ok), depending on the object/non-object test structure
    --If none of the images fit, the Recorder proposes the user to crop, to name, and to add the new image to some "image group", and then stores the image at guimap.sikuli and, desirably, adds the name to guimap.py

It's much better if the Recorder takes the image not from the existing screen, which contains highlighted image, but from (or also from) the screen which existed a few seconds before, where the element was intact. For that, it's just necessary to remember a few screens for the last seconds and select the image before the last change of the screen.

To get the basis for setting spatial relations, it will be very desirable to get the screenshots not only on clicks etc., but also on mouse hovers, without clicks, inserting find() in the script. For that, the user can press some key instead of the mouse, for example, F12 or like that. With this trick, the user will be able to make an anchor, e.g. find a label to find the check box at the left of it. The sense of the anchor can be so far set manually, editing the script.

Best regards
Eugene

Revision history for this message

RaiMan (raimund-hocke) wrote on 2015-03-24:

#5

@Eugene
yep, that is the main line to follow.

My idea is to generally use keys to tell the recorder what to do:
e.g. ctrl-alt-action, where action could be
c for click
d for double click
r for right click
m for a complex mouse action (configured on the fly)
t for type
p for paste
s for a configurable wheel action (scroll)
w for a wait on this image
v to wait for the image to vanish
….
this is accompanied by some floating palettes with situation dependent content.
so the user would just move the mouse to the next place and press the relevant key.

I am watching a promising Java library, that allows to intercept keyboard and mouse actions at the system level, so this would not need any hotkey usage.

Going this way, we will not have any problems with visually changing elements, since the recorder can:
- move the mouse away and capture the inactive state
- move the mouse back and capture the active state
if the user signals, that it is a vivid element (or it might even be checked automatically - only some 10 milliseconds)

the recorder outcome will be some meta (I guess I will use YAML), that can be edited and saved.

You might get it as Python, Ruby, JavaScript or Java code (other translators can be added/contributed), to use it in your script or if sufficient, just run it as is. (This is similar to what Selenium offers).

After a workflow snippet is recorded, one might run it in the recorder to automatically add some timing parameters like max waiting times.

Revision history for this message

Brian Redmond (bredmond) wrote on 2015-03-24: Re: [Bug 1434434] Re: [request] want a workflow recording tool with support for runtime adjustments to the workflow

#6

Download full text (4.1 KiB)

Hi everyone, this is simple, but might be helpful. I wrote this to speed
up image capture for myself. This uses Sikuli itself to prompt for a
region of interest on the screen and then automatically capture a new image
every time the region changes (think of it as motion detected screen
capture). Then you have a folder of timestamped images you can rename and
use. Great for automating things like installers.

from os import (
makedirs,
rename,
)
from time import strftime

from sikuli.Sikuli import *

script_capture_path = getBundlePath()

try:
    makedirs(script_capture_path)
except OSError:
      pass
#watch_area = SCREEN
watch_area = selectRegion()
permanent_image_name ="%s/%s.png" % (script_capture_path,
                                     strftime("%a %Hh%Mm%Ss"))
rename(capture(watch_area), permanent_image_name)

while True:
    try:
        wait(Pattern(permanent_image_name).similar(.99), 1)
    except FindFailed:
        permanent_image_name ="%s/%s.png" % (script_capture_path,
                                             strftime("%a %Hh%Mm%Ss"))
        rename(capture(watch_area), permanent_image_name)

On Tue, Mar 24, 2015 at 10:32 AM, RaiMan <email address hidden> wrote:

> @Eugene
> yep, that is the main line to follow.
>
> My idea is to generally use keys to tell the recorder what to do:
> e.g. ctrl-alt-action, where action could be
> c for click
> d for double click
> r for right click
> m for a complex mouse action (configured on the fly)
> t for type
> p for paste
> s for a configurable wheel action (scroll)
> w for a wait on this image
> v to wait for the image to vanish
> ….
> this is accompanied by some floating palettes with situation dependent
> content.
> so the user would just move the mouse to the next place and press the
> relevant key.
>
> I am watching a promising Java library, that allows to intercept
> keyboard and mouse actions at the system level, so this would not need
> any hotkey usage.
>
> Going this way, we will not have any problems with visually changing
> elements, since the recorder can:
> - move the mouse away and capture the inactive state
> - move the mouse back and capture the active state
> if the user signals, that it is a vivid element (or it might even be
> checked automatically - only some 10 milliseconds)
>
> the recorder outcome will be some meta (I guess I will use YAML), that
> can be edited and saved.
>
> You might get it as Python, Ruby, JavaScript or Java code (other
> translators can be added/contributed), to use it in your script or if
> sufficient, just run it as is. (This is similar to what Selenium
> offers).
>
> After a workflow snippet is recorded, one might run it in the recorder
> to automatically add some timing parameters like max waiting times.
>
> --
> You received this bug notification because you are subscribed to Sikuli.
> https://bugs.launchpad.net/bugs/1434434
>
> Title:
> [request] want a workflow recording tool with support for runtime
> adjustments to the workflow
>
> Status in Sikuli:
> In Progress
>
> Bug description:
> Hi guys!
>
> @RaiMan, could you kindly advise me if you have done a script
> recording tool? If you haven't, would you be ...

Hi everyone, this is simple, but might be helpful.  I wrote this to speed
up image capture for myself.  This uses Sikuli itself to prompt for a
region of interest on the screen and then automatically capture a new image
every time the region changes (think of it as motion detected screen
capture).  Then you have a folder of timestamped images you can rename and
use.  Great for automating things like installers.

from os import (
     makedirs,
     rename,
)
from time import strftime

from sikuli.Sikuli import *

script_capture_path = getBundlePath()

try:
    makedirs(script_capture_path)
except OSError:
      pass
#watch_area = SCREEN
watch_area = selectRegion()
permanent_image_name ="%s/%s.png" % (script_capture_path,
                                     strftime("%a %Hh%Mm%Ss"))
rename(capture(watch_area), permanent_image_name)

while True:
    try:
        wait(Pattern(permanent_image_name).similar(.99), 1)
    except FindFailed:
        permanent_image_name ="%s/%s.png" % (script_capture_path,
                                             strftime("%a %Hh%Mm%Ss"))
        rename(capture(watch_area), permanent_image_name)

On Tue, Mar 24, 2015 at 10:32 AM, RaiMan <rmhdevelop@me.com> wrote:

> @Eugene
> yep, that is the main line to follow.
>
> My idea is to generally use keys to tell the recorder what to do:
> e.g. ctrl-alt-action, where action could be
> c for click
> d for double click
> r for right click
> m for a complex mouse action (configured on the fly)
> t for type
> p for paste
> s for a configurable wheel action (scroll)
> w for a wait on this image
> v to wait for the image to vanish
> ….
> this is accompanied by some floating palettes with situation dependent
> content.
> so the user would just move the mouse to the next place and press the
> relevant key.
>
> I am watching a promising Java library, that allows to intercept
> keyboard and mouse actions at the system level, so this would not need
> any hotkey usage.
>
> Going this way, we will not have any problems with visually changing
> elements, since the recorder can:
> - move the mouse away and capture the inactive state
> - move the mouse back and capture the active state
> if the user signals, that it is a vivid element (or it might even be
> checked automatically - only some 10 milliseconds)
>
> the recorder outcome will be some meta (I guess I will use YAML), that
> can be edited and saved.
>
> You might get it as Python, Ruby, JavaScript or Java code (other
> translators can be added/contributed), to use it in your script or if
> sufficient, just run it as is. (This is similar to what Selenium
> offers).
>
> After a workflow snippet is recorded, one might run it in the recorder
> to automatically add some timing parameters like max waiting times.
>
> --
> You received this bug notification because you are subscribed to Sikuli.
> https://bugs.launchpad.net/bugs/1434434
>
> Title:
>   [request] want a workflow recording tool with support for runtime
>   adjustments to the workflow
>
> Status in Sikuli:
>   In Progress
>
> Bug description:
>   Hi guys!
>
>   @RaiMan, could you kindly advise me if you have done a script
>   recording tool? If you haven't, would you be interested in my one? It
>   some kind of floating panel with two buttons Record and Stop, that can
>   capture pictures and write your typing. Output is simply set of
>   commands, for example:
>
>   s.click("c:\\tmp\\myimgs\\PREF_20150320130001.PNG");
>   s.type("H");
>   s.type("e");
>   s.type("l");
>   s.type("l");
>   s.type("o");
>   s.type(" ");
>   s.type("g");
>   s.type("u");
>   s.type("y");
>   s.type("s");
>   s.type("!");
>   s.type(Key.ENTER);
>
>   You can copy them and past to your code. Probably, I or someone else
>   could develop a plugin for eclipse in the future that will input the
>   commands directly to your code. By now, it will be an AWT application.
>   The main advantage of the tool is to reduce necessity to make pictures
>   manually and link them to the commands, the tool does it for you.
>
>   I could also make an youtube video. So, let me know if you need so.
>
>   Thanks!
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/sikuli/+bug/1434434/+subscriptions
>

Revision history for this message

RaiMan (raimund-hocke) wrote on 2015-03-25:

#7

@Brian
Thanks for the reminder ;-)
Always wanted to implement something, that avoids this (ugly) coding.

From tomorrow's build 1.1.0 on this will do what you have done:
(no need to import anything)

r = selectRegion()
img = r.saveCapture(getBundlePath()) # the initial capture
while True:
  while r.exists(Pattern(img).similar(0.99), 0):
    wait(1)
    if Mouse.at().x < 10:
      exit()
    wait(1)
  img = r.saveCapture(getBundlePath()) # the follow-up capture after a change

I added a termination solution:
if you move the mouse to the screen's left edge, the script will terminate.

absolute-path-of-imagefile = r.saveCapture( [path [ , name ] ] )

will save timestamped .png files as:
nothing given: <temp>/Sikulix/sikuliximage-1427279736087.png
path given: <path>/sikuliximage-1427279736087.png
path and name given: <path>/<name>-1427279736087.png

and return the absolute path of the created .png or "" if failed for some reason.

Revision history for this message

Brian Redmond (bredmond) wrote on 2015-03-25:

#8

That's great, thanks Raiman!

On Wed, Mar 25, 2015 at 6:49 AM, RaiMan <email address hidden> wrote:

> @Brian
> Thanks for the reminder ;-)
> Always wanted to implement something, that avoids this (ugly) coding.
>
> >From tomorrow's build 1.1.0 on this will do what you have done:
> (no need to import anything)
>
> r = selectRegion()
> img = r.saveCapture(getBundlePath()) # the initial capture
> while True:
> while r.exists(Pattern(img).similar(0.99), 0):
> wait(1)
> if Mouse.at().x < 10:
> exit()
> wait(1)
> img = r.saveCapture(getBundlePath()) # the follow-up capture after a
> change
>
> I added a termination solution:
> if you move the mouse to the screen's left edge, the script will terminate.
>
> absolute-path-of-imagefile = r.saveCapture( [path [ , name ] ] )
>
> will save timestamped .png files as:
> nothing given: <temp>/Sikulix/sikuliximage-1427279736087.png
> path given: <path>/sikuliximage-1427279736087.png
> path and name given: <path>/<name>-1427279736087.png
>
> and return the absolute path of the created .png or "" if failed for
> some reason.
>
> --
> You received this bug notification because you are subscribed to Sikuli.
> https://bugs.launchpad.net/bugs/1434434
>
> Title:
> [request] want a workflow recording tool with support for runtime
> adjustments to the workflow
>
> Status in Sikuli:
> In Progress
>
> Bug description:
> Hi guys!
>
> @RaiMan, could you kindly advise me if you have done a script
> recording tool? If you haven't, would you be interested in my one? It
> some kind of floating panel with two buttons Record and Stop, that can
> capture pictures and write your typing. Output is simply set of
> commands, for example:
>
> s.click("c:\\tmp\\myimgs\\PREF_20150320130001.PNG");
> s.type("H");
> s.type("e");
> s.type("l");
> s.type("l");
> s.type("o");
> s.type(" ");
> s.type("g");
> s.type("u");
> s.type("y");
> s.type("s");
> s.type("!");
> s.type(Key.ENTER);
>
> You can copy them and past to your code. Probably, I or someone else
> could develop a plugin for eclipse in the future that will input the
> commands directly to your code. By now, it will be an AWT application.
> The main advantage of the tool is to reduce necessity to make pictures
> manually and link them to the commands, the tool does it for you.
>
> I could also make an youtube video. So, let me know if you need so.
>
> Thanks!
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/sikuli/+bug/1434434/+subscriptions
>

That's great, thanks Raiman!

On Wed, Mar 25, 2015 at 6:49 AM, RaiMan <rmhdevelop@me.com> wrote:

> @Brian
> Thanks for the reminder ;-)
> Always wanted to implement something, that avoids this (ugly) coding.
>
> >From tomorrow's build 1.1.0 on this will do what you have done:
> (no need to import anything)
>
> r = selectRegion()
> img = r.saveCapture(getBundlePath()) # the initial capture
> while True:
>   while r.exists(Pattern(img).similar(0.99), 0):
>     wait(1)
>     if Mouse.at().x < 10:
>       exit()
>     wait(1)
>   img = r.saveCapture(getBundlePath()) # the follow-up capture after a
> change
>
> I added a termination solution:
> if you move the mouse to the screen's left edge, the script will terminate.
>
> absolute-path-of-imagefile = r.saveCapture( [path [ , name ] ] )
>
> will save timestamped .png files as:
> nothing given: <temp>/Sikulix/sikuliximage-1427279736087.png
> path given: <path>/sikuliximage-1427279736087.png
> path and name given: <path>/<name>-1427279736087.png
>
> and return the absolute path of the created .png or "" if failed for
> some reason.
>
> --
> You received this bug notification because you are subscribed to Sikuli.
> https://bugs.launchpad.net/bugs/1434434
>
> Title:
>   [request] want a workflow recording tool with support for runtime
>   adjustments to the workflow
>
> Status in Sikuli:
>   In Progress
>
> Bug description:
>   Hi guys!
>
>   @RaiMan, could you kindly advise me if you have done a script
>   recording tool? If you haven't, would you be interested in my one? It
>   some kind of floating panel with two buttons Record and Stop, that can
>   capture pictures and write your typing. Output is simply set of
>   commands, for example:
>
>   s.click("c:\\tmp\\myimgs\\PREF_20150320130001.PNG");
>   s.type("H");
>   s.type("e");
>   s.type("l");
>   s.type("l");
>   s.type("o");
>   s.type(" ");
>   s.type("g");
>   s.type("u");
>   s.type("y");
>   s.type("s");
>   s.type("!");
>   s.type(Key.ENTER);
>
>   You can copy them and past to your code. Probably, I or someone else
>   could develop a plugin for eclipse in the future that will input the
>   commands directly to your code. By now, it will be an AWT application.
>   The main advantage of the tool is to reduce necessity to make pictures
>   manually and link them to the commands, the tool does it for you.
>
>   I could also make an youtube video. So, let me know if you need so.
>
>   Thanks!
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/sikuli/+bug/1434434/+subscriptions
>

RaiMan (raimund-hocke) on 2019-11-18

Changed in sikuli:
milestone:	2.0.0 → 2.1.0

RaiMan (raimund-hocke) on 2019-11-18

Changed in sikuli:
importance:	High → Medium

SikuliX

[request] want a workflow recording tool with support for runtime adjustments to the workflow

Bug Description

Other bug subscribers

Related questions

Remote bug watches