opencv match score isn't eqaul on android with Sikulix onPC

Asked by WeiHun Huang

I tried to use opencv matchTemplate to repeat the result of Sikuli Finder.
The opencv version is 2.4.10 (OpenCV-2.4.10-android-sdk.zip).
And the code is

        Mat img = Highgui.imread("/data/local/tmp/dotahome.png",1);
        Mat img2 = Highgui.imread("/data/local/tmp/war.png",1);
        Mat res = new Mat();
        Imgproc.matchTemplate(img, img2, res, Imgproc.TM_CCOEFF_NORMED);
        Core.MinMaxLocResult m = Core.minMaxLoc(res);
        Log.i(TAG, "maxVal = " + m.maxVal);
        Log.i(TAG, "maxLoc = " + m.maxLoc.x + "," + m.maxLoc.y);

and the output is
maxVal = 0.9522626996040344
maxLoc = 648.0,583.0

And the following codes is used in Sikulix version 2015-01-16_01:10nightly

f = Finder("dotahome.png")
f.find("war.png" )
if (f.hasNext()):
    m = f.next()
print m
print m.getScore()
print m.x, m.y

The result is
M[648,582 88x34]@S(S(0)[0,0 1920x1080]) S:0.92 C:692,599 [0/0 msec]
0.919825851917
648 582

The score and location are different between opencv on Android and Sikulix.
Would you please help on solving the difference?
Is the opencv usage the same as Sikulix's?
Thanks.

Question information

Language:
English Edit question
Status:
Answered
For:
SikuliX Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
RaiMan (raimund-hocke) said :
#1

yes, this might well be ;-)

I did not touch the Vision C++ implementation since 2012.
Currently on Windows/Mac I think ;-) the library ...VisionProxy... is built with some OpenCV 2.4.x.

I have already implemented a Java only version using the OpenCV Java API, but the latest tests with it are more than a year ago (it is internally switched off).

The big difference against what you are doing on Android:
To gain speed, the search starts with base and probe image both scaled down to a ration, that is determined by the minimum size a probe must have (12 x 12 pixels in the standard).
If we get a match in this scaled down situation it is taken as success and the match coordinates are evaluated using the ratio, which might lead to rounding differences and other scores than with the actual size.
Furthermore internally (believe me it is a mess and thanks to the god of bits&bytes it works ;-) different Finder usages are not consistent.

In my Java only implementation I evaluate the final position and score, by an additional matchTemplate in the calculated match area from the scaled down situation with some additional few pixel margin.

... but I did not make any consequent tests until now.

conclusion: with SikuliX up to version 1.1.x you have to live with such experiences (I will not touch any C++ code !).

... Version 2 will be better ;-)

Revision history for this message
WeiHun Huang (weihun-huang) said :
#2

Thanks, RaiMan.

I traced the VisionProxy part. But I cannot find how the opencv library are used till now.
It always ends on the files of natives directory.
Would you give me a hand on it?

I am not sure that I understand the meaning of the paragraph "The big difference against...."
Do you mean Sikuli use some skills to speed up on template matching?
It will be very useful and important for Android. Some android devices don't have powerful processor.
Please tell me more about where to find more information.

I would like to do some things about version 2. Hope that I can help on testing or development.

Revision history for this message
RaiMan (raimund-hocke) said :
#3

--- I am not sure that I understand the meaning of the paragraph "The big difference against...."
... no problem, it took me nearly a week, to get behind the logic that the former developer implemented in the C++ code (it is a mess as already mentioned, since obsolete parts and the interface to the text feature (Tesseract) is mixed with living code and spread over 3 modules).

-- some theory about cost of time with matchTemplate:
(for details see the doc of matchTemplate in OpenCV)

supposing you know how it internally works, you get the following:
base image: 1.000 x 600 (the image to search in)
probe image: 100 x 100 (the image to be searched)

we have 600.000 pixels minus 10.000 pixels to check (the lower right corner of the base need not be visited, since the area is smaller than the probe).

So for 590.000 pixels we have to evaluate the score at this point being the top left corner of a possible match.
For each pixel the score formula has to be evaluated (the type TM_CCOEFF_NORMED in your case) with 10.000 base and probe pixel pairs (this effort only depends on the probe size, lets name it score-check)

So the score check has to be made 590.000 times and if you have an RGB image it has to be made 3 times (once for each channel and finally merged into a single-value-vector of scores)

Finally you have this vector containing a score value between 0 and 1 for each pixel, that you can ask with the minMaxLoc() function for the minimum or maximum value (depends on the type used, for TM_CCOEFF_NORMED maximum is relevant).

--- So how to gain speed?
Looking at the above timing: the smaller the base the faster and the more base and probe are equal in size the faster.
If base and probe are equal in size you only have 1 score value to evaluate, which means that comparing 2 equal sized images is very fast and faster, than searching for the same probe in a larger image.
So the recommendation for SikuliX users is: keep the search region (base) as small as possible.

But there is another possibility to get faster, even if the user does not obey the recommendation and always searches the whole screen: resize base and probe to some smaller images and do the search there.
It makes sense, to set a resize limit, so that the small probe does not get smaller than about 100 - 200 pixels (experience value, to keep the uniqueness).

So to make it simple:
in our case the smallest size would be 1/8 (100 / 12), which would result in images:
base: 125 x 75 (9.375)
probe: 12 x 12 (144)

this search costs you some milliseconds (instead of (some) hundred milliseconds)

If you now take a match in this small situation and calculate the position in the original base image using the resize factor, it might come to rounding differences against what you get with a search with the original images.
And you might be left with a different score value.

As already mentioned: I already implemented the SikuliX find method in Java only (classes ImageFinder and ImageFind, currently switched off), where I use this approach, but at the end evaluate the true result by doing a search in the calculated area with some outer margin.

So if you want to have a Java only example: ImageFind.doFind()

always welcome with any suggestions or even contributions.

Can you help with this problem?

Provide an answer of your own, or ask WeiHun Huang for more information if necessary.

To post a message you must log in.