java multithreading is not parallel

Asked by Jaroslav Novotny on 2020-10-18

I have this runnable class:

public class PatternFinder implements Runnable {
    private final Pattern pattern;
    private final Region region;

    public PatternFinder(Pattern pattern, Region region){
        this.pattern = pattern;
        this.region = region;
    public void run(){

Then I make bunch of threads, I give each different PatternFinder that each has different region and different pattern, I start() them all and then join() them to the main thread. But I get the same performance like if I wouldn't use any threading at all and run all the some_region.has(some_pattern, 0); sequentially in the main thread.

I am using JDK 8u251, win10, sikulixapi-2.0.4


Question information

English Edit question
Sikuli Edit question
No assignee Edit question
Solved by:
Jaroslav Novotny
Last query:
Last reply:
RaiMan (raimund-hocke) said : #1

This is my test program (latest Java 8 on macOS 10.15 (iMac: 3 GHz 6-Core Intel Core i5))

package com.sikulix.testAPI;

import org.sikuli.script.*;

import java.util.ArrayList;
import java.util.Date;
import java.util.List;

public class Run {
  public static void main(String[] args) {
    Screen screen = new Screen();

    int regw = 300;
    int regh = 300;
    Region reg = new Region(0, 0, regw, regh);
    //to run the image search global init (adds to the first search)
    Image img = Image.create("img");

    System.out.println(String.format("***** region: %dx%d", regw, regh));
    doit(regw, regh, 1, img);
    doit(regw, regh, 10, img);
    doit(regw, regh, 100, img);

    regw = screen.w;
    regh = screen.h;
    System.out.println(String.format("***** region: %dx%d", regw, regh));
    doit(regw, regh, 1, img);
    doit(regw, regh, 10, img);
    doit(regw, regh, 100, img);

  private static void doit(int regw, int regh, int nmax, Image img) {
    long start = new Date().getTime();
    for (int n = 0; n < nmax; n++) {
      Region reg = new Region(0, 0, regw, regh);
    long duration = new Date().getTime() - start;
    System.out.println("nmax: " + nmax);
    System.out.println("duration: " + duration);

    List<Thread> threads = new ArrayList<>();
    for (int n = 0; n < nmax; n++) {
      threads.add(new Thread(() -> {
        Region regt = new Region(0, 0, regw, regh);
    start = new Date().getTime();
    for (Thread thx : threads) {
    for (Thread thx : threads) {
      try {
      } catch (InterruptedException e) {
    duration = new Date().getTime() - start;
    System.out.println("duration threads: " + duration);

This is the output:
***** region: 300x300
nmax: 1
duration: 16
duration threads: 17
nmax: 10
duration: 160
duration threads: 104
nmax: 100
duration: 1539
duration threads: 908
***** region: 2048x1152
nmax: 1
duration: 324
duration threads: 333
nmax: 10
duration: 3103
duration threads: 1675
nmax: 100
duration: 29967
duration threads: 14638

Which meets my experiences that I made with the implementation of Region.findAny() (internally uses a similar construct with threads): Getting down to 50% elapsed time is the best you can get.

I guess this is due to some internal resource-locking, probably in the AWT-Robot when capturing the screen.

I will keep an eye on this on the way to the final 2.0.5.

I distilled my code to the smallest working example:
And got output that I'd expect:

Test 1 took: 432 ms
Test 2 took: 118 ms
Test 3 took: 168 ms

I'll keep investigating but looks like problem is somewhere else, most likely between keyboard and chair, so I'll close this as solved.

I did some more testing and found this performance scaling pretty much in line with both test above:

It's array of 8 patterns each same size trying to find them in different region sizes, I did each test for each region size 100x and averaged it, wasn't sure if the stair-case effect is real or not.

The clear winner seems to be region.findAnyList() that scales nicely, but UI elements are often scattered around the screen so it would be used with a big region to batch-find the state of all of them. On the other hand the single and multi-threaded ways allow to use different region with different pattern so for each item from the pattern array only a small portion of screen can be used as region because UI elements are often fixed where they are. But there is something bottlenecking both in regions < 500px^2.

RaiMan (raimund-hocke) said : #4

region.findAny() works on one region with many patterns, the region is only captured once at the beginning and the search is done threaded in the captured image.

multiple region.has(), if done threaded, will do the capture of the region in every thread, hence multiple times in parallel.

This is why has()-threaded is slower than findAny() probably due to this as mentioned above:
I guess this is due to some internal resource-locking, probably in the AWT-Robot when capturing the screen.

As mentioned: I will have an eye on this.