Region.text() - number Recognition

Asked by Javier Gonzales Rodriguez on 2020-12-01

Hello there ...

I have some issues to recognize numbers using "Region.text()".

Is weird this scenario because it recognize numbers with 3 digits, sometimes it recognize 2 digits (to be specific from the number 20).

I tried with this code:

MyRegion = Region (288, 321, 32, 32)

MyNumbers = MyRegion.text()
print(MyNumbers)
uprint(MyNumbers)

Also:

MyRegion = Region (288, 321, 32, 32)
MyNumbers = int(MyRegion.text())
print(MyNumbers)
uprint(MyNumbers)

The numbers look like this:
https://ibb.co/LQH3d7h

When the number changes from 0 up to 19, it doesn't find anything.

Using region.find() or region.exists() works perfect but i need to get the numbers and compare them.

Any recommendation?

Thanks, regards!!

Am using linux mint 20
tesseract 4.1
SikuliX 2.0.4
Java 14

Question information

Language:
English Edit question
Status:
Solved
For:
Sikuli Edit question
Assignee:
No assignee Edit question
Solved by:
Javier Gonzales Rodriguez
Solved:
2020-12-07
Last query:
2020-12-07
Last reply:
2020-12-02
RaiMan (raimund-hocke) said : #1

Try with the new OCR features, which give more options.

https://sikulix-2014.readthedocs.io/en/latest/textandocr.html

Hi RaiMan, thanks for your reply!

I saw that documentation before and I really sorry but it is not clear for me about how to set the ocr.options() or use the new functions.

Am using the SikuliX IDE and if i want to use readText, readWord, etc .. it says "name 'readText' is not defined" ... so I know that I dont have the class method with the new functions.

This is what i have on my sikuli path: /home/user/.Sikulix/Lib/sikuli

'Env$py.class' __init__.py 'Sikuli$py.class'
 Env.py 'Region$py.class' 'SikuliImporter$py.class'
 Finder.py Region.py SikuliImporter.py
'__init__$py.class' Screen.py Sikuli.py

and this on "/home/user/.Sikulix/SikulixTesseract/tessdata/configs":

api_config bazaar digits hocr nodict pdf quiet tsv txt unlv

Can someone point me to the right way?

Regards!

RaiMan (raimund-hocke) said : #3

please paste a relevant part of your code.

matteoa (matteoa) said : #4

Hello Javier,
I've had the same problem and I managed to mitigate it with these functions:
        OCR.globalOptions().variable("tessedit_char_whitelist", "PCBU0123456789ABCDEF")
        OCR.globalOptions().variable("tessedit_char_blacklist", "abcdefGgHhIiLlMmNnOopQqRrSsTtuVvZzJjYyKkWw-!|")
        OCR.globalOptions().configs("bazaar")
        OCR.globalOptions().variable("load_system_dawg", "F")
        OCR.globalOptions().variable("load_freq_dawg", "F")
        OCR.globalOptions().psm(6)
        OCR.globalOptions().variable("user_patterns_file", "C:\\Sikulix\\Util\\OCR.Pattern")
and the content of the file (strictly tailored for my need) is:
P\n\n\n\n
C\n\n\n\n
B\n\n\n\n
U\n\n\n\n
Here some info I used to get a better OCR:
https://stackoverflow.com/questions/17209919/tesseract-user-patterns
https://github.com/tesseract-ocr/tesseract/blob/master/doc/tesseract.1.asc#config-files-and-augmenting-with-user-data
I said "mitigate" since the problem got better but was not completely solved.
Hope this helps

Hi RaiMan,

This is my code:

#Regions to get information
FirstReg = Region(1308,139,37,15)
SecondReg = Region(1309,155,42,14)

#Variables
MaxFirst = "1600"
MinFirst = "1000"
MinSecond = "35"

def MyNumbers():
    try:
        if (FirstReg.text() <= MaxFirst and SecondReg.text() >= MinSecond):
            type(Key.F1)
        elif (SecondReg.text() <= MinSecond):
            print("waiting to get a 35 number")
        elif (FirstReg.text() <= MinFirst):
            type(Key.F7)
        pass
    except:
        pass

MyNumbers()

I also tried with "collectWordsText()", the result is the same .. i found that it wasn't recognizing the numbers with my first basic code that i attached already.

Regards!

Hi Matteoa,

Thanks for the information man, i appreciate it.

I will try with this:
OCR.globalOptions().configs("digits")
OCR.globalOptions().variable("tessedit_char_whitelist", "0123456789")
OCR.globalOptions().psm(10)

I guess that i will change the psm type and check which is better, i will make some tests.

Thanks, regards!

Hi Matteoa,

It is working now, thanks for your help!