Tessdata_Prefix accent char issue

Asked by Aurelien

Hello,

I'm using SikuliX-IDE under Windows 8, 64-bit environment.
 I've been trying to use OCR function from Tesseract but I'm facing some issues.

As previously reported by other users, I've got an error while launching my py script with a text() function :

Error opening data file C:/Users/Aur├®lien/AppData/Roaming/Sikulix/SikulixTesser
act/tessdata/eng.traineddata
Please make sure the TESSDATA_PREFIX environment variable is set to the parent d
irectory of your "tessdata" directory.
Failed loading language 'eng'
Tesseract couldn't load any languages!
[debug] TextRecognizer: init OK: using as data folder:
C:\Users\Aurélien\AppData\Roaming\Sikulix\SikulixTesseract
#
# A fatal error has been detected by the Java Runtime Environment:
#
# EXCEPTION_ACCESS_VIOLATION (0xc0000005) at pc=0x68b30191, pid=2440, tid=0x000
023c8
#
# JRE version: Java(TM) SE Runtime Environment (8.0_121-b13) (build 1.8.0_121-b1
3)
# Java VM: Java HotSpot(TM) Client VM (25.121-b13 mixed mode, sharing windows-x8
6 )
# Problematic frame:
# C [libtesseract-3.dll+0x130191]
#
# Failed to write core dump. Minidumps are not enabled by default on client vers
ions of Windows
#
# An error report file with more information is saved as:
# C:\Users\AurÚlien\hs_err_pid2440.log
#
# If you would like to submit a bug report, please visit:
# http://bugreport.java.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#

I deleted the Lib folder, as well that I enabled the TextSearc and the OCR functionsin the user preferences menu but with the same results.

I also tried to use in my script :
Settings.OcrDataPath = "C:\Users\Aurélien\AppData\Roaming\Sikulix\SikulixTesseract"

but it looks like SikuliX doen't consider it as the used path is (see above) : C:/Users/Aur├®lien/AppData/Roaming/Sikulix/SikulixTesser
act/tessdata
And this path remains the same whatever I put in the Ocr settings. As you can see, for some reasons, there is also a strange accent char that is supposed to be an "é" char. What is also strange is that the line below "TextRecognizer" initialisation is referring to the proper path with the "é" char : C:\Users\Aurélien\AppData\Roaming\Sikulix\SikulixTesseract.

I have also tested with double slash but still the same thing.

Also there is a "Failed to write core dump. Minidumps are not enabled by default on client vers
ions of Windows" message then after. I don't know how it's related to the path issue ?

Thanks for your help.

Question information

Language:
English Edit question
Status:
Expired
For:
SikuliX Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
Pierre (pho-gfi) said :
#1

Hi,

You can try to delete your folder "C:\Users\Aurélien\AppData\Roaming\Sikulix", Sikuli will "regen" all files for you in this folder.

Revision history for this message
Aurelien (auredor) said :
#2

Hello,

I've tried this solution but it doesn't help because the sytem is hanging out waiting from data from the web through the port 50001. I 've disabled the firewall and all the protections but it doesn't help.

However I've started Sikulix from the admin account and it works like a charm. So, it looks like the issue comes from the accent char. For your information, the installed pytheon version is not the last one (2.7.x) but the 2.5.X. This is because I'm not able to run Skilux with the 2.7.x. I 've read that there are some cases where it might be caused by UTF8/Unicode character tables.

Thanks.

Revision history for this message
Launchpad Janitor (janitor) said :
#3

This question was expired because it remained in the 'Open' state without activity for the last 15 days.