the same letter reconnized by ICR

Asked by Lagadec on 2016-06-01

Hi,
I need your help.
I want to analyze a handwritten form. So I choose to test QueXF for that.
The form was generated with limesurvey. 10 person filled the form. So I import them first time and reimport them after ICR Train.
But when I verify : https://framapic.org/Zo6zNgjw7MVW/iRGTV27F5iih.PNG
"X" appears for each letter.

In the futur, 15 person use the application, so it will be very difficult to have 400 instances for each letter

There probably was something I did not understand
Thank you for your help
Quentin

Question information

Language:
English Edit question
Status:
Answered
For:
queXF Edit question
Assignee:
No assignee Edit question
Last query:
2016-06-16
Last reply:
2016-06-17
Adam Zammit (adamzammit) said : #1

Hello Quentin,

For training, queXF requires many instances of a letter. There is an example training set under doc/icr_database that you could test - see if that works (you import them then assign them to your questionnaires).

For such a small set it may be best just to do manual verification of characters using the verifier interface.

Adam

Lagadec (q-lagadec) said : #2

Hello
Thank's for your answer.
I tested it and it works better.

But I have another problem when I trained ICR with the handwritten of some person and I want to export it, when I look at the XML file generated, no <ocrkbdata> : https://framapic.org/6wOu2IqoOXlS/wk9WwpBQOrvb.PNG

So my personal ICR KB doesn't work when I import form answers.

Here is what I did : Train ICR > Select my form > Continue Training > Start training process in background
After, Import and Export ICR KB and select my ICR KB

Adam Zammit (adamzammit) said : #3

Hi Quentin,

Can you please check if there is any data in the ocrtrain table or the ocrkbdata table?

Also please check on the admin page the "Monitor ICR training process" link to see if there are any messages.

It is possible the training process has not executed.

Adam

Lagadec (q-lagadec) said : #4

Hi Adam,

No data have been added in ocrtrain and ocrkbdata table 5 minutes after click on "Start training process in background" .

In "Monitor ICR training process" just the message :
 "Process 1 running...

Outcome of last process run (if any)"

And when I look at "Train ICR" : https://framapic.org/3FGGcV1Sjj2c/yHj8JkP7U57f.PNG
No change...

Thank's
Quentin

Adam Zammit (adamzammit) said : #5

Hi Quentin,

It looks like the training process hasn't run.

Before clicking on "Start training process in background" you must make sure the checkboxes are ticked for those characters you wish to train.

If this doesn't help: Could you please try the "Manual training" links and then keeping the boxes "green" that you wish to train - click on the box of any character that doesn't match to make it go red - this will not be trained. Then click on the train button at the bottom of the page to train those particular characters.

Adam

Lagadec (q-lagadec) said : #6

Hi Adam,

Checkboxes are ticked but the file is still empty

With manual training it's work, but when I select many ICR KB in "Assign ICR KB to questionnaire" and reimport forms answer, it's the start problem : the same letter reconnized by ICR

Quentin

Adam Zammit (adamzammit) said : #7

Can you please send some examples of the ICR KB generated (the XML files?) from the Manual training?

I wouldn't expect very good recognition with a small training set.

It looks like the background training process is not running. Can you please confirm what version of queXF you are using and what operating system it is running on?

Lagadec (q-lagadec) said : #8

Hi
Some examples of the ICR KB from the manuel training : https://drive.google.com/open?id=0B_EB1PzjT-TEcEpBb01DbFRBQUU
QueXF 1.18.1 with UwAmp 3.1.0 on windows 8.1

Thank's

Quentin

Lagadec (q-lagadec) said : #9

More details for UwAmp :
Apache version : Apache/2.4.18 (Win32) OpenSSL/1.0.2f PHP/5.6.18

Adam Zammit (adamzammit) said : #10

Hi Quentin,

Assigning all those KB's to a form should result in detection of some characters (although with the small training sample I wouldn't expect it to be very accurate).

It looks like the automatic training process is not executing properly - this is most likely due to the way PHP runs on Windows - as the code is trying to execute a CLI version of PHP. Please have a look at this directive in the configuration file:

define('WINDOWS_PHP_EXEC', "start /b php");

And change it to suit your system (you may need to add a path before the php directive).

Adam

Lagadec (q-lagadec) said : #11

Hi Adam,
That's it, the name of php.ini file of UwAmp is php_uwamp.ini so when I try to execute in CLI, it didn't find the php.ini file...
I renamed php_uwamp.ini in php.ini and I added php in the path system too.
The process start correctly and the XML file is completed !

But when it's over the process is not stopped, it's normal ?

Thank you very much,
Quentin

Lagadec (q-lagadec) said : #12

Hi Adam,
I have a question, why when I train the ICR with a form to get an ICR KB and I reimport this form with this ICR KB check,
I have not the same letters detected ? I can not understand...

Currently, I have a ICR KB with 65 instances but the results of the analysis still has too many error.
400 occurrences letter solve the problem ?

Adam Zammit (adamzammit) said : #13

The ICR system is based on a research paper which suggests 400 instances of each character is optimal.

Even then the detection will not be perfect as not all variation can be taken into account.

The ICR system uses a statistical model to determine the most likely character detected out of all possible characters selected in the ICR KB.

Can you help with this problem?

Provide an answer of your own, or ask Lagadec for more information if necessary.

To post a message you must log in.