Proof of Concept

Asked by Don Gould

I'm currently a Remark user and I'm wanting a better solution.

I want to know if queXF will deal with this:

I have a bunch of different forms/tests.
Currently each form has a bar code on it that identifies the type of form/test that is (this tells Remark where the fields are).
Once we know what form is being processed, we then read a bar code that tells us who the student is that this test relates to and which page of the test we're looking at.
Each page of each test is in its own PDF file.
We then copy the page into a new location and record the results in a database (Remark puts it in a CSV which we then post process).

So let's look at an example....

random001.pdf - page 2 of test 1 for student 10001
random002.pdf - random blank page
random003.pdf - random page that some idiot just put in the scanning pile by mistake
random004.pdf - page 1 of test 1 for student 10001
random005.pdf - page 3 of test 1 for student 10002

All these files are in: /data/uploaded/unprocessed

In my batch I can have different pages from different tests for different students.

I need to end up with this:

cp /data/uploaded/unprocessed/$orginaluploadedfile /data/$student/$test/$student$test$page.pdf
mv /data/uploaded/unprocessed/$orginaluploadedfile /data/processedorginals/$DateProcessed/*.pdf (ie a copy of the original file ends up here)

To do this I need queXF to read the PDF and figure out what student and test the page relates to.

I also need queXF to read the fields on the test and put the results in an output data file for me.

If queXF can't figure out what the page is then it puts it into a rejected que, eg: /data/rejects/$dateprocessed/*.pdf
It would be ideal if blank pages could just be discarded.
I need all this to be able to be done in batches without an operator doing anything.

I use different scanners to scan my forms. I use different printers to print my forms and sometimes my operators manage to left and right offset my pages slightly. This means I need registration marks (which is see queXF supports).

Does queXF scale the pdf if required? ie if the printer has printed the page a fraction undersize and the scanner scans it back a fraction undersize then currently Remark just rejects it.

In some cases I also need to test for the presents of something. A signature for example.
In some cases I need to capture an image region into a jpg (or what ever image file format) so I can use it later.

Can queXF assist me?

Cheers Don

Question information

Language:
English Edit question
Status:
Answered
For:
queXF Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
Launchpad Janitor (janitor) said :
#1

This question was expired because it remained in the 'Open' state without activity for the last 15 days.

Revision history for this message
Adam Zammit (adamzammit) said :
#2

Response upcoming

Revision history for this message
Adam Zammit (adamzammit) said :
#3

Dear Don,

My apologies for the very long delay in responding.

queXF can do some of the things you have asked about, and should be able to handle others with some small modifications.

queXF expects each form to be a single multi-page PDF. It is able to identify a page by a barcode or text for OCR but would need to be modified to piece together a multipage form from multiple individual pages based on the student barcode. Have a look at functions/functions.import.php

queXF does to blank page detection and does ignore these by default (unless they are part of a multi-page form and it couldn't detect an expected page - in that case it will try and see if it is that "missing" page)

queXF stores the outcome of a page import in the database (doesn't move the files by default) but with a small modification to functions/functions.import.php you could get it to move the file to the appropriate location

If registration marks exist and they have been confirmed using the "page setup" function - then queXF can scale/rotate/offset a page without difficulty.

queXF doesn't test for the presence of something but this could be implemented by banding the area as a "single choice box" and then modifying the code to return how "filled" the area is instead of it being an explicit mark or not.

queXF also doesn't export to an image file by default but there is some code in functions.ocr.php that you could use as an example for exporting to images.

Regards,
Adam Zammit

Can you help with this problem?

Provide an answer of your own, or ask Don Gould for more information if necessary.

To post a message you must log in.