Why are certain file types not available for non-OCR'able languages?
A library partner noticed that two of their items did not show all of the file types available with all of their other items:
http://
http://
The items seemed to have derived okay and when I went into the item to look at the files I noticed that the .djvu and _djvu.txt files were not there. I asked Paul about this who told me that "The reason these are missing the djvu files is because the language is not currently OCR'able. Only items which are OCR'd get those derivatives." The items are listed as having only Hebrew as a language.
I told the library partner this on November 18 and they sent an email today with the following:
"I would just like to make sure I fully understand the response that file types other
than 'read online" or PDF or aren't available for Hebrew texts since they include
OCR. I'm a little confused because I thought (perhaps incorrectly) since I'm able
to search the contents of the PDF and online file types (though not in Hebrew),
these are also OCR'd. If so, then so I can explain this to my boss when she asks,
why would a Hebrew-only volume not be just viewable (though not searchable) on a
Daisy or Kindle reader or through the very nifty DJVu software...what am I missing?"
I would like to provide them with an answer so that they feel comfortable that they understand why those file types aren't available.
Thanks!
Question information
- Language:
- English Edit question
- Status:
- Solved
- Assignee:
- No assignee Edit question
- Solved by:
- Hank Bromley
- Solved:
- Last query:
- Last reply: