[1.1.0] How to get better results when working with OCR

Asked by carl

first of all congratulations for the project , it's great!!!!

my problem is be able to work with OCR. Maybe I should apply this passage:
https://answers.launchpad.net/sikuli/+faq/2436

But I want to take this opportunity to ask in the first place how to implement a new font.
(starting my job on the numbers, and after on the letters. )
I am hoping that OCR can move automatically in different FONT sizes

Question information

Language:
English Edit question
Status:
Solved
For:
SikuliX Edit question
Assignee:
No assignee Edit question
Solved by:
RaiMan
Solved:
Last query:
Last reply:
Revision history for this message
RaiMan (raimund-hocke) said :
#1

Thanks for kind feedback and your willingness, to contribute in answering questions.

OCR is still in a bad shape (not really revised since the first implementation). Only for version 2 it is planned to improve it and make it more configurable.

Currently you can do everything, that is based on additional content and option files in the tessdata folder, according to the Tesseract 3 documentation.

The other option always is (if not time critical), to install your own Tesseract and use it from a SikuliX workflow via command line:
- create an image somehow containing some text
- optionally optimize the image for OCR with some image processing package (like ImageMagick)
- run the Tesseract command with the appropriate options
- read the resulting textfile

--- version 1.1.0+ basic Image processing to get better OCR results
you can do the following, to bring your captured image to a condition best for Tesseract OCR:
img = capture(someRegion) # get the image from the screen (in memory)
img1 = Image.create(img) # create an with the new Image class (still in memory)
imgGrey = img1.convertImageToGrayscale(img1.get()) # does what it says (still in memory)
imgGreyResized = imgGrey.resize(factor) # resize the image to about 300dpi (usually 3 as factor is sufficient) (still in memory)

... then directly try
text = imgGreyResized.text() # using the SikuliX builtin OCR implementation

... or use the Tesseract from command line if special options are needed
imgSaved = imgGreyResized.asFile() # write to temp and get the filename

Sorry, but this is not in the docs.

Revision history for this message
carl (maibannato) said :
#2

OCR recognition is an important junction for my applications

----- version 1.1.0+ basic Image processing to get better OCR results
--you can do the following, to bring your captured image to a condition best for Tesseract OCR:
--img = capture(someRegion) # get the image from the screen (in memory)
--img1 = Image.create(img) # create an with the new Image class (still in memory)
--imgGrey = img1.convertImageToGrayscale(img1.get()) # does what it says (still in memory)
--imgGreyResized = imgGrey.resize(factor) # resize the image to about 300dpi (usually 3 as factor is sufficient) (still in memory)

I work in java.. I wrote this for now:
(I need to understand how to finish)

Screen screen = new Screen();
Region region = screen.selectRegion();
BufferedImage img;
img=screen.capture(region).getImage();
Image img2=new Image(img);
BufferedImage imgBN= Image.convertImageToGrayscale(img);

System.out.println(imgBN.text()); // not work, how get text() from BufferedImage?

Revision history for this message
RaiMan (raimund-hocke) said :
#3

- correct transscription to Java:

Image imgForOCR= Image.create(screen.capture(new Screen().selectRegion())).convertImageToGrayscale(img).resize(factor);

System.out.println(imgForOCR.text());

Revision history for this message
carl (maibannato) said :
#4

--Image imgForOCR= Image.create(screen.capture(new Screen().selectRegion())).convertImageToGrayscale(img).resize(factor);

Image.create, does not exist.

I also tried to look at:
https://docs.oracle.com/javase/8/docs/api/java/awt/Image.html

I have tried in the past two days to find an equivalent function but could not find solution.
have you any equivalent function to replace to Image.create(screenImage)

Revision history for this message
RaiMan (raimund-hocke) said :
#5

sorry, but I am talking about the class
org.sikuli.script.Image

Revision history for this message
carl (maibannato) said :
#6

I can not get over this problem , I'm sorry RaiMan+ community!

MY CODE:

import java.awt.image.BufferedImage;
import org.sikuli.script.Image;
import org.sikuli.script.Region;
import org.sikuli.script.Screen;

public class due {
 public static void main(String[] args) {
  Screen screen = new Screen();
  Image imgForOCR= Image.create(screen.capture(new Screen().selectRegion())).convertImageToGrayscale(img).resize(200);
 }
}

MY ERROR:

1) The method create(Image) in the type Image is not applicable for the arguments (ScreenImage)
                                   //import org.sikuli.script.Image;
2) img cannot be resolved to a variable

SOME INFO:
A) I use sikulixsetup-1.1.0
B) I use http://doc.sikuli.org/javadoc/ // I can't find Image class, and move knowledgeably

I hope to see OCR system and I hope to see it grow // <3

Revision history for this message
Best RaiMan (raimund-hocke) said :
#7

--- javadocs
according to the 1.1.0 nightly build page the javadocs are here:
http://nightly.sikuli.de/docs/index.html

this should work (sorry for misleading about Image.create(ScreenImage))

Image imgForOCR= new Image.(screen.capture(new Screen().selectRegion())).convertImageToGrayscale(img).resize(3);

Revision history for this message
carl (maibannato) said :
#8

Thanks RaiMan, that solved my question.