Gimp-Forum.net
Text to OCR - offline - Printable Version

+- Gimp-Forum.net (https://www.gimp-forum.net)
+-- Forum: Other topics (https://www.gimp-forum.net/Forum-Other-topics)
+--- Forum: Watercooler (https://www.gimp-forum.net/Forum-Watercooler)
+--- Thread: Text to OCR - offline (/Thread-Text-to-OCR-offline)



Text to OCR - offline - Krikor - 12-12-2019

Hi,

Could anyone indicate software that converts offline text (txt, pdf, etc) to OCR, free to Windows?

I found options for online or mobile platform use.

Any compatible with Gimp?

Thank you.


RE: Text to OCR - offline - Ofnuts - 12-13-2019

What kind of OCR? There are OCR fonts, there is a Gnu barcode generator...


RE: Text to OCR - offline - Krikor - 12-13-2019

(12-13-2019, 10:55 AM)Ofnuts Wrote: What kind of OCR? There are OCR fonts, there is a Gnu barcode generator...
I just wish I could remove the text from images (scanned books like jpg, tiff, etc) and translate them (paste into translator).

On mobile I have this feature, but for desktop-win, I only find online options, and many unreliable.


RE: Text to OCR - offline - Ofnuts - 12-14-2019

The FOSS software that seems the most used for this in the Linux world is called "Tesseract". A version for Windows can be found here.


RE: Text to OCR - offline - rich2005 - 12-14-2019

The problem with a basic Tesseract, is it is command line. Obviously the best way if OCR-ing a whole book. One problem is loss of formatting, tend to get long lines of text with no breaks and no headings etc.

I use it in Linux for small 'screen captured' text images using a GUI (prefer YAGF but not working in 'buntu 18.04 so gImageReader) .

For a screen capture always need some pre-processing in Gimp, scaling up 200% - 300%, clean background etc.

There is a Tesseract for Windows with GUI here: https://ocr.space/blog/p/free-ocr-windows.html

And a quick try-out in a Win10 VM https://i.imgur.com/H7fvKCu.jpg and that is typical, some post OCR corrections needed. Still better than typing out the whole thing Wink


RE: Text to OCR - offline - Krikor - 12-14-2019

(12-14-2019, 12:33 AM)Ofnuts Wrote: A version for Windows can be found here.

(12-14-2019, 10:26 AM)rich2005 Wrote: There is a Tesseract for Windows with GUI here:  https://ocr.space/blog/p/free-ocr-windows.html
Ofnuts and Rich2005,

Already downloaded and installed both files (Tesseract and a9t9).
Take it easy later I will try them on.
Thanks a lot for the help!