Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Text to OCR - offline
#1
Hi,

Could anyone indicate software that converts offline text (txt, pdf, etc) to OCR, free to Windows?

I found options for online or mobile platform use.

Any compatible with Gimp?

Thank you.
Reply
#2
What kind of OCR? There are OCR fonts, there is a Gnu barcode generator...
Reply
#3
(Yesterday, 10:55 AM)Ofnuts Wrote: What kind of OCR? There are OCR fonts, there is a Gnu barcode generator...
I just wish I could remove the text from images (scanned books like jpg, tiff, etc) and translate them (paste into translator).

On mobile I have this feature, but for desktop-win, I only find online options, and many unreliable.
Reply
#4
The FOSS software that seems the most used for this in the Linux world is called "Tesseract". A version for Windows can be found here.
Reply
#5
The problem with a basic Tesseract, is it is command line. Obviously the best way if OCR-ing a whole book. One problem is loss of formatting, tend to get long lines of text with no breaks and no headings etc.

I use it in Linux for small 'screen captured' text images using a GUI (prefer YAGF but not working in 'buntu 18.04 so gImageReader) .

For a screen capture always need some pre-processing in Gimp, scaling up 200% - 300%, clean background etc.

There is a Tesseract for Windows with GUI here: https://ocr.space/blog/p/free-ocr-windows.html

And a quick try-out in a Win10 VM https://i.imgur.com/H7fvKCu.jpg and that is typical, some post OCR corrections needed. Still better than typing out the whole thing Wink
Reply


Forum Jump: