Thread Rating:
  • 1 Vote(s) - 5 Average
  • 1
  • 2
  • 3
  • 4
  • 5
ofn-remove-grid
#1
ofn-remove-grid removes table borders from scanned text. It is found here.

It was spawned from this discussion.

As always, comments, suggestions and bug reports welcome.

Enjoy.
Reply
#2
Using the script I'm not getting good results for the image presented on this page:

What's the way to remove all lines and borders in image (keep texts) programmatically?
https://stackoverflow.com/questions/3394...ogrammatic

I tried with values for the treshold area from 6500 to 9500, but most of the page is deleted.

PS: But when manually executing the steps indicated in post #4 - https://www.gimp-forum.net/Thread-Erase-...8#pid20168 the result is great!

[Image: G1uQx.png]
Reply
#3
(10-02-2020, 04:39 PM)Krikor Wrote: Using the script I'm not getting good results for the image presented on this page:

What's the way to remove all lines and borders in image (keep texts) programmatically?
https://stackoverflow.com/questions/3394...ogrammatic

I tried with values for the treshold area from 6500 to 9500, but most of the page is deleted.

PS: But when manually executing the steps indicated in post #4 - https://www.gimp-forum.net/Thread-Erase-...8#pid20168 the result is great!

[Image: G1uQx.png]

Quoting the script doc:

Quote:Assumptions and algorithm

The script assumes that the area of grid squares is larger than the area of any visually isolated bits of text, that are normally individual characters. So it won’t work well if the image includes a very large text and a tiny table.

... which is pretty much the case with the image you show. This said, eyeballing the smallest grid area (which appears to be the one at (1890,1020), and making a selection just inside it (which makes an area threshold around 5400px), and then running the script removes this:

   

Which isn't too bad given the circumstances.

The one difference between ofn-remove-grid and ofn-path-filter-strokes is that ofn-remove-grid takes a shortcut to compute areas: it actually computes the area of the bounding box, when ofn-path-filter-strokes computes the area of the polygon (but with a formula that is only valid for convex polygons, so not for typical letters...).
Reply
#4
Uploaded a new version that can also filter on width and height, for these cases where there are only horizontal or vertical dividers.
Reply


Forum Jump: