Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
JPEG to GIMP to PDF Quality Question
#1
Hi,
I need to scan a massive load of textbooks, and am trying to figure out the best way to do it.

I have made some test scans to JPEG, 300dpi, Colour. Which look kind of OK, pixilated obviously when zoomed in.

But I imported those same JPEGs to GIMP, and exported as a PDF, and it made them look a load better, smoother.

The 4 JPEGS are 1.95 MB together

But the PDF from GIMP is 13.2 MB

I gather this might have something to do with GIMP converting the JPEGs into Bitmap (is it?).

I just tried scanning one of the pages at 300 dpi BMP and it looks all pixilated like the JPEG's do.

I want the nice smooth PDF but without the much bigger file size. Is that possible?...

Cheers

[img][Image: lEsjpxf.png][/img]
Reply
#2
What are the size in pixels and the print definitions of the images involved? It is possible that your process scales the image up, which blurs it.
Reply
#3
(02-25-2021, 08:34 PM)Ofnuts Wrote: What are the size in pixels and the print definitions of the images involved? It is possible that your process scales the image up, which blurs it.

For the JPEGS in GIMP (After exporting as PDF)
If I go to file - print - image settings it says:

Width 186.95 mm
Height 276.43 mm

X Resolution - 321.054
Y Resolution - 321.054

And if I go to Image - Print Size

Width 200.07 mm
Height 295.83 mm

X Resolution - 300
Y Resolution - 300

--------
One of the JPEGs in Properties says:
Width 2363 pixles
Height 3494 pixles

300 dpi

Bit depth 24
----------------------
PDF Document Properties:
PDF Producer - cairo 1.17.3 (http://cairographics.org)
PDF Version - 1.5
File Size 13.29 MB
Number of Pages - 4
Page Size 20.01 x 29.58 cm
--------------------------

In GIMP it's still pixilated - Until it is exported.

I could upload the GIMP file, and originals if you want?.. Can you do that on this forum?..

Bit clearer view of the difference when zoomed right in:

[Image: TnVIW7O.png]
Reply
#4
Quote:I need to scan a massive load of textbooks, and am trying to figure out the best way to do it.

Best way? Probably not Gimp, but what type 'format' are the textbooks? Mostly text with a few diagrams? Mixtures of text and pictures ? Mostly colour pictures ?

Quote:I have made some test scans to JPEG, 300dpi, Colour. Which look kind of OK, pixilated obviously when zoomed in.

300 ppi is photo quality, anything will look pixelated to a certain degree under extreme magnification but consider viewing at normal reading size.

Quote:But I imported those same JPEGs to GIMP, and exported as a PDF, and it made them look a load better, smoother. The 4 JPEGS are 1.95 MB together But the PDF from GIMP is 13.2 MB
I gather this might have something to do with GIMP converting the JPEGs into Bitmap (is it?).

Gimp 'embeds' the scanned jpeg images in the PDF 'wrapper' but not with jpeg compression. It is compressed but not as much. A little bit about it here. https://gitlab.gnome.org/GNOME/gimp/-/issues/6117

Quote:I want the nice smooth PDF but without the much bigger file size. Is that possible?...

How to make smaller ? Stick with jpeg and for photo quality, 300 ppi but for pure text and line illustrations use a greyscale scan.
If the scans are not already made, try scanning directly into LibreOffice https://www.libreoffice.org/ (AFAIK the Windows TWAIN scan is back to working) LO has very good PDF exporting functions.
If the scans are already made then command line ImageMagick https://imagemagick.org will join a series of jpegs into a single PDF. Size not much more than the sum of the image sizes.

Code:
convert *.jpg -units pixelsperinch -density 300  file.pdf

Just a note about opening pdf's in Gimp. The default ppi is 100 even when the pdf was made with 300 ppi. You can increase the ppi in the Open PDF dialogue.
Reply
#5
Thanks

Quote:What type 'format' are the textbooks?”

Handwriting in pen and pencil.

Quote:300 ppi is photo quality, anything will look pixelated to a certain degree under extreme magnification but consider viewing at normal reading size.”

I know but the PDF version just looks better all round, especially zoomed in. It looks more natural, like looking at the actual piece of paper, and not a pixilated JPEG. I just want to get PDF’s like that if possible, but not if it’s going to be so much bigger files.

Quote:“Gimp 'embeds' the scanned jpeg images in the PDF 'wrapper' but not with jpeg compression. It is compressed but not as much. A little bit about it here. https://gitlab.gnome.org/GNOME/gimp/-/issues/6117

What they are on about there about options for compression etc would be very helpful.
I don’t have a clue how it is smoothing things out nicely though. Is the data still JPEG wrapped in a PDF, or was it changed to BMP and wrapped in a PDF?..

Quote:How to make smaller ? Stick with jpeg and for photo quality, 300 ppi but for pure text and line illustrations use a greyscale scan.”

That won’t get things smooth like the PDF. And I want each book to end up in one PDF, not a folder full of JPEGs for each book.

Quote:If the scans are not already made, try scanning directly into LibreOffice https://www.libreoffice.org/ (AFAIK the Windows TWAIN scan is back to working) LO has very good PDF exporting functions.”

I’ve got Libre Office, can’t scan into it though because the drivers are a bit duff, (Canon can't be bothered to update them probably to force people to buy a newer scanner. If I go to scan from Windows 10 it says "You need a WIA driver to use this device. Scanning with the Canon Software works though). I tried importing the JPEGs into Libre Office and Draw. The problem is the images are not perfect a4 size and I would have to move each one in place. 17600 pages is the estimation I need to do. NOT a chance am I moving each one into place manually lol… No batch import in Libre either >.<

Plus still no smoothing that way. Kind of wish I never saw the smoothed version now, because I want it.

Quote:Just a note about opening pdf's in Gimp. The default ppi is 100 even when the pdf was made with 300 ppi. You can increase the ppi in the Open PDF dialogue.”

Yeah I saw that somewhere else. I didn’t import a PDF though, only JPEGs.

I’ve been trying out a load of JPEG to PDF programs.

Some make uncompressed PDFs but they make big files (Same 4 JPEGS is 97 MB in one of them), and no smoothing like with GIMP's. I haven't found any that just put the JPEGs into the PDF wrapper without messing with the quality at all. Found one that makes minimal difference though.
Reply
#6
Quote: Some make uncompressed PDFs but they make big files (Same 4 JPEGS is 97 MB in one of them), and no smoothing like with GIMP's.

I know you say you are not opening a PDF in Gimp but be careful what you are comparing.  Gimp can re-render and introduce antialiasing (smoothing).

There is always a trade off between quality and file size.  Exporting from Gimp then the factors are pixels-per-inch (ppi) and colour mode RGB or Greyscale (gs).

A typical A4  two page pdf from gimp @ 300 ppi RGB = 10 MB   gs = 3.8 MB
and for the same @ 200 ppi RGB = 3.7 MB gs = 2 MB

A considerable saving in size by going greyscale and reducing ppi.

Quote:...I haven't found any that just put the JPEGs into the PDF wrapper without messing with the quality at all. Found one that makes minimal difference though.

A different situation when using ImageMagick (IM) where the files are added to the PDF wrapper
For a jpeg the quality setting is not linear, you can get a sizeable reduction in file size by going down from 90 to 80 and visually see very little difference.

For that 300 ppi RGB one page file size quality 90 = 1 MB and quality 80 = 0.74 MB and for gs 90 = 0.7 MB / 80 = 0.47 MB
For IM, that  2 page PDF could differ from a 300 ppi RGB = 1.8 MB down to a 200 ppi gs = 0.76 MB

At the end of the day, to a certain extent all in the eye of the beholder.

edit: just as a comparison,  the extremes in a PDF viewer both 100% view size

   

Both readable, all depends on the content and what is considered acceptable.
Reply


Forum Jump: