Some PDF files use suboptimal image compression; this page lists some of the ways to fix this by recompressing/recreating the file. The methods described should be lossless; it is better to submit a larger file if it cannot be compressed losslessly.
Contents[hide] |
See here for information on this program.
This method basically attempts to reconstruct the PDF file by ripping out the images, and recreating the PDF file from the images. The following is a Ruby script to do this.
- #!/usr/bin/ruby
- require( 'fileutils' )
- BASICCONVERTOPTIONS = " -compress Group4"
- DELETEIGNOREFILE = false #Automatically delete files which grow in size after recompression?
- TMPDIRNAME = "tmpx139toslw"
- if ARGV[0] === NIL
- $stderr.puts "Syntax: pdfcompress.rb <PDF file> ( <additional convert options> )"
- exit 1
- end
- if ARGV[1] === NIL
- convertoptions = BASICCONVERTOPTIONS
- else
- convertoptions = ARGV[1] + BASICCONVERTOPTIONS
- end
- begin
- Dir.mkdir( TMPDIRNAME )
- $stderr.puts "Processing file " + ( file = ARGV[0] ) + "..."
- #Convert to individual PDFs
- system( "pdfimages \"" + file +"\" " + File.join( TMPDIRNAME, "images" ) )
- Dir.glob( File.join( TMPDIRNAME, "*" ) ).each { |imagefile|
- $stderr.printf( "\rCompressing " + File.basename( imagefile ) + "..." );
- system( "convert #{convertoptions} \"" + imagefile + "\" \"" + imagefile.sub( /\.[^.]*$/, ".tiff" ) + "\"" )
- system( "tiff2pdf \"" + imagefile.sub( /\.[^.]*$/, ".tiff" ) + "\" -o \"" + imagefile.sub( /\.[^.]*$/, ".pdf" ) +"\"" )
- }
- $stderr.printf( "\n" );
- #Put them all together now
- $stderr.printf( "Combining PDF files... " );
- system( "pdftk \"" + Dir.glob( File.join( TMPDIRNAME, "*.pdf" ) ).join( "\" \"" ) + "\" cat output \"" + ( output_filename = File.basename( file ).sub( /#{File.extname( file )}$/, ".2.pdf" ) ) + "\"" )
- $stderr.printf( "Done\n" );
- #Compare the sizes
- if( File.size( file ) > File.size( output_filename ) )
- $stdout.puts "Compressed file " + File.basename( file ) + " - Compressed from " + File.size( file ).to_s + " to " + File.size( output_filename ).to_s
- else
- $stdout.puts "Ignored file " + File.basename( file ) + " - Changed from " + File.size( file ).to_s + " to " + File.size( output_filename ).to_s
- File.delete( output_filename ) if DELETEIGNOREFILE
- end
- ensure
- #Clean up temp dir
- Dir.glob( File.join( TMPDIRNAME, "*" ) ).each { |delfile| File.delete( delfile ) }
- Dir.delete( TMPDIRNAME );
- end