Skip to content

Command-line, quality-first PDF optimizer. Drop PDFs into input/, get optimized results in output/. Ghostscript + qpdf with optional PSNR quality gate and a never-worse guarantee.

License

Notifications You must be signed in to change notification settings

laguileracl/pdf-ultra-compressor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

🚀 PDF Ultra Compressor

CI License: MIT PRs Welcome Discussions Wiki

Command-line, quality-first PDF optimizer for text- and image-heavy PDFs. Drop files into input/, get optimized results in output/. Focus: maximum size reduction without perceptible quality loss, with strict “never worse” guards. See docs/ for more details. For longer docs, visit the Wiki — quick links: Home, Usage, Quality Gates, Roadmap.

Keywords: pdf compression, pdf optimizer, ghostscript, qpdf, ocr, jbig2, jpeg2000, lossless, high quality, macos, linux, ci, command line

Features

  • Drop-in folder workflow: put PDFs in input/, get results in output/.
  • Multi-pass strategy: Ghostscript (prepress/printer/ebook) + qpdf.
  • Quality-first scoring with “never worse” safeguard (copies original if no gain).
  • Optional perceptual quality gate (PSNR) to prevent visible degradation.
  • Anonymous telemetry (opt-out) records technical, privacy-safe metrics to improve algorithms. Disable with --disable-telemetry.
  • New anti-noise mode to suppress artifacts on optimized PDFs (text/gray-safe filters and optional grayscale). Enable with --anti-noise.

Highlights

  • 🎯 Smart multi-pass pipeline: Ghostscript + qpdf
  • 🧠 Quality-first scoring: selects the best candidate (size vs. visual safety)
  • 📂 Zero-config workflow: input/output (processed moved to input/processed/)
  • 🧹 Structural cleanup and linearization when possible
  • 🛡️ Never-worse guarantee: falls back to original if not improved

Quick Start (macOS)

Install system tools (recommended):

brew install ghostscript qpdf

Then run:

# Put PDFs in input/
cp ~/Downloads/my.pdf input/

# Run the compressor (English v1)
python3 compressor.py

# Results in output/
ls output/

Alternatively, run the new v1 CLI (English-only):

python3 compressor.py

Telemetry is enabled by default and stores anonymized, technical-only data in telemetry_data/ locally. To opt out:

python3 compressor.py --disable-telemetry

To reduce compression artifacts/noise in the output (helpful for scanned text docs):

python3 compressor.py --anti-noise

Folder Layout

pdf-ultra-compressor/
├─ input/                 # Place PDFs here
│  └─ processed/          # Processed originals are moved here
├─ output/                # Optimized PDFs are written here
├─ compressor.py          # Primary CLI optimizer (English v1)
├─ ci/                    # Smoke test
├─ install_tools.sh       # macOS helper to install ghostscript & qpdf
└─ docs & meta

Typical Results

  • Scanned documents: 40–70% reduction
  • Image-heavy PDFs: 30–60% reduction
  • Mostly text PDFs: 10–30% reduction
  • Visual quality: preserved; never-worse guarantee (PSNR gate optional)

Roadmap

  • Add OCRmyPDF + JBIG2 for scanned PDFs (MRC-style pipeline)
  • Perceptual quality gates with SSIM/LPIPS (PSNR already available)

Contributing

Contributions are welcome! Please read CONTRIBUTING.md and open an issue or pull request.

License

MIT — see LICENSE.

Community & Discussions

Have questions, feature ideas, or want to share results? Join the project Discussions: https://github.com/laguileracl/pdf-ultra-compressor/discussions

  • Announcements: pinned “Welcome & Roadmap”
  • Q&A: ask questions
  • Ideas: feature proposals

About

Command-line, quality-first PDF optimizer. Drop PDFs into input/, get optimized results in output/. Ghostscript + qpdf with optional PSNR quality gate and a never-worse guarantee.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published