Difference between revisions of "PDF, PS and DjVu"

From ArchWiki
Jump to navigation Jump to search
(update Pkg/AUR templates)
Tag: wiki-scripts
(update Pkg/AUR templates)
Tag: wiki-scripts
 
(13 intermediate revisions by 9 users not shown)
Line 29: Line 29:
 
* {{App|Atril|Simple multi-page document viewer for MATE. Supports DjVu, DVI, EPS, EPUB, PDF, PostScript, TIFF, XPS and Comicbook.|https://github.com/mate-desktop/atril|{{Pkg|atril}}}}
 
* {{App|Atril|Simple multi-page document viewer for MATE. Supports DjVu, DVI, EPS, EPUB, PDF, PostScript, TIFF, XPS and Comicbook.|https://github.com/mate-desktop/atril|{{Pkg|atril}}}}
 
* {{App|DjView|Viewer for DjVu documents.|http://djvu.sourceforge.net/djview4.html|{{Pkg|djview}}}}
 
* {{App|DjView|Viewer for DjVu documents.|http://djvu.sourceforge.net/djview4.html|{{Pkg|djview}}}}
* {{App|ePDFView|Lightweight PDF document viewer using the Poppler and GTK+ libraries. Development stopped.|http://freecode.com/projects/epdfview|{{Pkg|epdfview}}}}
+
* {{App|ePDFView|Lightweight PDF document viewer using the Poppler and GTK libraries. Development stopped.|http://freecode.com/projects/epdfview|{{Pkg|epdfview}}}}
 
* {{App|[[Emacs]]|See also [https://github.com/politza/pdf-tools pdf-tools] for improved pdf support and the [https://elpa.gnu.org/packages/djvu.html djvu package] for djvu support.|https://www.gnu.org/software/emacs/|{{Pkg|emacs}}}}
 
* {{App|[[Emacs]]|See also [https://github.com/politza/pdf-tools pdf-tools] for improved pdf support and the [https://elpa.gnu.org/packages/djvu.html djvu package] for djvu support.|https://www.gnu.org/software/emacs/|{{Pkg|emacs}}}}
 
* {{App|[[Wikipedia:Evince|Evince]]|Document viewer for GNOME. Supports DjVu, DVI, EPS, PDF, PostScript, TIFF, XPS and Comicbook.|https://wiki.gnome.org/Apps/Evince|{{Pkg|evince}}}}
 
* {{App|[[Wikipedia:Evince|Evince]]|Document viewer for GNOME. Supports DjVu, DVI, EPS, PDF, PostScript, TIFF, XPS and Comicbook.|https://wiki.gnome.org/Apps/Evince|{{Pkg|evince}}}}
Line 99: Line 99:
 
|-
 
|-
 
! [[Zathura]]
 
! [[Zathura]]
| Poppler* / libmupdf* || libspectre* || DjVuLibre* || libmupdf* || {{Ya}} || {{B|zlib}}
+
| Poppler* / libmupdf* || libspectre* || DjVuLibre* || libmupdf* || {{Na}} || {{B|zlib}}
 
|}
 
|}
  
Line 125: Line 125:
 
=== Basic editors ===
 
=== Basic editors ===
  
* {{App|PDF Arranger|Helps merge or split pdf documents and rotate, crop and rearrange pages. It's a maintained fork of PDF-Shuffler.|https://github.com/jeromerobert/pdfarranger|{{Pkg|pdfarranger}}}}
+
* {{App|PDF Arranger|Helps merge or split pdf documents and rotate, crop and rearrange pages. It is a maintained fork of PDF-Shuffler.|https://github.com/jeromerobert/pdfarranger|{{Pkg|pdfarranger}}}}
 
* {{App|PDF Chain|GTK front-end for [[#PDF tools|PDFtk]], written in C++, supporting concatenation, burst, watermarks, attaching files and more.|http://pdfchain.sourceforge.net/|{{AUR|pdfchain}}}}
 
* {{App|PDF Chain|GTK front-end for [[#PDF tools|PDFtk]], written in C++, supporting concatenation, burst, watermarks, attaching files and more.|http://pdfchain.sourceforge.net/|{{AUR|pdfchain}}}}
 
* {{App|PDF Mix Tool|Qt front-end for [[#Libraries|PoDoFo]], written in C++, supports splitting, merging, rotating and mixing PDF files.|https://scarpetta.eu/pdfmixtool/|{{Pkg|pdfmixtool}}}}
 
* {{App|PDF Mix Tool|Qt front-end for [[#Libraries|PoDoFo]], written in C++, supports splitting, merging, rotating and mixing PDF files.|https://scarpetta.eu/pdfmixtool/|{{Pkg|pdfmixtool}}}}
* {{App|PDF Mod|Reorder, rotate, and remove pages, export images from a document, edit the title, subject, author, and keywords, and combine documents via drag and drop.|https://wiki.gnome.org/Apps/PdfMod|{{Pkg|pdfmod}}}}
+
* {{App|PDF Mod|Reorder, rotate, and remove pages, export images from a document, edit the title, subject, author, and keywords, and combine documents via drag and drop.|https://wiki.gnome.org/Attic/PdfMod|{{Pkg|pdfmod}}}}
 
* {{App|PDFsam|Open source application, written in Java, supports merging, splitting and rotating.|https://pdfsam.org/|{{Pkg|pdfsam}}}}
 
* {{App|PDFsam|Open source application, written in Java, supports merging, splitting and rotating.|https://pdfsam.org/|{{Pkg|pdfsam}}}}
 
* {{App|PDF Slicer|Simple application to extract, merge, rotate and reorder pages of PDF documents.|https://junrrein.github.io/pdfslicer/|{{AUR|pdfslicer}}}}
 
* {{App|PDF Slicer|Simple application to extract, merge, rotate and reorder pages of PDF documents.|https://junrrein.github.io/pdfslicer/|{{AUR|pdfslicer}}}}
Line 141: Line 141:
 
=== Advanced editors ===
 
=== Advanced editors ===
  
* {{App|Master PDF Editor|Functional proprietary PDF editor. Free for non-commercial use.|https://code-industry.net/free-pdf-editor/|{{AUR|masterpdfeditor}}, {{AUR|masterpdfeditor-qt4}} for older version without restrictions}}
+
* {{App|Master PDF Editor|Functional proprietary PDF editor. Free for non-commercial use.|https://code-industry.net/free-pdf-editor/|{{AUR|masterpdfeditor}}}}
 
* {{App|PDF Studio|All-in-one proprietary PDF editor similar to Adobe Acrobat.|https://www.qoppa.com/pdfstudio/|{{AUR|pdfstudio}}}}
 
* {{App|PDF Studio|All-in-one proprietary PDF editor similar to Adobe Acrobat.|https://www.qoppa.com/pdfstudio/|{{AUR|pdfstudio}}}}
  
Line 148: Line 148:
 
{{Expansion|Add {{AUR|cpdf}}.}}
 
{{Expansion|Add {{AUR|cpdf}}.}}
  
* {{App|[[Wikipedia:PDFtk|PDFtk]]|Simple tool for doing everyday things with PDF documents|http://www.pdfhacks.com/pdftk|{{AUR|pdftk}}, {{AUR|pdftk-bin}}}}
+
* {{App|[[Wikipedia:PDFtk|PDFtk]]|Simple tool for doing everyday things with PDF documents|http://www.pdfhacks.com/pdftk|{{pkg|pdftk}}}}
 
* {{App|Stapler|Light alternative to PDFtk using the [[#Python|PyPDF2]] library.|https://github.com/hellerbarde/stapler|{{AUR|stapler}}, {{AUR|stapler-git}}}}
 
* {{App|Stapler|Light alternative to PDFtk using the [[#Python|PyPDF2]] library.|https://github.com/hellerbarde/stapler|{{AUR|stapler}}, {{AUR|stapler-git}}}}
 
* {{App|[[Wikipedia:QPDF|QPDF]]|Content-preserving PDF transformation system.|https://github.com/qpdf/qpdf|{{Pkg|qpdf}}}}
 
* {{App|[[Wikipedia:QPDF|QPDF]]|Content-preserving PDF transformation system.|https://github.com/qpdf/qpdf|{{Pkg|qpdf}}}}
Line 204: Line 204:
  
 
  $ qpdf --decrypt --password=''password'' in.pdf out.pdf
 
  $ qpdf --decrypt --password=''password'' in.pdf out.pdf
 
  
 
{{Tip|Forgotten passwords might be recovered with {{Pkg|pdfcrack}}, see {{man|1|pdfcrack}}.}}
 
{{Tip|Forgotten passwords might be recovered with {{Pkg|pdfcrack}}, see {{man|1|pdfcrack}}.}}
Line 232: Line 231:
 
  $ pdfimages ''infile''.pdf -j ''outfileroot''
 
  $ pdfimages ''infile''.pdf -j ''outfileroot''
  
=== Extract page range from PDF ===
+
=== Extract page range from PDF, split multipage PDF document ===
 +
 
 +
With Ghostscript as a single file[https://forums.freebsd.org/threads/split-pdf-file.58902/#post-336971]
 +
 
 +
$ gs -sDEVICE=pdfwrite -dNOPAUSE -dBATCH -dSAFER -dFirstPage=''first'' -dLastPage=''last'' -sOutputFile=''outfile''.pdf ''infile''.pdf
  
 
With PDFtk as a single file:
 
With PDFtk as a single file:
Line 260: Line 263:
 
  $ pdfinfo file.pdf
 
  $ pdfinfo file.pdf
  
=== Optimize a PDF ===
+
=== Optimize, reduce size of a PDF ===
  
With Ghostscript:
+
With Ghostscript one of:
  
 
  $ ps2pdf -dPDFSETTINGS=/screen in.pdf out.pdf
 
  $ ps2pdf -dPDFSETTINGS=/screen in.pdf out.pdf
 +
$ gs -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/printer -sOutputFile=out.pdf in.pdf
  
 
For different settings see the [https://www.ghostscript.com/doc/9.22/VectorDevices.htm#PSPDF_IN documentation].
 
For different settings see the [https://www.ghostscript.com/doc/9.22/VectorDevices.htm#PSPDF_IN documentation].
  
There is also {{AUR|shrinkpdf}}, a third-party wrapper script.
+
There is also {{AUR|shrinkpdf}}, a script wrapping gs.
  
 
=== Rasterize a PDF ===
 
=== Rasterize a PDF ===
Line 291: Line 295:
  
 
Can be used to undo simple [[#Imposing a PDF|imposition]].
 
Can be used to undo simple [[#Imposing a PDF|imposition]].
 +
 +
=== Add signature.png or image to one of the pages in the pdf ===
 +
 +
to add an image to any location in a pdf can be done with imagemagick (convert), xv, and pdftk. a [http://emmanuel.branlard.free.fr/work/linux/dev/SignPDF/SignPDF wrapper script is here] . other hints are [https://unix.stackexchange.com/questions/85873/how-can-i-add-a-signature-png-to-a-pdf-in-linux here].
  
 
== DjVu tools ==
 
== DjVu tools ==
Line 299: Line 307:
  
 
=== Convert DjVu to images ===
 
=== Convert DjVu to images ===
 +
 
Break Djvu into separate pages:
 
Break Djvu into separate pages:
  djvmcvt -i input.djvu /path/to/out/dir output-index.djvu
+
 
 +
  $ djvmcvt -i input.djvu /path/to/out/dir output-index.djvu
  
 
Convert Djvu pages into images:
 
Convert Djvu pages into images:
  ddjvu --format=tiff page.djvu page.tiff
+
 
 +
  $ ddjvu --format=tiff page.djvu page.tiff
  
 
Convert Djvu pages into PDF:
 
Convert Djvu pages into PDF:
  ddjvu --format=pdf inputfile.djvu ouputfile.pdf
+
 
 +
  $ ddjvu --format=pdf inputfile.djvu ouputfile.pdf
  
 
You can also use ''--page'' to export specific pages:
 
You can also use ''--page'' to export specific pages:
  ddjvu --format=tiff --page=1-10 input.djvu output.tiff
+
 
 +
  $ ddjvu --format=tiff --page=1-10 input.djvu output.tiff
 +
 
 
this will convert pages from 1 to 10 into one tiff file.
 
this will convert pages from 1 to 10 into one tiff file.
  
 
=== Processing images ===
 
=== Processing images ===
 +
 
You can use {{Pkg|scantailor}} to:
 
You can use {{Pkg|scantailor}} to:
 +
 
* fix orientation
 
* fix orientation
 
* split pages
 
* split pages
Line 321: Line 337:
  
 
=== Make DjVu from images ===
 
=== Make DjVu from images ===
 +
 
There is a useful script {{AUR|img2djvu-git}}.
 
There is a useful script {{AUR|img2djvu-git}}.
  img2djvu -c1 -d600 -v1 ./out
+
 
it will create 600dpi out.djvu from all files in ./out directory.
+
  $ img2djvu -c1 -d600 -v1 ./out
 +
 
 +
it will create 600 DPI {{ic|out.djvu}} from all files in {{ic|./out}} directory.
  
 
Alternatively, you can try {{AUR|didjvu}}, which seems to create smaller files especially on images with well defined background.
 
Alternatively, you can try {{AUR|didjvu}}, which seems to create smaller files especially on images with well defined background.
Line 341: Line 360:
  
 
* with {{ic|1=-sPAPERSIZE=something}} you define the paper size. For valid PAPERSIZE values, see [http://ghostscript.com/doc/current/Use.htm#Known_paper_sizes].
 
* with {{ic|1=-sPAPERSIZE=something}} you define the paper size. For valid PAPERSIZE values, see [http://ghostscript.com/doc/current/Use.htm#Known_paper_sizes].
* {{ic|1=-dOptimize=true}} let's the created PDF be optimised for loading  
+
* {{ic|1=-dOptimize=true}} lets the created PDF be optimised for loading  
 
* {{ic|1=-dEmbedAllFonts=true}} makes the fonts look always nice
 
* {{ic|1=-dEmbedAllFonts=true}} makes the fonts look always nice
  
{{Note|You cannot choose the paper orientation in ps2pdf. If your input PS file is healthy, it already contains the orientation information. If you are trying to use an Encapsulated PS file, you will have problems, if it does not fit in the {{ic|1=-sPAPERSIZE}} you specified, because EPS files usually do not contain paper orientation informaiton. a workaround is creating a new paper in ghostscript settings (call it e.g. "slide") and use it as {{ic|1=-sPAPERSIZE=slide}}.}}
+
{{Note|You cannot choose the paper orientation in ps2pdf. If your input PS file is healthy, it already contains the orientation information. If you are trying to use an Encapsulated PS file, you will have problems, if it does not fit in the {{ic|1=-sPAPERSIZE}} you specified, because EPS files usually do not contain paper orientation information. a workaround is creating a new paper in ghostscript settings (call it e.g. "slide") and use it as {{ic|1=-sPAPERSIZE=slide}}.}}
  
 
== Libraries ==
 
== Libraries ==
Line 356: Line 375:
 
* {{App|pdfrw|A pure Python library that reads and writes PDFs.|https://github.com/pmaupin/pdfrw|{{Pkg|python-pdfrw}}, {{Pkg|python2-pdfrw}}}}
 
* {{App|pdfrw|A pure Python library that reads and writes PDFs.|https://github.com/pmaupin/pdfrw|{{Pkg|python-pdfrw}}, {{Pkg|python2-pdfrw}}}}
 
* {{App|PyPDF2|A pure-Python library built as a PDF toolkit.|https://mstamy2.github.com/PyPDF2|{{AUR|python-pypdf2}}, {{AUR|python2-pypdf2}}}}
 
* {{App|PyPDF2|A pure-Python library built as a PDF toolkit.|https://mstamy2.github.com/PyPDF2|{{AUR|python-pypdf2}}, {{AUR|python2-pypdf2}}}}
* {{App|PyX|Python library for the creation of PostScript and PDF files.|http://pyx.sourceforge.net|{{Pkg|python-pyx}}, {{Pkg|python2-pyx}}}}
+
* {{App|PyX|Python library for the creation of PostScript and PDF files.|http://pyx.sourceforge.net|{{Pkg|python-pyx}}, {{Pkg|python2-pyx}}{{Broken package link|{{aur-mirror|python2-pyx}}}}}}
 
* {{App|ReportLab|A proven industry-strength PDF generating solution|https://bitbucket.org/rptlab/reportlab|{{Pkg|python-reportlab}}, {{Pkg|python2-reportlab}}}}
 
* {{App|ReportLab|A proven industry-strength PDF generating solution|https://bitbucket.org/rptlab/reportlab|{{Pkg|python-reportlab}}, {{Pkg|python2-reportlab}}}}
  

Latest revision as of 07:29, 10 November 2019

This article covers software to view, edit and convert PDF, PostScript (PS), DjVu (déjà vu) and XPS files.

Engines

https://poppler.freedesktop.org/ || poppler
  • libspectre — Small library for rendering Postscript documents.
https://www.freedesktop.org/wiki/Software/libspectre || libspectre
  • Ghostscript — Interpreter for PostScript and PDF. Provides the gs(1) command-line interface, see also /usr/share/doc/ghostscript/*/Use.htm (online), along with many wrapper scripts like ps2pdf and pdf2ps.
https://ghostscript.com/ || ghostscript
  • DjVuLibre — Suite to create, manipulate and view DjVu documents.
http://djvu.sourceforge.net/ || djvulibre
  • libgxps — GObject based library for handling and rendering XPS documents.
https://wiki.gnome.org/Projects/libgxps || libgxps

Viewers

Framebuffer

  • fbgs — Poor man's PostScript/pdf viewer for the linux framebuffer console.
https://www.kraxel.org/blog/linux/fbida/ || fbida
  • fbpdf — Small framebuffer PDF and DjVu viewer based off of MuPDF, with Vim keybindings and written in C
http://repo.or.cz/w/fbpdf.git || fbpdf-gitAUR
  • JFBView — Framebuffer PDF and image viewer. Features include Vim-like controls, zoom-to-fit, a TOC (outline) view, fast multi-threaded rendering and asynchronous pre-caching. Originally a fork of fbpdf called jfbpdf, now completely rewritten.
https://seasonofcode.com/pages/jfbview.html || jfbviewAUR

Graphical

Note: Some web browsers can display PDF files, for example with PDF.js.
  • Adobe Reader — Proprietary PDF file viewer offered by Adobe. Discontinued for Linux.
http://www.adobe.com/products/reader.html || acroreadAUR
  • apvlv — Lightweight document viewer with Vim keybindings. Supports PDF, DjVu, UMD and TXT.
https://naihe2010.github.io/apvlv/ || apvlvAUR
  • Atril — Simple multi-page document viewer for MATE. Supports DjVu, DVI, EPS, EPUB, PDF, PostScript, TIFF, XPS and Comicbook.
https://github.com/mate-desktop/atril || atril
  • DjView — Viewer for DjVu documents.
http://djvu.sourceforge.net/djview4.html || djview
  • ePDFView — Lightweight PDF document viewer using the Poppler and GTK libraries. Development stopped.
http://freecode.com/projects/epdfview || epdfview
https://www.gnu.org/software/emacs/ || emacs
  • Evince — Document viewer for GNOME. Supports DjVu, DVI, EPS, PDF, PostScript, TIFF, XPS and Comicbook.
https://wiki.gnome.org/Apps/Evince || evince
  • Foxit Reader — Small, fast (compared to Acrobat) proprietary PDF viewer.
https://www.foxitsoftware.com/pdf-reader/ || foxitreaderAUR
  • gv — Graphical user interface for the Ghostscript interpreter that allows to view and navigate through PostScript and PDF documents.
https://www.gnu.org/software/gv/ || gv
  • llpp — Very fast PDF reader based off of MuPDF, that supports continuous page scrolling, bookmarking, and text search through the whole document.
http://repo.or.cz/w/llpp.git || llpp
  • MuPDF — Very fast EPUB, FictionBook, PDF, XPS and Comicbook viewer written in portable C. Features CJK font support.
https://mupdf.com/ || mupdf
  • Okular — Universal document viewer for KDE. Supports CHM, Comicbook, DjVu, DVI, EPUB, FictionBook, Mobipocket, ODT, PDF, Plucker, PostScript, TIFF and XPS.
https://okular.kde.org/ || okular
  • pdfpc — Presenter console with multi-monitor support for PDF files.
https://pdfpc.github.io/ || pdfpc
  • qpdfview — Tabbed document viewer. It uses Poppler for PDF support, libspectre for PS support, DjVuLibre for DjVu support, CUPS for printing support and the Qt toolkit for its interface.
https://launchpad.net/qpdfview || qpdfview
  • Xpdf — Viewer that can decode LZW and read encrypted PDFs.
http://www.xpdfreader.com/ || xpdf
  • Xreader — Document viewer part of the X-Apps Project. Supports DjVu, DVI, EPUB, PDF, PostScript, TIFF, XPS, Comicbook.
https://github.com/linuxmint/xreader/ || xreader
  • Zathura — Highly customizable and functional document viewer (plugin based). Supports PDF, DjVu, PostScript and Comicbook.
https://pwmt.org/projects/zathura/ || zathura

Comparison

Tango-inaccurate.pngThe factual accuracy of this article or section is disputed.Tango-inaccurate.png

Reason: Filling out PDF forms seem to be broken in MuPDF and llpp. (Discuss in Talk:PDF, PS and DjVu#)
Name PDF PostScript DjVu XPS PDF forms License
Adobe Reader custom proprietary
apvlv Poppler DjVuLibre GPLv2
Atril Poppler libspectre DjVuLibre libgxps GPLv2
DjView DjVuLibre GPLv2
Emacs Ghostscript* DjVuLibre* GPLv3
ePDFView Poppler GPLv2
Evince Poppler libspectre DjVuLibre libgxps GPLv2
Foxit Reader custom proprietary
gv Ghostscript GPLv3
llpp libmupdf libmupdf GPLv3
MuPDF custom custom AGPLv3
Okular Poppler libspectre DjVuLibre custom GPL, LGPL
pdfpc Poppler GPLv2
qpdfview Poppler libspectre* DjVuLibre* GPLv2
Xpdf custom GPLv3
Xreader Poppler libspectre* DjVuLibre* libgxps* GPLv2
Zathura Poppler* / libmupdf* libspectre* DjVuLibre* libmupdf* zlib
(* means optional)

PDF forms

The PDF forms column in the above table refers to AcroForms support. If you do not need your input to be directly extractable from the PDF, you can also use the applications in #Annotation or #Graphical PDF editing to put text on top of a PDF. PDF forms can be created with LibreOffice Writer (View > Toolbars > Form Controls) and the advanced PDF editors.

The proprietary and deprecated XFA format for forms, is not fully support by Poppler[1][2] and only supported by Adobe Reader and Master PDF Editor.

Annotation

  • flpsed — A PostScript and PDF annotator, only supports text boxes.
http://flpsed.org/flpsed.html || flpsedAUR

See also List of applications/Documents#Stylus note-taking.

Graphical PDF editing

  • Scribus can import and export PDF; text is imported as polygons.[3]
  • LibreOffice Draw can import and export PDF; text is imported as text; embedded fonts are substituted.[4][5]
  • Inkscape can import a single page from a PDF and export to PDF; text is imported as cloned glyphs or text; with the latter embedded fonts are substituted.
  • Graphics editors like GIMP and krita can also import and export PDFs at the cost of rasterization.

Basic editors

  • PDF Arranger — Helps merge or split pdf documents and rotate, crop and rearrange pages. It is a maintained fork of PDF-Shuffler.
https://github.com/jeromerobert/pdfarranger || pdfarranger
  • PDF Chain — GTK front-end for PDFtk, written in C++, supporting concatenation, burst, watermarks, attaching files and more.
http://pdfchain.sourceforge.net/ || pdfchainAUR
  • PDF Mix Tool — Qt front-end for PoDoFo, written in C++, supports splitting, merging, rotating and mixing PDF files.
https://scarpetta.eu/pdfmixtool/ || pdfmixtool
  • PDF Mod — Reorder, rotate, and remove pages, export images from a document, edit the title, subject, author, and keywords, and combine documents via drag and drop.
https://wiki.gnome.org/Attic/PdfMod || pdfmod
  • PDFsam — Open source application, written in Java, supports merging, splitting and rotating.
https://pdfsam.org/ || pdfsam
  • PDF Slicer — Simple application to extract, merge, rotate and reorder pages of PDF documents.
https://junrrein.github.io/pdfslicer/ || pdfslicerAUR
  • PDF Tricks — Simple, efficient application for small manipulations in PDF files using Ghostscript.
https://github.com/muriloventuroso/pdftricks || pdftricks

Cropping tools

  • briss — Java GUI to crop pages of PDF documents to one or more regions selected.
http://sourceforge.net/projects/briss/ || brissAUR
  • krop — Simple graphical tool to crop the pages of PDF files.
http://arminstraub.com/software/krop || kropAUR
  • PdfHandoutCrop — Tool to crop pdf handout with multiple pages per sheet.
https://cges30901.github.io/pdfhandoutcrop/ || pdfhandoutcropAUR

Advanced editors

  • Master PDF Editor — Functional proprietary PDF editor. Free for non-commercial use.
https://code-industry.net/free-pdf-editor/ || masterpdfeditorAUR
  • PDF Studio — All-in-one proprietary PDF editor similar to Adobe Acrobat.
https://www.qoppa.com/pdfstudio/ || pdfstudioAUR

PDF tools

Tango-view-fullscreen.pngThis article or section needs expansion.Tango-view-fullscreen.png

Reason: Add cpdfAUR. (Discuss in Talk:PDF, PS and DjVu#)
  • PDFtk — Simple tool for doing everyday things with PDF documents
http://www.pdfhacks.com/pdftk || pdftk
  • Stapler — Light alternative to PDFtk using the PyPDF2 library.
https://github.com/hellerbarde/stapler || staplerAUR, stapler-gitAUR
  • QPDF — Content-preserving PDF transformation system.
https://github.com/qpdf/qpdf || qpdf
  • pdfgrep — Commandline utility to search text in PDF files.
https://pdfgrep.org/ || pdfgrep
  • pdf2svg — Convert PDF files to SVG files.
http://www.cityinthesky.co.uk/opensource/pdf2svg/ || pdf2svg
  • mupdf-tools — Tools developed as part of MuPDF, contains mutool(1) and muraster.
https://mupdf.com || mupdf-tools

Create a PDF from images

With GraphicsMagick:

$ gm convert 1.jpg 2.jpg 3.jpg out.pdf

Concatenate PDFs

With Ghostscript:

$ gs -dNOPAUSE -sDEVICE=pdfwrite -sOUTPUTFILE=out.pdf -dBATCH 1.pdf 2.pdf 3.pdf

With PDFtk:

$ pdftk 1.pdf 2.pdf 3.pdf cat output out.pdf

With Poppler:

$ pdfunite 1.pdf 2.pdf 3.pdf out.pdf

With QPDF:

$ qpdf --empty --pages 1.pdf 2.pdf 3.pdf -- out.pdf

Convert a PDF to text

With Poppler and maintaining the layout:

$ pdftotext -layout in.pdf out.txt

See also pdftotext(1).

Decrypt a PDF

This section lists commands to decrypt a PDF to an unencrypted file. Note that most PDF viewers also support encrypted PDFs.

With PDFtk:

$ pdftk in.pdf input_pw password output out.pdf

With Poppler to PostScript:

$ pdftops -upw password in.pdf out.ps

With QPDF:

$ qpdf --decrypt --password=password in.pdf out.pdf
Tip: Forgotten passwords might be recovered with pdfcrack, see pdfcrack(1).

Encrypt a PDF

The user password is used for encryption, the owner password to restrict operations once the document is decrypted, for more information, see Wikipedia:PDF#Security and signatures.

With PDFtk:

$ pdftk in.pdf output out.pdf user_pw password

With PoDoFo:

$ podofoencrypt -u user_password -o owner_password in.pdf out.pdf

With QPDF:

$ qpdf --encrypt user_password owner_password key_length -- in.pdf out.pdf

where key_length can be 40, 128 or 256.

Extract images from a PDF

With Poppler to JPEG:

$ pdfimages infile.pdf -j outfileroot

Extract page range from PDF, split multipage PDF document

With Ghostscript as a single file[6]

$ gs -sDEVICE=pdfwrite -dNOPAUSE -dBATCH -dSAFER -dFirstPage=first -dLastPage=last -sOutputFile=outfile.pdf infile.pdf

With PDFtk as a single file:

$ pdftk infile.pdf cat first-last output outfile.pdf

With Poppler as separate files:

$ pdfseparate -f first -l last infile.pdf outfileroot-%d.pdf

With QPDF as a single file:

$ qpdf --empty --pages infile.pdf first-last -- outfile.pdf

Imposing a PDF

PDF Imposition can be done with pdfjam, for example paper waste can be reduced with pdfnup and pdfbook can be used to arrange PDFs into a format suitable for book binding.

Inspecting metadata

With ExifTool:

$ exiftool file.pdf

With Poppler:

$ pdfinfo file.pdf

Optimize, reduce size of a PDF

With Ghostscript one of:

$ ps2pdf -dPDFSETTINGS=/screen in.pdf out.pdf
$ gs -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/printer -sOutputFile=out.pdf in.pdf

For different settings see the documentation.

There is also shrinkpdfAUR, a script wrapping gs.

Rasterize a PDF

With GraphicsMagick to convert a specific page:

$ gm convert -density dpi infile.pdf[page] outfile.jpg

With Poppler to convert all pages:

$ pdftoppm -jpeg -r dpi infile.pdf outfileroot

With Poppler to convert a specific page:

$ pdftoppm -jpeg -r dpi -f page -singlefile infile.pdf outfileroot

Splitting PDF pages

With mupdf-tools to split every page vertically into two pages:

$ mutool poster -y 2 in.pdf out.pdf

Can be used to undo simple imposition.

Add signature.png or image to one of the pages in the pdf

to add an image to any location in a pdf can be done with imagemagick (convert), xv, and pdftk. a wrapper script is here . other hints are here.

DjVu tools

  • DjVuLibre provides many command-line tools, like ddjvu(1) for example.
  • img2djvu — Single-pass DjVu encoder based on DjVu Libre and ImageMagick
https://github.com/ashipunov/img2djvu || img2djvu-gitAUR
  • pdf2djvu — Creates DjVu files from PDF files.
https://jwilk.net/software/pdf2djvu || pdf2djvu

Convert DjVu to images

Break Djvu into separate pages:

$ djvmcvt -i input.djvu /path/to/out/dir output-index.djvu

Convert Djvu pages into images:

$ ddjvu --format=tiff page.djvu page.tiff

Convert Djvu pages into PDF:

$ ddjvu --format=pdf inputfile.djvu ouputfile.pdf

You can also use --page to export specific pages:

$ ddjvu --format=tiff --page=1-10 input.djvu output.tiff

this will convert pages from 1 to 10 into one tiff file.

Processing images

You can use scantailor to:

  • fix orientation
  • split pages
  • deskew
  • crop
  • adjust margins

Make DjVu from images

There is a useful script img2djvu-gitAUR.

$ img2djvu -c1 -d600 -v1 ./out

it will create 600 DPI out.djvu from all files in ./out directory.

Alternatively, you can try didjvuAUR, which seems to create smaller files especially on images with well defined background.

PostScript tools

  • pstotext — Converts PostScript files to text.
http://www.cs.wisc.edu/~ghost/doc/pstotext.htm || pstotext

ps2pdf

ps2pdf is a wrapper around ghostscript to convert PostScript to PDF:

$ ps2pdf -sPAPERSIZE=a4 -dOptimize=true -dEmbedAllFonts=true YourPSFile.ps

Explanation:

  • with -sPAPERSIZE=something you define the paper size. For valid PAPERSIZE values, see [7].
  • -dOptimize=true lets the created PDF be optimised for loading
  • -dEmbedAllFonts=true makes the fonts look always nice
Note: You cannot choose the paper orientation in ps2pdf. If your input PS file is healthy, it already contains the orientation information. If you are trying to use an Encapsulated PS file, you will have problems, if it does not fit in the -sPAPERSIZE you specified, because EPS files usually do not contain paper orientation information. a workaround is creating a new paper in ghostscript settings (call it e.g. "slide") and use it as -sPAPERSIZE=slide.

Libraries

  • libharu — C library for generating PDF documents.
https://github.com/libharu/libharu || libharu, Lua binding: lua-hpdfAUR
  • PoDoFo — A C++ library to work with the PDF file format.
http://podofo.sourceforge.net || podofo

Python

  • PDFMiner — Utils to extract, analyze text data of PDF files. Includes pdf2txt, dumppdf, and latin2ascii
http://www.unixuser.org/~euske/python/pdfminer/ || pdfminerAUR, python-pdfminer.sixAUR
  • pdfrw — A pure Python library that reads and writes PDFs.
https://github.com/pmaupin/pdfrw || python-pdfrw, python2-pdfrw
  • PyPDF2 — A pure-Python library built as a PDF toolkit.
https://mstamy2.github.com/PyPDF2 || python-pypdf2AUR, python2-pypdf2AUR
  • PyX — Python library for the creation of PostScript and PDF files.
http://pyx.sourceforge.net || python-pyx, python2-pyx[broken link: archived in aur-mirror]
  • ReportLab — A proven industry-strength PDF generating solution
https://bitbucket.org/rptlab/reportlab || python-reportlab, python2-reportlab

See also