Skip to main content

Build an Automated PDF Conversion System

Build a reliable automated PDF converter using unoconv with Bash script and cron job. Good small project for practicing and enhancing your Linux skills.

Β· By Roland Taylor Β· 21 min read

Warp Terminal
Objective Build a self-contained system that watches a folder, converts any dropped document to PDF using headless LibreOffice, sorts originals by extension, and logs every run β€” fully automated via cron.
Duration 60-90 minutes
Difficulty Intermediate
Environment Linux (Debian/Ubuntu/RHEL/Arch) β€” desktop or server, no GUI required

Scenario

Imagine this: In a company or organization, there's a shared folder where staff (or automated systems) drop finished documents that need to be standardized for archival or distribution. Everyone can keep editing their working files in the usual place. When a document is ready for the day, it gets saved to the Document Inbox folder and synced to the file server.

Every few minutes, a conversion job runs automatically, checking this folder for any supported documents, whether ODT, ODS, ODP, DOCX, etc. β€” and converts them to the PDF format. The resulting PDFs are saved to "Reports-PDF", replacing any previous versions if necessary, and the processed copy of the source document is filed into a folder in "Originals", sorted by extension for traceability.

There are no extra buttons to press and no manual exporting to remember. Anyone can drop a file and go on about their day, and the PDFs will be neatly arranged and waiting in the output directory minutes later. This lets the team keep a simple routine while ensuring consistent, ready-to-share PDFs appear on schedule. This is precisely the solution we’re aiming for in this lab.

Our automation goals

We’ll build a practical, approachable system that does the following:

  1. Watch a single folder for new documents in any supported file format (ODF, DOCX, etc.).
  2. Convert each file to PDF using unoconv.
  3. Move converted PDFs into a dedicated folder.
  4. Move original files into subfolders matching their extensions (e.g., originals/odt/).
  5. Prevent overlapping runs using a lockfile.
  6. Log all actions to /var/log/lo-unoconv.log with automatic log rotation.

This gives us a self-contained, resilient system that can handle everything from a trickle of invoices to hundreds of archived reports.

πŸ“‹
By supported file formats, we're referring to any file type that we include in our script. LibreOffice supports many file formats that we are unlikely to need.

Creating PDFs is one of the easiest tasks to take for granted on Linux, thanks to the robust PDF support provided by CUPS and Ghostscript. However, converting multiple files to this portable format can get tedious fast, especially for students, non-profits, and businesses that may have several files to handle on any given day. Fortunately, the Linux ecosystem gives you everything you need to fully automate this task, supporting several file formats and any number of files.

This guide will show you how to use unoconv (powered by headless LibreOffice) to build a simple, reliable system that converts any supported document format into PDF, and optionally sorts your original files into subfolders for storage or further management.

We’ll cover common open document formats, and show you how to expand the approach so you can drop in other types as needed. We’ll also use cron to automate execution, flock to prevent overlapping runs, and logrotate to handle log rotation automatically. The final result will be a lightweight, low-maintenance automation you can replicate on almost any Linux system.

The methods here work on both desktop and server environments, which makes them a practical fit for organisations that need to handle regular PDF conversions. Once configured, the process is fully hands-free. We’ll keep things approachable and script-first, run everything as a non-privileged user, and focus on a clear folder layout you can adapt to your own workflow with no GUI required.

πŸ“‹
Even if you do not need such a system, trying out such tutorials help sharpen your Linux skills. Try it, learn new things while having fun with it.

Prerequisites

  • Linux system: Debian, Ubuntu, RHEL/CentOS Stream, openSUSE, or Arch or any Linux distro
  • sudo or root access
  • Basic familiarity with bash and the command line
  • systemd
  • No display server or GUI required β€” everything runs headless

End-state architecture

When the lab is complete, the following will be running on your system:

/srv/convert/
β”œβ”€β”€ inbox/          ← world-writable; drop documents here
β”œβ”€β”€ PDFs/           ← converted PDFs appear here
└── originals/      ← originals sorted by extension
    β”œβ”€β”€ odt/
    β”œβ”€β”€ ods/
    └── docx/

Services
  libreoffice-listener.service   ← persistent headless LO on port 2002
  cron (lo-svc)                  ← runs /usr/local/bin/lo-autopdf.sh every 5 min
  logrotate                       ← rotates /var/log/lo-unoconv.log weekly

Understanding Unoconv (optional)

Unoconv (short for UNO Converter) is a Python wrapper for LibreOffice’s Universal Network Objects (UNO) API. It interfaces directly with a headless instance of LibreOffice, either by launching a new instance or connecting to an existing one, and uses this to convert between supported file formats.

🚧
unoconv is available on most Linux distributions, but is no longer under development. Its replacement unoserver, is under active development, but does not yet have all the features of unoconv.

You might wonder why we're not using LibreOffice directly, since it has a headless version that can even be used on servers. The answer lies in how headless LibreOffice works. It is designed to launch a new instance every time the libreoffice --headless command is run.

This works fine for one-time tasks, but it puts a strain on the system if this command must be loaded from storage and system resources must be reallocated every time you try to use it. By using unoconv as a wrapper, we can allow headless LibreOffice to run as a persistent listener, with predictable resource usage, and avoid overlap when multiple conversions are needed. This saves time, and makes an ideal solution for recurring jobs like ours.

Step 1: Installing the packages

You'll need to install LibreOffice, unoconv, and the UNO Python bindings (pyuno) for this setup to work. The Writer, Calc, and Impress components are also required, as they provide filters needed for file format conversions.

However, we won't need any GUI add-ons β€” everything here is headless/server-friendly. Even if some small GUI-related libraries are installed as dependencies, everything you'll install will run fully headless; absolutely no display server required.

Note: on desktops, some of these packages may already be installed. Running these commands will ensure you're not missing any dependencies, but will not cause any problems if the packages already exist.

Debian / Ubuntu:

sudo apt update
sudo apt install unoconv libreoffice-core libreoffice-writer libreoffice-calc libreoffice-impress python3-uno fonts-dejavu fonts-liberation

RHEL/CentOS Stream
First enable EPEL (often required for unoconv on RHEL and its derivatives, Fedora has it in the default repos):

sudo dnf install epel-release

Then install:

sudo dnf install unoconv libreoffice-writer libreoffice-calc libreoffice-impress libreoffice-pyuno python3-setuptools dejavu-sans-fonts liberation-fonts

openSUSE (Leap / Tumbleweed)

sudo zypper install unoconv libreoffice-writer libreoffice-calc libreoffice-impress python3-uno python3-setuptools dejavu-fonts liberation-fonts

Arch Linux (and Manjaro)
Heads up:
There’s no separate libreoffice-core/libreoffice-headless split on Arch, but the packages still run headless.

sudo pacman -S unoconv libreoffice-fresh python-setuptools ttf-dejavu ttf-liberation
πŸ“‹
libreoffice-fresh includes pyuno on Arch; use libreoffice-still for the LTS track.

About the author

Roland Taylor Roland Taylor
Updated on Apr 21, 2026