| Objective | Build a self-contained system that watches a folder, converts any dropped document to PDF using headless LibreOffice, sorts originals by extension, and logs every run β fully automated via cron. |
|---|---|
| Duration | 60-90 minutes |
| Difficulty | Intermediate |
| Environment | Linux (Debian/Ubuntu/RHEL/Arch) β desktop or server, no GUI required |
Scenario
Imagine this: In a company or organization, there's a shared folder where staff (or automated systems) drop finished documents that need to be standardized for archival or distribution. Everyone can keep editing their working files in the usual place. When a document is ready for the day, it gets saved to the Document Inbox folder and synced to the file server.
Every few minutes, a conversion job runs automatically, checking this folder for any supported documents, whether ODT, ODS, ODP, DOCX, etc. β and converts them to the PDF format. The resulting PDFs are saved to "Reports-PDF", replacing any previous versions if necessary, and the processed copy of the source document is filed into a folder in "Originals", sorted by extension for traceability.
There are no extra buttons to press and no manual exporting to remember. Anyone can drop a file and go on about their day, and the PDFs will be neatly arranged and waiting in the output directory minutes later. This lets the team keep a simple routine while ensuring consistent, ready-to-share PDFs appear on schedule. This is precisely the solution weβre aiming for in this lab.
Our automation goals
Weβll build a practical, approachable system that does the following:
- Watch a single folder for new documents in any supported file format (ODF, DOCX, etc.).
- Convert each file to PDF using unoconv.
- Move converted PDFs into a dedicated folder.
- Move original files into subfolders matching their extensions (e.g., originals/odt/).
- Prevent overlapping runs using a lockfile.
- Log all actions to /var/log/lo-unoconv.log with automatic log rotation.
This gives us a self-contained, resilient system that can handle everything from a trickle of invoices to hundreds of archived reports.
Creating PDFs is one of the easiest tasks to take for granted on Linux, thanks to the robust PDF support provided by CUPS and Ghostscript. However, converting multiple files to this portable format can get tedious fast, especially for students, non-profits, and businesses that may have several files to handle on any given day. Fortunately, the Linux ecosystem gives you everything you need to fully automate this task, supporting several file formats and any number of files.
This guide will show you how to use unoconv (powered by headless LibreOffice) to build a simple, reliable system that converts any supported document format into PDF, and optionally sorts your original files into subfolders for storage or further management.
Weβll cover common open document formats, and show you how to expand the approach so you can drop in other types as needed. Weβll also use cron to automate execution, flock to prevent overlapping runs, and logrotate to handle log rotation automatically. The final result will be a lightweight, low-maintenance automation you can replicate on almost any Linux system.
The methods here work on both desktop and server environments, which makes them a practical fit for organisations that need to handle regular PDF conversions. Once configured, the process is fully hands-free. Weβll keep things approachable and script-first, run everything as a non-privileged user, and focus on a clear folder layout you can adapt to your own workflow with no GUI required.
Prerequisites
- Linux system: Debian, Ubuntu, RHEL/CentOS Stream, openSUSE, or Arch or any Linux distro
- sudo or root access
- Basic familiarity with bash and the command line
- systemd
- No display server or GUI required β everything runs headless
End-state architecture
When the lab is complete, the following will be running on your system:
/srv/convert/
βββ inbox/ β world-writable; drop documents here
βββ PDFs/ β converted PDFs appear here
βββ originals/ β originals sorted by extension
βββ odt/
βββ ods/
βββ docx/
Services
libreoffice-listener.service β persistent headless LO on port 2002
cron (lo-svc) β runs /usr/local/bin/lo-autopdf.sh every 5 min
logrotate β rotates /var/log/lo-unoconv.log weeklyUnderstanding Unoconv (optional)
Unoconv (short for UNO Converter) is a Python wrapper for LibreOfficeβs Universal Network Objects (UNO) API. It interfaces directly with a headless instance of LibreOffice, either by launching a new instance or connecting to an existing one, and uses this to convert between supported file formats.
You might wonder why we're not using LibreOffice directly, since it has a headless version that can even be used on servers. The answer lies in how headless LibreOffice works. It is designed to launch a new instance every time the libreoffice --headless command is run.
This works fine for one-time tasks, but it puts a strain on the system if this command must be loaded from storage and system resources must be reallocated every time you try to use it. By using unoconv as a wrapper, we can allow headless LibreOffice to run as a persistent listener, with predictable resource usage, and avoid overlap when multiple conversions are needed. This saves time, and makes an ideal solution for recurring jobs like ours.
Step 1: Installing the packages
You'll need to install LibreOffice, unoconv, and the UNO Python bindings (pyuno) for this setup to work. The Writer, Calc, and Impress components are also required, as they provide filters needed for file format conversions.
However, we won't need any GUI add-ons β everything here is headless/server-friendly. Even if some small GUI-related libraries are installed as dependencies, everything you'll install will run fully headless; absolutely no display server required.
Note: on desktops, some of these packages may already be installed. Running these commands will ensure you're not missing any dependencies, but will not cause any problems if the packages already exist.
Debian / Ubuntu:
sudo apt update
sudo apt install unoconv libreoffice-core libreoffice-writer libreoffice-calc libreoffice-impress python3-uno fonts-dejavu fonts-liberation
RHEL/CentOS Stream
First enable EPEL (often required for unoconv on RHEL and its derivatives, Fedora has it in the default repos):
sudo dnf install epel-release
Then install:
sudo dnf install unoconv libreoffice-writer libreoffice-calc libreoffice-impress libreoffice-pyuno python3-setuptools dejavu-sans-fonts liberation-fonts
openSUSE (Leap / Tumbleweed)
sudo zypper install unoconv libreoffice-writer libreoffice-calc libreoffice-impress python3-uno python3-setuptools dejavu-fonts liberation-fonts
Arch Linux (and Manjaro)
Heads up: Thereβs no separate libreoffice-core/libreoffice-headless split on Arch, but the packages still run headless.
sudo pacman -S unoconv libreoffice-fresh python-setuptools ttf-dejavu ttf-liberation
libreoffice-fresh includes pyuno on Arch; use libreoffice-still for the LTS track.