Complete Guide to Self-hosting Plausible [Privacy Friendly Google Analytics Alternative]
Learn how to self-host a Plausible instance along with email reporting and global stats.
As an ethical website, we try to keep Linux Handbook as much Google and tracking free as possible. In that regard, we refrain from using Google Analytics for website traffic measurement.
Instead, we use Plausible Analytics. It is a simple, lightweight (<1 KB), open-source and privacy-friendly alternative to Google Analytics (GA).
It may not give you as many details as GA, but it gives you an idea about the traffic you are getting on your website along with the bounce rate and visit duration.
You can also see which pages are getting most visits, from where your website is getting the traffic, bounce rate and duration on page. You can also measure traffic based on geographical region and devices.
Founded and developed by Uku Taht and Marco Saric, Plausible greatly empowers the idea that website traffic can be analyzed without hindering visitors' privacy.
If you can afford, support the project by opting for their managed hosting plan. They even offer 30 days free trial.
If you have several websites with high traffic, and you find the pricing out of your budget, you can self-host the open source project Plausible like we do on Linux Handbook.
Self-hosting Plausible analytics with Docker
When I first worked on Plausible deployment, the process was utterly complicated. Thankfully, it is now quite convenient to deploy it on your own server. To make it easier, the wonderful folks at Plausible have also created a separate hosting repository on GitHub to get you started.
In this in-depth guide, you'll learn two ways of deploying the Plausible instance:
- The standalone method (single server, single service): Only Plausible runs on the entire server
- The reverse proxy method (single server, multiple services): You can deploy multiple web services like WordPress, Nextcloud etc with Plausible.
Additionally, I'll also show a couple of optional but useful steps to enjoy all features of Plausible:
- Configuring SMTP on Plausible so that you can receive weekly or monthly reports via email.
- Configuring GeoIP to display country wise statistics on the Plausible dashboard map
Prerequisites
Here's what you need apart from some knowledge of Linux commands, docker and docker-compose.
- A Linux server. You can use physical server, virtual machine or cloud servers. You may sign up with our partner Linode and get $100 in free credits.
- Docker and Docker Compose installed on your server.
- Access to DNS of your domain where you want to deploy Plausible.
- Nginx reverse proxy setup if you are opting for the second method of deployment.
Step 1: [Method 1] Preparing the deployment of Plausible in standalone way (single server, single service)
In this section, I'm going to assume that you want to directly host your domain at port 80 on a standalone server.
Let's start by discovering the most bare minimum essentials first. Note that docker-compose is required beforehand.
The Plausible deployment configuration basically consists of 3 main components:
- Postgres database for user data
- Clickhouse database for analytics data
- Plausible itself that relies on the two databases
Since Plausible deploys itself with Docker, all the above three components are deployed as their own respective containers.
Now look at how they are configured with Docker Compose one by one:
For Postgres, here you have to use the official Postgres 12 image available on Docker Hub (at the time of writing this tutorial). This is absolutely necessary as using the latest
tag in this case is not advisable.
plausible_db:
image: postgres:12
volumes:
- db-data:/var/lib/postgresql/data
environment:
- POSTGRES_PASSWORD=postgres
Use a volume name db-data
to store the user data at /var/lib/postgresql/data
. Setting an environment variable to assign the Postgres password would also be required.
For Clickhouse, use the Yandex Clickhouse Docker Hub image:
plausible_events_db:
image: yandex/clickhouse-server:latest
volumes:
- event-data:/var/lib/clickhouse
- ./clickhouse/clickhouse-config.xml:/etc/clickhouse-server/config.d/logging.xml:ro
- ./clickhouse/clickhouse-user-config.xml:/etc/clickhouse-server/users.d/logging.xml:ro
ulimits:
nofile:
soft: 262144
hard: 262144
Let me explain what it is doing! You are using a volume name event-data
to store the analytics data at /var/lib/clickhouse
. Then the configuration files are mounted to disable logging tables to avert issues such as calming down clickhouse in the long run. Following the bind mounted XML files, ulimit
is used to restrict resource utilization inside the Clickhouse container.
For the Plausible service itself, use the Docker Hub image that is tagged latest
by the developers as a stable release:
plausible:
image: plausible/analytics:latest
command: sh -c "sleep 10 && /entrypoint.sh db createdb && /entrypoint.sh db migrate && /entrypoint.sh db init-admin && /entrypoint.sh run"
depends_on:
- plausible_db
- plausible_events_db
- mail
- geoip
ports:
- 80:8000
env_file:
- plausible-conf.env
On first run, it creates a Postgres database for user data, a Clickhouse database for analytics data, migrates them to prepare the schema and creates the admin account for you.
As you can also see, the service relies on plausible_db
and plausible_events_db
for it to be operational. mail
and geoip
are two other additional services that I'll be discussing later.
Since you are using a standalone method, you can directly specify the container port 8000 to be available via the hostname at port 80. As for the env_file
, I shall discuss it in the "Environment files" section later in this guide.
Each of the database services will have their own respective Docker volumes for storing user and analytics data. So, you also need to include a volumes section within the docker compose file with the following details::
volumes:
db-data:
driver: local
event-data:
driver: local
geoip:
driver: local
You now have the necessary components for the basic Plausible deployment.
As we now know the three basic components for a basic Plausible deployment, it is still not quite the complete setup we would want yet. We also need to configure it with two more additions that will make it a complete web analytics deployment:
SMTP setup for email reports [optional]
You can make use of Bytemark SMTP service that Plausible will use in order to send weekly or monthly analytics reports. This additional configuration is simple but needs to be specified in the Plausible service configuration later on:
mail:
image: bytemark/smtp
restart: always
I'm going to use SendGrid as an example for this guide. You can create your free API key from here once logged into SendGrid. Save the 69 character string as you are going to need it later as the password for your SMTP config.
GeoIP for dashboard map [optional]
This part is needed to show the country wise visitor counts as you hover the cursor over the world map on the Plausible dashboard for your domain.
For this, you have to use the free GeoLite2 service. MaxMind's GeoIP2 databases provide IP intelligence data for high-volume environments. By self-hosting their databases, you eliminate any network latency and per-query charges. Plausible uses GeoLite2 databases that are free IP geolocation databases implemented as a container.
To set up a GeoIP database and let it update automatically, you need to sign up for a free account at MaxMind. After signing up, go to Services>My License Key from the left panel on your Maxmind account page. Click on "Generate new license key" and save it locally as you can view it only once when generating it.
plausible_geoip:
image: maxmindinc/geoipupdate
environment:
- GEOIPUPDATE_EDITION_IDS=GeoLite2-Country
- GEOIPUPDATE_FREQUENCY=168 # update every 7 days
env_file:
- geoip-conf.env
volumes:
- geoip:/usr/share/GeoIP
restart: always
Through the above two environment variables, we set the edition ID and how frequently the database would be updated. The GeoLite2 Country, City, and ASN databases are updated weekly, every Tuesday. The geoip-conf.env
discussed later in this guide will have to include all the credentials you obtain after generating the license key discussed above.
Environment Files
This section is perhaps the most important one, as it covers all the essential environment variables that need to be in place for the above five components of the Plausible instance to work correctly. Throughout the entire configuration, we make use of them directly and via environment files.
Environment file for Plausible configuration
The plausible-conf.env file stores the most essential environment variables to deploy the Plausible instance.
ADMIN_USER_EMAIL=replace-me
ADMIN_USER_NAME=replace-me
ADMIN_USER_PWD=replace-me
BASE_URL=replace-me
SECRET_KEY_BASE=replace-me
ADMIN_USER_EMAIL
is the email address you would wish to login as well as receive weekly or monthly analytics reports.- For
ADMIN_USER_NAME
, you can mention your own name here. - The value for
ADMIN_USER_PWD
is your login password. BASE_URL
can be in the format: http://plausible.domain.com . Note that for enabling HTTPS, it is recommended to use a reverse proxy method (discussed in the second part of this guide) to make use of an SSL certificate.- The
SECRET_KEY_BASE
is a random 64-character secret key used to secure Plausible. To generate one, use:openssl rand -base64 64
Environment file for Plausible SMTP configuration with Sendgrid
Though you can also include the SMTP environment variables in the same file, using a separate one makes it clearer to follow. So, here I've used a file called plausible-smtp.env
for the same purpose.
The below configuration is specific to Sendgrid but you can change it accordingly if you prefer a different SMTP service:
[email protected]
SMTP_HOST_ADDR=smtp.sendgrid.net
SMTP_HOST_PORT=465
SMTP_USER_NAME=apikey
SMTP_USER_PWD=replace-me
SMTP_HOST_SSL_ENABLED=true
SMTP_RETRIES=20
CRON_ENABLED=true
MAILER_EMAIL
is the customizable "from" email address that will show up on your inbox when you receive your weekly or monthly reports.SMTP_HOST_ADDR
is the SMTP server hostname. In case of SendGrid, it issmtp.sendgrid.net
.- Here you use 465 for the SMTP host port number via
SMTP_HOST_PORT
. apikey
is the username credential for SendGrid in particular, set viaSMTP_USER_NAME
.SMTP_USER_PWD
is the 69 character key used as the password that you obtained from SendGrid (discussed in SMTP setup section).- I've enabled SSL to true via
SMTP_HOST_SSL_ENABLED
for SendGrid. - The number of retries until the mailer gives up can be set via
SMTP_RETRIES
.
CRON_ENABLED
is not actually an SMTP setting to be specific. But why I've included this here is because without this variable set to true
, you will not receive any weekly or monthly reports via email. By default, this value is false, but it will probably be updated in the next release of Plausible as discussed here. But as of now, this setting is absolutely crucial.
Environment file for global stats on Plausible with GeoIP
With geoip-conf.env
, you include the essential credentials obtained from MaxMind as discussed earlier:
GEOIPUPDATE_ACCOUNT_ID=replace-me
GEOIPUPDATE_LICENSE_KEY=replace-me
For a complete reference to every kind of environment variable on Plausible, you can visit its documentation page here.
Now at this point, you've looked into all the necessary details for hosting a standalone deployment of Plausible. The complete docker compose configuration would look like this:
version: "3.3"
services:
plausible_db:
image: postgres:12
volumes:
- db-data:/var/lib/postgresql/data
environment:
- POSTGRES_PASSWORD=postgres
plausible_events_db:
image: yandex/clickhouse-server:latest
volumes:
- event-data:/var/lib/clickhouse
- ./clickhouse/clickhouse-config.xml:/etc/clickhouse-server/config.d/logging.xml:ro
- ./clickhouse/clickhouse-user-config.xml:/etc/clickhouse-server/users.d/logging.xml:ro
ulimits:
nofile:
soft: 262144
hard: 262144
mail:
image: bytemark/smtp
restart: always
geoip:
image: maxmindinc/geoipupdate
environment:
- GEOIPUPDATE_EDITION_IDS=GeoLite2-Country
- GEOIPUPDATE_FREQUENCY=168 # update every 7 days
env_file:
- geoip-conf.env
volumes:
- geoip:/usr/share/GeoIP
restart: always
plausible:
image: plausible/analytics:latest
command: sh -c "sleep 10 && /entrypoint.sh db createdb && /entrypoint.sh db migrate && /entrypoint.sh db init-admin && /entrypoint.sh run"
depends_on:
- plausible_db
- plausible_events_db
- mail
- geoip
ports:
- 80:8000
env_file:
- plausible-conf.env
volumes:
db-data:
driver: local
event-data:
driver: local
geoip:
driver: local
If you followed this method, skip the next section.
Step 1: [Method 2] Preparing the deployment of Plausible with Nginx reverse proxy (single server, multiple services)
Let us quickly go through the necessary revisions for making the above Plausible configuration to also work under a reverse proxy setting. I'm using the example from our previous Nginx Docker article.
For all the four services other than the Plausible service, I'll use an internal network named plausible
as it only needs to be visible for Plausible alone.
networks:
- plausible
But for the Plausible service, the same net
network used on the reverse proxy configuration is needed to be specified along with the plausible
network, of course. Only then would you be able to make it work with the Nginx Docker container.
networks:
- net
- plausible
You also need to replace our ports
parameter with expose
inside your Plausible service since you are now using a reverse proxy:
expose:
- 8000
At the end of the configuration, you also need to specify which of the networks are internal and external:
networks:
net:
external: true
plausible:
internal: true
Additional environment variables
You also need to make sure you update the plausible-conf.env
file with the following variables for the setup to work correctly:
VIRTUAL_HOST=plausible.domain.com
LETSENCRYPT_HOST=plausible.domain.com
TRUSTED_PROXIES=172.x.0.0/16
Specify the domain name without https://
on VIRTUAL_HOST
and LETSENCRYPT_HOST
which are meant for the reverse proxy and SSL certificate respectively.
With TRUSTED_PROXIES
, you explicitly define the proxy servers for Plausible to trust. The exact value can be obtained with:
docker inspect -f '{{ json .IPAM.Config }}' net | jq .[].Subnet
For the above command to work, you would require the jq
tool already installed.
On Ubuntu, you can install it with:
sudo apt -y install jq
On CentOS, you would require adding the repository first:
yum install epel-release -y
yum install jq -y
So, here is the final docker-compose.yml file, revised for the reverse proxy:
version: "3.3"
services:
mail:
image: bytemark/smtp
restart: always
networks:
- plausible
plausible_db:
image: postgres:12
volumes:
- db-data:/var/lib/postgresql/data
environment:
- POSTGRES_PASSWORD=postgres
restart: always
networks:
- plausible
plausible_events_db:
image: yandex/clickhouse-server:latest
volumes:
- event-data:/var/lib/clickhouse
- ./clickhouse/clickhouse-config.xml:/etc/clickhouse-server/config.d/logging.xml:ro
- ./clickhouse/clickhouse-user-config.xml:/etc/clickhouse-server/users.d/logging.xml:ro
ulimits:
nofile:
soft: 262144
hard: 262144
restart: always
networks:
- plausible
geoip:
image: maxmindinc/geoipupdate
environment:
- GEOIPUPDATE_EDITION_IDS=GeoLite2-Country
- GEOIPUPDATE_FREQUENCY=168 # update every 7 days
env_file:
- geoip-conf.env
volumes:
- geoip:/usr/share/GeoIP
restart: always
networks:
- net
- plausible
plausible:
image: plausible/analytics:latest
command: sh -c "sleep 10 && /entrypoint.sh db createdb && /entrypoint.sh db migrate && /entrypoint.sh db init-admin && /entrypoint.sh run"
depends_on:
- plausible_db
- plausible_events_db
- mail
- geoip
expose:
- 8000
env_file:
- plausible-conf.env
- plausible-smtp.env
restart: always
networks:
- net
- plausible
volumes:
db-data:
driver: local
event-data:
driver: local
geoip:
driver: local
networks:
net:
external: true
plausible:
internal: true
Step 2: Deploying Plausible Analytics
Irrespective of whether you used method 1 or method 2, you should have the docker-compose file ready. It's time to use that file.
On your server, clone the Plausible hosting repository:
git clone https://github.com/plausible/hosting
Move into the directory for revising the files:
cd hosting
Now edit the docker-compose file to have it the same content that you saw in method 1 or method 2 (whichever you chose). Also revise all the essential files that require modifications as discussed using any of the above two methods.
Start up the Plausible instance:
docker-compose up -d
Access the Plausible domain you had specified in the configuration. You should see screen like this. Using the credentials that you provided in the plausible-conf.env
file, login to the Plausible Dashboard.
Step 3: Using Plausible analytics for your websites
It is time to add the website(s) you want to track and analyze with Plausible. When you are logged in to the dashboard of your Plausible instance, click on "+ Add a website".
Enter your domain name (say domain.com
) without www
or https://
and click on "Add snippet"
Note that domain.com
can just be any domain and has nothing to do with hosting Plausible on its subdomain. A Plausible instance can be hosted on any other domain name and not necessarily the subdomain of the domain being analyzed.
You need to add this JavaScript snippet in the header scripts on the domain.com website. It is up to you figure out how to add header script on your website.
On Linux Handbook, we use Ghost, fast and lightweight CMS.
On Ghost, you need it paste it in the header section under SETTINGS > Code injection
and click "Save":
Once added, you will be able to observe the web analytics for the domain shortly after you click on it on the main panel.
For other web apps including WordPress and Discourse, you can also refer to the official integration guides.
Enable weekly or monthly reports with Plsuible
Assuming domain.com
is the domain that you have added for analyzing, head over to domain.com > Settings > Email Reports
after logging in onto the dashboard.
Enable the option(s) you want as required:
If you prefer to go the nerd way, you can also access the email setting directly based on the following URL syntax:
https://plausible.domain.com/domain.com/settings#email-reports
where plausible.domain.com
is where you host your plausible instance and domain.com
is the site you want to analyze.
What about global stats?
If you had configured GeoIP, you can view the global map on the dashboard where the visitor countries are shaded in purple. Darker the shade of the color, higher is the country's visitor count.
Hovering the cursor over a country on the map will show you its visitor count for your website. Clicking on "MORE" will show you the complete list of country wise visitor counts from the map.
Maintenance tips
If you want to check the container's logs while it's deployed in real time, you can run:
docker logs -f hosting_plausible_1
At any time, if you want to stop the instance, you can use:
docker-compose down
That's it! You have successfully deployed Plausible Analytics with email reporting and countrywise map stats on the dashboard!
Personal notes on Plausible
So, you learned to deploy Plausible analytics on your server. Which method did you choose for this purpose?
I prefer to use the reverse proxy method every time because it always leaves an option to deploy other web services and hence save the server cost.
A reverse proxy method is preferable even for a single server single service mode of operation, as it makes the entire setup future proof. If I plan to deploy a second instance on the method discussed in the previous section, it would require a lot of additional effort in changing and adopting the new reverse proxy configuration. So it is better to already have it ready from the beginning itself.
Additionally, you do not have to worry about SSL certificates as well.
If you have live restore enabled on Docker, you'd want to use the restart policy on-failure
instead of always
shown in this guide. It will avert restarting the containers in case the Docker daemon gets restarted.
Instead of Google Analytics, enjoy a Google-free Analytics :)
If you encounter any error or face any issues or have a suggestion, please let me know by leaving a comment.
DevOps Geek at Linux Handbook | Doctoral Researcher on GPU-based Bioinformatics & author of 'Hands-On GPU Computing with Python' | Strong Believer in the role of Linux & Decentralization in Science