Backups, Monitoring, and Security for small Mastodon servers

Backups, Monitoring, and Security for small Mastodon servers

With Mastodon quickly becoming a refuge for former bird-site users fleeing the new regime, many are considering self-hosting their Fediverse instance. There’s many good reasons to do this, such as privacy, data ownership, or even maintaining consistent performance while larger communities struggle to on-board an influx of new users.

But, as always, self-hosting means new responsibilities! In this case, to ensure that the data is safe and secure, operating correctly, and to ensure that the server is not disseminating malware.

This post assumes that the server is already set up according to the official installation guide, and should be carried out shortly after the setup is complete.

Backups

First and foremost is backups. The official guide is sadly quite incomplete here, as I would consider backups to be one of the most important parts of running a public service.

This backup script is very similar to other servers I’ve run over the years, and essentially just creates a database dump file, copies the redis database, deletes backups that are more than two weeks old, and rsyncs the installation directory & database backups to a storage server.

Since it exports the entire database it’s great for a small server, but will quickly become untenable for a large database.

/opt/backup_mastodon.sh

#!/bin/bash
set -eo pipefail

NOW=$(date +"%Y-%m-%d-%H-%M-%S")
SERVER="backup-nas.local"
USER="mastodon_backup"

# Export database, compress, and send to file with today's date
sudo -u postgres pg_dump mastodon_production | gzip > \
  /home/mastodon/backups/database/mastodon_production-"$NOW".sql.gz

# Copy redis automatic backup
cp /var/lib/redis/dump.rdb /home/mastodon/backups/redis/redis-"$NOW".rdb

# Delete database & redis backups older than 2 weeks
find /home/mastodon/backups/ -type f -mtime +14 -delete

rsync -azh --delete /home/mastodon/ "$USER"@"$SERVER":/backups/mastodon/data/

touch /tmp/backup

$SERVER and $USER would be changed to the login credentials for the server/nas where backups should be sent. In addition, passwordless SSH should be configured to allow rsync to run unsupervised.

This job is scheduled every 6 hours, which meets my recovery requirements – you may opt to run this hourly, daily, or on another schedule as you see fit.

/etc/cron.d/backup_mastodon

MAILTO=root
0 */6 * * *    root    /opt/backup_mastodon.sh

In my case, I chose to host the media files locally rather than upload them to cloud storage since it’s a single small server with very few users and little expected growth. This means that all image files, including cached files from other servers will be part of my backup.

Reverse proxy & Cloudflare weirdness

Rather than just using NAT port-forwarding like a normal person, I opted to use Cloudflare’s Tunnels product to reverse-proxy my site. I’m unconvinced of its security benefits, I just wanted to try something new…

After setting everything up, I noticed that the nginx access log was recording all the requests as ::1, or localhost. This ended up being an easy enough fix, simply configuring nginx to replace ::1 with the contents of the CF-Connecting-IP HTTP header, thus showing correct information in the access log.

This was just two lines to add to the top of the Server block:

/etc/nginx/sites-enabled/mastodon

...
  set_real_ip_from ::1;
  real_ip_header CF-Connecting-IP;
...

Health Checks

Mastodon comes with a few easy to use health checks out of the box. I like to use Monit for these types of really simple checks:

check process mastodon-web matching 'puma [0-9]+'
  start program = "/usr/bin/systemctl start mastodon-web.service"
  stop program  = "/usr/bin/systemctl stop  mastodon-web.service"
  if failed host localhost port 443 protocol HTTPS request /health then alert

check process mastodon-streaming matching "/usr/bin/node ./streaming"
  start program = "/usr/bin/systemctl start mastodon-streaming.service"
  stop program  = "/usr/bin/systemctl stop  mastodon-streaming.service"
  if failed host localhost port 443 protocol HTTPS request /api/v1/streaming/health then alert

If the service fails or does not respond, it will quickly restart the service and send me a quick email notification.

Monitoring

Mastodon already has a built-in metrics system implemented with statsd. However, my environment is already fully implemented with Prometheus, as I would assume most to be nowadays. To get good telemetry from this server, it was needed to install an intermediary statsd_exporter, as well as nginx & postgres exporters.

Nginx

Since the nginx exporter is already in the Debian software repositories, it was a simple installation:

sudo apt install prometheus-nginx-exporter

Then, a minimal stub_status page was enabled at the bottom of nginx.conf

server {
    listen 127.0.0.1:8080;
    location /stub_status {
        stub_status;
        access_log off;
    }
}

The service can be started and enabled:

sudo systemctl enable --now prometheus-nginx-exporter

Postgres

The postgres exporter can also be installed from the software repository:

sudo apt install prometheus-postgres-exporter

To read data from postgres, the exporter must be configured to authenticate and connect with the database server:

/etc/default/prometheus-postgres-exporter

DATA_SOURCE_NAME='user=prometheus host=/run/postgresql dbname=postgres'
ARGS='--disable-settings-metrics'

Likewise, the database server must be configured to allow the exporter to read data from itself. A quick SQL script is needed to configure the correct schemas and views for Prometheus to get access to only the monitoring data it needs:

sudo -u postgres psql
  CREATE USER prometheus;
  ALTER USER prometheus SET SEARCH_PATH TO prometheus,pg_catalog;
  
  CREATE SCHEMA prometheus AUTHORIZATION prometheus;
  
  CREATE FUNCTION prometheus.f_select_pg_stat_activity()
  RETURNS setof pg_catalog.pg_stat_activity
  LANGUAGE sql
  SECURITY DEFINER
  AS $$
    SELECT * from pg_catalog.pg_stat_activity;
  $$;
  
  CREATE FUNCTION prometheus.f_select_pg_stat_replication()
  RETURNS setof pg_catalog.pg_stat_replication
  LANGUAGE sql
  SECURITY DEFINER
  AS $$
    SELECT * from pg_catalog.pg_stat_replication;
  $$;
  
  CREATE VIEW prometheus.pg_stat_replication
  AS
    SELECT * FROM prometheus.f_select_pg_stat_replication();
  
  CREATE VIEW prometheus.pg_stat_activity
  AS
    SELECT * FROM prometheus.f_select_pg_stat_activity();
  
  GRANT SELECT ON prometheus.pg_stat_replication TO prometheus;
  GRANT SELECT ON prometheus.pg_stat_activity TO prometheus;

Then, the exporter service can be started up:

sudo systemctl enable --now prometheus-postgres-exporter

statsd_exporter

In the process of implementing this, I came accross these wonderful resources:

At this time, it is not in the Debian software repositories, so it was installed manually on my box:

VERSION="0.22.8"
wget "https://github.com/prometheus/statsd_exporter/releases/download/v$VERSION/statsd_exporter-$VERSION.linux-amd64.tar.gz"

tar xzf ./statsd_exporter-$VERSION.linux-amd64.tar.gz
sudo mv ./statsd_exporter-$VERSION.linux-amd64/statsd_exporter /usr/local/bin/
sudo chmod +x /usr/local/bin/statsd_exporter

Then, a systemd service is created to run the exporter:

/etc/systemd/system/prometheus-stats-exporter

[Unit]
Description=statsd_exporter
Documentation=https://github.com/prometheus/statsd_exporter

[Service]
Type=simple
ExecStart=/usr/local/bin/statsd_exporter
User=prometheus
Restart=always
RestartSec=5

[Install]
WantedBy=multi-user.target

Once installed, it may be started up and enabled on boot:

sudo systemctl daemon-reload
sudo systemctl enable --now prometheus-statsd-exporter.service

Configure mastodon

https://docs.joinmastodon.org/admin/config/#statsd

Mastodon supports statsd, but it is not enabled by default. To get our data, we must point it towards the statsd exporter configured previously:

~/live/.env.production

...
STATSD_ADDR=127.0.0.1:9125

Mastodon services need a quick restart to apply the change

sudo systemctl restart mastodon*

Once everything is running, a simple curl request can view the metrics collected so far:

curl localhost:9102/metrics -s  | grep -v '^#' | less

In my environment, I scrape prometheus using Fluentbit, so this config was added to that particular instance:

# Mastodon Statsd listener
[INPUT]
    name prometheus_scrape
    host 127.0.0.1
    port 9102
    tag  metrics.mastodon
    scrape_interval 10s 

# Nginx connection stats
[INPUT]
    name prometheus_scrape
    host 127.0.0.1
    port 9113
    tag  metrics.nginx
    scrape_interval 10s 

# Postgresql Database stats
[INPUT]
    name prometheus_scrape
    host 127.0.0.1
    port 9187
    tag  metrics.psql
    scrape_interval 10s 

From here, I simply send this data into Influxdb on a nearby machine, though that configuration is slightly outside the scope of this post.

This step will of course be vastly different depending on what poller is used.

Anti-malware scan

Though it’s highly unlikely, it is possible that a vulnerability in a media library could either infect my server or spread to others. To help mitigate this, I elected to run a regular virus scan on my user media directories. This is almost certainly “overkill”, and by no needs a core requirement on the importance of backups ;)

First, clamd is installed:

sudo apt install clamav clamav-daemon clamdscan

Then, this scanner script runs periodically as a background scan. It will only produce output if any signatures are detected, and will move the infected files out of the server directories into a quarantine.

/opt/scan.sh

#!/bin/bash

clamdscan --fdpass --infected --no-summary \
  --move=/var/cache/clamdjail \
  /home/mastodon/live/public/system/

touch /tmp/av_scan

I schedule this daily, though this frequency may be reduced as the amount of data to scan increases.

/etc/cron.d/scan_mastodon

MAILTO=root
0 0 * * *       root    /opt/scan.sh

It should also be noted that the scan can be sped up by using --multiscan which will use all available CPUs. While this does dramatically reduce the time taken to perform the full scan, it does negatively effect system performance.


That’s it so far… I’m sure I will find other tweaks to make on my personal Mastodon server with experience.

Until then, be sure to configure the other little self-checks, self-monitoring, security, and reporting tools that every little server needs.

For more info on scaling up larger mastodon servers, see this write-up from Hazel of Hachyderm: