19 KiB
title | eleventyNavigation | ||||||
---|---|---|---|---|---|---|---|
How to monitor your backups |
|
Monitoring and alerting
Having backups is great, but they won't do you a lot of good unless you have confidence that they're running on a regular basis. That's where monitoring and alerting comes in.
There are several different ways you can monitor your backups and find out whether they're succeeding. Which of these you choose to do is up to you and your particular infrastructure.
Job runner alerts
The easiest place to start is with failure alerts from the scheduled job runner (cron, systemd, etc.) that's running borgmatic. But note that if the job doesn't even get scheduled (e.g. due to the job runner not running), you probably won't get an alert at all! Still, this is a decent first line of defense, especially when combined with some of the other approaches below.
Commands run on error
The on_error
hook allows you to run an arbitrary command or script when
borgmatic itself encounters an error running your backups. So for instance,
you can run a script to send yourself a text message alert. But note that if
borgmatic doesn't actually run, this alert won't fire. See error
hooks
below for how to configure this.
Third-party monitoring services
borgmatic integrates with these monitoring services and libraries, pinging them as backups happen:
The idea is that you'll receive an alert when something goes wrong or when the service doesn't hear from borgmatic for a configured interval (if supported). See the documentation links above for configuration information.
While these services and libraries offer different features, you probably only need to use one of them at most.
Third-party monitoring software
You can use traditional monitoring software to consume borgmatic JSON output and track when the last successful backup occurred. See scripting borgmatic below for how to configure this.
Borg hosting providers
Most Borg hosting providers include monitoring and alerting as part of their offering. This gives you a dashboard to check on all of your backups, and can alert you if the service doesn't hear from borgmatic for a configured interval.
Consistency checks
While not strictly part of monitoring, if you want confidence that your backups are not only running but are restorable as well, you can configure particular consistency checks or even script full extract tests.
Error hooks
When an error occurs during a create
, prune
, compact
, or check
action,
borgmatic can run configurable shell commands to fire off custom error
notifications or take other actions, so you can get alerted as soon as
something goes wrong. Here's a not-so-useful example:
on_error:
- echo "Error while creating a backup or running a backup hook."
Prior to version 1.8.0 Put
this option in the hooks:
section of your configuration.
The on_error
hook supports interpolating particular runtime variables into
the hook command. Here's an example that assumes you provide a separate shell
script to handle the alerting:
on_error:
- send-text-message.sh {configuration_filename} {repository}
In this example, when the error occurs, borgmatic interpolates runtime values into the hook command: the borgmatic configuration filename and the path of the repository. Here's the full set of supported variables you can use here:
configuration_filename
: borgmatic configuration filename in which the error occurredrepository
: path of the repository in which the error occurred (may be blank if the error occurs in a hook)error
: the error message itselfoutput
: output of the command that failed (may be blank if an error occurred without running a command)
Note that borgmatic runs the on_error
hooks only for create
, prune
,
compact
, or check
actions/hooks in which an error occurs and not other
actions. borgmatic does not run on_error
hooks if an error occurs within a
before_everything
or after_everything
hook. For more about hooks, see the
borgmatic hooks
documentation,
especially the security information.
New in version 1.8.7 borgmatic automatically escapes these interpolated values to prevent shell injection attacks. One implication of this change is that you shouldn't wrap the interpolated values in your own quotes, as that will interfere with the quoting performed by borgmatic and result in your command receiving incorrect arguments. For instance, this won't work:
on_error:
# Don't do this! It won't work, as the {error} value is already quoted.
- send-text-message.sh "Uh oh: {error}"
Do this instead:
on_error:
- send-text-message.sh {error}
Healthchecks hook
Healthchecks is a service that provides "instant alerts when your cron jobs fail silently," and borgmatic has built-in integration with it. Once you create a Healthchecks account and project on their site, all you need to do is configure borgmatic with the unique "Ping URL" for your project. Here's an example:
healthchecks:
ping_url: https://hc-ping.com/addffa72-da17-40ae-be9c-ff591afb942a
Prior to version 1.8.0 Put
this option in the hooks:
section of your configuration.
With this configuration, borgmatic pings your Healthchecks project when a
backup begins, ends, or errors, but only when any of the create
, prune
,
compact
, or check
actions are run.
Then, if the actions complete successfully, borgmatic notifies Healthchecks of the success and includes borgmatic logs in the payload data sent to Healthchecks. This means that borgmatic logs show up in the Healthchecks UI, although be aware that Healthchecks currently has a 100-kilobyte limit for the logs in each ping.
If an error occurs during any action or hook, borgmatic notifies Healthchecks,
also tacking on logs including the error itself. But the logs are only
included for errors that occur when a create
, prune
, compact
, or check
action is run.
You can customize the verbosity of the logs that are sent to Healthchecks with
borgmatic's --monitoring-verbosity
flag. The --list
and --stats
flags
may also be of use. See borgmatic create --help
for more information.
Additionally, see the borgmatic configuration
file for
additional Healthchecks options.
You can configure Healthchecks to notify you by a variety of mechanisms when backups fail or it doesn't hear from borgmatic for a certain period of time.
Cronitor hook
Cronitor provides "Cron monitoring and uptime healthchecks for websites, services and APIs," and borgmatic has built-in integration with it. Once you create a Cronitor account and cron job monitor on their site, all you need to do is configure borgmatic with the unique "Ping API URL" for your monitor. Here's an example:
cronitor:
ping_url: https://cronitor.link/d3x0c1
Prior to version 1.8.0 Put
this option in the hooks:
section of your configuration.
With this configuration, borgmatic pings your Cronitor monitor when a backup
begins, ends, or errors, but only when any of the prune
, compact
,
create
, or check
actions are run. Then, if the actions complete
successfully or errors, borgmatic notifies Cronitor accordingly.
You can configure Cronitor to notify you by a variety of mechanisms when backups fail or it doesn't hear from borgmatic for a certain period of time.
Cronhub hook
Cronhub provides "instant alerts when any of your background jobs fail silently or run longer than expected," and borgmatic has built-in integration with it. Once you create a Cronhub account and monitor on their site, all you need to do is configure borgmatic with the unique "Ping URL" for your monitor. Here's an example:
cronhub:
ping_url: https://cronhub.io/start/1f5e3410-254c-11e8-b61d-55875966d031
Prior to version 1.8.0 Put
this option in the hooks:
section of your configuration.
With this configuration, borgmatic pings your Cronhub monitor when a backup
begins, ends, or errors, but only when any of the prune
, compact
,
create
, or check
actions are run. Then, if the actions complete
successfully or errors, borgmatic notifies Cronhub accordingly.
Note that even though you configure borgmatic with the "start" variant of the ping URL, borgmatic substitutes the correct state into the URL when pinging Cronhub ("start", "finish", or "fail").
You can configure Cronhub to notify you by a variety of mechanisms when backups fail or it doesn't hear from borgmatic for a certain period of time.
PagerDuty hook
In case you're new here: borgmatic is simple, configuration-driven backup software for servers and workstations, powered by Borg Backup.
PagerDuty provides incident monitoring and alerting. borgmatic has built-in integration that can notify you via PagerDuty as soon as a backup fails, so you can make sure your backups keep working.
First, create a PagerDuty account and service on their site. On the service, add an integration and set the Integration Type to "borgmatic".
Then, configure borgmatic with the unique "Integration Key" for your service. Here's an example:
pagerduty:
integration_key: a177cad45bd374409f78906a810a3074
Prior to version 1.8.0 Put
this option in the hooks:
section of your configuration.
With this configuration, borgmatic creates a PagerDuty event for your service
whenever backups fail, but only when any of the create
, prune
, compact
,
or check
actions are run. Note that borgmatic does not contact PagerDuty
when a backup starts or when it ends without error.
You can configure PagerDuty to notify you by a variety of mechanisms when backups fail.
If you have any issues with the integration, please contact us.
ntfy hook
New in version 1.6.3 ntfy is a free, simple, service (either hosted or self-hosted) which offers simple pub/sub push notifications to multiple platforms including web, Android and iOS.
Since push notifications for regular events might soon become quite annoying,
this hook only fires on any errors by default in order to instantly alert you
to issues. The states
list can override this. Each state can have its own
custom messages, priorities and tags or, if none are provided, will use the
default.
An example configuration is shown here with all the available options, including priorities and tags:
ntfy:
topic: my-unique-topic
server: https://ntfy.my-domain.com
username: myuser
password: secret
start:
title: A borgmatic backup started
message: Watch this space...
tags: borgmatic
priority: min
finish:
title: A borgmatic backup completed successfully
message: Nice!
tags: borgmatic,+1
priority: min
fail:
title: A borgmatic backup failed
message: You should probably fix it
tags: borgmatic,-1,skull
priority: max
states:
- start
- finish
- fail
Prior to version 1.8.0 Put
the ntfy:
option in the hooks:
section of your configuration.
New in version 1.8.9 Instead of
username
/password
, you can specify an ntfy access
token:
ntfy:
topic: my-unique-topic
server: https://ntfy.my-domain.com
access_token: tk_AgQdq7mVBoFD37zQVN29RhuMzNIz2
Loki hook
New in version 1.8.3 Grafana Loki is a "horizontally scalable, highly available, multi-tenant log aggregation system inspired by Prometheus." borgmatic has built-in integration with Loki, sending both backup status and borgmatic logs.
You can configure borgmatic to use either a self-hosted Loki instance or a Grafana Cloud account. Start by setting your Loki API push URL. Here's an example:
loki:
url: http://localhost:3100/loki/api/v1/push
With this configuration, borgmatic sends its logs to your Loki instance as any
of the prune
, compact
, create
, or check
actions are run. Then, after
the actions complete, borgmatic notifies Loki of success or failure.
This hook supports sending arbitrary labels to Loki. For instance:
loki:
url: http://localhost:3100/loki/api/v1/push
labels:
app: borgmatic
hostname: example.org
There are also a few placeholders you can optionally use as label values:
__config
: name of the borgmatic configuration file__config_path
: full path of the borgmatic configuration file__hostname
: the local machine hostname
These placeholders are only substituted for the whole label value, not interpolated into a larger string. For instance:
loki:
url: http://localhost:3100/loki/api/v1/push
labels:
app: borgmatic
config: __config
hostname: __hostname
Also check out this Loki dashboard for borgmatic if you'd like to see your backup logs and statistics in one place.
Apprise hook
New in version 1.8.4 Apprise is a local notification library that "allows you to send a notification to almost all of the most popular notification services available to us today such as: Telegram, Discord, Slack, Amazon SNS, Gotify, etc."
Depending on how you installed borgmatic, it may not have come with Apprise. For instance, if you originally installed borgmatic with pipx, run the following to install Apprise so borgmatic can use it:
sudo pipx install --force borgmatic[Apprise]
Omit sudo
if borgmatic is installed as a non-root user.
Once Apprise is installed, configure borgmatic to notify one or more Apprise services. For example:
apprise:
services:
- url: gotify://hostname/token
label: gotify
- url: mastodons://access_key@hostname/@user
label: mastodon
With this configuration, borgmatic pings each of the configured Apprise
services when a backup begins, ends, or errors, but only when any of the
prune
, compact
, create
, or check
actions are run.
You can optionally customize the contents of the default messages sent to these services:
apprise:
services:
- url: gotify://hostname/token
label: gotify
start:
title: Ping!
body: Starting backup process.
finish:
title: Ping!
body: Backups successfully made.
fail:
title: Ping!
body: Your backups have failed.
states:
- start
- finish
- fail
New in version 1.8.9 borgmatic logs are automatically included in the body data sent to your Apprise services when a backup finishes or fails.
You can customize the verbosity of the logs that are sent with borgmatic's
--monitoring-verbosity
flag. The --list
and --stats
flags may also be of
use. See borgmatic create --help
for more information.
If you don't want any logs sent, you can disable this feature by setting
send_logs
to false
:
apprise:
services:
- url: gotify://hostname/token
label: gotify
send_logs: false
Or to limit the size of logs sent to Apprise services:
apprise:
services:
- url: gotify://hostname/token
label: gotify
logs_size_limit: 500
This may be necessary for some services that reject large requests.
See the configuration reference for details.
Scripting borgmatic
To consume the output of borgmatic in other software, you can include an
optional --json
flag with create
, rlist
, rinfo
, or info
to get the
output formatted as JSON.
Note that when you specify the --json
flag, Borg's other non-JSON output is
suppressed so as not to interfere with the captured JSON. Also note that JSON
output only shows up at the console and not in syslog.
Latest backups
All borgmatic actions that accept an --archive
flag allow you to specify an
archive name of latest
. This lets you get the latest archive without having
to first run borgmatic rlist
manually, which can be handy in automated
scripts. Here's an example:
borgmatic info --archive latest