I host a single node Kubernetes cluster (MicroK8s) at home running various applications. I use simple hostpath volumes for storage and need a way to create backups of this data. The solution I chose is to combine K8 CronJobs with Rsync. The basic steps are:
Create a backups namespace
Create volume for backup directory
Create volume to map application data directory
Create Cronjob to perform the sync
I use Ansible for deployments so I will be showing those task/template files.
Create a Backups Namespace
Originally I created the CronJobs in the same namespace, however because of the number of containers it create it started to make a mess of my monitoring data (in Prometheus). I recommend creating a specific namespace for backups.
Create the Volumes
Use a separate drive to the source data to protect against disk failure.
These are the templates I use to create the volumes:
First I'll create a volume for the backup destination. Volumes are global, but Volume claims are namespaced. This volume claim exists in the same namespace as the backup Pods so they will all share the claim:
At this point we have somewhere to write the backups, now we need to setup the reads. I reuse the templates and just add in the specifics for the application. rsync will copy everything from the application data_dir to the backups data_dir.
One gotcha that you might run into here is that you can't reuse the existing volume of the application you are targetting (unless they are in the same namespace, which I don't think they should be).
Create the Cronbjob
For this kind of backup I have 3 separate cron jobs. Daily, Weekly, Monthly. Daily and Weekly will also contain the most recent Day/Week copy. Monthly however will create separate directories for each run that will end in the current date.
Below is an example that I use to backup NextCloud data
And finally, probably the most important part is the deployment template.
I have create a docker image that is just alpine + rsync: