Implement restic backup support for Darwin systems #52

Open
opened 2025-11-08 10:11:04 -08:00 by onlyhavecans · 0 comments
Owner

Summary

Implement restic backup configuration for Darwin (macOS) systems to achieve feature parity with the existing NixOS backup implementation. Currently, nixos/backups/ provides automated backups to SFTP and AWS S3 for NixOS hosts, but nix-darwin does not support native restic options, requiring a custom implementation.

Background

The NixOS backup system (nixos/backups/) provides:

  • Automated backups to dual destinations (SFTP to Edelgard, AWS S3)
  • Per-host exclude patterns
  • SOPS-encrypted credentials management
  • Systemd timers for scheduled backups
  • Email notifications on failure
  • Maintenance tasks (prune, check, reporting)

Reference: nixos/backups/client.nix, nixos/backups/shared.nix, nixos/backups/maintenance.nix

Key Differences: NixOS vs Darwin

Aspect NixOS Darwin
Service Manager systemd (services, timers) launchd (agents, plist files)
Restic Module services.restic.backups None (custom implementation needed)
Home Path /home/dos /Users/dos
Notification System systemd hooks + journalctl launchd exit handlers + log files
Scheduling OnBootSec, OnUnitActiveSec, RandomizedDelaySec StartCalendarInterval

Implementation Tasks

1. Create Directory Structure

Create darwin/backups/ with the following files:

  • default.nix - Main backup implementation using launchd
  • shared.nix - Helper functions and configuration generators
  • Catra-restic-excludes.txt - Per-host exclude patterns
  • Madison-restic-excludes.txt
  • Piper-restic-excludes.txt

2. Implement darwin/backups/shared.nix

Port helper functions from NixOS version:

  • destinations list (sftp, aws)
  • secretPaths helpers for SOPS secret paths
  • mkBackupConfig, mkSftpConfig, mkAwsConfig configuration builders
  • Export via _module.args.backupHelpers
  • Remove systemd-specific code

3. Implement darwin/backups/client.nix

Core backup implementation:

  • SSH Configuration: Port SFTP config from NixOS (known hosts, identity file)
  • Backup Options: Add skwrls.backups.paths option (defaults to vars.home)
  • launchd Agents: Create agents for each destination:
    • restic-backup-sftp
    • restic-backup-aws
  • Wrapper Scripts: Shell scripts that:
    • Source SOPS secrets (passwordFile, repositoryFile, AWS credentials)
    • Execute restic with flags: --exclude-file, --skip-if-unchanged, --retry-lock 30m
    • Handle error logging and notifications
  • Scheduling: Use StartCalendarInterval to run every 2 hours (equivalent to NixOS timer)
  • Environment Variables: Set RESTIC_PASSWORD_FILE, RESTIC_REPOSITORY, AWS credentials

4. Create Host-Specific Exclude Files

Create exclude patterns for each Darwin host (similar to Morgan-restic-excludes.txt):

  • Exclude /Users/dos/* by default
  • Include important directories: Code, Desktop, Documents, Pictures, etc.
  • Exclude macOS-specific cache directories:
    • Library/Caches
    • Library/Application Support/*/Cache
    • .Trash
  • Exclude common build artifacts: node_modules, build, dist, target, *.log

5. Configure SOPS Secrets

  • Configure SOPS templates for AWS environment file (similar to NixOS)

6. Implement Email Notifications

Port notification system from NixOS:

  • On failure: Send email with log excerpt using msmtp
  • Use launchd exit handlers to trigger notification scripts
  • Check exit codes in wrapper scripts

7. Update Host Configurations

Import backup module in Darwin hosts:

  • Add ../../darwin/backups to imports in hosts/Madison/default.nix
  • Add ../../darwin/backups to imports in hosts/Piper/default.nix
  • Add ../../darwin/backups to imports in hosts/Catra/default.nix

Technical Challenges

1. launchd vs systemd

  • Scheduling: Replicate timer behavior with StartCalendarInterval
  • Randomization: No RandomizedDelaySec equivalent (may need wrapper script with random sleep)
  • Environment: Different environment variable handling in launchd

2. No systemd failure hooks

  • Check exit codes in wrapper scripts
  • Use launchd KeepAlive with SuccessfulExit = false for automatic retries
  • Implement notification logic directly in scripts

3. Logging

  • systemd journal → macOS unified logging or custom log files

  • Use StandardErrorPath and StandardOutPath in launchd config

  • Consider using logger command for syslog integration

  • NixOS uses IdentityAgent none

  • Verify this works on Darwin or adjust as needed

Acceptance Criteria

  • SSH configuration properly set up for SFTP access to Edelgard
  • Backups run automatically on schedule (every 2 hours)
  • Email notifications sent on backup failures
  • Backups successfully write to both SFTP and AWS destinations
  • just check and just test pass with new configuration
  • Documentation updated in CLAUDE.md if needed
  • Can run sudo restic-aws snapshots and see snapshots
  • Can run sudo restic-sftp snapshots and see snapshots

Expected Behavior

After implementation:

  1. Each Darwin host (Catra, Madison, Piper) runs automated restic backups
  2. Backups occur every 2 hours to both SFTP (Edelgard) and AWS S3
  3. Only configured directories are backed up (excluding caches, build artifacts)
  4. Backups use SOPS-encrypted credentials securely
  5. Email notifications sent on failures with relevant log output
  6. All systems have appropriate wrappers to run restic with the environments set
  7. System behaves identically to NixOS backup system (from user perspective)

References

Notes

  • This is a complete reimplementation due to lack of native restic support in nix-darwin
  • Should maintain configuration consistency with NixOS implementation where possible
  • Do not implement maintenance tasks
  • May need to handle macOS-specific permissions for certain directories
  • May need to account for macOS-specific local network access permissions
  • launchd logs can be viewed with: log show --predicate 'subsystem == "restic-backup-sftp"' --last 1h
  • Test thoroughly before enabling on all hosts
## Summary Implement restic backup configuration for Darwin (macOS) systems to achieve feature parity with the existing NixOS backup implementation. Currently, `nixos/backups/` provides automated backups to SFTP and AWS S3 for NixOS hosts, but nix-darwin does not support native restic options, requiring a custom implementation. ## Background The NixOS backup system (`nixos/backups/`) provides: - Automated backups to dual destinations (SFTP to Edelgard, AWS S3) - Per-host exclude patterns - SOPS-encrypted credentials management - Systemd timers for scheduled backups - Email notifications on failure - Maintenance tasks (prune, check, reporting) Reference: `nixos/backups/client.nix`, `nixos/backups/shared.nix`, `nixos/backups/maintenance.nix` ## Key Differences: NixOS vs Darwin | Aspect | NixOS | Darwin | |--------|-------|--------| | Service Manager | systemd (services, timers) | launchd (agents, plist files) | | Restic Module | `services.restic.backups` | None (custom implementation needed) | | Home Path | `/home/dos` | `/Users/dos` | | Notification System | systemd hooks + journalctl | launchd exit handlers + log files | | Scheduling | `OnBootSec`, `OnUnitActiveSec`, `RandomizedDelaySec` | `StartCalendarInterval` | ## Implementation Tasks ### 1. Create Directory Structure Create `darwin/backups/` with the following files: - `default.nix` - Main backup implementation using launchd - `shared.nix` - Helper functions and configuration generators - `Catra-restic-excludes.txt` - Per-host exclude patterns - `Madison-restic-excludes.txt` - `Piper-restic-excludes.txt` ### 2. Implement `darwin/backups/shared.nix` Port helper functions from NixOS version: - `destinations` list (sftp, aws) - `secretPaths` helpers for SOPS secret paths - `mkBackupConfig`, `mkSftpConfig`, `mkAwsConfig` configuration builders - Export via `_module.args.backupHelpers` - Remove systemd-specific code ### 3. Implement `darwin/backups/client.nix` Core backup implementation: - **SSH Configuration**: Port SFTP config from NixOS (known hosts, identity file) - **Backup Options**: Add `skwrls.backups.paths` option (defaults to `vars.home`) - **launchd Agents**: Create agents for each destination: - `restic-backup-sftp` - `restic-backup-aws` - **Wrapper Scripts**: Shell scripts that: - Source SOPS secrets (passwordFile, repositoryFile, AWS credentials) - Execute restic with flags: `--exclude-file`, `--skip-if-unchanged`, `--retry-lock 30m` - Handle error logging and notifications - **Scheduling**: Use `StartCalendarInterval` to run every 2 hours (equivalent to NixOS timer) - **Environment Variables**: Set `RESTIC_PASSWORD_FILE`, `RESTIC_REPOSITORY`, AWS credentials ### 4. Create Host-Specific Exclude Files Create exclude patterns for each Darwin host (similar to `Morgan-restic-excludes.txt`): - Exclude `/Users/dos/*` by default - Include important directories: Code, Desktop, Documents, Pictures, etc. - Exclude macOS-specific cache directories: - `Library/Caches` - `Library/Application Support/*/Cache` - `.Trash` - Exclude common build artifacts: `node_modules`, `build`, `dist`, `target`, `*.log` ### 5. Configure SOPS Secrets - Configure SOPS templates for AWS environment file (similar to NixOS) ### 6. Implement Email Notifications Port notification system from NixOS: - On failure: Send email with log excerpt using `msmtp` - Use launchd exit handlers to trigger notification scripts - Check exit codes in wrapper scripts ### 7. Update Host Configurations Import backup module in Darwin hosts: - Add `../../darwin/backups` to imports in `hosts/Madison/default.nix` - Add `../../darwin/backups` to imports in `hosts/Piper/default.nix` - Add `../../darwin/backups` to imports in `hosts/Catra/default.nix` ## Technical Challenges ### 1. launchd vs systemd - **Scheduling**: Replicate timer behavior with `StartCalendarInterval` - **Randomization**: No `RandomizedDelaySec` equivalent (may need wrapper script with random sleep) - **Environment**: Different environment variable handling in launchd ### 2. No systemd failure hooks - Check exit codes in wrapper scripts - Use `launchd` `KeepAlive` with `SuccessfulExit = false` for automatic retries - Implement notification logic directly in scripts ### 3. Logging - systemd journal → macOS unified logging or custom log files - Use `StandardErrorPath` and `StandardOutPath` in launchd config - Consider using `logger` command for syslog integration - NixOS uses `IdentityAgent none` - Verify this works on Darwin or adjust as needed ## Acceptance Criteria - [ ] SSH configuration properly set up for SFTP access to Edelgard - [ ] Backups run automatically on schedule (every 2 hours) - [ ] Email notifications sent on backup failures - [ ] Backups successfully write to both SFTP and AWS destinations - [ ] `just check` and `just test` pass with new configuration - [ ] Documentation updated in CLAUDE.md if needed - [ ] Can run `sudo restic-aws snapshots` and see snapshots - [ ] Can run `sudo restic-sftp snapshots` and see snapshots ## Expected Behavior After implementation: 1. Each Darwin host (Catra, Madison, Piper) runs automated restic backups 2. Backups occur every 2 hours to both SFTP (Edelgard) and AWS S3 3. Only configured directories are backed up (excluding caches, build artifacts) 4. Backups use SOPS-encrypted credentials securely 5. Email notifications sent on failures with relevant log output 6. All systems have appropriate wrappers to run restic with the environments set 7. System behaves identically to NixOS backup system (from user perspective) ## References - **NixOS Restic Module**: https://github.com/NixOS/nixpkgs/blob/nixos-unstable/nixos/modules/services/backup/restic.nix - **NixOS Implementation**: `nixos/backups/client.nix:1-70` - **Shared Helpers**: `nixos/backups/shared.nix:1-98` - **Exclude Pattern Example**: `nixos/backups/Morgan-restic-excludes.txt:1-85` - **nix-darwin launchd docs**: https://daiderd.com/nix-darwin/manual/index.html#opt-launchd.agents - **Restic documentation**: https://restic.readthedocs.io/ ## Notes - This is a complete reimplementation due to lack of native restic support in nix-darwin - Should maintain configuration consistency with NixOS implementation where possible - Do not implement maintenance tasks - May need to handle macOS-specific permissions for certain directories - May need to account for macOS-specific local network access permissions - launchd logs can be viewed with: `log show --predicate 'subsystem == "restic-backup-sftp"' --last 1h` - Test thoroughly before enabling on all hosts
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
ops/nixos-skwrls#52
No description provided.