Collection of operational skills for VaultMesh infrastructure including: - backup-sovereign: Backup and recovery operations - btc-anchor: Bitcoin anchoring - cloudflare-tunnel-manager: Cloudflare tunnel management - container-registry: Container registry operations - disaster-recovery: Disaster recovery procedures - dns-sovereign: DNS management - eth-anchor: Ethereum anchoring - gitea-bootstrap: Gitea setup and configuration - hetzner-bootstrap: Hetzner server provisioning - merkle-forest: Merkle tree operations - node-hardening: Node security hardening - operator-bootstrap: Operator initialization - proof-verifier: Cryptographic proof verification - rfc3161-anchor: RFC3161 timestamping - secrets-vault: Secrets management 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2.7 KiB
2.7 KiB
Recovery Procedures
Overview
This document describes recovery procedures for when node-hardening changes cause loss of remote access or system instability.
Prerequisites
- Console access via IPMI, VNC, or physical connection
- Knowledge of backup file locations
- Root or sudo access
Scenario 1: SSH Access Lost
Symptoms
- Cannot SSH to the server
- Connection refused or timeout
Recovery Steps
-
Access console (IPMI/VNC/physical)
-
Run emergency restore:
cd ~/.claude/skills/node-hardening ./scripts/rollback/emergency_restore.sh -
If emergency_restore fails, manually restore:
# Disable UFW sudo ufw --force disable # Restore SSH config sudo cp /path/to/outputs/backups/sshd_config.before /etc/ssh/sshd_config # Restart SSH sudo systemctl restart ssh # or sudo systemctl restart sshd -
Verify from another terminal:
ssh user@server
Scenario 2: Firewall Blocking All Traffic
Symptoms
- All network services unreachable
- SSH, HTTP, HTTPS all timeout
Recovery Steps
-
Access console (IPMI/VNC/physical)
-
Disable UFW:
sudo ufw --force disable -
Verify rules:
sudo ufw status verbose -
Restore from backup if available:
sudo iptables-restore < /path/to/outputs/backups/iptables_rules_before.txt
Scenario 3: fail2ban Blocking Legitimate Access
Symptoms
- SSH works from some IPs but not others
- Intermittent connection failures
Recovery Steps
-
Check banned IPs:
sudo fail2ban-client status sshd -
Unban IP:
sudo fail2ban-client set sshd unbanip <IP_ADDRESS> -
Whitelist operator IP in
/etc/fail2ban/jail.local:[DEFAULT] ignoreip = 127.0.0.1/8 ::1 <OPERATOR_IP> -
Restart fail2ban:
sudo systemctl restart fail2ban
Backup Locations
| File | Description |
|---|---|
outputs/backups/sshd_config.before |
Original SSH configuration |
outputs/backups/ufw_status_before.txt |
UFW state before changes |
outputs/backups/iptables_rules_before.txt |
iptables rules before changes |
Prevention
- Always keep a secondary SSH session open during changes
- Test from a different network before closing sessions
- Have console access ready before running apply scripts
- Review plan output before running apply
Contact
If recovery procedures fail, escalate to infrastructure team with:
- Node name and IP
- Time of last successful access
- Changes that were applied
- Error messages from recovery attempts