Files
vm-skills/node-hardening/references/recovery_procedures.md
Vault Sovereign eac77ef7b4 Initial commit: VaultMesh Skills collection
Collection of operational skills for VaultMesh infrastructure including:
- backup-sovereign: Backup and recovery operations
- btc-anchor: Bitcoin anchoring
- cloudflare-tunnel-manager: Cloudflare tunnel management
- container-registry: Container registry operations
- disaster-recovery: Disaster recovery procedures
- dns-sovereign: DNS management
- eth-anchor: Ethereum anchoring
- gitea-bootstrap: Gitea setup and configuration
- hetzner-bootstrap: Hetzner server provisioning
- merkle-forest: Merkle tree operations
- node-hardening: Node security hardening
- operator-bootstrap: Operator initialization
- proof-verifier: Cryptographic proof verification
- rfc3161-anchor: RFC3161 timestamping
- secrets-vault: Secrets management

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-27 00:25:00 +00:00

2.7 KiB

Recovery Procedures

Overview

This document describes recovery procedures for when node-hardening changes cause loss of remote access or system instability.

Prerequisites

  • Console access via IPMI, VNC, or physical connection
  • Knowledge of backup file locations
  • Root or sudo access

Scenario 1: SSH Access Lost

Symptoms

  • Cannot SSH to the server
  • Connection refused or timeout

Recovery Steps

  1. Access console (IPMI/VNC/physical)

  2. Run emergency restore:

    cd ~/.claude/skills/node-hardening
    ./scripts/rollback/emergency_restore.sh
    
  3. If emergency_restore fails, manually restore:

    # Disable UFW
    sudo ufw --force disable
    
    # Restore SSH config
    sudo cp /path/to/outputs/backups/sshd_config.before /etc/ssh/sshd_config
    
    # Restart SSH
    sudo systemctl restart ssh
    # or
    sudo systemctl restart sshd
    
  4. Verify from another terminal:

    ssh user@server
    

Scenario 2: Firewall Blocking All Traffic

Symptoms

  • All network services unreachable
  • SSH, HTTP, HTTPS all timeout

Recovery Steps

  1. Access console (IPMI/VNC/physical)

  2. Disable UFW:

    sudo ufw --force disable
    
  3. Verify rules:

    sudo ufw status verbose
    
  4. Restore from backup if available:

    sudo iptables-restore < /path/to/outputs/backups/iptables_rules_before.txt
    

Scenario 3: fail2ban Blocking Legitimate Access

Symptoms

  • SSH works from some IPs but not others
  • Intermittent connection failures

Recovery Steps

  1. Check banned IPs:

    sudo fail2ban-client status sshd
    
  2. Unban IP:

    sudo fail2ban-client set sshd unbanip <IP_ADDRESS>
    
  3. Whitelist operator IP in /etc/fail2ban/jail.local:

    [DEFAULT]
    ignoreip = 127.0.0.1/8 ::1 <OPERATOR_IP>
    
  4. Restart fail2ban:

    sudo systemctl restart fail2ban
    

Backup Locations

File Description
outputs/backups/sshd_config.before Original SSH configuration
outputs/backups/ufw_status_before.txt UFW state before changes
outputs/backups/iptables_rules_before.txt iptables rules before changes

Prevention

  1. Always keep a secondary SSH session open during changes
  2. Test from a different network before closing sessions
  3. Have console access ready before running apply scripts
  4. Review plan output before running apply

Contact

If recovery procedures fail, escalate to infrastructure team with:

  • Node name and IP
  • Time of last successful access
  • Changes that were applied
  • Error messages from recovery attempts