Files
vm-control/README.md
2025-12-27 02:07:32 +00:00

156 lines
5.1 KiB
Markdown

# VaultMesh Command Center
Minimal, no-bloat control plane for VaultMesh nodes.
- Rust backend (Axum) with server-rendered HTML (HTMX-ready)
- Node agent (Rust daemon) that runs on each VaultMesh node
- Zero external infra deps (no Redis, Kafka, k8s)
- Cloudflare-native: fronted by Cloudflare Tunnel + Access
## Repository Layout
```text
vm-control/
├── Cargo.toml # Workspace manifest
├── README.md
├── .gitignore
├── command-center/ # Backend + Web UI
│ ├── Cargo.toml
│ └── src/
│ ├── main.rs # Entry point, server setup
│ ├── routes.rs # HTTP handlers
│ └── state.rs # AppState, in-memory node store
├── node-agent/ # Daemon for each VaultMesh node
│ ├── Cargo.toml
│ └── src/
│ ├── main.rs # Heartbeat loop
│ └── config.rs # Env config loader
├── docs/
│ ├── ARCHITECTURE.md # How it all fits together
│ └── NODE_AGENT_CONTRACT.md # Agent API spec
└── systemd/
├── vaultmesh-command-center.service
└── vaultmesh-node-agent.service
```
## Quick Start
```bash
# Clone and build
git clone <your-url> vm-control
cd vm-control
# Run the command center locally
cd command-center
RUST_LOG=info cargo run
# listens on 127.0.0.1:8088
# In another terminal, run the agent (pointing at local CC)
cd ../node-agent
RUST_LOG=info VAULTMESH_OS_PROFILE=ArchVault cargo run
```
Then:
- Put the Command Center behind a Cloudflare Tunnel.
- Protect it with Cloudflare Access.
- Install the node agent as a systemd service on each VaultMesh node.
## Deployment
### Command Center
1. Build release binary:
```bash
cargo build --release -p vaultmesh-command-center
```
2. Copy to `/usr/local/bin/`:
```bash
sudo cp target/release/vaultmesh-command-center /usr/local/bin/
```
3. Install systemd unit:
```bash
sudo cp systemd/vaultmesh-command-center.service /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable --now vaultmesh-command-center
```
4. Configure Cloudflare Tunnel to point at `http://127.0.0.1:8088`.
### Node Agent
1. Build release binary:
```bash
cargo build --release -p vaultmesh-node-agent
```
2. Copy to each node:
```bash
sudo cp target/release/vaultmesh-node-agent /usr/local/bin/
```
3. Create environment file `/etc/vaultmesh/agent.env`:
```bash
VAULTMESH_CC_URL=https://cc.your-domain.example
VAULTMESH_OS_PROFILE=ArchVault
VAULTMESH_ROOT=/var/lib/vaultmesh
VAULTMESH_HEARTBEAT_SECS=30
```
4. Install systemd unit:
```bash
sudo cp systemd/vaultmesh-node-agent.service /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable --now vaultmesh-node-agent
```
## API Endpoints
| Method | Path | Description |
|--------|------------------------------------|------------------------------------------|
| GET | `/` | HTML dashboard showing all nodes |
| GET | `/nodes` | JSON array of all node heartbeats |
| GET | `/nodes/:id` | Node detail page (HTML) |
| POST | `/nodes/:id/commands` | Queue command for node (web form or API) |
| POST | `/api/agent/heartbeat` | Agent heartbeat endpoint |
| GET | `/api/agent/commands?node_id=<uuid>` | Agent polls for pending commands |
| POST | `/api/agent/command-result` | Agent reports command execution result |
## Fleet Operation Model (V0.5)
**Green fleet:**
- Attention = `OK`.
- Heartbeats fresh (age < `VAULTMESH_HEARTBEAT_STALE_MINUTES`).
- Last scan age < `VAULTMESH_SCAN_STALE_HOURS`.
- No critical or high findings.
**Yellow/Red fleet:**
Check Attention reasons in order:
1. `heartbeat_stale` → connectivity / host issue.
2. `cloudflare_down` / `services_down` → control plane or local service failure.
3. `never_scanned` / `scan_stale` → wait for scheduler or trigger manual scan (if policy allows).
4. `critical_findings` / `high_findings` → prioritize remediation on that node.
**Policy guardrail:**
- Disallowed commands are blocked with HTTP 403.
- Scheduler respects policy; `sovereign-scan` must be allowed for a profile to be auto-queued.
## Configuration
| Variable | Default | Description |
|------------------------------------|-------------------|------------------------------|
| `VAULTMESH_SCAN_INTERVAL_HOURS` | `24` | Auto-scan interval |
| `VAULTMESH_SCAN_STALE_HOURS` | `48` | Scan staleness threshold |
| `VAULTMESH_HEARTBEAT_STALE_MINUTES`| `10` | Heartbeat staleness |
| `VAULTMESH_SCHEDULER_TICK_SECONDS` | `300` | Scheduler tick interval |
| `VAULTMESH_CC_KEY_PATH` | `cc-ed25519.key` | Command signing key path |
## License
MIT