156 lines
5.1 KiB
Markdown
156 lines
5.1 KiB
Markdown
# VaultMesh Command Center
|
|
|
|
Minimal, no-bloat control plane for VaultMesh nodes.
|
|
|
|
- Rust backend (Axum) with server-rendered HTML (HTMX-ready)
|
|
- Node agent (Rust daemon) that runs on each VaultMesh node
|
|
- Zero external infra deps (no Redis, Kafka, k8s)
|
|
- Cloudflare-native: fronted by Cloudflare Tunnel + Access
|
|
|
|
## Repository Layout
|
|
|
|
```text
|
|
vm-control/
|
|
├── Cargo.toml # Workspace manifest
|
|
├── README.md
|
|
├── .gitignore
|
|
├── command-center/ # Backend + Web UI
|
|
│ ├── Cargo.toml
|
|
│ └── src/
|
|
│ ├── main.rs # Entry point, server setup
|
|
│ ├── routes.rs # HTTP handlers
|
|
│ └── state.rs # AppState, in-memory node store
|
|
├── node-agent/ # Daemon for each VaultMesh node
|
|
│ ├── Cargo.toml
|
|
│ └── src/
|
|
│ ├── main.rs # Heartbeat loop
|
|
│ └── config.rs # Env config loader
|
|
├── docs/
|
|
│ ├── ARCHITECTURE.md # How it all fits together
|
|
│ └── NODE_AGENT_CONTRACT.md # Agent API spec
|
|
└── systemd/
|
|
├── vaultmesh-command-center.service
|
|
└── vaultmesh-node-agent.service
|
|
```
|
|
|
|
## Quick Start
|
|
|
|
```bash
|
|
# Clone and build
|
|
git clone <your-url> vm-control
|
|
cd vm-control
|
|
|
|
# Run the command center locally
|
|
cd command-center
|
|
RUST_LOG=info cargo run
|
|
# listens on 127.0.0.1:8088
|
|
|
|
# In another terminal, run the agent (pointing at local CC)
|
|
cd ../node-agent
|
|
RUST_LOG=info VAULTMESH_OS_PROFILE=ArchVault cargo run
|
|
```
|
|
|
|
Then:
|
|
- Put the Command Center behind a Cloudflare Tunnel.
|
|
- Protect it with Cloudflare Access.
|
|
- Install the node agent as a systemd service on each VaultMesh node.
|
|
|
|
## Deployment
|
|
|
|
### Command Center
|
|
|
|
1. Build release binary:
|
|
```bash
|
|
cargo build --release -p vaultmesh-command-center
|
|
```
|
|
|
|
2. Copy to `/usr/local/bin/`:
|
|
```bash
|
|
sudo cp target/release/vaultmesh-command-center /usr/local/bin/
|
|
```
|
|
|
|
3. Install systemd unit:
|
|
```bash
|
|
sudo cp systemd/vaultmesh-command-center.service /etc/systemd/system/
|
|
sudo systemctl daemon-reload
|
|
sudo systemctl enable --now vaultmesh-command-center
|
|
```
|
|
|
|
4. Configure Cloudflare Tunnel to point at `http://127.0.0.1:8088`.
|
|
|
|
### Node Agent
|
|
|
|
1. Build release binary:
|
|
```bash
|
|
cargo build --release -p vaultmesh-node-agent
|
|
```
|
|
|
|
2. Copy to each node:
|
|
```bash
|
|
sudo cp target/release/vaultmesh-node-agent /usr/local/bin/
|
|
```
|
|
|
|
3. Create environment file `/etc/vaultmesh/agent.env`:
|
|
```bash
|
|
VAULTMESH_CC_URL=https://cc.your-domain.example
|
|
VAULTMESH_OS_PROFILE=ArchVault
|
|
VAULTMESH_ROOT=/var/lib/vaultmesh
|
|
VAULTMESH_HEARTBEAT_SECS=30
|
|
```
|
|
|
|
4. Install systemd unit:
|
|
```bash
|
|
sudo cp systemd/vaultmesh-node-agent.service /etc/systemd/system/
|
|
sudo systemctl daemon-reload
|
|
sudo systemctl enable --now vaultmesh-node-agent
|
|
```
|
|
|
|
## API Endpoints
|
|
|
|
| Method | Path | Description |
|
|
|--------|------------------------------------|------------------------------------------|
|
|
| GET | `/` | HTML dashboard showing all nodes |
|
|
| GET | `/nodes` | JSON array of all node heartbeats |
|
|
| GET | `/nodes/:id` | Node detail page (HTML) |
|
|
| POST | `/nodes/:id/commands` | Queue command for node (web form or API) |
|
|
| POST | `/api/agent/heartbeat` | Agent heartbeat endpoint |
|
|
| GET | `/api/agent/commands?node_id=<uuid>` | Agent polls for pending commands |
|
|
| POST | `/api/agent/command-result` | Agent reports command execution result |
|
|
|
|
## Fleet Operation Model (V0.5)
|
|
|
|
**Green fleet:**
|
|
|
|
- Attention = `OK`.
|
|
- Heartbeats fresh (age < `VAULTMESH_HEARTBEAT_STALE_MINUTES`).
|
|
- Last scan age < `VAULTMESH_SCAN_STALE_HOURS`.
|
|
- No critical or high findings.
|
|
|
|
**Yellow/Red fleet:**
|
|
|
|
Check Attention reasons in order:
|
|
|
|
1. `heartbeat_stale` → connectivity / host issue.
|
|
2. `cloudflare_down` / `services_down` → control plane or local service failure.
|
|
3. `never_scanned` / `scan_stale` → wait for scheduler or trigger manual scan (if policy allows).
|
|
4. `critical_findings` / `high_findings` → prioritize remediation on that node.
|
|
|
|
**Policy guardrail:**
|
|
|
|
- Disallowed commands are blocked with HTTP 403.
|
|
- Scheduler respects policy; `sovereign-scan` must be allowed for a profile to be auto-queued.
|
|
|
|
## Configuration
|
|
|
|
| Variable | Default | Description |
|
|
|------------------------------------|-------------------|------------------------------|
|
|
| `VAULTMESH_SCAN_INTERVAL_HOURS` | `24` | Auto-scan interval |
|
|
| `VAULTMESH_SCAN_STALE_HOURS` | `48` | Scan staleness threshold |
|
|
| `VAULTMESH_HEARTBEAT_STALE_MINUTES`| `10` | Heartbeat staleness |
|
|
| `VAULTMESH_SCHEDULER_TICK_SECONDS` | `300` | Scheduler tick interval |
|
|
| `VAULTMESH_CC_KEY_PATH` | `cc-ed25519.key` | Command signing key path |
|
|
|
|
## License
|
|
|
|
MIT
|