chore: initial import
This commit is contained in:
163
docs/NODE_AGENT_CONTRACT.md
Normal file
163
docs/NODE_AGENT_CONTRACT.md
Normal file
@@ -0,0 +1,163 @@
|
||||
# Node Agent Contract
|
||||
|
||||
This document defines the API contract between VaultMesh Node Agents and the Command Center.
|
||||
|
||||
## Heartbeat Endpoint
|
||||
|
||||
### Request
|
||||
|
||||
```
|
||||
POST /api/agent/heartbeat
|
||||
Content-Type: application/json
|
||||
```
|
||||
|
||||
### Heartbeat Schema
|
||||
|
||||
```json
|
||||
{
|
||||
"node_id": "UUID v4",
|
||||
"hostname": "string",
|
||||
"os_profile": "string",
|
||||
"cloudflare_ok": "boolean",
|
||||
"services_ok": "boolean",
|
||||
"vaultmesh_root": "string",
|
||||
"timestamp": "ISO 8601 / RFC 3339"
|
||||
}
|
||||
```
|
||||
|
||||
### Field Definitions
|
||||
|
||||
| Field | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `node_id` | UUID v4 | Unique identifier for this node. Should persist across reboots. |
|
||||
| `hostname` | String | System hostname (from `hostname::get()` or `/etc/hostname`) |
|
||||
| `os_profile` | String | VaultMesh profile name: `ArchVault`, `DebianVault`, etc. |
|
||||
| `cloudflare_ok` | Boolean | `true` if `cloudflared` service is active |
|
||||
| `services_ok` | Boolean | `true` if VAULTMESH_ROOT exists and is healthy |
|
||||
| `vaultmesh_root` | String | Path to VAULTMESH_ROOT (e.g., `/var/lib/vaultmesh`) |
|
||||
| `timestamp` | RFC 3339 | UTC timestamp when heartbeat was generated |
|
||||
|
||||
### Response
|
||||
|
||||
**Success (200 OK)**:
|
||||
```json
|
||||
{
|
||||
"status": "ok"
|
||||
}
|
||||
```
|
||||
|
||||
**Error (4xx/5xx)**:
|
||||
```json
|
||||
{
|
||||
"error": "description"
|
||||
}
|
||||
```
|
||||
|
||||
## Environment Variables
|
||||
|
||||
The node agent is configured via environment variables, typically set in `/etc/vaultmesh/agent.env`.
|
||||
|
||||
| Variable | Required | Default | Description |
|
||||
|----------|----------|---------|-------------|
|
||||
| `VAULTMESH_NODE_ID` | No | Auto-generated UUID v4 | Persistent node identifier |
|
||||
| `VAULTMESH_CC_URL` | No | `http://127.0.0.1:8088` | Command Center base URL |
|
||||
| `VAULTMESH_OS_PROFILE` | No | `ArchVault` | OS profile name to report |
|
||||
| `VAULTMESH_ROOT` | No | `/var/lib/vaultmesh` | Path to check for services_ok |
|
||||
| `VAULTMESH_HEARTBEAT_SECS` | No | `30` | Seconds between heartbeats |
|
||||
| `RUST_LOG` | No | `info` | Log level (trace, debug, info, warn, error) |
|
||||
|
||||
### Example `/etc/vaultmesh/agent.env`
|
||||
|
||||
```bash
|
||||
VAULTMESH_NODE_ID=550e8400-e29b-41d4-a716-446655440000
|
||||
VAULTMESH_CC_URL=https://cc.vaultmesh.example
|
||||
VAULTMESH_OS_PROFILE=ArchVault
|
||||
VAULTMESH_ROOT=/var/lib/vaultmesh
|
||||
VAULTMESH_HEARTBEAT_SECS=30
|
||||
RUST_LOG=info
|
||||
```
|
||||
|
||||
## Node Registration
|
||||
|
||||
Nodes self-register on first heartbeat. There is no explicit registration endpoint.
|
||||
|
||||
When the Command Center receives a heartbeat with a new `node_id`, it creates a new entry. Subsequent heartbeats update the existing entry.
|
||||
|
||||
### Node ID Persistence
|
||||
|
||||
For consistent tracking, the `VAULTMESH_NODE_ID` should be persisted. Options:
|
||||
|
||||
1. **Environment file**: Set in `/etc/vaultmesh/agent.env`
|
||||
2. **Machine ID**: Could derive from `/etc/machine-id`
|
||||
3. **Auto-generated**: If not set, agent generates a new UUID on each start (not recommended for production)
|
||||
|
||||
**Recommended**: Generate a UUID once during node bootstrap and store in `agent.env`:
|
||||
|
||||
```bash
|
||||
# During node bootstrap
|
||||
echo "VAULTMESH_NODE_ID=$(uuidgen)" >> /etc/vaultmesh/agent.env
|
||||
```
|
||||
|
||||
## Health Checks
|
||||
|
||||
### cloudflare_ok
|
||||
|
||||
The agent runs:
|
||||
```bash
|
||||
systemctl is-active --quiet cloudflared
|
||||
```
|
||||
|
||||
Returns `true` if exit code is 0 (service active).
|
||||
|
||||
### services_ok
|
||||
|
||||
The agent checks if `VAULTMESH_ROOT` exists and is a directory:
|
||||
```rust
|
||||
std::path::Path::new(vaultmesh_root).is_dir()
|
||||
```
|
||||
|
||||
Future versions may add additional checks:
|
||||
- Disk space
|
||||
- Key services running
|
||||
- ProofChain integrity
|
||||
|
||||
## Error Handling
|
||||
|
||||
### Network Errors
|
||||
|
||||
If the agent cannot reach the Command Center:
|
||||
- Log error at WARN/ERROR level
|
||||
- Sleep for heartbeat interval
|
||||
- Retry on next cycle
|
||||
|
||||
No exponential backoff in V1. The agent will retry every `VAULTMESH_HEARTBEAT_SECS` seconds indefinitely.
|
||||
|
||||
### Invalid Response
|
||||
|
||||
If CC returns non-2xx status:
|
||||
- Log warning with status code
|
||||
- Continue normal operation
|
||||
- Retry on next cycle
|
||||
|
||||
## Security Considerations
|
||||
|
||||
### Transport Security
|
||||
|
||||
- Agent should connect to CC via HTTPS (Cloudflare Tunnel)
|
||||
- `reqwest` configured with `rustls-tls` (no OpenSSL dependency)
|
||||
|
||||
### Authentication (Future)
|
||||
|
||||
V1 has no agent authentication. Future versions may add:
|
||||
- Signed JWTs
|
||||
- Shared secrets
|
||||
- mTLS
|
||||
|
||||
### Data Sensitivity
|
||||
|
||||
Heartbeat data is low-sensitivity:
|
||||
- No secrets or credentials
|
||||
- No PII
|
||||
- No file contents
|
||||
|
||||
The `vaultmesh_root` path and hostname are the most identifying fields.
|
||||
Reference in New Issue
Block a user