chore: initial import

This commit is contained in:
Sovereign
2025-12-18 00:29:15 +01:00
commit 789397eb33
22 changed files with 5944 additions and 0 deletions

260
CHANGELOG.md Normal file
View File

@@ -0,0 +1,260 @@
# Changelog
## V0.7.2 Communication Layer
### Unified Event API
- Added `EventEnvelope` as the canonical message format for all comms events.
- New event kinds: `note`, `incident`, `ack`, `tag`, `resolve`.
- Events support optional `node_id` for node-specific or global scope.
- Author field tracks origin: "operator", "system", "vm-copilot", etc.
### New API Endpoints
- `POST /api/events` - Create a new event envelope.
- Server assigns `id` and `ts`.
- Author can be overridden via `X-VM-Author` header.
- `GET /api/events` - Query events with filtering:
- `?since=RFC3339` - Filter by timestamp.
- `?kind=note,incident` - Comma-separated kind filter.
- `?node_id=uuid` - Filter by node.
- `?limit=N` - Max results (default: 100).
### SSE Integration
- Envelope events broadcast via SSE using their `kind` as the event name.
- E.g., `event: note` for note envelopes.
### Persistence
- Events logged to `events.jsonl` in `$VAULTMESH_LOG_DIR`.
- Replayed on startup to restore in-memory state.
- Memory-bounded to 500 most recent envelopes.
### Usage
```bash
# Post a note
curl -X POST http://127.0.0.1:8088/api/events \
-H "Content-Type: application/json" \
-d '{"kind":"note","node_id":"uuid","author":"operator","body":{"text":"Test","severity":"info"}}'
# Query events
curl "http://127.0.0.1:8088/api/events?since=2025-01-01T00:00:00Z&kind=note"
```
> V0.7.2 transforms CC from a control plane into a coordination plane.
---
## V0.7.1 Mission Console
### NASA-Style Dashboard
- Added `GET /console` endpoint with 3-panel Mission Console UI.
- Global Mission Bar with fleet KPIs: Total, Healthy, Attention, Critical, Last Scan.
- Left panel: Clickable node list with live status pills (OK/ATTN/CRIT).
- Center panel: Selected node telemetry with metrics cards and per-node event timeline.
- Right panel: Attention summary chips, global scan findings, and global event feed.
### Live SSE Integration
- Real-time heartbeat indicator glow when nodes report in.
- Live status pill updates via `attention` events.
- Dynamic KPI recalculation without page refresh.
- Per-node and global event timelines populated from SSE stream.
### Visual Design
- Dark NASA-inspired theme (#05070a background).
- Monospace typography (JetBrains Mono).
- CSS Grid 3-column layout with sticky headers.
- Animated pulse for critical nodes.
- Color-coded severity indicators (green/yellow/red).
### Usage
```bash
# Open Mission Console
open http://127.0.0.1:8088/console
# Original table dashboard still at /
open http://127.0.0.1:8088/
```
> V0.7.1 transforms the basic table into an operator-grade control room.
---
## V0.7 SSE Event Bus
### Real-Time Event Streaming
- Added `GET /events` endpoint for Server-Sent Events (SSE).
- Events are streamed in real-time as they occur (no page refresh needed).
- Broadcast channel (`tokio::sync::broadcast`) distributes events to all connected clients.
### Event Types
Four named SSE events are published:
- `heartbeat` when a node sends a heartbeat.
- `scan` when a sovereign-scan completes.
- `command` when any command result is reported.
- `attention` when a node's attention status changes.
### Wire Format
Each event is JSON-encoded with wire-efficient payloads:
```
event: heartbeat
data: {"ts":"2024-01-15T10:30:00Z","node_id":"...","hostname":"vault-01","cloudflare_ok":true,...}
event: attention
data: {"ts":"2024-01-15T10:30:00Z","node_id":"...","needs_attention":true,"reasons":["critical_findings"]}
```
### Dashboard Integration
- Minimal JS probe added to dashboard and node detail pages.
- Events are logged to browser console (`[SSE][HB]`, `[SSE][SCAN]`, etc.) for debugging.
- Foundation for V0.7.1 Mission Console with live row updates.
### Keepalive
- SSE connection sends keepalive every 15 seconds to prevent timeouts.
- Clients that lag behind receive a warning and skip missed events.
### Testing
```bash
# Connect to SSE stream
curl -N http://127.0.0.1:8088/events
# Or open dashboard in browser, open DevTools Console
# Trigger heartbeat → see [SSE][HB] in console
```
> V0.7 provides the real-time infrastructure for live dashboards without page refresh.
---
## V0.6.1 Log Tools
### CLI Log Commands
- Added `logs view` subcommand to query event logs with filters:
- `--kind heartbeats|scans|commands` filter by event type
- `--node NODE_ID` filter by node UUID
- `--since 1h|24h|7d|30m` time-based filtering
- `--min-severity info|low|medium|high|critical` severity filter (scans only)
- `-n|--limit N` number of events to show (default: 20)
- Added `logs tail` subcommand for real-time log following
- Added `logs stats` subcommand for per-node and per-kind statistics
- Server mode unchanged: `vm-cc` or `vm-cc serve` starts HTTP server
### Files Added
- `command-center/src/cli.rs` CLI argument parsing with clap
- `command-center/src/logs.rs` Log reading and query logic
### Usage Examples
```bash
vm-cc logs view # last 20 events
vm-cc logs view --kind scans # last 20 scans
vm-cc logs view --since 1h --kind heartbeats
vm-cc logs view --kind scans --min-severity high
vm-cc logs stats # event counts by node
vm-cc logs tail # follow all logs
```
---
## V0.6 Append-Only Persistence
### Event Logging
- All CC state changes are now persisted to append-only JSONL log files.
- Log files are stored in `$VAULTMESH_LOG_DIR` (default: `/var/lib/vaultmesh/cc-logs`).
- Three event types:
- `heartbeats.jsonl` HeartbeatEvent for each agent heartbeat.
- `scans.jsonl` ScanEvent when sovereign-scan completes successfully.
- `commands.jsonl` CommandEvent when agent reports any command result.
### Replay on Startup
- CC replays all log files on startup to reconstruct in-memory state.
- Nodes, heartbeat history, scan results, and command history survive restarts.
- Log replay counts are printed at startup for visibility.
### Ledger-Ready Format
- JSONL format is ledger-native; each line is a self-contained JSON object.
- Designed as foundation for V0.8 Ledger Bridge (Merkle tree over logs).
- No external database dependencies (SQLite not required).
### Configuration
| Variable | Default | Description |
|----------------------|------------------------------|----------------------------|
| `VAULTMESH_LOG_DIR` | `/var/lib/vaultmesh/cc-logs` | Directory for event logs |
> With V0.6, CC state is durable across restarts. The append-only design provides an audit trail of all fleet events.
---
## V0.5 Fleet Orchestrator
### Scan Orchestrator
- Background scheduler runs every `VAULTMESH_SCHEDULER_TICK_SECONDS` (default: `300s`).
- Automatically queues `sovereign-scan` commands for nodes whose last scan is older than
`VAULTMESH_SCAN_INTERVAL_HOURS` (default: `24h`).
- Commands are signed with the CC Ed25519 key and added to the per-node command queue for agent pickup.
### Staleness / Drift Detection
- New **Attention** column on the main dashboard for at-a-glance fleet health.
- Attention reasons:
- `never_scanned` node has no scan history.
- `scan_stale` last scan older than `VAULTMESH_SCAN_STALE_HOURS`.
- `heartbeat_stale` no heartbeat within `VAULTMESH_HEARTBEAT_STALE_MINUTES`.
- `critical_findings` last scan reported critical vulnerabilities.
- `high_findings` last scan reported high-severity vulnerabilities.
- `cloudflare_down` Cloudflare service flag is `false`.
- `services_down` services flag is `false`.
- Visual cues:
- `.attn-ok` green background, label `OK` for healthy nodes.
- `.attn-bad` red background, comma-separated reasons for nodes needing attention.
### Command Policy Enforcement
- Introduced `CommandPolicy` with:
- `global_allowed` commands.
- Optional `per_profile` allowlists keyed by OS profile.
- Default allowed commands:
- `service-status`
- `tail-journal`
- `sovereign-scan`
- `restart-service`
- Web UI returns HTTP `403 Forbidden` for disallowed commands.
- Scheduler respects policy and will not auto-queue scans if they are disallowed for the node's profile.
### Configuration (Environment Variables)
| Variable | Default | Description |
|------------------------------------|---------|-------------------------------------|
| `VAULTMESH_SCAN_INTERVAL_HOURS` | `24` | How often each node should be scanned |
| `VAULTMESH_SCAN_STALE_HOURS` | `48` | Scan staleness threshold |
| `VAULTMESH_HEARTBEAT_STALE_MINUTES`| `10` | Heartbeat staleness threshold |
| `VAULTMESH_SCHEDULER_TICK_SECONDS` | `300` | Scheduler loop interval |
> With V0.5, the fleet tends itself; operators only need to investigate nodes marked red in the Attention column.
---
## V0.4.1 Scan Status UI
- Dashboard columns: Last Scan, Crit/High, Source (REAL/MOCK badges).
- Parse `sovereign-scan` stdout to track `LastScan` in state.
- Visual indicators for critical (red) and high (orange) findings.
## V0.4 Sovereign Scan Integration
- Node agent executes `sovereign-scan` command.
- Scan results reported via command-result API.
- ProofChain receipts written to `/var/lib/vaultmesh/proofchain/`.
## V0.3 Signed Commands
- Ed25519 command signing (CC signs, agent verifies).
- Command queue per node with nonce replay protection.
- Commands: `service-status`, `restart-service`, `tail-journal`.
## V0.2 Node Metrics
- Added system metrics to heartbeat (load, memory, disk).
- Node detail page with metrics cards.
- Heartbeat history tracking.
## V0.1 Initial Release
- Command Center with HTML dashboard.
- Node Agent with heartbeat loop.
- Basic health monitoring (`cloudflare_ok`, `services_ok`).