270 lines
10 KiB
Markdown
270 lines
10 KiB
Markdown
# Changelog
|
||
|
||
## V0.7.3 – Envelope Canonicalization
|
||
|
||
- Added `format` and `schema` to `EventEnvelope` for stable codec/semantics pinning.
|
||
- Renamed `body` → `payload` in the comms event API (server still accepts `body` as an alias).
|
||
- Canonicalization enforced before persistence/broadcast:
|
||
- Timestamps truncated to RFC3339 UTC `Z` with seconds precision
|
||
- `payload` object keys recursively sorted (arrays preserve order)
|
||
- `events.jsonl` now stores one canonical `EventEnvelope` per line.
|
||
|
||
## V0.7.2 – Communication Layer
|
||
|
||
### Unified Event API
|
||
- Added `EventEnvelope` as the canonical message format for all comms events.
|
||
- New event kinds: `note`, `incident`, `ack`, `tag`, `resolve`.
|
||
- Events support optional `node_id` for node-specific or global scope.
|
||
- Author field tracks origin: "operator", "system", "vm-copilot", etc.
|
||
|
||
### New API Endpoints
|
||
- `POST /api/events` - Create a new event envelope.
|
||
- Server assigns `id` and `ts`.
|
||
- Author can be overridden via `X-VM-Author` header.
|
||
- `GET /api/events` - Query events with filtering:
|
||
- `?since=RFC3339` - Filter by timestamp.
|
||
- `?kind=note,incident` - Comma-separated kind filter.
|
||
- `?node_id=uuid` - Filter by node.
|
||
- `?limit=N` - Max results (default: 100).
|
||
|
||
### SSE Integration
|
||
- Envelope events broadcast via SSE using their `kind` as the event name.
|
||
- E.g., `event: note` for note envelopes.
|
||
|
||
### Persistence
|
||
- Events logged to `events.jsonl` in `$VAULTMESH_LOG_DIR`.
|
||
- Replayed on startup to restore in-memory state.
|
||
- Memory-bounded to 500 most recent envelopes.
|
||
|
||
### Usage
|
||
```bash
|
||
# Post a note
|
||
curl -X POST http://127.0.0.1:8088/api/events \
|
||
-H "Content-Type: application/json" \
|
||
-d '{"kind":"note","node_id":"uuid","author":"operator","payload":{"text":"Test","severity":"info"}}'
|
||
|
||
# Query events
|
||
curl "http://127.0.0.1:8088/api/events?since=2025-01-01T00:00:00Z&kind=note"
|
||
```
|
||
|
||
> V0.7.2 transforms CC from a control plane into a coordination plane.
|
||
|
||
---
|
||
|
||
## V0.7.1 – Mission Console
|
||
|
||
### NASA-Style Dashboard
|
||
- Added `GET /console` endpoint with 3-panel Mission Console UI.
|
||
- Global Mission Bar with fleet KPIs: Total, Healthy, Attention, Critical, Last Scan.
|
||
- Left panel: Clickable node list with live status pills (OK/ATTN/CRIT).
|
||
- Center panel: Selected node telemetry with metrics cards and per-node event timeline.
|
||
- Right panel: Attention summary chips, global scan findings, and global event feed.
|
||
|
||
### Live SSE Integration
|
||
- Real-time heartbeat indicator glow when nodes report in.
|
||
- Live status pill updates via `attention` events.
|
||
- Dynamic KPI recalculation without page refresh.
|
||
- Per-node and global event timelines populated from SSE stream.
|
||
|
||
### Visual Design
|
||
- Dark NASA-inspired theme (#05070a background).
|
||
- Monospace typography (JetBrains Mono).
|
||
- CSS Grid 3-column layout with sticky headers.
|
||
- Animated pulse for critical nodes.
|
||
- Color-coded severity indicators (green/yellow/red).
|
||
|
||
### Usage
|
||
```bash
|
||
# Open Mission Console
|
||
open http://127.0.0.1:8088/console
|
||
|
||
# Original table dashboard still at /
|
||
open http://127.0.0.1:8088/
|
||
```
|
||
|
||
> V0.7.1 transforms the basic table into an operator-grade control room.
|
||
|
||
---
|
||
|
||
## V0.7 – SSE Event Bus
|
||
|
||
### Real-Time Event Streaming
|
||
- Added `GET /events` endpoint for Server-Sent Events (SSE).
|
||
- Events are streamed in real-time as they occur (no page refresh needed).
|
||
- Broadcast channel (`tokio::sync::broadcast`) distributes events to all connected clients.
|
||
|
||
### Event Types
|
||
Four named SSE events are published:
|
||
- `heartbeat` – when a node sends a heartbeat.
|
||
- `scan` – when a sovereign-scan completes.
|
||
- `command` – when any command result is reported.
|
||
- `attention` – when a node's attention status changes.
|
||
|
||
### Wire Format
|
||
Each event is JSON-encoded with wire-efficient payloads:
|
||
```
|
||
event: heartbeat
|
||
data: {"ts":"2024-01-15T10:30:00Z","node_id":"...","hostname":"vault-01","cloudflare_ok":true,...}
|
||
|
||
event: attention
|
||
data: {"ts":"2024-01-15T10:30:00Z","node_id":"...","needs_attention":true,"reasons":["critical_findings"]}
|
||
```
|
||
|
||
### Dashboard Integration
|
||
- Minimal JS probe added to dashboard and node detail pages.
|
||
- Events are logged to browser console (`[SSE][HB]`, `[SSE][SCAN]`, etc.) for debugging.
|
||
- Foundation for V0.7.1 Mission Console with live row updates.
|
||
|
||
### Keepalive
|
||
- SSE connection sends keepalive every 15 seconds to prevent timeouts.
|
||
- Clients that lag behind receive a warning and skip missed events.
|
||
|
||
### Testing
|
||
```bash
|
||
# Connect to SSE stream
|
||
curl -N http://127.0.0.1:8088/events
|
||
|
||
# Or open dashboard in browser, open DevTools Console
|
||
# Trigger heartbeat → see [SSE][HB] in console
|
||
```
|
||
|
||
> V0.7 provides the real-time infrastructure for live dashboards without page refresh.
|
||
|
||
---
|
||
|
||
## V0.6.1 – Log Tools
|
||
|
||
### CLI Log Commands
|
||
- Added `logs view` subcommand to query event logs with filters:
|
||
- `--kind heartbeats|scans|commands` – filter by event type
|
||
- `--node NODE_ID` – filter by node UUID
|
||
- `--since 1h|24h|7d|30m` – time-based filtering
|
||
- `--min-severity info|low|medium|high|critical` – severity filter (scans only)
|
||
- `-n|--limit N` – number of events to show (default: 20)
|
||
- Added `logs tail` subcommand for real-time log following
|
||
- Added `logs stats` subcommand for per-node and per-kind statistics
|
||
- Server mode unchanged: `vm-cc` or `vm-cc serve` starts HTTP server
|
||
|
||
### Files Added
|
||
- `command-center/src/cli.rs` – CLI argument parsing with clap
|
||
- `command-center/src/logs.rs` – Log reading and query logic
|
||
|
||
### Usage Examples
|
||
```bash
|
||
vm-cc logs view # last 20 events
|
||
vm-cc logs view --kind scans # last 20 scans
|
||
vm-cc logs view --since 1h --kind heartbeats
|
||
vm-cc logs view --kind scans --min-severity high
|
||
vm-cc logs stats # event counts by node
|
||
vm-cc logs tail # follow all logs
|
||
```
|
||
|
||
---
|
||
|
||
## V0.6 – Append-Only Persistence
|
||
|
||
### Event Logging
|
||
- All CC state changes are now persisted to append-only JSONL log files.
|
||
- Log files are stored in `$VAULTMESH_LOG_DIR` (default: `/var/lib/vaultmesh/cc-logs`).
|
||
- Three event types:
|
||
- `heartbeats.jsonl` – HeartbeatEvent for each agent heartbeat.
|
||
- `scans.jsonl` – ScanEvent when sovereign-scan completes successfully.
|
||
- `commands.jsonl` – CommandEvent when agent reports any command result.
|
||
|
||
### Replay on Startup
|
||
- CC replays all log files on startup to reconstruct in-memory state.
|
||
- Nodes, heartbeat history, scan results, and command history survive restarts.
|
||
- Log replay counts are printed at startup for visibility.
|
||
|
||
### Ledger-Ready Format
|
||
- JSONL format is ledger-native; each line is a self-contained JSON object.
|
||
- Designed as foundation for V0.8 Ledger Bridge (Merkle tree over logs).
|
||
- No external database dependencies (SQLite not required).
|
||
|
||
### Configuration
|
||
|
||
| Variable | Default | Description |
|
||
|----------------------|------------------------------|----------------------------|
|
||
| `VAULTMESH_LOG_DIR` | `/var/lib/vaultmesh/cc-logs` | Directory for event logs |
|
||
|
||
> With V0.6, CC state is durable across restarts. The append-only design provides an audit trail of all fleet events.
|
||
|
||
---
|
||
|
||
## V0.5 – Fleet Orchestrator
|
||
|
||
### Scan Orchestrator
|
||
- Background scheduler runs every `VAULTMESH_SCHEDULER_TICK_SECONDS` (default: `300s`).
|
||
- Automatically queues `sovereign-scan` commands for nodes whose last scan is older than
|
||
`VAULTMESH_SCAN_INTERVAL_HOURS` (default: `24h`).
|
||
- Commands are signed with the CC Ed25519 key and added to the per-node command queue for agent pickup.
|
||
|
||
### Staleness / Drift Detection
|
||
- New **Attention** column on the main dashboard for at-a-glance fleet health.
|
||
- Attention reasons:
|
||
- `never_scanned` – node has no scan history.
|
||
- `scan_stale` – last scan older than `VAULTMESH_SCAN_STALE_HOURS`.
|
||
- `heartbeat_stale` – no heartbeat within `VAULTMESH_HEARTBEAT_STALE_MINUTES`.
|
||
- `critical_findings` – last scan reported critical vulnerabilities.
|
||
- `high_findings` – last scan reported high-severity vulnerabilities.
|
||
- `cloudflare_down` – Cloudflare service flag is `false`.
|
||
- `services_down` – services flag is `false`.
|
||
- Visual cues:
|
||
- `.attn-ok` – green background, label `OK` for healthy nodes.
|
||
- `.attn-bad` – red background, comma-separated reasons for nodes needing attention.
|
||
|
||
### Command Policy Enforcement
|
||
- Introduced `CommandPolicy` with:
|
||
- `global_allowed` commands.
|
||
- Optional `per_profile` allowlists keyed by OS profile.
|
||
- Default allowed commands:
|
||
- `service-status`
|
||
- `tail-journal`
|
||
- `sovereign-scan`
|
||
- `restart-service`
|
||
- Web UI returns HTTP `403 Forbidden` for disallowed commands.
|
||
- Scheduler respects policy and will not auto-queue scans if they are disallowed for the node's profile.
|
||
|
||
### Configuration (Environment Variables)
|
||
|
||
| Variable | Default | Description |
|
||
|------------------------------------|---------|-------------------------------------|
|
||
| `VAULTMESH_SCAN_INTERVAL_HOURS` | `24` | How often each node should be scanned |
|
||
| `VAULTMESH_SCAN_STALE_HOURS` | `48` | Scan staleness threshold |
|
||
| `VAULTMESH_HEARTBEAT_STALE_MINUTES`| `10` | Heartbeat staleness threshold |
|
||
| `VAULTMESH_SCHEDULER_TICK_SECONDS` | `300` | Scheduler loop interval |
|
||
|
||
> With V0.5, the fleet tends itself; operators only need to investigate nodes marked red in the Attention column.
|
||
|
||
---
|
||
|
||
## V0.4.1 – Scan Status UI
|
||
|
||
- Dashboard columns: Last Scan, Crit/High, Source (REAL/MOCK badges).
|
||
- Parse `sovereign-scan` stdout to track `LastScan` in state.
|
||
- Visual indicators for critical (red) and high (orange) findings.
|
||
|
||
## V0.4 – Sovereign Scan Integration
|
||
|
||
- Node agent executes `sovereign-scan` command.
|
||
- Scan results reported via command-result API.
|
||
- ProofChain receipts written to `/var/lib/vaultmesh/proofchain/`.
|
||
|
||
## V0.3 – Signed Commands
|
||
|
||
- Ed25519 command signing (CC signs, agent verifies).
|
||
- Command queue per node with nonce replay protection.
|
||
- Commands: `service-status`, `restart-service`, `tail-journal`.
|
||
|
||
## V0.2 – Node Metrics
|
||
|
||
- Added system metrics to heartbeat (load, memory, disk).
|
||
- Node detail page with metrics cards.
|
||
- Heartbeat history tracking.
|
||
|
||
## V0.1 – Initial Release
|
||
|
||
- Command Center with HTML dashboard.
|
||
- Node Agent with heartbeat loop.
|
||
- Basic health monitoring (`cloudflare_ok`, `services_ok`).
|