Files
Vault Sovereign 37a867c485 Initial commit: Cloudflare infrastructure with WAF Intelligence
- Complete Cloudflare Terraform configuration (DNS, WAF, tunnels, access)
- WAF Intelligence MCP server with threat analysis and ML classification
- GitOps automation with PR workflows and drift detection
- Observatory monitoring stack with Prometheus/Grafana
- IDE operator rules for governed development
- Security playbooks and compliance frameworks
- Autonomous remediation and state reconciliation
2025-12-16 18:31:53 +00:00
..

Mesh Observatory

Prometheus + Grafana monitoring stack for Cloudflare infrastructure state.

Components

Component Port Description
Prometheus 9090 Metrics collection and storage
Grafana 3000 Visualization dashboards
Metrics Exporter 9100 Custom Cloudflare metrics

Quick Start

1. Configure Environment

cp .env.example .env
# Edit .env with your credentials

Required environment variables:

CLOUDFLARE_API_TOKEN=<your-token>
CLOUDFLARE_ZONE_ID=<your-zone-id>
CLOUDFLARE_ACCOUNT_ID=<your-account-id>
GRAFANA_PASSWORD=<secure-password>

2. Start Stack

docker-compose up -d

3. Access Dashboards

Dashboards

Dashboard UID Description
Cloudflare Mesh Overview cf-overview Main command center
DNS Health cf-dns DNS records, DNSSEC, types
Tunnel Status cf-tunnel Tunnel health, connections
Invariants & Compliance cf-invariants Invariant pass/fail, anomalies
Security Settings cf-security SSL, TLS, Access apps
ProofChain & Anchors cf-proofchain Merkle roots, snapshot freshness

Metrics Reference

DNS Metrics

  • cloudflare_dns_records_total - Total DNS records
  • cloudflare_dns_records_proxied - Proxied records count
  • cloudflare_dns_records_unproxied - DNS-only records count
  • cloudflare_dns_records_by_type{type="A|AAAA|CNAME|..."} - Records by type
  • cloudflare_dnssec_enabled - DNSSEC status (0/1)

Tunnel Metrics

  • cloudflare_tunnels_total - Total active tunnels
  • cloudflare_tunnels_healthy - Tunnels with active connections
  • cloudflare_tunnels_unhealthy - Tunnels without connections
  • cloudflare_tunnel_connections_total - Total tunnel connections

Zone Settings

  • cloudflare_zone_ssl_strict - SSL mode is strict (0/1)
  • cloudflare_zone_tls_version_secure - TLS 1.2+ enforced (0/1)
  • cloudflare_zone_always_https - HTTPS redirect enabled (0/1)
  • cloudflare_zone_browser_check - Browser integrity check (0/1)

Access Metrics

  • cloudflare_access_apps_total - Total Access applications
  • cloudflare_access_apps_by_type{type="..."} - Apps by type

Invariant Metrics

  • cloudflare_invariants_total - Total invariant checks
  • cloudflare_invariants_passed - Passing invariants
  • cloudflare_invariants_failed - Failing invariants
  • cloudflare_invariants_pass_rate - Pass percentage
  • cloudflare_invariant_report_age_seconds - Report freshness

Snapshot Metrics

  • cloudflare_snapshot_age_seconds - Seconds since last snapshot
  • cloudflare_snapshot_merkle_root_set - Merkle root present (0/1)

Anomaly Metrics

  • cloudflare_anomalies_total - Total anomaly receipts
  • cloudflare_anomalies_last_24h - Recent anomalies

Drift Visualizer

Standalone tool for comparing state sources.

Usage

python3 drift-visualizer.py \
    --snapshot ../snapshots/cloudflare-latest.json \
    --manifest ../cloudflare_dns_manifest.md \
    --output-dir ../reports

Output

  • drift-report-<timestamp>.json - Machine-readable diff
  • drift-report-<timestamp>.html - Visual HTML report

Directory Structure

observatory/
├── docker-compose.yml      # Stack definition
├── Dockerfile.exporter     # Metrics exporter container
├── prometheus.yml          # Prometheus config
├── metrics-exporter.py     # Custom exporter
├── drift-visualizer.py     # Drift analysis tool
├── datasources/            # Grafana datasource provisioning
│   └── prometheus.yml
├── dashboards/             # Grafana dashboard provisioning
│   ├── dashboards.yml
│   ├── cloudflare-overview.json
│   ├── dns-health.json
│   ├── tunnel-status.json
│   ├── invariants.json
│   ├── security-settings.json
│   └── proofchain.json
└── rules/                  # Prometheus alerting rules (optional)

Integration with CI/CD

The metrics exporter reads from:

  • ../snapshots/ - State snapshots from state-reconciler.py
  • ../anomalies/ - Anomaly receipts from invariant-checker.py

Ensure these directories are populated by the GitLab CI pipeline or systemd services.

Alerting (Optional)

Create alerting rules in rules/alerts.yml:

groups:
  - name: cloudflare
    rules:
      - alert: InvariantFailure
        expr: cloudflare_invariants_failed > 0
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "Cloudflare invariant check failing"

      - alert: TunnelUnhealthy
        expr: cloudflare_tunnels_unhealthy > 0
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Cloudflare tunnel has no connections"

      - alert: SnapshotStale
        expr: cloudflare_snapshot_age_seconds > 7200
        for: 10m
        labels:
          severity: warning
        annotations:
          summary: "Cloudflare state snapshot older than 2 hours"