- Complete Cloudflare Terraform configuration (DNS, WAF, tunnels, access) - WAF Intelligence MCP server with threat analysis and ML classification - GitOps automation with PR workflows and drift detection - Observatory monitoring stack with Prometheus/Grafana - IDE operator rules for governed development - Security playbooks and compliance frameworks - Autonomous remediation and state reconciliation
5.0 KiB
5.0 KiB
Mesh Observatory
Prometheus + Grafana monitoring stack for Cloudflare infrastructure state.
Components
| Component | Port | Description |
|---|---|---|
| Prometheus | 9090 | Metrics collection and storage |
| Grafana | 3000 | Visualization dashboards |
| Metrics Exporter | 9100 | Custom Cloudflare metrics |
Quick Start
1. Configure Environment
cp .env.example .env
# Edit .env with your credentials
Required environment variables:
CLOUDFLARE_API_TOKEN=<your-token>
CLOUDFLARE_ZONE_ID=<your-zone-id>
CLOUDFLARE_ACCOUNT_ID=<your-account-id>
GRAFANA_PASSWORD=<secure-password>
2. Start Stack
docker-compose up -d
3. Access Dashboards
- Grafana: http://localhost:3000 (admin / $GRAFANA_PASSWORD)
- Prometheus: http://localhost:9090
Dashboards
| Dashboard | UID | Description |
|---|---|---|
| Cloudflare Mesh Overview | cf-overview | Main command center |
| DNS Health | cf-dns | DNS records, DNSSEC, types |
| Tunnel Status | cf-tunnel | Tunnel health, connections |
| Invariants & Compliance | cf-invariants | Invariant pass/fail, anomalies |
| Security Settings | cf-security | SSL, TLS, Access apps |
| ProofChain & Anchors | cf-proofchain | Merkle roots, snapshot freshness |
Metrics Reference
DNS Metrics
cloudflare_dns_records_total- Total DNS recordscloudflare_dns_records_proxied- Proxied records countcloudflare_dns_records_unproxied- DNS-only records countcloudflare_dns_records_by_type{type="A|AAAA|CNAME|..."}- Records by typecloudflare_dnssec_enabled- DNSSEC status (0/1)
Tunnel Metrics
cloudflare_tunnels_total- Total active tunnelscloudflare_tunnels_healthy- Tunnels with active connectionscloudflare_tunnels_unhealthy- Tunnels without connectionscloudflare_tunnel_connections_total- Total tunnel connections
Zone Settings
cloudflare_zone_ssl_strict- SSL mode is strict (0/1)cloudflare_zone_tls_version_secure- TLS 1.2+ enforced (0/1)cloudflare_zone_always_https- HTTPS redirect enabled (0/1)cloudflare_zone_browser_check- Browser integrity check (0/1)
Access Metrics
cloudflare_access_apps_total- Total Access applicationscloudflare_access_apps_by_type{type="..."}- Apps by type
Invariant Metrics
cloudflare_invariants_total- Total invariant checkscloudflare_invariants_passed- Passing invariantscloudflare_invariants_failed- Failing invariantscloudflare_invariants_pass_rate- Pass percentagecloudflare_invariant_report_age_seconds- Report freshness
Snapshot Metrics
cloudflare_snapshot_age_seconds- Seconds since last snapshotcloudflare_snapshot_merkle_root_set- Merkle root present (0/1)
Anomaly Metrics
cloudflare_anomalies_total- Total anomaly receiptscloudflare_anomalies_last_24h- Recent anomalies
Drift Visualizer
Standalone tool for comparing state sources.
Usage
python3 drift-visualizer.py \
--snapshot ../snapshots/cloudflare-latest.json \
--manifest ../cloudflare_dns_manifest.md \
--output-dir ../reports
Output
drift-report-<timestamp>.json- Machine-readable diffdrift-report-<timestamp>.html- Visual HTML report
Directory Structure
observatory/
├── docker-compose.yml # Stack definition
├── Dockerfile.exporter # Metrics exporter container
├── prometheus.yml # Prometheus config
├── metrics-exporter.py # Custom exporter
├── drift-visualizer.py # Drift analysis tool
├── datasources/ # Grafana datasource provisioning
│ └── prometheus.yml
├── dashboards/ # Grafana dashboard provisioning
│ ├── dashboards.yml
│ ├── cloudflare-overview.json
│ ├── dns-health.json
│ ├── tunnel-status.json
│ ├── invariants.json
│ ├── security-settings.json
│ └── proofchain.json
└── rules/ # Prometheus alerting rules (optional)
Integration with CI/CD
The metrics exporter reads from:
../snapshots/- State snapshots from state-reconciler.py../anomalies/- Anomaly receipts from invariant-checker.py
Ensure these directories are populated by the GitLab CI pipeline or systemd services.
Alerting (Optional)
Create alerting rules in rules/alerts.yml:
groups:
- name: cloudflare
rules:
- alert: InvariantFailure
expr: cloudflare_invariants_failed > 0
for: 5m
labels:
severity: critical
annotations:
summary: "Cloudflare invariant check failing"
- alert: TunnelUnhealthy
expr: cloudflare_tunnels_unhealthy > 0
for: 5m
labels:
severity: warning
annotations:
summary: "Cloudflare tunnel has no connections"
- alert: SnapshotStale
expr: cloudflare_snapshot_age_seconds > 7200
for: 10m
labels:
severity: warning
annotations:
summary: "Cloudflare state snapshot older than 2 hours"