Files
vm-cloudflare/observatory/prometheus/alerts/proofchain-alerts.yml
Vault Sovereign 37a867c485 Initial commit: Cloudflare infrastructure with WAF Intelligence
- Complete Cloudflare Terraform configuration (DNS, WAF, tunnels, access)
- WAF Intelligence MCP server with threat analysis and ML classification
- GitOps automation with PR workflows and drift detection
- Observatory monitoring stack with Prometheus/Grafana
- IDE operator rules for governed development
- Security playbooks and compliance frameworks
- Autonomous remediation and state reconciliation
2025-12-16 18:31:53 +00:00

258 lines
9.6 KiB
YAML

# Proofchain Alert Rules for Cloudflare Mesh Observatory
# Phase 5B - Alerts & Escalation
groups:
- name: proofchain_alerts
interval: 60s
rules:
# ============================================
# CRITICAL - Chain Integrity Failure
# ============================================
- alert: ProofchainIntegrityFailure
expr: cloudflare_proofchain_integrity_valid == 0
for: 1m
labels:
severity: critical
component: proofchain
security_incident: "true"
annotations:
summary: "CRITICAL: Proofchain integrity verification FAILED"
description: |
Proofchain {{ $labels.chain_name }} has failed integrity verification.
Last valid hash: {{ $labels.last_valid_hash }}
Expected hash: {{ $labels.expected_hash }}
Computed hash: {{ $labels.computed_hash }}
This indicates potential:
- Ledger tampering
- Receipt corruption
- Chain fork
IMMEDIATELY HALT new receipt generation until resolved.
impact: "Audit trail integrity compromised"
runbook_url: "https://wiki.internal/playbooks/proofchain-incident"
# ============================================
# CRITICAL - Receipt Hash Mismatch
# ============================================
- alert: ReceiptHashMismatch
expr: cloudflare_receipt_hash_valid == 0
for: 1m
labels:
severity: critical
component: proofchain
security_incident: "true"
annotations:
summary: "Receipt hash mismatch detected"
description: |
Receipt {{ $labels.receipt_id }} ({{ $labels.receipt_type }})
hash does not match stored value.
This receipt may have been modified after creation.
Investigate for potential tampering.
runbook_url: "https://wiki.internal/playbooks/proofchain-incident"
# ============================================
# CRITICAL - Anchor Missing
# ============================================
- alert: ProofchainAnchorMissing
expr: cloudflare_proofchain_anchor_age_hours > 24
for: 1h
labels:
severity: critical
component: proofchain
annotations:
summary: "Proofchain anchor overdue"
description: |
No proofchain anchor has been created in {{ $value | humanize }} hours.
Anchors should be created at least daily.
This weakens the audit trail's immutability guarantees.
runbook_url: "https://wiki.internal/playbooks/proofchain-maintenance"
# ============================================
# WARNING - Receipt Generation Failed
# ============================================
- alert: ReceiptGenerationFailed
expr: increase(cloudflare_receipt_generation_failures_total[1h]) > 0
for: 5m
labels:
severity: warning
component: proofchain
annotations:
summary: "Receipt generation failures detected"
description: |
{{ $value }} receipt generation failures in the last hour.
Receipt type: {{ $labels.receipt_type }}
Error: {{ $labels.error_type }}
Operations are proceeding but not being properly logged.
# ============================================
# WARNING - Chain Growth Stalled
# ============================================
- alert: ProofchainGrowthStalled
expr: increase(cloudflare_proofchain_receipts_total[6h]) == 0
for: 6h
labels:
severity: warning
component: proofchain
annotations:
summary: "No new receipts in 6 hours"
description: |
Proofchain {{ $labels.chain_name }} has not received new receipts
in 6 hours. This may indicate:
- Receipt generation failure
- System not operational
- Configuration issue
Verify receipt generation is working.
# ============================================
# WARNING - Chain Drift from Root
# ============================================
- alert: ProofchainDrift
expr: cloudflare_proofchain_drift_receipts > 100
for: 1h
labels:
severity: warning
component: proofchain
annotations:
summary: "Proofchain has {{ $value }} unanchored receipts"
description: |
Chain {{ $labels.chain_name }} has {{ $value }} receipts since
the last anchor. Consider creating a new anchor to checkpoint
the current state.
# ============================================
# INFO - Anchor Created
# ============================================
- alert: ProofchainAnchorCreated
expr: changes(cloudflare_proofchain_anchor_count[1h]) > 0
for: 0m
labels:
severity: info
component: proofchain
annotations:
summary: "New proofchain anchor created"
description: |
A new anchor has been created for chain {{ $labels.chain_name }}.
Anchor hash: {{ $labels.anchor_hash }}
Receipts anchored: {{ $labels.receipts_anchored }}
# ============================================
# WARNING - Frontier Corruption
# ============================================
- alert: ProofchainFrontierCorrupt
expr: cloudflare_proofchain_frontier_valid == 0
for: 1m
labels:
severity: critical
component: proofchain
annotations:
summary: "Proofchain frontier is corrupt"
description: |
The frontier (latest state) of chain {{ $labels.chain_name }}
cannot be verified. The chain may be in an inconsistent state.
Do not append new receipts until this is resolved.
runbook_url: "https://wiki.internal/playbooks/proofchain-incident"
# ============================================
# WARNING - Receipt Backlog
# ============================================
- alert: ReceiptBacklog
expr: cloudflare_receipt_queue_depth > 100
for: 10m
labels:
severity: warning
component: proofchain
annotations:
summary: "Receipt generation backlog"
description: |
{{ $value }} receipts waiting to be written.
This may indicate performance issues or blocked writes.
# ============================================
# CRITICAL - Receipt Queue Overflow
# ============================================
- alert: ReceiptQueueOverflow
expr: cloudflare_receipt_queue_depth > 1000
for: 5m
labels:
severity: critical
component: proofchain
annotations:
summary: "Receipt queue overflow imminent"
description: |
{{ $value }} receipts in queue. Queue may overflow.
Some operational events may not be recorded.
Investigate and resolve immediately.
# ============================================
# WARNING - Receipt Write Latency High
# ============================================
- alert: ReceiptWriteLatencyHigh
expr: cloudflare_receipt_write_duration_seconds > 5
for: 5m
labels:
severity: warning
component: proofchain
annotations:
summary: "High receipt write latency"
description: |
Receipt write operations taking {{ $value | humanize }}s.
This may cause backlog buildup.
Check storage performance.
# ============================================
# CRITICAL - Storage Near Capacity
# ============================================
- alert: ProofchainStorageNearFull
expr: cloudflare_proofchain_storage_used_bytes / cloudflare_proofchain_storage_total_bytes > 0.9
for: 1h
labels:
severity: critical
component: proofchain
annotations:
summary: "Proofchain storage >90% full"
description: |
Proofchain storage is {{ $value | humanizePercentage }} full.
Expand storage or archive old receipts immediately.
# ============================================
# WARNING - Cross-Ledger Verification Failed
# ============================================
- alert: CrossLedgerVerificationFailed
expr: cloudflare_proofchain_cross_verification_valid == 0
for: 5m
labels:
severity: warning
component: proofchain
annotations:
summary: "Cross-ledger verification failed"
description: |
Verification between {{ $labels.chain_a }} and {{ $labels.chain_b }}
has failed. The ledgers may have diverged.
Investigate the root cause before proceeding.
# ============================================
# INFO - Receipt Type Distribution Anomaly
# ============================================
- alert: ReceiptDistributionAnomaly
expr: |
(rate(cloudflare_receipts_by_type_total{type="anomaly"}[1h])
/ rate(cloudflare_receipts_by_type_total[1h])) > 0.5
for: 1h
labels:
severity: info
component: proofchain
annotations:
summary: "High proportion of anomaly receipts"
description: |
More than 50% of recent receipts are anomaly type.
This may indicate systemic issues being logged.
Review recent anomaly receipts for patterns.