- Complete Cloudflare Terraform configuration (DNS, WAF, tunnels, access) - WAF Intelligence MCP server with threat analysis and ML classification - GitOps automation with PR workflows and drift detection - Observatory monitoring stack with Prometheus/Grafana - IDE operator rules for governed development - Security playbooks and compliance frameworks - Autonomous remediation and state reconciliation
170 lines
4.8 KiB
Cheetah
170 lines
4.8 KiB
Cheetah
{{/* PagerDuty notification templates for Cloudflare Mesh Observatory */}}
|
|
|
|
{{/* Main description template */}}
|
|
{{ define "pagerduty.cloudflare.description" -}}
|
|
[{{ .CommonLabels.severity | toUpper }}] {{ .CommonLabels.alertname }} - {{ .CommonAnnotations.summary }}
|
|
{{- end }}
|
|
|
|
{{/* Detailed incident description */}}
|
|
{{ define "pagerduty.cloudflare.details" -}}
|
|
{{ range .Alerts }}
|
|
Alert: {{ .Labels.alertname }}
|
|
Severity: {{ .Labels.severity }}
|
|
Component: {{ .Labels.component }}
|
|
|
|
Summary: {{ .Annotations.summary }}
|
|
|
|
Description: {{ .Annotations.description }}
|
|
|
|
Labels:
|
|
{{ range .Labels.SortedPairs -}}
|
|
{{ .Name }}: {{ .Value }}
|
|
{{ end }}
|
|
|
|
Started: {{ .StartsAt.Format "2006-01-02 15:04:05 UTC" }}
|
|
{{ if eq .Status "resolved" }}Resolved: {{ .EndsAt.Format "2006-01-02 15:04:05 UTC" }}{{ end }}
|
|
|
|
Runbook: {{ if .Annotations.runbook_url }}{{ .Annotations.runbook_url }}{{ else }}https://wiki.internal/playbooks/cloudflare{{ end }}
|
|
|
|
---
|
|
{{ end }}
|
|
{{- end }}
|
|
|
|
{{/* Critical tunnel incident */}}
|
|
{{ define "pagerduty.cloudflare.tunnel.critical" -}}
|
|
CRITICAL TUNNEL FAILURE
|
|
|
|
Tunnel: {{ .CommonLabels.tunnel_name }} ({{ .CommonLabels.tunnel_id }})
|
|
Zone: {{ .CommonLabels.zone }}
|
|
|
|
All tunnel connections have failed. Services behind this tunnel are UNREACHABLE.
|
|
|
|
Immediate Actions Required:
|
|
1. Check cloudflared daemon status on origin server
|
|
2. Verify network path to Cloudflare edge
|
|
3. Review recent configuration changes
|
|
4. Consider emergency tunnel rotation
|
|
|
|
Impact: {{ .CommonAnnotations.impact }}
|
|
ETA to degradation: IMMEDIATE
|
|
|
|
Escalation Chain:
|
|
1. On-call Infrastructure Engineer
|
|
2. Platform Team Lead
|
|
3. Security Team (if compromise suspected)
|
|
{{- end }}
|
|
|
|
{{/* Critical DNS incident */}}
|
|
{{ define "pagerduty.cloudflare.dns.critical" -}}
|
|
CRITICAL DNS INCIDENT
|
|
|
|
Type: {{ .CommonLabels.alertname }}
|
|
Zone: {{ .CommonLabels.zone }}
|
|
Record: {{ .CommonLabels.record_name }}
|
|
|
|
{{ if eq .CommonLabels.alertname "DNSHijackDetected" -}}
|
|
POTENTIAL DNS HIJACK DETECTED
|
|
|
|
This is a SECURITY INCIDENT. DNS records do not match expected configuration.
|
|
|
|
Immediate Actions:
|
|
1. Verify DNS resolution from multiple locations
|
|
2. Check Cloudflare dashboard for unauthorized changes
|
|
3. Review audit logs for suspicious activity
|
|
4. Engage security incident response
|
|
|
|
DO NOT dismiss without verification.
|
|
{{- else -}}
|
|
DNS configuration drift detected. Records have changed from expected baseline.
|
|
|
|
Actions:
|
|
1. Compare current vs expected records
|
|
2. Determine if change was authorized
|
|
3. Restore from known-good state if needed
|
|
{{- end }}
|
|
{{- end }}
|
|
|
|
{{/* Critical WAF incident */}}
|
|
{{ define "pagerduty.cloudflare.waf.critical" -}}
|
|
CRITICAL WAF INCIDENT
|
|
|
|
Attack Type: {{ .CommonLabels.attack_type }}
|
|
Source: {{ .CommonLabels.source_ip }}
|
|
Request Volume: {{ .CommonLabels.request_count }} requests
|
|
|
|
{{ if eq .CommonLabels.alertname "WAFMassiveAttack" -}}
|
|
MASSIVE ATTACK IN PROGRESS
|
|
|
|
Request volume significantly exceeds baseline. This may indicate:
|
|
- DDoS attack
|
|
- Credential stuffing
|
|
- Application-layer attack
|
|
|
|
Immediate Actions:
|
|
1. Review attack traffic patterns
|
|
2. Consider enabling Under Attack Mode
|
|
3. Increase rate limiting thresholds
|
|
4. Block attacking IPs if identified
|
|
|
|
Current Mitigation: {{ .CommonAnnotations.current_mitigation }}
|
|
{{- else -}}
|
|
WAF rule bypass detected. Malicious traffic may be reaching origin.
|
|
|
|
Actions:
|
|
1. Analyze bypassed requests
|
|
2. Tighten rule specificity
|
|
3. Add supplementary blocking rules
|
|
{{- end }}
|
|
{{- end }}
|
|
|
|
{{/* Critical invariant violation */}}
|
|
{{ define "pagerduty.cloudflare.invariant.critical" -}}
|
|
SECURITY INVARIANT VIOLATION
|
|
|
|
Invariant: {{ .CommonLabels.invariant_name }}
|
|
Category: {{ .CommonLabels.category }}
|
|
|
|
A critical security invariant has been violated. This indicates:
|
|
- Unauthorized configuration change
|
|
- Potential security misconfiguration
|
|
- Compliance violation
|
|
|
|
Violation Details:
|
|
- Expected: {{ .CommonLabels.expected_value }}
|
|
- Actual: {{ .CommonLabels.actual_value }}
|
|
- Impact: {{ .CommonAnnotations.impact }}
|
|
|
|
Affected Frameworks: {{ .CommonLabels.frameworks }}
|
|
|
|
This violation requires immediate investigation and remediation.
|
|
{{- end }}
|
|
|
|
{{/* Critical proofchain incident */}}
|
|
{{ define "pagerduty.cloudflare.proofchain.critical" -}}
|
|
PROOFCHAIN INTEGRITY FAILURE
|
|
|
|
Chain: {{ .CommonLabels.chain_name }}
|
|
Receipt Type: {{ .CommonLabels.receipt_type }}
|
|
|
|
CRITICAL: Proofchain integrity verification has FAILED.
|
|
|
|
This indicates one of:
|
|
1. Ledger tampering
|
|
2. Receipt corruption
|
|
3. Chain fork
|
|
4. Hash collision (extremely unlikely)
|
|
|
|
Integrity Details:
|
|
- Last Valid Hash: {{ .CommonLabels.last_valid_hash }}
|
|
- Expected Hash: {{ .CommonLabels.expected_hash }}
|
|
- Computed Hash: {{ .CommonLabels.computed_hash }}
|
|
|
|
IMMEDIATE ACTIONS:
|
|
1. HALT all new receipt generation
|
|
2. Preserve current state for forensics
|
|
3. Identify last known-good checkpoint
|
|
4. Engage proofchain administrator
|
|
|
|
This is a potential SECURITY INCIDENT if tampering is suspected.
|
|
{{- end }}
|