Initial commit: Cloudflare infrastructure with WAF Intelligence
- Complete Cloudflare Terraform configuration (DNS, WAF, tunnels, access) - WAF Intelligence MCP server with threat analysis and ML classification - GitOps automation with PR workflows and drift detection - Observatory monitoring stack with Prometheus/Grafana - IDE operator rules for governed development - Security playbooks and compliance frameworks - Autonomous remediation and state reconciliation
This commit is contained in:
169
observatory/alertmanager/templates/pagerduty.tmpl
Normal file
169
observatory/alertmanager/templates/pagerduty.tmpl
Normal file
@@ -0,0 +1,169 @@
|
||||
{{/* PagerDuty notification templates for Cloudflare Mesh Observatory */}}
|
||||
|
||||
{{/* Main description template */}}
|
||||
{{ define "pagerduty.cloudflare.description" -}}
|
||||
[{{ .CommonLabels.severity | toUpper }}] {{ .CommonLabels.alertname }} - {{ .CommonAnnotations.summary }}
|
||||
{{- end }}
|
||||
|
||||
{{/* Detailed incident description */}}
|
||||
{{ define "pagerduty.cloudflare.details" -}}
|
||||
{{ range .Alerts }}
|
||||
Alert: {{ .Labels.alertname }}
|
||||
Severity: {{ .Labels.severity }}
|
||||
Component: {{ .Labels.component }}
|
||||
|
||||
Summary: {{ .Annotations.summary }}
|
||||
|
||||
Description: {{ .Annotations.description }}
|
||||
|
||||
Labels:
|
||||
{{ range .Labels.SortedPairs -}}
|
||||
{{ .Name }}: {{ .Value }}
|
||||
{{ end }}
|
||||
|
||||
Started: {{ .StartsAt.Format "2006-01-02 15:04:05 UTC" }}
|
||||
{{ if eq .Status "resolved" }}Resolved: {{ .EndsAt.Format "2006-01-02 15:04:05 UTC" }}{{ end }}
|
||||
|
||||
Runbook: {{ if .Annotations.runbook_url }}{{ .Annotations.runbook_url }}{{ else }}https://wiki.internal/playbooks/cloudflare{{ end }}
|
||||
|
||||
---
|
||||
{{ end }}
|
||||
{{- end }}
|
||||
|
||||
{{/* Critical tunnel incident */}}
|
||||
{{ define "pagerduty.cloudflare.tunnel.critical" -}}
|
||||
CRITICAL TUNNEL FAILURE
|
||||
|
||||
Tunnel: {{ .CommonLabels.tunnel_name }} ({{ .CommonLabels.tunnel_id }})
|
||||
Zone: {{ .CommonLabels.zone }}
|
||||
|
||||
All tunnel connections have failed. Services behind this tunnel are UNREACHABLE.
|
||||
|
||||
Immediate Actions Required:
|
||||
1. Check cloudflared daemon status on origin server
|
||||
2. Verify network path to Cloudflare edge
|
||||
3. Review recent configuration changes
|
||||
4. Consider emergency tunnel rotation
|
||||
|
||||
Impact: {{ .CommonAnnotations.impact }}
|
||||
ETA to degradation: IMMEDIATE
|
||||
|
||||
Escalation Chain:
|
||||
1. On-call Infrastructure Engineer
|
||||
2. Platform Team Lead
|
||||
3. Security Team (if compromise suspected)
|
||||
{{- end }}
|
||||
|
||||
{{/* Critical DNS incident */}}
|
||||
{{ define "pagerduty.cloudflare.dns.critical" -}}
|
||||
CRITICAL DNS INCIDENT
|
||||
|
||||
Type: {{ .CommonLabels.alertname }}
|
||||
Zone: {{ .CommonLabels.zone }}
|
||||
Record: {{ .CommonLabels.record_name }}
|
||||
|
||||
{{ if eq .CommonLabels.alertname "DNSHijackDetected" -}}
|
||||
POTENTIAL DNS HIJACK DETECTED
|
||||
|
||||
This is a SECURITY INCIDENT. DNS records do not match expected configuration.
|
||||
|
||||
Immediate Actions:
|
||||
1. Verify DNS resolution from multiple locations
|
||||
2. Check Cloudflare dashboard for unauthorized changes
|
||||
3. Review audit logs for suspicious activity
|
||||
4. Engage security incident response
|
||||
|
||||
DO NOT dismiss without verification.
|
||||
{{- else -}}
|
||||
DNS configuration drift detected. Records have changed from expected baseline.
|
||||
|
||||
Actions:
|
||||
1. Compare current vs expected records
|
||||
2. Determine if change was authorized
|
||||
3. Restore from known-good state if needed
|
||||
{{- end }}
|
||||
{{- end }}
|
||||
|
||||
{{/* Critical WAF incident */}}
|
||||
{{ define "pagerduty.cloudflare.waf.critical" -}}
|
||||
CRITICAL WAF INCIDENT
|
||||
|
||||
Attack Type: {{ .CommonLabels.attack_type }}
|
||||
Source: {{ .CommonLabels.source_ip }}
|
||||
Request Volume: {{ .CommonLabels.request_count }} requests
|
||||
|
||||
{{ if eq .CommonLabels.alertname "WAFMassiveAttack" -}}
|
||||
MASSIVE ATTACK IN PROGRESS
|
||||
|
||||
Request volume significantly exceeds baseline. This may indicate:
|
||||
- DDoS attack
|
||||
- Credential stuffing
|
||||
- Application-layer attack
|
||||
|
||||
Immediate Actions:
|
||||
1. Review attack traffic patterns
|
||||
2. Consider enabling Under Attack Mode
|
||||
3. Increase rate limiting thresholds
|
||||
4. Block attacking IPs if identified
|
||||
|
||||
Current Mitigation: {{ .CommonAnnotations.current_mitigation }}
|
||||
{{- else -}}
|
||||
WAF rule bypass detected. Malicious traffic may be reaching origin.
|
||||
|
||||
Actions:
|
||||
1. Analyze bypassed requests
|
||||
2. Tighten rule specificity
|
||||
3. Add supplementary blocking rules
|
||||
{{- end }}
|
||||
{{- end }}
|
||||
|
||||
{{/* Critical invariant violation */}}
|
||||
{{ define "pagerduty.cloudflare.invariant.critical" -}}
|
||||
SECURITY INVARIANT VIOLATION
|
||||
|
||||
Invariant: {{ .CommonLabels.invariant_name }}
|
||||
Category: {{ .CommonLabels.category }}
|
||||
|
||||
A critical security invariant has been violated. This indicates:
|
||||
- Unauthorized configuration change
|
||||
- Potential security misconfiguration
|
||||
- Compliance violation
|
||||
|
||||
Violation Details:
|
||||
- Expected: {{ .CommonLabels.expected_value }}
|
||||
- Actual: {{ .CommonLabels.actual_value }}
|
||||
- Impact: {{ .CommonAnnotations.impact }}
|
||||
|
||||
Affected Frameworks: {{ .CommonLabels.frameworks }}
|
||||
|
||||
This violation requires immediate investigation and remediation.
|
||||
{{- end }}
|
||||
|
||||
{{/* Critical proofchain incident */}}
|
||||
{{ define "pagerduty.cloudflare.proofchain.critical" -}}
|
||||
PROOFCHAIN INTEGRITY FAILURE
|
||||
|
||||
Chain: {{ .CommonLabels.chain_name }}
|
||||
Receipt Type: {{ .CommonLabels.receipt_type }}
|
||||
|
||||
CRITICAL: Proofchain integrity verification has FAILED.
|
||||
|
||||
This indicates one of:
|
||||
1. Ledger tampering
|
||||
2. Receipt corruption
|
||||
3. Chain fork
|
||||
4. Hash collision (extremely unlikely)
|
||||
|
||||
Integrity Details:
|
||||
- Last Valid Hash: {{ .CommonLabels.last_valid_hash }}
|
||||
- Expected Hash: {{ .CommonLabels.expected_hash }}
|
||||
- Computed Hash: {{ .CommonLabels.computed_hash }}
|
||||
|
||||
IMMEDIATE ACTIONS:
|
||||
1. HALT all new receipt generation
|
||||
2. Preserve current state for forensics
|
||||
3. Identify last known-good checkpoint
|
||||
4. Engage proofchain administrator
|
||||
|
||||
This is a potential SECURITY INCIDENT if tampering is suspected.
|
||||
{{- end }}
|
||||
Reference in New Issue
Block a user