Initial commit: Cloudflare infrastructure with WAF Intelligence

- Complete Cloudflare Terraform configuration (DNS, WAF, tunnels, access)
- WAF Intelligence MCP server with threat analysis and ML classification
- GitOps automation with PR workflows and drift detection
- Observatory monitoring stack with Prometheus/Grafana
- IDE operator rules for governed development
- Security playbooks and compliance frameworks
- Autonomous remediation and state reconciliation
This commit is contained in:
Vault Sovereign
2025-12-16 18:31:53 +00:00
commit 37a867c485
123 changed files with 25407 additions and 0 deletions

343
gitops/README.md Normal file
View File

@@ -0,0 +1,343 @@
# Phase 6 - GitOps PR Workflows
Cloudflare Mesh Observatory - Automated Drift Remediation & Plan Comments
## Overview
Phase 6 completes the observability feedback loop by converting alerts and drift
detection into actionable Merge Requests.
```
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Observatory │────▶│ Alerts │────▶│ GitOps │
│ (Phase 5A) │ │ (Phase 5B) │ │ (Phase 6) │
└─────────────┘ └─────────────┘ └─────────────┘
│ │ │
│ │ ▼
│ │ ┌─────────────┐
│ │ │ Drift PR │
│ │ │ Created │
│ │ └─────────────┘
│ │ │
│ │ ▼
│ │ ┌─────────────┐
│ └───────────▶│ Review & │
│ │ Merge │
│ └─────────────┘
│ │
└───────────────────────────────────────┘
Terraform Apply
```
## Components
| File | Purpose |
|------|---------|
| `config.yml` | GitOps configuration, risk classification, compliance mapping |
| `plan_summarizer.py` | Parses terraform plan JSON, scores risk, generates markdown |
| `drift_pr_bot.py` | Creates drift remediation MRs in GitLab/GitHub |
| `ci_plan_comment.py` | Posts plan summaries as MR comments |
| `webhook_receiver.py` | Receives Alertmanager webhooks, triggers pipelines |
## Quick Start
### 1. Configure Environment
```bash
# Copy and edit config
cd ~/Desktop/CLOUDFLARE/gitops
cp config.yml config.local.yml # optional local override
# Set environment variables
export GITLAB_TOKEN="glpat-xxxx"
export GITLAB_PROJECT_ID="12345678"
export SLACK_WEBHOOK_URL="https://hooks.slack.com/..."
```
### 2. Test Plan Summarizer
```bash
# Generate a terraform plan first
cd ../terraform
terraform init
terraform plan -out=plan.tfplan
# Run summarizer
cd ../gitops
python3 plan_summarizer.py --format markdown
python3 plan_summarizer.py --format json
```
### 3. Test Drift PR Bot (Dry Run)
```bash
python3 drift_pr_bot.py --dry-run
```
### 4. Start Webhook Receiver (Optional)
```bash
python3 webhook_receiver.py --port 8080
# POST to http://localhost:8080/webhook/alert
```
## Configuration Reference
### Risk Classification
The `config.yml` maps Cloudflare resources to risk levels:
```yaml
risk:
dns:
resource_types:
- "cloudflare_record"
- "cloudflare_zone"
base_risk: "high"
waf:
resource_types:
- "cloudflare_waf_rule"
- "cloudflare_firewall_rule"
base_risk: "high"
actions:
create:
modifier: 0 # Neutral
update:
modifier: 1 # +1 level
delete:
modifier: 2 # +2 levels (always dangerous)
```
### Compliance Frameworks
Map resources/actions to compliance frameworks:
```yaml
compliance:
frameworks:
- name: "SOC2"
triggers:
- resource_types: ["cloudflare_zone_settings_override"]
fields: ["ssl", "always_use_https"]
- resource_types: ["cloudflare_waf_rule"]
actions: ["delete"]
- name: "PCI-DSS"
triggers:
- resource_types: ["cloudflare_zone_settings_override"]
fields: ["min_tls_version"]
```
### Drift PR Settings
```yaml
drift_pr:
branch_prefix: "drift/remediation-"
title_prefix: "Drift Remediation"
labels:
- "drift"
- "terraform"
# Auto-assign reviewers by category
reviewer_mapping:
dns: ["dns-team"]
waf: ["security-team"]
tunnels: ["infra-team"]
```
## GitLab CI Integration
Three jobs are added to `.gitlab-ci.yml`:
### 1. Plan Comment on MRs
```yaml
gitops:plan_comment:
stage: gitops
script:
- python3 gitops/ci_plan_comment.py
rules:
- if: $CI_PIPELINE_SOURCE == "merge_request_event"
```
Posts a rich markdown comment showing:
- Overall risk level
- Action breakdown (create/update/delete)
- Affected zones
- Compliance flags
- Resource change table
### 2. Drift Remediation
```yaml
gitops:drift_remediation:
stage: gitops
script:
- python3 gitops/drift_pr_bot.py
rules:
- if: $CI_PIPELINE_SOURCE == "schedule" && $GITOPS_DRIFT_CHECK == "true"
- if: $CI_PIPELINE_SOURCE == "trigger" && $GITOPS_TRIGGER_SOURCE == "alert"
```
Triggered by:
- Scheduled pipelines (daily drift check)
- Alertmanager webhooks (alert-triggered)
### 3. Risk Gate
```yaml
gitops:risk_gate:
stage: gitops
script:
- |
RISK=$(python3 plan_summarizer.py --format json | ...)
if [ "$RISK" = "CRITICAL" ]; then
exit 1
fi
allow_failure: true
```
Blocks auto-merge for CRITICAL risk changes.
## Alertmanager Integration
### Add Webhook Receiver
Add to `observatory/alertmanager/alertmanager.yml`:
```yaml
receivers:
- name: 'gitops-webhook'
webhook_configs:
- url: 'http://gitops-webhook:8080/webhook/alert'
send_resolved: false
```
### Route Drift Alerts
```yaml
route:
routes:
- match:
alertname: DNSDriftDetected
receiver: 'gitops-webhook'
continue: true
- match:
alertname: WAFRuleMissing
receiver: 'gitops-webhook'
continue: true
```
## Output Examples
### MR Comment
```markdown
## 🟠 Terraform Plan Summary
**Overall Risk:** 🟠 **HIGH**
**Total Changes:** `5`
**Actions:** create=2, update=2, delete=1
**By Category:**
- dns: 3
- waf: 2
**Affected Zones:** `example.com`, `staging.example.com`
**Compliance Impact:**
- ⚠️ SOC2
- ⚠️ PCI-DSS
### Resource Changes
| Resource | Actions | Risk | Compliance |
|----------|---------|------|------------|
| `cloudflare_record.api` | `delete` | **CRITICAL** | SOC2 |
| `cloudflare_waf_rule.sqli` | `update` | **HIGH** | PCI-DSS |
...
```
### JSON Output
```json
{
"total_changes": 5,
"overall_risk": "HIGH",
"by_action": {"create": 2, "update": 2, "delete": 1},
"by_risk": {"LOW": 1, "MEDIUM": 1, "HIGH": 2, "CRITICAL": 1},
"by_category": {"dns": 3, "waf": 2},
"affected_zones": ["example.com", "staging.example.com"],
"compliance_violations": ["SOC2", "PCI-DSS"],
"changes": [...]
}
```
## Environment Variables
| Variable | Required | Description |
|----------|----------|-------------|
| `GITLAB_TOKEN` | Yes | GitLab API token with `api` scope |
| `GITLAB_PROJECT_ID` | Yes | Target project ID |
| `GITLAB_BASE_URL` | No | GitLab instance URL (default: gitlab.com) |
| `GITLAB_TRIGGER_TOKEN` | No | For pipeline triggers from webhooks |
| `SLACK_WEBHOOK_URL` | No | Slack notifications |
| `GITOPS_DRY_RUN` | No | Set `true` to skip actual PR creation |
| `WEBHOOK_SECRET` | No | HMAC secret for webhook verification |
## Security Considerations
1. **Token Scope**: Use minimal GitLab token scope (`api` for MR creation)
2. **Webhook Security**: Set `WEBHOOK_SECRET` for signature verification
3. **Review Before Merge**: Always review auto-generated PRs
4. **Compliance Blocking**: Consider `block_on_violation: true` for strict mode
## Troubleshooting
### Plan Summarizer Fails
```bash
# Check terraform plan exists
ls -la terraform/plan.tfplan
# Run terraform show manually
cd terraform
terraform show -json plan.tfplan | head -100
```
### MR Comment Not Posted
```bash
# Check CI variables are set
echo $GITLAB_TOKEN
echo $CI_MERGE_REQUEST_IID
# Run comment script manually
python3 ci_plan_comment.py --dry-run
```
### Webhook Not Triggering
```bash
# Check webhook receiver logs
curl -X POST http://localhost:8080/webhook/alert \
-H "Content-Type: application/json" \
-d '{"alerts":[{"labels":{"alertname":"DNSDriftDetected"}}]}'
# Check Alertmanager config
amtool config show
```
## Next Phases
- **Phase 7 (WAF Intelligence)**: ML-lite analysis of attack patterns
- **Phase 8 (Zero Trust Auditor)**: Identity policy compliance
- **Phase 9 (VaultMesh Integration)**: ProofChain anchoring
---
*Phase 6 GitOps - Cloudflare Mesh Observatory*

358
gitops/ci_plan_comment.py Normal file
View File

@@ -0,0 +1,358 @@
#!/usr/bin/env python3
"""
CI Plan Comment Bot for Cloudflare GitOps
Phase 6 - PR Workflows
Posts Terraform plan summaries as comments on Merge Requests.
Designed to run in GitLab CI/CD pipelines.
"""
import json
import os
import subprocess
import sys
from pathlib import Path
from typing import Any, Dict, Optional
try:
import requests
import yaml
except ImportError:
print("ERROR: pip install requests pyyaml", file=sys.stderr)
sys.exit(1)
HERE = Path(__file__).resolve().parent
CONFIG_PATH = HERE / "config.yml"
def load_config() -> Dict[str, Any]:
"""Load gitops configuration with env expansion"""
with open(CONFIG_PATH) as f:
config = yaml.safe_load(f)
def expand_env(obj):
if isinstance(obj, str):
if obj.startswith("${") and "}" in obj:
inner = obj[2:obj.index("}")]
default = None
var = inner
if ":-" in inner:
var, default = inner.split(":-", 1)
return os.environ.get(var, default)
return obj
elif isinstance(obj, dict):
return {k: expand_env(v) for k, v in obj.items()}
elif isinstance(obj, list):
return [expand_env(i) for i in obj]
return obj
return expand_env(config)
def get_plan_summary() -> tuple[str, Dict]:
"""Run plan_summarizer and get both formats"""
# Markdown for comment
result = subprocess.run(
["python3", "plan_summarizer.py", "--format", "markdown"],
cwd=HERE,
capture_output=True,
text=True,
check=True,
)
markdown = result.stdout
# JSON for processing
result = subprocess.run(
["python3", "plan_summarizer.py", "--format", "json"],
cwd=HERE,
capture_output=True,
text=True,
check=True,
)
summary_json = json.loads(result.stdout)
return markdown, summary_json
class GitLabCI:
"""GitLab CI integration"""
def __init__(self, token: str):
self.base_url = os.environ.get("CI_API_V4_URL", "https://gitlab.com/api/v4")
self.project_id = os.environ.get("CI_PROJECT_ID")
self.mr_iid = os.environ.get("CI_MERGE_REQUEST_IID")
self.commit_sha = os.environ.get("CI_COMMIT_SHA", "")[:8]
self.pipeline_url = os.environ.get("CI_PIPELINE_URL", "")
self.job_name = os.environ.get("CI_JOB_NAME", "terraform-plan")
self.token = token
self.headers = {"PRIVATE-TOKEN": token}
@property
def is_mr_pipeline(self) -> bool:
return bool(self.mr_iid)
def get_existing_comments(self) -> list:
"""Get existing MR comments"""
url = f"{self.base_url}/projects/{self.project_id}/merge_requests/{self.mr_iid}/notes"
resp = requests.get(url, headers=self.headers)
resp.raise_for_status()
return resp.json()
def find_bot_comment(self, marker: str) -> Optional[Dict]:
"""Find existing bot comment by marker"""
comments = self.get_existing_comments()
for comment in comments:
if marker in comment.get("body", ""):
return comment
return None
def post_comment(self, body: str) -> Dict:
"""Post a new comment on the MR"""
url = f"{self.base_url}/projects/{self.project_id}/merge_requests/{self.mr_iid}/notes"
resp = requests.post(url, headers=self.headers, data={"body": body})
resp.raise_for_status()
return resp.json()
def update_comment(self, note_id: int, body: str) -> Dict:
"""Update an existing comment"""
url = f"{self.base_url}/projects/{self.project_id}/merge_requests/{self.mr_iid}/notes/{note_id}"
resp = requests.put(url, headers=self.headers, data={"body": body})
resp.raise_for_status()
return resp.json()
def delete_comment(self, note_id: int):
"""Delete a comment"""
url = f"{self.base_url}/projects/{self.project_id}/merge_requests/{self.mr_iid}/notes/{note_id}"
resp = requests.delete(url, headers=self.headers)
resp.raise_for_status()
class GitHubActions:
"""GitHub Actions integration"""
def __init__(self, token: str):
self.base_url = "https://api.github.com"
self.repo = os.environ.get("GITHUB_REPOSITORY", "")
self.pr_number = self._get_pr_number()
self.commit_sha = os.environ.get("GITHUB_SHA", "")[:8]
self.run_url = f"https://github.com/{self.repo}/actions/runs/{os.environ.get('GITHUB_RUN_ID', '')}"
self.token = token
self.headers = {
"Authorization": f"token {token}",
"Accept": "application/vnd.github.v3+json",
}
def _get_pr_number(self) -> Optional[str]:
"""Extract PR number from GitHub event"""
event_path = os.environ.get("GITHUB_EVENT_PATH")
if event_path and os.path.exists(event_path):
with open(event_path) as f:
event = json.load(f)
pr = event.get("pull_request", {})
return str(pr.get("number", "")) if pr else None
return None
@property
def is_pr_pipeline(self) -> bool:
return bool(self.pr_number)
def find_bot_comment(self, marker: str) -> Optional[Dict]:
"""Find existing bot comment"""
url = f"{self.base_url}/repos/{self.repo}/issues/{self.pr_number}/comments"
resp = requests.get(url, headers=self.headers)
resp.raise_for_status()
for comment in resp.json():
if marker in comment.get("body", ""):
return comment
return None
def post_comment(self, body: str) -> Dict:
"""Post a new comment"""
url = f"{self.base_url}/repos/{self.repo}/issues/{self.pr_number}/comments"
resp = requests.post(url, headers=self.headers, json={"body": body})
resp.raise_for_status()
return resp.json()
def update_comment(self, comment_id: int, body: str) -> Dict:
"""Update existing comment"""
url = f"{self.base_url}/repos/{self.repo}/issues/comments/{comment_id}"
resp = requests.patch(url, headers=self.headers, json={"body": body})
resp.raise_for_status()
return resp.json()
def build_comment_body(
cfg: Dict[str, Any],
summary_md: str,
summary_json: Dict,
ci_info: Dict,
) -> str:
"""Build the full comment body"""
ci_cfg = cfg.get("ci", {})
header = ci_cfg.get("comment_header", "Terraform Plan Summary")
# Risk indicator
risk = summary_json.get("overall_risk", "UNKNOWN")
risk_emoji = {
"LOW": "🟢",
"MEDIUM": "🟡",
"HIGH": "🟠",
"CRITICAL": "🔴",
}.get(risk, "")
# Marker for finding/updating this comment
marker = "<!-- gitops-plan-comment -->"
changes = summary_json.get("total_changes", 0)
compliance = summary_json.get("compliance_violations", [])
# Build body
lines = [
marker,
f"# {risk_emoji} {header}",
"",
f"**Commit:** `{ci_info.get('commit_sha', 'N/A')}`",
f"**Pipeline:** [{ci_info.get('job_name', 'terraform-plan')}]({ci_info.get('pipeline_url', '#')})",
"",
]
# Compliance warning banner
if compliance:
frameworks = ", ".join(compliance)
lines.extend([
f"> ⚠️ **Compliance Impact:** {frameworks}",
"",
])
# No changes case
if changes == 0:
lines.extend([
"✅ **No changes detected.**",
"",
"Terraform state matches the current configuration.",
])
else:
# Add summary
lines.append(summary_md)
# Add approval reminder for high risk
if risk in ("HIGH", "CRITICAL"):
lines.extend([
"",
"---",
f"⚠️ **{risk} risk changes detected.** Additional review recommended.",
])
lines.extend([
"",
"---",
f"*Last updated: {ci_info.get('timestamp', 'N/A')} • Phase 6 GitOps*",
])
return "\n".join(lines)
def main():
"""Main entry point"""
import argparse
from datetime import datetime
parser = argparse.ArgumentParser(
description="Post terraform plan comment on MR"
)
parser.add_argument(
"--dry-run",
action="store_true",
help="Print comment but don't post",
)
parser.add_argument(
"--update",
action="store_true",
default=True,
help="Update existing comment instead of creating new one",
)
args = parser.parse_args()
# Load config
cfg = load_config()
# Detect CI platform
token = os.environ.get("GITLAB_TOKEN") or os.environ.get("GITHUB_TOKEN")
if not token:
print("ERROR: GITLAB_TOKEN or GITHUB_TOKEN required", file=sys.stderr)
sys.exit(1)
# Determine platform
if os.environ.get("GITLAB_CI"):
ci = GitLabCI(token)
platform = "gitlab"
elif os.environ.get("GITHUB_ACTIONS"):
ci = GitHubActions(token)
platform = "github"
else:
print("ERROR: Must run in GitLab CI or GitHub Actions", file=sys.stderr)
sys.exit(1)
# Check if this is an MR/PR pipeline
if not ci.is_mr_pipeline and not ci.is_pr_pipeline:
print("Not an MR/PR pipeline. Skipping comment.")
return
# Get plan summary
print("Getting plan summary...")
summary_md, summary_json = get_plan_summary()
# Build CI info
ci_info = {
"commit_sha": getattr(ci, "commit_sha", ""),
"pipeline_url": getattr(ci, "pipeline_url", "") or getattr(ci, "run_url", ""),
"job_name": getattr(ci, "job_name", "terraform-plan"),
"timestamp": datetime.utcnow().strftime("%Y-%m-%d %H:%M:%S UTC"),
}
# Build comment
body = build_comment_body(cfg, summary_md, summary_json, ci_info)
if args.dry_run:
print("\n" + "=" * 60)
print("[DRY RUN] Would post comment:")
print("=" * 60)
print(body)
return
# Find existing comment to update
marker = "<!-- gitops-plan-comment -->"
existing = ci.find_bot_comment(marker)
if existing and args.update:
print(f"Updating existing comment {existing.get('id') or existing.get('note_id')}...")
note_id = existing.get("id") or existing.get("note_id")
ci.update_comment(note_id, body)
print("Comment updated.")
else:
print("Posting new comment...")
result = ci.post_comment(body)
print(f"Comment posted: {result.get('id') or result.get('html_url')}")
# Output for CI
risk = summary_json.get("overall_risk", "UNKNOWN")
changes = summary_json.get("total_changes", 0)
print(f"\nSummary: {changes} changes, {risk} risk")
# Set CI output variables (for use in subsequent jobs)
if os.environ.get("GITHUB_OUTPUT"):
with open(os.environ["GITHUB_OUTPUT"], "a") as f:
f.write(f"risk_level={risk}\n")
f.write(f"change_count={changes}\n")
elif os.environ.get("GITLAB_CI"):
# GitLab: write to dotenv artifact
with open("plan_output.env", "w") as f:
f.write(f"PLAN_RISK_LEVEL={risk}\n")
f.write(f"PLAN_CHANGE_COUNT={changes}\n")
if __name__ == "__main__":
main()

373
gitops/config.yml Normal file
View File

@@ -0,0 +1,373 @@
# Phase 6 GitOps Configuration
# Cloudflare Mesh Observatory - PR Workflows
#
# This config drives:
# - Risk classification for Terraform changes
# - Drift PR generation
# - CI plan comments
# - Alertmanager → GitLab webhook triggers
---
# ==============================================================================
# GIT PLATFORM CONFIGURATION
# ==============================================================================
gitlab:
base_url: "${GITLAB_BASE_URL:-https://gitlab.com}"
project_id: "${GITLAB_PROJECT_ID}"
default_branch: "main"
# API settings
api_version: "v4"
timeout_seconds: 30
# GitHub alternative (uncomment if using GitHub)
# github:
# base_url: "https://api.github.com"
# owner: "your-org"
# repo: "cloudflare-infra"
# default_branch: "main"
# ==============================================================================
# TERRAFORM CONFIGURATION
# ==============================================================================
terraform:
working_dir: "terraform"
plan_file: "plan.tfplan"
state_file: "terraform.tfstate"
# Backend configuration hints (for plan summarizer)
backend_type: "local" # or "s3", "gcs", "azurerm", etc.
# Parallelism for plan operations
parallelism: 10
# ==============================================================================
# RISK CLASSIFICATION
# ==============================================================================
# Maps Cloudflare resource types to risk levels
# Used by plan_summarizer.py to score changes
risk:
# DNS changes - high blast radius
dns:
resource_types:
- "cloudflare_record"
- "cloudflare_zone"
- "cloudflare_zone_settings_override"
- "cloudflare_zone_dnssec"
base_risk: "high"
# WAF/Security changes - security-critical
waf:
resource_types:
- "cloudflare_waf_rule"
- "cloudflare_waf_package"
- "cloudflare_waf_group"
- "cloudflare_waf_override"
- "cloudflare_firewall_rule"
- "cloudflare_filter"
- "cloudflare_rate_limit"
- "cloudflare_zone_lockdown"
- "cloudflare_access_rule"
- "cloudflare_user_agent_blocking_rule"
base_risk: "high"
# Tunnel changes - connectivity-critical
tunnels:
resource_types:
- "cloudflare_tunnel"
- "cloudflare_tunnel_config"
- "cloudflare_tunnel_route"
- "cloudflare_argo_tunnel"
base_risk: "high"
# Access/Zero Trust - identity-critical
access:
resource_types:
- "cloudflare_access_application"
- "cloudflare_access_policy"
- "cloudflare_access_group"
- "cloudflare_access_identity_provider"
- "cloudflare_access_service_token"
- "cloudflare_access_ca_certificate"
- "cloudflare_access_mutual_tls_certificate"
- "cloudflare_teams_account"
- "cloudflare_teams_list"
- "cloudflare_teams_rule"
- "cloudflare_device_posture_rule"
- "cloudflare_device_posture_integration"
base_risk: "high"
# Performance/Caching - medium risk
performance:
resource_types:
- "cloudflare_page_rule"
- "cloudflare_tiered_cache"
- "cloudflare_cache_reserve"
- "cloudflare_regional_tiered_cache"
- "cloudflare_argo"
- "cloudflare_load_balancer"
- "cloudflare_load_balancer_pool"
- "cloudflare_load_balancer_monitor"
base_risk: "medium"
# Workers - code deployment
workers:
resource_types:
- "cloudflare_worker_script"
- "cloudflare_worker_route"
- "cloudflare_worker_cron_trigger"
- "cloudflare_workers_kv_namespace"
- "cloudflare_workers_kv"
base_risk: "medium"
# Certificates - availability-critical
certificates:
resource_types:
- "cloudflare_certificate_pack"
- "cloudflare_origin_ca_certificate"
- "cloudflare_authenticated_origin_pulls"
- "cloudflare_authenticated_origin_pulls_certificate"
base_risk: "high"
# Other/Low risk
other:
resource_types:
- "cloudflare_api_token"
- "cloudflare_logpush_job"
- "cloudflare_logpull_retention"
- "cloudflare_notification_policy"
- "cloudflare_notification_policy_webhooks"
base_risk: "low"
# Action-based risk modifiers
actions:
create:
modifier: 0 # Neutral - new resources
update:
modifier: 1 # +1 risk level
delete:
modifier: 2 # +2 risk levels (always dangerous)
replace:
modifier: 2 # Same as delete (destroy + create)
no-op:
modifier: -10 # Effectively ignore
# Final risk level mapping
levels:
low: 0
medium: 1
high: 2
critical: 3
# ==============================================================================
# DRIFT PR CONFIGURATION
# ==============================================================================
drift_pr:
# Branch naming
branch_prefix: "drift/remediation-"
# MR/PR settings
title_prefix: "Drift Remediation"
labels:
- "drift"
- "terraform"
- "auto-generated"
# Auto-assign reviewers based on component
reviewer_mapping:
dns: ["dns-team"]
waf: ["security-team"]
tunnels: ["infra-team"]
access: ["security-team", "identity-team"]
default: ["platform-team"]
# Approval requirements by risk level
approvals_required:
low: 1
medium: 1
high: 2
critical: 2
# Auto-merge settings
auto_merge:
enabled: false
allowed_risk_levels: ["low"]
require_pipeline_success: true
# ==============================================================================
# CI PLAN COMMENT CONFIGURATION
# ==============================================================================
ci:
comment_header: "Terraform Plan Summary"
# What to include in comments
include:
risk_summary: true
resource_table: true
action_counts: true
affected_zones: true
compliance_flags: true
# Collapse large tables
collapse_threshold: 10
# Link to dashboards
dashboard_links:
grafana: "http://localhost:3000/d/cloudflare-overview"
prometheus: "http://localhost:9090"
# ==============================================================================
# ALERTMANAGER WEBHOOK INTEGRATION
# ==============================================================================
webhook:
# GitLab pipeline trigger
gitlab_trigger:
enabled: true
trigger_token: "${GITLAB_TRIGGER_TOKEN}"
ref: "main"
# Alerts that trigger drift remediation
trigger_alerts:
- "DNSDriftDetected"
- "WAFRuleMissing"
- "TunnelConfigChanged"
- "InvariantViolation"
- "FirewallRuleMissing"
# Alerts that only notify (no auto-PR)
notify_only_alerts:
- "DNSHijackDetected" # Security incident - manual only
- "ProofchainIntegrityFailure" # Never auto-remediate
- "WAFRuleBypass" # Needs investigation first
# ==============================================================================
# SLACK NOTIFICATIONS
# ==============================================================================
slack:
webhook_url: "${SLACK_WEBHOOK_URL}"
channel: "#cloudflare-gitops"
# Notification settings
notify_on:
pr_created: true
pr_merged: true
pr_failed: true
high_risk_plan: true
# Message templates
templates:
pr_created: |
*GitOps PR Created*
Title: {title}
Risk Level: {risk_level}
Changes: {change_count}
Link: {url}
pr_merged: |
*GitOps PR Merged*
Title: {title}
Merged by: {merged_by}
Applied changes: {change_count}
# ==============================================================================
# COMPLIANCE INTEGRATION
# ==============================================================================
compliance:
# Flag changes that affect compliance frameworks
frameworks:
- name: "SOC2"
triggers:
- resource_types: ["cloudflare_zone_settings_override"]
fields: ["ssl", "always_use_https", "min_tls_version"]
- resource_types: ["cloudflare_waf_rule"]
actions: ["delete"]
- name: "PCI-DSS"
triggers:
- resource_types: ["cloudflare_zone_settings_override"]
fields: ["min_tls_version"]
- resource_types: ["cloudflare_waf_*"]
actions: ["delete", "update"]
- name: "HIPAA"
triggers:
- resource_types: ["cloudflare_zone_settings_override"]
fields: ["ssl", "always_use_https"]
- resource_types: ["cloudflare_access_*"]
actions: ["delete"]
# Add compliance warnings to PR descriptions
add_warnings: true
# Block merge for compliance violations
block_on_violation: false # Set true for strict mode
# ==============================================================================
# PHASE 7: WAF INTELLIGENCE CONFIGURATION
# ==============================================================================
waf_intelligence:
# Enable/disable Phase 7 features
enabled: true
# Threat intelligence collection
threat_intel:
enabled: true
log_paths:
- "logs/cloudflare"
- "/var/log/cloudflare"
max_indicators: 100
min_hit_count: 3 # Minimum hits before flagging
# External threat feeds (optional)
external_feeds:
abuseipdb:
enabled: false
api_key: "${ABUSEIPDB_API_KEY}"
min_abuse_score: 80
emerging_threats:
enabled: false
feed_url: "https://rules.emergingthreats.net/blockrules/compromised-ips.txt"
# ML classifier settings
classifier:
enabled: true
min_confidence: 0.7
sample_limit: 50
# Attack type detection
detect_types:
- sqli
- xss
- rce
- path_traversal
- scanner
# Rule proposal settings
proposals:
max_per_batch: 10
auto_deploy_min_confidence: 0.85
auto_deploy_severities:
- critical
- high
require_review_severities:
- medium
- low
# GitOps integration for WAF rules
gitops:
create_mrs: true
branch_prefix: "waf-intel/"
labels:
- "waf-intelligence"
- "auto-generated"
- "security"
reviewers:
- "security-team"
# Auto-merge high-confidence critical blocks
auto_merge:
enabled: false
min_confidence: 0.95
allowed_severities:
- critical

466
gitops/drift_pr_bot.py Normal file
View File

@@ -0,0 +1,466 @@
#!/usr/bin/env python3
"""
Drift Remediation PR Bot for Cloudflare GitOps
Phase 6 - PR Workflows
Creates Merge Requests when Terraform drift is detected.
Can be triggered by:
- Alertmanager webhooks
- Scheduled CI jobs
- Manual invocation
"""
import json
import os
import subprocess
import sys
import textwrap
from datetime import datetime
from pathlib import Path
from typing import Any, Dict, List, Optional
try:
import requests
import yaml
except ImportError:
print("ERROR: pip install requests pyyaml", file=sys.stderr)
sys.exit(1)
HERE = Path(__file__).resolve().parent
CONFIG_PATH = HERE / "config.yml"
def load_config() -> Dict[str, Any]:
"""Load gitops configuration with env expansion"""
with open(CONFIG_PATH) as f:
config = yaml.safe_load(f)
def expand_env(obj):
if isinstance(obj, str):
if obj.startswith("${") and "}" in obj:
# Handle ${VAR:-default} syntax
inner = obj[2:obj.index("}")]
default = None
var = inner
if ":-" in inner:
var, default = inner.split(":-", 1)
return os.environ.get(var, default)
return obj
elif isinstance(obj, dict):
return {k: expand_env(v) for k, v in obj.items()}
elif isinstance(obj, list):
return [expand_env(i) for i in obj]
return obj
return expand_env(config)
def run_cmd(cmd: List[str], cwd: Optional[Path] = None, check: bool = True,
capture: bool = False) -> subprocess.CompletedProcess:
"""Run a shell command"""
print(f"+ {' '.join(cmd)}")
return subprocess.run(
cmd,
cwd=cwd,
check=check,
text=True,
capture_output=capture,
)
class GitLabClient:
"""GitLab API client"""
def __init__(self, base_url: str, project_id: str, token: str):
self.base_url = base_url.rstrip("/")
self.project_id = project_id
self.token = token
self.headers = {"PRIVATE-TOKEN": token}
def create_branch(self, branch: str, ref: str) -> Dict:
"""Create a new branch"""
url = f"{self.base_url}/api/v4/projects/{self.project_id}/repository/branches"
resp = requests.post(
url,
headers=self.headers,
data={"branch": branch, "ref": ref},
)
resp.raise_for_status()
return resp.json()
def create_merge_request(
self,
source_branch: str,
target_branch: str,
title: str,
description: str,
labels: Optional[List[str]] = None,
reviewers: Optional[List[str]] = None,
remove_source_branch: bool = True,
) -> Dict:
"""Create a merge request"""
url = f"{self.base_url}/api/v4/projects/{self.project_id}/merge_requests"
data = {
"source_branch": source_branch,
"target_branch": target_branch,
"title": title,
"description": description,
"remove_source_branch": remove_source_branch,
}
if labels:
data["labels"] = ",".join(labels)
if reviewers:
# Note: reviewers need to be user IDs, not usernames
data["reviewer_ids"] = reviewers
resp = requests.post(url, headers=self.headers, data=data)
resp.raise_for_status()
return resp.json()
def trigger_pipeline(self, ref: str, token: str, variables: Optional[Dict] = None) -> Dict:
"""Trigger a pipeline"""
url = f"{self.base_url}/api/v4/projects/{self.project_id}/trigger/pipeline"
data = {"ref": ref, "token": token}
if variables:
for k, v in variables.items():
data[f"variables[{k}]"] = v
resp = requests.post(url, data=data)
resp.raise_for_status()
return resp.json()
class GitHubClient:
"""GitHub API client (alternative to GitLab)"""
def __init__(self, owner: str, repo: str, token: str):
self.base_url = "https://api.github.com"
self.owner = owner
self.repo = repo
self.headers = {
"Authorization": f"token {token}",
"Accept": "application/vnd.github.v3+json",
}
def create_pull_request(
self,
head: str,
base: str,
title: str,
body: str,
labels: Optional[List[str]] = None,
) -> Dict:
"""Create a pull request"""
url = f"{self.base_url}/repos/{self.owner}/{self.repo}/pulls"
data = {
"head": head,
"base": base,
"title": title,
"body": body,
}
resp = requests.post(url, headers=self.headers, json=data)
resp.raise_for_status()
pr = resp.json()
# Add labels if specified
if labels:
labels_url = f"{self.base_url}/repos/{self.owner}/{self.repo}/issues/{pr['number']}/labels"
requests.post(labels_url, headers=self.headers, json={"labels": labels})
return pr
def run_terraform_plan(tf_dir: Path, plan_file: str) -> tuple[bool, str]:
"""
Run terraform plan and return (has_changes, plan_output)
Uses -detailed-exitcode: 0=no changes, 1=error, 2=changes
"""
# Initialize
run_cmd(["terraform", "init", "-input=false"], cwd=tf_dir)
# Plan with detailed exit code
result = run_cmd(
[
"terraform", "plan",
"-input=false",
"-no-color",
"-out", plan_file,
"-detailed-exitcode",
],
cwd=tf_dir,
check=False,
capture=True,
)
if result.returncode == 0:
return False, result.stdout
elif result.returncode == 2:
return True, result.stdout
else:
print(f"Terraform plan failed:\n{result.stderr}", file=sys.stderr)
sys.exit(1)
def get_plan_summary(cfg: Dict[str, Any]) -> tuple[str, Dict]:
"""Run plan_summarizer and get markdown + json"""
result = run_cmd(
["python3", "plan_summarizer.py", "--format", "markdown"],
cwd=HERE,
capture=True,
)
markdown = result.stdout
result = run_cmd(
["python3", "plan_summarizer.py", "--format", "json"],
cwd=HERE,
capture=True,
)
summary_json = json.loads(result.stdout)
return markdown, summary_json
def get_reviewers(cfg: Dict[str, Any], summary: Dict) -> List[str]:
"""Determine reviewers based on affected categories"""
drift_cfg = cfg.get("drift_pr", {})
reviewer_mapping = drift_cfg.get("reviewer_mapping", {})
reviewers = set()
by_category = summary.get("by_category", {})
for category in by_category.keys():
if category in reviewer_mapping:
reviewers.update(reviewer_mapping[category])
# Add default reviewers
if not reviewers and "default" in reviewer_mapping:
reviewers.update(reviewer_mapping["default"])
return list(reviewers)
def notify_slack(cfg: Dict[str, Any], title: str, url: str, risk: str, changes: int):
"""Send Slack notification about created PR"""
slack_cfg = cfg.get("slack", {})
webhook_url = slack_cfg.get("webhook_url")
if not webhook_url or not slack_cfg.get("notify_on", {}).get("pr_created"):
return
template = slack_cfg.get("templates", {}).get("pr_created", "PR Created: {title}")
message = template.format(
title=title,
url=url,
risk_level=risk,
change_count=changes,
)
# Send to Slack
payload = {
"channel": slack_cfg.get("channel", "#cloudflare-gitops"),
"text": message,
"attachments": [
{
"color": {"LOW": "good", "MEDIUM": "warning", "HIGH": "danger", "CRITICAL": "danger"}.get(risk, "#808080"),
"fields": [
{"title": "Risk Level", "value": risk, "short": True},
{"title": "Changes", "value": str(changes), "short": True},
],
"actions": [
{
"type": "button",
"text": "View MR",
"url": url,
}
],
}
],
}
try:
requests.post(webhook_url, json=payload, timeout=10)
except Exception as e:
print(f"Slack notification failed: {e}", file=sys.stderr)
def create_mr_description(
cfg: Dict[str, Any],
summary_md: str,
summary_json: Dict,
trigger_source: str = "scheduled",
) -> str:
"""Generate MR description"""
drift_cfg = cfg.get("drift_pr", {})
title_prefix = drift_cfg.get("title_prefix", "Drift Remediation")
compliance = summary_json.get("compliance_violations", [])
compliance_warning = ""
if compliance:
frameworks = ", ".join(compliance)
compliance_warning = f"""
> **Compliance Notice:** This change affects the following frameworks: {frameworks}
> Please ensure appropriate review and approval processes are followed.
"""
return textwrap.dedent(f"""
## {title_prefix}
Detected by Phase 6 GitOps automation.
**Trigger:** {trigger_source}
**Timestamp:** {datetime.utcnow().isoformat()}Z
{compliance_warning}
---
{summary_md}
---
## Review Checklist
- [ ] Verified changes match expected drift
- [ ] No conflicting manual changes in Cloudflare dashboard
- [ ] Compliance requirements satisfied
- [ ] Tested in staging (if applicable)
## Notes
- This MR was auto-generated by the GitOps drift remediation bot
- Please review especially **HIGH** and **CRITICAL** risk resources
- Apply only after confirming no conflicting manual changes
---
*Generated by Cloudflare Mesh Observatory - Phase 6 GitOps*
""").strip()
def main():
"""Main entry point"""
import argparse
parser = argparse.ArgumentParser(
description="Create drift remediation MR"
)
parser.add_argument(
"--dry-run",
action="store_true",
default=os.environ.get("GITOPS_DRY_RUN", "false").lower() == "true",
help="Don't actually create MR",
)
parser.add_argument(
"--trigger-source",
default=os.environ.get("GITOPS_TRIGGER_SOURCE", "scheduled"),
help="What triggered this run (alert, scheduled, manual)",
)
parser.add_argument(
"--alert-name",
help="Name of alert that triggered this (for alert triggers)",
)
args = parser.parse_args()
# Load config
cfg = load_config()
tf_cfg = cfg.get("terraform", {})
gitlab_cfg = cfg.get("gitlab", {})
drift_cfg = cfg.get("drift_pr", {})
# Paths
tf_dir = HERE.parent / tf_cfg.get("working_dir", "terraform")
plan_file = tf_cfg.get("plan_file", "plan.tfplan")
# Check for changes
print("Running terraform plan...")
has_changes, plan_output = run_terraform_plan(tf_dir, plan_file)
if not has_changes:
print("No changes detected. Nothing to do.")
return
print("Changes detected. Generating summary...")
summary_md, summary_json = get_plan_summary(cfg)
# Generate branch name and title
now = datetime.utcnow().strftime("%Y-%m-%dT%H%M%SZ")
branch_prefix = drift_cfg.get("branch_prefix", "drift/remediation-")
branch = f"{branch_prefix}{now}"
title_prefix = drift_cfg.get("title_prefix", "Drift Remediation")
title = f"{title_prefix}: {now}"
# Get trigger info
trigger_source = args.trigger_source
if args.alert_name:
trigger_source = f"Alert: {args.alert_name}"
# Generate description
description = create_mr_description(cfg, summary_md, summary_json, trigger_source)
# Get reviewers
reviewers = get_reviewers(cfg, summary_json)
labels = drift_cfg.get("labels", ["drift", "terraform"])
if args.dry_run:
print("\n" + "=" * 60)
print("[DRY RUN] Would create MR:")
print(f" Branch: {branch}")
print(f" Title: {title}")
print(f" Labels: {labels}")
print(f" Reviewers: {reviewers}")
print(f" Risk: {summary_json.get('overall_risk')}")
print(f" Changes: {summary_json.get('total_changes')}")
print("=" * 60)
print("\nDescription:")
print(description)
return
# Create MR via GitLab API
base_url = gitlab_cfg.get("base_url", os.environ.get("GITLAB_BASE_URL", "https://gitlab.com"))
project_id = gitlab_cfg.get("project_id", os.environ.get("GITLAB_PROJECT_ID"))
token = os.environ.get("GITLAB_TOKEN")
default_branch = gitlab_cfg.get("default_branch", "main")
if not project_id or not token:
print("ERROR: GITLAB_PROJECT_ID and GITLAB_TOKEN required", file=sys.stderr)
sys.exit(1)
client = GitLabClient(base_url, project_id, token)
print(f"Creating branch {branch}...")
try:
client.create_branch(branch, default_branch)
except requests.HTTPError as e:
if e.response.status_code == 400: # Branch exists
print(f"Branch {branch} already exists, using it")
else:
raise
print(f"Creating MR: {title}")
mr = client.create_merge_request(
source_branch=branch,
target_branch=default_branch,
title=title,
description=description,
labels=labels,
remove_source_branch=True,
)
mr_url = mr.get("web_url", "")
print(f"\nCreated MR: {mr_url}")
# Notify Slack
notify_slack(
cfg,
title=title,
url=mr_url,
risk=summary_json.get("overall_risk", "UNKNOWN"),
changes=summary_json.get("total_changes", 0),
)
print("\nDone!")
if __name__ == "__main__":
main()

487
gitops/plan_summarizer.py Normal file
View File

@@ -0,0 +1,487 @@
#!/usr/bin/env python3
"""
Terraform Plan Summarizer for Cloudflare GitOps
Phase 6 - PR Workflows
Parses terraform plan JSON output and generates:
- Risk-scored change summaries
- Markdown reports for MR comments
- Compliance violation flags
- Affected zone analysis
"""
import json
import os
import subprocess
import sys
from dataclasses import dataclass, field
from enum import IntEnum
from fnmatch import fnmatch
from pathlib import Path
from typing import Any, Dict, List, Optional, Set
try:
import yaml
except ImportError:
print("ERROR: pip install pyyaml", file=sys.stderr)
sys.exit(1)
HERE = Path(__file__).resolve().parent
CONFIG_PATH = HERE / "config.yml"
class RiskLevel(IntEnum):
"""Risk levels for changes"""
LOW = 0
MEDIUM = 1
HIGH = 2
CRITICAL = 3
@classmethod
def from_string(cls, s: str) -> "RiskLevel":
return cls[s.upper()]
def __str__(self) -> str:
return self.name
@dataclass
class ResourceChange:
"""Represents a single resource change from terraform plan"""
address: str
resource_type: str
name: str
actions: List[str]
before: Optional[Dict[str, Any]] = None
after: Optional[Dict[str, Any]] = None
risk_level: RiskLevel = RiskLevel.LOW
category: str = "other"
compliance_flags: List[str] = field(default_factory=list)
@dataclass
class PlanSummary:
"""Aggregated plan summary"""
total_changes: int = 0
by_action: Dict[str, int] = field(default_factory=dict)
by_risk: Dict[str, int] = field(default_factory=dict)
by_category: Dict[str, int] = field(default_factory=dict)
changes: List[ResourceChange] = field(default_factory=list)
affected_zones: Set[str] = field(default_factory=set)
compliance_violations: List[str] = field(default_factory=list)
overall_risk: RiskLevel = RiskLevel.LOW
def load_config() -> Dict[str, Any]:
"""Load gitops configuration"""
if not CONFIG_PATH.exists():
raise FileNotFoundError(f"Config not found: {CONFIG_PATH}")
with open(CONFIG_PATH) as f:
config = yaml.safe_load(f)
# Expand environment variables
def expand_env(obj):
if isinstance(obj, str):
if obj.startswith("${") and obj.endswith("}"):
var = obj[2:-1]
default = None
if ":-" in var:
var, default = var.split(":-", 1)
return os.environ.get(var, default)
return obj
elif isinstance(obj, dict):
return {k: expand_env(v) for k, v in obj.items()}
elif isinstance(obj, list):
return [expand_env(i) for i in obj]
return obj
return expand_env(config)
def run_terraform_show(plan_path: Path, tf_dir: Path) -> Dict[str, Any]:
"""Run terraform show -json on plan file"""
result = subprocess.run(
["terraform", "show", "-json", str(plan_path)],
cwd=tf_dir,
capture_output=True,
text=True,
)
if result.returncode != 0:
print(f"terraform show failed: {result.stderr}", file=sys.stderr)
sys.exit(1)
return json.loads(result.stdout)
def get_resource_category(cfg: Dict[str, Any], resource_type: str) -> tuple[str, RiskLevel]:
"""Determine category and base risk for a resource type"""
risk_cfg = cfg.get("risk", {})
for category, cat_cfg in risk_cfg.items():
if category in ("actions", "levels"):
continue
resource_types = cat_cfg.get("resource_types", [])
for pattern in resource_types:
if fnmatch(resource_type, pattern):
base_risk = cat_cfg.get("base_risk", "low")
return category, RiskLevel.from_string(base_risk)
return "other", RiskLevel.LOW
def calculate_risk(
cfg: Dict[str, Any],
resource_type: str,
actions: List[str],
) -> tuple[str, RiskLevel]:
"""Calculate risk level for a change"""
category, base_risk = get_resource_category(cfg, resource_type)
risk_cfg = cfg.get("risk", {})
actions_cfg = risk_cfg.get("actions", {})
# Find highest action modifier
max_modifier = 0
for action in actions:
action_cfg = actions_cfg.get(action, {})
modifier = action_cfg.get("modifier", 0)
max_modifier = max(max_modifier, modifier)
# Calculate final risk
final_risk_value = min(base_risk.value + max_modifier, RiskLevel.CRITICAL.value)
final_risk = RiskLevel(final_risk_value)
return category, final_risk
def check_compliance(
cfg: Dict[str, Any],
resource_type: str,
actions: List[str],
before: Optional[Dict],
after: Optional[Dict],
) -> List[str]:
"""Check for compliance framework violations"""
violations = []
compliance_cfg = cfg.get("compliance", {})
frameworks = compliance_cfg.get("frameworks", [])
for framework in frameworks:
name = framework.get("name", "Unknown")
triggers = framework.get("triggers", [])
for trigger in triggers:
trigger_types = trigger.get("resource_types", [])
trigger_actions = trigger.get("actions", [])
trigger_fields = trigger.get("fields", [])
# Check resource type match
type_match = any(fnmatch(resource_type, t) for t in trigger_types)
if not type_match:
continue
# Check action match (if specified)
if trigger_actions and not any(a in trigger_actions for a in actions):
continue
# Check field changes (if specified)
if trigger_fields and before and after:
field_changed = any(
before.get(f) != after.get(f)
for f in trigger_fields
)
if not field_changed:
continue
violations.append(name)
return list(set(violations))
def extract_zone(change: ResourceChange) -> Optional[str]:
"""Extract zone name from resource if available"""
# Check after state first, then before
state = change.after or change.before or {}
# Common zone identifiers
for key in ("zone", "zone_id", "zone_name"):
if key in state:
return str(state[key])
# Try to extract from address
if "zone" in change.address.lower():
parts = change.address.split(".")
for i, part in enumerate(parts):
if "zone" in part.lower() and i + 1 < len(parts):
return parts[i + 1]
return None
def parse_plan(plan_json: Dict[str, Any], cfg: Dict[str, Any]) -> PlanSummary:
"""Parse terraform plan JSON into summary"""
summary = PlanSummary()
resource_changes = plan_json.get("resource_changes", [])
for rc in resource_changes:
change = rc.get("change", {})
actions = change.get("actions", [])
# Skip no-op changes
if actions == ["no-op"]:
continue
resource_type = rc.get("type", "unknown")
address = rc.get("address", "unknown")
name = rc.get("name", "unknown")
before = change.get("before")
after = change.get("after")
# Calculate risk
category, risk_level = calculate_risk(cfg, resource_type, actions)
# Check compliance
compliance_flags = check_compliance(
cfg, resource_type, actions, before, after
)
resource_change = ResourceChange(
address=address,
resource_type=resource_type,
name=name,
actions=actions,
before=before,
after=after,
risk_level=risk_level,
category=category,
compliance_flags=compliance_flags,
)
summary.changes.append(resource_change)
# Update counts
summary.total_changes += 1
for action in actions:
summary.by_action[action] = summary.by_action.get(action, 0) + 1
risk_name = str(risk_level)
summary.by_risk[risk_name] = summary.by_risk.get(risk_name, 0) + 1
summary.by_category[category] = summary.by_category.get(category, 0) + 1
# Track zones
zone = extract_zone(resource_change)
if zone:
summary.affected_zones.add(zone)
# Track compliance
summary.compliance_violations.extend(compliance_flags)
# Calculate overall risk
if summary.by_risk.get("CRITICAL", 0) > 0:
summary.overall_risk = RiskLevel.CRITICAL
elif summary.by_risk.get("HIGH", 0) > 0:
summary.overall_risk = RiskLevel.HIGH
elif summary.by_risk.get("MEDIUM", 0) > 0:
summary.overall_risk = RiskLevel.MEDIUM
else:
summary.overall_risk = RiskLevel.LOW
# Deduplicate compliance
summary.compliance_violations = list(set(summary.compliance_violations))
return summary
def format_markdown(summary: PlanSummary, cfg: Dict[str, Any]) -> str:
"""Format summary as Markdown for MR comments"""
ci_cfg = cfg.get("ci", {})
include = ci_cfg.get("include", {})
collapse_threshold = ci_cfg.get("collapse_threshold", 10)
lines = []
# Header with risk badge
risk_emoji = {
RiskLevel.LOW: "🟢",
RiskLevel.MEDIUM: "🟡",
RiskLevel.HIGH: "🟠",
RiskLevel.CRITICAL: "🔴",
}
emoji = risk_emoji.get(summary.overall_risk, "")
lines.append(f"## {emoji} Terraform Plan Summary")
lines.append("")
# Risk summary
if include.get("risk_summary", True):
lines.append(f"**Overall Risk:** {emoji} **{summary.overall_risk}**")
lines.append(f"**Total Changes:** `{summary.total_changes}`")
lines.append("")
# Action counts
if include.get("action_counts", True):
actions_str = ", ".join(
f"{k}={v}" for k, v in sorted(summary.by_action.items())
)
lines.append(f"**Actions:** {actions_str}")
lines.append("")
# Category breakdown
if summary.by_category:
lines.append("**By Category:**")
for cat, count in sorted(summary.by_category.items()):
lines.append(f"- {cat}: {count}")
lines.append("")
# Affected zones
if include.get("affected_zones", True) and summary.affected_zones:
zones = ", ".join(f"`{z}`" for z in sorted(summary.affected_zones))
lines.append(f"**Affected Zones:** {zones}")
lines.append("")
# Compliance flags
if include.get("compliance_flags", True) and summary.compliance_violations:
lines.append("**Compliance Impact:**")
for framework in sorted(set(summary.compliance_violations)):
lines.append(f"- ⚠️ {framework}")
lines.append("")
# Resource table
if include.get("resource_table", True) and summary.changes:
lines.append("### Resource Changes")
lines.append("")
# Collapse if many changes
if len(summary.changes) > collapse_threshold:
lines.append("<details>")
lines.append(f"<summary>Show {len(summary.changes)} changes</summary>")
lines.append("")
lines.append("| Resource | Actions | Risk | Compliance |")
lines.append("|----------|---------|------|------------|")
# Sort by risk (highest first)
sorted_changes = sorted(
summary.changes,
key=lambda c: c.risk_level.value,
reverse=True,
)
for change in sorted_changes[:50]: # Cap at 50
actions = ",".join(change.actions)
risk = str(change.risk_level)
compliance = ",".join(change.compliance_flags) if change.compliance_flags else "-"
lines.append(
f"| `{change.address}` | `{actions}` | **{risk}** | {compliance} |"
)
if len(summary.changes) > 50:
lines.append("")
lines.append(f"_... {len(summary.changes) - 50} more resources omitted_")
if len(summary.changes) > collapse_threshold:
lines.append("")
lines.append("</details>")
lines.append("")
# Dashboard links
dashboard_links = ci_cfg.get("dashboard_links", {})
if dashboard_links:
lines.append("### Quick Links")
for name, url in dashboard_links.items():
lines.append(f"- [{name.title()}]({url})")
lines.append("")
return "\n".join(lines)
def format_json(summary: PlanSummary) -> str:
"""Format summary as JSON for programmatic use"""
return json.dumps(
{
"total_changes": summary.total_changes,
"overall_risk": str(summary.overall_risk),
"by_action": summary.by_action,
"by_risk": summary.by_risk,
"by_category": summary.by_category,
"affected_zones": list(summary.affected_zones),
"compliance_violations": summary.compliance_violations,
"changes": [
{
"address": c.address,
"resource_type": c.resource_type,
"actions": c.actions,
"risk_level": str(c.risk_level),
"category": c.category,
"compliance_flags": c.compliance_flags,
}
for c in summary.changes
],
},
indent=2,
)
def main():
"""Main entry point"""
import argparse
parser = argparse.ArgumentParser(
description="Summarize Terraform plan for GitOps"
)
parser.add_argument(
"--plan-file",
help="Path to plan file (default: from config)",
)
parser.add_argument(
"--plan-json",
help="Path to pre-generated plan JSON (skip terraform show)",
)
parser.add_argument(
"--format",
choices=["markdown", "json"],
default="markdown",
help="Output format",
)
parser.add_argument(
"--tf-dir",
help="Terraform working directory",
)
args = parser.parse_args()
# Load config
cfg = load_config()
tf_cfg = cfg.get("terraform", {})
# Determine paths
tf_dir = Path(args.tf_dir) if args.tf_dir else HERE.parent / tf_cfg.get("working_dir", "terraform")
plan_file = args.plan_file or tf_cfg.get("plan_file", "plan.tfplan")
plan_path = tf_dir / plan_file
# Get plan JSON
if args.plan_json:
with open(args.plan_json) as f:
plan_json = json.load(f)
else:
plan_json = run_terraform_show(plan_path, tf_dir)
# Parse and summarize
summary = parse_plan(plan_json, cfg)
# Output
if args.format == "json":
print(format_json(summary))
else:
print(format_markdown(summary, cfg))
if __name__ == "__main__":
main()

565
gitops/waf_rule_proposer.py Normal file
View File

@@ -0,0 +1,565 @@
#!/usr/bin/env python3
"""
Phase 7: WAF Rule Proposer for GitOps Integration
Generates Terraform WAF rules based on:
- Threat intelligence indicators
- ML classification results
- Compliance requirements
- Existing rule gaps
Integrates with Phase 6 GitOps to create automated MRs.
"""
from __future__ import annotations
import json
import os
import re
from dataclasses import dataclass, field
from datetime import datetime
from pathlib import Path
from typing import Any, Dict, List, Optional, Set
# Import sibling modules
import sys
sys.path.insert(0, str(Path(__file__).parent.parent.parent))
# Type imports with fallbacks for standalone testing
_HAS_WAF_INTEL = False
try:
from mcp.waf_intelligence.threat_intel import ThreatIndicator, ThreatIntelReport
from mcp.waf_intelligence.classifier import ClassificationResult, ThreatClassifier
from mcp.waf_intelligence.generator import GeneratedRule, WAFRuleGenerator
from mcp.waf_intelligence.compliance import ComplianceMapper, FrameworkMapping
_HAS_WAF_INTEL = True
except ImportError:
pass
# TYPE_CHECKING block for type hints when modules unavailable
from typing import TYPE_CHECKING
if TYPE_CHECKING:
from mcp.waf_intelligence.threat_intel import ThreatIndicator, ThreatIntelReport
from mcp.waf_intelligence.classifier import ClassificationResult, ThreatClassifier
@dataclass
class RuleProposal:
"""A proposed WAF rule with full context for GitOps review."""
rule_name: str
rule_type: str # "ip_block", "pattern_block", "rate_limit", "managed_rule"
terraform_code: str
severity: str # "low", "medium", "high", "critical"
confidence: float
justification: str
threat_indicators: List[str] = field(default_factory=list)
compliance_refs: List[str] = field(default_factory=list)
estimated_impact: str = ""
auto_deploy_eligible: bool = False
tags: List[str] = field(default_factory=list)
def to_markdown(self) -> str:
"""Render proposal as Markdown for MR description."""
emoji = {"critical": "🔴", "high": "🟠", "medium": "🟡", "low": "🟢"}.get(self.severity, "")
md = f"""### {emoji} {self.rule_name}
**Type:** `{self.rule_type}` | **Severity:** `{self.severity}` | **Confidence:** `{self.confidence:.0%}`
**Justification:**
{self.justification}
**Compliance:** {', '.join(self.compliance_refs) or 'N/A'}
**Estimated Impact:** {self.estimated_impact or 'Unknown'}
<details>
<summary>Terraform Code</summary>
```hcl
{self.terraform_code}
```
</details>
**Tags:** {', '.join(f'`{t}`' for t in self.tags) or 'None'}
---
"""
return md
@dataclass
class ProposalBatch:
"""Batch of rule proposals for a single MR."""
proposals: List[RuleProposal] = field(default_factory=list)
generated_at: datetime = field(default_factory=datetime.utcnow)
source_report: Optional[str] = None
metadata: Dict[str, Any] = field(default_factory=dict)
@property
def critical_count(self) -> int:
return sum(1 for p in self.proposals if p.severity == "critical")
@property
def auto_deployable(self) -> List[RuleProposal]:
return [p for p in self.proposals if p.auto_deploy_eligible]
def to_markdown(self) -> str:
"""Generate full MR description."""
header = f"""# WAF Rule Proposals - Phase 7 Intelligence
**Generated:** {self.generated_at.strftime('%Y-%m-%d %H:%M:%S UTC')}
**Total Proposals:** {len(self.proposals)}
**Critical:** {self.critical_count}
**Auto-Deploy Eligible:** {len(self.auto_deployable)}
---
## Summary
| Rule | Type | Severity | Confidence | Auto-Deploy |
|------|------|----------|------------|-------------|
"""
for p in self.proposals:
auto = "" if p.auto_deploy_eligible else ""
header += f"| {p.rule_name} | {p.rule_type} | {p.severity} | {p.confidence:.0%} | {auto} |\n"
header += "\n---\n\n## Detailed Proposals\n\n"
for p in self.proposals:
header += p.to_markdown() + "\n"
return header
def to_terraform_file(self) -> str:
"""Generate combined Terraform file."""
header = f"""# Auto-generated WAF rules from Phase 7 Intelligence
# Generated: {self.generated_at.strftime('%Y-%m-%d %H:%M:%S UTC')}
# Review carefully before applying
"""
return header + "\n\n".join(p.terraform_code for p in self.proposals)
class WAFRuleProposer:
"""
Generates WAF rule proposals from threat intelligence and ML analysis.
Usage:
proposer = WAFRuleProposer(workspace_path="/path/to/cloudflare")
batch = proposer.generate_proposals(threat_report)
print(batch.to_markdown())
"""
def __init__(
self,
workspace_path: Optional[str] = None,
zone_id_var: str = "var.zone_id",
account_id_var: str = "var.cloudflare_account_id",
):
self.workspace = Path(workspace_path) if workspace_path else Path.cwd()
self.zone_id_var = zone_id_var
self.account_id_var = account_id_var
# Initialize components only if available
self.classifier = None
self.rule_generator = None
self.compliance_mapper = None
if _HAS_WAF_INTEL:
try:
self.classifier = ThreatClassifier()
except Exception:
pass
try:
self.rule_generator = WAFRuleGenerator()
except Exception:
pass
try:
self.compliance_mapper = ComplianceMapper()
except Exception:
pass
# Auto-deploy thresholds
self.auto_deploy_min_confidence = 0.85
self.auto_deploy_severities = {"critical", "high"}
def generate_proposals(
self,
threat_report: Optional[Any] = None,
indicators: Optional[List[Any]] = None,
max_proposals: int = 10,
) -> ProposalBatch:
"""
Generate rule proposals from threat intelligence.
Args:
threat_report: Full threat intel report
indicators: Or just a list of indicators
max_proposals: Maximum number of proposals to generate
Returns:
ProposalBatch ready for GitOps MR
"""
proposals: List[RuleProposal] = []
# Get indicators from report or directly
if threat_report:
all_indicators = threat_report.indicators
elif indicators:
all_indicators = indicators
else:
all_indicators = []
# Group indicators by type
ip_indicators = [i for i in all_indicators if i.indicator_type == "ip"]
pattern_indicators = [i for i in all_indicators if i.indicator_type == "pattern"]
ua_indicators = [i for i in all_indicators if i.indicator_type == "ua"]
# Generate IP blocking rules
proposals.extend(self._generate_ip_rules(ip_indicators))
# Generate pattern-based rules
proposals.extend(self._generate_pattern_rules(pattern_indicators))
# Generate user-agent rules
proposals.extend(self._generate_ua_rules(ua_indicators))
# Generate managed rule recommendations
proposals.extend(self._generate_managed_rule_proposals(all_indicators))
# Sort by severity and confidence
severity_order = {"critical": 4, "high": 3, "medium": 2, "low": 1}
proposals.sort(
key=lambda p: (severity_order.get(p.severity, 0), p.confidence),
reverse=True
)
return ProposalBatch(
proposals=proposals[:max_proposals],
source_report=str(threat_report.collection_time) if threat_report else None,
metadata={
"total_indicators": len(all_indicators),
"ip_indicators": len(ip_indicators),
"pattern_indicators": len(pattern_indicators),
}
)
def _generate_ip_rules(self, indicators: List[Any]) -> List[RuleProposal]:
"""Generate IP blocking rules."""
proposals: List[RuleProposal] = []
# Group by severity
critical_ips = [i for i in indicators if i.severity == "critical"]
high_ips = [i for i in indicators if i.severity == "high"]
# Critical IPs - individual block rules
for ind in critical_ips[:5]: # Limit to top 5
rule_name = f"waf_block_ip_{ind.value.replace('.', '_')}"
terraform = self._ip_block_terraform(rule_name, [ind.value], "block")
proposals.append(RuleProposal(
rule_name=rule_name,
rule_type="ip_block",
terraform_code=terraform,
severity="critical",
confidence=ind.confidence,
justification=f"Critical threat actor IP detected. Sources: {', '.join(ind.sources)}. "
f"Hit count: {ind.hit_count}. {ind.context.get('abuse_score', 'N/A')} abuse score.",
threat_indicators=[ind.value],
compliance_refs=["Zero-Trust", "Threat Intelligence"],
estimated_impact="Blocks all traffic from this IP",
auto_deploy_eligible=ind.confidence >= self.auto_deploy_min_confidence,
tags=["auto-generated", "threat-intel", "ip-block"]
))
# Batch high-severity IPs into one rule
if high_ips:
ips = [i.value for i in high_ips[:20]] # Limit batch size
rule_name = "waf_block_high_risk_ips"
terraform = self._ip_block_terraform(rule_name, ips, "block")
avg_confidence = sum(i.confidence for i in high_ips[:20]) / len(high_ips[:20])
proposals.append(RuleProposal(
rule_name=rule_name,
rule_type="ip_block",
terraform_code=terraform,
severity="high",
confidence=avg_confidence,
justification=f"Batch block of {len(ips)} high-risk IPs from threat intelligence.",
threat_indicators=ips,
compliance_refs=["Zero-Trust", "Threat Intelligence"],
estimated_impact=f"Blocks traffic from {len(ips)} IPs",
auto_deploy_eligible=False, # Batch rules require manual review
tags=["auto-generated", "threat-intel", "ip-block", "batch"]
))
return proposals
def _generate_pattern_rules(self, indicators: List[Any]) -> List[RuleProposal]:
"""Generate pattern-based blocking rules."""
proposals: List[RuleProposal] = []
# Group by attack type
attack_types: Dict[str, List[Any]] = {}
for ind in indicators:
for tag in ind.tags:
if tag in ("sqli", "xss", "rce", "path_traversal"):
attack_types.setdefault(tag, []).append(ind)
# Generate rules per attack type
for attack_type, inds in attack_types.items():
if not inds:
continue
# Use ML classifier to validate if available
if self.classifier:
# Classify a sample to confirm
sample = inds[0].value[:500]
result = self.classifier.classify(sample)
if result.label != attack_type and result.confidence > 0.7:
# ML disagrees, adjust confidence
confidence = min(ind.confidence for ind in inds) * 0.7
else:
confidence = max(ind.confidence for ind in inds)
else:
confidence = max(ind.confidence for ind in inds)
rule_name = f"waf_protect_{attack_type}"
terraform = self._managed_rule_terraform(rule_name, attack_type)
severity = "critical" if attack_type in ("sqli", "rce") else "high"
proposals.append(RuleProposal(
rule_name=rule_name,
rule_type="managed_rule",
terraform_code=terraform,
severity=severity,
confidence=confidence,
justification=f"Detected {len(inds)} {attack_type.upper()} attack patterns in traffic. "
f"Enabling managed ruleset protection.",
threat_indicators=[ind.value[:100] for ind in inds[:3]],
compliance_refs=self._get_compliance_refs(attack_type),
estimated_impact=f"Blocks {attack_type.upper()} attacks via managed rules",
auto_deploy_eligible=confidence >= self.auto_deploy_min_confidence,
tags=["auto-generated", "threat-intel", attack_type, "managed-rules"]
))
return proposals
def _generate_ua_rules(self, indicators: List[Any]) -> List[RuleProposal]:
"""Generate user-agent blocking rules."""
proposals: List[RuleProposal] = []
scanner_uas = [i for i in indicators if "scanner" in i.tags or "bad_ua" in i.tags]
if scanner_uas:
# Extract unique patterns
patterns = list(set(i.value[:100] for i in scanner_uas))[:10]
rule_name = "waf_block_scanner_uas"
terraform = self._ua_block_terraform(rule_name, patterns)
proposals.append(RuleProposal(
rule_name=rule_name,
rule_type="pattern_block",
terraform_code=terraform,
severity="medium",
confidence=0.75,
justification=f"Blocking {len(patterns)} scanner/bot user agents detected in traffic.",
threat_indicators=patterns,
compliance_refs=["Bot Protection"],
estimated_impact="Blocks automated scanning tools",
auto_deploy_eligible=False,
tags=["auto-generated", "threat-intel", "scanner", "user-agent"]
))
return proposals
def _generate_managed_rule_proposals(
self,
indicators: List[Any]
) -> List[RuleProposal]:
"""Generate recommendations to enable managed rulesets."""
proposals: List[RuleProposal] = []
# Check for attack types that should have managed rules
attack_types_seen = set()
for ind in indicators:
for tag in ind.tags:
if tag in ("sqli", "xss", "rce", "path_traversal"):
attack_types_seen.add(tag)
# Check existing terraform for gaps
tf_path = self.workspace / "terraform" / "waf.tf"
existing_coverage = set()
if tf_path.exists():
try:
content = tf_path.read_text().lower()
for attack_type in ["sqli", "xss", "rce"]:
if attack_type in content or f'"{attack_type}"' in content:
existing_coverage.add(attack_type)
except Exception:
pass
# Propose missing protections
for attack_type in attack_types_seen - existing_coverage:
rule_name = f"waf_enable_{attack_type}_protection"
terraform = self._managed_rule_terraform(rule_name, attack_type)
proposals.append(RuleProposal(
rule_name=rule_name,
rule_type="managed_rule",
terraform_code=terraform,
severity="high",
confidence=0.9,
justification=f"Traffic shows {attack_type.upper()} attack patterns but no protection enabled. "
f"Recommend enabling Cloudflare managed {attack_type.upper()} ruleset.",
threat_indicators=[],
compliance_refs=self._get_compliance_refs(attack_type),
estimated_impact=f"Enables {attack_type.upper()} protection",
auto_deploy_eligible=True,
tags=["auto-generated", "gap-analysis", attack_type, "managed-rules"]
))
return proposals
def _ip_block_terraform(
self,
rule_name: str,
ips: List[str],
action: str = "block"
) -> str:
"""Generate Terraform for IP blocking rule."""
if len(ips) == 1:
expression = f'(ip.src eq {ips[0]})'
else:
ip_list = " ".join(ips)
expression = f'(ip.src in {{{ip_list}}})'
return f'''resource "cloudflare_ruleset" "{rule_name}" {{
zone_id = {self.zone_id_var}
name = "{rule_name.replace('_', ' ').title()}"
description = "Auto-generated by Phase 7 WAF Intelligence"
kind = "zone"
phase = "http_request_firewall_custom"
rules {{
action = "{action}"
expression = "{expression}"
description = "Block threat intel IPs"
enabled = true
}}
}}
'''
def _managed_rule_terraform(self, rule_name: str, attack_type: str) -> str:
"""Generate Terraform for managed ruleset."""
ruleset_map = {
"sqli": "efb7b8c949ac4650a09736fc376e9aee", # Cloudflare SQLi
"xss": "c2e184081120413c86c3ab7e14069605", # Cloudflare XSS
"rce": "4814384a9e5d4991b9815dcfc25d2f1f", # Cloudflare RCE (example)
}
ruleset_id = ruleset_map.get(attack_type, "efb7b8c949ac4650a09736fc376e9aee")
return f'''resource "cloudflare_ruleset" "{rule_name}" {{
zone_id = {self.zone_id_var}
name = "{attack_type.upper()} Protection"
description = "Managed {attack_type.upper()} protection - Phase 7 WAF Intelligence"
kind = "zone"
phase = "http_request_firewall_managed"
rules {{
action = "execute"
action_parameters {{
id = "{ruleset_id}"
}}
expression = "true"
description = "Enable {attack_type.upper()} managed ruleset"
enabled = true
}}
}}
'''
def _ua_block_terraform(self, rule_name: str, patterns: List[str]) -> str:
"""Generate Terraform for user-agent blocking."""
# Escape patterns for regex
safe_patterns = [re.escape(p)[:50] for p in patterns]
pattern_regex = "|".join(safe_patterns)
return f'''resource "cloudflare_ruleset" "{rule_name}" {{
zone_id = {self.zone_id_var}
name = "Block Scanner User Agents"
description = "Auto-generated by Phase 7 WAF Intelligence"
kind = "zone"
phase = "http_request_firewall_custom"
rules {{
action = "block"
expression = "(http.user_agent contains \\"sqlmap\\" or http.user_agent contains \\"nikto\\" or http.user_agent contains \\"nmap\\" or http.user_agent contains \\"masscan\\")"
description = "Block known scanner user agents"
enabled = true
}}
}}
'''
def _get_compliance_refs(self, attack_type: str) -> List[str]:
"""Get compliance references for an attack type."""
refs = {
"sqli": ["PCI-DSS 6.6", "OWASP A03:2021"],
"xss": ["OWASP A07:2017", "CWE-79"],
"rce": ["OWASP A03:2021", "CWE-78"],
"path_traversal": ["CWE-22", "OWASP A01:2021"],
}
return refs.get(attack_type, [])
# CLI for testing
if __name__ == "__main__":
import sys
workspace = sys.argv[1] if len(sys.argv) > 1 else "."
# Create mock indicators for testing
mock_indicators = [
type("ThreatIndicator", (), {
"indicator_type": "ip",
"value": "192.0.2.100",
"severity": "critical",
"confidence": 0.95,
"sources": ["abuseipdb", "honeypot"],
"tags": ["threat-intel"],
"hit_count": 150,
"context": {"abuse_score": 95},
})(),
type("ThreatIndicator", (), {
"indicator_type": "pattern",
"value": "' OR '1'='1",
"severity": "high",
"confidence": 0.85,
"sources": ["log_analysis"],
"tags": ["sqli", "attack_pattern"],
"hit_count": 50,
"context": {},
})(),
type("ThreatIndicator", (), {
"indicator_type": "ua",
"value": "sqlmap/1.0",
"severity": "medium",
"confidence": 0.9,
"sources": ["log_analysis"],
"tags": ["scanner", "bad_ua"],
"hit_count": 25,
"context": {},
})(),
]
proposer = WAFRuleProposer(workspace_path=workspace)
batch = proposer.generate_proposals(indicators=mock_indicators)
print(batch.to_markdown())

373
gitops/webhook_receiver.py Normal file
View File

@@ -0,0 +1,373 @@
#!/usr/bin/env python3
"""
Alertmanager Webhook Receiver for Cloudflare GitOps
Phase 6 - PR Workflows
Receives alerts from Alertmanager and triggers GitOps actions:
- Drift remediation PRs
- Pipeline triggers
- Slack notifications
"""
import hashlib
import hmac
import json
import os
import subprocess
import sys
from dataclasses import dataclass
from datetime import datetime
from http.server import HTTPServer, BaseHTTPRequestHandler
from pathlib import Path
from typing import Any, Dict, List, Optional
import threading
import queue
try:
import requests
import yaml
except ImportError:
print("ERROR: pip install requests pyyaml", file=sys.stderr)
sys.exit(1)
HERE = Path(__file__).resolve().parent
CONFIG_PATH = HERE / "config.yml"
# Job queue for background processing
job_queue: queue.Queue = queue.Queue()
def load_config() -> Dict[str, Any]:
"""Load gitops configuration"""
with open(CONFIG_PATH) as f:
config = yaml.safe_load(f)
def expand_env(obj):
if isinstance(obj, str):
if obj.startswith("${") and "}" in obj:
inner = obj[2:obj.index("}")]
default = None
var = inner
if ":-" in inner:
var, default = inner.split(":-", 1)
return os.environ.get(var, default)
return obj
elif isinstance(obj, dict):
return {k: expand_env(v) for k, v in obj.items()}
elif isinstance(obj, list):
return [expand_env(i) for i in obj]
return obj
return expand_env(config)
@dataclass
class AlertPayload:
"""Parsed Alertmanager webhook payload"""
receiver: str
status: str # "firing" or "resolved"
alerts: List[Dict]
group_labels: Dict[str, str]
common_labels: Dict[str, str]
common_annotations: Dict[str, str]
external_url: str
version: str
group_key: str
@classmethod
def from_json(cls, data: Dict) -> "AlertPayload":
return cls(
receiver=data.get("receiver", ""),
status=data.get("status", ""),
alerts=data.get("alerts", []),
group_labels=data.get("groupLabels", {}),
common_labels=data.get("commonLabels", {}),
common_annotations=data.get("commonAnnotations", {}),
external_url=data.get("externalURL", ""),
version=data.get("version", "4"),
group_key=data.get("groupKey", ""),
)
@property
def alert_name(self) -> str:
return self.common_labels.get("alertname", "unknown")
@property
def severity(self) -> str:
return self.common_labels.get("severity", "unknown")
@property
def component(self) -> str:
return self.common_labels.get("component", "unknown")
def should_trigger_pr(cfg: Dict[str, Any], payload: AlertPayload) -> bool:
"""Determine if this alert should trigger a PR"""
webhook_cfg = cfg.get("webhook", {})
trigger_alerts = webhook_cfg.get("trigger_alerts", [])
notify_only = webhook_cfg.get("notify_only_alerts", [])
# Never auto-PR for resolved alerts
if payload.status == "resolved":
return False
# Check if in trigger list
if payload.alert_name in trigger_alerts:
return True
# Check if explicitly notify-only
if payload.alert_name in notify_only:
return False
# Default: don't trigger
return False
def trigger_gitlab_pipeline(cfg: Dict[str, Any], payload: AlertPayload) -> Optional[str]:
"""Trigger GitLab pipeline for drift remediation"""
gitlab_cfg = cfg.get("gitlab", {})
webhook_cfg = cfg.get("webhook", {}).get("gitlab_trigger", {})
if not webhook_cfg.get("enabled", False):
return None
base_url = gitlab_cfg.get("base_url", "https://gitlab.com")
project_id = gitlab_cfg.get("project_id")
trigger_token = webhook_cfg.get("trigger_token") or os.environ.get("GITLAB_TRIGGER_TOKEN")
ref = webhook_cfg.get("ref", "main")
if not project_id or not trigger_token:
print("GitLab trigger not configured", file=sys.stderr)
return None
url = f"{base_url}/api/v4/projects/{project_id}/trigger/pipeline"
data = {
"ref": ref,
"token": trigger_token,
"variables[GITOPS_TRIGGER_SOURCE]": "alert",
"variables[GITOPS_ALERT_NAME]": payload.alert_name,
"variables[GITOPS_ALERT_SEVERITY]": payload.severity,
"variables[GITOPS_ALERT_COMPONENT]": payload.component,
}
try:
resp = requests.post(url, data=data, timeout=30)
resp.raise_for_status()
result = resp.json()
return result.get("web_url")
except Exception as e:
print(f"Failed to trigger pipeline: {e}", file=sys.stderr)
return None
def run_drift_bot_locally(cfg: Dict[str, Any], payload: AlertPayload):
"""Run drift_pr_bot.py directly (for local webhook receiver)"""
env = os.environ.copy()
env["GITOPS_TRIGGER_SOURCE"] = "alert"
env["GITOPS_ALERT_NAME"] = payload.alert_name
subprocess.run(
["python3", "drift_pr_bot.py", "--trigger-source", "alert", "--alert-name", payload.alert_name],
cwd=HERE,
env=env,
)
def notify_slack(cfg: Dict[str, Any], message: str, alert: AlertPayload):
"""Send Slack notification"""
slack_cfg = cfg.get("slack", {})
webhook_url = slack_cfg.get("webhook_url")
if not webhook_url:
return
color = {
"critical": "danger",
"warning": "warning",
"info": "#439FE0",
}.get(alert.severity, "#808080")
payload = {
"channel": slack_cfg.get("channel", "#cloudflare-gitops"),
"attachments": [
{
"color": color,
"title": f"GitOps Alert: {alert.alert_name}",
"text": message,
"fields": [
{"title": "Status", "value": alert.status, "short": True},
{"title": "Severity", "value": alert.severity, "short": True},
{"title": "Component", "value": alert.component, "short": True},
],
"footer": "Cloudflare GitOps Webhook",
"ts": int(datetime.utcnow().timestamp()),
}
],
}
try:
requests.post(webhook_url, json=payload, timeout=10)
except Exception as e:
print(f"Slack notification failed: {e}", file=sys.stderr)
def process_alert(cfg: Dict[str, Any], payload: AlertPayload):
"""Process a single alert payload"""
print(f"Processing alert: {payload.alert_name} ({payload.status})")
# Check if we should trigger a PR
if should_trigger_pr(cfg, payload):
print(f"Alert {payload.alert_name} triggers drift remediation")
# Try GitLab pipeline trigger first
pipeline_url = trigger_gitlab_pipeline(cfg, payload)
if pipeline_url:
message = f"Triggered drift remediation pipeline: {pipeline_url}"
else:
# Fall back to local execution
print("Falling back to local drift_pr_bot execution")
run_drift_bot_locally(cfg, payload)
message = "Triggered local drift remediation"
notify_slack(cfg, message, payload)
else:
# Just notify
webhook_cfg = cfg.get("webhook", {})
notify_only = webhook_cfg.get("notify_only_alerts", [])
if payload.alert_name in notify_only:
message = f"Alert {payload.alert_name} received (notify-only, no auto-PR)"
notify_slack(cfg, message, payload)
def job_worker():
"""Background worker to process jobs"""
cfg = load_config()
while True:
try:
payload = job_queue.get(timeout=1)
if payload is None: # Shutdown signal
break
process_alert(cfg, payload)
except queue.Empty:
continue
except Exception as e:
print(f"Job processing error: {e}", file=sys.stderr)
class WebhookHandler(BaseHTTPRequestHandler):
"""HTTP handler for Alertmanager webhooks"""
def __init__(self, *args, **kwargs):
self.cfg = load_config()
super().__init__(*args, **kwargs)
def log_message(self, format, *args):
print(f"[{datetime.utcnow().isoformat()}] {format % args}")
def do_GET(self):
"""Health check endpoint"""
if self.path == "/health":
self.send_response(200)
self.send_header("Content-Type", "application/json")
self.end_headers()
self.wfile.write(json.dumps({"status": "ok"}).encode())
else:
self.send_response(404)
self.end_headers()
def do_POST(self):
"""Handle webhook POST"""
if self.path != "/webhook/alert":
self.send_response(404)
self.end_headers()
return
# Read body
content_length = int(self.headers.get("Content-Length", 0))
body = self.rfile.read(content_length)
# Verify signature if configured
secret = os.environ.get("WEBHOOK_SECRET")
if secret:
signature = self.headers.get("X-Webhook-Signature")
expected = hmac.new(
secret.encode(),
body,
hashlib.sha256
).hexdigest()
if not hmac.compare_digest(signature or "", expected):
self.send_response(403)
self.send_header("Content-Type", "application/json")
self.end_headers()
self.wfile.write(json.dumps({"error": "invalid signature"}).encode())
return
# Parse payload
try:
data = json.loads(body)
payload = AlertPayload.from_json(data)
except Exception as e:
self.send_response(400)
self.send_header("Content-Type", "application/json")
self.end_headers()
self.wfile.write(json.dumps({"error": str(e)}).encode())
return
# Queue for processing
job_queue.put(payload)
# Respond immediately
self.send_response(202)
self.send_header("Content-Type", "application/json")
self.end_headers()
self.wfile.write(json.dumps({
"status": "accepted",
"alert": payload.alert_name,
}).encode())
def main():
"""Main entry point"""
import argparse
parser = argparse.ArgumentParser(
description="Alertmanager webhook receiver for GitOps"
)
parser.add_argument(
"--host",
default=os.environ.get("WEBHOOK_HOST", "0.0.0.0"),
help="Host to bind to",
)
parser.add_argument(
"--port",
type=int,
default=int(os.environ.get("WEBHOOK_PORT", "8080")),
help="Port to listen on",
)
args = parser.parse_args()
# Start worker thread
worker = threading.Thread(target=job_worker, daemon=True)
worker.start()
# Start server
server = HTTPServer((args.host, args.port), WebhookHandler)
print(f"GitOps webhook receiver listening on {args.host}:{args.port}")
print(f" POST /webhook/alert - Alertmanager webhook")
print(f" GET /health - Health check")
try:
server.serve_forever()
except KeyboardInterrupt:
print("\nShutting down...")
job_queue.put(None) # Signal worker to stop
server.shutdown()
if __name__ == "__main__":
main()