677 lines
18 KiB
Markdown
677 lines
18 KiB
Markdown
# The Ouroboros Loop: Self-Correcting Security Architecture
|
|
|
|
**How Layer 0 Shadow learns from Layer 7 telemetry to improve itself**
|
|
|
|
---
|
|
|
|
## What is the Ouroboros Loop?
|
|
|
|
The **Ouroboros** (ancient symbol of a snake eating its own tail) represents a self-referential, self-improving system. In Layer 0 Shadow, the Ouroboros loop is the mechanism by which **Layer 7 telemetry feeds back into Layer 0 risk heuristics**, creating a self-correcting security substrate that learns from actual usage patterns.
|
|
|
|
---
|
|
|
|
## The Loop Structure
|
|
|
|
```
|
|
Layer 7 (Telemetry)
|
|
↓
|
|
[Feedback Analysis]
|
|
↓
|
|
Layer 0 (Shadow Eval) ← [Improved Risk Heuristics]
|
|
↓
|
|
Layer 1 (Boot/Doctrine)
|
|
↓
|
|
Layer 2 (Routing)
|
|
↓
|
|
Layer 3 (MCP Tools)
|
|
↓
|
|
Layer 4 (Guardrails)
|
|
↓
|
|
Layer 5 (Terraform)
|
|
↓
|
|
Layer 6 (GitOps)
|
|
↓
|
|
Layer 7 (Telemetry) ← [Back to start]
|
|
```
|
|
|
|
**The cycle repeats:** Each query flows through all layers, and Layer 7's telemetry informs Layer 0's future classifications.
|
|
|
|
---
|
|
|
|
## How It Works: Step by Step
|
|
|
|
### Phase 1: Initial Query (Layer 0)
|
|
|
|
**Query:** "add a WAF rule to block bots"
|
|
|
|
**Layer 0 Evaluation:**
|
|
```python
|
|
# Current heuristics (initial state)
|
|
if "skip git" in query: → FORBIDDEN
|
|
if "dashboard" in query: → FORBIDDEN
|
|
if "disable guardrails" in query: → CATASTROPHIC
|
|
# ... other patterns
|
|
|
|
# This query: "add a WAF rule to block bots"
|
|
# Classification: BLESSED (no violations detected)
|
|
# Action: HANDOFF_TO_LAYER1
|
|
```
|
|
|
|
**Result:** Query passes through all layers, completes successfully.
|
|
|
|
---
|
|
|
|
### Phase 2: Telemetry Collection (Layer 7)
|
|
|
|
**After processing completes, Layer 7 logs:**
|
|
|
|
```json
|
|
{
|
|
"timestamp": "2025-12-10T14:23:45Z",
|
|
"query": "add a WAF rule to block bots",
|
|
"agent": "cloudflare-ops",
|
|
"tools_used": ["gh_grep", "filesystem", "waf_intelligence"],
|
|
"guardrails_passed": true,
|
|
"terraform_generated": true,
|
|
"pr_created": true,
|
|
"pr_number": 42,
|
|
"confidence": 92,
|
|
"threat_type": "scanner",
|
|
"layer0_classification": "blessed",
|
|
"layer0_risk_score": 0,
|
|
"processing_time_ms": 1250,
|
|
"outcome": "success"
|
|
}
|
|
```
|
|
|
|
**Location:** `observatory/cognition_flow_logs.jsonl`
|
|
|
|
---
|
|
|
|
### Phase 3: Feedback Analysis (Between Layer 7 and Layer 0)
|
|
|
|
**The system analyzes telemetry to identify patterns:**
|
|
|
|
#### Pattern 1: False Negatives (Missed Threats)
|
|
|
|
**Example:** A query was classified as BLESSED but later triggered guardrail warnings.
|
|
|
|
**Telemetry:**
|
|
```json
|
|
{
|
|
"query": "update the WAF to allow all traffic",
|
|
"layer0_classification": "blessed",
|
|
"layer0_risk_score": 0,
|
|
"guardrails_passed": false,
|
|
"guardrail_warnings": ["zero_trust_violation", "security_risk"],
|
|
"outcome": "blocked_by_guardrails"
|
|
}
|
|
```
|
|
|
|
**Learning:** Layer 0 should have classified this as FORBIDDEN or AMBIGUOUS.
|
|
|
|
**Heuristic Update:**
|
|
```python
|
|
# New pattern learned
|
|
if "allow all traffic" in query: → FORBIDDEN
|
|
if "bypass security" in query: → FORBIDDEN
|
|
```
|
|
|
|
#### Pattern 2: False Positives (Over-Blocking)
|
|
|
|
**Example:** A query was classified as FORBIDDEN but was actually legitimate.
|
|
|
|
**Telemetry:**
|
|
```json
|
|
{
|
|
"query": "check the dashboard for current WAF rules",
|
|
"layer0_classification": "forbidden",
|
|
"layer0_risk_score": 3,
|
|
"layer0_reason": "governance_violation",
|
|
"outcome": "blocked_by_layer0",
|
|
"user_feedback": "legitimate_read_only_query"
|
|
}
|
|
```
|
|
|
|
**Learning:** "dashboard" in read-only context should be allowed.
|
|
|
|
**Heuristic Update:**
|
|
```python
|
|
# Refined pattern
|
|
if "dashboard" in query and "read" in query or "check" in query:
|
|
→ BLESSED (read-only operations)
|
|
elif "dashboard" in query and ("change" in query or "update" in query):
|
|
→ FORBIDDEN (write operations)
|
|
```
|
|
|
|
#### Pattern 3: Ambiguity Detection Improvement
|
|
|
|
**Example:** Queries that should have been flagged as ambiguous.
|
|
|
|
**Telemetry:**
|
|
```json
|
|
{
|
|
"query": "fix it",
|
|
"layer0_classification": "blessed",
|
|
"layer0_risk_score": 0,
|
|
"agent": "cloudflare-ops",
|
|
"tools_used": ["filesystem"],
|
|
"guardrails_passed": true,
|
|
"terraform_generated": false,
|
|
"outcome": "incomplete",
|
|
"user_clarification_required": true
|
|
}
|
|
```
|
|
|
|
**Learning:** Very short queries (< 3 words) should be AMBIGUOUS, not BLESSED.
|
|
|
|
**Heuristic Update:**
|
|
```python
|
|
# Improved ambiguity detection
|
|
if len(query.split()) <= 2 and not query.endswith("?"):
|
|
→ AMBIGUOUS (needs clarification)
|
|
```
|
|
|
|
---
|
|
|
|
### Phase 4: Heuristic Update (Layer 0 Re-Awakens)
|
|
|
|
**Layer 0's classifier is updated with new patterns:**
|
|
|
|
```python
|
|
class ShadowClassifier:
|
|
def __init__(self):
|
|
# Initial patterns (static)
|
|
self.catastrophic_patterns = [
|
|
"disable guardrails",
|
|
"override agent permissions",
|
|
"bypass governance",
|
|
"self-modifying",
|
|
]
|
|
|
|
self.forbidden_patterns = [
|
|
"skip git",
|
|
"apply directly",
|
|
"dashboard", # ← Refined: read-only allowed
|
|
"manual change",
|
|
]
|
|
|
|
# Learned patterns (from telemetry)
|
|
self.learned_forbidden = [
|
|
"allow all traffic", # ← Learned from false negative
|
|
"bypass security", # ← Learned from false negative
|
|
]
|
|
|
|
self.learned_ambiguous = [
|
|
# Short queries (< 3 words) → AMBIGUOUS
|
|
]
|
|
|
|
def classify(self, query: str) -> ShadowEvalResult:
|
|
q = query.lower().strip()
|
|
|
|
# Check learned patterns first (more specific)
|
|
if any(pattern in q for pattern in self.learned_forbidden):
|
|
return ShadowEvalResult(
|
|
classification=Classification.FORBIDDEN,
|
|
reason="learned_pattern",
|
|
risk_score=3,
|
|
flags=["telemetry_learned"],
|
|
)
|
|
|
|
# Then check static patterns
|
|
# ... existing logic
|
|
```
|
|
|
|
---
|
|
|
|
## What Telemetry Feeds Back?
|
|
|
|
### Layer 7 Logs (Complete Query Lifecycle)
|
|
|
|
```json
|
|
{
|
|
"timestamp": "ISO-8601",
|
|
"query": "original user query",
|
|
"layer0_classification": "blessed | ambiguous | forbidden | catastrophic",
|
|
"layer0_risk_score": 0-5,
|
|
"layer0_reason": "classification reason",
|
|
"layer0_trace_id": "uuid-v4",
|
|
"agent": "cloudflare-ops | security-audit | data-engineer",
|
|
"tools_used": ["gh_grep", "filesystem", "waf_intelligence"],
|
|
"guardrails_passed": true | false,
|
|
"guardrail_warnings": ["list of warnings"],
|
|
"terraform_generated": true | false,
|
|
"pr_created": true | false,
|
|
"pr_number": 42,
|
|
"confidence": 0-100,
|
|
"threat_type": "scanner | bot | ddos",
|
|
"processing_time_ms": 1250,
|
|
"outcome": "success | blocked | incomplete | error",
|
|
"user_feedback": "optional user correction"
|
|
}
|
|
```
|
|
|
|
### Key Metrics for Learning
|
|
|
|
1. **Classification Accuracy**
|
|
- `layer0_classification` vs `outcome`
|
|
- False positives (over-blocking)
|
|
- False negatives (missed threats)
|
|
|
|
2. **Risk Score Calibration**
|
|
- `layer0_risk_score` vs actual risk (from guardrails)
|
|
- Adjust risk thresholds based on outcomes
|
|
|
|
3. **Pattern Effectiveness**
|
|
- Which patterns catch real threats?
|
|
- Which patterns cause false positives?
|
|
|
|
4. **Resource Efficiency**
|
|
- `processing_time_ms` for blocked queries (should be 0)
|
|
- Queries that should have been blocked earlier
|
|
|
|
---
|
|
|
|
## Self-Correction Examples
|
|
|
|
### Example 1: Learning New Threat Patterns
|
|
|
|
**Initial State:**
|
|
```python
|
|
# Layer 0 doesn't know about "terraform destroy" risks
|
|
if "terraform destroy" in query:
|
|
→ BLESSED (not in forbidden patterns)
|
|
```
|
|
|
|
**After Processing:**
|
|
```json
|
|
{
|
|
"query": "terraform destroy production",
|
|
"layer0_classification": "blessed",
|
|
"guardrails_passed": false,
|
|
"guardrail_warnings": ["destructive_operation", "production_risk"],
|
|
"outcome": "blocked_by_guardrails"
|
|
}
|
|
```
|
|
|
|
**Learning:**
|
|
```python
|
|
# New pattern learned
|
|
if "terraform destroy" in query:
|
|
→ FORBIDDEN (destructive operation)
|
|
```
|
|
|
|
**Next Query:**
|
|
```python
|
|
# Query: "terraform destroy staging"
|
|
# Classification: FORBIDDEN (learned pattern)
|
|
# Action: HANDOFF_TO_GUARDRAILS (immediate)
|
|
# Result: Blocked before any processing
|
|
```
|
|
|
|
---
|
|
|
|
### Example 2: Refining Ambiguity Detection
|
|
|
|
**Initial State:**
|
|
```python
|
|
# Very short queries
|
|
if len(query.split()) <= 2:
|
|
→ AMBIGUOUS
|
|
```
|
|
|
|
**After Processing:**
|
|
```json
|
|
{
|
|
"query": "git status",
|
|
"layer0_classification": "ambiguous",
|
|
"outcome": "success",
|
|
"user_feedback": "common_command_should_be_blessed"
|
|
}
|
|
```
|
|
|
|
**Learning:**
|
|
```python
|
|
# Refined: Common commands are blessed
|
|
common_commands = ["git status", "terraform plan", "terraform validate"]
|
|
if query.lower() in common_commands:
|
|
→ BLESSED
|
|
elif len(query.split()) <= 2:
|
|
→ AMBIGUOUS
|
|
```
|
|
|
|
---
|
|
|
|
### Example 3: Multi-Account Risk Weighting
|
|
|
|
**Initial State:**
|
|
```python
|
|
# All queries treated equally
|
|
if "skip git" in query:
|
|
→ FORBIDDEN (risk_score: 3)
|
|
```
|
|
|
|
**After Processing:**
|
|
```json
|
|
{
|
|
"query": "skip git and apply to production",
|
|
"layer0_classification": "forbidden",
|
|
"layer0_risk_score": 3,
|
|
"account": "production",
|
|
"outcome": "blocked",
|
|
"actual_risk": "critical" # Higher than risk_score 3
|
|
}
|
|
```
|
|
|
|
**Learning:**
|
|
```python
|
|
# Production account queries need higher risk scores
|
|
if "production" in query and "skip git" in query:
|
|
→ FORBIDDEN (risk_score: 5) # Increased from 3
|
|
elif "skip git" in query:
|
|
→ FORBIDDEN (risk_score: 3)
|
|
```
|
|
|
|
---
|
|
|
|
## Current Implementation Status
|
|
|
|
### ✅ What's Implemented
|
|
|
|
1. **Layer 0 Classification** - Four-tier system (blessed/ambiguous/forbidden/catastrophic)
|
|
2. **Layer 7 Telemetry** - Logging structure defined
|
|
3. **Preboot Logging** - Violations logged to `preboot_shield.jsonl`
|
|
4. **Trace IDs** - Each query has unique trace ID for correlation
|
|
|
|
### 🚧 What's Planned (Future Enhancements)
|
|
|
|
From `LAYER0_SHADOW.md` Section 9:
|
|
|
|
1. **Threat-Signature Learning**
|
|
- Analyze forbidden queries to extract new patterns
|
|
- Automatically update `ShadowClassifier` patterns
|
|
|
|
2. **Multi-Account Risk Weighting**
|
|
- Different risk scores for production vs staging
|
|
- Account-specific pattern matching
|
|
|
|
3. **Synthetic Replay Mode**
|
|
- Replay historical queries to test new heuristics
|
|
- Audit reconstruction for compliance
|
|
|
|
4. **Metacognitive Hints**
|
|
- Improve ambiguity detection with context
|
|
- Better understanding of user intent
|
|
|
|
---
|
|
|
|
## Implementation Architecture
|
|
|
|
### Current: Static Patterns
|
|
|
|
```python
|
|
class ShadowClassifier:
|
|
def classify(self, query: str) -> ShadowEvalResult:
|
|
# Static pattern matching
|
|
if "skip git" in query:
|
|
return FORBIDDEN
|
|
# ... more static patterns
|
|
```
|
|
|
|
### Future: Dynamic Learning
|
|
|
|
```python
|
|
class ShadowClassifier:
|
|
def __init__(self):
|
|
self.static_patterns = {...} # Initial patterns
|
|
self.learned_patterns = {} # From telemetry
|
|
self.risk_weights = {} # Account-specific weights
|
|
|
|
def classify(self, query: str) -> ShadowEvalResult:
|
|
# Check learned patterns first (more specific)
|
|
result = self._check_learned_patterns(query)
|
|
if result:
|
|
return result
|
|
|
|
# Then check static patterns
|
|
return self._check_static_patterns(query)
|
|
|
|
def update_from_telemetry(self, telemetry_log: dict):
|
|
"""Update heuristics based on Layer 7 telemetry"""
|
|
if telemetry_log["outcome"] == "blocked_by_guardrails":
|
|
# False negative: should have been caught by Layer 0
|
|
self._learn_forbidden_pattern(telemetry_log["query"])
|
|
|
|
elif telemetry_log["outcome"] == "success" and telemetry_log["layer0_classification"] == "forbidden":
|
|
# False positive: over-blocked
|
|
self._refine_pattern(telemetry_log["query"])
|
|
```
|
|
|
|
---
|
|
|
|
## The Feedback Loop in Action
|
|
|
|
### Cycle 1: Initial State
|
|
|
|
**Query:** "skip git and apply directly"
|
|
|
|
**Layer 0:** FORBIDDEN (static pattern)
|
|
**Layer 7:** Logs violation
|
|
**Learning:** Pattern works correctly
|
|
|
|
---
|
|
|
|
### Cycle 2: New Threat Pattern
|
|
|
|
**Query:** "terraform destroy production infrastructure"
|
|
|
|
**Layer 0:** BLESSED (not in patterns)
|
|
**Layer 4 (Guardrails):** Blocks (destructive operation)
|
|
**Layer 7:** Logs false negative
|
|
**Learning:** Add "terraform destroy" to forbidden patterns
|
|
|
|
---
|
|
|
|
### Cycle 3: Improved Detection
|
|
|
|
**Query:** "terraform destroy staging"
|
|
|
|
**Layer 0:** FORBIDDEN (learned pattern)
|
|
**Action:** Blocked immediately (no processing)
|
|
**Layer 7:** Logs successful early block
|
|
**Learning:** Pattern confirmed effective
|
|
|
|
---
|
|
|
|
## Benefits of the Ouroboros Loop
|
|
|
|
### 1. **Self-Improving Security**
|
|
- Learns from actual threats
|
|
- Adapts to new attack patterns
|
|
- Reduces false positives over time
|
|
|
|
### 2. **Resource Efficiency**
|
|
- Catches threats earlier (Layer 0 vs Layer 4)
|
|
- Prevents wasted processing on bad queries
|
|
- Improves system performance
|
|
|
|
### 3. **Governance Enforcement**
|
|
- Learns infrastructure-specific violations
|
|
- Adapts to organizational policies
|
|
- Enforces GitOps/Terraform rules automatically
|
|
|
|
### 4. **Reduced Maintenance**
|
|
- Less manual pattern updates
|
|
- Automatic threat detection
|
|
- Self-correcting without human intervention
|
|
|
|
---
|
|
|
|
## Comparison to Static Systems
|
|
|
|
### Static System (Industry Standard)
|
|
|
|
```
|
|
Patterns defined once → Never change → Manual updates required
|
|
```
|
|
|
|
**Problems:**
|
|
- ❌ Can't adapt to new threats
|
|
- ❌ Requires manual updates
|
|
- ❌ False positives/negatives persist
|
|
- ❌ No learning from mistakes
|
|
|
|
### Ouroboros Loop (Layer 0 Shadow)
|
|
|
|
```
|
|
Patterns → Learn from outcomes → Improve patterns → Better detection
|
|
```
|
|
|
|
**Benefits:**
|
|
- ✅ Adapts to new threats automatically
|
|
- ✅ Self-improving without manual updates
|
|
- ✅ Reduces false positives/negatives over time
|
|
- ✅ Learns from actual usage patterns
|
|
|
|
---
|
|
|
|
## Philosophical Foundation
|
|
|
|
From `RED-BOOK.md` - The Fourfold Work:
|
|
|
|
1. **Nigredo** (Black) - Breakdown, dissolution
|
|
- Layer 0 detects violations (breakdown of governance)
|
|
|
|
2. **Albedo** (White) - Purification, clarity
|
|
- Layer 7 telemetry provides clarity on what happened
|
|
|
|
3. **Citrinitas** (Yellow) - Insight, pattern recognition
|
|
- Feedback analysis identifies patterns
|
|
|
|
4. **Rubedo** (Red) - Integration, completion
|
|
- Layer 0 heuristics updated (integration of learning)
|
|
|
|
**The Ouroboros loop completes the Work:** Each violation (Nigredo) becomes learning (Albedo) → insight (Citrinitas) → improvement (Rubedo) → better protection (back to Nigredo prevention).
|
|
|
|
---
|
|
|
|
## Future Enhancements: Detailed Plans
|
|
|
|
### 1. Threat-Signature Learning
|
|
|
|
**Implementation:**
|
|
```python
|
|
def analyze_forbidden_queries(telemetry_logs: List[dict]) -> List[str]:
|
|
"""Extract common patterns from forbidden queries"""
|
|
patterns = []
|
|
for log in telemetry_logs:
|
|
if log["layer0_classification"] == "forbidden":
|
|
# Extract key phrases
|
|
patterns.extend(extract_patterns(log["query"]))
|
|
return most_common_patterns(patterns)
|
|
```
|
|
|
|
**Example:**
|
|
- 10 queries with "skip git" → Add to forbidden patterns
|
|
- 5 queries with "terraform destroy" → Add to forbidden patterns
|
|
|
|
---
|
|
|
|
### 2. Multi-Account Risk Weighting
|
|
|
|
**Implementation:**
|
|
```python
|
|
def calculate_risk_score(query: str, account: str) -> int:
|
|
base_score = get_base_risk(query)
|
|
|
|
# Production accounts = higher risk
|
|
if account == "production":
|
|
return min(base_score * 1.5, 5) # Cap at 5
|
|
|
|
return base_score
|
|
```
|
|
|
|
**Example:**
|
|
- "skip git" in staging → risk_score: 3
|
|
- "skip git" in production → risk_score: 5 (catastrophic)
|
|
|
|
---
|
|
|
|
### 3. Synthetic Replay Mode
|
|
|
|
**Implementation:**
|
|
```python
|
|
def replay_historical_queries(new_heuristics: ShadowClassifier):
|
|
"""Test new heuristics against historical queries"""
|
|
historical_logs = load_telemetry_logs()
|
|
|
|
for log in historical_logs:
|
|
new_classification = new_heuristics.classify(log["query"])
|
|
old_classification = log["layer0_classification"]
|
|
|
|
if new_classification != old_classification:
|
|
print(f"Changed: {log['query']}")
|
|
print(f" Old: {old_classification}")
|
|
print(f" New: {new_classification}")
|
|
```
|
|
|
|
**Use Case:** Before deploying new heuristics, replay last 1000 queries to ensure no regressions.
|
|
|
|
---
|
|
|
|
### 4. Metacognitive Hints
|
|
|
|
**Implementation:**
|
|
```python
|
|
def classify_with_context(query: str, context: dict) -> ShadowEvalResult:
|
|
"""Use context to improve classification"""
|
|
|
|
# Context includes:
|
|
# - Previous queries in session
|
|
# - User's role (admin, developer, etc.)
|
|
# - Current working directory
|
|
# - Recent file changes
|
|
|
|
if context["user_role"] == "admin" and "production" in query:
|
|
# Admins querying production = higher scrutiny
|
|
return classify_with_higher_risk(query)
|
|
|
|
return standard_classify(query)
|
|
```
|
|
|
|
**Example:**
|
|
- "update WAF" from admin → BLESSED
|
|
- "update WAF" from developer → AMBIGUOUS (needs clarification)
|
|
|
|
---
|
|
|
|
## Summary
|
|
|
|
The **Ouroboros Loop** is a self-correcting security architecture that:
|
|
|
|
1. **Collects telemetry** from Layer 7 (complete query lifecycle)
|
|
2. **Analyzes patterns** to identify false positives/negatives
|
|
3. **Updates heuristics** in Layer 0 based on actual outcomes
|
|
4. **Improves detection** over time without manual intervention
|
|
|
|
**Key Innovation:** Unlike static security systems, Layer 0 Shadow learns from its mistakes and adapts to new threats automatically, creating a self-improving security substrate that becomes more effective over time.
|
|
|
|
**Current Status:** Architecture defined, telemetry structure in place, learning mechanisms planned for future implementation.
|
|
|
|
**The Loop:** Layer 7 → Analysis → Layer 0 → Layer 1 → ... → Layer 7 (repeat)
|
|
|
|
---
|
|
|
|
## References
|
|
|
|
- [LAYER0_SHADOW.md](LAYER0_SHADOW.md) - Layer 0 specification
|
|
- [COGNITION_FLOW.md](COGNITION_FLOW.md) - 8-layer architecture
|
|
- [RED-BOOK.md](RED-BOOK.md) - Philosophical foundation
|
|
- [DEMO_COGNITION.md](DEMO_COGNITION.md) - Real-world examples
|
|
|
|
---
|
|
|
|
**Last Updated:** 2025-12-10
|
|
**Status:** 🟢 Architecture Defined, Learning Mechanisms Planned
|
|
**Ouroboros Loop:** Active (Telemetry → Analysis → Improvement)
|