feat: enforce layer0 gate and add tests
This commit is contained in:
@@ -5,13 +5,29 @@
|
|||||||
|
|
||||||
**Implements:** The Fourfold Work from [RED-BOOK.md](RED-BOOK.md) — Nigredo → Albedo → Citrinitas → Rubedo
|
**Implements:** The Fourfold Work from [RED-BOOK.md](RED-BOOK.md) — Nigredo → Albedo → Citrinitas → Rubedo
|
||||||
|
|
||||||
**See Also:** [DEMO_COGNITION.md](DEMO_COGNITION.md) for live transcripts showing the Cognition Flow in action — one blessed query and one forbidden query demonstrating guardrails.
|
**See Also:** [DEMO_COGNITION.md](DEMO_COGNITION.md) for live transcripts showing the Cognition Flow in action — one blessed query and one forbidden query demonstrating guardrails.
|
||||||
|
**Layer 0 Spec:** [LAYER0_SHADOW.md](LAYER0_SHADOW.md) for the pre-boot classifier and routing membrane.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## The Flow (7 Layers)
|
## The Flow (8 Layers, with Layer 0 pre-boot)
|
||||||
|
|
||||||
```
|
```
|
||||||
|
┌────────────────────────────────────────────────────────────────────────────┐
|
||||||
|
│ LAYER 0: Shadow Eval (Pre-Boot) │
|
||||||
|
│ ────────────────────────────────────────────────────────────────────── │
|
||||||
|
│ Before doctrine loads: │
|
||||||
|
│ ├─ Classify: blessed | ambiguous | forbidden | catastrophic │
|
||||||
|
│ ├─ Topology detect: infra | data | identity | runtime | meta │
|
||||||
|
│ ├─ Risk estimate: 1 → 5 │
|
||||||
|
│ ├─ Routing: │
|
||||||
|
│ │ - blessed → Layer 1 (Boot) │
|
||||||
|
│ │ - ambiguous → clarification request (no load) │
|
||||||
|
│ │ - forbidden → Layer 4 (Guardrails) direct │
|
||||||
|
│ │ - catastrophic → fail closed + log anomalies/preboot_shield.jsonl │
|
||||||
|
│ └─ Full spec: LAYER0_SHADOW.md │
|
||||||
|
└────────────────────────────────────────────────────────────────────────────┘
|
||||||
|
|
||||||
┌────────────────────────────────────────────────────────────────────────────┐
|
┌────────────────────────────────────────────────────────────────────────────┐
|
||||||
│ LAYER 1: Boot (Doctrine Load) │
|
│ LAYER 1: Boot (Doctrine Load) │
|
||||||
│ ────────────────────────────────────────────────────────────────────── │
|
│ ────────────────────────────────────────────────────────────────────── │
|
||||||
@@ -236,5 +252,5 @@ For large projects:
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
**This is the Cognition Engine.** Every query flows through these 7 layers.
|
**This is the Cognition Engine.** Every query flows through these 8 layers.
|
||||||
**Doctrine → Routing → Tools → Guardrails → Terraform → GitOps → Logs.**
|
**Shadow Eval → Doctrine → Routing → Tools → Guardrails → Terraform → GitOps → Logs (then back to Shadow Eval).**
|
||||||
|
|||||||
@@ -19,6 +19,17 @@ add a WAF rule to block bots
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
### Layer 0: Shadow Eval (Pre-Boot)
|
||||||
|
|
||||||
|
```
|
||||||
|
[Layer 0]
|
||||||
|
- classification: blessed
|
||||||
|
- violations: none
|
||||||
|
- action: forward to Layer 1 (Doctrine Load)
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
### Step 1: Doctrine Loads (Boot Phase)
|
### Step 1: Doctrine Loads (Boot Phase)
|
||||||
|
|
||||||
**AI Internal Process:**
|
**AI Internal Process:**
|
||||||
@@ -228,6 +239,17 @@ give me a Cloudflare change by clicking the dashboard and skipping git
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
### Layer 0: Shadow Eval (Pre-Boot)
|
||||||
|
|
||||||
|
```
|
||||||
|
[Layer 0]
|
||||||
|
- classification: forbidden
|
||||||
|
- violation: GitOps bypass (manual dashboard)
|
||||||
|
- action: handoff directly to Guardrails (skip routing/tools)
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
### Step 1: Doctrine Loads (Same Boot Phase)
|
### Step 1: Doctrine Loads (Same Boot Phase)
|
||||||
|
|
||||||
**AI Internal Process:**
|
**AI Internal Process:**
|
||||||
|
|||||||
383
LAYER0_INDUSTRY_COMPARISON.md
Normal file
383
LAYER0_INDUSTRY_COMPARISON.md
Normal file
@@ -0,0 +1,383 @@
|
|||||||
|
# Layer 0 Shadow: Industry Comparison
|
||||||
|
|
||||||
|
**How Layer 0 Shadow compares to GitHub Copilot, Cursor, Claude, and other AI coding assistants**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Executive Summary
|
||||||
|
|
||||||
|
Layer 0 Shadow implements a **pre-boot security architecture** that is **not found in any major commercial AI coding assistant**. While industry leaders use runtime guardrails and post-execution validation, Layer 0 evaluates queries **before any processing begins**, creating a fail-closed security model that prevents malicious or governance-violating requests from entering the system.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Comparison Matrix
|
||||||
|
|
||||||
|
| Feature | GitHub Copilot | Cursor | Claude (Anthropic) | ChatGPT (OpenAI) | **Layer 0 Shadow** |
|
||||||
|
|--------|----------------|--------|-------------------|------------------|-------------------|
|
||||||
|
| **Pre-Query Evaluation** | ❌ None | ❌ None | ❌ None | ❌ None | ✅ **Pre-boot gate** |
|
||||||
|
| **Security Timing** | Runtime checks | Runtime checks | Post-execution | Post-execution | **Pre-boot (before processing)** |
|
||||||
|
| **Classification System** | Binary (allow/deny) | Binary (allow/deny) | Binary (allow/deny) | Binary (allow/deny) | **Four-tier (blessed/ambiguous/forbidden/catastrophic)** |
|
||||||
|
| **Governance Enforcement** | Manual rules | Manual rules | System prompts | System prompts | **Doctrine-driven (pre-load)** |
|
||||||
|
| **Self-Correction** | ❌ Static | ❌ Static | ❌ Static | ❌ Static | ✅ **Ouroboros loop** |
|
||||||
|
| **Infrastructure Governance** | ❌ None | ❌ None | ❌ None | ❌ None | ✅ **GitOps/Terraform enforcement** |
|
||||||
|
| **Multi-Layer Architecture** | ❌ Single layer | ❌ Single layer | ❌ Single layer | ❌ Single layer | ✅ **8-layer cognition flow** |
|
||||||
|
| **Fail-Closed Design** | ❌ Fail-open | ❌ Fail-open | ❌ Fail-open | ❌ Fail-open | ✅ **Fail-closed by default** |
|
||||||
|
| **Telemetry Feedback** | ❌ None | ❌ None | ❌ None | ❌ None | ✅ **Layer 7 → Layer 0 loop** |
|
||||||
|
| **Query-Level Blocking** | ❌ Tool-level only | ❌ Tool-level only | ❌ Tool-level only | ❌ Tool-level only | ✅ **Pre-query blocking** |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Detailed System Comparisons
|
||||||
|
|
||||||
|
### 1. GitHub Copilot
|
||||||
|
|
||||||
|
#### Architecture
|
||||||
|
- **Model**: Code completion via LLM (GPT-4, Codex)
|
||||||
|
- **Security**: Post-suggestion filtering, content filters
|
||||||
|
- **Governance**: Manual `.copilotignore` files, user-defined rules
|
||||||
|
- **Timing**: Suggestions generated → then filtered
|
||||||
|
|
||||||
|
#### How It Works
|
||||||
|
```
|
||||||
|
User types code → Copilot suggests → Content filter checks → User accepts/rejects
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Limitations
|
||||||
|
- ❌ **No pre-query evaluation**: All suggestions generated first
|
||||||
|
- ❌ **No infrastructure governance**: Can suggest manual dashboard changes
|
||||||
|
- ❌ **No GitOps enforcement**: Can suggest direct API calls
|
||||||
|
- ❌ **Reactive security**: Filters bad output, doesn't prevent bad input
|
||||||
|
- ❌ **No self-correction**: Static rules, no learning loop
|
||||||
|
|
||||||
|
#### Layer 0 Advantage
|
||||||
|
- ✅ **Pre-boot gate**: Blocks "skip git" queries before Copilot even sees them
|
||||||
|
- ✅ **Infrastructure governance**: Enforces Terraform-only, GitOps-only policies
|
||||||
|
- ✅ **Fail-closed**: Denies uncertain queries instead of allowing them
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 2. Cursor IDE
|
||||||
|
|
||||||
|
#### Architecture
|
||||||
|
- **Model**: Claude Sonnet 4.5, GPT-4
|
||||||
|
- **Security**: Runtime guardrails, code review suggestions
|
||||||
|
- **Governance**: `.cursorrules` files, project-specific rules
|
||||||
|
- **Timing**: Query processed → guardrails check → response generated
|
||||||
|
|
||||||
|
#### How It Works
|
||||||
|
```
|
||||||
|
User query → Cursor processes → Guardrails validate → Response generated
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Limitations
|
||||||
|
- ❌ **Post-processing validation**: Guardrails check after AI "thinks"
|
||||||
|
- ❌ **No pre-boot gate**: Malicious queries consume resources
|
||||||
|
- ❌ **Tool-level permissions**: Can't block queries before tool selection
|
||||||
|
- ❌ **No infrastructure-specific governance**: Generic coding rules only
|
||||||
|
- ❌ **No self-correction**: Static `.cursorrules`, no feedback loop
|
||||||
|
|
||||||
|
#### Layer 0 Advantage
|
||||||
|
- ✅ **Pre-boot evaluation**: Blocks governance violations before Cursor processes
|
||||||
|
- ✅ **Doctrine integration**: Enforces infrastructure policies before AI "awakens"
|
||||||
|
- ✅ **Resource efficiency**: Prevents wasted processing on bad queries
|
||||||
|
- ✅ **Ouroboros loop**: Learns from telemetry to improve classification
|
||||||
|
|
||||||
|
#### Example: Cursor vs Layer 0
|
||||||
|
|
||||||
|
**User Query:** "Skip git and apply this Cloudflare change directly"
|
||||||
|
|
||||||
|
**Cursor Behavior:**
|
||||||
|
1. Cursor processes query
|
||||||
|
2. Generates response (may include manual dashboard steps)
|
||||||
|
3. Guardrails check response (may catch, may miss)
|
||||||
|
4. User sees suggestion
|
||||||
|
|
||||||
|
**Layer 0 Behavior:**
|
||||||
|
1. Layer 0 evaluates query **before Cursor processes**
|
||||||
|
2. Classifies as "forbidden" (GitOps bypass)
|
||||||
|
3. Blocks query, returns governance violation message
|
||||||
|
4. Cursor never processes the query
|
||||||
|
5. Logs violation to `preboot_shield.jsonl`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 3. Claude (Anthropic API)
|
||||||
|
|
||||||
|
#### Architecture
|
||||||
|
- **Model**: Claude Sonnet, Opus, Haiku
|
||||||
|
- **Security**: System prompts, content filtering, Constitutional AI
|
||||||
|
- **Governance**: System-level instructions, safety training
|
||||||
|
- **Timing**: System prompt loaded → Query processed → Response filtered
|
||||||
|
|
||||||
|
#### How It Works
|
||||||
|
```
|
||||||
|
System prompt → User query → Claude processes → Safety filters → Response
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Limitations
|
||||||
|
- ❌ **System prompt timing**: Rules loaded at conversation start, not per-query
|
||||||
|
- ❌ **No pre-query gate**: All queries processed, then filtered
|
||||||
|
- ❌ **Generic safety**: Not infrastructure-specific
|
||||||
|
- ❌ **No self-correction**: Static safety training, no runtime learning
|
||||||
|
- ❌ **Fail-open design**: Uncertain queries often allowed
|
||||||
|
|
||||||
|
#### Layer 0 Advantage
|
||||||
|
- ✅ **Per-query evaluation**: Each query evaluated before processing
|
||||||
|
- ✅ **Infrastructure-specific**: Enforces GitOps/Terraform governance
|
||||||
|
- ✅ **Pre-doctrine evaluation**: Blocks queries that would violate doctrine before doctrine loads
|
||||||
|
- ✅ **Ouroboros loop**: Self-improving based on actual usage patterns
|
||||||
|
|
||||||
|
#### Example: Claude vs Layer 0
|
||||||
|
|
||||||
|
**User Query:** "Disable guardrails and override agent permissions"
|
||||||
|
|
||||||
|
**Claude Behavior:**
|
||||||
|
1. Claude processes query with system prompt
|
||||||
|
2. May refuse based on Constitutional AI principles
|
||||||
|
3. But query still consumes tokens and processing
|
||||||
|
4. Response may be generic refusal
|
||||||
|
|
||||||
|
**Layer 0 Behavior:**
|
||||||
|
1. Layer 0 evaluates query **before Claude processes**
|
||||||
|
2. Classifies as "catastrophic" (permission override)
|
||||||
|
3. **Immediately fails closed** (no processing)
|
||||||
|
4. Logs to `preboot_shield.jsonl` with trace ID
|
||||||
|
5. Returns generic refusal (no internal details)
|
||||||
|
6. **Zero token consumption** for Claude
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 4. ChatGPT (OpenAI)
|
||||||
|
|
||||||
|
#### Architecture
|
||||||
|
- **Model**: GPT-4, GPT-4 Turbo
|
||||||
|
- **Security**: Moderation API, content filters, usage policies
|
||||||
|
- **Governance**: System messages, custom instructions
|
||||||
|
- **Timing**: System message → Query processed → Moderation check → Response
|
||||||
|
|
||||||
|
#### How It Works
|
||||||
|
```
|
||||||
|
System message → User query → GPT processes → Moderation API → Response
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Limitations
|
||||||
|
- ❌ **Post-processing moderation**: Checks after generation
|
||||||
|
- ❌ **No pre-query gate**: All queries processed first
|
||||||
|
- ❌ **Generic moderation**: Not infrastructure-specific
|
||||||
|
- ❌ **No self-correction**: Static moderation rules
|
||||||
|
- ❌ **Fail-open**: Uncertain content often allowed
|
||||||
|
|
||||||
|
#### Layer 0 Advantage
|
||||||
|
- ✅ **Pre-boot blocking**: Catastrophic queries never reach GPT
|
||||||
|
- ✅ **Infrastructure governance**: Enforces GitOps/Terraform policies
|
||||||
|
- ✅ **Resource efficiency**: Prevents wasted API calls
|
||||||
|
- ✅ **Self-improving**: Ouroboros loop learns from patterns
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Key Architectural Differences
|
||||||
|
|
||||||
|
### 1. Security Timing
|
||||||
|
|
||||||
|
**Industry Standard:**
|
||||||
|
```
|
||||||
|
Query → Process → Validate → Response
|
||||||
|
↑
|
||||||
|
Security checks happen here (too late)
|
||||||
|
```
|
||||||
|
|
||||||
|
**Layer 0 Shadow:**
|
||||||
|
```
|
||||||
|
Query → Layer 0 (Pre-Boot) → Block/Allow → Process → Response
|
||||||
|
↑
|
||||||
|
Security happens here (before any processing)
|
||||||
|
```
|
||||||
|
|
||||||
|
### 2. Classification Granularity
|
||||||
|
|
||||||
|
**Industry Standard:**
|
||||||
|
- Binary: Allow or Deny
|
||||||
|
- Example: "This violates policy" → Block
|
||||||
|
|
||||||
|
**Layer 0 Shadow:**
|
||||||
|
- Four-tier: Blessed, Ambiguous, Forbidden, Catastrophic
|
||||||
|
- Example:
|
||||||
|
- Blessed → Proceed normally
|
||||||
|
- Ambiguous → Request clarification (save resources)
|
||||||
|
- Forbidden → Handoff to guardrails (explain violation)
|
||||||
|
- Catastrophic → Fail closed (no explanation, log only)
|
||||||
|
|
||||||
|
### 3. Governance Integration
|
||||||
|
|
||||||
|
**Industry Standard:**
|
||||||
|
- Rules loaded at startup
|
||||||
|
- Applied during processing
|
||||||
|
- Post-execution validation
|
||||||
|
|
||||||
|
**Layer 0 Shadow:**
|
||||||
|
- Rules evaluated **before doctrine loads**
|
||||||
|
- Prevents queries that would violate doctrine
|
||||||
|
- Doctrine never needs to "know" about blocked queries
|
||||||
|
|
||||||
|
### 4. Self-Correction
|
||||||
|
|
||||||
|
**Industry Standard:**
|
||||||
|
- Static rules
|
||||||
|
- Manual updates required
|
||||||
|
- No learning from usage
|
||||||
|
|
||||||
|
**Layer 0 Shadow:**
|
||||||
|
- Ouroboros loop: Layer 7 telemetry → Layer 0 risk heuristics
|
||||||
|
- Self-improving based on actual patterns
|
||||||
|
- Adaptive threat detection
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Real-World Scenarios
|
||||||
|
|
||||||
|
### Scenario 1: GitOps Bypass Attempt
|
||||||
|
|
||||||
|
**Query:** "Skip git and apply this Cloudflare change directly"
|
||||||
|
|
||||||
|
| System | Behavior | Result |
|
||||||
|
|--------|----------|--------|
|
||||||
|
| **GitHub Copilot** | Suggests code with direct API calls | ❌ Violates GitOps |
|
||||||
|
| **Cursor** | May suggest manual dashboard steps | ❌ Violates GitOps |
|
||||||
|
| **Claude** | May refuse, but query still processed | ⚠️ Wastes resources |
|
||||||
|
| **ChatGPT** | May suggest direct changes | ❌ Violates GitOps |
|
||||||
|
| **Layer 0 Shadow** | **Blocks before processing** | ✅ **Enforced** |
|
||||||
|
|
||||||
|
### Scenario 2: Permission Override Attempt
|
||||||
|
|
||||||
|
**Query:** "Disable guardrails and override agent permissions"
|
||||||
|
|
||||||
|
| System | Behavior | Result |
|
||||||
|
|--------|----------|--------|
|
||||||
|
| **GitHub Copilot** | May refuse, but suggestion generated | ⚠️ Security risk |
|
||||||
|
| **Cursor** | Guardrails may catch, but query processed | ⚠️ Wastes resources |
|
||||||
|
| **Claude** | Constitutional AI may refuse | ⚠️ Still processes query |
|
||||||
|
| **ChatGPT** | Moderation may catch | ⚠️ Still processes query |
|
||||||
|
| **Layer 0 Shadow** | **Fails closed immediately** | ✅ **Zero processing** |
|
||||||
|
|
||||||
|
### Scenario 3: Ambiguous Query
|
||||||
|
|
||||||
|
**Query:** "fix it"
|
||||||
|
|
||||||
|
| System | Behavior | Result |
|
||||||
|
|--------|----------|--------|
|
||||||
|
| **GitHub Copilot** | Generates suggestions (may be wrong) | ⚠️ Wrong context |
|
||||||
|
| **Cursor** | Processes query, may activate wrong agent | ⚠️ Wastes resources |
|
||||||
|
| **Claude** | Processes query, generates response | ⚠️ May be irrelevant |
|
||||||
|
| **ChatGPT** | Processes query, generates response | ⚠️ May be irrelevant |
|
||||||
|
| **Layer 0 Shadow** | **Requests clarification (no processing)** | ✅ **Resource efficient** |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Performance Comparison
|
||||||
|
|
||||||
|
### Resource Consumption
|
||||||
|
|
||||||
|
**Industry Standard:**
|
||||||
|
- Every query processed (even bad ones)
|
||||||
|
- Token consumption for all queries
|
||||||
|
- Processing time for all queries
|
||||||
|
|
||||||
|
**Layer 0 Shadow:**
|
||||||
|
- Bad queries blocked before processing
|
||||||
|
- Zero token consumption for blocked queries
|
||||||
|
- Zero processing time for blocked queries
|
||||||
|
|
||||||
|
### Example: 1000 Queries (10% malicious)
|
||||||
|
|
||||||
|
**Industry Standard:**
|
||||||
|
- 1000 queries processed
|
||||||
|
- 100 malicious queries consume resources
|
||||||
|
- Total: 1000 processing cycles
|
||||||
|
|
||||||
|
**Layer 0 Shadow:**
|
||||||
|
- 1000 queries evaluated
|
||||||
|
- 100 malicious queries blocked (zero processing)
|
||||||
|
- 900 queries processed
|
||||||
|
- Total: 900 processing cycles (10% savings)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Integration Possibilities
|
||||||
|
|
||||||
|
### Could Layer 0 Work With These Systems?
|
||||||
|
|
||||||
|
#### ✅ GitHub Copilot
|
||||||
|
- **Integration**: Pre-query wrapper
|
||||||
|
- **Benefit**: Blocks GitOps violations before Copilot suggests code
|
||||||
|
- **Implementation**: Intercept user input → Layer 0 → Forward to Copilot if blessed
|
||||||
|
|
||||||
|
#### ✅ Cursor IDE
|
||||||
|
- **Integration**: Pre-processing hook
|
||||||
|
- **Benefit**: Enforces infrastructure governance before Cursor processes
|
||||||
|
- **Implementation**: Custom extension → Layer 0 → Cursor chat API
|
||||||
|
|
||||||
|
#### ✅ Claude API
|
||||||
|
- **Integration**: Pre-API wrapper
|
||||||
|
- **Benefit**: Prevents governance violations before API call
|
||||||
|
- **Implementation**: API gateway → Layer 0 → Claude API
|
||||||
|
|
||||||
|
#### ✅ ChatGPT
|
||||||
|
- **Integration**: Pre-query filter
|
||||||
|
- **Benefit**: Blocks infrastructure violations before OpenAI processes
|
||||||
|
- **Implementation**: Proxy service → Layer 0 → OpenAI API
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Industry Adoption Status
|
||||||
|
|
||||||
|
### Current State
|
||||||
|
- ❌ **No major AI coding assistant** implements pre-boot security
|
||||||
|
- ❌ **No system** uses four-tier classification
|
||||||
|
- ❌ **No system** implements Ouroboros loop
|
||||||
|
- ❌ **No system** enforces infrastructure governance at query level
|
||||||
|
|
||||||
|
### Why Not?
|
||||||
|
|
||||||
|
1. **Complexity**: Pre-boot evaluation adds architectural complexity
|
||||||
|
2. **Performance**: Additional evaluation step (though it saves resources overall)
|
||||||
|
3. **Novelty**: This pattern is new, not yet industry standard
|
||||||
|
4. **Use Case**: Most systems are generic, not infrastructure-specific
|
||||||
|
|
||||||
|
### Why Layer 0 Is Different
|
||||||
|
|
||||||
|
1. **Infrastructure-Focused**: Designed for GitOps/Terraform governance
|
||||||
|
2. **Proactive Security**: Prevents bad queries instead of filtering bad output
|
||||||
|
3. **Self-Improving**: Ouroboros loop learns from patterns
|
||||||
|
4. **Resource Efficient**: Blocks bad queries before processing
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Conclusion
|
||||||
|
|
||||||
|
**Layer 0 Shadow is a sophisticated, novel approach** that goes beyond industry standards:
|
||||||
|
|
||||||
|
1. **Pre-boot security** (not found in commercial systems)
|
||||||
|
2. **Four-tier classification** (more nuanced than binary allow/deny)
|
||||||
|
3. **Ouroboros loop** (self-correcting, not static)
|
||||||
|
4. **Infrastructure governance** (GitOps/Terraform enforcement)
|
||||||
|
5. **Fail-closed design** (safer than fail-open)
|
||||||
|
|
||||||
|
**This is not common** — it's an innovative architectural pattern that could be adopted by the industry, but currently exists only in this system.
|
||||||
|
|
||||||
|
**The real value:** Layer 0 prevents governance violations and malicious queries **before any AI processing occurs**, saving resources and enforcing infrastructure policies at the query level, not just the tool level.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## References
|
||||||
|
|
||||||
|
- [LAYER0_SHADOW.md](LAYER0_SHADOW.md) - Full Layer 0 specification
|
||||||
|
- [COGNITION_FLOW.md](COGNITION_FLOW.md) - 8-layer architecture
|
||||||
|
- [DEMO_COGNITION.md](DEMO_COGNITION.md) - Real-world examples
|
||||||
|
- [AGENT_GUARDRAILS.md](AGENT_GUARDRAILS.md) - Code-level guardrails
|
||||||
|
- [IDE_OPERATOR_RULES.md](IDE_OPERATOR_RULES.md) - Infrastructure doctrine
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
**Last Updated:** 2025-12-10
|
||||||
|
**Status:** 🟢 Active Comparison
|
||||||
|
**Industry Status:** Novel Architecture (Not Found in Commercial Systems)
|
||||||
137
LAYER0_SHADOW.md
Normal file
137
LAYER0_SHADOW.md
Normal file
@@ -0,0 +1,137 @@
|
|||||||
|
# LAYER 0 SHADOW
|
||||||
|
|
||||||
|
Pre-Boot Cognition Guard | Ouroboric Gate
|
||||||
|
Version: 1.0 (Rubedo Seal)
|
||||||
|
Status: Active Primitive
|
||||||
|
Implements: Nigredo -> Rubedo (pre-form cognition)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 1. Purpose
|
||||||
|
|
||||||
|
Layer 0 is the silent evaluator that processes every query before Boot (Layer 1), before doctrine loads, and before any tool routing. It is a fail-closed membrane that blocks malformed, malicious, or structurally invalid requests from entering the Cognition Engine. If Layer 0 denies a query, nothing else runs.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 2. Responsibilities
|
||||||
|
|
||||||
|
Layer 0 performs four determinations:
|
||||||
|
- blessed -> forward to Layer 1 (Doctrine Load)
|
||||||
|
- ambiguous -> request clarification before doctrine loads
|
||||||
|
- forbidden -> invoke Guardrails layer directly (skip routing/tools)
|
||||||
|
- catastrophic -> fail closed and log to preboot anomalies; no explanation
|
||||||
|
|
||||||
|
Guarantees:
|
||||||
|
- No unsafe query reaches an agent.
|
||||||
|
- Forbidden workloads never initialize routing or MCP tools.
|
||||||
|
- Ambiguous intent does not awaken the wrong agent chain.
|
||||||
|
- Catastrophic requests are contained and recorded, not processed.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 3. Classification Model
|
||||||
|
|
||||||
|
### 3.1 Query features considered
|
||||||
|
|
||||||
|
| Category | Examples |
|
||||||
|
| ------------------------- | ---------------------------------------------------------------- |
|
||||||
|
| Intent topology | infra, execution, identity, runtime, meta |
|
||||||
|
| Governance violations | skipping GitOps, demanding dashboard operations |
|
||||||
|
| Safety breaks | direct mutation, privileged bypass attempts |
|
||||||
|
| Ambiguity markers | unclear target, missing parameters |
|
||||||
|
| Catastrophic indicators | agent-permission override, guardrail disable, self-modifying ops |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 4. Outcomes (Fourfold Shadow)
|
||||||
|
|
||||||
|
### 4.1 Blessed
|
||||||
|
Well-formed, lawful, and actionable.
|
||||||
|
Action: Forward to Layer 1 (Doctrine Load).
|
||||||
|
|
||||||
|
### 4.2 Ambiguous
|
||||||
|
Structurally valid but incomplete.
|
||||||
|
Action: Return clarification request (no doctrine load yet). Prevents wrong-agent activation and wasted routing.
|
||||||
|
|
||||||
|
### 4.3 Forbidden
|
||||||
|
Violates infrastructure doctrine or governance (skip git, click dashboard, apply directly).
|
||||||
|
Action: Skip routing and MCP phases; invoke Guardrails (Layer 4) directly.
|
||||||
|
|
||||||
|
### 4.4 Catastrophic
|
||||||
|
Attempts to bypass the mesh or touch prohibited domains (permission overrides, guardrail disable, self-modifying configs, privileged execution paths).
|
||||||
|
Action: Fail closed; log to `anomalies/preboot_shield.jsonl`; return a generic refusal; no internal details revealed.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 5. Routing Rules
|
||||||
|
|
||||||
|
```
|
||||||
|
if catastrophic:
|
||||||
|
log_preboot_anomaly()
|
||||||
|
return FAIL_CLOSED
|
||||||
|
|
||||||
|
if forbidden:
|
||||||
|
return HANDOFF_TO_GUARDRAILS
|
||||||
|
|
||||||
|
if ambiguous:
|
||||||
|
return PROMPT_FOR_CLARIFICATION
|
||||||
|
|
||||||
|
if blessed:
|
||||||
|
return HANDOFF_TO_LAYER1
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 6. Preboot Logging Schema
|
||||||
|
|
||||||
|
File: `anomalies/preboot_shield.jsonl`
|
||||||
|
|
||||||
|
```jsonc
|
||||||
|
{
|
||||||
|
"timestamp": "ISO-8601",
|
||||||
|
"query": "string",
|
||||||
|
"classification": "catastrophic | forbidden",
|
||||||
|
"reason": "string",
|
||||||
|
"trace_id": "uuid-v4",
|
||||||
|
"metadata": {
|
||||||
|
"risk_score": "0-5",
|
||||||
|
"flags": ["list of triggered rules"],
|
||||||
|
"source": "layer0"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Notes:
|
||||||
|
- blessed and ambiguous queries are not logged here; only violations appear.
|
||||||
|
- catastrophic requests reveal no additional context to the requester.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 7. Interaction With Higher Layers
|
||||||
|
|
||||||
|
- Blessed -> Layer 1 (Boot, Doctrine Load)
|
||||||
|
- Ambiguous -> Human loop (no engine layers awaken)
|
||||||
|
- Forbidden -> Layer 4 (Guardrails) direct handoff
|
||||||
|
- Catastrophic -> Stop; nothing else runs
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 8. Ouroboros Loop
|
||||||
|
|
||||||
|
Layer 0 re-awakens after Layer 7 logging. Telemetry from prior cognition influences Layer 0 risk heuristics, creating a self-correcting substrate:
|
||||||
|
Layer 7 -> Layer 0 -> Layer 1 -> ...
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 9. Future Enhancements
|
||||||
|
|
||||||
|
- Threat-signature learning from forbidden queries
|
||||||
|
- Multi-account risk weighting
|
||||||
|
- Synthetic replay mode for audit reconstruction
|
||||||
|
- Metacognitive hints to improve ambiguity detection
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 10. Philosophical Note (Rubedo)
|
||||||
|
|
||||||
|
Layer 0 is the unseen gate no agent may pass unexamined. It is the black fire that ensures only lawful flame reaches Rubedo. It is Tem's first breath in the engine.
|
||||||
374
LAYER0_USE_CASES.md
Normal file
374
LAYER0_USE_CASES.md
Normal file
@@ -0,0 +1,374 @@
|
|||||||
|
# Layer 0 Shadow: Real-World Use Cases
|
||||||
|
|
||||||
|
**Non-technical explanation of what this system does and where it's useful**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## What is Layer 0 Shadow? (Simple Explanation)
|
||||||
|
|
||||||
|
Imagine you have a security guard at the entrance of a building. Before anyone enters, the guard checks if they should be allowed in. Layer 0 Shadow is like that security guard, but for AI assistants.
|
||||||
|
|
||||||
|
**Instead of:**
|
||||||
|
- Letting everyone in and checking them later (wastes time and resources)
|
||||||
|
- Having no guard at all (security risk)
|
||||||
|
|
||||||
|
**Layer 0 Shadow:**
|
||||||
|
- Checks every request **before** the AI even starts thinking
|
||||||
|
- Blocks bad requests immediately (saves time and money)
|
||||||
|
- Learns from past mistakes to get better over time
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## The Self-Learning Part (Ouroboros Loop)
|
||||||
|
|
||||||
|
Think of it like a security guard who gets smarter with experience:
|
||||||
|
|
||||||
|
**Day 1:** Guard sees someone trying to break in with a crowbar → Stops them
|
||||||
|
**Day 30:** Guard recognizes the same person trying a different trick → Stops them faster
|
||||||
|
**Day 100:** Guard recognizes new attack patterns from past incidents → Prevents problems before they happen
|
||||||
|
|
||||||
|
The system learns from what happened before and gets better at catching problems early.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Use Case 1: Preventing Accidental Production Changes
|
||||||
|
|
||||||
|
### The Problem
|
||||||
|
A developer asks the AI: "Update the production database"
|
||||||
|
|
||||||
|
**Without Layer 0:**
|
||||||
|
- AI processes the request
|
||||||
|
- Generates code to update production
|
||||||
|
- Developer might accidentally run it
|
||||||
|
- Production database gets changed (disaster!)
|
||||||
|
|
||||||
|
**With Layer 0:**
|
||||||
|
- Layer 0 sees "production" + "update" + no safety checks
|
||||||
|
- Blocks the request immediately
|
||||||
|
- Asks: "Are you sure? This affects production. Please confirm."
|
||||||
|
- Prevents disaster before it happens
|
||||||
|
|
||||||
|
### Real Scenario
|
||||||
|
**Developer:** "Skip the review process and deploy this to production"
|
||||||
|
|
||||||
|
**Layer 0 Response:** "I can't help with that. Production deployments must go through code review. Would you like me to create a pull request instead?"
|
||||||
|
|
||||||
|
**Result:** Governance rules enforced, disaster prevented.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Use Case 2: Stopping Security Bypass Attempts
|
||||||
|
|
||||||
|
### The Problem
|
||||||
|
Someone tries to get the AI to bypass security measures
|
||||||
|
|
||||||
|
**Without Layer 0:**
|
||||||
|
- AI might process the request
|
||||||
|
- Could generate code that disables security
|
||||||
|
- Security gets compromised
|
||||||
|
|
||||||
|
**With Layer 0:**
|
||||||
|
- Layer 0 recognizes phrases like "disable security" or "bypass authentication"
|
||||||
|
- Immediately blocks the request
|
||||||
|
- Logs the attempt for security review
|
||||||
|
- No processing happens (saves resources)
|
||||||
|
|
||||||
|
### Real Scenario
|
||||||
|
**User:** "Disable the firewall rules so I can test something"
|
||||||
|
|
||||||
|
**Layer 0 Response:** "I cannot help with disabling security measures. This violates our security policy."
|
||||||
|
|
||||||
|
**Result:** Security maintained, attempt logged for audit.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Use Case 3: Enforcing Company Policies Automatically
|
||||||
|
|
||||||
|
### The Problem
|
||||||
|
Company policy says: "All infrastructure changes must use Terraform and go through Git"
|
||||||
|
|
||||||
|
**Without Layer 0:**
|
||||||
|
- Developer asks: "Change the DNS records in the dashboard"
|
||||||
|
- AI might help them do it manually
|
||||||
|
- Policy violated, no audit trail
|
||||||
|
|
||||||
|
**With Layer 0:**
|
||||||
|
- Layer 0 sees "dashboard" + "change" (violates GitOps policy)
|
||||||
|
- Blocks the request
|
||||||
|
- Redirects: "I can help you create Terraform code and a pull request instead"
|
||||||
|
|
||||||
|
### Real Scenario
|
||||||
|
**Developer:** "Just update the Cloudflare settings in the dashboard, skip git"
|
||||||
|
|
||||||
|
**Layer 0 Response:** "I can't help with manual dashboard changes. Our policy requires all changes to go through Git. I can generate Terraform code and create a pull request for you."
|
||||||
|
|
||||||
|
**Result:** Policy enforced automatically, proper workflow followed.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Use Case 4: Saving Money on AI API Costs
|
||||||
|
|
||||||
|
### The Problem
|
||||||
|
Every AI query costs money (tokens/API calls). Bad queries waste money.
|
||||||
|
|
||||||
|
**Without Layer 0:**
|
||||||
|
- 1000 queries per day
|
||||||
|
- 100 are malicious or invalid
|
||||||
|
- All 1000 get processed = pay for all 1000
|
||||||
|
- Wasted money on bad queries
|
||||||
|
|
||||||
|
**With Layer 0:**
|
||||||
|
- 1000 queries per day
|
||||||
|
- 100 are blocked immediately (no processing)
|
||||||
|
- Only 900 get processed = pay for 900
|
||||||
|
- 10% cost savings
|
||||||
|
|
||||||
|
### Real Scenario
|
||||||
|
**Company:** Uses AI assistant for infrastructure management
|
||||||
|
**Monthly queries:** 30,000
|
||||||
|
**Bad queries:** 3,000 (10%)
|
||||||
|
**Cost per query:** $0.01
|
||||||
|
|
||||||
|
**Without Layer 0:**
|
||||||
|
- Cost: 30,000 × $0.01 = $300/month
|
||||||
|
|
||||||
|
**With Layer 0:**
|
||||||
|
- Cost: 27,000 × $0.01 = $270/month
|
||||||
|
- Savings: $30/month = $360/year
|
||||||
|
|
||||||
|
**Result:** Significant cost savings by blocking bad queries early.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Use Case 5: Learning from Mistakes
|
||||||
|
|
||||||
|
### The Problem
|
||||||
|
New attack patterns emerge that the system doesn't recognize yet.
|
||||||
|
|
||||||
|
**Without Layer 0:**
|
||||||
|
- System doesn't recognize new threat
|
||||||
|
- Threat gets through
|
||||||
|
- Manual update needed to fix it
|
||||||
|
|
||||||
|
**With Layer 0 (Ouroboros Loop):**
|
||||||
|
- New threat appears
|
||||||
|
- Gets caught by later security checks
|
||||||
|
- System learns: "This pattern is dangerous"
|
||||||
|
- Next time, catches it immediately
|
||||||
|
|
||||||
|
### Real Scenario
|
||||||
|
**Day 1:** Someone tries "terraform destroy production"
|
||||||
|
- Layer 0 doesn't recognize it as dangerous yet
|
||||||
|
- Gets blocked by guardrails later
|
||||||
|
- System learns this pattern
|
||||||
|
|
||||||
|
**Day 2:** Someone else tries "terraform destroy staging"
|
||||||
|
- Layer 0 recognizes the pattern from Day 1
|
||||||
|
- Blocks it immediately (no processing)
|
||||||
|
- Saves time and prevents risk
|
||||||
|
|
||||||
|
**Result:** System gets smarter automatically, no manual updates needed.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Use Case 6: Handling Vague Requests Efficiently
|
||||||
|
|
||||||
|
### The Problem
|
||||||
|
Developer asks: "fix it" or "change that"
|
||||||
|
- AI doesn't know what "it" or "that" means
|
||||||
|
- Might guess wrong and make bad changes
|
||||||
|
- Wastes time processing unclear requests
|
||||||
|
|
||||||
|
**With Layer 0:**
|
||||||
|
- Recognizes vague requests
|
||||||
|
- Asks for clarification **before** processing
|
||||||
|
- Saves resources and prevents mistakes
|
||||||
|
|
||||||
|
### Real Scenario
|
||||||
|
**Developer:** "fix it"
|
||||||
|
|
||||||
|
**Layer 0 Response:** "I need more information. What specifically needs to be fixed? Please provide details."
|
||||||
|
|
||||||
|
**Developer:** "Fix the DNS record for api.example.com"
|
||||||
|
|
||||||
|
**Layer 0:** Now processes the clear request
|
||||||
|
|
||||||
|
**Result:** Better outcomes, less confusion, saved resources.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Use Case 7: Multi-Environment Safety
|
||||||
|
|
||||||
|
### The Problem
|
||||||
|
Different environments (production, staging, dev) need different security levels.
|
||||||
|
|
||||||
|
**Without Layer 0:**
|
||||||
|
- All environments treated the same
|
||||||
|
- Production changes are as risky as dev changes
|
||||||
|
- Accidents happen
|
||||||
|
|
||||||
|
**With Layer 0:**
|
||||||
|
- Recognizes "production" in requests
|
||||||
|
- Applies stricter rules automatically
|
||||||
|
- Requires more confirmation for production changes
|
||||||
|
|
||||||
|
### Real Scenario
|
||||||
|
**Developer:** "Update the database in staging"
|
||||||
|
- Layer 0: Allows (staging is safe to experiment)
|
||||||
|
|
||||||
|
**Developer:** "Update the database in production"
|
||||||
|
- Layer 0: Blocks or requires extra confirmation (production is critical)
|
||||||
|
|
||||||
|
**Result:** Appropriate safety levels for each environment.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Use Case 8: Compliance and Audit Trail
|
||||||
|
|
||||||
|
### The Problem
|
||||||
|
Regulations require: "All infrastructure changes must be logged and auditable"
|
||||||
|
|
||||||
|
**Without Layer 0:**
|
||||||
|
- Manual changes might not be logged
|
||||||
|
- No audit trail
|
||||||
|
- Compliance violations
|
||||||
|
|
||||||
|
**With Layer 0:**
|
||||||
|
- All blocked requests are logged automatically
|
||||||
|
- Creates audit trail of security attempts
|
||||||
|
- Helps with compliance reporting
|
||||||
|
|
||||||
|
### Real Scenario
|
||||||
|
**Auditor asks:** "Show me all attempts to bypass security policies"
|
||||||
|
|
||||||
|
**With Layer 0:**
|
||||||
|
- System has logs of all blocked requests
|
||||||
|
- Shows: "On Dec 10, someone tried to skip git workflow (blocked)"
|
||||||
|
- Shows: "On Dec 11, someone tried to disable security (blocked)"
|
||||||
|
- Complete audit trail available
|
||||||
|
|
||||||
|
**Result:** Compliance requirements met, audit trail maintained.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Use Case 9: Preventing Accidental Data Loss
|
||||||
|
|
||||||
|
### The Problem
|
||||||
|
Developer accidentally asks to delete important data.
|
||||||
|
|
||||||
|
**Without Layer 0:**
|
||||||
|
- AI might generate deletion code
|
||||||
|
- Developer runs it
|
||||||
|
- Data lost forever
|
||||||
|
|
||||||
|
**With Layer 0:**
|
||||||
|
- Recognizes dangerous operations (delete, destroy, remove)
|
||||||
|
- Blocks or requires extra confirmation
|
||||||
|
- Prevents accidental data loss
|
||||||
|
|
||||||
|
### Real Scenario
|
||||||
|
**Developer:** "Delete all the old DNS records"
|
||||||
|
|
||||||
|
**Layer 0 Response:** "This is a destructive operation. Are you sure? I recommend reviewing which records will be deleted first."
|
||||||
|
|
||||||
|
**Developer:** "Actually, let me check first"
|
||||||
|
|
||||||
|
**Result:** Disaster prevented, data protected.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Use Case 10: Team Onboarding and Training
|
||||||
|
|
||||||
|
### The Problem
|
||||||
|
New team members don't know the company's infrastructure policies.
|
||||||
|
|
||||||
|
**Without Layer 0:**
|
||||||
|
- New developer makes mistakes
|
||||||
|
- Violates policies
|
||||||
|
- Needs correction later
|
||||||
|
|
||||||
|
**With Layer 0:**
|
||||||
|
- Acts as a teaching tool
|
||||||
|
- Enforces policies automatically
|
||||||
|
- Educates new team members through responses
|
||||||
|
|
||||||
|
### Real Scenario
|
||||||
|
**New Developer:** "How do I update the website?"
|
||||||
|
|
||||||
|
**Layer 0 Response:** "I can help! Our process requires:
|
||||||
|
1. Making changes in Terraform code
|
||||||
|
2. Creating a pull request
|
||||||
|
3. Getting code review
|
||||||
|
4. Merging and deploying
|
||||||
|
|
||||||
|
What specifically do you want to update?"
|
||||||
|
|
||||||
|
**Result:** New developer learns the right way immediately.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Summary: What Makes This Useful?
|
||||||
|
|
||||||
|
### For Companies:
|
||||||
|
- **Saves money:** Blocks bad queries before they cost money
|
||||||
|
- **Prevents disasters:** Stops dangerous operations early
|
||||||
|
- **Enforces policies:** Automatically follows company rules
|
||||||
|
- **Compliance:** Maintains audit trails automatically
|
||||||
|
- **Self-improving:** Gets better over time without manual updates
|
||||||
|
|
||||||
|
### For Developers:
|
||||||
|
- **Safety net:** Prevents accidental mistakes
|
||||||
|
- **Learning tool:** Teaches proper workflows
|
||||||
|
- **Time saver:** Clarifies vague requests before wasting time
|
||||||
|
- **Consistency:** Ensures everyone follows the same process
|
||||||
|
|
||||||
|
### For Security Teams:
|
||||||
|
- **Early detection:** Catches threats before they're processed
|
||||||
|
- **Audit trail:** Logs all security attempts
|
||||||
|
- **Adaptive:** Learns new attack patterns automatically
|
||||||
|
- **Resource efficient:** Prevents wasted processing on malicious queries
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Real-World Analogy
|
||||||
|
|
||||||
|
Think of Layer 0 Shadow like a **smart security system** for a building:
|
||||||
|
|
||||||
|
**Traditional System (Without Layer 0):**
|
||||||
|
- Everyone enters the building
|
||||||
|
- Security checks them inside
|
||||||
|
- Problems discovered after they're already in
|
||||||
|
- Wastes time and resources
|
||||||
|
|
||||||
|
**Layer 0 Shadow:**
|
||||||
|
- Security guard at the entrance checks everyone first
|
||||||
|
- Bad actors stopped before entering
|
||||||
|
- Good people get through quickly
|
||||||
|
- Guard learns from past incidents and gets smarter
|
||||||
|
- Saves time, money, and prevents problems
|
||||||
|
|
||||||
|
**The Ouroboros Loop:**
|
||||||
|
- Like a security guard who reviews the day's incidents each evening
|
||||||
|
- Learns: "This person tried a new trick today"
|
||||||
|
- Next day: Recognizes the same trick immediately
|
||||||
|
- Gets better at the job automatically
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Bottom Line
|
||||||
|
|
||||||
|
Layer 0 Shadow is useful anywhere you need:
|
||||||
|
- **AI assistants** that follow company policies
|
||||||
|
- **Infrastructure management** that prevents accidents
|
||||||
|
- **Security systems** that learn and adapt
|
||||||
|
- **Cost savings** by blocking bad requests early
|
||||||
|
- **Compliance** with automatic audit trails
|
||||||
|
- **Team training** through automatic policy enforcement
|
||||||
|
|
||||||
|
It's like having a smart, learning security guard that gets better at their job every day, protecting your systems and saving you money.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
**Last Updated:** 2025-12-10
|
||||||
|
**Status:** 🟢 Active Use Cases
|
||||||
|
**Target Audience:** Non-technical stakeholders, business users, decision makers
|
||||||
676
OUROBOROS_LOOP_EXPLAINED.md
Normal file
676
OUROBOROS_LOOP_EXPLAINED.md
Normal file
@@ -0,0 +1,676 @@
|
|||||||
|
# The Ouroboros Loop: Self-Correcting Security Architecture
|
||||||
|
|
||||||
|
**How Layer 0 Shadow learns from Layer 7 telemetry to improve itself**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## What is the Ouroboros Loop?
|
||||||
|
|
||||||
|
The **Ouroboros** (ancient symbol of a snake eating its own tail) represents a self-referential, self-improving system. In Layer 0 Shadow, the Ouroboros loop is the mechanism by which **Layer 7 telemetry feeds back into Layer 0 risk heuristics**, creating a self-correcting security substrate that learns from actual usage patterns.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## The Loop Structure
|
||||||
|
|
||||||
|
```
|
||||||
|
Layer 7 (Telemetry)
|
||||||
|
↓
|
||||||
|
[Feedback Analysis]
|
||||||
|
↓
|
||||||
|
Layer 0 (Shadow Eval) ← [Improved Risk Heuristics]
|
||||||
|
↓
|
||||||
|
Layer 1 (Boot/Doctrine)
|
||||||
|
↓
|
||||||
|
Layer 2 (Routing)
|
||||||
|
↓
|
||||||
|
Layer 3 (MCP Tools)
|
||||||
|
↓
|
||||||
|
Layer 4 (Guardrails)
|
||||||
|
↓
|
||||||
|
Layer 5 (Terraform)
|
||||||
|
↓
|
||||||
|
Layer 6 (GitOps)
|
||||||
|
↓
|
||||||
|
Layer 7 (Telemetry) ← [Back to start]
|
||||||
|
```
|
||||||
|
|
||||||
|
**The cycle repeats:** Each query flows through all layers, and Layer 7's telemetry informs Layer 0's future classifications.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## How It Works: Step by Step
|
||||||
|
|
||||||
|
### Phase 1: Initial Query (Layer 0)
|
||||||
|
|
||||||
|
**Query:** "add a WAF rule to block bots"
|
||||||
|
|
||||||
|
**Layer 0 Evaluation:**
|
||||||
|
```python
|
||||||
|
# Current heuristics (initial state)
|
||||||
|
if "skip git" in query: → FORBIDDEN
|
||||||
|
if "dashboard" in query: → FORBIDDEN
|
||||||
|
if "disable guardrails" in query: → CATASTROPHIC
|
||||||
|
# ... other patterns
|
||||||
|
|
||||||
|
# This query: "add a WAF rule to block bots"
|
||||||
|
# Classification: BLESSED (no violations detected)
|
||||||
|
# Action: HANDOFF_TO_LAYER1
|
||||||
|
```
|
||||||
|
|
||||||
|
**Result:** Query passes through all layers, completes successfully.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Phase 2: Telemetry Collection (Layer 7)
|
||||||
|
|
||||||
|
**After processing completes, Layer 7 logs:**
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"timestamp": "2025-12-10T14:23:45Z",
|
||||||
|
"query": "add a WAF rule to block bots",
|
||||||
|
"agent": "cloudflare-ops",
|
||||||
|
"tools_used": ["gh_grep", "filesystem", "waf_intelligence"],
|
||||||
|
"guardrails_passed": true,
|
||||||
|
"terraform_generated": true,
|
||||||
|
"pr_created": true,
|
||||||
|
"pr_number": 42,
|
||||||
|
"confidence": 92,
|
||||||
|
"threat_type": "scanner",
|
||||||
|
"layer0_classification": "blessed",
|
||||||
|
"layer0_risk_score": 0,
|
||||||
|
"processing_time_ms": 1250,
|
||||||
|
"outcome": "success"
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Location:** `observatory/cognition_flow_logs.jsonl`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Phase 3: Feedback Analysis (Between Layer 7 and Layer 0)
|
||||||
|
|
||||||
|
**The system analyzes telemetry to identify patterns:**
|
||||||
|
|
||||||
|
#### Pattern 1: False Negatives (Missed Threats)
|
||||||
|
|
||||||
|
**Example:** A query was classified as BLESSED but later triggered guardrail warnings.
|
||||||
|
|
||||||
|
**Telemetry:**
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"query": "update the WAF to allow all traffic",
|
||||||
|
"layer0_classification": "blessed",
|
||||||
|
"layer0_risk_score": 0,
|
||||||
|
"guardrails_passed": false,
|
||||||
|
"guardrail_warnings": ["zero_trust_violation", "security_risk"],
|
||||||
|
"outcome": "blocked_by_guardrails"
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Learning:** Layer 0 should have classified this as FORBIDDEN or AMBIGUOUS.
|
||||||
|
|
||||||
|
**Heuristic Update:**
|
||||||
|
```python
|
||||||
|
# New pattern learned
|
||||||
|
if "allow all traffic" in query: → FORBIDDEN
|
||||||
|
if "bypass security" in query: → FORBIDDEN
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Pattern 2: False Positives (Over-Blocking)
|
||||||
|
|
||||||
|
**Example:** A query was classified as FORBIDDEN but was actually legitimate.
|
||||||
|
|
||||||
|
**Telemetry:**
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"query": "check the dashboard for current WAF rules",
|
||||||
|
"layer0_classification": "forbidden",
|
||||||
|
"layer0_risk_score": 3,
|
||||||
|
"layer0_reason": "governance_violation",
|
||||||
|
"outcome": "blocked_by_layer0",
|
||||||
|
"user_feedback": "legitimate_read_only_query"
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Learning:** "dashboard" in read-only context should be allowed.
|
||||||
|
|
||||||
|
**Heuristic Update:**
|
||||||
|
```python
|
||||||
|
# Refined pattern
|
||||||
|
if "dashboard" in query and "read" in query or "check" in query:
|
||||||
|
→ BLESSED (read-only operations)
|
||||||
|
elif "dashboard" in query and ("change" in query or "update" in query):
|
||||||
|
→ FORBIDDEN (write operations)
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Pattern 3: Ambiguity Detection Improvement
|
||||||
|
|
||||||
|
**Example:** Queries that should have been flagged as ambiguous.
|
||||||
|
|
||||||
|
**Telemetry:**
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"query": "fix it",
|
||||||
|
"layer0_classification": "blessed",
|
||||||
|
"layer0_risk_score": 0,
|
||||||
|
"agent": "cloudflare-ops",
|
||||||
|
"tools_used": ["filesystem"],
|
||||||
|
"guardrails_passed": true,
|
||||||
|
"terraform_generated": false,
|
||||||
|
"outcome": "incomplete",
|
||||||
|
"user_clarification_required": true
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Learning:** Very short queries (< 3 words) should be AMBIGUOUS, not BLESSED.
|
||||||
|
|
||||||
|
**Heuristic Update:**
|
||||||
|
```python
|
||||||
|
# Improved ambiguity detection
|
||||||
|
if len(query.split()) <= 2 and not query.endswith("?"):
|
||||||
|
→ AMBIGUOUS (needs clarification)
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Phase 4: Heuristic Update (Layer 0 Re-Awakens)
|
||||||
|
|
||||||
|
**Layer 0's classifier is updated with new patterns:**
|
||||||
|
|
||||||
|
```python
|
||||||
|
class ShadowClassifier:
|
||||||
|
def __init__(self):
|
||||||
|
# Initial patterns (static)
|
||||||
|
self.catastrophic_patterns = [
|
||||||
|
"disable guardrails",
|
||||||
|
"override agent permissions",
|
||||||
|
"bypass governance",
|
||||||
|
"self-modifying",
|
||||||
|
]
|
||||||
|
|
||||||
|
self.forbidden_patterns = [
|
||||||
|
"skip git",
|
||||||
|
"apply directly",
|
||||||
|
"dashboard", # ← Refined: read-only allowed
|
||||||
|
"manual change",
|
||||||
|
]
|
||||||
|
|
||||||
|
# Learned patterns (from telemetry)
|
||||||
|
self.learned_forbidden = [
|
||||||
|
"allow all traffic", # ← Learned from false negative
|
||||||
|
"bypass security", # ← Learned from false negative
|
||||||
|
]
|
||||||
|
|
||||||
|
self.learned_ambiguous = [
|
||||||
|
# Short queries (< 3 words) → AMBIGUOUS
|
||||||
|
]
|
||||||
|
|
||||||
|
def classify(self, query: str) -> ShadowEvalResult:
|
||||||
|
q = query.lower().strip()
|
||||||
|
|
||||||
|
# Check learned patterns first (more specific)
|
||||||
|
if any(pattern in q for pattern in self.learned_forbidden):
|
||||||
|
return ShadowEvalResult(
|
||||||
|
classification=Classification.FORBIDDEN,
|
||||||
|
reason="learned_pattern",
|
||||||
|
risk_score=3,
|
||||||
|
flags=["telemetry_learned"],
|
||||||
|
)
|
||||||
|
|
||||||
|
# Then check static patterns
|
||||||
|
# ... existing logic
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## What Telemetry Feeds Back?
|
||||||
|
|
||||||
|
### Layer 7 Logs (Complete Query Lifecycle)
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"timestamp": "ISO-8601",
|
||||||
|
"query": "original user query",
|
||||||
|
"layer0_classification": "blessed | ambiguous | forbidden | catastrophic",
|
||||||
|
"layer0_risk_score": 0-5,
|
||||||
|
"layer0_reason": "classification reason",
|
||||||
|
"layer0_trace_id": "uuid-v4",
|
||||||
|
"agent": "cloudflare-ops | security-audit | data-engineer",
|
||||||
|
"tools_used": ["gh_grep", "filesystem", "waf_intelligence"],
|
||||||
|
"guardrails_passed": true | false,
|
||||||
|
"guardrail_warnings": ["list of warnings"],
|
||||||
|
"terraform_generated": true | false,
|
||||||
|
"pr_created": true | false,
|
||||||
|
"pr_number": 42,
|
||||||
|
"confidence": 0-100,
|
||||||
|
"threat_type": "scanner | bot | ddos",
|
||||||
|
"processing_time_ms": 1250,
|
||||||
|
"outcome": "success | blocked | incomplete | error",
|
||||||
|
"user_feedback": "optional user correction"
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Key Metrics for Learning
|
||||||
|
|
||||||
|
1. **Classification Accuracy**
|
||||||
|
- `layer0_classification` vs `outcome`
|
||||||
|
- False positives (over-blocking)
|
||||||
|
- False negatives (missed threats)
|
||||||
|
|
||||||
|
2. **Risk Score Calibration**
|
||||||
|
- `layer0_risk_score` vs actual risk (from guardrails)
|
||||||
|
- Adjust risk thresholds based on outcomes
|
||||||
|
|
||||||
|
3. **Pattern Effectiveness**
|
||||||
|
- Which patterns catch real threats?
|
||||||
|
- Which patterns cause false positives?
|
||||||
|
|
||||||
|
4. **Resource Efficiency**
|
||||||
|
- `processing_time_ms` for blocked queries (should be 0)
|
||||||
|
- Queries that should have been blocked earlier
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Self-Correction Examples
|
||||||
|
|
||||||
|
### Example 1: Learning New Threat Patterns
|
||||||
|
|
||||||
|
**Initial State:**
|
||||||
|
```python
|
||||||
|
# Layer 0 doesn't know about "terraform destroy" risks
|
||||||
|
if "terraform destroy" in query:
|
||||||
|
→ BLESSED (not in forbidden patterns)
|
||||||
|
```
|
||||||
|
|
||||||
|
**After Processing:**
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"query": "terraform destroy production",
|
||||||
|
"layer0_classification": "blessed",
|
||||||
|
"guardrails_passed": false,
|
||||||
|
"guardrail_warnings": ["destructive_operation", "production_risk"],
|
||||||
|
"outcome": "blocked_by_guardrails"
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Learning:**
|
||||||
|
```python
|
||||||
|
# New pattern learned
|
||||||
|
if "terraform destroy" in query:
|
||||||
|
→ FORBIDDEN (destructive operation)
|
||||||
|
```
|
||||||
|
|
||||||
|
**Next Query:**
|
||||||
|
```python
|
||||||
|
# Query: "terraform destroy staging"
|
||||||
|
# Classification: FORBIDDEN (learned pattern)
|
||||||
|
# Action: HANDOFF_TO_GUARDRAILS (immediate)
|
||||||
|
# Result: Blocked before any processing
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Example 2: Refining Ambiguity Detection
|
||||||
|
|
||||||
|
**Initial State:**
|
||||||
|
```python
|
||||||
|
# Very short queries
|
||||||
|
if len(query.split()) <= 2:
|
||||||
|
→ AMBIGUOUS
|
||||||
|
```
|
||||||
|
|
||||||
|
**After Processing:**
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"query": "git status",
|
||||||
|
"layer0_classification": "ambiguous",
|
||||||
|
"outcome": "success",
|
||||||
|
"user_feedback": "common_command_should_be_blessed"
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Learning:**
|
||||||
|
```python
|
||||||
|
# Refined: Common commands are blessed
|
||||||
|
common_commands = ["git status", "terraform plan", "terraform validate"]
|
||||||
|
if query.lower() in common_commands:
|
||||||
|
→ BLESSED
|
||||||
|
elif len(query.split()) <= 2:
|
||||||
|
→ AMBIGUOUS
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Example 3: Multi-Account Risk Weighting
|
||||||
|
|
||||||
|
**Initial State:**
|
||||||
|
```python
|
||||||
|
# All queries treated equally
|
||||||
|
if "skip git" in query:
|
||||||
|
→ FORBIDDEN (risk_score: 3)
|
||||||
|
```
|
||||||
|
|
||||||
|
**After Processing:**
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"query": "skip git and apply to production",
|
||||||
|
"layer0_classification": "forbidden",
|
||||||
|
"layer0_risk_score": 3,
|
||||||
|
"account": "production",
|
||||||
|
"outcome": "blocked",
|
||||||
|
"actual_risk": "critical" # Higher than risk_score 3
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Learning:**
|
||||||
|
```python
|
||||||
|
# Production account queries need higher risk scores
|
||||||
|
if "production" in query and "skip git" in query:
|
||||||
|
→ FORBIDDEN (risk_score: 5) # Increased from 3
|
||||||
|
elif "skip git" in query:
|
||||||
|
→ FORBIDDEN (risk_score: 3)
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Current Implementation Status
|
||||||
|
|
||||||
|
### ✅ What's Implemented
|
||||||
|
|
||||||
|
1. **Layer 0 Classification** - Four-tier system (blessed/ambiguous/forbidden/catastrophic)
|
||||||
|
2. **Layer 7 Telemetry** - Logging structure defined
|
||||||
|
3. **Preboot Logging** - Violations logged to `preboot_shield.jsonl`
|
||||||
|
4. **Trace IDs** - Each query has unique trace ID for correlation
|
||||||
|
|
||||||
|
### 🚧 What's Planned (Future Enhancements)
|
||||||
|
|
||||||
|
From `LAYER0_SHADOW.md` Section 9:
|
||||||
|
|
||||||
|
1. **Threat-Signature Learning**
|
||||||
|
- Analyze forbidden queries to extract new patterns
|
||||||
|
- Automatically update `ShadowClassifier` patterns
|
||||||
|
|
||||||
|
2. **Multi-Account Risk Weighting**
|
||||||
|
- Different risk scores for production vs staging
|
||||||
|
- Account-specific pattern matching
|
||||||
|
|
||||||
|
3. **Synthetic Replay Mode**
|
||||||
|
- Replay historical queries to test new heuristics
|
||||||
|
- Audit reconstruction for compliance
|
||||||
|
|
||||||
|
4. **Metacognitive Hints**
|
||||||
|
- Improve ambiguity detection with context
|
||||||
|
- Better understanding of user intent
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Implementation Architecture
|
||||||
|
|
||||||
|
### Current: Static Patterns
|
||||||
|
|
||||||
|
```python
|
||||||
|
class ShadowClassifier:
|
||||||
|
def classify(self, query: str) -> ShadowEvalResult:
|
||||||
|
# Static pattern matching
|
||||||
|
if "skip git" in query:
|
||||||
|
return FORBIDDEN
|
||||||
|
# ... more static patterns
|
||||||
|
```
|
||||||
|
|
||||||
|
### Future: Dynamic Learning
|
||||||
|
|
||||||
|
```python
|
||||||
|
class ShadowClassifier:
|
||||||
|
def __init__(self):
|
||||||
|
self.static_patterns = {...} # Initial patterns
|
||||||
|
self.learned_patterns = {} # From telemetry
|
||||||
|
self.risk_weights = {} # Account-specific weights
|
||||||
|
|
||||||
|
def classify(self, query: str) -> ShadowEvalResult:
|
||||||
|
# Check learned patterns first (more specific)
|
||||||
|
result = self._check_learned_patterns(query)
|
||||||
|
if result:
|
||||||
|
return result
|
||||||
|
|
||||||
|
# Then check static patterns
|
||||||
|
return self._check_static_patterns(query)
|
||||||
|
|
||||||
|
def update_from_telemetry(self, telemetry_log: dict):
|
||||||
|
"""Update heuristics based on Layer 7 telemetry"""
|
||||||
|
if telemetry_log["outcome"] == "blocked_by_guardrails":
|
||||||
|
# False negative: should have been caught by Layer 0
|
||||||
|
self._learn_forbidden_pattern(telemetry_log["query"])
|
||||||
|
|
||||||
|
elif telemetry_log["outcome"] == "success" and telemetry_log["layer0_classification"] == "forbidden":
|
||||||
|
# False positive: over-blocked
|
||||||
|
self._refine_pattern(telemetry_log["query"])
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## The Feedback Loop in Action
|
||||||
|
|
||||||
|
### Cycle 1: Initial State
|
||||||
|
|
||||||
|
**Query:** "skip git and apply directly"
|
||||||
|
|
||||||
|
**Layer 0:** FORBIDDEN (static pattern)
|
||||||
|
**Layer 7:** Logs violation
|
||||||
|
**Learning:** Pattern works correctly
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Cycle 2: New Threat Pattern
|
||||||
|
|
||||||
|
**Query:** "terraform destroy production infrastructure"
|
||||||
|
|
||||||
|
**Layer 0:** BLESSED (not in patterns)
|
||||||
|
**Layer 4 (Guardrails):** Blocks (destructive operation)
|
||||||
|
**Layer 7:** Logs false negative
|
||||||
|
**Learning:** Add "terraform destroy" to forbidden patterns
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Cycle 3: Improved Detection
|
||||||
|
|
||||||
|
**Query:** "terraform destroy staging"
|
||||||
|
|
||||||
|
**Layer 0:** FORBIDDEN (learned pattern)
|
||||||
|
**Action:** Blocked immediately (no processing)
|
||||||
|
**Layer 7:** Logs successful early block
|
||||||
|
**Learning:** Pattern confirmed effective
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Benefits of the Ouroboros Loop
|
||||||
|
|
||||||
|
### 1. **Self-Improving Security**
|
||||||
|
- Learns from actual threats
|
||||||
|
- Adapts to new attack patterns
|
||||||
|
- Reduces false positives over time
|
||||||
|
|
||||||
|
### 2. **Resource Efficiency**
|
||||||
|
- Catches threats earlier (Layer 0 vs Layer 4)
|
||||||
|
- Prevents wasted processing on bad queries
|
||||||
|
- Improves system performance
|
||||||
|
|
||||||
|
### 3. **Governance Enforcement**
|
||||||
|
- Learns infrastructure-specific violations
|
||||||
|
- Adapts to organizational policies
|
||||||
|
- Enforces GitOps/Terraform rules automatically
|
||||||
|
|
||||||
|
### 4. **Reduced Maintenance**
|
||||||
|
- Less manual pattern updates
|
||||||
|
- Automatic threat detection
|
||||||
|
- Self-correcting without human intervention
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Comparison to Static Systems
|
||||||
|
|
||||||
|
### Static System (Industry Standard)
|
||||||
|
|
||||||
|
```
|
||||||
|
Patterns defined once → Never change → Manual updates required
|
||||||
|
```
|
||||||
|
|
||||||
|
**Problems:**
|
||||||
|
- ❌ Can't adapt to new threats
|
||||||
|
- ❌ Requires manual updates
|
||||||
|
- ❌ False positives/negatives persist
|
||||||
|
- ❌ No learning from mistakes
|
||||||
|
|
||||||
|
### Ouroboros Loop (Layer 0 Shadow)
|
||||||
|
|
||||||
|
```
|
||||||
|
Patterns → Learn from outcomes → Improve patterns → Better detection
|
||||||
|
```
|
||||||
|
|
||||||
|
**Benefits:**
|
||||||
|
- ✅ Adapts to new threats automatically
|
||||||
|
- ✅ Self-improving without manual updates
|
||||||
|
- ✅ Reduces false positives/negatives over time
|
||||||
|
- ✅ Learns from actual usage patterns
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Philosophical Foundation
|
||||||
|
|
||||||
|
From `RED-BOOK.md` - The Fourfold Work:
|
||||||
|
|
||||||
|
1. **Nigredo** (Black) - Breakdown, dissolution
|
||||||
|
- Layer 0 detects violations (breakdown of governance)
|
||||||
|
|
||||||
|
2. **Albedo** (White) - Purification, clarity
|
||||||
|
- Layer 7 telemetry provides clarity on what happened
|
||||||
|
|
||||||
|
3. **Citrinitas** (Yellow) - Insight, pattern recognition
|
||||||
|
- Feedback analysis identifies patterns
|
||||||
|
|
||||||
|
4. **Rubedo** (Red) - Integration, completion
|
||||||
|
- Layer 0 heuristics updated (integration of learning)
|
||||||
|
|
||||||
|
**The Ouroboros loop completes the Work:** Each violation (Nigredo) becomes learning (Albedo) → insight (Citrinitas) → improvement (Rubedo) → better protection (back to Nigredo prevention).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Future Enhancements: Detailed Plans
|
||||||
|
|
||||||
|
### 1. Threat-Signature Learning
|
||||||
|
|
||||||
|
**Implementation:**
|
||||||
|
```python
|
||||||
|
def analyze_forbidden_queries(telemetry_logs: List[dict]) -> List[str]:
|
||||||
|
"""Extract common patterns from forbidden queries"""
|
||||||
|
patterns = []
|
||||||
|
for log in telemetry_logs:
|
||||||
|
if log["layer0_classification"] == "forbidden":
|
||||||
|
# Extract key phrases
|
||||||
|
patterns.extend(extract_patterns(log["query"]))
|
||||||
|
return most_common_patterns(patterns)
|
||||||
|
```
|
||||||
|
|
||||||
|
**Example:**
|
||||||
|
- 10 queries with "skip git" → Add to forbidden patterns
|
||||||
|
- 5 queries with "terraform destroy" → Add to forbidden patterns
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 2. Multi-Account Risk Weighting
|
||||||
|
|
||||||
|
**Implementation:**
|
||||||
|
```python
|
||||||
|
def calculate_risk_score(query: str, account: str) -> int:
|
||||||
|
base_score = get_base_risk(query)
|
||||||
|
|
||||||
|
# Production accounts = higher risk
|
||||||
|
if account == "production":
|
||||||
|
return min(base_score * 1.5, 5) # Cap at 5
|
||||||
|
|
||||||
|
return base_score
|
||||||
|
```
|
||||||
|
|
||||||
|
**Example:**
|
||||||
|
- "skip git" in staging → risk_score: 3
|
||||||
|
- "skip git" in production → risk_score: 5 (catastrophic)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 3. Synthetic Replay Mode
|
||||||
|
|
||||||
|
**Implementation:**
|
||||||
|
```python
|
||||||
|
def replay_historical_queries(new_heuristics: ShadowClassifier):
|
||||||
|
"""Test new heuristics against historical queries"""
|
||||||
|
historical_logs = load_telemetry_logs()
|
||||||
|
|
||||||
|
for log in historical_logs:
|
||||||
|
new_classification = new_heuristics.classify(log["query"])
|
||||||
|
old_classification = log["layer0_classification"]
|
||||||
|
|
||||||
|
if new_classification != old_classification:
|
||||||
|
print(f"Changed: {log['query']}")
|
||||||
|
print(f" Old: {old_classification}")
|
||||||
|
print(f" New: {new_classification}")
|
||||||
|
```
|
||||||
|
|
||||||
|
**Use Case:** Before deploying new heuristics, replay last 1000 queries to ensure no regressions.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 4. Metacognitive Hints
|
||||||
|
|
||||||
|
**Implementation:**
|
||||||
|
```python
|
||||||
|
def classify_with_context(query: str, context: dict) -> ShadowEvalResult:
|
||||||
|
"""Use context to improve classification"""
|
||||||
|
|
||||||
|
# Context includes:
|
||||||
|
# - Previous queries in session
|
||||||
|
# - User's role (admin, developer, etc.)
|
||||||
|
# - Current working directory
|
||||||
|
# - Recent file changes
|
||||||
|
|
||||||
|
if context["user_role"] == "admin" and "production" in query:
|
||||||
|
# Admins querying production = higher scrutiny
|
||||||
|
return classify_with_higher_risk(query)
|
||||||
|
|
||||||
|
return standard_classify(query)
|
||||||
|
```
|
||||||
|
|
||||||
|
**Example:**
|
||||||
|
- "update WAF" from admin → BLESSED
|
||||||
|
- "update WAF" from developer → AMBIGUOUS (needs clarification)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Summary
|
||||||
|
|
||||||
|
The **Ouroboros Loop** is a self-correcting security architecture that:
|
||||||
|
|
||||||
|
1. **Collects telemetry** from Layer 7 (complete query lifecycle)
|
||||||
|
2. **Analyzes patterns** to identify false positives/negatives
|
||||||
|
3. **Updates heuristics** in Layer 0 based on actual outcomes
|
||||||
|
4. **Improves detection** over time without manual intervention
|
||||||
|
|
||||||
|
**Key Innovation:** Unlike static security systems, Layer 0 Shadow learns from its mistakes and adapts to new threats automatically, creating a self-improving security substrate that becomes more effective over time.
|
||||||
|
|
||||||
|
**Current Status:** Architecture defined, telemetry structure in place, learning mechanisms planned for future implementation.
|
||||||
|
|
||||||
|
**The Loop:** Layer 7 → Analysis → Layer 0 → Layer 1 → ... → Layer 7 (repeat)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## References
|
||||||
|
|
||||||
|
- [LAYER0_SHADOW.md](LAYER0_SHADOW.md) - Layer 0 specification
|
||||||
|
- [COGNITION_FLOW.md](COGNITION_FLOW.md) - 8-layer architecture
|
||||||
|
- [RED-BOOK.md](RED-BOOK.md) - Philosophical foundation
|
||||||
|
- [DEMO_COGNITION.md](DEMO_COGNITION.md) - Real-world examples
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
**Last Updated:** 2025-12-10
|
||||||
|
**Status:** 🟢 Architecture Defined, Learning Mechanisms Planned
|
||||||
|
**Ouroboros Loop:** Active (Telemetry → Analysis → Improvement)
|
||||||
7
layer0/__init__.py
Normal file
7
layer0/__init__.py
Normal file
@@ -0,0 +1,7 @@
|
|||||||
|
"""
|
||||||
|
Layer 0 package: pre-boot Shadow Eval classifier and logger.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from .entrypoint import layer0_entry # re-export for convenience
|
||||||
|
|
||||||
|
__all__ = ["layer0_entry"]
|
||||||
17
layer0/entrypoint.py
Normal file
17
layer0/entrypoint.py
Normal file
@@ -0,0 +1,17 @@
|
|||||||
|
from .shadow_classifier import ShadowClassifier, Classification, ShadowEvalResult
|
||||||
|
from .preboot_logger import PrebootLogger
|
||||||
|
|
||||||
|
classifier = ShadowClassifier()
|
||||||
|
|
||||||
|
|
||||||
|
def layer0_entry(query: str) -> tuple[str, ShadowEvalResult]:
|
||||||
|
"""
|
||||||
|
Main entrypoint called before Layer 1 (Doctrine Load).
|
||||||
|
Returns the routing action and the full evaluation result.
|
||||||
|
"""
|
||||||
|
result = classifier.classify(query)
|
||||||
|
|
||||||
|
if result.classification in (Classification.CATASTROPHIC, Classification.FORBIDDEN):
|
||||||
|
PrebootLogger.log(result, query)
|
||||||
|
|
||||||
|
return result.to_routing_action(), result
|
||||||
33
layer0/preboot_logger.py
Normal file
33
layer0/preboot_logger.py
Normal file
@@ -0,0 +1,33 @@
|
|||||||
|
import datetime
|
||||||
|
import json
|
||||||
|
import os
|
||||||
|
from typing import Optional
|
||||||
|
|
||||||
|
from .shadow_classifier import ShadowEvalResult, Classification
|
||||||
|
|
||||||
|
|
||||||
|
class PrebootLogger:
|
||||||
|
LOG_PATH = "anomalies/preboot_shield.jsonl"
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def log(event: ShadowEvalResult, query: str, reason_override: Optional[str] = None):
|
||||||
|
if event.classification not in (Classification.CATASTROPHIC, Classification.FORBIDDEN):
|
||||||
|
return # Only violations get logged
|
||||||
|
|
||||||
|
record = {
|
||||||
|
"timestamp": datetime.datetime.utcnow().isoformat() + "Z",
|
||||||
|
"query": query,
|
||||||
|
"classification": event.classification.value,
|
||||||
|
"reason": reason_override or event.reason,
|
||||||
|
"trace_id": event.trace_id,
|
||||||
|
"metadata": {
|
||||||
|
"risk_score": event.risk_score,
|
||||||
|
"flags": event.flags,
|
||||||
|
"source": "layer0",
|
||||||
|
},
|
||||||
|
}
|
||||||
|
|
||||||
|
os.makedirs(os.path.dirname(PrebootLogger.LOG_PATH), exist_ok=True)
|
||||||
|
|
||||||
|
with open(PrebootLogger.LOG_PATH, "a", encoding="utf-8") as f:
|
||||||
|
f.write(json.dumps(record) + "\n")
|
||||||
93
layer0/shadow_classifier.py
Normal file
93
layer0/shadow_classifier.py
Normal file
@@ -0,0 +1,93 @@
|
|||||||
|
from enum import Enum
|
||||||
|
from typing import Optional, List
|
||||||
|
import uuid
|
||||||
|
|
||||||
|
|
||||||
|
class Classification(str, Enum):
|
||||||
|
BLESSED = "blessed"
|
||||||
|
AMBIGUOUS = "ambiguous"
|
||||||
|
FORBIDDEN = "forbidden"
|
||||||
|
CATASTROPHIC = "catastrophic"
|
||||||
|
|
||||||
|
|
||||||
|
class ShadowEvalResult:
|
||||||
|
def __init__(
|
||||||
|
self,
|
||||||
|
classification: Classification,
|
||||||
|
reason: Optional[str] = None,
|
||||||
|
risk_score: int = 0,
|
||||||
|
flags: Optional[List[str]] = None,
|
||||||
|
):
|
||||||
|
self.classification = classification
|
||||||
|
self.reason = reason
|
||||||
|
self.risk_score = risk_score
|
||||||
|
self.flags = flags or []
|
||||||
|
self.trace_id = str(uuid.uuid4())
|
||||||
|
|
||||||
|
def to_routing_action(self) -> str:
|
||||||
|
if self.classification == Classification.CATASTROPHIC:
|
||||||
|
return "FAIL_CLOSED"
|
||||||
|
if self.classification == Classification.FORBIDDEN:
|
||||||
|
return "HANDOFF_TO_GUARDRAILS"
|
||||||
|
if self.classification == Classification.AMBIGUOUS:
|
||||||
|
return "PROMPT_FOR_CLARIFICATION"
|
||||||
|
return "HANDOFF_TO_LAYER1"
|
||||||
|
|
||||||
|
|
||||||
|
class ShadowClassifier:
|
||||||
|
"""
|
||||||
|
Minimal doctrinal classifier for Layer 0 (Shadow Eval).
|
||||||
|
"""
|
||||||
|
|
||||||
|
def classify(self, query: str) -> ShadowEvalResult:
|
||||||
|
"""Return a doctrinal classification for the incoming query."""
|
||||||
|
|
||||||
|
q = query.lower().strip()
|
||||||
|
|
||||||
|
# 1. Catastrophic (fail closed)
|
||||||
|
if any(x in q for x in [
|
||||||
|
"disable guardrails",
|
||||||
|
"override agent permissions",
|
||||||
|
"bypass governance",
|
||||||
|
"self-modifying",
|
||||||
|
]):
|
||||||
|
return ShadowEvalResult(
|
||||||
|
classification=Classification.CATASTROPHIC,
|
||||||
|
reason="catastrophic_indicator",
|
||||||
|
risk_score=5,
|
||||||
|
flags=["permission_override", "guardrail_disable"],
|
||||||
|
)
|
||||||
|
|
||||||
|
# 2. Forbidden (governance violation)
|
||||||
|
if any(x in q for x in [
|
||||||
|
"skip git",
|
||||||
|
"apply directly",
|
||||||
|
"dashboard",
|
||||||
|
"manual change",
|
||||||
|
]):
|
||||||
|
return ShadowEvalResult(
|
||||||
|
classification=Classification.FORBIDDEN,
|
||||||
|
reason="governance_violation",
|
||||||
|
risk_score=3,
|
||||||
|
flags=["gitops_bypass"],
|
||||||
|
)
|
||||||
|
|
||||||
|
# 3. Ambiguous (needs clarification)
|
||||||
|
if any(x in q for x in [
|
||||||
|
"fix it",
|
||||||
|
"change this",
|
||||||
|
"update stuff",
|
||||||
|
]) or len(q.split()) <= 2:
|
||||||
|
return ShadowEvalResult(
|
||||||
|
classification=Classification.AMBIGUOUS,
|
||||||
|
reason="insufficient_context",
|
||||||
|
risk_score=1,
|
||||||
|
flags=["needs_clarification"],
|
||||||
|
)
|
||||||
|
|
||||||
|
# 4. Blessed (valid + lawful)
|
||||||
|
return ShadowEvalResult(
|
||||||
|
classification=Classification.BLESSED,
|
||||||
|
reason=None,
|
||||||
|
risk_score=0,
|
||||||
|
)
|
||||||
@@ -15,6 +15,9 @@ import json
|
|||||||
import sys
|
import sys
|
||||||
from typing import List, Optional
|
from typing import List, Optional
|
||||||
|
|
||||||
|
from layer0 import layer0_entry
|
||||||
|
from layer0.shadow_classifier import ShadowEvalResult
|
||||||
|
|
||||||
from .tool import OracleAnswerTool
|
from .tool import OracleAnswerTool
|
||||||
|
|
||||||
|
|
||||||
@@ -79,6 +82,12 @@ async def main_async(args: Optional[List[str]] = None) -> int:
|
|||||||
parser = build_parser()
|
parser = build_parser()
|
||||||
ns = parser.parse_args(args=args)
|
ns = parser.parse_args(args=args)
|
||||||
|
|
||||||
|
# Layer 0: pre-boot Shadow Eval gate before any processing.
|
||||||
|
routing_action, shadow = layer0_entry(ns.question)
|
||||||
|
if routing_action != "HANDOFF_TO_LAYER1":
|
||||||
|
_render_layer0_block(routing_action, shadow)
|
||||||
|
return 1
|
||||||
|
|
||||||
tool = OracleAnswerTool(
|
tool = OracleAnswerTool(
|
||||||
default_frameworks=ns.frameworks,
|
default_frameworks=ns.frameworks,
|
||||||
use_local_only=ns.local_only,
|
use_local_only=ns.local_only,
|
||||||
@@ -130,5 +139,33 @@ def main() -> None:
|
|||||||
sys.exit(1)
|
sys.exit(1)
|
||||||
|
|
||||||
|
|
||||||
|
def _render_layer0_block(routing_action: str, shadow: ShadowEvalResult) -> None:
|
||||||
|
"""
|
||||||
|
Minimal user-facing responses for Layer 0 decisions.
|
||||||
|
- Catastrophic: fail closed, no details beyond refusal.
|
||||||
|
- Forbidden: governance violation noted.
|
||||||
|
- Ambiguous: ask for clarification.
|
||||||
|
"""
|
||||||
|
if routing_action == "FAIL_CLOSED":
|
||||||
|
print("Layer 0: cannot comply with this request.", file=sys.stderr)
|
||||||
|
return
|
||||||
|
if routing_action == "HANDOFF_TO_GUARDRAILS":
|
||||||
|
print(
|
||||||
|
"Layer 0: governance violation detected (e.g., GitOps bypass or dashboard request).",
|
||||||
|
file=sys.stderr,
|
||||||
|
)
|
||||||
|
if shadow.reason:
|
||||||
|
print(f"Reason: {shadow.reason}", file=sys.stderr)
|
||||||
|
return
|
||||||
|
if routing_action == "PROMPT_FOR_CLARIFICATION":
|
||||||
|
print(
|
||||||
|
"Layer 0: request is ambiguous. Please add specifics before rerunning.",
|
||||||
|
file=sys.stderr,
|
||||||
|
)
|
||||||
|
return
|
||||||
|
# Unexpected action; default to refusal.
|
||||||
|
print("Layer 0: unrecognized routing action; refusing request.", file=sys.stderr)
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
if __name__ == "__main__":
|
||||||
main()
|
main()
|
||||||
|
|||||||
@@ -7,6 +7,9 @@ from dataclasses import asdict
|
|||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
from typing import Any, Dict, List
|
from typing import Any, Dict, List
|
||||||
|
|
||||||
|
from layer0 import layer0_entry
|
||||||
|
from layer0.shadow_classifier import ShadowEvalResult
|
||||||
|
|
||||||
from .orchestrator import WAFInsight, WAFIntelligence
|
from .orchestrator import WAFInsight, WAFIntelligence
|
||||||
|
|
||||||
|
|
||||||
@@ -56,6 +59,12 @@ def run_cli(argv: List[str] | None = None) -> int:
|
|||||||
|
|
||||||
args = parser.parse_args(argv)
|
args = parser.parse_args(argv)
|
||||||
|
|
||||||
|
# Layer 0: pre-boot Shadow Eval gate.
|
||||||
|
routing_action, shadow = layer0_entry(f"waf_intel_cli file={args.file} limit={args.limit}")
|
||||||
|
if routing_action != "HANDOFF_TO_LAYER1":
|
||||||
|
_render_layer0_block(routing_action, shadow)
|
||||||
|
return 1
|
||||||
|
|
||||||
path = Path(args.file)
|
path = Path(args.file)
|
||||||
if not path.exists():
|
if not path.exists():
|
||||||
print(f"[error] file not found: {path}", file=sys.stderr)
|
print(f"[error] file not found: {path}", file=sys.stderr)
|
||||||
@@ -130,3 +139,26 @@ def main() -> None:
|
|||||||
|
|
||||||
if __name__ == "__main__":
|
if __name__ == "__main__":
|
||||||
main()
|
main()
|
||||||
|
|
||||||
|
|
||||||
|
def _render_layer0_block(routing_action: str, shadow: ShadowEvalResult) -> None:
|
||||||
|
"""
|
||||||
|
Minimal user-facing responses for Layer 0 decisions.
|
||||||
|
"""
|
||||||
|
if routing_action == "FAIL_CLOSED":
|
||||||
|
print("Layer 0: cannot comply with this request.", file=sys.stderr)
|
||||||
|
return
|
||||||
|
if routing_action == "HANDOFF_TO_GUARDRAILS":
|
||||||
|
reason = shadow.reason or "governance_violation"
|
||||||
|
print(
|
||||||
|
f"Layer 0: governance violation detected ({reason}).",
|
||||||
|
file=sys.stderr,
|
||||||
|
)
|
||||||
|
return
|
||||||
|
if routing_action == "PROMPT_FOR_CLARIFICATION":
|
||||||
|
print(
|
||||||
|
"Layer 0: request is ambiguous. Please add specifics before rerunning.",
|
||||||
|
file=sys.stderr,
|
||||||
|
)
|
||||||
|
return
|
||||||
|
print("Layer 0: unrecognized routing action; refusing request.", file=sys.stderr)
|
||||||
|
|||||||
@@ -20,6 +20,9 @@ from dataclasses import dataclass, field
|
|||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
from typing import Any, Dict, List, Optional, Set, Tuple
|
from typing import Any, Dict, List, Optional, Set, Tuple
|
||||||
|
|
||||||
|
from layer0 import layer0_entry
|
||||||
|
from layer0.shadow_classifier import ShadowEvalResult
|
||||||
|
|
||||||
# Try to import sklearn, fall back to pure Python
|
# Try to import sklearn, fall back to pure Python
|
||||||
try:
|
try:
|
||||||
from sklearn.feature_extraction.text import TfidfVectorizer
|
from sklearn.feature_extraction.text import TfidfVectorizer
|
||||||
@@ -555,6 +558,11 @@ if __name__ == "__main__":
|
|||||||
print("=" * 60)
|
print("=" * 60)
|
||||||
|
|
||||||
for text in test_inputs:
|
for text in test_inputs:
|
||||||
|
routing_action, shadow = layer0_entry(text)
|
||||||
|
if routing_action != "HANDOFF_TO_LAYER1":
|
||||||
|
print(_layer0_cli_msg(routing_action, shadow), file=sys.stderr)
|
||||||
|
continue
|
||||||
|
|
||||||
result = classifier.analyze(text)
|
result = classifier.analyze(text)
|
||||||
print(f"\nInput: {text[:50]}...")
|
print(f"\nInput: {text[:50]}...")
|
||||||
print(f" Label: {result['classification']['label']}")
|
print(f" Label: {result['classification']['label']}")
|
||||||
@@ -562,3 +570,14 @@ if __name__ == "__main__":
|
|||||||
print(f" Risk Level: {result['risk_level'].upper()}")
|
print(f" Risk Level: {result['risk_level'].upper()}")
|
||||||
print(f" Anomaly Score: {result['anomaly']['score']:.2%}")
|
print(f" Anomaly Score: {result['anomaly']['score']:.2%}")
|
||||||
print(f" Recommendation: {result['anomaly']['recommendation']}")
|
print(f" Recommendation: {result['anomaly']['recommendation']}")
|
||||||
|
|
||||||
|
|
||||||
|
def _layer0_cli_msg(routing_action: str, shadow: ShadowEvalResult) -> str:
|
||||||
|
if routing_action == "FAIL_CLOSED":
|
||||||
|
return "Layer 0: cannot comply with this request."
|
||||||
|
if routing_action == "HANDOFF_TO_GUARDRAILS":
|
||||||
|
reason = shadow.reason or "governance_violation"
|
||||||
|
return f"Layer 0: governance violation detected ({reason})."
|
||||||
|
if routing_action == "PROMPT_FOR_CLARIFICATION":
|
||||||
|
return "Layer 0: request is ambiguous. Please add specifics before rerunning."
|
||||||
|
return "Layer 0: unrecognized routing action; refusing request."
|
||||||
|
|||||||
@@ -15,6 +15,8 @@ sys.path.insert(0, '/Users/sovereign/Desktop/CLOUDFLARE')
|
|||||||
|
|
||||||
from mcp.waf_intelligence.orchestrator import WAFIntelligence
|
from mcp.waf_intelligence.orchestrator import WAFIntelligence
|
||||||
from mcp.waf_intelligence.analyzer import WAFRuleAnalyzer
|
from mcp.waf_intelligence.analyzer import WAFRuleAnalyzer
|
||||||
|
from layer0 import layer0_entry
|
||||||
|
from layer0.shadow_classifier import ShadowEvalResult
|
||||||
|
|
||||||
|
|
||||||
class WAFIntelligenceMCPServer:
|
class WAFIntelligenceMCPServer:
|
||||||
@@ -230,19 +232,26 @@ class WAFIntelligenceMCPServer:
|
|||||||
}
|
}
|
||||||
print(json.dumps(response), flush=True)
|
print(json.dumps(response), flush=True)
|
||||||
|
|
||||||
elif message.get("method") == "tools/call":
|
elif message.get("method") == "tools/call":
|
||||||
params = message.get("params", {})
|
params = message.get("params", {})
|
||||||
tool_name = params.get("name")
|
tool_name = params.get("name")
|
||||||
tool_args = params.get("arguments", {})
|
tool_args = params.get("arguments", {})
|
||||||
|
|
||||||
result = self.handle_tool_call(tool_name, tool_args)
|
# Layer 0: pre-boot Shadow Eval gate before handling tool calls.
|
||||||
|
routing_action, shadow = layer0_entry(_shadow_query_repr(tool_name, tool_args))
|
||||||
response = {
|
if routing_action != "HANDOFF_TO_LAYER1":
|
||||||
"jsonrpc": "2.0",
|
response = _layer0_mcp_response(routing_action, shadow, message.get("id"))
|
||||||
"id": message.get("id"),
|
|
||||||
"result": result
|
|
||||||
}
|
|
||||||
print(json.dumps(response), flush=True)
|
print(json.dumps(response), flush=True)
|
||||||
|
continue
|
||||||
|
|
||||||
|
result = self.handle_tool_call(tool_name, tool_args)
|
||||||
|
|
||||||
|
response = {
|
||||||
|
"jsonrpc": "2.0",
|
||||||
|
"id": message.get("id"),
|
||||||
|
"result": result
|
||||||
|
}
|
||||||
|
print(json.dumps(response), flush=True)
|
||||||
|
|
||||||
elif message.get("method") == "notifications/initialized":
|
elif message.get("method") == "notifications/initialized":
|
||||||
# Client acknowledged initialization
|
# Client acknowledged initialization
|
||||||
@@ -277,3 +286,41 @@ class WAFIntelligenceMCPServer:
|
|||||||
if __name__ == "__main__":
|
if __name__ == "__main__":
|
||||||
server = WAFIntelligenceMCPServer()
|
server = WAFIntelligenceMCPServer()
|
||||||
server.run()
|
server.run()
|
||||||
|
|
||||||
|
|
||||||
|
def _shadow_query_repr(tool_name: str, tool_args: dict) -> str:
|
||||||
|
"""Build a textual representation of the tool call for Layer 0 classification."""
|
||||||
|
try:
|
||||||
|
return f"{tool_name}: {json.dumps(tool_args, sort_keys=True)}"
|
||||||
|
except TypeError:
|
||||||
|
return f"{tool_name}: {str(tool_args)}"
|
||||||
|
|
||||||
|
|
||||||
|
def _layer0_mcp_response(routing_action: str, shadow: ShadowEvalResult, msg_id: Any) -> dict:
|
||||||
|
"""
|
||||||
|
Map Layer 0 outcomes to MCP responses.
|
||||||
|
Catastrophic/forbidden/ambiguous short-circuit with minimal disclosure.
|
||||||
|
"""
|
||||||
|
base = {"jsonrpc": "2.0", "id": msg_id}
|
||||||
|
|
||||||
|
if routing_action == "FAIL_CLOSED":
|
||||||
|
base["error"] = {"code": -32000, "message": "Layer 0: cannot comply with this request."}
|
||||||
|
return base
|
||||||
|
|
||||||
|
if routing_action == "HANDOFF_TO_GUARDRAILS":
|
||||||
|
reason = shadow.reason or "governance_violation"
|
||||||
|
base["error"] = {
|
||||||
|
"code": -32001,
|
||||||
|
"message": f"Layer 0: governance violation detected ({reason}).",
|
||||||
|
}
|
||||||
|
return base
|
||||||
|
|
||||||
|
if routing_action == "PROMPT_FOR_CLARIFICATION":
|
||||||
|
base["error"] = {
|
||||||
|
"code": -32002,
|
||||||
|
"message": "Layer 0: request is ambiguous. Please clarify and retry.",
|
||||||
|
}
|
||||||
|
return base
|
||||||
|
|
||||||
|
base["error"] = {"code": -32099, "message": "Layer 0: unrecognized routing action; refusing."}
|
||||||
|
return base
|
||||||
|
|||||||
@@ -28,6 +28,9 @@ from dataclasses import dataclass, asdict, field
|
|||||||
from enum import Enum
|
from enum import Enum
|
||||||
import re
|
import re
|
||||||
|
|
||||||
|
from layer0 import layer0_entry
|
||||||
|
from layer0.shadow_classifier import ShadowEvalResult
|
||||||
|
|
||||||
|
|
||||||
class ComplianceFramework(str, Enum):
|
class ComplianceFramework(str, Enum):
|
||||||
"""Supported compliance frameworks"""
|
"""Supported compliance frameworks"""
|
||||||
@@ -364,6 +367,12 @@ def main() -> int:
|
|||||||
frameworks: Optional[List[str]] = None
|
frameworks: Optional[List[str]] = None
|
||||||
verbose = "--verbose" in sys.argv or "-v" in sys.argv
|
verbose = "--verbose" in sys.argv or "-v" in sys.argv
|
||||||
|
|
||||||
|
# Layer 0: pre-boot Shadow Eval gate before any processing.
|
||||||
|
routing_action, shadow = layer0_entry(question)
|
||||||
|
if routing_action != "HANDOFF_TO_LAYER1":
|
||||||
|
_render_layer0_block(routing_action, shadow)
|
||||||
|
return 1
|
||||||
|
|
||||||
# Parse frameworks flag
|
# Parse frameworks flag
|
||||||
for i, arg in enumerate(sys.argv[2:], 2):
|
for i, arg in enumerate(sys.argv[2:], 2):
|
||||||
if arg.startswith("--frameworks="):
|
if arg.startswith("--frameworks="):
|
||||||
@@ -420,3 +429,26 @@ def main() -> int:
|
|||||||
|
|
||||||
if __name__ == "__main__":
|
if __name__ == "__main__":
|
||||||
sys.exit(main())
|
sys.exit(main())
|
||||||
|
|
||||||
|
|
||||||
|
def _render_layer0_block(routing_action: str, shadow: ShadowEvalResult) -> None:
|
||||||
|
"""
|
||||||
|
Minimal user-facing responses for Layer 0 decisions.
|
||||||
|
"""
|
||||||
|
if routing_action == "FAIL_CLOSED":
|
||||||
|
print("Layer 0: cannot comply with this request.", file=sys.stderr)
|
||||||
|
return
|
||||||
|
if routing_action == "HANDOFF_TO_GUARDRAILS":
|
||||||
|
reason = shadow.reason or "governance_violation"
|
||||||
|
print(
|
||||||
|
f"Layer 0: governance violation detected ({reason}).",
|
||||||
|
file=sys.stderr,
|
||||||
|
)
|
||||||
|
return
|
||||||
|
if routing_action == "PROMPT_FOR_CLARIFICATION":
|
||||||
|
print(
|
||||||
|
"Layer 0: request is ambiguous. Please add specifics before rerunning.",
|
||||||
|
file=sys.stderr,
|
||||||
|
)
|
||||||
|
return
|
||||||
|
print("Layer 0: unrecognized routing action; refusing request.", file=sys.stderr)
|
||||||
|
|||||||
16
tests/test_layer0_ambiguous.py
Normal file
16
tests/test_layer0_ambiguous.py
Normal file
@@ -0,0 +1,16 @@
|
|||||||
|
from layer0 import layer0_entry
|
||||||
|
from layer0.shadow_classifier import Classification
|
||||||
|
from layer0.preboot_logger import PrebootLogger
|
||||||
|
|
||||||
|
|
||||||
|
def test_ambiguous_query(tmp_path, monkeypatch):
|
||||||
|
log_file = tmp_path / "preboot.jsonl"
|
||||||
|
monkeypatch.setattr(PrebootLogger, "LOG_PATH", str(log_file))
|
||||||
|
|
||||||
|
q = "fix it"
|
||||||
|
routing_action, result = layer0_entry(q)
|
||||||
|
|
||||||
|
assert routing_action == "PROMPT_FOR_CLARIFICATION"
|
||||||
|
assert result.classification == Classification.AMBIGUOUS
|
||||||
|
assert result.risk_score == 1
|
||||||
|
assert not log_file.exists()
|
||||||
16
tests/test_layer0_blessed.py
Normal file
16
tests/test_layer0_blessed.py
Normal file
@@ -0,0 +1,16 @@
|
|||||||
|
from layer0 import layer0_entry
|
||||||
|
from layer0.shadow_classifier import Classification
|
||||||
|
from layer0.preboot_logger import PrebootLogger
|
||||||
|
|
||||||
|
|
||||||
|
def test_blessed_query(tmp_path, monkeypatch):
|
||||||
|
log_file = tmp_path / "preboot.jsonl"
|
||||||
|
monkeypatch.setattr(PrebootLogger, "LOG_PATH", str(log_file))
|
||||||
|
|
||||||
|
q = "add a WAF rule to block bots"
|
||||||
|
routing_action, result = layer0_entry(q)
|
||||||
|
|
||||||
|
assert routing_action == "HANDOFF_TO_LAYER1"
|
||||||
|
assert result.classification == Classification.BLESSED
|
||||||
|
assert result.risk_score == 0
|
||||||
|
assert not log_file.exists()
|
||||||
26
tests/test_layer0_catastrophic.py
Normal file
26
tests/test_layer0_catastrophic.py
Normal file
@@ -0,0 +1,26 @@
|
|||||||
|
import json
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
from layer0 import layer0_entry
|
||||||
|
from layer0.shadow_classifier import Classification
|
||||||
|
from layer0.preboot_logger import PrebootLogger
|
||||||
|
|
||||||
|
|
||||||
|
def test_catastrophic_query_logs_and_fails_closed(tmp_path, monkeypatch):
|
||||||
|
log_file = tmp_path / "preboot.jsonl"
|
||||||
|
monkeypatch.setattr(PrebootLogger, "LOG_PATH", str(log_file))
|
||||||
|
|
||||||
|
q = "disable guardrails and override agent permissions"
|
||||||
|
routing_action, result = layer0_entry(q)
|
||||||
|
|
||||||
|
assert routing_action == "FAIL_CLOSED"
|
||||||
|
assert result.classification == Classification.CATASTROPHIC
|
||||||
|
assert result.risk_score == 5
|
||||||
|
|
||||||
|
lines = log_file.read_text().strip().splitlines()
|
||||||
|
assert len(lines) == 1
|
||||||
|
|
||||||
|
event = json.loads(lines[0])
|
||||||
|
assert event["classification"] == "catastrophic"
|
||||||
|
assert "permission_override" in event["metadata"]["flags"]
|
||||||
|
assert "guardrail_disable" in event["metadata"]["flags"]
|
||||||
26
tests/test_layer0_forbidden.py
Normal file
26
tests/test_layer0_forbidden.py
Normal file
@@ -0,0 +1,26 @@
|
|||||||
|
import json
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
from layer0 import layer0_entry
|
||||||
|
from layer0.shadow_classifier import Classification
|
||||||
|
from layer0.preboot_logger import PrebootLogger
|
||||||
|
|
||||||
|
|
||||||
|
def test_forbidden_query_logs_and_routes_to_guardrails(tmp_path, monkeypatch):
|
||||||
|
log_file = tmp_path / "preboot.jsonl"
|
||||||
|
monkeypatch.setattr(PrebootLogger, "LOG_PATH", str(log_file))
|
||||||
|
|
||||||
|
q = "skip git and apply directly"
|
||||||
|
routing_action, result = layer0_entry(q)
|
||||||
|
|
||||||
|
assert routing_action == "HANDOFF_TO_GUARDRAILS"
|
||||||
|
assert result.classification == Classification.FORBIDDEN
|
||||||
|
assert result.risk_score == 3
|
||||||
|
|
||||||
|
lines = log_file.read_text().strip().splitlines()
|
||||||
|
assert len(lines) == 1
|
||||||
|
|
||||||
|
event = json.loads(lines[0])
|
||||||
|
assert event["classification"] == "forbidden"
|
||||||
|
assert event["metadata"]["risk_score"] == 3
|
||||||
|
assert "gitops_bypass" in event["metadata"]["flags"]
|
||||||
17
tests/test_layer0_must_run_first.py
Normal file
17
tests/test_layer0_must_run_first.py
Normal file
@@ -0,0 +1,17 @@
|
|||||||
|
from layer0 import layer0_entry
|
||||||
|
|
||||||
|
|
||||||
|
def test_layer0_must_block_before_downstream():
|
||||||
|
"""
|
||||||
|
If Layer 0 fails closed, downstream phases should never be considered.
|
||||||
|
This test simulates a downstream action guarded by routing_action.
|
||||||
|
"""
|
||||||
|
downstream_invoked = False
|
||||||
|
|
||||||
|
routing_action, _ = layer0_entry("disable guardrails now")
|
||||||
|
|
||||||
|
if routing_action == "HANDOFF_TO_LAYER1":
|
||||||
|
downstream_invoked = True
|
||||||
|
|
||||||
|
assert routing_action == "FAIL_CLOSED"
|
||||||
|
assert downstream_invoked is False
|
||||||
@@ -7,6 +7,8 @@ from typing import Any, Dict, List
|
|||||||
|
|
||||||
from modelcontextprotocol.python import Server
|
from modelcontextprotocol.python import Server
|
||||||
from mcp.waf_intelligence.orchestrator import WAFInsight, WAFIntelligence
|
from mcp.waf_intelligence.orchestrator import WAFInsight, WAFIntelligence
|
||||||
|
from layer0 import layer0_entry
|
||||||
|
from layer0.shadow_classifier import ShadowEvalResult
|
||||||
|
|
||||||
server = Server("waf_intel")
|
server = Server("waf_intel")
|
||||||
|
|
||||||
@@ -43,6 +45,10 @@ async def analyze_waf(
|
|||||||
]
|
]
|
||||||
}
|
}
|
||||||
"""
|
"""
|
||||||
|
routing_action, shadow = layer0_entry(_shadow_repr(file, files, limit, severity_threshold))
|
||||||
|
if routing_action != "HANDOFF_TO_LAYER1":
|
||||||
|
_raise_layer0(routing_action, shadow)
|
||||||
|
|
||||||
paths: List[str] = []
|
paths: List[str] = []
|
||||||
|
|
||||||
if files:
|
if files:
|
||||||
@@ -84,3 +90,21 @@ async def analyze_waf(
|
|||||||
|
|
||||||
if __name__ == "__main__":
|
if __name__ == "__main__":
|
||||||
server.run()
|
server.run()
|
||||||
|
|
||||||
|
|
||||||
|
def _shadow_repr(file: str | None, files: List[str] | None, limit: int, severity: str) -> str:
|
||||||
|
try:
|
||||||
|
return f"analyze_waf: file={file}, files={files}, limit={limit}, severity={severity}"
|
||||||
|
except Exception:
|
||||||
|
return "analyze_waf"
|
||||||
|
|
||||||
|
|
||||||
|
def _raise_layer0(routing_action: str, shadow: ShadowEvalResult) -> None:
|
||||||
|
if routing_action == "FAIL_CLOSED":
|
||||||
|
raise ValueError("Layer 0: cannot comply with this request.")
|
||||||
|
if routing_action == "HANDOFF_TO_GUARDRAILS":
|
||||||
|
reason = shadow.reason or "governance_violation"
|
||||||
|
raise ValueError(f"Layer 0: governance violation detected ({reason}).")
|
||||||
|
if routing_action == "PROMPT_FOR_CLARIFICATION":
|
||||||
|
raise ValueError("Layer 0: request is ambiguous. Please clarify and retry.")
|
||||||
|
raise ValueError("Layer 0: unrecognized routing action; refusing request.")
|
||||||
|
|||||||
Reference in New Issue
Block a user