Files
vm-core/docs/VAULTMESH-PROOFBUNDLE-SPEC.md
2025-12-27 00:10:32 +00:00

596 lines
19 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# VAULTMESH-PROOFBUNDLE-SPEC
_Version 1.1.0 ProofBundle Data Model & Verification Semantics_
## 1. Introduction
This document specifies the structure and verification semantics of the **VaultMesh ProofBundle**.
A ProofBundle is a self-contained evidence artifact intended for regulators, auditors, and relying parties. It packages:
- A document-specific event chain (e.g. skill validations → document download),
- Cryptographic identities (DIDs) for human and system actors,
- Sealed ledger state (Guardian anchor and scroll roots),
- Placeholder references for external ProofChain anchors (e.g. BTC/ETH/OTS).
A ProofBundle is designed to be verifiable **offline**, using only the bundle JSON and an open-source verifier.
---
## 2. Terminology
The following terms are used in the RFC sense:
- **MUST** / **MUST NOT** absolute requirement.
- **SHOULD** / **SHOULD NOT** strong recommendation; valid reasons may exist to deviate, but they must be understood.
- **MAY** optional behavior.
Additional terms:
| Term | Definition |
|------|------------|
| **Receipt** | A canonical JSON object representing a single ledger event (e.g. `document_download`, `skill_validation`), including at minimum a `root_hash`. |
| **Scroll** | An append-only JSONL file containing receipts of a given class (e.g. Automation, Guardian, Identity). |
| **Guardian Anchor** | A special receipt that commits to the current state of all scrolls via BLAKE3 roots, written to the Guardian scroll. |
| **DID** | Decentralized Identifier in the VaultMesh namespace, e.g. `did:vm:human:karol`, `did:vm:portal:shield`, `did:vm:guardian:local`. |
| **ProofChain** | Optional external anchoring backends (e.g. Bitcoin, Ethereum, OpenTimestamps) referenced by the bundle. |
---
## 3. Data Model
### 3.1 Top-Level Structure
A ProofBundle MUST be a single JSON object with the following top-level fields:
```jsonc
{
"bundle_id": "pb-20251206T174406-dl-20251206T165831-2ebdac",
"schema_version": "1.1.0",
"generated_at": "2025-12-06T17:44:06.123Z",
"document": { ... },
"actor": { ... },
"portal": { ... },
"chain": { ... },
"guardian_anchor": { ... },
"proofchain": { ... },
"meta": { ... }
}
```
#### 3.1.1 bundle_id
- **Type:** string
- **Semantics:** Globally unique identifier for the bundle instance.
- **Format:** Implementation-defined; SHOULD include the download ID and timestamp.
#### 3.1.2 schema_version
- **Type:** string
- **Semantics:** Version of this specification the bundle adheres to.
- This document describes version **1.1.0**.
**Verifiers:**
- MUST reject unknown major versions.
- SHOULD attempt best-effort parsing of minor version bumps (e.g. 1.2.x), ignoring unknown fields.
#### 3.1.3 generated_at
- **Type:** string (ISO 8601 with UTC Z).
- **Semantics:** Time at which the ProofBundle was generated by the portal.
---
### 3.2 Document Section
```json
"document": {
"doc_id": "001 Conformity Declaration",
"filename": "VM-AI-CON-001_Conformity_Declaration.docx",
"category": "AI Governance"
}
```
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `doc_id` | string | REQUIRED | Human-readable identifier used in the portal and receipts. |
| `filename` | string | REQUIRED | The file name of the underlying document. |
| `category` | string | OPTIONAL | High-level classification (e.g. "AI Governance", "Data Protection"). |
| `path` | string | OPTIONAL | Full path in repository. |
---
### 3.3 Actor & Portal Sections
```json
"actor": {
"did": "did:vm:human:karol",
"display_name": "Karol S",
"role": "auditor"
},
"portal": {
"did": "did:vm:portal:shield",
"instance": "shield.story-ule.ts.net",
"description": "VaultMesh Auditor Portal Shield node"
}
```
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `actor.did` | string | REQUIRED | DID of the human or agent initiating the document download. |
| `actor.display_name` | string | OPTIONAL | Human-readable name; MAY be "Unknown Auditor" when not resolved. |
| `actor.role` | string | OPTIONAL | Role or function (e.g. "auditor", "DPO", "regulator"). |
| `portal.did` | string | REQUIRED | DID of the portal instance. |
| `portal.instance` | string | OPTIONAL | Hostname or logical instance ID. |
#### 3.3.1 Actor Identity Semantics
The `actor.did` field is the **normative identity anchor** for the human or agent
responsible for the documented action. It MUST be a valid VaultMesh DID (e.g.
`did:vm:human:karol`), resolvable in the VaultMesh Identity scroll.
The `actor.display_name` field is **non-normative convenience metadata**. It is
resolved from the Identity scroll and/or local configuration (e.g. environment
variables) at bundle generation time. Implementations:
- MUST treat `actor.did` as the authoritative identity reference.
- MUST NOT rely on `actor.display_name` for any cryptographic or access control decisions.
- MAY omit or localize `actor.display_name` without affecting ProofBundle validity.
---
### 3.4 Chain Section
```json
"chain": {
"ok": true,
"length": 7,
"start": { /* receipt summary */ },
"end": { /* receipt summary */ },
"receipts": [ /* full receipts */ ]
}
```
#### 3.4.1 ok
- **Type:** boolean
- **Semantics:** Declarative statement by the generator that the chain is believed to be cryptographically valid at generation time.
- Verifiers MUST NOT rely on this field alone and MUST recompute chain validity.
#### 3.4.2 length
- **Type:** integer
- **Semantics:** Number of receipts represented in `receipts`.
- Verifiers SHOULD check that `length` equals `receipts.length`.
#### 3.4.3 start and end
- **Type:** object
- **Semantics:** Human-oriented summaries of the first and last receipts in the chain.
```json
"start": {
"type": "skill_validation",
"timestamp": "2025-12-06T14:47:14.000Z",
"root_hash": "blake3:de01c8b3..."
},
"end": {
"type": "document_download",
"timestamp": "2025-12-06T16:58:31.826Z",
"root_hash": "blake3:bb379364..."
}
```
Verifiers MAY recompute these summaries from `receipts` and SHOULD treat any inconsistency as an error.
#### 3.4.4 receipts
- **Type:** array of objects
- **Semantics:** Full chain of receipts from genesis (index 0) to the document download receipt (last index).
Each receipt object:
```json
{
"type": "document_download",
"timestamp": "2025-12-06T16:58:31.826Z",
"root_hash": "blake3:bb379364566df7179a982d632267b492...",
"previous_hash": "blake3:de01c8b34e9d0453484d73048be11dd5...",
"actor_did": "did:vm:human:karol",
"portal_did": "did:vm:portal:shield"
}
```
**Minimum required fields per receipt:**
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `type` | string | REQUIRED | Event type (e.g. `skill_validation`, `document_download`). |
| `timestamp` | string | REQUIRED | ISO 8601 with UTC Z. |
| `root_hash` | string | REQUIRED | BLAKE3 digest of the canonical JSON form of the receipt. |
| `previous_hash` | string\|null | REQUIRED | BLAKE3 hash of the previous receipt; MUST be present for all receipts except the first. |
Additional fields (e.g. `actor_did`, `portal_did`, `session_id`, `ip_hash`, `user_agent_hash`) are RECOMMENDED.
---
### 3.5 Guardian Anchor Section
```json
"guardian_anchor": {
"anchor_id": "anchor-20251206155628",
"anchor_by": "did:vm:guardian:local",
"anchor_epoch": 1765039262,
"anchor_timestamp": "2025-12-06T15:56:28Z",
"root_hash": "blake3:1af3b9a4...",
"scroll_roots": {
"automation": { "root_hash": "blake3:aa12bb34...", "entries": 11, "has_root": true },
"guardian": { "root_hash": "blake3:cc56dd78...", "entries": 5, "has_root": true },
"identity": { "root_hash": "blake3:ee90ff12...", "entries": 4, "has_root": true }
}
}
```
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `anchor_id` | string | REQUIRED | Identifier of the Guardian anchor receipt. |
| `anchor_by` | string | REQUIRED | DID of the Guardian engine. |
| `anchor_epoch` | integer | OPTIONAL | Epoch seconds at anchor time. |
| `anchor_timestamp` | string | REQUIRED | ISO 8601 timestamp of the anchor. |
| `root_hash` | string\|null | OPTIONAL | Global root hash (reserved for future use). |
| `scroll_roots` | object | REQUIRED | Map from scroll name to its root hash as committed in the anchor. |
---
### 3.6 ProofChain Section
```json
"proofchain": {
"btc": { "status": "not_anchored", "txid": null },
"eth": { "status": "not_anchored", "txid": null },
"ots": { "status": "not_anchored", "timestamp_url": null }
}
```
For each backend (`btc`, `eth`, `ots`):
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `status` | string | REQUIRED | One of: `"not_anchored"`, `"pending"`, `"anchored"` |
| `txid` / `timestamp_url` | string\|null | OPTIONAL | Backend-specific reference when anchored. |
Verifiers:
- MAY ignore this section when performing purely local verification.
- SHOULD treat unknown statuses conservatively.
---
### 3.7 Meta Section
```json
"meta": {
"requested_by_session": "6pngxxbMxLYQf180qPmIeq-xkJ8nDBN3",
"requested_by_user": "karol@vaultmesh.earth",
"node": "shield"
}
```
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `requested_by_session` | string | OPTIONAL | Portal session that requested the bundle. |
| `requested_by_user` | string | OPTIONAL | Account identifier in the portal. |
| `node` | string | OPTIONAL | Node name. |
---
## 4. Cryptographic Properties
### 4.1 Hash Function
VaultMesh uses **BLAKE3** as the hash function for all `root_hash` and `previous_hash` values.
- **Digest encoding:** hex string, prefixed with `"blake3:"`, e.g. `blake3:1af3b9a4...`
- Implementations MUST preserve the prefix and encoding when serializing.
### 4.2 Receipt Hashing
For each receipt R in `chain.receipts`:
1. Serialize R to **canonical JSON**:
- UTF-8 encoding
- Sorted keys
- No insignificant whitespace: `separators=(",", ":")`
2. Compute `H = BLAKE3(R_canonical)`
3. Set `root_hash = "blake3:" + hex(H)`
```python
encoded = json.dumps(
receipt_without_root_hash,
sort_keys=True,
separators=(",", ":"),
ensure_ascii=False
).encode("utf-8")
root_hash = f"blake3:{blake3.blake3(encoded).hexdigest()}"
```
The verifier MUST recompute `root_hash` from the canonical JSON and compare it to the stored `root_hash`. Any mismatch indicates tampering.
### 4.3 Hash-Chain Semantics
Given receipts `R[0] ... R[n-1]`:
- For `i = 0`: `R[0].previous_hash` MAY be `null` or absent.
- For `i > 0`: `R[i].previous_hash` MUST equal `R[i-1].root_hash`.
A verifier MUST treat any violation as chain breakage.
---
## 5. Threat Model & Non-Goals
### 5.1 Threat Model
ProofBundle is designed to protect against:
| Threat | Mitigation |
|--------|------------|
| Post-hoc modification of receipts | Hash verification detects tampering |
| Removal or insertion of receipts | Chain linkage breaks |
| Misrepresentation of chain integrity | Verifier recomputes and compares to `chain.ok` |
| Partial disclosure attempts | Chain must be complete from genesis to download |
| Actor impersonation | DID attribution, not mutable username |
### 5.2 Non-Goals
ProofBundle explicitly does **not** guarantee:
- **Document content correctness** The bundle proves *access*, not that the document is semantically correct or policy-compliant.
- **Real-world identity verification** DIDs are cryptographic; KYC depends on external identity processes.
- **Protection against malicious genesis** If an adversary controls the VaultMesh node before receipts are created, the bundle cannot detect this.
- **IP/user-agent confidentiality** BLAKE3 hashes may be reversible via brute-force if input space is small.
Regulators SHOULD combine ProofBundle verification with organizational and process audits.
---
## 6. Example Bundle
### 6.1 Minimal Example
```json
{
"bundle_id": "pb-20251206T174406-dl-20251206T165831-2ebdac",
"schema_version": "1.1.0",
"generated_at": "2025-12-06T17:44:06.123Z",
"document": {
"doc_id": "001 Conformity Declaration",
"filename": "VM-AI-CON-001_Conformity_Declaration.docx",
"category": "AI Governance"
},
"actor": {
"did": "did:vm:human:karol",
"display_name": "Karol S",
"role": "auditor"
},
"portal": {
"did": "did:vm:portal:shield",
"instance": "shield"
},
"chain": {
"ok": true,
"length": 3,
"start": {
"type": "skill_validation",
"timestamp": "2025-12-06T14:47:14.000Z",
"root_hash": "blake3:de01c8b34e9d0453..."
},
"end": {
"type": "document_download",
"timestamp": "2025-12-06T16:58:31.826Z",
"root_hash": "blake3:bb379364566df717..."
},
"receipts": [
{
"type": "skill_validation",
"timestamp": "2025-12-06T14:47:14.000Z",
"root_hash": "blake3:de01c8b34e9d0453...",
"previous_hash": null
},
{
"type": "skill_validation",
"timestamp": "2025-12-06T15:10:02.000Z",
"root_hash": "blake3:4e7cf7352e25a150...",
"previous_hash": "blake3:de01c8b34e9d0453..."
},
{
"type": "document_download",
"timestamp": "2025-12-06T16:58:31.826Z",
"root_hash": "blake3:bb379364566df717...",
"previous_hash": "blake3:4e7cf7352e25a150...",
"actor_did": "did:vm:human:karol",
"portal_did": "did:vm:portal:shield"
}
]
},
"guardian_anchor": {
"anchor_id": "anchor-20251206155628",
"anchor_by": "did:vm:guardian:local",
"anchor_timestamp": "2025-12-06T15:56:28Z",
"root_hash": null,
"scroll_roots": {
"automation": { "root_hash": "blake3:b165f779...", "entries": 11, "has_root": true }
}
},
"proofchain": {
"btc": { "status": "not_anchored", "txid": null },
"eth": { "status": "not_anchored", "txid": null },
"ots": { "status": "not_anchored", "timestamp_url": null }
}
}
```
### 6.2 Expected Verifier Output
```
ProofBundle: pb-20251206T174406-dl-20251206T165831-2ebdac
Document : 001 Conformity Declaration
File : VM-AI-CON-001_Conformity_Declaration.docx
Actor : did:vm:human:karol (Karol S)
Portal : did:vm:portal:shield (shield)
Receipts : 3
Hash check : OK
Chain linkage : OK
Bundle chain.ok: True (matches computed: True)
Result: OK chain of 3 receipts is contiguous and valid.
```
---
## 7. Compliance Crosswalk AI Act Annex IX
This section provides a non-exhaustive mapping between AI Act Annex IX documentation expectations and ProofBundle fields.
| Annex IX Requirement | ProofBundle Support |
|---------------------|---------------------|
| Record-keeping of events and logs | `chain.receipts[]` (types, timestamps, DIDs) |
| Traceability of changes and operations | Hash-chain via `root_hash` and `previous_hash` |
| Identification of persons and systems involved | `actor.did`, `actor.display_name`, `portal.did` |
| Identification of system components | `guardian_anchor.anchor_by`, `portal.instance` |
| Technical documentation of integrity safeguards | Cryptographic model in this SPEC; BLAKE3 usage |
| Evidence of access to technical documentation | `document_download` receipts bound to specific doc IDs |
| Tamper-evidence of documentation and logs | BLAKE3 per receipt + chained `previous_hash` |
| Ability to provide evidence to market surveillance authorities | ProofBundle JSON + offline verifier |
Regulators MAY reference a valid ProofBundle, together with this specification, as part of the technical documentation demonstrating logging, traceability, and integrity controls.
---
## 8. HTML Viewer
The portal exposes an HTML view at:
```
/docs/proofbundle/:downloadId
```
This view:
- Renders the ProofBundle contents in a human-friendly layout
- Provides a Print button (browser print → PDF) for filing
- Displays verification note:
> "This ProofBundle can be independently verified with the open-source `vm_verify_proofbundle.py` tool. No access to VaultMesh servers is required."
---
## 9. Verifier Exit Codes
| Code | Meaning |
|------|---------|
| 0 | Verification passed |
| 1 | Verification failed (chain or hashes) |
| 2 | Usage error or file not found |
---
## 10. Conformance Tests
This section defines **non-normative** but **strongly RECOMMENDED** test vectors
for implementers of ProofBundle verifiers.
### 10.1 Test Vector Location
Official VaultMesh test vectors are distributed under:
```
testvectors/proofbundle/
```
with the following files:
- `proofbundle-valid.json`
- `proofbundle-tampered-body.json`
- `proofbundle-tampered-root.json`
- `proofbundle-broken-chain.json`
### 10.2 Expected Behaviour
Implementations of `vm_verify_proofbundle` (or equivalent) SHOULD pass the
following conformance checks:
| Input file | Expected Exit | Expected Behaviour |
|------------|---------------|-------------------|
| `proofbundle-valid.json` | 0 | Chain verification succeeds; no errors reported. |
| `proofbundle-tampered-body.json` | 1 | Receipt hash mismatch is detected. |
| `proofbundle-tampered-root.json` | 1 | Receipt hash mismatch is detected. |
| `proofbundle-broken-chain.json` | 1 | Broken `previous_hash` linkage is detected. |
Implementations MAY emit different human-readable error messages, but MUST
distinguish success from failure via exit codes or equivalent programmatic
signals.
### 10.3 Schema Version Handling
Verifiers MUST check the `schema_version` field of a ProofBundle against a
known set of supported versions. If an unsupported schema version is
encountered, verifiers:
- MUST NOT attempt partial verification, and
- MUST return a non-zero exit code (e.g. `2`) indicating
`UNSUPPORTED_SCHEMA_VERSION`, and
- SHOULD direct implementers to the Standards Index
(`VAULTMESH-STANDARDS-INDEX.md`) for the current version matrix.
---
## 11. Versioning & Extensibility
- This document defines `schema_version = "1.1.0"`.
- Producers MUST include a `schema_version` string.
- Verifiers MUST:
- Reject unknown major versions (e.g. 2.x.x) by default.
- Tolerate additional fields for minor versions (e.g. 1.2.x) as long as required fields are present and valid.
Future extensions (e.g. richer ProofChain data, additional actor attributes) MAY be added under new fields, provided they do not alter the semantics defined in this version.
---
## 12. Appendix: Citation
This assessment relies on VaultMesh ProofBundle, specified in
**"VAULTMESH-PROOFBUNDLE-SPEC v1.1.0"**.
Verification was performed using the reference tool
`vm_verify_proofbundle.py` v1.1.0 and validated against the
**VaultMesh ProofBundle Conformance Test Pack v1.0**.
Implementations claiming interoperability **MUST** demonstrate
conformance against all official test vectors before asserting
support for this specification.
The tag `proofbundle-v1.1.0` in the VaultMesh repository marks
the reference implementation state for this version.
---
## 13. References
- [RFC 2119](https://www.rfc-editor.org/rfc/rfc2119) Key words for requirement levels
- [BLAKE3](https://github.com/BLAKE3-team/BLAKE3) Cryptographic hash function
- [DID Core](https://www.w3.org/TR/did-core/) Decentralized Identifiers
- [EU AI Act](https://eur-lex.europa.eu/eli/reg/2024/1689) Regulation 2024/1689
- [ISO/IEC 42001:2023](https://www.iso.org/standard/81230.html) AI Management System
---
_VaultMesh ProofBundle Specification v1.1.0_
_Sovereign Infrastructure for the Digital Age_