# Sentinel v1 Canonicalization & Hashing Rules This document defines deterministic event hashing and Merkle root computation for Sentinel v1. Verification MUST be deterministic across platforms given the same artifacts. ## 1) Hash function (`hash_algo`) `hash_algo` MUST be one of: - `blake3` (recommended) - `sha256` (fallback for constrained platforms) The chosen `hash_algo` MUST be constant for a given Sentinel instance/build. Verifiers MUST reject mixed algorithms within a single bundle unless explicitly versioned. ### 1.1 `vmhash` `vmhash(data: bytes) -> string` returns: - `"blake3:" + hex(blake3(data))` when `hash_algo=blake3` - `"sha256:" + hex(sha256(data))` when `hash_algo=sha256` `hex(...)` is lowercase hex with no separators. ## 2) JSON canonicalization (`canonicalization_version`) `canonicalization_version` for Sentinel v1 events is: - `sentinel-event-jcs-v1` Canonical JSON MUST use RFC 8785 (JSON Canonicalization Scheme, “JCS”): - UTF-8 encoding - Object keys sorted lexicographically - No insignificant whitespace - Numbers encoded per JCS rules If a platform cannot implement full JCS, it MUST NOT claim `sentinel-event-jcs-v1`. ## 3) Event canonical bytes Each exported event is a JSON object that conforms to `event.schema.json`. `event_canonical_bytes` is the UTF-8 bytes of the JCS-canonicalized event object. ## 4) Event hash + hash chain ### 4.1 `event_hash` `event_hash` MUST be computed over the canonical bytes of the event object *excluding* the `event_hash` field itself. Define: - `event_without_event_hash = event` with the `event_hash` property removed (if present) - `event_canonical_bytes = jcs_bytes(event_without_event_hash)` Then: `event_hash = vmhash(event_canonical_bytes)` For exported artifacts, `event_hash` MUST be present in the event record and verifiers MUST recompute and compare it. ### 4.2 `prev_event_hash` - For `seq = 0` (or the first event in a new ledger): `prev_event_hash = "0"` - For `seq = n > 0`: `prev_event_hash` MUST equal the computed `event_hash` of the immediately preceding event (`seq = n-1`) in the same ledger. This provides fast tamper evidence even without Merkle recomputation. ## 5) Operation digest (`op_digest`) `op_digest` commits to the *normalized* operation descriptor. Define the normalized object: ```json { "op": "", "params": { "canonical": "params" } } ``` Normalization rules: - `op` MUST be a stable, versioned identifier (e.g., `sentinel.export_seal.v1`). - `params` MUST be JSON (no NaN/Infinity); omit unset fields rather than using null where possible. - Canonicalize the object using `sentinel-event-jcs-v1`, then hash: `op_digest = vmhash(jcs_bytes({"op": op, "params": params}))` ## 6) Merkle root (`ROOT.current.txt`) ### 6.1 Leaves The Merkle tree commits to the ordered list of event hashes: `leaves = [event_hash(seq=0), event_hash(seq=1), ...]` Each leaf is a `vmhash` string (`algo:hex`). Note on ranged bundles: A verifier can only recompute the global Merkle roots for an arbitrary `since_seq > 0` bundle if it is also given a verifiable Merkle continuation state (e.g., a frontier snapshot) at `since_seq-1`. Otherwise, verification MUST fall back to hash-chain + file-integrity checks for that range, or the bundle MUST start at `since_seq = 0`. ### 6.2 Parent computation (VaultMesh-style) To compute a parent from two children: - Let `left_hex = left.split(":", 1)[-1]` - Let `right_hex = right.split(":", 1)[-1]` - `parent = vmhash( (left_hex + right_hex).encode("utf-8") )` If the level has an odd count, duplicate the last element (i.e., `right = left`). ### 6.3 Empty tree root If there are no leaves, the root MUST be: `vmhash(b"empty")` ### 6.4 Root publication file format `ROOT.current.txt` MUST be human-readable and parseable as key/value lines: ``` format=vm-sentinel-root-v1 root= seq= updated_at= hash_algo= canonicalization_version=sentinel-event-jcs-v1 ``` Additional keys MAY be included, but verifiers MUST ignore unknown keys.