# Format Specification (v0) This document freezes the v0 formats used by `vm-ledger`. Key idea: **signing and verification do not depend on CBOR serialization details**. The canonical signing bytes are defined below and locked by golden-vector tests. ## Entry (v0) ### Fields `EntryUnsigned` (the data that is signed): - `prev_hash`: 32 bytes (`[u8;32]`) — previous entry hash, or `00…00` for genesis - `ts_ms`: `u64` — Unix epoch milliseconds - `namespace`: UTF-8 string - `payload_cbor`: bytes — opaque payload (the ledger never interprets it) - `author_pubkey`: 32 bytes (`[u8;32]`) — Ed25519 verifying key `Entry` adds: - `sig`: 64 bytes (`[u8;64]`) — Ed25519 signature over `signing_message_v0` ### Signing bytes (v0) `signing_message_v0 =` (concatenation, in order): 1. Domain separator: ASCII `CLv0` (4 bytes) 2. `prev_hash` (32 bytes) 3. `ts_ms` as **little-endian** `u64` (8 bytes) 4. `namespace_len` as **little-endian** `u32` (4 bytes) 5. `namespace` bytes (`namespace_len` bytes) 6. `payload_hash = BLAKE3(payload_cbor)` (32 bytes) 7. `author_pubkey` (32 bytes) This is implemented in `vm-ledger/crates/ledger-core/src/entry.rs:32` and locked by `vm-ledger/crates/ledger-core/tests/golden_vectors.rs:10`. ### Entry hash (v0) `entry_hash = BLAKE3( … )` over: - Domain separator `CL-entry-v0` - `prev_hash` - `ts_ms` (little-endian `u64`) - `namespace_len` (little-endian `u32`) - `namespace` bytes - `payload_hash = BLAKE3(payload_cbor)` - `author_pubkey` - `sig_len` (little-endian `u32`, always `64` for v0) - `sig` bytes (64) This hash is the hash-chain link and the Merkle leaf value. ## Merkle (v0) Leaves are entry hashes (`[u8;32]`). - Empty tree root: `BLAKE3("CL-merkle-empty-v0")` - Leaf compression: `leaf = BLAKE3("CL-merkle-leaf-v0" || entry_hash)` - Node compression: `node = BLAKE3("CL-merkle-node-v0" || left || right)` - If a level has an odd number of nodes, the last node is duplicated (`right = left`). The Merkle root is computed over the **prefix** of entries included in a checkpoint. ## Checkpoints file (v0) `log/checkpoints.jsonl` is JSON Lines. Each line is a `Checkpoint`: - `ts_ms` (`u64`) - `entry_count` (`u64`) — number of entries covered by this checkpoint - `merkle_root_hex` (64 hex chars) - `head_hash_hex` (64 hex chars) — the hash of entry `entry_count-1` (or `00…00` if `entry_count == 0`) - optional witness fields: - `witness_pubkey_hex` - `witness_sig_hex` ## Checkpoint Attestations (v0, v1) Checkpoint attestations are independent witness signatures over a checkpoint root. File: `log/checkpoints.attestations.jsonl` (JSON Lines). Each line is a `CheckpointAttestationV0` or `CheckpointAttestationV1`, distinguished by the `format` field. - `format`: must equal `civ-ledger-checkpoint-attest-v0` - `ledger_genesis_hash_hex`: 64 hex chars — the **first entry hash** (ledger identity anchor) - `checkpoint_entry_count`: `u64` — number of entries covered by the checkpoint - `checkpoint_merkle_root_hex`: 64 hex chars - `checkpoint_head_hash_hex`: 64 hex chars — the hash of entry `checkpoint_entry_count-1` (or `00…00` if `checkpoint_entry_count == 0`) - `ts_seen_ms`: `u64` — witness observation time (Unix epoch milliseconds) - `witness_pubkey_hex`: 64 hex chars — Ed25519 public key - `witness_sig_hex`: 128 hex chars — Ed25519 signature over the signing bytes below ### Attestation signing bytes (v0) `attestation_signing_message_v0 =` (concatenation, in order): 1. Domain separator: ASCII `CIV_LEDGER_CHECKPOINT_ATTEST_V0` 2. `ledger_genesis_hash` (32 bytes) 3. `checkpoint_entry_count` as little-endian `u64` (8 bytes) 4. `checkpoint_merkle_root` (32 bytes) 5. `checkpoint_head_hash` (32 bytes) 6. `ts_seen_ms` as little-endian `u64` (8 bytes) This is implemented in `vm-ledger/crates/ledger-core/src/attestation.rs` and locked by `vm-ledger/crates/ledger-core/tests/attestation_vectors.rs`. ### Checkpoint Attestation (v1) `CheckpointAttestationV1` adds one field to bind the witness signature to the specific checkpoint record timestamp: - `format`: must equal `civ-ledger-checkpoint-attest-v1` - all fields from v0, plus: - `checkpoint_ts_ms`: `u64` — the `Checkpoint.ts_ms` value being attested Policy: - `ts_seen_ms >= checkpoint_ts_ms` ### Attestation signing bytes (v1) `attestation_signing_message_v1 =` (concatenation, in order): 1. Domain separator: ASCII `CIV_LEDGER_CHECKPOINT_ATTEST_V1` 2. `ledger_genesis_hash` (32 bytes) 3. `checkpoint_entry_count` as little-endian `u64` (8 bytes) 4. `checkpoint_merkle_root` (32 bytes) 5. `checkpoint_head_hash` (32 bytes) 6. `checkpoint_ts_ms` as little-endian `u64` (8 bytes) 7. `ts_seen_ms` as little-endian `u64` (8 bytes) ## ReadProof / Receipt (v0) Read proofs are self-contained JSON objects of type `ReadProofV0`: - `format`: must equal `civ-ledger-readproof-v0` - `entry_hash_hex`: 64 hex chars (the entry hash being proven) - `entry_index`: 0-based index of the entry within the checkpoint prefix - `entry_count`: number of entries in the checkpoint prefix - `checkpoint_merkle_root_hex`: 64 hex chars - `path`: array of Merkle steps, each with: - `sibling_side`: `"left"` or `"right"` (position of the sibling relative to the current hash) - `sibling_hash_hex`: 64 hex chars (the sibling hash at that Merkle level) Verification: 1. `current = BLAKE3("CL-merkle-leaf-v0" || entry_hash)` 2. For each path step: - if `sibling_side == "left"`: `current = BLAKE3("CL-merkle-node-v0" || sibling || current)` - if `sibling_side == "right"`: `current = BLAKE3("CL-merkle-node-v0" || current || sibling)` 3. Accept iff `current == checkpoint_merkle_root` This is implemented in `vm-ledger/crates/ledger-core/src/proof.rs`. ## Receipt (v0) Receipts bundle: **entry bytes + inclusion proof + (optional) witness attestations**. JSON object `ReceiptV0`: - `format`: must equal `civ-ledger-receipt-v0` - `entry_cbor_b64`: base64 (no padding) of CBOR-encoded `Entry` - `entry_hash_hex`: 64 hex chars - `read_proof`: a `ReadProofV0` - `attestations`: array of `CheckpointAttestationV0` or `CheckpointAttestationV1` (may be empty) Verification: 1. Decode `entry_cbor_b64` → `Entry`, verify the entry signature, and compute `entry_hash`. 2. Verify `read_proof` and require `read_proof.entry_hash_hex == entry_hash_hex`. 3. If witness is required: verify at least one included checkpoint attestation (v0 or v1) and require it matches `read_proof.entry_count` + `read_proof.checkpoint_merkle_root_hex`. This is implemented in `vm-ledger/crates/ledger-core/src/receipt.rs`. ## Payload Type: `file_anchor.v0` The `ledger anchor-file` CLI emits a CBOR payload with the following JSON shape: - `type`: `"file_anchor.v0"` - `path`: string (preferably repo-relative) - `hash_blake3_hex`: 64 hex chars - `bytes`: `u64` - `git`: - `commit`: string or null - `dirty`: boolean or null