Skip to content

Proof bundle format

A proof bundle is a zip containing four files. It is the externally-visible deliverable and the thing that must work end-to-end — design decisions protect its integrity first.

flowchart LR
    subgraph Inputs
        M[MerkleRoots + OTSProof]
        S["Snapshots metadata<br/>+ HMAC commitments"]
        K[Ed25519 signing key]
    end

    AB[assemble_bundle_data] --> CJ[canonical_bundle_json]
    M --> AB
    S --> AB
    CJ --> JSON[bundle.json]
    CJ --> SG[sign_canonical_bytes]
    K --> SG --> SIG[bundle.sig.json]
    AB --> PDF[render_bundle_pdf] --> PDF_F[bundle.pdf]
    VER[verify_template.py] --> VERF[verify.py]

    JSON --> ZIP[[bundle.zip]]
    SIG --> ZIP
    PDF_F --> ZIP
    VERF --> ZIP

Contents

bundle.pdf

Human-readable cover document. Contains:

  • Signed attestation of authorship (the text the author agreed to the first time they generated a bundle).
  • Writing statistics: total captures, active days, peak day, final word count, sessions.
  • Word-count-over-time chart — rendered by reportlab Drawing primitives directly in the PDF, no PNG intermediate.
  • Cryptographic appendix: Merkle roots, OTS receipt fingerprints, Bitcoin block heights where known.
  • One page of verification instructions for the publisher.
  • A bundle_identifier_hex footer on every page — the content hash of bundle.json.

Zero manuscript content.

bundle.json

The canonical, content-addressed payload. This is what verify.py actually reads. Deterministic JSON — sorted keys, UTF-8, no trailing whitespace, stable floating-point formatting — so the same inputs always produce the same bytes, and so bundle_identifier_hex (SHA-256 of this file) stably identifies a specific bundle.

Schema (illustrative — see backend/api/bundle.py and backend/api/schemas.py for the authoritative shape):

{
  "format_version": 1,
  "bundle_identifier_hex": "...",        // SHA-256 of the canonical bytes
  "generated_at": "2026-04-22T14:00:00Z",
  "author": {
    "email": "...",
    "attestation_text": "...",            // user-facing attestation
    "attestation_signed_at": "..."
  },
  "writing_summary": {
    "capture_count": 412,
    "active_days": 91,
    "peak_day": { "date": "...", "word_count": 8102 },
    "final_word_count": 85000,
    "session_count": 128
  },
  "merkle_roots": [
    {
      "computed_at": "2026-04-09T00:00:00Z",
      "root_hash_hex": "...",
      "leaves_hex": ["...", "...", "..."],   // HMAC commitments in order
      "ots_receipt_b64": "...",              // DetachedTimestampFile
      "bitcoin_block_height": 891234          // null if pending
    },
    ...
  ],
  "snapshots": [
    {
      "captured_at": "...",
      "plaintext_hmac_hex": "...",        // = Merkle leaf
      "merkle_root_hash_hex": "...",      // which day this save anchors to
      "word_count": 1234,
      "char_count": 7890,
      "file_type": "md"
    },
    ...
  ]
}

Note: no path, filename, or file contents. The only manuscript-derived value is plaintext_hmac_hex, which is an HMAC under a key the server never sees.

bundle.sig.json

Ed25519 signature over the canonical bundle.json bytes.

{
  "alg": "Ed25519",
  "signature_b64": "...",
  "public_key_fingerprint": "..."    // SHA-256 truncated; stable across bundles
}

The public-key fingerprint is embedded so the verifier can confirm the bundle was signed by BlindProof's production signing key (rather than a substituted one). The Ed25519 private key lives as the BLINDPROOF_SIGNING_KEY Fly secret; its public key is published alongside the docs.

verify.py

The stdlib-only checking tool. A PEP 723 single-file Python script with exactly one external dependency: opentimestamps-client (needed to call the ots CLI for Bitcoin verification).

What it does:

  1. Loads bundle.json and bundle.sig.json.
  2. Recomputes bundle_identifier_hex from the canonical bytes and confirms the signature using the embedded public-key fingerprint.
  3. For each merkle_roots[i]: re-derives root_hash_hex from leaves_hex using the same Bitcoin-style SHA-256 tree the backend used. Refuses to continue if they don't match.
  4. If a manuscript file path is supplied, computes HMAC-SHA256(mac_key, plaintext) over it — wait, no. The publisher doesn't have mac_key. What verify.py actually does for the "this manuscript matches" check: it requires the author to pass the manuscript and the derived HMAC, or to read the HMAC from a sidecar in the bundle. The manuscript-match flow is explained in Verifying a bundle.
  5. For each ots_receipt_b64: writes the bytes to a temp file, invokes ots verify -d <digest_hex> <tempfile>, and maps the output to PASS / PENDING / FAIL.
  6. Prints a summary: overall verdict, plus one line per check.

It is deliberately simple — a publisher should be able to audit the script in a reasonable afternoon. Keep it stdlib-only; every added dependency weakens the durability promise.

Determinism and stability

The canonical JSON is deterministic: given the same backend state, regenerating a bundle for the same user produces the same bytes (same bundle_identifier_hex, same signature). This is important for a few reasons:

  • A publisher can compare two bundles produced at different times and quickly see what changed (new snapshots appended; nothing rewritten).
  • A republishing author can produce a fresh bundle before delivery without invalidating an earlier one.
  • The fingerprint in the PDF footer lets non-technical readers match a PDF to its canonical payload at a glance.

If you change bundle.json's shape, bump format_version. verify.py must tolerate older formats for at least as long as any bundle produced under them might still be in circulation — which, given the durability promise, is "forever".

Stability caveats

  • Ed25519 key rotation. When we rotate the signing key (V1), bundles signed by the old key must still verify. Plan is to publish a key-history document keyed by fingerprint; verify.py consults it offline. Not yet implemented.
  • OTS receipt format evolution. opentimestamps-client is stable and backwards-compatible; we pin a lower bound and test against the latest.
  • Bitcoin. Bitcoin block headers are what OTS ultimately anchors to. The dependency surface here is "Bitcoin continues to exist and the public calendar network continues to operate". Both are outside our control and are what gives the bundle its durability.

See also