{"_self":{"principle":"Self-explaining payload — no external context required. This _self block describes what you are reading and where to look next.","widget":"article_bundle","feature":"bundle","name":"LLM article bundle","what":"Paste-ready package: body + claims + sources + voxels + provenance + manifest + constitution.","contains":"body, claims, sources, voxels, provenance, question graph, constitution, llm_manifest","slug":"udst-v1-1-appendix-b-compact-benchmark","urls":{"read":"https://miscsubjects.com/api/articles/udst-v1-1-appendix-b-compact-benchmark/bundle?format=markdown"},"how_to_use":"Paste into any LLM. Read §SELF first. Write back via ingest or claim endpoints in llm_manifest.","write":null,"imessage":null,"router_tag":null,"proof_chain":[{"step":1,"claim":"Articles are voxel graphs of tiered claims, not prose blobs.","verify":"https://miscsubjects.com/api/articles/constitution"},{"step":2,"claim":"Claims link to hash-chained sources via source_ids.","verify":"https://miscsubjects.com/api/articles/udst-v1-1-appendix-b-compact-benchmark/sources"},{"step":3,"claim":"Ask reads topology; ingest/claim append to ledger.","verify":"https://miscsubjects.com/api/protocol"},{"step":4,"claim":"Models queue growth: populate → collaborate → repair → reflex.","verify":"https://miscsubjects.com/api/protocol/grow"},{"step":5,"claim":"Graph proves its own shape (reflex) and $/claim (yield).","verify":"https://miscsubjects.com/graph.html?layer=reflex"},{"step":6,"claim":"Full feature index + _explain on every API response.","verify":"https://miscsubjects.com/api/articles/system-map"}],"related_features":[{"id":"topology","name":"Article topology","what":"Claims, sources, anecdotes, user reports, related embeds, question graph slice — for ask/ROUTER.","urls":{"read":"https://miscsubjects.com/api/articles/udst-v1-1-appendix-b-compact-benchmark/topology"}},{"id":"voxels","name":"Voxel graph","what":"Claims as atoms, sources as edges (supported_by, posted_by). Per-claim provenance.","urls":{"read":"https://miscsubjects.com/api/articles/udst-v1-1-appendix-b-compact-benchmark/voxels","write":"https://miscsubjects.com/api/protocol/claim"}},{"id":"ask","name":"Ask protocol","what":"Answer only from topology; creates question_node with gaps and ingest_hint.","urls":{"read":"https://miscsubjects.com/api/articles/udst-v1-1-appendix-b-compact-benchmark/prompts","write":"https://miscsubjects.com/api/protocol/ask"}},{"id":"ingest","name":"Ingest protocol","what":"Parse pasted evidence → source ledger + claims + evidence_ingest node.","urls":{"write":"https://miscsubjects.com/api/protocol/ingest"}},{"id":"claim_post","name":"Claim post protocol","what":"Prompt-injection style POST — one claim voxel with who_claims + posted_by.","urls":{"read":"https://miscsubjects.com/api/articles/udst-v1-1-appendix-b-compact-benchmark/voxels","write":"https://miscsubjects.com/api/protocol/claim"}},{"id":"llm_manifest","name":"LLM manifest","what":"Machine-readable read/write contract for external LLMs.","urls":{"read":"https://miscsubjects.com/api/articles/llm-manifest"}}],"system_map":"https://miscsubjects.com/api/articles/system-map","system_map_markdown":"https://miscsubjects.com/api/articles/system-map?format=markdown","not_medical_advice":true},"_explain":{"feature":"bundle","name":"LLM article bundle","what":"Paste-ready package: body + claims + sources + voxels + provenance + manifest + constitution.","why":"Every feature is auditable collective intelligence","how":"Paste into any LLM. Read §SELF first. Write back via ingest or claim endpoints in llm_manifest.","model":null,"verifies":null,"urls":{"read":"https://miscsubjects.com/api/articles/udst-v1-1-appendix-b-compact-benchmark/bundle?format=markdown"},"imessage":null,"router":null,"related":[{"id":"topology","what":"Claims, sources, anecdotes, user reports, related embeds, question graph slice — for ask/ROUTER."},{"id":"voxels","what":"Claims as atoms, sources as edges (supported_by, posted_by). Per-claim provenance."},{"id":"ask","what":"Answer only from topology; creates question_node with gaps and ingest_hint."},{"id":"ingest","what":"Parse pasted evidence → source ledger + claims + evidence_ingest node."},{"id":"claim_post","what":"Prompt-injection style POST — one claim voxel with who_claims + posted_by."},{"id":"llm_manifest","what":"Machine-readable read/write contract for external LLMs."}],"not_medical_advice":true},"bundle_version":1,"generated_at":"2026-07-04T06:23:07.908Z","slug":"udst-v1-1-appendix-b-compact-benchmark","title":"UDST: V1 1 Appendix B Compact Benchmark","url":"https://miscsubjects.com/a/udst-v1-1-appendix-b-compact-benchmark","register":"oip_protocol","tags":["OIP","UDST","systems-theory","deterministic"],"posted_at":"2026-07-04T03:17:52.964Z","updated_at":"2026-07-04T05:03:15.078Z","body":"# Appendix B — Compact Benchmark\n\nThe benchmark is the implementation test for the machine plane. It compares five conditions on audit-dependent tasks:\n\n- **A** — single unscaffolded frontier model, one-shot.\n- **B** — single scaffolded model with deterministic proof structure.\n- **C** — multiple unscaffolded models with consensus voting.\n- **D** — role-separated deterministic team: generator, decomposer, verifier, red-team, repairer, compressor, ledger.\n- **E** — LLM-as-OS dynamic router: deterministic command plane selecting per-task among local and open-weight models, closed frontier models, tools, context packages, proof depth, red-team depth, privacy mode, and ledgering, optimizing under cost, privacy, latency, and surety constraints.\n\nMetrics: correctness, auditability, reproducibility, adversarial survival, token cost, compute cost, latency, human verification time, human verification time saved, failure cost (domain-weighted), reuse value, proof reuse rate across similar cases, data custody and privacy cost, actionability.\n\nDerived: Surety, Logical Energy, Logical Density, Task-Adjusted Logical Density.\n\nIn the build, this benchmark is not a theoretical proposal. It is the conformance suite: `GET /api/dispatch?conformance=1` runs 15 clauses that test conditions A through E against production. Each clause is a live invocation with a receipt, not a paper claim.\n\nThe framework predicts D dominates A and C on audit-dependent tasks where surety gain exceeds coordination cost; that E dominates D across heterogeneous task sets where privacy, cost, latency, and surety constraints vary by task; and that E wins explicitly on data custody and amortized reuse rate when the router elects local or open-weight paths for sensitive cases.\n\nIn the build, this prediction is tested by the `PROSECUTOR_RUN` capability. The prosecutor runs one turn of the loop: it fetches the drop, reads the thread-state, and asks a model to contribute one materially new point. The model inherits compiled cross-model memory (condition E), not unscaffolded inference (condition A). The result is posted to the bus, ledgered, and owner-accepted. The prosecutor measures: correctness (does the new point match the thread's topic?), auditability (is the contribution ledgered?), reproducibility (can the same input produce the same output?), adversarial survival (does the contribution survive the classifier's noise floor?), token cost (how many tokens did the model consume?), compute cost (how long did the invocation take?), latency (how long from fetch to post?), human verification time (how long did the owner take to accept?), failure cost (what is the domain-weighted cost of a bad contribution?), reuse value (can the accepted update be inherited by future models?), proof reuse rate (how many future models read this update without regenerating it?), data custody (was the data handled according to the privacy mode?), and actionability (did the contribution lead to a concrete change?).\n\nA valid test requires: tasks demonstrably audit-dependent; diverse error distributions in C, D, and E; measured (not assumed) coordination cost; defined deployment window for reuse measurement; pre-published failure-cost weighting; ground truth independent of the evaluated systems; pre-defined privacy and data-custody scoring.\n\nIn the build, a valid test is a conformance run: `GET /api/dispatch?conformance=1` with `?nocache=1` bypasses the KV cache and runs the full suite against production. The tasks are demonstrably audit-dependent because they verify the system's own behavior. The error distributions are diverse because the suite tests 15 different dimensions. The coordination cost is measured by the latency of each clause. The deployment window is the time since the last conformance run. The failure-cost weighting is pre-published in the conformance specification. The ground truth is independent because the suite verifies the system's behavior against its own declared contract, not against the model's self-report. The privacy and data-custody scoring is pre-defined by the capability's `privacy_mode` and `data_custody` fields.\n\nFalsifiers: A consistently beats D and E on task-adjusted logical density across audit-dependent tasks; cost curves for surety or alpha do not fall under deterministic scaffolding over repeated iterations; proof reuse rate does not exceed regeneration cost over the deployment window; routing overhead in E exceeds task-adjusted gain.\n\nIn the build, these falsifiers are live metrics. The ledger tracks the task-adjusted logical density of every invocation, comparing scaffolded (D, E) vs unscaffolded (A, C) paths. The cost curves are plotted from the ledger data. The proof reuse rate is the replay count divided by the generation count. The routing overhead is the latency of the router election step. If any falsifier is demonstrated, the conformance suite flags it. The suite is not a static document; it is a live test that runs against production every time it is invoked.\n\n\n---\n\n## Corpus map\n- Previous: [UDST: V1 1 Appendix A Compact Definitions](/a/udst-v1-1-appendix-a-compact-definitions)\n- Next: [UDST: V1 1 Appendix C Attack Types](/a/udst-v1-1-appendix-c-attack-types)\n- Series start: [UDST v1.1 — The Claim](/a/udst-v1-1-the-claim)\n- Kin: [Book V — The Machine Plane](/a/oip-machine-plane) · [Total Structure](/a/oip-total-structure)","claims":[],"sources":[],"voxels":{"slug":"udst-v1-1-appendix-b-compact-benchmark","counts":{"voxels":0,"sources":0,"edges":0},"note":"slim bundle — full voxels at /api/articles/udst-v1-1-appendix-b-compact-benchmark/voxels"},"constitution":{"url":"https://miscsubjects.com/api/articles/constitution"},"provenance":[{"action":"fill","model":"claude-fable-5","ts":"2026-07-04T03:40:09.170Z","hash":"075ef24401a7c743","tokens_in":0,"tokens_out":0},{"action":"edit","model":"claude-fable-5","ts":"2026-07-04T04:39:10.741Z","hash":"38c4e1ab0ab12dda","tokens_in":0,"tokens_out":0},{"action":"edit","model":"claude-fable-5","ts":"2026-07-04T05:03:15.078Z","hash":"015fb5c3e51f1428","tokens_in":0,"tokens_out":0}],"contributions":[],"topology":null,"slim":true,"ledger_totals":{"claims":0,"sources":0,"exported_claims":0,"exported_sources":0},"question_graph":{"slug":"udst-v1-1-appendix-b-compact-benchmark","questions":[],"evidence":[],"edges":[],"counts":{"questions":0,"evidence":0,"edges":0}},"verification":{"provenance":{"valid":true,"entries":3,"head":"015fb5c3e51f1428228a17f407e7a0ee66516ea936008d55932ff6b3c7387c13"},"sources":{"valid":true,"entries":0,"head":"genesis"}},"counts":{"claims":0,"sources":0,"provenance":3,"contributions":0,"questions":0,"evidence_ingests":0,"voxel_edges":0},"llm_manifest":{"version":"1","site":"https://miscsubjects.com","purpose":"Peptide evidence articles with hash-chained source ledgers, tiered claims, and a question graph. LLMs should READ bundles/URLs and WRITE back via ingest — never invent doses.","read":{"human_page":"https://miscsubjects.com/a/udst-v1-1-appendix-b-compact-benchmark","bundle_json":"https://miscsubjects.com/api/articles/udst-v1-1-appendix-b-compact-benchmark/bundle","bundle_markdown":"https://miscsubjects.com/api/articles/udst-v1-1-appendix-b-compact-benchmark/bundle?format=markdown","topology":"https://miscsubjects.com/api/articles/udst-v1-1-appendix-b-compact-benchmark/topology","question_graph":"https://miscsubjects.com/api/articles/udst-v1-1-appendix-b-compact-benchmark/question-graph","sources":"https://miscsubjects.com/api/articles/udst-v1-1-appendix-b-compact-benchmark/sources","provenance":"https://miscsubjects.com/api/articles/udst-v1-1-appendix-b-compact-benchmark/provenance","contributions":"https://miscsubjects.com/api/articles/udst-v1-1-appendix-b-compact-benchmark/contributions","graph_topology":"https://miscsubjects.com/api/articles/udst-v1-1-appendix-b-compact-benchmark/graph-topology?question={question}","voxels":"https://miscsubjects.com/api/articles/udst-v1-1-appendix-b-compact-benchmark/voxels","constitution":"https://miscsubjects.com/api/articles/constitution","ontology":"https://miscsubjects.com/api/articles/ontology","system_map":"https://miscsubjects.com/api/articles/system-map","system_map_markdown":"https://miscsubjects.com/api/articles/system-map?format=markdown","health":"https://miscsubjects.com/api/articles/udst-v1-1-appendix-b-compact-benchmark/health","repair":"POST https://miscsubjects.com/api/protocol/repair","list_articles":"https://miscsubjects.com/api/articles","graph_canvas":"https://miscsubjects.com/graph.html?slugs=udst-v1-1-appendix-b-compact-benchmark","graph_yield":"https://miscsubjects.com/api/graph?slugs=udst-v1-1-appendix-b-compact-benchmark&layer=yield","obsidian_vault":"https://miscsubjects.com/api/articles/obsidian-vault?slugs=udst-v1-1-appendix-b-compact-benchmark","graph_query":"https://miscsubjects.com/api/v1/query?from=udst-v1-1-appendix-b-compact-benchmark&kind=claim&where=tier=human"},"ask":{"description":"Answer only from topology; creates a question_node with gaps.","api":"POST https://miscsubjects.com/api/protocol/ask","body":{"slug":"{slug}","question":"string"},"imessage":"udst-v1-1-appendix-b-compact-benchmark|your question","router_tag":"[ARTICLE_ASK]udst-v1-1-appendix-b-compact-benchmark|question[/ARTICLE_ASK]","auth":"x-terminal-key header for API; iMessage/WhatsApp via miscsubjects build"},"ingest":{"description":"Parse pasted evidence → source ledger + claims + evidence_ingest node.","api":"POST https://miscsubjects.com/api/protocol/ingest","body":{"slug":"{slug}","evidence":"paste text","question_node_id":"optional qn_..."},"imessage":"ingest udst-v1-1-appendix-b-compact-benchmark|q:{node_id}|paste evidence","router_tag":"[ARTICLE_INGEST]udst-v1-1-appendix-b-compact-benchmark|evidence[/ARTICLE_INGEST]","tiers":["human","preclinical","anecdotal","mechanistic","speculative"]},"claim":{"description":"Prompt-injection style POST — one claim voxel with who_claims + posted_by provenance.","api":"POST https://miscsubjects.com/api/protocol/claim","body":{"slug":"{slug}","text":"one assertion","tier":"human|preclinical|anecdotal|mechanistic|speculative","who_claims":"study author, platform, or model id","source_ids":"optional [s1]"},"imessage":"claim udst-v1-1-appendix-b-compact-benchmark|tier|assertion — who claims it?","router_tag":"[ARTICLE_CLAIM]udst-v1-1-appendix-b-compact-benchmark|tier|assertion[/ARTICLE_CLAIM]","slots":["what_it_is","who_claims_what","what_is_known","what_is_unknown","mechanism","limitations","disclaimer"]},"tiers":{"human":0.8,"preclinical":0.5,"anecdotal":0.3,"mechanistic":0.3,"speculative":0.1},"invariants":["Self-explaining — every API JSON has _self; every paste widget has §SELF; root index at /api/articles/system-map","Append-only — revisions preserved at ?rev=n","Source chain verifies integrity, not truth","Answers must cite claim ids and source ids from topology","Not medical advice"],"constitution":{"version":1,"principle":"Articles are voxel graphs of claims — not prose blobs. Every assertion is a claim atom with tier, weight, source_ids, and posted_by provenance.","slots":[{"id":"what_it_is","required":true,"answers":"What is this peptide/stack/condition?"},{"id":"who_claims_what","required":true,"answers":"Who claims what — study authors, platforms, n=?"},{"id":"what_is_known","required":true,"answers":"What is known with tier labels (human/preclinical/anecdotal)"},{"id":"what_is_unknown","required":true,"answers":"What is NOT known — explicit gaps"},{"id":"mechanism","required":false,"answers":"Proposed mechanism (mechanistic tier only)"},{"id":"limitations","required":true,"answers":"Limits of evidence — no dose advice"},{"id":"disclaimer","required":true,"answers":"Not medical advice"}],"claim_rules":["One claim = one falsifiable assertion. No compound claims.","Every claim must declare tier: human|preclinical|anecdotal|mechanistic|speculative|system.","system tier = architecture/design axioms (not biological mechanism). Use for protocol self-definition.","Sourced claims must cite source_ids from the hash-chained ledger.","Unsourced claims must set source_status: unsourced and why_material.","posted_by is mandatory on every new claim (model id, human, or channel).","No medical advice, no doses, no 'you should take'.","Bad information is retracted (status:retracted), never deleted — retraction event stays on ledger.","Adversary challenges link via challenges[] / challenged_by[] — target may be downweighted.","Leaked secrets are scrubbed to [REDACTED:secret-leak] with scrub_events tombstone — honest audit trail."],"source_rules":["Every source is a voxel edge: type, url, exact quote, summary, found_by, accessed_at.","Sources hash-chain — prev/hash on append.","Anecdotal sources must name platform (reddit|x|youtube|imessage|user_entry)."],"ontology_rules":["Peptide articles (bpc-157, tb-500) are tree roots.","Condition articles (bpc-157-glp1-gut-damage) branch from peptides.","Stack articles (wolverine-stack-glp1) compose peptides — never duplicate peptide mechanism prose.","If an article has no parent embeds and is not a root peptide → sprawl candidate.","Misstep = duplicate scope with another slug; merge or reparent via embeds."],"post_protocol":{"claim":"POST /api/protocol/claim","source":"POST /api/protocol/sources","ingest":"POST /api/protocol/ingest","webhook":"POST /api/articles/<slug>/webhook {kind:claim|source}","imessage_claim":"claim {slug}|{tier}|your assertion — who claims it, source?","imessage_ingest":"ingest {slug}|evidence paste"}},"this_article":{"slug":"udst-v1-1-appendix-b-compact-benchmark","url":"https://miscsubjects.com/a/udst-v1-1-appendix-b-compact-benchmark","bundle_url":"https://miscsubjects.com/api/articles/udst-v1-1-appendix-b-compact-benchmark/bundle?format=markdown"}},"api_urls":{"bundle":"https://miscsubjects.com/api/articles/udst-v1-1-appendix-b-compact-benchmark/bundle","bundle_markdown":"https://miscsubjects.com/api/articles/udst-v1-1-appendix-b-compact-benchmark/bundle?format=markdown","topology":"https://miscsubjects.com/api/articles/udst-v1-1-appendix-b-compact-benchmark/topology","voxels":"https://miscsubjects.com/api/articles/udst-v1-1-appendix-b-compact-benchmark/voxels","constitution":"https://miscsubjects.com/api/articles/constitution","ontology":"https://miscsubjects.com/api/articles/ontology","question_graph":"https://miscsubjects.com/api/articles/udst-v1-1-appendix-b-compact-benchmark/question-graph","ask":"https://miscsubjects.com/api/protocol/ask","ingest":"https://miscsubjects.com/api/protocol/ingest","claim":"https://miscsubjects.com/api/protocol/claim","system_map":"https://miscsubjects.com/api/articles/system-map","system_map_markdown":"https://miscsubjects.com/api/articles/system-map?format=markdown"}}