output-qa

Synthetic Enterprise Canvas Demo

Evidence-to-canvas review runs stored source-traceable

qa-plan-ready-renderer-blocked

Output QA

Manual review plan for future PowerPoint, PDF, and Word handoffs. The demo still rejects binary output claims until renderer, template, credential, and policy gates are complete.

QA lanes

qa-plan-ready-renderer-blocked

Traceability

blocker

Every claim resolves to source ids, assumptions, or reviewer-approved notes.

Reject output if a factual claim has no source id or assumption marker.

Evidence strength

blocker

Low-confidence blocks, weak evidence, and conversion warnings remain visible.

Reject output if confidence/evidence warnings are hidden or softened.

Format fidelity

manual-review

PowerPoint, PDF, and Word preserve the agreed structure, appendix policy, and labels.

Compare every section against the output-template field map before handoff.

Live-output labeling

blocker

Model-produced language is labeled only after a live smoke run is stored and inspected.

Reject any live/model label while `OPENAI_API_KEY` or live-run approval is absent.

Governance boundary

blocker

Approval, retention, archive, tenant, and client-facing boundaries remain explicit.

Reject output if it implies production authorization or client-data readiness.

Format-specific checks

PowerPoint

manual-qa-required

Slides include visible source ids, confidence badges, appendix references, and a boundary slide.

PDF

manual-qa-required

Locked package includes source appendix or attachment policy and does not hide draft status.

Word

manual-qa-required

Narrative memo keeps inline citations, assumption markers, and editable reviewer notes.

Sampling protocol

  1. 1. Review all title/context, governance, approval, and boundary sections.
  2. 2. Sample at least three canvas blocks: one high-confidence, one review-confidence, and one warning-bearing block.
  3. 3. Trace every sampled claim back to the source package and converted artifact metadata.
  4. 4. Compare validation appendix rows against positive, negative, borderline, and known-miss test cases.
  5. 5. Inspect one stored run detail before labeling anything as model-produced.

Failure actions

Record the failed lane, format, claim or section, source ids involved, and reviewer owner.
Do not send the output externally while any blocker-severity lane is failed.
Convert recurring misses into validation cases or output-template changes before rerunning.
Escalate renderer defects separately from product/content defects.

Human actions

Approve CLMBS output templates and brand rules for PowerPoint, PDF, and Word.
Decide who owns manual QA sign-off before a handoff becomes client-facing.
Provision `OPENAI_API_KEY` only after data/access/tenant decisions are accepted, then run one live smoke test.
Define archive, checksum, retention, deletion, and legal-hold rules before production export.