A Typed Prompt Compiler: Catching Hallucination Bugs at Build Time
We built a TypeScript-flavored DSL for prompts that statically verifies retrieved-context shape against expected JSON output. Bugs that used to ship to prod now fail CI.
Most "prompt engineering" today is string concatenation with extra steps. We wanted a type system.
PromptC is a small DSL that compiles to provider-specific calls. You declare the shape of your retrieval context and the shape of your expected output. The compiler verifies the prompt body references only fields that exist, and that the output schema is a structural subset of what the model can plausibly return given the system instructions.
In our codebase, this caught 23 latent bugs in the first week — prompts that worked 95% of the time but failed silently on edge-case inputs.
Production deployment
We deployed this gradually behind an internal feature flag, mirroring 1% of traffic for 72 hours before promoting. The instrumentation surface is shipped as part of our open-source edge-trace crate.
// Pseudocode — the actual wiring lives in the repo
const router = createRouter({
classify: classifier.predict,
speculate: speculator.draft,
verify: verifier.confirm,
windowSize: 8,
});
export default router.handle;What we got wrong
Our first iteration over-trusted the speculator on long sequences. The fix was a sliding acceptance threshold that decays with prefix length — obvious in hindsight, not obvious during the on-call that surfaced it.
The bottleneck is rarely where you think it is. Measure first; optimize the thing that actually moves the bill.