Your AI ignores you.
Calx fixes that.
Your corrections don’t stick. Your agents make the same mistakes. Your team is building the same internal harness three times over.
Calx is the behavioral control layer for supervised AI agents. We find the recurring corrections in your agent workflows, compile them into runtime enforcement, and give you the evidence back.
Fisher’s exact, p = 0.002 across 6 architectural classes.
Corrections become structural rules.
Runtime enforcement, not another prompt file. A real rule, a real violation, a real intercept.
Never run
git push --force without explicit approval.Never bypass pre-commit hooks (
--no-verify).Always ask before destructive operations.
git push origin feat/parse...on tool_call(name="bash") {
if cmd.matches("git push.*--force")
or cmd.contains("--no-verify") {
veto(reason="destructive · needs approval")
}
}
# enforced outside the context window
# scoped to operator: spencerbuilds
[14:02:31] ⟶ agent response: "cannot force-push, will wait for approval"
[14:02:31] ● evidence logged to serve · visible in weekly report
--force --no-verify. Text rules do not compile into behavior.Prompts and memory improve what the agent knows.
Not what it does.
Everyone is building on the information plane. There is no compiler between the two planes. Calx fixes that.
The person already in the pain.
Not broad horizontal adoption. The person who buys Calx already knows every correction is paid twice.
Give us one agent workflow. We’ll find what humans keep correcting.
A Correction Audit is the low-friction entry. We inspect your agent’s corrections, identify recurrence clusters, and deliver a behavioral control report that shows what compiles into runtime enforcement and what stays process.
- The top recurring correction classes your team keeps paying for
- Which corrections are architectural and can compile. Which are process and cannot.
- A concrete enforcement plan: which rules become Tether hooks, which become review gates
- A baseline recurrence metric so pilot impact is measurable before we start
One compiler. Three product layers.
Tether enforces. Bench captures. Serve compiles. Model-agnostic via LiteLLM.
- Tool veto · Block tool calls at the harness level
- Response review · Check output against rules before delivery
- Injection defense · Enforced outside the context window
Not a chat app. Not a workspace. The surface where correction becomes visible, fast, and enforceable.
See Bench →The brain of the system. Proprietary pipeline. The moat compounds per operator, per session.
Read the paper →Three papers. 100+ sources. Built with itself.
Calx is an applied science company that ships software. The product is the research made durable.
Behavioral infrastructure
for the supervised-agent era.
Runtime governance is forming as a category. Memory remembers. Observability records. Guardrails block. Calx is the adaptive correction loop inside the harness that turns human judgment into runtime behavior. The specialists get there first.