Technical posts on building runtime governance for autonomous agents. Philosophical posts on what we're really building, and why this layer has to exist.
Most agent failures are recoverable. Some are not. The ones that aren't are the ones the field is least equipped to talk about — and they happen earlier in production than anyone is admitting. A taxonomy, and a proposal.
The constraint is not the cost of the system. It is the system. On the difference between alignment as a wrapper and alignment as architecture — and why the second is the only one that holds at runtime.
A short and very practical case for keeping the safety classifier outside the model that produced the action. Includes benchmarks.