Selected work
Orion-OS
A self-correcting runtime for autonomous code generation — it writes, tests, and repairs its own output.
Problem
LLM code generation fails silently. Plausible-looking code, no tests, no recovery path, no way to know what happened.
Approach
A deterministic pipeline wrapped around the model — coherence gate, real pytest execution, a failure taxonomy, and a self-repair loop — with every run traced and queryable.
Result
- average cost per generated-and-tested feature
- ~$0.001
- self-repair retries to recover a failure, on a fixed budget
- ≤3
- unattended on a VPS, with a nightly benchmark
- 24/7
Stack
- Python
- FastAPI
- Postgres
- DeepSeek