Software Engineering15:40–15:58Cinema 2

Stop vibing your agents to production: applying ML discipline to agent development

Justin Barias

Lead AI Engineer · Australian Government

When I joined my current team, it was a familiar pattern: 6-8 experiments over a year, each taking 10-12 weeks, 60-70% of the time burned on infrastructure, one thing in production held together with duct tape, and our entire agent lifecycle dependent on what our cloud provider made available in our region. The fix wasn't a new framework. It was an old playbook: ML engineering. Version artifacts like model checkpoints, define evaluators like loss functions, search hyperparameters systematically, and decouple your tooling from your cloud provider. The first experiment under this approach finished in 4 weeks, and other teams across the organisation started running their own experiments without us. In this talk, I'll walk through the methodology, the key trade-offs, and demo HoloDeck, the open-source distillation of everything I learned.