Democratizing Frontier LLMs - Cloud-cluster scale intelligence running on any Desktop PC
Explore the engineering breakthroughs behind running state-of-the-art, ultra-large Mixture-of-Experts models (100B-600B+ parameters) entirely offline on consumer-grade desktop hardware. This session dives deep into our novel storage-centric inference engine, developed by Monadd-AI and a research partner, that leverages dynamic expert offloading, hardware-topology-aware scheduling and prefetching algorithms to run Frontier LLMs directly on PCs. Learn how we run cloud-computing scale AI models without cloud-computing scale resources, just on a Desktop PC, with a live demonstration of a massive MoE model running locally.