From 1 hour to 10 minutes: How I sped up my distributed LLM training without changing the code or GPUs

Henry Zhu·Sep 11, 2025·8 min read

Scaling AI Infrastructure at Abridge with SkyPilot

How we transformed our fragmented multi-cloud AI infrastructure into a unified system with SkyPilot, achieving 10x faster development cycles.

Sisil Mehta (ML Platform Lead, Abridge)·Sep 4, 2025·7 min read

From SLURM to SkyPilot: How Avataar cut costs 11x with multi-cloud AI infrastructure

Avataar's enterprise AI content platform cut costs 11x and unlocked GPU capacity by migrating from inflexible SLURM deployment to SkyPilot's multi-cloud infrastructure.

Shubham Jain (AI & Engineering Lead, Avataar)·Aug 20, 2025·7 min read

Self-host open-source LLM agent sandbox on your own cloud

Alex Kim·Aug 12, 2025·9 min read

Slurm vs K8s for AI Infra: Academic HPC vs Cloud-Native Reality - the non-ideal solutions

Alex Kim·Jul 30, 2025·8 min read