Scaling Mixtral LLM Serving with High GPU Availability and Cost Efficiency

A tutorial for serving Mixtral 8x7B model with SkyPilot and SkyServe.

Zhanghao Wu·Dec 21, 2023·8 min read

Scaling AI Robotics on the Cloud

Covariant runs AI on the cloud using SkyPilot, delivering models 4x faster cost-effectively.

Rocky Duan (CTO, Covariant), Clay Rosenthal (Production Engineer, Covariant), Marco Almeida (TLM of Production Engineering Team, Covariant), Chris Colby (Head of Software and Research, Covariant)·Sep 26, 2023·10 min read

Finetuning Llama 2 in your own cloud environment, privately

An operational guide on finetuning Llama 2, ready for commercial use.

Zhanghao Wu, Wei-Lin Chiang, Zongheng Yang·Aug 2, 2023·12 min read

Serving LLM 24x Faster On the Cloud with vLLM and SkyPilot

SkyPilot makes the deployment and development of vLLM easy and fast on clouds.

Woosuk Kwon, Zhuohan Li, Zhanghao Wu·Jun 29, 2023·5 min read

SkyPilot 0.3: LLM support and unprecedented GPU availability across more clouds

SkyPilot Team·May 30, 2023·6 min read