https://skypilot.readthedocs.io/en/latest/serving/sky-serve.html

Introducing SkyServe: 50% Cheaper AI Serving on Any Cloud with High Availability

SkyServe: A simple, cost-efficient, multi-region/cloud library for serving GenAI models.

Tian Xia, Zhanghao Wu, Ziming Mao, Zongheng Yang·Feb 20, 2024·10 min read

Scaling Mixtral LLM Serving with High GPU Availability and Cost Efficiency

A tutorial for serving Mixtral 8x7B model with SkyPilot and SkyServe.

Zhanghao Wu·Dec 21, 2023·8 min read

Scaling AI Robotics on the Cloud

Covariant runs AI on the cloud using SkyPilot, delivering models 4x faster cost-effectively.

Rocky Duan (CTO, Covariant), Clay Rosenthal (Production Engineer, Covariant), Marco Almeida (TLM of Production Engineering Team, Covariant), Chris Colby (Head of Software and Research, Covariant)·Sep 26, 2023·10 min read

Finetuning Llama 2 in your own cloud environment, privately

An operational guide on finetuning Llama 2, ready for commercial use.

Zhanghao Wu, Wei-Lin Chiang, Zongheng Yang·Aug 2, 2023·12 min read

Serving LLM 24x Faster On the Cloud with vLLM and SkyPilot

SkyPilot makes the deployment and development of vLLM easy and fast on clouds.

Woosuk Kwon, Zhuohan Li, Zhanghao Wu·Jun 29, 2023·5 min read