Finetune Llama 3.1 on Your Infra

Operational guide to finetune Llama 3.1, with everything packaged in a simple SkyPilot YAML.

Zhanghao Wu, Romil Bhardwaj, Zongheng Yang·Jul 23, 2024·5 min read

AI on Kubernetes Without the Pain

Romil Bhardwaj·Jul 11, 2024·12 min read

SkyPilot 0.6: Managed Jobs API, SkyServe on Kubernetes, Spot + On-demand mixing, Paperspace support

SkyPilot Team·Jun 4, 2024·4 min read
https://skypilot.readthedocs.io/en/latest/serving/sky-serve.html

Introducing SkyServe: 50% Cheaper AI Serving on Any Cloud with High Availability

SkyServe: A simple, cost-efficient, multi-region/cloud library for serving GenAI models.

Tian Xia, Zhanghao Wu, Ziming Mao, Zongheng Yang·Feb 20, 2024·10 min read

Scaling Mixtral LLM Serving with High GPU Availability and Cost Efficiency

A tutorial for serving Mixtral 8x7B model with SkyPilot and SkyServe.

Zhanghao Wu·Dec 21, 2023·8 min read