We are excited to bring you SkyPilot 0.10! This is our largest release by far, introducing enterprise-ready features including Single Sign-On (SSO), a feature-rich dashboard, external PostgreSQL support, workspace isolation, and SSH Node Pools. Whether you’re a startup scaling AI workloads or an enterprise managing complex multi-cloud infrastructure, SkyPilot 0.10 delivers the production-grade capabilities you need.
Get it now:
pip install -U skypilot
Enterprise authentication with SSO
SkyPilot 0.10 integrates seamlessly with enterprise SSO providers like Okta and Google Workspace, enabling secure authentication with automatic account creation and access control.
Log in to your organization’s SkyPilot deployment:
$ sky api login -e https://skypilot.example.com
A web browser has been opened to http://skypilot.example.com/token. Please continue the login in the web browser.
To manually copy the token, press ctrl+c.
Logged into SkyPilot API server at: http://skypilot.example.com
└── Dashboard: http://skypilot.example.com/dashboard
Users authenticate via their familiar SSO flow, and their identities are automatically tracked across all SkyPilot resources. Service accounts enable programmatic access for CI/CD pipelines and automated workflows.
Feature-rich dashboard
The new SkyPilot dashboard provides comprehensive infrastructure management in a single interface. Monitor all your cloud resources, edit configurations, manage users, view GPU metrics, and track job history with detailed YAML and git commit information.
Key dashboard features include infrastructure overview to see all available infrastructure in one page, user management to add, remove, and manage organization users, real-time metrics to monitor GPU utilization and API server performance, job tracking to view YAML configurations, entrypoints, and git commit hashes, and real-time logs to stream logs directly from managed jobs.
Workspaces for team isolation
Workspaces provide declarative team isolation with custom cloud configurations. Define separate environments for different teams or projects, controlling which infrastructure each team can access.
Configure workspaces in your API server deployment:
# API server config
workspaces:
research-private:
private: true
allowed_users:
- [email protected]
- [email protected]
gcp:
project_id: skypilot-research-private
aws:
disabled: true
ml-team:
gcp:
project_id: skypilot-ml-team-prod
Teams simply set their active workspace to use their team’s configuration:
# In team's .sky.yaml
active_workspace: ml-team
External PostgreSQL for production
SkyPilot 0.10 supports external PostgreSQL databases for production API server deployments, enabling high availability and disaster recovery. Use managed database services like AWS RDS or Cloud SQL to ensure your cluster and job state survives API server restarts.
# API server config
db: postgresql://myusername:mypassword@hostname:5432/database
SSH Node Pools: bring your own machines
Turn your existing infrastructure—on-premises servers, cloud reserved instances, or personal workstations—into SkyPilot-managed resources with SSH Node Pools.
Configure your machines in ~/.sky/ssh_node_pools.yaml
:
# ~/.sky/ssh_node_pools.yaml
my-datacenter:
hosts:
- 10.0.1.100
- 10.0.1.101
- 10.0.1.102
Deploy SkyPilot workloads on them:
$ sky ssh up
$ sky launch --infra ssh/my-datacenter -- python train.py
Your machines now appear as infrastructure choices alongside cloud providers, complete with GPU availability tracking and resource management.
Streamlined infrastructure selection
The new --infra
flag simplifies infrastructure specification, replacing the separate --cloud/--region/--zone
flags with a unified interface:
# New unified approach
sky launch --infra aws/us-west-2/us-west-2a task.yaml
sky launch --infra k8s/my-k8s-context task.yaml
sky launch --infra ssh/my-ssh-pool task.yaml
# Old approach (deprecated)
sky launch --cloud aws --region us-west-2 --zone us-west-2a task.yaml
Enhanced Kubernetes support
SkyPilot 0.10 brings enterprise-grade Kubernetes capabilities including multi-cluster management to simultaneously manage multiple Kubernetes clusters, Kueue integration for native support for Kubernetes job queueing, PVC volumes for persistent storage support for stateful workloads, and exec-based auth for robust authentication with improved kubeconfig handling.
Check GPU availability and utilization across your Kubernetes clusters:
$ sky show-gpus --infra k8s
Kubernetes GPUs
GPU UTILIZATION
H200 24 of 24 free
H100 24 of 24 free
Context: nebius-cluster
GPU REQUESTABLE_QTY_PER_NODE UTILIZATION
H100 1, 2, 4, 8 24 of 24 free
Context: lambda-cluster
GPU REQUESTABLE_QTY_PER_NODE UTILIZATION
H200 1, 2, 4, 8 24 of 24 free
Kubernetes per-node GPU availability
CONTEXT NODE GPU UTILIZATION
nebius-cluster <node_id-...> H100 8 of 8 free
nebius-cluster <node_id-...> H100 8 of 8 free
nebius-cluster <node_id-...> H100 8 of 8 free
lambda-cluster <node_id-...> H200 8 of 8 free
lambda-cluster <node_id-...> H200 8 of 8 free
lambda-cluster <node_id-...> H200 8 of 8 free
Automatic high-performance networking
Configure high-performance networking automatically with a single line:
resources:
accelerators: H100:8
network_tier: best # Automatically configures optimal networking
Supported across Nebius VMs and Managed Kubernetes Service, GCP VMs, and Google Kubernetes Engine with GPUDirect-TCPX and RDMA support.
New cloud providers
SkyPilot 0.10 expands cloud support with Hyperbolic for cost-effective AI workloads and Samsung Cloud Platform (SCP) with enhanced provisioner interface.
Enhanced managed jobs
High-availability job controllers with failure recovery, automatic restart capabilities, and job consolidation mode for improved efficiency. Centralized job management across multiple users reduces overhead and improves resource utilization.
Get started today
SkyPilot 0.10 transforms how organizations deploy and manage AI infrastructure across clouds. From individual researchers to enterprise teams, this release provides the production-grade features needed to scale AI workloads reliably.
Install SkyPilot 0.10 (by default uses the local API server):
pip install -U skypilot
Or upgrade your existing API server deployment gracefully:
- Clients automatically wait for an upgrade and retries
- Future compatibility across minor/major versions
NAMESPACE=skypilot
RELEASE_NAME=skypilot
VERSION=0.10.0
helm repo update skypilot
helm upgrade -n $NAMESPACE $RELEASE_NAME skypilot/skypilot \
--set apiService.image=berkeleyskypilot/skypilot:$VERSION \
--version $VERSION --devel --reuse-values
Check out the full release notes, upgrade guide, and enterprise documentation to get started.
To receive latest updates, please star and watch the project’s GitHub repo, follow @skypilot_org, or join the SkyPilot community Slack.