AI coding agents are quickly becoming the standard way for engineers to build and maintain workflows. A crucial part of all AI workloads is spinning up and maintaining remote GPU instances to quickly and efficiently process training and inference.

Launching remote GPU workloads used to mean writing long Kubernetes manifests and Slurm jobs and submitting them: a repetitive and time-consuming maintenance job. You had to know the right resource fields, figure out the cheapest cloud, handle credentials safely, and remember to set autostop. For repetitive tasks like spinning up a dev cluster or kicking off a fine-tune, that overhead adds up.

With the SkyPilot Agent Skill, your AI coding agent can now launch and run work on GPUs efficiently and securely. Your agent can launch clusters, run training jobs and manage cloud resources across any infrastructure using direct natural language instructions.

This blog walks you through the steps to enable your coding agent to launch GPU workloads. The examples in this tutorial use Claude Code but the same skill works with any agent that supports the open Agent Skills standard, including Codex, GitHub Copilot, Cursor, and others.

Install the SkyPilot Agent Skill

Open a Claude Code session and run:

/plugin marketplace add skypilot-org/skypilot
/plugin install skypilot@skypilot

This will install the SkyPilot plugin which contains everything Claude Code needs to know in order to launch, manage and spin down remote cloud infrastructure. You may need to restart Claude Code to reload the skill.

Alternatively, or if you’re not using Claude, you can instruct your agent to load the SkyPilot Agent Skill directly from the skill.md in the SkyPilot repository:

Fetch and follow https://github.com/skypilot-org/skypilot/blob/HEAD/agent/INSTALL.md to install the skypilot skill

Your agent now knows how to launch GPU workloads.

Instruct Agent to Set up Credentials

You can now instruct your agent to check your available cloud infrastructure:

Check my available infra

The agent will install the SkyPilot CLI if needed and inform you which cloud platforms are ready for use. If needed, the agent will walk you through setting up credentials for clouds that still need to be configured. Select the clouds you want to support and follow the prompts to enable access.

Check available infra

Your clouds are now recognized and correctly configured to launch your workloads.

Launch a GPU Cluster with One Sentence

This is where it gets fun. Instead of writing a YAML from scratch, just tell the agent what infrastructure you need to launch:

Launch a GPU cluster

The agent will:

  1. Generate the appropriate SkyPilot task YAML.
  2. Launch the cluster with an autostop set.
  3. Provision the resources on the indicated cloud provider.

You don’t need to manually create the config file. Your agent will transparently walk you through the steps to launch your required resources.

Check GPUs across K8s clusters

Your agent can give you a unified view of what GPU resources are available right now across your multiple Kubernetes clusters, whether on-prem or in the cloud:

What GPUs are available across my k8s clusters?

The agent queries all clusters registered with SkyPilot and returns a consolidated summary: which clusters have capacity, what GPU types are available, and how many are free. No jumping between dashboards or kubectl contexts.

Workloads your Agent can Launch

Your agent can also run many other types of complex workloads for you, like:

  1. Configure interactive dev environments: “Launch a cluster with H100 GPU, connect my VS Code to it, and set it to auto-stop after 30 min idle.”

  2. Fine-tune models: “Fine-tune Llama 3.1 8B on my dataset at s3://my-data/train.jsonl. Use spot instances and recover from preemptions.”

  3. Configure multi-cloud failover: “Submit training jobs that try our Slurm cluster first and fall back to AWS if it’s full.”

  4. Compare GPU pricing: “What’s the cheapest 8x H200 across AWS, GCP, Lambda, and CoreWeave?”

Read the docs for more examples.

Spin Down Your Cluster

When you’re done, ask the agent to tear down your cluster:

Shut down my cluster.

If you have multiple clusters running, the agent will list them and confirm which one to stop before taking any action. You can also ask it to stop all clusters at once, or check the status of running infrastructure before deciding:

What clusters do I have running right now?

Note that if you set autostop at launch time, the cluster will shut down automatically after the idle period.

Get Started Now

Install the SkyPilot Agent Skill, bootstrap your cloud credentials, and start launching GPU workloads with natural language instructions.

Check out the Agent Skill documentation for the full list of supported capabilities, and share what you’re building in the SkyPilot Slack — we’d love to hear from you!