You’re an AI startup founder. You’ve just raised a round. You’re now racing to train models or build AI apps. When it comes to infrastructure, there’s good news and bad news.

The good news is that if you play your cards right, you can get up to $1M in credits from 3+ cloud providers. The bad news is that using those credits efficiently across multiple clouds – while maximizing GPU utilization and managing compute resources – can be quite a heavy infra burden.

In this guide, we’ll try and make your life easier by walking you through:

  • How to get $1M worth of cloud credits for AI startups;
  • Learnings for getting GPU quotas to utilize the credits;
  • Tips for easily and economically spending those credits while keeping your team focused on AI, not infra.

Getting ~$1M Cloud Credits

As of this writing, here’s a breakdown of the credits you can expect from the big three clouds:

CloudProgram NameCreditsExpiry
Google CloudGoogle for Startups Cloud Program$250K + $100K1 year, then 1 year as rebate
Amazon Web ServicesAWS Activate

AWS GenAI Accelerator
$100K or $200K

Up to $1M
1 or 3 year

Unspecified
Microsoft AzureMicrosoft for Startups$150K1 year

Note that:

  • Some clouds give out credits in tranches. You have to spend the first tranche first before unlocking the next.
  • Some credits may be in the form of rebates (e.g., you spend N dollars and get 20% off the bill, up to a maximum rebate amount).
  • Credits come with an expiration date.

Our tips, from many convos with AI founders:

  • Negotiate: The information above is from clouds’ public websites. Founders we’ve spoken with have been able to get clouds to compete with each other for longer credit durations and/or even more credits. Don’t be afraid to leverage your options.
  • Ladder your credit usage: Don’t start using all your credits across clouds at once—ladder your usage (exhaust one cloud’s credits before starting on the next)! Note that this also introduces the challenge of cloud migrations, which we’ll tackle below.

Even with these caveats, if you genuinely need to consume that much cloud/GPU compute, this represents up to ~$1M of solid savings from your own bank account.

Google Cloud Platform (GCP)

Through the Google for Startups Cloud Program, AI companies can receive up to $250K in credits with a 1-year expiry, and up to $100K credits as 20% rebates for the year after.

Application Steps:

  1. (Optional; can be helpful) Talk to your incubator, accelerator, or VC that is part of Google’s partner network to connect with Google Cloud representatives.
  2. Complete the application on the Google for Startups page.
  3. Contact the Google representative and send your Google Cloud account details for approval.
  4. You may ask the representative for additional quotas of different resource types, including GPUs and TPUs.

Amazon Web Services (AWS)

AWS offers up to ~$1M credits to startups. There are two programs you can enroll in: AWS Activate with $100K or $200K credits (1-year or 3-year expiration) and AWS Generative AI Accelerator with up to $1M credits.

Application Steps for AWS Activate:

  1. You can start with the founders tier for the initial $1K credit, but if you have a VC, contact your VC to get an organization ID for the portfolio tier, which can unlock >$100K credits.
  2. Register an AWS Builder ID and an AWS account — these are two different accounts.
  3. Apply for AWS Activate with the portfolio tier using the organization ID (note that different organization IDs may represent different amounts of credits).
  4. Once approved, your credits will be applied directly to your AWS account, and you can start using the $100K credits from AWS Activate.

AWS Generative AI Accelerator program offers up to $1M credits, but has a specific application window. You could consider using your AWS Activate credits first before applying to this credit program.

Microsoft Azure

Microsoft for Startups offers up to $150,000 in Azure credits. Apply either directly or via a nomination through a partner of Microsoft (check with your VC if they are).

Application Steps:

  1. Visit Microsoft for Startups and sign up for the program with your Linkedin account.
  2. Fill in the information of your company and select your VC when prompted with Microsoft partner.
  3. Upon approval, you should access Microsoft for Startups Founders Hub and activate your credits on Azure.
  4. Azure credits can come in tranches. You can get more credits once you provide more information about your company and use more than 50% of the given credits.

Other Clouds

There are a few other cloud credit programs available for your startup:

Besides credits, there are a bunch of AI clouds that offer 3-4x cheaper GPUs. For example, 8x H100 GPUs cost $88/hr on Google, $98/hr on AWS, and $24/hr on Lambda Cloud (with better capacity). As a pointer, some common GPU clouds include (non-exhaustive list) Lambda Cloud, RunPod, Paperspace, Fluidstack, and CUDO.

Getting Quotas

After getting credits approved, the next step is to request quotas for the desired hardware and locations.

With some battle scars, here are several important tips for requesting quota increases.

Request quotas in many regions: While certain regions in different clouds are known to have higher-than-average capacity (e.g., AWS us-east-1; GCP us-central1), you should always try to request quotas in different regions to maximize the total quota you can use. Don’t worry about the complexity of using many regions; we will tackle that in the section below.

Always reach out to your sales/account teams: Quota requests are heavily a human-in-the-loop process. (Despite all the talk on AI agents, you should reach out to…human agents. For now.) Your sales / account reps can help escalate your quota requests.

Quotas for previous-generation GPUs are easier to get: The latest-generation GPUs are always the most in demand, whose quotas may be harder to get (e.g., currently, H100 GPUs) than previous generations (which are great for a lot of use cases). If your workloads do not need the latest generation GPUs, apply for quotas for other GPUs first.

Start spending credits even before you get quota approval: Run some workloads on CPU or older-generation GPU instances early, so your account has some activity or payment history. This may help quota approval.

Share account/project/subscription with different users to share quotas: You may have multiple team members using clouds. The best way to share quotas among the users is to share the same entity that quotas are assigned to, i.e., an AWS account, a GCP project, or an Azure subscription. You can create IAMs under those entities to distinguish different users.

While clouds have official instructions for requesting quotas: AWS, GCP and Azure, there are still some important notes to share.

  • GCP: (1) A responsive sales representative is very important for quota requests. Try to find a connection through your network. (2) Remember to increase the GPUs All Regions quota first, which is a “global” quota that limits the total number of GPUs you can create across GCP regions.
  • AWS: (1) You can only open two quota requests in each region. So to speed up the entire process be sure to apply for quotas in many regions in parallel. (2) Jumping on a call with your sales representative is not a bad idea.
  • Azure: In case your quota request gets rejected, you should submit a support ticket. Search for “Help + support” in the search bar, click “Create a support request” in the page, and select “Quota request for Compute-VM (cores-vCPUs) subscription limit increases” for the ticket. You will be connected to a representative through email or phone.

Challenges of (Multi)Cloud Credits

Once you’ve secured your cloud credits and got quota approved, the hard part begins: spending them wisely. Here’s where the going gets tough:

  • Multi-cloud & laddering complexity: Different clouds have different interfaces for creating instances and running workloads. If you ladder the credits, namely burning through one cloud’s credits first before adding another, you’re looking at weeks of integration per cloud just to keep your AI running.
  • Resource unavailability: GPU resource availability varies from time to time and across different regions/clouds. For an AI startup, you have to get GPUs quickly whenever you need them to speed up iteration.
  • Overspending on idle compute: Cloud is elastic and you can terminate your machine when idle, but that also means idle machines can cost you a lot if not well managed.

These challenges are exactly where SkyPilot comes in.

SkyPilot: A Unified System for Running AI on Different Clouds

SkyPilot is an open source framework that abstracts away the infra differences of running AI workloads on different clouds, while automatically finding the most cost-effective resources for your workloads.

With SkyPilot, you can run AI and batch workloads on any infra (12+ clouds and Kubernetes) with a unified interface and in a BYOC (launched in your own cloud accounts) fashion. Let’s take it for a spin to solve the multicloud credit problem.

Step 0: Install SkyPilot in 1 minute

pip install 'skypilot[all]'  # You can put cloud names here too.

Docs here if you need them.

Step 1: Set up all your clouds in one place

To get started with utilizing your credits, simply set up the clouds you want to use with their native credentials/auth (e.g., ~/.aws/credentials or SSO). SkyPilot will then rely on the native auth to automatically detect access.

Here’s an example of how you’d do this with AWS:

  1. Get your credentials from AWS cloud console, by clicking on “Security Credentials” in the account dropdown at the top right corner.
  2. Create an “Access Key”
  3. Run aws configure locally and enter the access key you just generated.

More secure auth like AWS SSO is also supported.

Run the sky check command to see what clouds you have access to:

$ sky check

🎉 Enabled clouds 🎉
  ✔ AWS
  ✔ Azure
  ✔ Cudo
  ✔ Fluidstack
  ✔ GCP
  ✔ Lambda
  ✔ RunPod
  ✔ Kubernetes

You’re now ready to interact with different clouds (and on-prem GPU clusters too) with a unified interface provided by SkyPilot.

Step 2: Write once, run anywhere

First, you declare your project (for training) or model (for serving) in a SkyPilot YAML.

The YAML is sort of like a “Dockerfile for clouds”. It packages your project’s dependencies, its entrypoint, and any resource requirements:

resources:
  accelerators: A100
  cloud: gcp  # Suppose GCP is the first cloud you got some credits on.

# Optionally, launch a serving deployment:
# service: ...

workdir: .

run: |
  python my_entrypoint.py  

You can get this going by running:

sky launch task.yaml

Now, suppose after a few months you want to switch clouds to take advantage of credits you have elsewhere – you only need a flag switch:

sky launch --cloud gcp task.yaml  # Or aws, azure, oci, kubernetes, …

Notice that no workflow or workload changes are needed when moving from cloud X to cloud Y. Your AI workflows are now portable, which makes laddering the credits much easier.

Step 3: Observe and manage multicloud resources in a single pane of glass

Now that we can launch on several clouds, the next question is: How do we see and manage my team’s jobs spread across N clouds and M regions?

You can use SkyPilot’s CLI or API to manage your resources in different clouds:

  • sky status: show all your resources, jobs, and services created on different clouds
  • sky stop / down: stop or tear down your resources when they are idle
  • sky jobs: submit jobs with automatic failure recovery, lifecycle management
  • sky serve: start an autoscaling service with cross-cloud, cross-region capability

In other words, there’s no need to build your custom tooling on top of cloud-specific APIs.

Here’s an example of the state of your infra, viewable in sky status. You may have some training jobs on Azure, some data processing jobs on GCP, and some dev nodes on AWS:

FAQs

Dealing with quotas & capacity shortage

The astute reader may ask about the question of GPU unavailability. GPU capacity can vary significantly across different regions/clouds and across time. Quotas are also notoriously hard to request.

To deal with these issues, SkyPilot has a built-in auto-failover provisioner that looks through all the resources across different regions and clouds that you have access to, improving GPU availability:

Additionally, you can ask SkyPilot to utilize different pricing models: on-demand, spot and long/short-term reservations. This further enlarges your total resource pool.

For example, anecdotally we found that spot GPUs on Google Cloud can be more available than on-demand ones, and you can use SkyPilot to easily leverage spot instances.

Are more cost savings possible?

The last question is: How can I stretch out my cloud credits for as long as possible?

SkyPilot offers three key cost-saving features for this:

Cost optimizer: On most clouds, pricing of the same hardware differs by regions or zones and can change over time. To fully utilize the credits you get from clouds, SkyPilot automatically finds the cheapest resources across your cloud accounts whenever you are launching new cloud resources.

Autostop: To avoid overspending, you can configure SkyPilot to automatically stop or tear down idle resources by simply running sky autostop <cluster-name>.

Spot instance support: Lastly, SkyPilot enables you to save >3x cost by running jobs on spot instances and automatically recovering from preemptions. To do so, use sky jobs launch --use-spot. Docs here.

What about data transfer / egress costs across clouds?

This is definitely a question that an AI infra lead should be asking. If my team’s AI jobs are launched on different clouds, egress costs will go crazy, right? Where do I store the data?

Fortunately, zero-egress object storage is being offered by the industry. For example, Cloudflare R2 is a zero-egress object storage that is S3-compatible. Tigris Data is another such storage provider.

S3 compatibility means that you can use these object storage from within SkyPilot with no code changes. Our suggested pattern is (using R2 as an example):

  • Store your large datasets (that training workers need to read) in R2
  • Make your training job workers read from that bucket

To have your training job read from an R2 bucket, add a file_mounts field, and pass a path to your training program:

resources:
  ...

file_mounts:
  /train_data:
    source: r2://my-bucket/
    mode: MOUNT  # Either MOUNT (stream) or COPY (copy to VMs first)

run: |
  python my_entrypoint.py --input-path=/train_data  

This ensures reading training datasets incurs zero egress costs, regardless of where you launch the GPU compute!

Summary

Cloud providers offer up to $1M in credits for AI startups, but effectively managing those credits across multiple clouds can be a huge challenge.

SkyPilot simplifies running AI workloads on any (and many) clouds. It provides a single interface to manage jobs on AWS, GCP, Azure, and more, while optimizing for costs and ensuring availability.

Stop wasting credits on idle resources (a top factor for cloud overspending), and start maximizing your cloud credits’ leverage with SkyPilot’s automated cost-saving features.

To spend your cloud credits, get started with SkyPilot in 5 minutes here.

Thanks to Justin Gage for reviewing and editing earlier drafts of this post.