WTH are Azure Spot VMs?

This week we continue the sporadic series of “WTH is” posts, looking at a service in Azure that perhaps doesn't make it all that obvious what it is for, or why you might want to use it. This week we are looking at Azure Spot VMs.

What are Azure Spot VMs?

Some time ago, Microsoft introduced a service called “Low Priority VMs”. This service allowed you to to take advantage of any spare capacity in an an Azure region which was sitting idle. You could use these idle VMs for a reduced cost, on the proviso that if that capacity was needed then it would be taken away from you at short notice. If you could cope with this potential removal of VMs then you could get up to an 80% reduction in cost.

Amazon had been doing something similar to this for a long time, but the big difference with Amazon was that you actually bid on the price of this spare capacity. The price for the spare capacity VMs was not fixed like the Azure version, it would fluctuate up and down based on how much spare capacity was available. If there is lots of spare capacity the price would be lower, if there was limited or no spare capacity the price would be higher, or unavailable. Amazon called this service “Spot Instances”

Spot VM's is Microsoft's implementation of Amazons model of bidding for spare capacity, with a slight name change, I assume to placate the legal team. So now instead of offering a fixed price for using spare capacity, you bid on this, indicating the maximum price you are willing to pay for your VM.

Spot VMs supports both standalone VMs and VM Scale Sets (VMSS)

How do Azure Spot VMs work?

VM Creation

Using a Spot VM is a choice you make at the time you create a new virtual machine or a VM scale set, it's not a separate type of VM. When you open the “New VM” wizard one of the first options is whether to use a spot VM or not.

Select Spot VM

If you select yes then we need to select the “eviction type”, this determines what the criteria are for Azure to take the VM away from you (evict), and so sets your payment criteria. There are two options:

  1. Capacity Only - This means that your VM will only be evicted when there is no spare capacity and the capacity you are using is needed for pay as you go (full price) resources. In this mode you are effectively saying you are OK with paying up to the standard pay as you go price for VMs. If spare capacity is high you will pay less, but you do not want to be evicted if the price goes up, you will pay it. This is effectively how “low priority VMs” worked

  2. Price or Capacity - In this scenario you will be evicted if either Azure needs your capcity, or if the spot price reaches the maximum price you are prepared to pay per hour. This is inline with how the Amazon approach works.

    Eviction type

If you select capacity only then there is nothing more to do, but for price and capacity you now need to set the maximum price you want to pay. The portal will show you what the minimum price you can enter is.

pricing

You can also see a list of the average spot pricing for nearby regions, in case they might be cheaper.

Regions

After this point you complete the rest of the wizard as you would a normal VM, and go ahead and start the creation.

At the time you deploy the VM, there must be capacity available at the price point you set, otherwise you will get an error deploying the VM.

VM Use

Once you VM is created it will run, so long as there is spare capacity and the cost is less than the price you have set. It will act just like any other VM.

Should Azure need the spare capcity, or the price goes above the price you set then your VM will be given a 30 warning prior to it being evicted. Once the machine is evicted it will be de-allocated. The pending eviction is something that is sent to the VM through the Azure Metadata service, so it is possible to detect the upcoming eviction and do something, such as a graceful shutdown of your application, so long as you can do it in 30 seconds.

When capacity comes back available, or pricing comes back down below your limit you are able to start up an evicted machines, however these will not be started automatically for you.

Why would I want to use Azure Spot VMs?

There's really only 1 reason you would want to use Spot VMs and that is to reduce your cost. If you can put up with the threat of eviction and not being able to get resources at certain times then you can make some very significant savings on VM costs. For workloads that are not time sensitive but that require a lot of compute power this can be a good way to reduce costs.

There is no way to guarantee you will be able to get resources with Spot VM's, so you need to be comfortable with that. When capacity is low (like it is in many regions currently) you may struggle to get resource.

What issues do Spot VMs have?

Obviously the biggest concern with Spot instances is that you may not be able to get resource. This is not an issue with the service, it is how it is designed to work, by you need to be aware of this and make sure you are happy with that approach.

Beyond that, there are a few limitations in the service that I think you should be aware of:

  • B series VMs are not supported
  • Promo Sizes are not supported
  • Spot VMs cannot use ephemeral disks
  • Azure China is not supported
  • Spot VMs are not available on Benefit, Sponsored or Free subscriptions
  • You cannot convert a spot VM to a regular VM or regular to spot, you would have to delete the VM and attach the disk to a new VM
  • Machines which are evicted and de-allocated are not turned back on when capacity or price comes back inside allowed limits, you need to manually turn them back on
  • You will be unable to create your VM if capacity or pricing are not inside allowed limits

Further Reading:

If you are interested in trying out spot VMs you can find information and tutorials here: