Enforce Budgets with Azure Automation

Budgets are a feature of the new Azure cost management tool, which is primarily the integration of Cloudyn into the Azure portal. Budgets allow you to set a financial boundary for a subscription or a resource group, which you can monitor through cost management and trigger alerts when you are close to the cost set in your budget.

One thing you cannot do with budgets, and something that many people feel is an obvious requirement, is to stop people spending any more money when they hit the budget. If a subscription or resource group breaches its budget, then it will trigger an alert. However, it does not turn off any resources, and it does not stop further spending.

However, there is a way around this, and that is what we will look at today.

Using Budget Alerts to Trigger Scripts

As I mentioned, you can set up budgets to trigger an alert when you reach a certain threshold. These alerts are configured to call Azure Monitor action groups. Action groups allow you to configure things like email, SMS and phone alerts, but they also let you call out to other services such as Azure Functions, Logic Apps and Azure Automation. Because we can call these services, it means we can trigger any script or code we like when an alert occurs, and so we can implement some of our logic to restrict further spending when a budget is hit.

In this example, we are going to create a straightforward Azure Automation script that turns of all the VMs in an Azure Resource Group or Subscription if a budget is hit. We will also apply a resource lock to prevent someone from turning the resource back on or creating more.

Pre-Requisites

There are a few things we need in place before we set this up:

  1. You need to use an Enterprise, Pay as You Go or Dev/Test subscription for budgets to be supported. CSP and Sponsorship subs are not currently supported.
  2. You need a service principal that will be used to run the Automation script and talk to the Azure API. See here for details on creating a service principal.
  3. The service principal needs to have permission to turn off the VMs and to create the resource lock. You may want to create a custom role to do this, as by default only the Owner and User Administrator built-in roles have permission to create locks. This role will require the "Microsoft.Authorization/locks/*" permission.

Setup Automation

The first thing we want to do is setup Azure Automation with the resources we need.

Automation Account

We need an automation account to run the script with. You can use an existing one or create a new one. Ensure that you update the AzureRM modules to the latest version in the modules tab. If you get an error when we run the script about the resource lock type "readOnly" not existing it is because you have not updated AzureRm.

Credential Asset

The script will need some credentials to run as and talk to Azure. We will use the service principal you created earlier for this. In Azure Automation create a credential asset called "AzureCredential", set the username to be the AppID of the service principal and the password to be the password.

Runbook

We now need to create the runbook we wish to trigger. Go to the runbooks tab and click create new. You can call it whatever you wish. The script we are going to use in this example is below. This is a simple script which takes 4 parameters:

  1. The name of a credential asset containing your Azure SP creds - this is optional if it is not provided it will use the default name of "AzureCredential".
  2. The ID of the subscription to run against
  3. The ID of the tenant the service principal is in
  4. An optional name of the ResourceGroup to run against. If this is not supplied, it will run against all VMs in the sub

The script below is an adapted version of the gallery script available here.

param (
    [Parameter(Mandatory=$false)] 
    [String]  $AzureCredentialAssetName = 'AzureCredential',
        
    [Parameter(Mandatory=$true)]
    [String] $AzureSubscriptionId,

    [Parameter(Mandatory=$true)]
    [String] $AzureTenantID,

    [Parameter(Mandatory=$false)] 
    [String] $ResourceGroupName
)

# Returns strings with status messages
[OutputType([String])]

# Connect to Azure and select the subscription to work against

$Cred = Get-AutomationPSCredential -Name $AzureCredentialAssetName -ErrorAction Stop
Add-AzureRmAccount -Credential $Cred -TenantId $azureTenantId  -ServicePrincipal  -ErrorAction Stop -ErrorVariable err

if($err) {
    throw $err
}

$SubId = $AzureSubscriptionId

# If there is a specific resource group, then get all VMs in the resource group,
# otherwise get all VMs in the subscription.
if ($ResourceGroupName) 
{ 
    $VMs = Get-AzureRmVM -ResourceGroupName $ResourceGroupName -status
}
else 
{ 
    $VMs = Get-AzureRmVM -status
}

# Stop each of the VMs
foreach ($VM in $VMs)
{
    if($VM.PowerState -eq "VM running"){
    $StopRtn = $VM | Stop-AzureRmVM -Force -ErrorAction Continue

        if ($StopRtn.Status -ne 'Succeeded')
        {
            # The VM failed to stop, so send notice
            Write-Output ($VM.Name + " failed to stop")
            Write-Error ($VM.Name + " failed to stop. Error was:") -ErrorAction Continue
            Write-Error (ConvertTo-Json $StopRtn.Error) -ErrorAction Continue
        }
        else
        {
            # The VM stopped, so send notice
            Write-Output ($VM.Name + " has been stopped")
        }
    }
}
# Add Resource Lock to prevent further changes
if ($ResourceGroupName) 
{ 
    New-AzureRmResourceLock -LockLevel ReadOnly -LockNotes "Lock after breaching budget - contact operations to remove" -LockName "BudgetBreachLock" -ResourceGroupName $ResourceGroupName -force
}
else{
    New-AzureRmResourceLock -LockLevel ReadOnly -LockNotes "Lock after breaching budget - contact operations to remove" -LockName "BudgetBreachLock"   -Scope "/subscriptions/$AzureSubscriptionId"
}

Create Action Group

Now that we have the automation job ready to go we want to set up our action group to be triggered when the budget is set. Action groups are created under Azure monitor, then selecting alerts on the left, and then "manage action groups".

Click "Add action group" to create a new one and give it a name and short name. Each budget will need its action group, as you specify the subscription and/or the resource group to turn off here. Select the subscription and resource group to store the alert in; this has nothing to do with the resources you want to alert on; it is just for storage.

Finally, we need to create some actions. If you want to email or send SMS alerts when a budget is breached you can set them here, then we need to create an action with a type of "Automation Runbook". In the page that opens you need to change the "Runbook Source" to "User" and then select the sub, automation account and runbook we configured previously.

Finally, we need to configure the parameters we want to send for this action. Click "configure parameters" and then enter the subscription id and tenant ID. Enter the resource group ID if you want this to operate at the resource group level; otherwise, leave it empty to look at the whole subscription.

If you named your asset credential anything other than "AzureCredential" you will also need to set that.

Click OK to save, and it is now ready. If you need to work with other resource groups or subscriptions, you now need to replicate the steps above for this.

Setup Budget

The final step is to set up the budget to trigger this alert. In the Azure portal, open the cost management blade, then select "Cost Management" and then "Budgets". At this point, you need to select the scope you want to set your budget at in the scope selector. Click on this and filter down to the subscription or resource group you are interested in.

Click on "Add" to create a new budget. Fill in the fields to give it a name and the amount for the budget as well as how often it resets. You can also set an expiry if you wish.

Next, we complete the alerts section and configure a condition that when the spend hits 100% of the budget that we trigger our action group, which runs our automation run book.

You can also add some email alerts at the bottom of this page if you wish.

Wait for Trigger

At this point, we are ready to go. If the usage of the resource we scoped this budget to hits 100% it will trigger the defined action group which will run our automation script. This script will turn off all VMs in the scope provided and set a lock that prevents turning them back on or adding new resources. Only users with rights to remove locks will be able to remove that and turn it back on.

Future Development

This example is a pretty simple one that works for VMs. If you want to turn off other resource or undertake other actions you can easily do so by writing your own automation script. If you prefer writing C#, you can also do this with a Function, or undertake more complex workflow with a Logic App. If you do come up with interesting scripts, please do share them and comment back here.