Using Terraform with Azure - What's the benefit?

2018-02-26

If you’ve been to any cloud or devops conference or meetup in the last year, you’ve probably heard about Terraform, the Infrastructure as Code tool from Hashicorp, it’s been one of the big talking points of the devops community for some time now. Terraform is a proprietary language for creating infrastructure as code deployment solutions, and one of it’s biggest selling points is that it supports multiple different cloud vendors including Azure. I completely understand the excitement around having a single language to support deployments to multiple clouds, if that’s the space your in. For me 90% or more of my deployments are in Azure, so I have struggled to see a compelling reason to switch from using ARM templates. So I decided it was time for some investigation. I’ve spent some time playing with Terraform to deploy some basic Azure resources, and here are my views on it’s good and bad points and it’s feasibility as a replacement for ARM templates.

The rest of this article are my views from what I have experienced so far. I haven’t been playing with Terraform for long and I am sure there are some concepts I have not grasped or areas I have missed. I’d be glad to take any feedback!

At this point in my testing I’ve not done enough deployments to have a view on whether performance or reliability is any better with Terraform than ARM templates, it’s something i’d like to explore in the future.

Advantages

We’ll take a look first at the things I found in Terraform that provided value above what can be done with ARM templates:

Multi Provider

As I mentioned, this is one of the big selling points of Terraform, that you can use the same language to create deployments for Azure, AWS, GCE, OpenStack etc. as well as on-prem bare metal deployments. I can really see the benefits in this, especially if you are deploying resources that span cloud providers. The example provided for this scenario, deploying a cloud server in one vendor and adding a DNS entry in a different cloud vendor, is a pretty good showing of how this could do some really cool things, if your operating in this world.

Syntax

Terraform uses a custom language called HashiCorp Configuration Language (HCL). HCL is designed to be a compromise between being human readable and machine friendly, and in my experience is generally easier to read than ARM template JSON.

The example below shows creating a storage account using an ARM template:

{
    "type": "Microsoft.Storage/storageAccounts",
    "name": "[variables('storageAccountName')]",
    "apiVersion": "2016-01-01",
    "location": "[resourceGroup().location]",
    "sku": {
        "name": "[parameters('storageAccountType')]"
    },
    "kind": "Storage",
    "properties": {}
}

Compare this to creating the same resource in a Terraform template:

resource "azurerm_storage_account" "storage" {
  name = "${random_id.storageAccountName.dec}"

  resource_group_name = "${azurerm_resource_group.storageRG.name}"

  location = "${azurerm_resource_group.storageRG.location}"

  account_kind = "Storage"

  account_tier = "${var.storageAccountType}"

  account_replication_type = "${var.storageAccountReplication}"
}

In my view this is simpler to read than the ARM version for a few reasons:

Type is implied from the resource header
No need to specify an API version
Less concern regarding quotations
Reduced nesting

It’s still not perfect, variable/parameter selection is still using a special syntax which is not particularly easy to read, particuarly as Terraform doesn’t distinguish between Parameters and variables like an ARM template does.

Additional Resources

I hadn’t expected this when I started looking at Terraform, but it does actually offer the ability to create some items that aren’t available in ARM templates. The obvious one is resource groups, which are defined as first class objects in a Terraform template and resources placed in them, I prefer this approach to the ARM one where the resource group is implied through the command to run the template rather than actually part of the template. In addition to resource groups Terraform can also create storage containers, queues, tables and file shares, which is currently not possible in an ARM template.

State

This is an interesting one. With an ARM template all of the state of the deployment is in the deployment its self running in Azure, however when you deploy a Terraform template it creates a state file – “terraform.tfstate”. This state file is used to store the state of the infrastructure that it is deployed, separate to that infrastructure. Coming from an ARM template world this seems a little odd initially, as now you have to manage a separate file rather than relying on the infrastructure its self, but there are some benefits to using this approach:

Terraform retains knowledge and ownership of the resources it deploys, because of this you can do things like running “Terraform Destroy” and have it tear down the resources it create, in an ARM template the only way to do that is removing the resource group (or delete them individually)
Terraform is able to store metadata about the deployment, which means it can have knowledge of things that aren’t specifically part of the resources you deploy, allowing Terraform to provide more functionality than just what the resource API provides.
State includes things like dependencies, which at first view makes you thing, well Azure already provides this, but what about dependencies between providers? Where you need an AWS Dynamo DB instance created before you create an Azure VM.
State files for one deployment can be used by another as input data, so if your doing an application deployment on top of an existing infrastructure deployment you can reference this state to get the details
State files become really important when you are working with providers that don’t offer a complex resource API to get state from, such as on premises deployments

Terraform includes ways to be able to share state between people working in the same environment (remote state) and having different state between environments (workspaces). It is also possible to refresh the state to make sure that it is inline with the existing deployment.

All of that said, state files also introduce some issues, which we will discuss later.

Dry Runs

One other advantage of using state is that before running a deployment you can run the command “Terraform Plan”. This looks at the state file and the template you are trying to deploy and determines what changes it needs to make, without actually making any, it then presents you with a summary of what the deployment will change in this environment.

This is not only documenting what it is going to change, but it is also creating the deployment plan it will use to create the resources. We can also specify the -out parameter with the plan command to output this to a file, and then feed this to the “Terraform Apply” command. This provides two benefits:

We know that what is applied will be exactly what is in that plan
If we don’t have permission to deploy resource we can hand that plan file to someone who doe

Data Sources

Data sources are configuration objects in Terraform that allow you to collect data from outside of Terraform. There are a wide range of data source available with in each provider, for example in the Azure provider we can use data sources to pull in information about existing resources such as DNS Zones, RBAC Roles, Disk Images etc, similar providers exist for AWS resources and other cloud providers. There are also more generic data sources that allow you to pull data from a file or zip, as well as providers for services like Git, Data Dog, New Relic etc.

If none of these built in providers meet your needs there is also the external data source that allows you to call a script and read data from that, so long as it is returned as JSON.

Modules

Modules in Terraform provide a way to create re-usable code. At it’s core a module is just another Terraform template or set of templates that are packaged up together and then can be called from your main template like so:

module "frontend" { source = "/modules/frontend-app" }
``

Modules can be configured with input and output variables to allow your variables to flow between the main template and the modules. In the example above the module is stored in a file path, but what makes modules really useful is that you can store them in version control and reference them directly from your template:
```hcl
module "frontend" { source = "git::girhub.com/repo/modules.git//frontend-app?ref=v0.0.1" }

When you run Terraform you now have an extra command to run, “Terraform Get” which will fetch the modules from the repository ready for use. This makes code reuse within your organisation much easier.

Comments

This one is a bit silly, but the Terrafrom HCL language supports the use of comments, whereas JSON in ARM templates technically does not.

Disadvantages

Not everything about Terraform was an improvement on ARM templates, there where some areas of concern:

Resource Availability and Dependency

Obviously resources in Terraform are created by Hashicorp, so there is potential for a delay between Azure resources being released by Microsoft and them being available to create in Terraform. Resources seem to be added pretty quickly, for example there is already a resource for AKS, but there are some things missing. For example, there is no resource currently to create an Azure Recovery Service Vault or an App Service certificate. On a lower level, I have an ARM template that when it creates some PaaS resources (web app, SQL etc.), which I can create using Terraform, however my ARM template also configures these resources to send their diagnostic data to Log Analytics, the functionality to do this does not exist in TerraForm templates for Azure currently.

There is a way round this, because you can have Terraform just deploy an ARM template, but this is an unpleasant solution. Firstly you end up writing ARM syntax JSON inside your HCL resource and secondly you end up splitting your resource configuration between the two methods.

State

We mentioned above the benefits of using state files, however there are some downsides too. Firstly you have an extra file that is critical to your deployments that you need to manage and keep safe. If you loose your state file or it get’s overwritten you are really in a lot of trouble. You can use the “Terraform Refresh” command to refresh your state file against the existing infrastructure, so that can fix if someone has delete or added a resource or changed it’s configuration, but you can’t get a whole new state file from this approach. The metadata that Terraform records in the state file is nowhere else but the state file. Obviously you can work with this by storing your state file in a central location that is backed up etc. but you need to consider this.

Additionally state is tied to your environment, so each time you want to deploy another instance of your infrastructure your going to need to manage another state file. The use of work-spaces manages this for you in some respects, allowing you to easily swap between environments, but you still need to manage and look after those file. With ARM templates I know so long as I deploy to the right resource group then everything should go where it needs.

The second issue with state is around security. If you look inside a state file you will see that it records lots of information about your deployment in plain text, this includes things like variables, resource information etc. If these items are sensitive this can be a real issue. If you take the example we used above to create a storage account, the state file contains all the information about the storage account, including the storage keys in plain text.

Parameter Files

With ARM templates I can maintain my Parameters in a parameter file and pass it into the deployment command at deploy time. If I want to use a different set of parameters I can just pass in a different file. This sort of flexibility doesn’t exist in Terraform for a couple of reason:

State – If I have a different set of variables it will still use the same state file, and so expect to work with the existing state, this could cause significant confustion
File merging – When running a Terraform deployment you don’t specify a template and parameter file like you do in ARM, you just run it from a specif folder. Terraform will then merge together all the .tf files in that folder to create your deployment. Variables are just stored in a .tf file like anything else, so to have two sets of variables I would have to have two folders and replicate all the files in them

This is all solvable using work-spaces and possible storing variables in a external data source, but it’s a different way of working that will take more time.

Key Vault Access

One other benefit with parameter files in ARM is that you could reference an Azure Keyvault directly for your secrets. Assuming your deployment account had access to the vault it would pull the secrets out directly and use them. This isn’t currently possible in Terraform. A solution to this would be to add an Azure Keyvault data source, but I’ve not seen any evidence that this is coming yet.

Error Handling

I’ve seen a bit of this with my tests, but there are lot’s of articles online that complain about the unhelpfulness of Terraform error messages and that the debug logs are not often of much use. ARM templates suffer in this area as well however, so it’s certainly not unique to Terraform.

Conclusions

Terraform has some really interesting concepts, I can see some definite benefits over ARM templates in certain areas and I actually really enjoyed working with it. If I had a need to undertake a multi-cloud deployment or deploy on prem I would definitely make use of it. As far as for Azure only deployments, I am still a bit on the fence. Whilst I can see the benefits it could bring, especially around modules, data sources and dry runs, it also has some downsides and it is going to bring added complexity to the deployment process. I am certainly going to look at using it in a new green-field project to further understand it’s real world usage, but as for re-writing my existing deployments, that’s something that will take a bit more convincing. That’s just my opinion and maybe for your use case this will be a significant win, especially if your just starting your automated deployment journey and are looking to decide on the way forward. Hopefully this article has helped provide a good overview of what Terraform brings to the table.