Blog

Terraform with YAML: Part 1

04 Apr, 2023
Xebia Background Header Wave

This post is the first in a series of three about supercharging your Terraform setup using YAML.

Terraform is one of the most common tools to provision infrastructure from code or configuration. However it’s using a custom language called HCL (Hashicorp Configuration Language). In this blog post we will explore how we can replace as much HCL code as possible with YAML and what the benefits are of doing so.

Why YAML?

One of the best properties of YAML in my opinion is the absence of syntax overhead. It allows you to consicely write down parameters and values. Let’s look at a comparison of some HCL code and YAML where we configure some Google Pub/Sub topics and subscriptions:

locals {
  config = {
    topics = [
      {
        name = "my-topic"
        labels = {
          environment = "prod"
        }
        subscriptions = [
          {
            name          = "my-subscription"
            push_endpoint = "https://example.com/push"
          }
        ]
      }
    ]
  }
}
topics:
  - name: my-topic
    labels:
        environment: prod
    subscriptions:
      - name: my-subription

As you can see, the difference in number of lines is quite large. Of course this will change once we add some HCL code to import the YAML configuration, but it quickly adds up when your infrastructure grows.

Loading and converting the YAML file to HCL is very easy. You can do it in one line even using the yamldecode and file functions:

locals {
  config = yamldecode(file("config.yaml"))
}

The result is an HCL represenation of the same data as shown in the earlier example.

For this particular example, the total number of lines of code using plain HCL is 18, of which 9 are purely syntax. The total number of lines using YAML, including the loading and parsing of the file, is 9. That’s a 50% reduction!

For more information about YAML decoding in Terraform, check the official documentation.

Another benefit of YAML over HCL is familiarity. Many engineers that do not work on infrastructure are not familiar with the HCL syntax and it’s quirks. YAML on the other hand is so simple and widely used that almost every engineer has used it in their career at some point. This means that if your repository contains YAML for infrastructure configuration, other types of engineers can easily adjust the configuration and deploy it (preferrably using a CI/CD pipeline and proper code review). This provides a self-sufficient environment for application or data teams that work on top of the base infrastructure.

A simple example

Let’s build a fully working example:

project:
  id: my-project-id
  region: europe-west4
bucket:
  name: example-bucket-123
  location: EU
  force_destroy: true
terraform {
  required_providers {
    google = {
      source  = "hashicorp/google"
      version = "4.47.0"
    }
  }
}

locals {
  config = yamldecode(file("config.yaml"))
}

provider "google" {
  project = config.project.id
  region  = config.project.region
}

resource "google_storage_bucket" "bucket" {
  name          = config.bucket.name
  location      = config.bucket.location
  force_destroy = config.bucket.force_destroy
}

In this example, we create and configure a Cloud Storage bucket. We use two separate root objects (project and bucket) to keep the config tidy and readable.

Using loops

Often we want to configure multiple resources, for example different storage buckets for different applications. Let’s adjust the example above to use a for_each loop:

project:
  id: my-project-id
  region: europe-west4
buckets:
  - name: example-bucket-123
    location: EU
    force_destroy: true
  - name: example-bucket-456
    location: US
    force_destroy: false
terraform {
  required_providers {
    google = {
      source  = "hashicorp/google"
      version = "4.47.0"
    }
  }
}

locals {
  config = yamldecode(file("config.yaml"))
}

provider "google" {
  project = config.project.id
  region  = config.project.region
}

resource "google_storage_bucket" "bucket" {
  for_each = toset(config.buckets)

  name          = each.value.name
  location      = each.value.location
  force_destroy = each.value.force_destroy
}

As you can see, with minimal extra code, we can now provision as many buckets as we want.

Up next

Now we have a basic understanding of the benefits of using YAML configuration files in your Terraform code. In the next post in this series we will dive into more advanced topics, like how to deal with nested loops, creating multiple resource types from a single YAML configuration, and dynamic variable injection and templating. As a bonus we will look into validating YAML files using a schema to get early feedback on the configuration without having to run a Terraform plan.

Chris ter Beke
Chris ter Beke has been using Google Cloud Platform for over 8 years. He has built multiple SaaS platforms using technologies like Terraform, Kubernetes, Python, and automated CI/CD pipelines.
Questions?

Get in touch with us to learn more about the subject and related solutions

Explore related posts