Terraform: Read CSV data and convert it into a list of maps

Updated: February 4, 2024 By: Guest Contributor Post a comment

Introduction

Infrastructure as Code (IaC) has changed the landscape of DevOps and cloud engineering, enabling professionals to manage infrastructure with the same approach as application code. Terraform, a tool developed by HashiCorp, stands at the forefront of this movement, allowing users to safely and predictably create, change, and improve infrastructure. This tutorial takes a deep dive into a somewhat niche but incredibly powerful feature of Terraform: reading CSV data and converting it into a list of maps for further manipulation or resource creation.

Understanding the Basics

Before diving into the complexities of working with CSV data in Terraform, it’s crucial to grasp a few basic concepts. Terraform allows us to define infrastructure through a configuration language named HCL (HashiCorp Configuration Language). Within these configurations, we can leverage the
csvdecode function to parse CSV formatted strings into structured lists or maps.

First, let’s look at a simple use case where we have a CSV file with basic information about some virtual machines we would like to manage through Terraform.

name, image, size
vm1, ubuntu, small
vm2, centos, large

Basic Example

locals {
  csv_data = file("${path.module}/vm_details.csv")
  vm_list = csvdecode(local.csv_data)
}
output "vm_list" {
  value = local.vm_list
}

The code above does the following: it reads the CSV file vm_details.csv, decodes the CSV content into a list of maps where each line represents a virtual machine with its attributes, and finally, exports this information via an output variable called vm_list. Below is how the output might look:

[{
  "name": "vm1",
  "image": "ubuntu",
  "size": "small"
},
{
  "name": "vm2",
  "image": "centos",
  "size": "large"
}]

Handling More Complex CSV Data

Suppose we have a CSV file with more complex data structure, including nested details or even lists within some entries. To handle this, we might need to craft a more intricate method of parsing the data. While Terraform’s basic csvdecode function does an excellent job with flat structures, confronting hierarchical data requires a bit more Terraform finesse and possibly even delving into external preprocessing steps.

For the sake of simplicity, let’s assume our complex CSV looks something like this:

name, info
vm1, "{\"os\":\"ubuntu\",\"features\":[\"web\",\"db\"]}"
vm2, "{\"os\":\"centos\",\"features\":[\"app\",\"cache\"]}"

Dealing with this information directly in Terraform might not be straightforward, as csvdecode does not automatically parse nested JSON structures within CSV fields. Handling this scenario requires first to decode the CSV data into a list of maps and then additionally parse the complex fields.

locals {
  raw_csv_data = file("${path.module}/complex_vm_details.csv")
  intermediate_list = csvdecode(local.raw_csv_data)
  vm_list = [for vm in local.intermediate_list : {
    name = vm.name,
    info = jsondecode(vm.info)
  }]
}
output "vm_list" {
  value = local.vm_list
}

With the above snippet, we achieve decomposition of the CSV data into an intermediate list of maps. Subsequently, we utilize a combination of the for loop and the jsondecode function for parsing the nested JSON strings found in the info field, effectively converting the entire structure into a more manageable format.

Advanced Usage: Dynamic Resource Creation

Once you have mastered reading and parsing CSV files into structured data, a natural progression is leveraging this capability to dynamically create resources. Imagine using the parsed CSV data to dynamically provision a fleet of virtual machines on a cloud platform.

resource "cloud_vm" "vm_instance" {
  for_each = {for vm in local.vm_list : vm.name => vm}
  name = each.value.name
  image = each.value.info.os
  size = lookup(each.value.info, "size", "medium")
}

In the example above, the for_each directive tells Terraform to create a separate cloud_vm.vm_instance resource for each item in the vm_list, utilizing the specifications within. It dynamically sets the VM’s name, image, and size using the parsed CSV data, showcasing the power of Terraform in automating infrastructure based on dynamic data sources.

Conclusion

Through this tutorial, we’ve explored how Terraform can ingest CSV data, parse it into a list of maps, and further manipulate this data for infrastructure management and automation. Mastery of these techniques opens up a wealth of possibilities for infrastructure as code practitioners, empowering them to handle dynamic and complex configurations with ease.