Terraform, being a powerful tool to build, change, and version infrastructure efficiently, offers several ways to manipulate lists, including removing duplicates. Understanding how to efficiently remove duplicates from lists in Terraform can significantly enhance the clarity and performance of your configurations. This guide will explore multiple solutions to achieve this goal, detailing their implementation steps, code examples, and a discussion of their pros and cons.

Approach 1: Use `distinct` Function

The distinct function is a built-in Terraform function designed specifically for removing duplicate elements from a list. It’s straightforward and the most direct way to approach the problem of duplicates in lists.

Steps to Implement:

Define a list variable in your Terraform configuration.
Apply the distinct function to the list.
Use the output of the distinct function as needed in your Terraform configuration.

Code Example:

variable "mylist" {
  type    = list(string)
  default = ["apple", "banana", "apple", "orange"]
}

output "unique_list" {
  value = distinct(var.mylist)
}

Output:

["apple", "banana", "orange"]

Notes: The distinct function is efficient and straightforward; however, it is limited to lists and does not work on sets or maps. Its direct approach ensures that the result is predictable, making it an excellent choice for most scenarios requiring deduplication.

Approach 2: Custom Function with `for` Loop

In scenarios where more complex logic or conditions are needed for removing duplicates, a custom function using a for loop, combined with the set function, can be used. This method offers more flexibility at the cost of increased complexity.

Steps to Implement:

Define a local variable that uses a for loop to iterate over your list.

Within the for loop, implement any necessary custom logic for identifying duplicates.
Convert the resulting list to a set using the set function to ensure deduplication.
Convert the set back to a list if necessary for your use case.

Code Example:

variable "mylist" {
  type    = list(string)
  default = ["apple", "banana", "pear", "apple"]
}

locals {
  unique_list = tolist({for i in var.mylist : i => i})
}

output "unique_list_custom" {
  value = local.unique_list
}

Output:

["apple", "banana", "pear"]

Notes: This method provides flexibility but adds complexity. It can handle more complex deduplication scenarios beyond simple value matching. The downside is that it requires more lines of Terraform configuration and might not be as initially intuitive as the distinct function.

Approach 3: Using External Data Processing

For extremely large lists or when the deduplication logic is too complex for Terraform’s native functions, processing the list outside of Terraform and then passing the deduplicated list into Terraform may be the best solution. This approach relies on external scripting or programming languages.

Steps to Implement:

Create a script in your preferred language (e.g., Python, Bash) that reads the original list, removes duplicates, and prints the deduplicated list.

Use Terraform’s external data source to call the script and capture its output.
Utilize the output from the external data source as needed in your Terraform configuration.

Code Example:

No direct Terraform code example as this solution involves external scripting. An example in Python for removing duplicates from a list called mylist.txt and printing the result could look like this:

# Python script example
import sys
with open('mylist.txt', 'r') as file:
    lines = file.read().splitlines()
unique_lines = list(set(lines))
for line in unique_lines:
    print(line)

Notes: This method allows for leveraging the full power of a programming language, offering maximum flexibility and capability for complex deduplication. However, it adds an external dependency to the Terraform configuration, potentially complicating the deployment process.

Conclusion

Removing duplicates from lists in Terraform can be accomplished through several methods, each with its advantages and limitations. The choice of method should be based on the specific requirements of your Terraform configuration, such as the complexity of the deduplication logic and the size of the list. For straightforward scenarios, the distinct function offers a simple and efficient solution. For more complex needs, creating a custom function or leveraging external data processing might be more appropriate. Understanding these options allows you to write more efficient and effective Terraform configurations.

Next Article: What is Terraform and how does it work?

Previous Article: Terraform: How to convert a number to a string and vice versa

Series: Terraform Tutorials

DevOps