Managing NSX ALB with Terraform part 1: Overview

December 29, 2023

ALB has a great UI, but it also comes with a complete (if convoluted) API and Terraform provider. I’m a big fan of using Infrastructure-as-Code principles, and that’s exactly what I’ll dive into in this series.

Background and assumptions

We’re currently working through a datacenter renewal, one of the cornerstones of this renewal has been automating where we can, and introducing IaC principles along the way. A substantial component in this renewal includes the replacement of our F5 loadbalancers with NSX ALB. As Terraform was our platform of choice for managing NSX, it goes without saying that we use it for ALB as well.

This might become a pretty long series of posts, so in order to keep it somewhat readable, I’ll make a few assumptions:

You have a working knowledge of the NSX Advanced Loadbalancer
You have a working knowledge of Terraform
You have some basic knowledge on how to use Swagger/OpenAPI

I might dive into these at some point, but there’s plenty of resources out there to get you going if you’re starting from scratch, I’ve listed a few below.

ALB

Terraform

Tutorials at Hashicorp
“Terraform: Up and Running: Writing Infrastructure as Code (3rd edition)” - Yevgeni Brikman - ISBN: 1098116747
PluralSight course
Udemy course
Another Udemy course

Manual stuff

Our general mindset is that want to push everything we reasonably can into code.

Unfortunately, some configuration items are just too complex or unique to make that worth it. We did not bother with the items below:

Clouds: This is essentially a one-off for us, and it’s a fairly complex object (for NSX clouds) to put into code
vCenter objects: See above
SE Interface networks: We’ve got some legacy setups where we needed specific configuration. Again, this was a one-off so it was not worth it for us.

Structuring the repository

It’s generally a good idea to spend time working out your repository structure and general way of work with Terraform. This might seem trivial, but given that a Terraform folder generally means a Terraform state file, it becomes a pretty big pain in the butt to change it later. In our deployment, we have a the following tree:

├── Compute
├── Network
│   ├── Gateways
│   ├── Loadbalancer
│   │   ├── General
│   │   │   ├── certificates.tf
│   │   │   ├── main.tf
│   │   │   ├── provider.tf
│   │   │   ├── segroups.tf
│   │   │   ├── variables.tf
│   │   ├── TenantA
│   │   │   ├── PRD
│   │   │   │   ├── main.tf
│   │   │   │   ├── __FQDN1__.tf
│   │   │   │   ├── __FQDN2__.tf
│   │   │   │   ├── provider.tf
│   │   │   │   ├── variables.tf
│   │   │   ├── UAT
│   │   │   ├── TST
│   │   │   ├── DEV
│   │   ├── TenantB
│   ├── Segments
├──  Security

We’ve setup a single repository to define all our virtual infrastructure in our deployment. Loadbalancing is nested under the Network folder. We’ve split it up in a few folders:

General: This is all the global configuration that should be available to multiple tenants. It contains configuration for certificates, IPAM profiles and SE groups.
Tenant: We split up our environment in a several tenants, this allows a clear overview of each tenant’s configuration and reduces the potential impact in case thing go wrong.
PRD/UAT/TST/DEV: Within each tenant, we’ve created folders for every environment. This further reduces the blast radius of things going wrong.

You might have noticed quite a few files always return, this is part of our workflow. We’ll always define all variables in variables.tf, define all providers and the backend in provider.tf and set general configuration in main.tf. All other files are optional and created when we feel like the main.tf file would become too large or unwieldy. In this cause, we’ve split up the configuration of our SE groups and certificates in the general folder, and we created a TF file for every individual application.

The structure above works for us, but it has an important implication that should be considered: we need to update files in every folder on a semi-regular basis, usually when we need to upgrade either the module or the provider version. The big downside here is of course the amount of overhead, the upside is that you can test new versions on a subset of your configuration instead of pushing it to production immediately. You can work around this issue by using something like TerraGrunt. We did not bother as our environment is small enough to live with the overhead.

Modules

We use modules for stuff that is used on a (semi-)regular basis. This gives us quite a few benefits

pretty granular control of settings that we want to enforce
Keep our config files shorter and more readable
Enable other teams to deploy configuration without needing to understand all the intricate details of the platform

In our deployment, we created modules for the following items:

Pool
VSVIP
Virtual Service
All of the above

Conclusion

This post has been pretty light on technical content, but I hope it provides a clear overview on where we start from before I dive into the details. In the next post, I’ll go over the modules.