Managing NSX ALB with Terraform part 2: The modules

March 28, 2024

It’s been a while! As promised, I’ll be taking a look at the various modules that we built in Terraform.

A recap on TF modules

As mentioned in part 1, I’m assuming that you know the basics of Terraform, so I won’t be going in too much detail here. In short: Terraform defines a module as any directory that contains .tf files. The directory where you run Terraform is considered the root module, any directories underneath that are considered child modules. Modules can be called in similar fashion to regular resources or datasources.

Modules have many advantages, the major ones (for me) are:

It allows you to build complex constructs, and offer for consumption to people that might not necessarily have the knowledge needed to build them from scratch.
Tying into that, you can set up fixed values for many parameters in your modules. This allows deeper standardization and predictability.
Modules allow you to adhere to the DRY (Don’t Repeat Yourself) principle. You only define them once and then feed the module only the necessary parameters.

We host our modules internally on TerraReg (which I highly recommend!). This gives us a central point which assists in discovery and deployment.

The modules

We’ve created 6 modules for our purposes right now, I’ve listed most of them below. Most of our modules have two major versions that we host on TerraReg, one for our “legacy” vSphere cloud, and one for our newer NSX clouds. I’ll be focussing on the NSX versions here, but the difference are pretty minor.

Our module repositories are generally layed out like this:

├── Examples
│   ├── Example
│         ├── main.tf  
├── CHANGELOG.md
├── main.tf
├── output.tf
├── README.md
├── variables.tf

Examples contains any examples we provide (duh), this is the formatting that the Terraform Registry Protocol (and therefore TerraReg) expects.
The CHANGELOG and README files should be self-explanatory :)
output.tf contains any outputs we might provide (names, IP’s, ID’s, …)
variables.tf contains any non-local variables we use.
main.tf contains the resources and datasources that the module creates and uses. It will also contain the providers we use.

Shared stuff

Since this post will be plenty long as it is, I’ll try to keep it concise and adhere the DRY principle mentioned above. Every single one of the modules mentioned, contains the components mentioned below in our variables.tf and main.tf files.

variables.tf

We request information regarding the region, environment, tenant and subnet. This is specific to our deployment as we use all of this information to determine which cloud, vrf and T1 our resources need to refer to. We validate these inputs to make sure we can actually use them later on.

variable "region" {
  type = string

  validation {
    condition     = contains(["EMEA", "AMER", "APAC"], var.vsvip_region)
    error_message = "Location should be one of: AMER, APAC or EMEA."
  }
}

variable "environment" {
  type = string

  validation {
    condition     = contains(["PRD", "UAT", "TST", "DEV"], var.vsvip_environment)
    error_message = "Environment should be one of: PRD, UAT, TST, or DEV."
  }
}

variable "tenant" {
  type = string

  validation {
    condition     = contains(["A", "B", "C", "D", "E"], var.vsvip_tenant)
    error_message = "Tenant should be one of: A, B, C, D or E."
  }
}

main.tf

We’re obviously gonna need the avi provider. In addition for any NSX integrated clouds, we’ll need the nsxt provider as well, so we can fetch some data about our T1 gateway.

terraform {
  required_providers {
    avi = {
      source  = "vmware/avi"
      version = ">=22.1.2"
    }
    nsxt = {
      source  = "vmware/nsxt"
      version = "3.3.1"
    }
  }
}

We’ll be fetching information from our cloud, vrf, placement network and the NSX T1 gateway. These will be relevant for almost every module. While it’s not the most efficient thing to be fetch this data repeatedly, it does allow our modules to be self-contained, which we value. As you can see, we’re already using most of the variables we defined earlier to fetch the data we need.

data "avi_cloud" "nsxmgr" {
  name = "s-${lower(var.region)}-nsxmgr-${lower(var.environment)}"
}
data "avi_vrfcontext" "vrf" {
  name      = "T1-${var.region}-${var.tenant}"
  cloud_ref = data.avi_cloud.nsxmgr.id
}

data "avi_network" "placement_network" {
  name      = "LS-${var.region}-${var.tenant}-${var.environment}-VIP"
  cloud_ref = data.avi_cloud.nsxmgr.id
}

data "nsxt_policy_tier1_gateway" "tier1" {
  display_name = "T1-${var.region}-${var.tenant}"
}

VSVIP

Let’s start of with the easiest one, the VSVIP module. This module will create a VSVIP on ALB (obviously). Our deployment leverages the Infoblox integration, which allows us to dynamically fetch the IP address and create DNS records for the VSVIP.

variables.tf

First off, we want a name for our vip. Our internal naming convention demands that these start with “vip_” hence the validation.

variable "vsvip_name" {
  type = string

  validation {
    condition     = startswith(var.vsvip_name, "vip_")
    error_message = "Name must start with vip_"
  }
}

Next up, we request information regarding the subnet. We this information to determine which placement network our VIP needs to be in.

variable "vsvip_subnet" {
  type = string

  validation {
    condition     = can(cidrnetmask("${var.vsvip_subnet}/23"))
    error_message = "Must be valid IPv4 VIP subnet"
  }
}

Finally, we request a list of all FQDN’s that should be registered for this VIP. We once again validate to input, this time to make sure it’s a valid domain.

variable "vsvip_dnsinfo" {
  type = list(string)  

  validation {
    condition     = alltrue([for fqdn in var.vsvip_dnsinfo : endswith(fqdn, "myfirstdomain.com") || endswith(fqdn, "myseconddomain.net")])
    error_message = "All FQDNs must be in myfirstdomain.com or myseconddomain.net"
  }
}

main.tf

Finally, there’s the resource itself. For the VSVIP module, this is a single resource. You’ll notice here that we’re using the data objects we referred to before, as well as all the rest of the input variables described above.

First, we’ll set some basic parameters.

resource "avi_vsvip" "vsvip" {
  name            = "${var.vsvip_name}"
  cloud_ref       = data.avi_cloud.nsxmgr.id
  tier1_lr        = "/infra/tier-1s/${data.nsxt_policy_tier1_gateway.tier1.id}"
  vrf_context_ref = data.avi_vrfcontext.vrf.id

Next up, is all the information that is AVI needs to go and fetch an IP from Infoblox. The network_ref refers to the NSX Logical Segment name, while the addr itself selects the network address of the subnet.

  vip {
    vip_id      = 1
    enabled     = true
    auto_allocate_ip = true
    auto_allocate_ip_type = "V4_ONLY"
    ipam_network_subnet {
      network_ref = data.avi_network.placement_network.id
      subnet {
        ip_addr {
          addr = var.vsvip_subnet
          type = "V4"
        }
        mask = 23
      }
    }
  }

Finally, we have a dynamic block to create any DNS records we might want to for our VSVIP.

  dynamic "dns_info" {
    for_each = toset(var.vsvip_dnsinfo)
    content {
      type = "DNS_RECORD_A"
      fqdn = dns_info.key
    }
  }
}

outputs.tf

As you may have guessed, this module creates nothing more than a VSVIP. Since this won’t do much by itself, we’ll probably need some way to reference the VIP elsewhere in code. That’s our the output comes in. We choose to just output the VSVIP id

output "vsvip_id" {
  value = avi_vsvip.vsvip.id
}

Pool (& Health Monitor)

Next up, the pool module. As you probably guessed, this module will create a pool. But wait, there is more! In addition, the module will also create a health monitor, based in the inputs.

variables.tf

As you can see below, there isn’t anything special here. We request the basic information you need for a pool. Two of these might warrant some more explanation. The reencrypt bool will set an SSL profile on the pool the make sure traffic to a backend TLS application is encrypted. The healthmonitor map will be used for HTTP(S) monitors later. The variable expects a path and the credentials to access this path.

variable "pool_name" {
  type = string

  validation {
    condition     = startswith(var.pool_name, "pool_")
    error_message = "Name must start with pool_"
  }
}

variable "pool_port" {
  type = number
}

variable "pool_servers" {
  type = map(string)
}

variable "pool_persistence" {
  type = bool
}

variable "pool_reencrypt" {
  type = bool

  default = false
}

variable "healthmonitor" {
  type = object({
    path     = string
    user     = string
    password = string
  })
  default = null
}

main.tf

Locals

These warrant some more explanation. We’ve defined two locals here. First there’s types this is a simple map to tie 80 to HTTP and 443 to HTTPS. Next, there’s hm_type where we use the types local as a lookup map to see if we want to create an HTTP, HTTPS or TCP healthmonitor. These locals are essentially what allows us to distinguish between the type of traffic we’re expecting. It obviously does not cover any scenario (there is no UDP, there’s no other protocol specific health monitors), but it does cover more than 90% of our pools.

locals {
  types = {
    80  = "HTTP"
    443 = "HTTPS"
  }
  hm_type = lookup(local.types, var.pool_port, "TCP")
}

Data

For the pool, we also grab two more data sources: the persistency profile (we always default to cookie) and the SSL profile used for encrypting traffic to the backend. It’s obviously possible to customize these and even allow them as inputs, but we had no need for that.

data "avi_applicationpersistenceprofile" "persistence" {
  name = "System-Persistence-Http-Cookie"
}

data "avi_sslprofile" "sslprofile" {
  name = "System-Standard-PFS"
}

Resources

First up, we define the namesake of the module: the venerable pool. Just like the VSVIP, we have some basic parameters, followed by the pool specific information. For the pool we have the default_server_port, which defines the port traffic will be forwarded to. Next there is the servers dynamic block, which will loop through all servers that were in the input. We then define lb_algorithm to define how we want our traffic loadbalanced, and use health_monitor_refs to link back to the health monitor we’re creating.

Finally, we’re using some conditional expressions to decide whether or not we’re setting persistency and SSL profiles.

resource "avi_pool" "pool" {
  name      = var.pool_name
  cloud_ref = data.avi_cloud.nsxmgr.id
  tier1_lr  = "/infra/tier-1s/${data.nsxt_policy_tier1_gateway.tier1.id}"
  vrf_ref   = data.avi_vrfcontext.vrf.id

  default_server_port = var.pool_port
  dynamic "servers" {
    for_each = var.pool_servers
    content {
      hostname = servers.key
      ip {
        addr = servers.value
        type = "V4"
      }
    }
  }
  lb_algorithm        = "LB_ALGORITHM_LEAST_CONNECTIONS"
  health_monitor_refs = [avi_healthmonitor.hm.id]

  application_persistence_profile_ref = var.pool_persistence ? data.avi_applicationpersistenceprofile.persistence.id : null
  ssl_profile_ref                     = var.pool_reencrypt ? data.avi_sslprofile.sslprofile.id : null
}

Next up is the healthmonitor! We start off by defining the type of healthmonitor, based on the locals mentioned before. We also set the parameters regarding health check intervals, amount of checks and timeouts.

resource "avi_healthmonitor" "hm" {
  name = "HM_${local.hm_type}_${var.pool_name}"
  type = "HEALTH_MONITOR_${local.hm_type}"

  receive_timeout   = 4
  send_interval     = 10
  successful_checks = 2
  failed_checks     = 2

Next up, we have a conditional dynamic authentication block. If our health monitor is defined as HTTP or HTTPS, we’ll also create block, otherwise we don’t. Check here for a more detailed explanation.

  dynamic "authentication" {
    for_each = local.hm_type == "HTTP" || local.hm_type == "HTTPS" ? [1] : []

    content {
      username = var.healthmonitor.user
      password = var.healthmonitor.password
    }
  }

Next up, we once again use conditional dynamic blocks to create our specific monitor settings. Based on the hm_type local, we’ll create an HTTP, HTTPS or TCP monitor. In case of the HTTP(S) monitors, we’ll set up a path and basic authentication, and we’ll mark any 2XX code as healthy. For the TCP monitor, we only need to define that we’re using TCP Half Open (which reduces the performance impact).

  dynamic "http_monitor" {
    for_each = local.hm_type == "HTTP" ? [1] : []
    content {
      auth_type          = "AUTH_BASIC"
      exact_http_request = false
      http_request       = "GET /${var.healthmonitor.path} HTTP/1.1\r\nHost: ${split("_", var.pool_name)[1]}\r\n"
      http_response      = "HTTP/1.1 200 OK"
      http_response_code = ["HTTP_2XX"]
    }
  }

  dynamic "https_monitor" {
    for_each = local.hm_type == "HTTPS" ? [1] : []
    content {
      auth_type          = "AUTH_BASIC"
      exact_http_request = false
      http_request       = "GET /${var.healthmonitor.path} HTTP/1.1\r\nHost: ${split("_", var.pool_name)[1]}\r\n"
      http_response      = "HTTP/1.1 200 OK"
      http_response_code = ["HTTP_2XX"]

      ssl_attributes {
        ssl_profile_ref = data.avi_sslprofile.sslprofile.id
      }
    }
  }

  dynamic "tcp_monitor" {
    for_each = local.hm_type == "TCP" ? [1] : []

    content {
      tcp_half_open = true
    }
  }
}

Conclusion

I was aiming to describe all modules here, but I’m afraid that would make this post a bit too long, so I’ll split them up a bit further. Next up: the virtual service modules and “all of the above” module.