Magitech.ca written by Wyatt Zacharias

Managing a Route53 DNS zone with Terraform

As we continue our migration of resources into AWS, one item on the list was our public DNS zone and all of the records within. Many cloud native solutions have a specific dependency on DNS, in that they are otherwise completely dynamic and rely on DNS being automatically updated whenever these dynamic resources are created or destroyed. This is easily accomlished with a service such as Route53, which provides full automation control through the API to adjust DNS records as resources are created or destroyed. This is great when you’re using an automation tool to deploy resources and at the same time create the appropriate DNS records, but what about when you have a mixed environment where you’re still maintaining traditional DNS records that are manually created to correspond to resources and have automatically created records deployed with other resources.

Since we use Terraform at work to dynamically deploy our AWS resources, I was looking for a simple but scalable method for maintaining a set of static DNS records with Terraform. This is actually a bit difficult as Terraform doesn’t handle certain dynamic input sets well. We use the aws_route53_record resource in Terraform to create DNS records in Route53. This resource accepts a list of strings for the data portion of the record, but only accepts a single string for the name of the record. Now we could use the Terraform count directive to create a number of these resources relative to the number of input records that we give it but this is where Terraform falls short in its dynamic data handling. Resources created with a count are assigned to a speicifc index in the state file for that resource. Any changes to the dataset such as an insertion or removal will reorder the resources and cause Terraform to destroy and recreate all DNS records in the data set.

Essentially there is no scalable way to deploy a large set of DNS records with Terraform in a single apply. Instead I decided the best way to deploy the set of resources would be to do so individually per record. This is slightly less efficient as it creates an individual state file for each record, but resolves all of the other issues and makes all of your DNS records independent from one another. What it looks like is instead of a single tfvars files with all the records, we create a subfolder to represent each DNS record, and a tfvars file within that contains input data for only a single record. Then we run an apply-all on the parent folder which will then run the same DNS record module for each tfvars file which creates a single DNS record.

It looks something like this:

env_name/
   region/
    dns_records/
      record1.example.com/
        terraform.tfvars
      record2.example.com/
        terraform.tfvars
      record3.example.com/
        terraform.tfvars

Running an apply-all in the dns_records folder would run each tfvars file in each sub-directory. Or if you were running an apply-all in the top level folder then all other tfvars files in the tree would be run in addition.

Checkpoint Client VPN Certificate Enrollment Fails

Checkpoint’s Mobile Access VPN Blade is what Checkpoint calls their client VPN function where external users can tunnel their traffic into the corporate network and access the services within. A variety of authentication mechanisms can be configured for clients to use when they connect, including the use of their active directory credentials. Most organisations won’t consider a single user credential sufficient authentication for allowing connectivity into the internal networks from an external source. The mobile access blade supports a second factor of authentication, in this case a client certificate signed by a trusted certificate authority. By default the checkpoint management server has a certificate authority running on it, which can be used to issue certifcates to clients that will be trusted by the gateways. Instead of using CSR files or exporting certificates as PKCF files checkpoint VPN clients include a certificate enrollment function the uses a one time key to securely connect to the CA and request and issue the certificate onto the client in a single step. Additionally once a certificate is issued, it can be securely renewed through the same process without the need for a new one time key each time.

Issues with certificate enrollment arise when client machines are connected internally and traffic routes abnormally when communicating with the gateway that is running the mobile access blade, and the management server that performs the certificate authority functions. When a client starts the enrollment externally it will initiate communication with the external IP address that is configured for the VPN site, which will be the gateway running the mobile access blade on port 18264. The gateway then dNAT’s the traffic to the address of the management server which performs the enrollment and communicates back to the client, being routed through the gateway device. This is where internal issues arise. If the source address of the client performing the enrollment is directly accessible to the management server, or the return route bypasses the gateway, traffic will be out of state and not route back to the client correctly.

This is a fairly standard case of asymmetric routing, but is difficult to isolate since the steps of enrollment are not documented by Checkpoint, and thus the administrator is stuck looking through packet captures trying to understand the flow of traffic and where it is wrong. In our case the problem was resolved by creating a new Hide NAT rule with the original source being our workstation subnets, the original destination being the IP of the management server and the original destination port being 18264. The translated source should be set to the internal IP address of the gateway that faces the managment server and the NAT method should be Hide. This tricks the management server into sending traffic back to the gateway instead of back to the client directly. Because the source address is never changed, the return traffic will be properly routed by the gateway, back to the client.

See Also:

I couldn’t find any Checkpoint SK that exactly described this issue, but here are a couple that pointed me in the right direction (support entitlement required): SK109993 SK114266

Checkpoint IPv6 BGP Configuration Challenges and RouteD Crashes

Recently at work we finished going through all the ARIN buerocracy and aquired our own ASN and IPv4 & IPv6 space. This was aquired so that we’ll have the ability to easily run multiple upstream internet providers, potentially across different geographic locations. Our intention was to then use our Checkpoint 5400 firewalls to peer BGP with our upstream providers. In theory this should have worked fine, the 5400s ship with 8GB of RAM and the full IPv4 + IPv6 route tables need about 4GB on their own.

Checkpoint firewalls support BGP by default; the routed daemon handles all of the dynamic routing protocols that are running on the firewall. An important thing to note about checkpoint firewalls and the routing architecture therein is at the base level it is the Linux kernel running, and it is the kernel IP forwarding that is forwarding traffic through the box. BGP peering is configured through the Gaia web UI and can also be configured from the CLI.

I started with configuring the IPv6 peering first, as I was still waiting on the final approval from ARIN before they would issue the IPv4 space we had requested. Right off the bat I ran into some odd compatibility issues with IPv6. For starters our upstream ISP initially wanted to use a /127 PTP link address for peering. This practice is fully defined in RFCs and supported on most major “real” network hardware vendors (Cisco, Juniper, etc), but unsupported by checkpoint, which meant I had to request our upstream change to a /126 peering link. The second initial issue is that checkpoint does not support md5 authentication for peering. Again I had to request to our upstream to allow an unauthenticated peering session.

With the initial incompatibilities out of the way I got our session established and started getting routes from our upstream. This is where things got weird. Our route table would quickly climb to just over 60,000 routes, the full IPv6 routing table. Within 60 seconds of the session establishing and the routes being received, the BGP session would die, dropping all of the routes from the table and going back to active state. This would happen continuously every 60-120 seconds. Checking the log of the routed daemon at /var/log/routed_messages it was full of error messages similar to this: Dec 19 14:03:39 ERROR: KRT SEND ADD 2620:1c0:61:: mask ffff:ffff:ffff:: router 2606:xxxx:xxxx::5: Cannot allocate memory which was quite surprising as the current memory utilization was less than 50%, and monitoring with top showed no major variation or increase in memory allocation while the daemon was trying to run. After opening a support case with checkpoint and getting nowhere I was able to surmise that the error messages were actually referring to the kernel’s route table, which is separate from the dynamic route table maintained by routed, but is still updated with all of the routes from routed's dynamic table. With that in mind I went searching for Linux kernel issues with the IPv6 route table, and right away I was able to find other users encountering similar issues with the inability to handle any significant number of IPv6 routes with the Linux kernel. I was then able to confirm this by running routed on the firewall and waiting for the route table to fill up. I then tried to manually add an IPv6 route to the kernel: [Expert@gateway:0]# ip -6 route add 2001:db8::1 via 2606:1234::1 this command exits with the error RTNETLINK answers: Cannot allocate memory. So it is the kernel that cannot allocate memory not the routed daemon. One more google search and I found the kernel parameter net.ipv6.route.max_size which determines that max size of the route table in bytes. Notice that it is a unique parameter for both protocol versions and can have different values. Checking what the default was set to, it was a measly 4096 kilobytes!! In comparision, net.ipv4.route.max_size is set to 4194304 kilobytes. Thankfully checkpoint allows root access to their firewalls, so it’s an easy fix to up the IPv6 limit to match the IPv4 limit. After running sysctl -w net.ipv6.route.max_size = 4194304 routed immediately stabilised and the established session stayed online.

It’s clear that checkpoint QA never actually tested their IPv6 functionality to the extent that IPv4 is tested, as this would easily have been caught, and the value that the IPv4 limit is set to is non-standard. So at one point they new they needed to accomodate large routing tables, but they never made the same accomodations for IPv6.