Don’t let your Azure Routes bite you

In Azure there are a couple of situations where you need to be careful with your User-Defined Routes, and after looking at some of them with other engineers we decided that a blog post around those might be interesting. So let’s go!

The most typical situation where you use User-Defined Routes (UDRs) in Azure is in combination with a network firewall, which can be either Azure Firewall or a Firewall from one of Microsoft’s partners. If using an Azure Firewall it is useful decomposing the Azure Firewall resource into its individual components. These are normally hidden from the normal user, but at times knowning about this can be the difference between solving a problem in 30 minutes or 1 week.

Essentially, an Azure Firewall is composed of two or more multiple instances with two load balancers in front of them: a public one (to get traffic from the public Internet), and an internal one (to get traffic from on-premises or the rest of the Azure Virtual Network environment). Something like this:

Internal structure of Azure Firewall

The following diagram illustrates then the design we are going to investigate. A hub and spoke model, with some common services in the hub, and an application served out of a virtual machine in the spoke. Route tables in the different subnets send traffic to the firewall, to make sure that every IP flow is inspected:

Overall design

Flow: Application to Common Services

Let’s start with having a look of some of the flows in the diagram, for example the connectivity between the spokes and the Common Services in the hub. These Common Services might be anything such as Domain Controllers, DNS servers or file shares, and connections are typically initiated from the spokes. Lets walk this flow:

Flow: spoke to common services (hub)

1. The spoke VM (172.16.1.4) sends a packet to the common services system (192.168.1.10). That packet will be intercepted by the UDR for 192.168.0.0/16, and it will be forwarded to the Azure Firewall’s insternal Load Balancer.
Source IP: 172.16.1.4
Destination IP: 192.168.1.10

2. The internal Azure Load Balancer in the firewall will pickup one of the firewall instances. In this particular example, it has picked up Instance 0.

3. The Azure Firewall instance will inspect the packet, and if it decides to forward it (instead of dropping it), it will forward it to the destination IP. Note that per default the Azure Firewall will not Source NAT private IP addresses (RFC 1918), so no change in the packet addresses:
Source IP: 172.16.1.4
Destination IP: 192.168.1.10

4. The common services VM answers, and Azure sends the packet to the Azure Firewall’s internal load balancer as per the User-Defined Route configured in the CSRT (Common Services Route Table) for 172.16.0.0/16:
Source IP: 192.168.1.10
Destination IP: 172.16.1.4

5. Now the load balancer needs to choose the firewall instance to process the packet. Any given Azure Load Balancer will choose the same backend instance for the same packet attributes (even if they are reversed). In this case, the source and destination IP addresses are the same than in the step 2 but reversed (the source is now the destination, and the destination is now the source), so the Azure Load Balancer will pick Instance 0 again .

6. The Azure Firewall Instance 0 already knows about that TCP flow, since it saw the initial packet, and it will allow it and forward it to its destination in the spoke.
Source IP: 192.168.1.10
Destination IP: 172.16.1.4

So everything good, right? There is one problem with this design though, that the following flow will explain.

Flow: Internet to Application

Now we will look at an Internet user trying to reach the application in the spoke through the Azure Firewall’s public IP:

Asymmetric traffic between the public Internet and a spoke

1. In this case, the traffic is initiated from the public Internet. The application client will reach the public IP address of the Azure Firewall (AzFWPIP in the diagram). That public IP address is actually configured in the public Azure Load Balancer
Source IP: ClientPIP
Destination IP: AzFWPIP

2. The public Load Balancer will pick up one of the Azure Fireall instances, in this particular example the Instance 1 (I believe that the public ALB is configured with Floating IP aka Direct Server Return, but this is not relevant for this blog):
Source IP: ClientPIP
Destination IP: AzFWPIP

3. The Azure Firewall instance will destination-NAT the traffic (assuming here a DNAT rule is configured in the Azure Firewall). Additionally it will source-NAT the packet to make sure that return traffic comes to the same instance. Finally, it will forward it to the application in the spoke
Source IP: 192.168.100.7
Destination IP: 172.16.1.4

4. And here we finally have the problem: when the virtual machine in the spoke answers, the return traffic will go to 192.168.100.7. This destination is covered by the UDR 192.168.0.0/16, so Azure will send it to the Azure Firewall’s internal load balancer
Source IP: 172.16.1.4
Destination IP: 192.168.100.7

5. This internal load balancer is different that the public load balancer that saw the inbound packet from the client, so it might as well pick a different firewall instance than the one used for the inbound flow. If we assume the worst case, the internal load balancer will pick now the Azure Firewall Instance 0. This Azure Firewall instance has not seen the inbound packet, so it will consequently dropped this packet as an answer to a question that has never been asked (or a SYN ACK packet for which a SYN packet has never been seen). Oh no.

How can we avoid this? The problem here is that we put the UDR to the whole hub vnet prefix 192.168.0.0/16 in the AppRT route table (the route marked in red in the above diagram), and this includes the IP addresses of the individual firewall instances. However, when a packet comes sourced from one of the firewall instances we want to go straight to it, not through the Azure Load Balancer. There are two ways of doing this:

  • You could add more specific routes to the route table as exceptions. You might be tempted to use the next-hop type “VnetLocal” (also known as “Virtual Network” in the portal), but that is only valid for the local vnet, and not for a peered one. Instead, here you would have to add each individual IP address for the firewall instances:
    192.168.100.6/32 -> 192.168.100.6 (next-hop type “Virtual Appliance”)
    192.168.100.7/32 -> 192.168.100.7 (next-hop type “Virtual Appliance”)
  • The previous approach might be acceptable if you know in advance the private IP addresses of your firewall, but in some cases (like with the Azure Firewall) you will not know them. You can of course guess, but an alternative is configuring a smaller UDR that does not cover the AzureFirewallSubnet. Instead of sending the whole hub IP space to the Azure Firewall, you actually only need to specify the subnets where your common services are located, as the following diagram suggests:
Fixing asymmetric routing with smaller UDRs

As you can see here, when the application in the spoke answers to the firewall, the return packets will not be intercepted by any route, and will flow normally through the VNet peering back to the instance that originated them.

The flow between the spoke and the common services subnet in the hub should still work, as the UDR for 192.168.1.0/24 covers those systems. If there were further subnets in the hub that need to be exposed to the spokes through the firewall, you would have to add subsequent routes to the route table in the spoke.

Flow: On-premises to Application

Let’s change gears a bit. Let’s assume that there is an on-premises site connected via a site-to-site VPN tunnel to our virtual network. In order to support traffic from on-premises to our application a route table is needed in the GatewaySubnet where the Azure VPN Gateway is located:

Further UDRs required to inspect traffic from on-premises

We will not do the full packet walk here, but let’s highlight a couple of points:

  • When traffic leaves the VPN Gateway (packet 2), the UDR in the gateway subnet for 172.16.0.0/16 will send it to the Azure Firewall. The other UDR in the gateway subnet for 192.168.0.0/16 has been included to inspect traffic from on-premises to the Common Services subnet.
  • The Azure Firewall will not source-NAT traffic, so the inbound and outbound packets will traverse the same internal Load Balancer, and will have identically source and destination addresses and ports (but reversed). Hence the same Azure Firewall instance will be picked for packets 3 and 6 (instance 1 in this example).

So far so good! Imagine however in this case that the customer decides to deploy an ExpressRoute gateway into the design. Once the ExpressRoute Gateway has been deployed (even before connecting it to an ExpressRoute circuit), flows to on-premises will break. Why? What happened?

There is a longer explanation involved, but take my TL;DR here: if you have both types of gateways in an Azure Virtual Network, they need to speak internal BGP to each other. However, the route for 192.168.0.0/16 in the GatewaySubnet was sending traffic between the two gateways to the Azure Firewall, which will probably drop it:

UDRs can break the required iBGP adjacency between gateways

When that happens, the ExpressRoute gateway will not learn the prefixes from the VPN gateway, and will not inject anything into the Vnet. As a consequence, the Azure Firewall will not know how to go back to the on-premises network.

The fix for this is similar as for the asymmetric flow from earlier:

  • You can exclude the GatewaySubnet’s prefix from the existing UDR with a more specific route pointing to VnetLocal (this time you can use it because the destination is in the same Vnet)
  • Or you can make the initial route to 192.168.0.0/16 more specific so that it does not contain the GatewaySubnet. For example, reducing it to just the CommonServices subnet (192.168.1.0/24)

I prefer the second method, as I reflect in this diagram:

Traffic between gateways should stay in the GatewaySubnet

And that’s it! In this blog I went over some intricacies of UDRs in Azure that can cause problems with SNAT and with the coexistence of VPN and ExpressRoute gateways. I hope it helped!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: