Using Route Server to firewall onprem traffic with an NVA

In a previous blog we had a setup with a Network Virtual Appliance (NVA) for Internet egress and hybrid connectivity based on Azure Virtual Network Gateways. There is another fairly typical use case with regards to traffic between on-premises an Azure: firewalling it with an NVA:

Use case: Firewall onprem traffic with an NVA

In some situations customers will combine the role of VPN termination and firewalling in the same NVA. However, I haven’t seen this pattern very often, since achieving an active/active high availibility design in that scenario can be quite challenging. Hence I will not cover it in this post, but it should be similar to the design described in Azure Route Server multi-region design and Connecting your NVAs to ExpressRoute with Azure Route Server.

The topology I will use in this post is depicted next. I am not using a hub and spoke design because it does not make a huge difference for testing, and I am only using a site-to-site VPN connection, since the results should be valid for ExpressRoute too:

Test topology

Note that I am using a different ASN in the VPN Gateway as the default 65515. The interaction between the Route Server and the VNGs can be either iBGP or eBGP, I am not sure if there is any different behavior with iBGP.

Our goal is sending traffic between Azure and the on-premises network through the NVA, without having to configure UDRs in the VM subnet (which could be either in the same VNet or in a spoke VNet).

In this post we will investigate some of the peculiarities of the Azure Route Server:

  • If the NVA advertises the exact VNet prefix to the Azure Route Server, it won’t be advertised to onprem (nor will it be plumbed into the VNets, but we saw that in previous posts already).
  • The same happens with longer VNet prefixes, so more specific routes cannot be used to attract onprem-to-Azure traffic. Note that this has an important impact if having shared services subnets in the hub (not explored in this post)
  • There can be a routing loop situation because of prefixes injected by the Route Server in the NVA subnet, which is not easy to solve. In this article we will use an overlay tunnel over the Azure Virtual Network Gateways to solve this problem.

Initial Situation

The Azure NVA advertises the onprem and Azure prefixes over BGP to the RS:

❯ az network routeserver peering list-learned-routes -n nva --routeserver rs -g $rg --query 'RouteServiceRole_IN_0' -o table
LocalAddress    Network      NextHop    SourcePeer    Origin    AsPath    Weight
--------------  -----------  ---------  ------------  --------  --------  --------
10.1.1.4        10.1.0.0/16  10.1.2.4   10.1.2.4      EBgp      65001     32768
10.1.1.4        10.2.0.0/16  10.1.2.4   10.1.2.4      EBgp      65001     32768

Do these advertised networks have priority over the system routes? Let’s check the Azure VM:

az network nic show-effective-route-table -n azurevmVMNic -g $rg -o table
Source                 State    Address Prefix    Next Hop Type          Next Hop IP
---------------------  -------  ----------------  ---------------------  -------------
Default                Active   10.1.0.0/16       VnetLocal
VirtualNetworkGateway  Active   10.2.0.0/16       VirtualNetworkGateway  10.1.2.4
Default                Active   0.0.0.0/0         Internet
Default                Active   10.0.0.0/8        None
Default                Active   100.64.0.0/10     None
Default                Active   192.168.0.0/16    None
Default                Active   25.33.80.0/20     None
Default                Active   25.41.3.0/25      None

Yes for the onprem route! Makes sense, because the AS-Path is shorter than what is coming from the VPN GW (“65001” for the route from the NVA, “65002 65501” for the route from the VPN Gateway.

Let’s check what the Azure VPN Gateway gets from the Route Server:

❯ az network vnet-gateway list-learned-routes -n vpngw -g $rg -o table
Network      Origin    SourcePeer    AsPath             Weight    NextHop
-----------  --------  ------------  -----------------  --------  ---------
10.1.0.0/16  Network   10.1.0.4                         32768
10.2.2.4/32  Network   10.1.0.4                         32768
10.2.2.4/32  IBgp      10.1.0.5                         32768     10.1.0.5
10.2.0.0/16  EBgp      10.2.2.4      65002              32768     10.2.2.4
10.2.0.0/16  IBgp      10.1.0.5      65002              32768     10.1.0.5
10.1.0.5/32  EBgp      10.2.2.4      65002              32768     10.2.2.4
10.2.0.0/16  EBgp      10.1.1.5      65515-65001        32768     10.1.1.5
10.2.0.0/16  EBgp      10.1.1.4      65515-65001        32768     10.1.1.4

10.1.0.0/16  Network   10.1.0.5                         32768
10.2.2.4/32  Network   10.1.0.5                         32768
10.2.2.4/32  IBgp      10.1.0.4                         32768     10.1.0.4
10.2.0.0/16  EBgp      10.2.2.4      65002              32768     10.2.2.4
10.2.0.0/16  IBgp      10.1.0.4      65002              32768     10.1.0.4
10.1.0.4/32  EBgp      10.2.2.4      65002              32768     10.2.2.4
10.2.0.0/16  EBgp      10.1.1.5      65515-65001        32768     10.1.1.5
10.2.0.0/16  EBgp      10.1.1.4      65515-65001        32768     10.1.1.4

First of all, the 10.1.0.0/16 is not even learnt over BGP, the origin is “Network” (I think of that as a static route in the VNGs). Second, the 10.2.0.0/16 from that the NVA advertised to onprem arrives here too. In this case the route from onprem has a shorter path, but this is a dangerous game: if the route from onprem had a longer AS path, it might be worse than the route from the NVA, and we would be looking here at a routing loop.

So we have two problems:

  • How to attract traffic for the VNet into the NVA? We could try with longer prefixes, and split the /16 route in two /17 prefixes
  • We shouldnt advertise the onprem prefix to the VPN gateway. We can use the no-advertise BGP community as described in a previous post.

After both changes, here is what the VPN Gateway learns:

❯ az network vnet-gateway list-learned-routes -n vpngw -g $rg -o table
Network      Origin    SourcePeer    AsPath             Weight    NextHop
-----------  --------  ------------  -----------------  --------  ---------
10.1.0.0/16  Network   10.1.0.5                         32768
10.2.2.4/32  Network   10.1.0.5                         32768
10.2.2.4/32  IBgp      10.1.0.4                         32768     10.1.0.4
10.2.0.0/16  EBgp      10.2.2.4      65002              32768     10.2.2.4
10.2.0.0/16  IBgp      10.1.0.4      65002              32768     10.1.0.4
10.1.0.4/32  EBgp      10.2.2.4      65002              32768     10.2.2.4
10.2.0.0/16  EBgp      10.1.1.4      65515-65501-65002  32768     10.1.1.4

10.1.0.0/16  Network   10.1.0.4                         32768
10.2.2.4/32  Network   10.1.0.4                         32768
10.2.2.4/32  IBgp      10.1.0.5                         32768     10.1.0.5
10.2.0.0/16  EBgp      10.2.2.4      65002              32768     10.2.2.4
10.2.0.0/16  IBgp      10.1.0.5      65002              32768     10.1.0.5
10.1.0.5/32  EBgp      10.2.2.4      65002              32768     10.2.2.4
10.2.0.0/16  EBgp      10.1.1.5      65515-65501-65002  32768     10.1.1.5

As you can see the Route Server is not advertising any more 10.2.0.0/16 (the last 10.2.0.0/16 entry is actually coming from onprem, not from Azure, as the AS path ending is 65002 shows).

However, the smaller 10.1.0.0/17 and 10.1.128.0/17 prefixes are not there! Let’s verify that the Route Server is getting them from the NVA:

❯ az network routeserver peering list-learned-routes -n nva --routeserver rs -g $rg --query 'RouteServiceRole_IN_0' -o table
LocalAddress    Network        NextHop    SourcePeer    Origin    AsPath    Weight
--------------  -------------  ---------  ------------  --------  --------  --------
10.1.1.4        10.1.128.0/17  10.1.2.4   10.1.2.4      EBgp      65001     32768
10.1.1.4        10.1.0.0/17    10.1.2.4   10.1.2.4      EBgp      65001     32768
10.1.1.4        10.2.0.0/16    10.1.2.4   10.1.2.4      EBgp      65001     32768

So even if the Route Server gets these routes from the NVA, it will not readvertise them to onprem, or plumb them into any Azure subnet. Hence the only possibility is configuring UDRs in the GatewaySubnet. Configuring UDRs in the hub VNet is still in line with our main goal: not having to use UDRs for the Azure application workloads.

Routing Loop in the NVA NIC

However, we have a more critical issue. Let’s check the effective routes in the NVA NIC:

❯ az network nic show-effective-route-table -n nva-nic0 -g $rg -o table
Source                 State    Address Prefix    Next Hop Type          Next Hop IP
---------------------  -------  ----------------  ---------------------  -------------
Default                Active   10.1.0.0/16       VnetLocal
VirtualNetworkGateway  Active   10.2.0.0/16       VirtualNetworkGateway  10.1.2.4
Default                Active   0.0.0.0/0         Internet
Default                Active   10.0.0.0/8        None
Default                Active   100.64.0.0/10     None
Default                Active   192.168.0.0/16    None
Default                Active   25.33.80.0/20     None
Default                Active   25.41.3.0/25      None

Houston, we have a problem: traffic from Azure to onprem will be sent to the NVA. The NVA will then sent it to the Azure network, but when the packets hit the NIC, Azure SDN will return the packets to the NVA, provoking a routing loop.

How to override that 10.2.0.0/16 route in the NVA subnet, so that it points to the Azure VNG instead of to the NVA? We could use the flag --disable-bgp-route-propagation in the route-table, but that would remove all routes whatsoever, and traffic to onprem would hit the 10.0.0.0/8 route to None.

Using an UDR with next-hop VirtualNetworkGateway might work, but I would not recommend it for two reasons: first, this type of routes only work in the absence of an ExpressRoute Gateway. If you only have a VPN Gateway today it would work, but if six months from now somebody decides to deploy an ExpressRoute Gateway, it will not be pretty.

The second reason is that I am not sure how these routes with next-hop type VNG work. In my tests for this post I have observed they are effective only if BGP is disabled in the VPN S2S connection, which makes their usage pretty limited: every Azure prefix would have to be configured statically in the on-prem appliances.

Overlay over Azure VNGs

If we don’t want the onprem prefix programmed into the NVA’s NIC to come from the VPN, one option is to only advertise it from the NVA in the first place. How can we do that? By creating a tunnel between the onprem device and the Azure NVA:

Design with overlay

The tunnel will encapsulate every traffic flowing between the two appliances, obfuscating the actual source and destination IP addresses. Azure will only see the tunnel endpoint IP address of each appliance, but not those of the actual end points.

We don’t need BGP in the VPN connection any more, and after disabling it, the 10.2.2.4/32 route will appear in the effective routes:

❯ az network nic show-effective-route-table -n nva-nic0 -g $rg -o table
Source                 State    Address Prefix    Next Hop Type          Next Hop IP
---------------------  -------  ----------------  ---------------------  -------------
Default                Active   10.1.0.0/16       VnetLocal
VirtualNetworkGateway  Active   10.2.2.4/32       VirtualNetworkGateway  10.1.0.4
VirtualNetworkGateway  Active   10.2.2.4/32       VirtualNetworkGateway  10.1.0.5
Default                Active   0.0.0.0/0         Internet
Default                Active   10.0.0.0/8        None
Default                Active   100.64.0.0/10     None
Default                Active   192.168.0.0/16    None
Default                Active   25.33.80.0/20     None
Default                Active   25.41.3.0/25      None 

After creating a VXLAN tunnel between the onprem and Azure appliances, and enabling BGP on top of that tunnel, the onprem routes appear in the NVA’s NIC too (although they wouldn’t be needed, since traffic to onprem would flow inside of the VXLAN tunnel as described earlier):

❯ az network nic show-effective-route-table -n nva-nic0 -g $rg -o table
Source                 State    Address Prefix    Next Hop Type          Next Hop IP
---------------------  -------  ----------------  ---------------------  -------------
Default                Active   10.1.0.0/16       VnetLocal
VirtualNetworkGateway  Active   10.2.0.0/16       VirtualNetworkGateway  10.1.2.4
VirtualNetworkGateway  Active   10.2.2.4/32       VirtualNetworkGateway  10.1.0.4
VirtualNetworkGateway  Active   10.2.2.4/32       VirtualNetworkGateway  10.1.0.5
Default                Active   0.0.0.0/0         Internet
Default                Active   10.0.0.0/8        None
Default                Active   100.64.0.0/10     None
Default                Active   192.168.0.0/16    None
Default                Active   25.33.80.0/20     None
Default                Active   25.41.3.0/25      None

More importantly, the VM’s effective routes now point to the NVA in Azure for the onprem prefixes:

❯ az network nic show-effective-route-table -n azurevmVMNic -g $rg -o table
Source                 State    Address Prefix    Next Hop Type          Next Hop IP
---------------------  -------  ----------------  ---------------------  -------------
Default                Active   10.1.0.0/16       VnetLocal
VirtualNetworkGateway  Active   10.2.0.0/16       VirtualNetworkGateway  10.1.2.4
VirtualNetworkGateway  Active   10.2.2.4/32       VirtualNetworkGateway  10.1.0.4
VirtualNetworkGateway  Active   10.2.2.4/32       VirtualNetworkGateway  10.1.0.5
Default                Active   0.0.0.0/0         Internet
Default                Active   10.0.0.0/8        None
Default                Active   100.64.0.0/10     None
Default                Active   192.168.0.0/16    None
Default                Active   25.33.80.0/20     None
Default                Active   25.41.3.0/25      None

So we have achieved our goal! Most networking vendors support VXLAN-based tunnels, they can be a very effective tool in your box. Moreover, SDWAN technologies often rely in this type of functionality, so this pattern should be familiar to you if you have been exposed to those type of designs.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: