Azure Route Server: super powers for your Network Virtual Appliance

Amongst the many Ignite announcements this year, my favourite is the new Azure Route Server, in public preview now, since it has the potential to dramatically change how networks are built in Azure. If you are thinking “here he comes with his BGP thing again”… You are right! Let me explain:

In public cloud there are typically two ways of doing things: the “managed” way, and the “DIY” (Do-It-Yourself) way. The managed way is architected to simplify operations, while in the DIY way you are willing to take on some complexity that you trade for a higer flexibility. While I am a fan of the KISS principle, I have seen many customers going the DIY way in networking, because keeping all options open is of paramount importance for them in this area.

So far, if you wanted to go the DIY way with Azure Networking there were some serious limitations. For example, if you wanted to deploy your own firewall, you needed to rely on everybody configuring User-Defined Routes to send traffic to that firewall. Or if you deployed your own VPN appliance, you had no way of injecting your site-to-site or point-to-site prefixes into the Virtual Network, again relying on static UDRs. Let me go over some of those use cases a bit more in detail, and how the Azure Route Server will help here.

Firewall NVAs

Say you deploy the firewall of your vendor of choice. How are you going to attract traffic to it? You need to configure UDRs in all of your subnets. Let me repeat that: all of your subnets. If you forget one subnet, or if somebody overwrites your UDRs, chances are some traffic unexpectedly bypasses your firewall.

Enter Route Server: your firewall appliance might advertise some routes to the Route Server (even a 0.0.0.0/0), and those routes would be plumbed into each and every subnet of both the VNet where the Route Server is deployed (typically your hub VNet), as well as in the directly-connected spokes. So no chance for accidental misconfigurations! (although you might couple that with some policies/RBAC configs to prevent people from adding UDRs, since those would override routes learnt from the Route Server).

Easier Active/Passive NVA Clusters

There are essentially two ways of building an NVA cluster in Azure: either using Azure Load Balancers, or UDRs. The UDR way is mostly used in active/passive clusters, where you have an UDR pointing to the active NVA. “Something” would monitor the active NVA, and change that UDR to point to the passive NVA if need be.

There are two main drawbacks of this design: what should be that “something” that monitors the availability of the primary NVA? You could have external components (which should be redundant too), or an internal agent in the secondary NVA, but that introduces additional complexity. The second problem is the time it takes to detect the failure of the primary NVA, send the Azure API call to change the UDR, and have the change propagated to the platform, which often result in convergence times over two minutes.

But now we can influence the VNet routing without UDRs! The primary NVA could send a preferred route to the Azure Route Server, and the secondary NVA a worse route for the same prefixes, for example spiced with some AS-path-prepending. In a normal scenario the primary NVA route would attract the traffic, but when that route disappears, the secondary route kicks in in a matter of seconds, with no agents or API calls involved.

VPN to ExpressRoute Transit Routing

This has been a limitation in many situations, where for example you want P2S users connected over VPN on Azure to get access to corporate resources over ExpressRoute. This functionality is available in Virtual WAN, but to my knowledge it is not available when using standalone Azure Virtual Network Gateways for VPN.

With the Azure Route Server you could terminate those P2S tunnels in the appliance of your favorite vendor, and announce those prefixes to the ExpressRoute gateway. The VPN NVA would learn the ExpressRoute prefixes from the Route Server as well, effectively providing transit routing between VPN and ExpressRoute.

In the paragraph above I have used the example of P2S because I see it quite often, the same would be valid for S2S tunnels though, where you have some branches connected over VPN, some over ExpressRoute, and you need branch-to-branch connectivity.

No More Static Routes in your VPN NVA

Another configuration piece which needed to be manually mantained is the static routing in VPN NVAs, let me explain. When you have a Site-To-Site VPN, the Azure side oftens advertises the Azure prefixes to the on-premises side. Up to now there was no way for the Azure side to “learn” which prefixes belong to Azure, so this was a static configuration. Whenever a new spoke was added to the setup, somebody had to go and add a static route to the NVA.

However, now that VPN NVA can learn all Azure VNet prefixes (both hubs and spokes) from the Azure Route Server dynamically, so no need for any manual interactions when spokes are added, changed or removed. For example when adding spokes, the VPN NVA will learn the new route and advertise it to the on-premises side automatically.

No More Dummy VNets for Indirect Spokes

If you know what I am talking about, you might already be clapping. The problem here is that ExpressRoute gateways will only advertise to onprem the prefixes of the VNet where those gateways are, as well of those of direcly peered VNets. If you have VNets more than one peering hop away, today there is no easy way of advertising those prefixes.

One trick is creating “dummy VNets” peered to the hub with the prefixes you want to advertise, and UDRs in the GatewaySubnet to make sure that traffic is not black-holed to those dummy VNets. This is quite a nice example of “duct tape networking”, and not something I would do in production.

But guess what: you could have an NVA appliance in your hub that advertises the prefixes of the indirect spokes to ExpressRoute via the Azure Route Server. You still need UDRs in your indirect spokes, but getting rid of those pesky dummy VNets is quite a nice thing.

Example: Multi-Region NVA Designs

An example of some of the use cases described above are multi-region designs. I am really curious to see how people are going to use it, but here is a variant I think would be feasible without too much effort:

Sample diagram for multi-region desgin

Please note that the overall diagram is surperficial on purpose, things like ExpressRoute Global Reach or Site-to-Site tunnels and not represented to keep it “simple” (it already has a lot going on).

So what do you think? Do you have any specific case I did not cover? Any experience with Azure Route Server already?

17 thoughts on “Azure Route Server: super powers for your Network Virtual Appliance

  1. […] my previous blog I wrote my view on the characteristics of the new Azure Route Server that I am most excited about. […]

    Like

  2. […] a previous blog post I have described the features of the new Azure Route Server I am most excited about, as well as a […]

    Like

  3. Rajesh Poojary

    Wow…you covered a lot about Azure Route Server 🙂

    Like

    1. Yes, I think it can help to unlock some scenarios that were very difficult to implement until now 🙂

      Like

  4. Sreenivasan

    Yes, Most awaited , I like to know whether router filtering can be done with this service.

    Liked by 1 person

    1. No, not yet. I agree that would be even better

      Like

  5. digiwhite

    Hi,
    In the peer connection from spoke to region you can use the “use remote gateway” setting. This will register the spoke network CIDR and will be advertised through BGP to the on-prem network. All you would have to do is set a default route to the NVA from the spoke’s perspective.

    So to me the interesting part would be:
    – Can the route server from Region 1 learn the full neighbour network (Region 2)?
    – Will the complete CIDR including spoke in Region 1 be propagated through the Router Server to Region 2?

    Reason for my questions is the fact that the Expressroute gateway does not Read/Support UDR on the subnet (GatewaySubnet). And we are not able to inject routes directly to the BGP configuration of the ER.
    Another problem I am facing is the 3 layered model. In this example you have a Hub / Spoke model. In my case I want to have a Hub / SubHub / Spoke model. Main reason is the limitation of 500 peer connections. So if the RouteServer is able to transport routes to its direct neighbours al the way up to the HUB where the ER is configured and have the routes injected into BGP I am happy 🙂

    Regard,

    Raymond

    Like

    1. Hey Raymond, maybe this post would answer some of those questions? https://blog.cloudtrooper.net/2021/03/06/route-server-multi-region-design/. It dives deep in the multi-region design, and some of the concepts can be used in the hub/hub/spoke design.

      Like

  6. ductapenetwork

    Good example, only the drawing looks a bit wrong. On the top the express route has a BGP relation with the route server not with the NVA right ? Like the bottom side of the drawing

    Liked by 1 person

    1. Spot on! I corrected the diagram, it is hopefully correct now. Thanks for that!

      Like

  7. Hello

    So I’ve been testing Azure route server with Fortigate NVA and Azure VNG to on premise network. I’m using hub and spoke topology.

    I was looking at a way to get rid of as many UDR as possible.

    I can get a VM in a spoke to use routes advertised by the route server, one issue is that the VM lose internet connectivity as the default route isn’t learned from the NVA. I’ve set default information originate on the Fortigate NVA but this will create a routing loop, the OUSID interface of the fortigate will learn by the route server that the default route is available from INSID interface of the NVA.

    The other issue I’m having at the moment is that the traffic from on premise will bypass the NVA as the VNG knows how to get to a spoke using the peering. I’m forced to use UDR with more specific prefix on the gateway subnet so it pass through the NVA.

    I’d also like to know if any of you have been able to use default information originate in your NVA so that on-premise internet traffic can route through your NVA in Azure when using the expressroute (without encapsulating the traffic in a VPN). The use case here is that all our offices (more than 50) are in a particular region and our mpls provider is providing direct connectivity to azure, there’s no firewall in most offices, currently all internet traffic gets routed through the 2 main offices in redundant fashion. We were looking at the possibility to route internet traffic to a NVA in azure and possibly get rid of internet connection at the main offices. I’m not set on doing this, I’m just exploring the possibilities.

    Like

    1. Hey there! That was a long post, let me tackle your 2 questions separately: for the routing loop issue (routes being plumbed in the NVA NIC too), please have a look at https://blog.cloudtrooper.net/2021/03/06/route-server-multi-region-design/. For the outbound NIC of the NVA you could use a Route Table that prevents the NIC from learning the rputes from ARS. For the internal NIC it is a bit more complex, essentially you have 2 options:
      1. Encap traffic between the NVAs
      2. “Teach” the NIC where the internal prefixes are with UDRs, ideally summaries

      Like

      1. For the traffic to onprem, have a look here: https://blog.cloudtrooper.net/2021/03/29/using-route-server-to-firewall-onprem-traffic-with-an-nva/. Feel free to send me an email (jose.moreno@microsoft.com), I can put you in contact with some Fortinet engineers with experience on this design.

        Like

  8. Tomas

    Hi Jose,
    I’m equally thrilled for Azure Route Server, but what config must be done to make Azure Firewall in the hub advertise default route pointing to it? I couldn’t find a single article about that on the internet.
    Many thanks!
    Tomas

    Like

    1. Hey Tomas! Today it is unfortunately not possible, since the Azure Firewall does not support BGP.

      Like

      1. Tomas

        Thanks Jose, just found that out as well. That’s a bummer, but so is support for only 6000 VM IPs, which is a fraction of my spokes. Any background info on when this might get supported?

        Like

      2. No idea. The ARS programs the routes in the VMs’ NICs, hence there is a limit there. If you check with your Microsoft account team, they might have some visibility to the roadmap.

        Like

Leave a comment