One of the most common and yet complex networking designs in Azure is interconnecting Azure IaaS workloads deployed in a Virtual Network, vSphere virtual machines in an Azure VMware Solution private cloud, and on-premises networks. My esteemed colleague Robin Heringa kindly gave me access to an AVS cluster, so armed with the fantastic possibilities that Megaport offers around ExpressRoute connectivity to multiple clouds, I decided to give it a try.
I found interesting things along the way, such as how ExpressRoute Global Reach works, routes with the ASN 12076 unexpectedly prepended a couple of times, or Azure devices that seem to have been configured with the feature you might know as allow-as in
. If I managed to make you curious, keep reading! (If I didn’t, this post is probably not for you).
Topology
This is the topology I am testing, consisting of an Azure Virtual Network, an Azure VMware Solution (AVS) environment, and an on-premises network that I am simulating with a Google Cloud VPC. In the middle Megaport as network provider interconnecting everything via ExpressRoute:

The way you actually connect AVS with Azure VNets is by using an ExpressRoute circuit included in the AVS offering, and if your on-prem network is equally connected via ExpressRoute to Azure, on-prem and AVS are like two branch offices that talk to each other through ExpressRoute GlobalReach (you might want to have a read at my previous article ExpressRoute Global Reach under the covers):

The Azure VM’s perspective
Let’s start from the top: the Azure VM will correctly see routes from both environments:
❯ az network nic show-effective-route-table --ids $vm_nic_id -o table Source State Address Prefix Next Hop Type Next Hop IP --------------------- ------- ---------------- --------------------- ------------- Default Active 192.168.1.0/24 VnetLocal VirtualNetworkGateway Active 10.2.253.128/25 VirtualNetworkGateway 10.24.133.24 VirtualNetworkGateway Active 10.2.253.128/25 VirtualNetworkGateway 10.24.133.28 VirtualNetworkGateway Active 10.2.253.0/25 VirtualNetworkGateway 10.24.133.24 VirtualNetworkGateway Active 10.2.253.0/25 VirtualNetworkGateway 10.24.133.28 VirtualNetworkGateway Active 192.168.5.0/24 VirtualNetworkGateway 10.24.133.24 VirtualNetworkGateway Active 192.168.5.0/24 VirtualNetworkGateway 10.24.133.28 VirtualNetworkGateway Active 10.2.252.64/26 VirtualNetworkGateway 10.24.133.24 VirtualNetworkGateway Active 10.2.252.64/26 VirtualNetworkGateway 10.24.133.28 VirtualNetworkGateway Active 10.2.252.0/26 VirtualNetworkGateway 10.24.133.24 VirtualNetworkGateway Active 10.2.252.0/26 VirtualNetworkGateway 10.24.133.28 VirtualNetworkGateway Active 10.2.254.0/25 VirtualNetworkGateway 10.24.133.24 VirtualNetworkGateway Active 10.2.254.0/25 VirtualNetworkGateway 10.24.133.28 VirtualNetworkGateway Active 10.2.255.0/26 VirtualNetworkGateway 10.24.133.24 VirtualNetworkGateway Active 10.2.255.0/26 VirtualNetworkGateway 10.24.133.28 VirtualNetworkGateway Active 10.2.252.192/32 VirtualNetworkGateway 10.24.133.24 VirtualNetworkGateway Active 10.2.252.192/32 VirtualNetworkGateway 10.24.133.28 VirtualNetworkGateway Active 192.168.3.0/24 VirtualNetworkGateway 10.24.133.24 VirtualNetworkGateway Active 192.168.3.0/24 VirtualNetworkGateway 10.24.133.28 Default Active 0.0.0.0/0 Internet Default Active 10.0.0.0/8 None Default Active 100.64.0.0/10 None ...
The first thing to highlight is that in the route table we can see two ECMP routes for each of the remote environments (192.168.3.0/24 is on-prem, highlighted in green, and 192.168.5.0/24 is AVS, highlighted in orange), so connectivity from here looks fine.
There might be some questions in your head though. For example, what are those 10.2 prefixes? They are coming from the infrastructure prefixes for AVS. In this case, my AVS private cloud has assigned the prefix 10.2.252.0/22, out of which it has carved out different subnets for the required infrastructure and control elements of the VMware environment, as the Azure portal shows:

Another interesting question is what are the next hop IPs 10.24.133.24 and 10.24.133.28? They are not the ExpressRoute Gateway IP addresses, as we will see later, but the IP addresses of the ExpressRoute Microsoft Edge Router, since ExpressRoute egress traffic from a VNet does not traverse the gateway, to achieve a better performance. So essentially they are IP addresses that you can safely ignore.
What does the ExpressRoute Gateway see?
In the ExpressRoute gateway we will start seeing more details about the setup. Let’s first have a look at its BGP neighbors:
❯ az network vnet-gateway list-bgp-peer-status -n $ergw_name -g $rg -o table Neighbor ASN State ConnectedDuration RoutesReceived MessagesSent MessagesReceived ----------- ----- --------- ------------------- ---------------- -------------- ------------------ 192.168.1.4 12076 Connected 00:39:21.0873422 11 92 85 192.168.1.5 12076 Connected 00:39:19.8925931 11 93 86 192.168.1.6 12076 Connected 00:00:12.2626719 12 3 7 192.168.1.7 12076 Connected 00:00:12.5151968 12 3 7
There are two neighbors for each ExpressRoute peering: one going to the AVS cloud (marked in orange), and the other one going to the Megaport circuit that will connect to on-premises (highlighted in green).
Note that the IP addresses for the BGP peers are contiguous, and actually taken out of the GatewaySubnet. This is by the way one reason why Microsoft has those minimum size requirements for this subnet, and you don’t want to run out of IP addresses there.
The remote ASN for both neighbors is 12076, so out of that output you cannot really know which is which, and here you need to trust me (I will prove it to you in a second, do not worry). Before going into the mud though, let’s see what the gateway is advertising to each of those neighbors (the ExpressRoute edge routers):
❯ az network vnet-gateway list-advertised-routes -n $ergw_name -g $rg -o table --peer 192.168.1.4 Network NextHop Origin AsPath Weight -------------- ------------ -------- -------- -------- 192.168.1.0/24 192.168.1.13 Igp 65515 0 ❯ az network vnet-gateway list-advertised-routes -n $ergw_name -g $rg -o table --peer 192.168.1.6 Network NextHop Origin AsPath Weight -------------- ------------ -------- -------- -------- 192.168.1.0/24 192.168.1.13 Igp 65515 0
So the ExpressRoute gateway is only advertising the local Azure prefixes. You might think that the gateway would advertise to on-prem the prefixes learnt from AVS and vice versa, but that is not the case. This is why ExpressRoute Global Reach is required.
Enter Global Reach for AVS-onprem
If Azure doesn’t transit traffic between AVS and on-prem, we need to connect them directly to each other, and that is exactly what Global Reach does. You do not have access to the AVS ExpressRoute edge router (sometimes called DMSEE or Dedicated Microsoft Enterprise Edge), but we can have a look at the ExpressRoute edge router connected to on-premises through Megaport (each ExpressRoute circuit has two connections, a primary and a secondary one):
❯ az network express-route list-route-tables-summary -g $rg -n $er_circuit_name --path primary --peering-name AzurePrivatePeering --query value -o table Neighbor V AsProperty UpDown StatePfxRcd -------------------- --- ------------ -------- ------------- 10.2.252.129+179 4 12076 0 9 169.254.21.201+15238 4 65001 0 4 192.168.1.12+52195 4 65515 0 1 192.168.1.13+179 4 65515 0 1 ❯ az network express-route list-route-tables-summary -g $rg -n $er_circuit_name --path secondary --peering-name AzurePrivatePeering --query value -o table Neighbor V AsProperty UpDown StatePfxRcd -------------------- --- ------------ -------- ------------- 10.2.252.130+179 4 12076 0 9 169.254.21.205+50428 4 65001 0 13 192.168.1.12+52201 4 65515 0 1 192.168.1.13+179 4 65515 0 1
There are a couple of things to unpack here. If you focus on the AS number it is easy to tell what each neighbor actually is:
- The last two ones highlighted in blue to ASN 65515 are the BGP connections to the ExpressRoute gateways in the Azure VNet (65515 is the ASN for Virtual Network Gateways)
- The Global Reach connection peers Microsoft routers to each other with iBGP on ASN 12076, hence the peer highlighted in orange is the remote edge router connected to AVS. Note that the peer IP addresses 10.2.252.129 (primary) and 10.2.252.130 (secondary) are coming out of AVS infra range 10.2.252.0/22.
- And finally, ASN 65001 is the Megaport router as we will see next
Megaport Cloud Router
Since I am using Megaport as my ExpressRoute network provider, we can check at the Megaport Cloud Router what our BGP routes look like:

The output above shows the routes that the MCR sees (filtered to show only the ones containing “192”): from the Azure ExpressRoute circuit there are two connections, and Megaport prefers the primary one for both the Azure route (192.168.1.0/24, the AS path only includes 12076) and the route from AVS (192.168.5.0/24, the AS path includes 12076 and 398656, an ASN generated by AVS).
You probably know this already, but the reason why the MCR prefers only one of the routes is because it is following the default eBGP behavior, which is just picking a single route and not doing Equal Cost Multi-Pathing (ECMP).
The remaining route is on-premises, which in this case I am simulating with a Google VPC connected to Megaport via Google Partner Interconnect. You can tell it is Google because of the ASN 16550.
Hence all seems looking good here. We can go further down the chain, and check that the routes are going down until the Google VPC. Not relevant for this exercise, since we are just using Google Cloud to simulate on-prem, but for the sake of being exhaustive, here we go:
❯ gcloud compute routers get-status $router_name --region=$region --format=json | jq -r '.result.bestRoutesForRouter[]|{destRange,routeType,nextHopIp} | join("\t")' 10.2.252.0/26 BGP 169.254.179.10 10.2.252.64/26 BGP 169.254.179.10 10.2.252.192/32 BGP 169.254.179.10 10.2.253.0/25 BGP 169.254.179.10 10.2.253.128/25 BGP 169.254.179.10 10.2.254.0/25 BGP 169.254.179.10 10.2.255.0/26 BGP 169.254.179.10 169.254.21.200/30 BGP 169.254.179.10 169.254.21.204/30 BGP 169.254.179.10 169.254.179.8/29 BGP 169.254.179.10 192.168.1.0/24 BGP 169.254.179.10 192.168.5.0/24 BGP 169.254.179.10
There you go, you see the infra prefixes from the AVS private cloud (that belong to the supernet 10.2.252.0/22, the actual NSX segment for VMs in AVS 192.168.5.0/24, and the Azure VNet prefix 192.168.1.0/24. All good! (if you are wondering about those 169.254 prefixes, those are transit networks of the Megaport Cloud Router).
The NSX side
Let’s go back to Azure. If you remember, the Megaport ExpressRoute circuit was connected via Global Reach to BGP neighbors 10.2.252.129 and 10.2.252.130 on ASN 12076. In AVS we can have a look first at the network topology in NSX Manager:

By the way, here you can see our 192.168.5.0/24 subnet defined as NSX segment at the bottom. Let’s check now the NSX T0 BGP configuration:

First surprise, ASN 398656 is not here, the Tier-0 Gateway has the ASN 64513 that we haven’t seen so far! We can have a look at its BGP neighbors:

Same here, its upstream neighbors (you can tell they are “upstream” because they are in the non-routable subnets 100.72.2.0/28 and 100.72.2.16/28, which you can see as the top subnets in the NSX topology) are in ASN 65100. What this tells us is that there is a BGP layer between the NSX Tier-0 gateways and ExpressRoute which is invisible to the AVS admin, and that we need to trust (Jedi hand move here).
So let’s trust, and see if the routes arrive. In the NSX Manager portal (I haven’t found a way to get to the T0 console in AVS) you can download the routes of the Tier-0 gateway. After importing to Excel and deleting some rows and columns that are not relevant for this exercise, here is what I have:
route_type | network | next_hop | admin_distance |
b | 192.168.1.0/24 | 100.72.2.1 | 20 |
b | 192.168.1.0/24 | 100.72.2.17 | 20 |
t1c | 192.168.5.0/24 | 100.64.96.1 | 3 |
b | 192.168.3.0/24 | 100.72.2.17 | 20 |
b | 192.168.3.0/24 | 100.72.2.1 | 20 |
In essence the routes from Azure (192.168.1.0/24) and onprem (192.168.3.0/24) are coming in from the upstream BGP neighbors alright, and the downstream NSX segment 192.168.5.0/4 is visible from the tier-1 gateway.
That BGP triangle
Everything is working, but I am still a bit unsure of how routing works, especially in the BGP triangle that we have between the ExpressRoute gateway in Azure and the two ExpressRoute routers (one from the AVS circuit, the other one from the onprem circuit). To begin with, I will inspect the routes in the BGP table of the Virtual Network Gateway in Azure (note that I have removed the AVS infra prefixes and the MCR transit nets from this output for brevity):
❯ az network vnet-gateway list-learned-routes -n $ergw_name -g $rg --query 'value[].{LocalAddress:localAddress, Peer:sourcePeer, Network:network, NextHop:nextHop, ASPath: asPath, Origin:origin, Weight:weight}' -o table LocalAddress Peer Network ASPath Origin Weight NextHop -------------- ------------ ----------------- ----------------------------- -------- -------- ----------- 192.168.1.13 192.168.1.13 192.168.1.0/24 Network 32768 192.168.1.13 192.168.1.4 192.168.5.0/24 12076-398656 EBgp 32769 192.168.1.4 192.168.1.13 192.168.1.5 192.168.5.0/24 12076-398656 EBgp 32769 192.168.1.5 192.168.1.13 192.168.1.6 192.168.3.0/24 12076-65001-16550 EBgp 32769 192.168.1.6 192.168.1.13 192.168.1.7 192.168.3.0/24 12076-65001-16550 EBgp 32769 192.168.1.7 192.168.1.13 192.168.1.7 192.168.5.0/24 12076-12076-12076-398656 EBgp 32769 192.168.1.7 192.168.1.13 192.168.1.6 192.168.5.0/24 12076-12076-12076-398656 EBgp 32769 192.168.1.6 192.168.1.13 192.168.1.4 192.168.3.0/24 12076-12076-12076-65001-16550 EBgp 32769 192.168.1.4 192.168.1.13 192.168.1.5 192.168.3.0/24 12076-12076-12076-65001-16550 EBgp 32769 192.168.1.5 [...]
What the hell is going on here? Let’s focus on the orange routes (AVS): you see some routes coming from 192.168.1.4 and .5, which are the Microsoft Edge routers from VMware. The AS path in those routes are what we would expect: “12076 398656”.
And then we see the same routes coming from 192.168.1.6 and .7, the Microsoft Edge Routers from on-premises, which prepends its ASN (12076) two additional times, so that the Azure ExpressRoute gateway prefers the direct routes coming from AVS. The following diagram describes what is going on:

allow-as in 2
Now we can check the routes of the Megaport-facing ER edge router, since we don’t have access to the AVS-facing one:
❯ az network express-route list-route-tables -g $rg -n $er_circuit_name --path primary --peering-name AzurePrivatePeering --query value -o table This command is in preview and under development. Reference and support levels: https://aka.ms/CLI_refstatus Network NextHop LocPrf Weight Path ----------------- -------------- -------- -------- ------------- 10.2.252.0/26 10.2.252.129 100 0 398656 ? 10.2.252.64/26 10.2.252.129 100 0 398656 ? 10.2.252.192/32 10.2.252.129 100 0 398656 ? 10.2.253.0/25 10.2.252.129 100 0 398656 ? 10.2.253.128/25 10.2.252.129 100 0 398656 ? 10.2.254.0/25 10.2.252.129 100 0 398656 ? 10.2.255.0/26 10.2.252.129 100 0 398656 ? 169.254.21.200/30 169.254.21.201 100 0 65001 ? 169.254.21.204/30 169.254.21.201 100 0 65001 ? 169.254.179.8/29 169.254.21.201 100 0 65001 ? 192.168.1.0/24 192.168.1.13* 100 0 65515 I 192.168.1.0/24 192.168.1.12 100 0 65515 I 192.168.1.0/24 10.2.252.129 100 0 65515 I 192.168.3.0/24 169.254.21.201 100 0 65001 16550 ? 192.168.5.0/24 10.2.252.129 100 0 398656 ?
Looking good! (you can ignore the 10.2.252.0/22 infra routes from AVS as well as the 169.254.0.0/16 routes for the MCR transit VNets). And the secondary connection now:
❯ az network express-route list-route-tables -g $rg -n $er_circuit_name --path secondary --peering-name AzurePrivatePeering --query value -o table This command is in preview and under development. Reference and support levels: https://aka.ms/CLI_refstatus Network NextHop LocPrf Weight Path ----------------- -------------- -------- -------- -------------------- 10.2.252.0/26 10.2.252.130* 100 0 398656 ? 10.2.252.0/26 169.254.21.205 100 0 65001 12076 398656 ? 10.2.252.64/26 10.2.252.130* 100 0 398656 ? 10.2.252.64/26 169.254.21.205 100 0 65001 12076 398656 ? 10.2.252.192/32 10.2.252.130* 100 0 398656 ? 10.2.252.192/32 169.254.21.205 100 0 65001 12076 398656 ? 10.2.253.0/25 10.2.252.130* 100 0 398656 ? 10.2.253.0/25 169.254.21.205 100 0 65001 12076 398656 ? 10.2.253.128/25 10.2.252.130* 100 0 398656 ? 10.2.253.128/25 169.254.21.205 100 0 65001 12076 398656 ? 10.2.254.0/25 10.2.252.130* 100 0 398656 ? 10.2.254.0/25 169.254.21.205 100 0 65001 12076 398656 ? 10.2.255.0/26 10.2.252.130* 100 0 398656 ? 10.2.255.0/26 169.254.21.205 100 0 65001 12076 398656 ? 169.254.21.200/30 169.254.21.205 100 0 65001 ? 169.254.21.204/30 169.254.21.205 100 0 65001 ? 169.254.179.8/29 169.254.21.205 100 0 65001 ? 192.168.1.0/24 192.168.1.13* 100 0 65515 I 192.168.1.0/24 192.168.1.12 100 0 65515 I 192.168.1.0/24 10.2.252.130 100 0 65515 I 192.168.1.0/24 169.254.21.205 100 0 65001 12076 I 192.168.3.0/24 169.254.21.205 100 0 65001 16550 ? 192.168.5.0/24 10.2.252.130* 100 0 398656 ? 192.168.5.0/24 169.254.21.205 100 0 65001 12076 398656 ?
What the heck is this? Here it looks like for all routes that are not generated on-premises the MCR is advertising them back to Azure, and the Microsoft edge router is taking them happily! This shouldn’t happen, because BGP’s loop prevention mechanism dictates routers to drop advertisements coming over eBGP that contain their own AS number. However, the Microsoft edge router seems not to be doing that. In Cisco routers you can achieve this behavior with the configuration command allow-as in, and that is exactly what seems to be happening here.
In spite of that weirdness, those routes are not preferred by the MSEE because they have a longer path, as the asterisk next to the other, shorter routes show.
Another question is why this is only happening on the secondary connection. My guess is that the MCR is not advertising the routes back the primary ER link because that is its preferred destination. It would be interesting verifying whether this behavior changed if the MCR “preferred” both paths, taking both routes into the routing table (in Cisco parlance with maximum-paths eibgp 2
).
What have we seen?
We have gone over the initially simple configuration of the BGP triangle over ExpressRoute between an Azure VNet, AVS and on-premises, and we have seen that all works out of the box.
I have illustrated some “fun facts” such as the 12076 prepending happening over Global Reach, and the weirdness of the Microsoft ExpressRoute edge router being configured with allow-as in
(I am told it is in order to support certain advanced scenarios).
Hopefully the next time you look at a similar setup these things will not come as a surprise to you!
As always extreme detailed helpful information!
LikeLiked by 1 person