Azure Subnet Peering

First of all, my apologies for the radio silence, there have been some private projects going on during the Summer months that have kept me away from blogging.

With that out of the way: what the heck is subnet peering? You probably know VNet peering, but is “subnet peering” now a thing? Well, not yet, but it is already available in the Azure API and the Azure CLI. Consequently, I couldn’t help testing it: in this post I will explain how this upcoming feature works. I will as well confirm that it is not the feature I was looking for to fix the administration of the GatewaySubnet routing table, but I found this research to be a good exercise nevertheless, especially with a feature as fundamental as VNet peering.

Note: this feature is not documented yet, and it is not in public preview or General Availability. Do not use it in production, and test at your own risk. The behavior of this feature may vary until the time it gets to General Availability.

VNet Peering Refresh

Let’s recap first to get everybody to the same page: Virtual Network peering (VNet peering in short) is a technology that merges two Virtual Networks together, so that all systems in VNet A can reach the systems in VNet B, and viceversa.

Let me take you down the rabbit hole to understand how this works under the hood, because it will be useful to understand subnet peering later. As you might have read in my post Azure Networking is not Like your Onprem Network, virtual machines deployed in a VNet can communicate because their NICs know how reach each other over the Azure Software-Defined Network:

Figure: Representation of VMs and their NICs in a VNet

When you look into the effective routes of the NICs, you will be able to see that they have a system route that actually lets the NICs in the VNet talk to each other. The following output shows what the effective routes of every NIC in a VNet with the prefix 10.13.76.0/24 look like:

❯ az network nic show-effective-route-table -n spokevmVMNic -g $rg -o table
Source    State    Address Prefix    Next Hop Type    Next Hop IP
--------  -------  ----------------  ---------------  -------------
Default   Active   10.13.76.0/24     VnetLocal
[...]   

Now, let’s assume that you peer that VNet to another one. The NICs in both VNets will know how to reach each other because they will get a second system route for the peered VNet. For example, if we peer the VNet of the example above with a second one with the prefix 10.13.0.0/24 the effective route table will show like this:

❯ az network nic show-effective-route-table -n spokevmVMNic -g $rg -o table
Source    State    Address Prefix    Next Hop Type    Next Hop IP
--------  -------  ----------------  ---------------  -------------
Default   Active   10.13.76.0/24     VnetLocal       
Default   Active   10.13.0.0/24      VNetPeering
[...]

Getting started with subnet peering

To use this new functionality you will need to enable the corresponding feature in your subscription, and after that process is complete, to register again the Microsoft.Network resource provider:

❯ az feature register --namespace Microsoft.Network --name AllowMultiplePeeringLinksBetweenVnets 
❯ az feature show --name AllowMultiplePeeringLinksBetweenVnets --namespace Microsoft.Network --query 'properties.state' -o tsv
Registering
[...] -> Go for a coffee or two here
❯ az feature show --name AllowMultiplePeeringLinksBetweenVnets --namespace Microsoft.Network --query 'properties.state' -o tsv
Registered
❯ az feature show --name AllowMultiplePeeringLinksBetweenVnets --namespace Microsoft.Network --query 'properties.state' -o tsv
Registered

So what does it do exactly?

With subnet peering, Microsoft gives you additional control on those additional system routes that get programmed in the NIC. For example, what if you would only want the VNet above to communicate with certain subnets, but not with others?

The new options for creating VNet peerings now let you specify which subnets will participate in the peering. In the example below I am peering a single subnet in my spoke VNet with two subnets in the hub VNet:

 az network vnet peering create -n "hub-to-spoke" -g $rg -o none\
        --vnet-name hub --remote-vnet spoke \
        --allow-forwarded-traffic --allow-vnet-access --allow-gateway-transit \
        --peer-complete-vnet false \
        --local-subnet-names GatewaySubnet fw --remote-subnet-names vm 
az network vnet peering create -n "spoke-to-hub" -g $rg -o none\
        --vnet-name spoke --remote-vnet hub \
        --allow-forwarded-traffic --allow-vnet-access --use-remote-gateways \
        --peer-complete-vnet false \
        --remote-subnet-names GatewaySubnet fw --local-subnet-names vm

You need to disable the full VNet peering (through --peer-complete-vnet false), and then specify the local and remote subnets. The effect is that when inspecting the effective routes in my spoke VNet, I only see the /26 subnets inside of 10.13.0.0/24 that I specifically included in the commands above, but not the rest:

❯ az network nic show-effective-route-table -n spokevmVMNic -g $rg -o table
Source    State    Address Prefix    Next Hop Type     Next Hop IP
--------  -------  ----------------  ----------------  -------------
Default   Active   10.13.76.0/24     VnetLocal
Default   Active   10.13.0.0/26      VNetPeering
Default   Active   10.13.0.128/26    VNetPeering
[...]

Similarly, if I look into the effective routes in the firewall, I will only see the spoke subnet I included (/26), but not the whole VNet (`/24`):

❯ az network nic show-effective-route-table -n hubfwVMNic -g $rg -o table
Source    State    Address Prefix    Next Hop Type    Next Hop IP
--------  -------  ----------------  ---------------  -------------
Default   Active   10.13.0.0/24      VnetLocal
Default   Active   10.13.76.0/26     VNetPeering
[...]

Let me exclude that GatewaySubnet!

One of the most tedious aspects of Azure networking is maintaining the route table that is needed in the GatewaySubnet to send traffic to a firewall. The ExpressRoute and VPN gateways are living in the hub VNet as other virtual machines, and they will learn as well the system routes from the peering. But wait, you don’t want those gateways to send traffic directly to the spokes, but only through the firewall. Consequently, you need to override those routes introduced by VNet peering one after the other (and no, summaries will not do the trick, since in Azure as in any other network more specific routes always win).

So could we make the GatewaySubnet not learn the spoke prefixes? Unfortunately subnet peering doesn’t work for this goal, since Azure will not let you exclude the GatewaySubnet from the peering if you are using the settings allowGatewayTransit and UseRemoteGateways:

❯ az network vnet peering create -n "spoke-to-hub" -g $rg \
     --vnet-name spoke --remote-vnet hub \
     --allow-forwarded-traffic --allow-vnet-access --use-remote-gateways \
     --remote-subnet-names fw --local-subnet-names vm --peer-complete-vnet false 

(SubnetPeeringHasUseRemoteGatewaysSetButGatewaySubnetNotPeeredInRemoteSubnetNames) Subnet Peering link: {1} UseRemoteGateways Set but GatewaySubnet not peered in RemoteSubnetNames: {2}.

You could create the opposite peering (from hub to spoke, the one that really influences the routes programmed in the hub subnets) without the GatewaySubnet, however that will not work: both peerings need to have matching subnets, otherwise they will not get synchronized and things will not be pretty.

As a consequence, if you want to avoid having to maintain that route table associated to the GatewaySubnet your best take is either using static routing instead of BGP (only possible with VPN, but not with ExpressRoute connections) or using Azure Route Server, as explained in my post Hub and Spoke 2.0.

What about SDWAN NVAs?

Ah, those would be fair game, right? If you are using SDWAN Network Virtual Appliances (NVAs) instead of the Azure ExpressRoute and VPN gateways, then you don’t need to enable those “allow gateway transit” or “use remote gateways” flags in the VNet peering, so excluding the SDWAN NVA would be OK. In my setup I have an SDWAN NVA in the 10.13.0.64/26 subnet in the hub:

❯ az network vnet subnet list --vnet-name hub -g $rg -o table
AddressPrefix    Name           PrivateEndpointNetworkPolicies    PrivateLinkServiceNetworkPolicies    ProvisioningState    ResourceGroup
---------------  -------------  --------------------------------  -----------------------------------  -------------------  ---------------
10.13.0.0/26     GatewaySubnet  Disabled                          Enabled                              Succeeded            vnetpeering
10.13.0.64/26    nva            Disabled                          Enabled                              Succeeded            vnetpeering
10.13.0.128/26   fw             Disabled                          Enabled                              Succeeded            vnetpeering

❯ az vm list-ip-addresses -g $rg -o table
VirtualMachine    PrivateIPAddresses
----------------  --------------------
hubfw             10.13.0.132
sdwan             10.13.0.68
spokevm2          10.13.76.4

As we saw earlier, this 10.13.0.64/26 subnet is excluded from the peering and the spoke doesn’t learn it:

❯ az network nic show-effective-route-table -n spokevm2VMNic -g $rg -o table
Source    State    Address Prefix    Next Hop Type     Next Hop IP
--------  -------  ----------------  ----------------  -------------
Default   Active   10.13.76.0/24     VnetLocal
Default   Active   10.13.0.0/26      VNetPeering
Default   Active   10.13.0.128/26    VNetPeering

And yet, the SDWAN NIC seems to be learning the spoke subnet prefix!

❯ az network nic show-effective-route-table -n sdwanVMNic -g $rg -o table
Source    State    Address Prefix    Next Hop Type    Next Hop IP
--------  -------  ----------------  ---------------  -------------
Default   Active   10.13.0.0/24      VnetLocal
Default   Active   10.13.76.0/26     VNetPeering
[...]

We can double-check in the hub peering that the SDWAN subnet is not included:

❯ az network vnet peering show -g $rg --vnet-name hub -n hub-to-spoke -o jsonc
{
  "allowForwardedTraffic": true,
  "allowGatewayTransit": true,
  "allowVirtualNetworkAccess": true,
  "doNotVerifyRemoteGateways": false,
  "id": "/subscriptions/blahblah/resourceGroups/vnetpeering/providers/Microsoft.Network/virtualNetworks/hub/virtualNetworkPeerings/hub-to-spoke",
  "localAddressSpace": {
    "addressPrefixes": [
      "10.13.0.0/26",
      "10.13.0.128/26"                                                                                                                                                                                                                          
    ]
  },                                                                                                                                                                                                                                
  "localSubnetNames": [
    "GatewaySubnet",
    "fw"
  ],
  "localVirtualNetworkAddressSpace": {
    "addressPrefixes": [
      "10.13.0.0/26",
      "10.13.0.128/26"
    ]
  },
  "name": "hub-to-spoke",
  "peerCompleteVnets": false,
  "peeringState": "Connected",
  "peeringSyncLevel": "FullyInSync",
  "provisioningState": "Succeeded",
  "remoteAddressSpace": {
    "addressPrefixes": [
      "10.13.76.0/26"
    ]
  },
  "remoteSubnetNames": [
    "vm"
  ],
  "remoteVirtualNetwork": {
    "id": "/subscriptions/blahblah/resourceGroups/vnetpeering/providers/Microsoft.Network/virtualNetworks/spoke",
    "resourceGroup": "vnetpeering"
  },
  "remoteVirtualNetworkAddressSpace": {
    "addressPrefixes": [
      "10.13.76.0/26"
    ]
  },
  "resourceGroup": "vnetpeering",
  "type": "Microsoft.Network/virtualNetworks/virtualNetworkPeerings",
  "useRemoteGateways": false
}

So I am afraid that at its current state we cannot use this feature either to alleviate the administrative effort of maintaining the UDR table associated to the SDWAN subnet either. Note that this might change in the future as this feature matures.

Mix and Match

What if you want to selectively peer some subnets on one side, but the whole VNet on the other? Can you do that? Unfortunately I wasn’t successful here:

  • --local-subnet-names * doesn’t work.
  • As soon as you specify the --peer-complete-vnet true parameter, subnets are ignored.
  • You cannot leave out the local or remote subnet names parameter, you would get an error message like Both Local and Remote Address Space in a Virtual Network Peering must contain either at least one IPv4 in both or IPv6 prefix in both.

In my example this means that if I only peer certain subnets from the hub to the spoke, I need to also select the spoke subnets when peering to the hub. Adding new spoke subnets will force me to update the peering to include the new subnets as well.

Adding up

I set up to explore for subnet peering to avoid the administrative overhead of maintaining the route tables that need to be associated with the GatewaySubnet or the subnet where SD-WAN appliances are deployed, so that traffic flows through the firewall before being sent to the spoke workloads. However, my tests have shown that subnet peering is not usable for this.

What are your use cases for subnet peering? Please let me know in your comments below!

13 thoughts on “Azure Subnet Peering

  1. Abdel's avatarAbdel

    Great article, as always, unfortunately i didn’t find any documentation on the topic.

    Like

    1. Yes, this feature hasn’t been announced yet.

      Like

  2. Szymon's avatarSzymon

    Great to have you back! Your blog is an invaluable source of information. Thanks for your work and sharing knowledge!

    Like

    1. You made my day Szymon!

      Like

  3. Chris Gibbs's avatarChris Gibbs

    One possible use case I would like to explore this feature for is for selectivity peering between subnets to bypass the NVA inspection in the Hub.

    Large data flows springs to mind, ML, Databricks or Synapse mounting storage / SQL from different subnets in order to increase performance need a direct network path.

    Like

    1. True, although today you can already do that, albeit peering the whole VNets and then using NSGs. I agree that restricting the peering to individual subnets is cleaner.

      Like

  4. […] julkiset IP:t ovat nyt zone-redundantteja ja VNET-peeraukset voi tehdä vain valituille subneteille. Peerauksessa näyttää olevan vielä uuden ominaisuuden […]

    Like

  5. MATIAS OSCAR CASIVA's avatarMATIAS OSCAR CASIVA

    And what about Vnet peerings to Azure Vwan Vhubs? will this subnet peering be possible? I see an impact here on the total amount of routes for bigger environments. Especially with Express Route.

    Like

    1. Exactly, that is why I see this feature more relevant for spoke-to-spoke peerings, rather than spoke-to-hub.

      Like

  6. GT's avatarGT

    Maybe this is a solution for the inital Azure Route Server Multi Region Design’s routing loop problem, without solving it via eGP over VXLAN? Has anyone had a chance of testing that and checking, if ARS is propagating even through subnet-level VNET peerings?

    Like

    1. I would have to think more about a possible design, but at first sight I don’t see how.

      Like

  7. […] but this solved our need. If I were to solve this now I would rather use Subnet Peering Azure Subnet Peering – Cloudtrooper or use NAT features on the VPN Gateway to hide the entire VNET behind a NAT address. Also since […]

    Like

  8. […] wrote a blog post some time ago here, feel free to revisit. If you want the short version, it is a way of restricting a peering between […]

    Like

Leave a comment