If you have been reading some of my blog posts, you probably know that I have been working on Azure networking for a while. Part of that work has consisted of helping customers to create network architectures based on their requirements. Last week I got a similar ask from a colleague for a large-scale hub-and-spoke design, and I decided to do something I hadn’t done before: calculate the price of each option.
You might (rightfully) wonder why the heck I wouldn’t have done that before. The answer is easy: laziness. Azure networking pricing model can be complex, and comparing different designs to each other is often an apples-to-oranges discussion. On top of that, the cost depends on the traffic flows that every organization has (Cynthia Treger has blogged about data transfer costs in Azure), and very often the details about expected flows are not available at all, so you cannot calculate the final price.
Still, I thought it would be interesting to take some assumptions, so I opened up the second best solution to every problem (of course I mean Excel) and I started putting numbers in a spreadsheet to generate cost comparison charts like this (we will dive deeper in the comparisons in the rest of the post):

Let me give you a quick summary of what I learned:
- You shouldn’t neglect the cost of data processing and traffic peering in your calculations, since it can be a significant chunk of your overall bill.
- From that perspective, designs with transit VNets will be more expensive, especially when you have heavy VNet-to-VNet flows.
- The structure of Virtual WAN traffic processing cost makes it cheaper than self-managed hub-and-spoke for many scenarios.
Disclaimer: price is not everything!
I would like to highlight that this post is all about pricing. The fact that one solution is more expensive doesn’t mean that nobody should pick it up. You need to weigh its added value against its price, and decide whether it is worth it.
The different architectures evaluated here vary in functionality and flexibility, but mostly in their operational overhead. Price is going to be an important factor to consider in the decision of which one is the best for your organization, but it shouldn’t be the only one. Especially for a foundational technology such as networking, that could impact positively or negatively the rest of your environment.
But don’t let me stall any longer and let’s start our trip into the rabbit hole!
The design options
The initial question is simple enough: which Azure networking architecture is best for around 3,000 VNets over two regions with Azure Firewall and ExpressRoute? Different options exist for large-scale hub-and-spoke environments, don’t miss Adam Stuart’s blog on that topic. Out of those, I selected four options as the most attractive.
The first one is the traditional customer-managed hub-and-spoke environment, where you connect all of the hubs together via (global) VNet peering. The maximum number of spokes per environment is 400, which is the number of routes you can have in the route table associated to the GatewaySubnet:

The second option is very similar, but using Virtual WAN. Here the limit of spokes is 600, as stated in the documentation:

The third option is an indirect spoke design. The idea is to consolidate the ExpressRoute gateways on a core layer. This core layer, provided by Virtual WAN, will also interconnect all spoke blocks between each other eliminating the need for the full mesh between the hub VNets:

And the fourth option is using Azure Virtual Network Manager to create a full-mesh between the spokes in every spoke block and the hubs, and to manage routing in the GatewaySubnets. From a topology perspective identical to option 1, but with different scalability numbers:

Cost analysis with 3,000 spokes
So let’s start with the first cost analysis. I configured the calculation parameters to 3,000 spokes over 2 regions, and I set these traffic flow parameters:
- 100 MB per month between every two spokes.
- 1 GB per month from every spoke to onprem.
- 10 GB per month from onprem to every spoke.
The last parameter (traffic from onprem to Azure) will impact the size of the ExpressRoute gateways, since traffic in the opposite direction (from Azure to onprem) bypasses the gateways (with the exception of private endpoints, but I didn’t consider that here).
I also left some margin in the maximum number of spokes per block:
- For option 1 (hub and spoke) I used 390 instead of 400, in case you need for some reason to have additional routes in the GatewaySubnet route table.
- For option 2 (VWAN) I used 590 instead of 600.
- For option 3 (indirect spokes) I used 490 instead of 500, to save 10 peerings in every transit VNet for additional purpose (actually 9, one peering is consumed by the connection to Virtual WAN).
- Finally, for option 4 (AVNM) I used 990 instead of 1,000, also to leave some room for additional routes or peerings.
The first block of costs are going to be the VNet peerings. They are similar in options 1, 2 and 4, but in Virtual WAN you don’t pay for the peering side at the hub:
| 1-HnS | 2-VWAN | 3-HnS+VWAN | 4-AVNM | |
| VNet peerings all spokes | 18,654 | 9,327 | 18,654 | 18,654 |
You might argue that the (lack of) VWAN peering costs is absorbed by the hub data processing, but let’s table that discussion for a bit later.
The second block of costs are the firewall costs. Here you need to break them down in the fixed and variable (per GB) components:
| 1-HnS | 2-VWAN | 3-HnS+VWAN | 4-AVNM | |
| FW bandwidth | 27,528 | 26,064 | 26,424 | 19,824 |
| FW fixed | 7,300 | 5,475 | 7,300 | 3,650 |
| FW total | 34,828 | 31,539 | 33,724 | 23,474 |
As you can see, the variable costs because of bandwidth are similar in all designs except for AVNM, since the full-mesh interconnection of spokes inside of one block removes some traffic from reaching the firewall. The small variances across options 1, 2 and 3 are due to the size of the spoke blocks: the smaller the spoke blocks, the more flows need to traverse two firewalls instead of just one.
The lower cost of the AVNM option also for the fixed costs is easy to understand: if you have larger blocks (1,000 VNets), you will have fewer firewalls.
Let’s move on with the trickiest part, that I call the inter-block costs. These costs will depend on the topology:
| 1-HnS | 2-VWAN | 3-HnS+VWAN | 4-AVNM | |
| Transit-to-vhub peering | 7,860 | |||
| HnS H2H peering (intra-region) | 3,375 | 1,530 | ||
| HnS H2H peering (cross-region) | 4,500 | 4,500 | ||
| VWAN H2H transfer (intra-region) | 0 | 6,060 | ||
| VWAN H2H transfer (cross-region) | 9,000 | 27,000 | ||
| Interblock total | 7,875 | 9,000 | 40,920 | 6,030 |
- The first row (transit-to-vhub) is only relevant for option 3, the indirect spoke design.
- Options 1 (hub and spoke) and 4 (AVNM) have the same structure, where you pay for traffic traversing the peerings between the hubs. These peerings can be local or global, so you need to differentiate between intra-region and cross-region. AVNM has less intra-region traffic between blocks, because the blocks are larger, but exactly the same cross-region traffic.
- For option 2 you don’t pay for intra-region hub-to-hub traffic, or at least I haven’t found a price for that. You do pay for cross-region data transfer, but not for the vHub data processing, since traffic goes through the firewalls and not through the virtual hub routers.
- In option 3 you pay for both vHub data processing and cross-region data transfer at the VWAN layer
You can see how the indirect spoke option goes very badly here. Now that we are talking about virtual hubs, let’s have a look at their actual cost:
| 1-HnS | 2-VWAN | 3-HnS+VWAN | 4-AVNM | |
| vhub fixed | 1,095 | 365 | ||
| vhub RUs | 438 | 146 | ||
| Virtual hub total | 0 | 1,533 | 511 | 0 |
We only have virtual hub costs in options 2 and 3, because the other options do not include Virtual WAN. Option 3 has only two virtual hubs (one per region), so the costs are lower. However, in absolute terms these are mostly negligible compared to the cost of traffic that we saw earlier.
We are almost there, let’s look at the costs for the ExpressRoute gateways, which are calculated differently in hub-and-spoke (per gateway) and Virtual WAN (scale units and connection units):
| 1-HnS | 2-VWAN | 3-HnS+VWAN | 4-AVNM | |
| ERGW SKU | ErGw1AZ | ErGw1AZ | ||
| ERGW cost | 2108.24 | 1054.12 | ||
| VWAN ER scale units per hub | 1 | 1 | ||
| VWAN ER conn. per hub | 2 | 2 | ||
| ER GW total | 2,108.24 | 2,277.60 | 759.20 | 1,054.12 |
Here again the indirect spoke model is the cheapest, since it only has gateways in 2 hubs. After comes option 4, since it reduces the number of spoke blocks due to the higher 1,000 VNet limit, and finally options 1 and 2 pretty close. Still, these numbers are not going to make a dent in the overall costs.
And finally, the AVNM component, which is pretty straight forward (and significant):
| 1-HnS | 2-VWAN | 3-HnS+VWAN | 4-AVNM | |
| AVNM costs | 0 | 0 | 0 | 43,858 |
So what’s the final verdict? Here you go:

- The firewall costs are roughly the same in all options, slightly less for AVNM (due to the larger blocks and to the mesh between the spokes).
- The inter-block traffic cost makes option 3 (indirect spokes) unattractive, although it has operational advantages over options 1 and 2.
- AVNM costs are also significant. They might be justified by the automation that they offer, since they reduce the operational cost of the solution.
- Traditional Virtual WAN is cheaper than hub-and-spoke, mostly due to the lower cost of the spoke VNet peerings. Besides, it has lower administrative overhead than the customer-managed hub-and spoke, so I would say it is the clear winner for this senario.
Other scenarios
As I mentioned at the top of the post, this analysis greatly depends on the input parameters. What about if there is absolutely no VNet-to-VNet traffic in the design, as opposed to the 100 MB of traffic between the spokes I used earlier? Here what you get:

The cost difference between the first three options almost disappears. The choice between them would probably come down to the complexity, functionality and operational overhead.
You might be asking yourself, what if I don’t have 2 regions, but 6? Let’s try that, leaving the rest of the parameters unchanged (3,000 VNets, no V2V traffic, 1GB spoke-to-onprem per spoke, 10GB onprem-to-spoke per spoke):

Option 2 (Virtual WAN) still in the (pricing) lead!
What if you significantly increase the traffic to/from onprem, for example because you have bandwidth-intensive private endpoints? We can increase those parameters from 1GB/10GB to 10GB/100GB and go back to 2 regions, with no significant relative difference the options, but all of them getting more expensive:

You can see that the pricing for each option has raised again, but Virtual WAN stays as the most cost-effective choice.
So far we haven’t seen any scenario where AVNM comes out cheaper. The full mesh between spokes that AVNM provides helps to keep spoke-to-spoke traffic outside of the firewall, so AVNM will be cheaper when there is a lot of it. If we try 1800 spokes spread over 2 regions with 10GB of monthly traffic between every pair of spokes, this is what we get:

Wrapping up
Congratulations for getting down here, this was a long post! The learning lesson for me out of this exercise is that you cannot ignore your traffic patterns when comparing different networking designs in Azure. Feel free to use the spreadsheet to simulate your own traffic, and if you happen to find a bug or you have an improvement suggestion, please let me know!
