Azure Fleet load balancing: not what you think

Azure Kubernetes Fleet Manager is a very interesting solution that allows you to deploy code to multiple clusters at the same time. I am not going to stir up the debate here of whether this approach is better or worse than GitOps or CD pipeline parametrization, but instead I am going to look in detail at another feature of Fleet: cross-cluster load balancing.

This feature is succinctly explained in the official docs, but I have seen quite a number of people confused about it: is this replacing an Azure Traffic Manager or an Azure Front Door in front of my multiple clusters? Is this feature giving me cluster resiliency? Is it optimizing latency? This is the picture shown in the official documentation at this time:

It would be so cool having that load balancer (aka multi-cluster service) sitting in front of my clusters, giving me extra resiliency in case a cluster went down, right? So of course I decided to test it!

Kubernetes Fleet cluster LB: a “look inside”

The first alarm bell in my head rang when reading the configuration guide: “The target Azure Kubernetes Service (AKS) clusters on which the workloads are deployed need to be present on either the same virtual network or on peered virtual networks“. Why would this be? Explanation further down…

The second alarm bell fired when I saw that the actual multi-cluster service resource gets deployed in one of the member clusters, not in the Kubernetes Fleet Manager (see the step 2 of this part of the configuration guide). After deploying the multi-cluster service you can check its status by going to the member cluster where you deployed it:

❯ KUBECONFIG=aks-member-1 kubectl get multiclusterservice $app_name -n $app_ns
NAME      SERVICE-IMPORT   EXTERNAL-IP     IS-VALID   AGE
yadaapi   yadaapi          20.255.187.59   True       11m

By the way, quick tip: I am normally not typing the command above, but I have defined some aliases to make my life easier:

az fleet get-credentials -g $rg -n $fleet_name --file $fleet_cred_file -o none
az aks get-credentials -n $aks1_name -g $rg --file $aks1_cred_file -o none
az aks get-credentials -n $aks2_name -g $rg --file $aks2_cred_file -o none
alias k1="kubectl --kubeconfig=$aks1_cred_file"
alias k2="kubectl --kubeconfig=$aks2_cred_file"
alias kf="kubectl --kubeconfig=$fleet_cred_file"

But back to our core topic: if I look in Azure where that IP address 20.255.187.59 is defined I will find it, oh wonder, in the standard public load balancer which is created along the AKS cluster:

So does that mean that there is a load balancing rule sending traffic to all of the other clusters? Let’s go down the rabbit hole: first you need to find out the load balancing rule associated with the IP address 20.255.187.59:

This makes sense, my application is working on port 8080. Now we check the backend pool, which is just kubernetes as you can see in the screenshot above. However, this is the standard backend pool that gets created by default. Has this been modified to include nodes from the other clusters? Nope, it only contains the nodes (I only have one) of the local cluster:

This means that all traffic first goes to the cluster where I defined my multi-cluster service, and from there it gets distributed to the other clusters. That is why the VNets need to be peered together, because the cluster where you created the multi-cluster service needs to have direct line of sight with all of the other clusters.

And by the way, if you killed the cluster where the multi-cluster service is defined, your multi-cluster load balancer would stop working, since the only node in the load balancer’s backend pool would also become unavailable.

A more realistic representation

So how would I represent the architecture? Here we go:

The previous diagram in my opinion makes it clear that this architecture is not increasing the resiliency of the setup (AKS 1 is an obvious point of failure), or improving latency when you have clusters distributed across multiple geographic regions (actually it will probably make it worse). Consequently, chances are that you are still going to look at “traditional” multi-region load balancers such as Azure Traffic Manager or an Azure Front Door for some time.

Have I missed anything? Do you have a good use case for the Azure Kubernetes Fleet Manager cross-cluster load balancing feature? Let me know in the comments!

2 thoughts on “Azure Fleet load balancing: not what you think”

Cedric Verstraeten

August 29, 2025 at 8:49 am

Really good insights, thanks for sharing your vision. You might look here for a DNS approach: https://learn.microsoft.com/en-us/azure/kubernetes-fleet/howto-dns-load-balancing

LikeLike

1. erjosito
  
  August 29, 2025 at 10:35 pm
  
  Hey Cedric, thanks for sharing! Yes, the DNS-based LB feature came out after I wrote this post.
  
  LikeLike

Kubernetes Fleet cluster LB: a “look inside”

A more realistic representation

Share this:

Related

2 thoughts on “Azure Fleet load balancing: not what you think”

Leave a reply to erjosito Cancel reply