A day in the life of a packet in Azure Redhat Openshift (part 2)

In this part 2 of my blog series around ARO networking we will have a look and how inbound and outbound Internet connectivity works, as well as connectivity between different pods in the cluster. Other posts in the series:

If you remember, in part 1 of this blog we deployed a very simple app consisting of an API accessing a database. Let’s recap:

kubectl get pod -o wide
NAME              READY   STATUS      RESTARTS   AGE     IP            NODE                                   NOMINATED NODE   READINESS GATES
server-1-deploy   0/1     Completed   0          54m     10.131.0.32   aro2-p8bjm-worker-northeurope1-qt8l7   <none>           <none>
server-1-ppl25    1/1     Running     0          54m     10.128.2.24   aro2-p8bjm-worker-northeurope3-wl4vw   <none>           <none>
sqlapi-1-8jgx8    1/1     Running     0          40m     10.131.0.40   aro2-p8bjm-worker-northeurope1-qt8l7   <none>           <none>
sqlapi-1-deploy   0/1     Completed   0          40m     10.131.0.39   aro2-p8bjm-worker-northeurope1-qt8l7   <none>           <none>
 
kubectl get svc
NAME       TYPE           CLUSTER-IP      EXTERNAL-IP    PORT(S)          AGE
server     ClusterIP      172.30.72.7     <none>         1433/TCP         57m
sqlapi     ClusterIP      172.30.82.94    <none>         8080/TCP         43m
sqlapilb   LoadBalancer   172.30.226.18   192.168.0.11   8080:30039/TCP   6h50m
 
kubectl get route
NAME       HOST/PORT                                               PATH   SERVICES   PORT   TERMINATION   WILDCARD
sqlapilb   sqlapilb-project1.apps.m50kgrxk.northeurope.aroapp.io          sqlapilb   8080                 None

Let’s start with outbound connectivity to the public Internet. The API pod we have deployed has a couple of interesting endpoints. We will start with the “ip” endpoint, that tells us a few things:

curl "http://sqlapilb-project1.apps.m50kgrxk.northeurope.aroapp.io/api/ip"
{
  "my_default_gateway": "10.131.0.1",
  "my_dns_servers": "['172.30.0.10']",
  "my_private_ip": "10.131.0.40",
  "my_public_ip": "40.127.221.40",
  "path_accessed": "sqlapilb-project1.apps.m50kgrxk.northeurope.aroapp.io/api/ip",
  "sql_server_fqdn": "server.project1.svc.cluster.local",
  "sql_server_ip": "172.30.72.7",
  "x-forwarded-for": "109.125.122.99",
  "your_address": "10.128.2.11",
  "your_browser": "None",
  "your_platform": "None"
}

The field “my_private_ip” is the IP address of the pod, 10.131.0.40 as was shown in the output of the “oc get pod -o wide” earlier. The field “my_public_ip” is the IP address with which this pod will go out to the public Internet, but where is it coming from? Let’s find out (sneak preview: if you know how AKS works, this is going to be very familiar).

The Azure resources that make up your cluster like virtual machines, virtual disks and load balancers are located in a different resource group, typically called “node resource group”. Let’s have a look at the load balancers there:

node_rg_id=$(az aro show -n $cluster_name -g $rg --query 'clusterProfile.resourceGroupId' -o tsv)
node_rg_name=$(echo $node_rg_id | cut -d/ -f 5)
az network lb list -g $node_rg_name -o table
Location     Name                    ProvisioningState    ResourceGroup    ResourceGuid
-----------  ----------------------  -------------------  ---------------  ------------------------------------
northeurope  aro2-p8bjm              Succeeded            aro2-resources   f809521e-6935-4066-8898-6ca1215e3d82
northeurope  aro2-p8bjm-internal     Succeeded            aro2-resources   45347122-98bc-49c5-9940-fc9713268bc2
northeurope  aro2-p8bjm-internal-lb  Succeeded            aro2-resources   1d809b47-5c87-4ac4-83b4-1e3e53ae8a67
northeurope  aro2-p8bjm-public-lb    Succeeded            aro2-resources   972600b0-2a96-4313-a0f2-6137c2043cfd 

As you can see there are four load balancers, but which one does what? One way of finding out is looking at the backend servers. For example, let us focus on the one called “aro2-p8bjm” and look whether it is associated to the master or to the worker nodes:

az network lb address-pool list --lb-name aro2-p8bjm -g $node_rg_name --query '[].backendIpConfigurations[].id' -o tsv | cut -d/ -f 9
aro2-p8bjm-worker-northeurope1-qt8l7-nic
aro2-p8bjm-worker-northeurope3-wl4vw-nic
aro2-p8bjm-worker-northeurope2-rbxzc-nic

The previous command is just a bit of filtering kung fu to display the names of the NICs associated to a certain load balancer, which as you can see correspond to the worker nodes. Let’s verify whether this load balancer has any frontend IP addresses associated with it:

az network lb frontend-ip list --lb-name aro2-p8bjm -g $node_rg_name --query '[].[name,publicIpAddress.id]' -o tsv
outbound /subscriptions/e7da9914-9b05-4891-893c-546cb7b0422e/resourceGroups/aro2-resources/providers/Microsoft.Network/publicIPAddresses/aro2-p8bjm-outbound-pip-v4
ab765e9911ee74b8d81280e60cfcea16 /subscriptions/e7da9914-9b05-4891-893c-546cb7b0422e/resourceGroups/aro2-resources/providers/Microsoft.Network/publicIPAddresses/aro2-p8bjm-ab765e9911ee74b8d81280e60cfcea16

Again, this command is about selecting some specific output of the frontend-ip resources: their name and their associated public IP address. By the way, the name of the first frontend is “outbound”, which seems to indicate the presence of an outbound rule:

az network lb outbound-rule list --lb-name aro2-p8bjm -g $node_rg_name -o table
AllocatedOutboundPorts EnableTcpReset IdleTimeoutInMinutes Name Protocol ProvisioningState ResourceGroup
------------------------ ---------------- ---------------------- ------------ ---------- ------------------- ---------------
1024 False 30 outboundrule All Succeeded aro2-resources

Yes! As we suspected, an outbound rule, with a fixed vale for the Allocated Outbound Ports of 1,204. This means that each node gets a maximum of 1,024 ephemeral TCP ports (and as many UDP ports) for outbound connectivity of the pods that it contains. Normally this value is enough for most situations, but if you happen to see connection timeouts in your egress connectivity from the pods, make sure to look at the metrics of this Load Balancer to troubleshoot SNAT port exhaustion issues.

But we were looking for a public IP. Let us have a look of all the IPs in the node resource group:

az network public-ip list -g $node_rg_name -o table
Name                                         ResourceGroup    Location     Zones    Address        AddressVersion    AllocationMethod    IdleTimeoutInMinutes    ProvisioningState
-------------------------------------------  ---------------  -----------  -------  -------------  ----------------  ------------------  ----------------------  -------------------
aro2-p8bjm-ab765e9911ee74b8d81280e60cfcea16  aro2-resources   northeurope           51.104.142.63  IPv4              Static              4                       Succeeded
aro2-p8bjm-outbound-pip-v4                   aro2-resources   northeurope           40.127.221.40  IPv4              Static              4                       Succeeded
aro2-p8bjm-pip-v4                            aro2-resources   northeurope           40.127.221.38  IPv4              Static              4                       Succeeded

So there you go, 40.127.221.40 is the IP address that our pod gets as source IP when getting into the Internet. This is actually the public IP address “aro2-p8bjm-outbound-pip-v4”, which is associated to an outbound rule in the Azure Load Balancer “aro2-p8bjm”, where the worker nodes are connected to. Easy, right?

But let’s look at the other public IP asociated to the LB, the IP “aro2-p8bjm-ab765e9911ee74b8d81280e60cfcea16” with the value 51.104.142.63. If in the same Load Balancer we have a look at the load balancing rules, we will find the two rules associated with the Openshift router, and with a name that is very similar to the name of the IP:

az network lb rule list --lb-name aro2-p8bjm -g $node_rg_name -o table
BackendPort DisableOutboundSnat EnableFloatingIp EnableTcpReset FrontendPort IdleTimeoutInMinutes LoadDistribution Name Protocol ProvisioningState ResourceGroup
------------- --------------------- ------------------ ---------------- -------------- ---------------------- ------------------ ---------------------------------------- ---------- ------------------- ---------------
80 False True True 80 4 Default ab765e9911ee74b8d81280e60cfcea16-TCP-80 Tcp Succeeded aro2-resources
443 False True True 443 4 Default ab765e9911ee74b8d81280e60cfcea16-TCP-443 Tcp Succeeded aro2-resources

The ports should make us suspect that these are the rules associated with the Openshift router. Let’s verify the IP address that the router service has:

kubectl get svc -n openshift-ingress
NAME                      TYPE           CLUSTER-IP       EXTERNAL-IP     PORT(S)                      AGE
router-default            LoadBalancer   172.30.248.226   51.104.142.63   80:30602/TCP,443:31409/TCP   10h
router-internal-default   ClusterIP      172.30.126.81    <none>          80/TCP,443/TCP,1936/TCP      10h

And to be completely sure:

nslookup sqlapilb-project1.apps.m50kgrxk.northeurope.aroapp.io
Server: 172.24.208.1
Address: 172.24.208.1#53
Non-authoritative answer:
Name: sqlapilb-project1.apps.m50kgrxk.northeurope.aroapp.io
Address: 51.104.142.63

But now that we are at it, the source address that the pod is seeing from us (the field “your_address”) is actually 10.128.2.11. Where is this coming from? The Openshift rotuer is a full reverse proxy, so the connection should come from one of the router pods. Let’s verify:

kubectl get pod -n openshift-ingress -o wide
NAME                             READY   STATUS    RESTARTS   AGE   IP            NODE                                   NOMINATED NODE   READINESS GATES
router-default-cf4d7b6d5-lcxtl   1/1     Running   0          10h   10.128.2.11   aro2-p8bjm-worker-northeurope3-wl4vw   <none>           <none>
router-default-cf4d7b6d5-xlkxx   1/1     Running   0          10h   10.131.0.18   aro2-p8bjm-worker-northeurope1-qt8l7   <none>

There you go, it is the IP address of one of the routers, which incidentally was gentle enough to put the original client IP address (109.125.122.99 in the output above) in the X-Forwarded-For HTTP header, so that the information does not get lost.

Awesome, we have seen how communication works from the pods to Internet and from Internet to the pods, let us have a look now at the flow between the API and the SQL server. We saw in part 1 that the API will try to reach the server on a certain FQDN that we provided as environment variable. We can inspect the environment variables inside of our pod with the API endpoint “printenv”:

curl -s "http://sqlapilb-project1.apps.m50kgrxk.northeurope.aroapp.io/api/printenv" | jq -r '.SQL_SERVER_FQDN'
server.project1.svc.cluster.local

And we can see whether name resolution works for that FQDN:

curl -s "http://sqlapilb-project1.apps.m50kgrxk.northeurope.aroapp.io/api/dns?fqdn=server.project1.svc.cluster.local"
{
"fqdn": "server.project1.svc.cluster.local",
"ip": "172.30.72.7"
}

If you scroll up a bit, you can check that 172.30.72.7 was the ClusterIP address for the “server” service, which is load balancing to our SQL Server pod. We can now try to reach the SQL Server using other two API functions: the first one retrieves the SQL Server version, the second one gets the source IP with which the SQL server sees the client:

curl -s "http://sqlapilb-project1.apps.m50kgrxk.northeurope.aroapp.io/api/sqlversion"
{
"sql_output": "Microsoft SQL Server 2019 (RTM-CU4) (KB4548597) - 15.0.4033.1 (X64) \n\tMar 14 2020 16:10:35 \n\tCopyright (C) 2019 Microsoft Corporation\n\tDeveloper Edition (64-bit) on Linux (Ubuntu 18.04.4 LTS) "
} 
curl -s "http://sqlapilb-project1.apps.m50kgrxk.northeurope.aroapp.io/api/sqlsrcip"
{
"sql_output": "10.131.0.40"
}

As you can see, the source IP visible to the SQL Server is the pod IP address for the API pod. In the next post (part 3) we will have a look at inter-project communication.

4 thoughts on “A day in the life of a packet in Azure Redhat Openshift (part 2)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: