In this part 2 of my blog series around ARO networking we will have a look and how inbound and outbound Internet connectivity works, as well as connectivity between different pods in the cluster. Other posts in the series:
- Part 1: Intro and SDN Plugin
- Part 2: Internet and Intra-cluster Communication
- Part 3: Inter-Project and Vnet Communication
- Part 4: Private Link and DNS
- Part 5: Private and Public routers
If you remember, in part 1 of this blog we deployed a very simple app consisting of an API accessing a database. Let’s recap:
kubectl get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES server-1-deploy 0/1 Completed 0 54m 10.131.0.32 aro2-p8bjm-worker-northeurope1-qt8l7 <none> <none> server-1-ppl25 1/1 Running 0 54m 10.128.2.24 aro2-p8bjm-worker-northeurope3-wl4vw <none> <none> sqlapi-1-8jgx8 1/1 Running 0 40m 10.131.0.40 aro2-p8bjm-worker-northeurope1-qt8l7 <none> <none> sqlapi-1-deploy 0/1 Completed 0 40m 10.131.0.39 aro2-p8bjm-worker-northeurope1-qt8l7 <none> <none> kubectl get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE server ClusterIP 172.30.72.7 <none> 1433/TCP 57m sqlapi ClusterIP 172.30.82.94 <none> 8080/TCP 43m sqlapilb LoadBalancer 172.30.226.18 192.168.0.11 8080:30039/TCP 6h50m kubectl get route NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD sqlapilb sqlapilb-project1.apps.m50kgrxk.northeurope.aroapp.io sqlapilb 8080 None
Let’s start with outbound connectivity to the public Internet. The API pod we have deployed has a couple of interesting endpoints. We will start with the “ip” endpoint, that tells us a few things:
curl "http://sqlapilb-project1.apps.m50kgrxk.northeurope.aroapp.io/api/ip" { "my_default_gateway": "10.131.0.1", "my_dns_servers": "['172.30.0.10']", "my_private_ip": "10.131.0.40", "my_public_ip": "40.127.221.40", "path_accessed": "sqlapilb-project1.apps.m50kgrxk.northeurope.aroapp.io/api/ip", "sql_server_fqdn": "server.project1.svc.cluster.local", "sql_server_ip": "172.30.72.7", "x-forwarded-for": "109.125.122.99", "your_address": "10.128.2.11", "your_browser": "None", "your_platform": "None" }
The field “my_private_ip” is the IP address of the pod, 10.131.0.40 as was shown in the output of the “oc get pod -o wide” earlier. The field “my_public_ip” is the IP address with which this pod will go out to the public Internet, but where is it coming from? Let’s find out (sneak preview: if you know how AKS works, this is going to be very familiar).
The Azure resources that make up your cluster like virtual machines, virtual disks and load balancers are located in a different resource group, typically called “node resource group”. Let’s have a look at the load balancers there:
node_rg_id=$(az aro show -n $cluster_name -g $rg --query 'clusterProfile.resourceGroupId' -o tsv) node_rg_name=$(echo $node_rg_id | cut -d/ -f 5) az network lb list -g $node_rg_name -o table Location Name ProvisioningState ResourceGroup ResourceGuid ----------- ---------------------- ------------------- --------------- ------------------------------------ northeurope aro2-p8bjm Succeeded aro2-resources f809521e-6935-4066-8898-6ca1215e3d82 northeurope aro2-p8bjm-internal Succeeded aro2-resources 45347122-98bc-49c5-9940-fc9713268bc2 northeurope aro2-p8bjm-internal-lb Succeeded aro2-resources 1d809b47-5c87-4ac4-83b4-1e3e53ae8a67 northeurope aro2-p8bjm-public-lb Succeeded aro2-resources 972600b0-2a96-4313-a0f2-6137c2043cfd
As you can see there are four load balancers, but which one does what? One way of finding out is looking at the backend servers. For example, let us focus on the one called “aro2-p8bjm” and look whether it is associated to the master or to the worker nodes:
az network lb address-pool list --lb-name aro2-p8bjm -g $node_rg_name --query '[].backendIpConfigurations[].id' -o tsv | cut -d/ -f 9 aro2-p8bjm-worker-northeurope1-qt8l7-nic aro2-p8bjm-worker-northeurope3-wl4vw-nic aro2-p8bjm-worker-northeurope2-rbxzc-nic
The previous command is just a bit of filtering kung fu to display the names of the NICs associated to a certain load balancer, which as you can see correspond to the worker nodes. Let’s verify whether this load balancer has any frontend IP addresses associated with it:
az network lb frontend-ip list --lb-name aro2-p8bjm -g $node_rg_name --query '[].[name,publicIpAddress.id]' -o tsv outbound /subscriptions/e7da9914-9b05-4891-893c-546cb7b0422e/resourceGroups/aro2-resources/providers/Microsoft.Network/publicIPAddresses/aro2-p8bjm-outbound-pip-v4 ab765e9911ee74b8d81280e60cfcea16 /subscriptions/e7da9914-9b05-4891-893c-546cb7b0422e/resourceGroups/aro2-resources/providers/Microsoft.Network/publicIPAddresses/aro2-p8bjm-ab765e9911ee74b8d81280e60cfcea16
Again, this command is about selecting some specific output of the frontend-ip resources: their name and their associated public IP address. By the way, the name of the first frontend is “outbound”, which seems to indicate the presence of an outbound rule:
az network lb outbound-rule list --lb-name aro2-p8bjm -g $node_rg_name -o table AllocatedOutboundPorts EnableTcpReset IdleTimeoutInMinutes Name Protocol ProvisioningState ResourceGroup ------------------------ ---------------- ---------------------- ------------ ---------- ------------------- --------------- 1024 False 30 outboundrule All Succeeded aro2-resources
Yes! As we suspected, an outbound rule, with a fixed vale for the Allocated Outbound Ports of 1,204. This means that each node gets a maximum of 1,024 ephemeral TCP ports (and as many UDP ports) for outbound connectivity of the pods that it contains. Normally this value is enough for most situations, but if you happen to see connection timeouts in your egress connectivity from the pods, make sure to look at the metrics of this Load Balancer to troubleshoot SNAT port exhaustion issues.
But we were looking for a public IP. Let us have a look of all the IPs in the node resource group:
az network public-ip list -g $node_rg_name -o table Name ResourceGroup Location Zones Address AddressVersion AllocationMethod IdleTimeoutInMinutes ProvisioningState ------------------------------------------- --------------- ----------- ------- ------------- ---------------- ------------------ ---------------------- ------------------- aro2-p8bjm-ab765e9911ee74b8d81280e60cfcea16 aro2-resources northeurope 51.104.142.63 IPv4 Static 4 Succeeded aro2-p8bjm-outbound-pip-v4 aro2-resources northeurope 40.127.221.40 IPv4 Static 4 Succeeded aro2-p8bjm-pip-v4 aro2-resources northeurope 40.127.221.38 IPv4 Static 4 Succeeded
So there you go, 40.127.221.40 is the IP address that our pod gets as source IP when getting into the Internet. This is actually the public IP address “aro2-p8bjm-outbound-pip-v4”, which is associated to an outbound rule in the Azure Load Balancer “aro2-p8bjm”, where the worker nodes are connected to. Easy, right?
But let’s look at the other public IP asociated to the LB, the IP “aro2-p8bjm-ab765e9911ee74b8d81280e60cfcea16” with the value 51.104.142.63. If in the same Load Balancer we have a look at the load balancing rules, we will find the two rules associated with the Openshift router, and with a name that is very similar to the name of the IP:
az network lb rule list --lb-name aro2-p8bjm -g $node_rg_name -o table BackendPort DisableOutboundSnat EnableFloatingIp EnableTcpReset FrontendPort IdleTimeoutInMinutes LoadDistribution Name Protocol ProvisioningState ResourceGroup ------------- --------------------- ------------------ ---------------- -------------- ---------------------- ------------------ ---------------------------------------- ---------- ------------------- --------------- 80 False True True 80 4 Default ab765e9911ee74b8d81280e60cfcea16-TCP-80 Tcp Succeeded aro2-resources 443 False True True 443 4 Default ab765e9911ee74b8d81280e60cfcea16-TCP-443 Tcp Succeeded aro2-resources
The ports should make us suspect that these are the rules associated with the Openshift router. Let’s verify the IP address that the router service has:
kubectl get svc -n openshift-ingress NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE router-default LoadBalancer 172.30.248.226 51.104.142.63 80:30602/TCP,443:31409/TCP 10h router-internal-default ClusterIP 172.30.126.81 <none> 80/TCP,443/TCP,1936/TCP 10h
And to be completely sure:
nslookup sqlapilb-project1.apps.m50kgrxk.northeurope.aroapp.io Server: 172.24.208.1 Address: 172.24.208.1#53 Non-authoritative answer: Name: sqlapilb-project1.apps.m50kgrxk.northeurope.aroapp.io Address: 51.104.142.63
But now that we are at it, the source address that the pod is seeing from us (the field “your_address”) is actually 10.128.2.11. Where is this coming from? The Openshift rotuer is a full reverse proxy, so the connection should come from one of the router pods. Let’s verify:
kubectl get pod -n openshift-ingress -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES router-default-cf4d7b6d5-lcxtl 1/1 Running 0 10h 10.128.2.11 aro2-p8bjm-worker-northeurope3-wl4vw <none> <none> router-default-cf4d7b6d5-xlkxx 1/1 Running 0 10h 10.131.0.18 aro2-p8bjm-worker-northeurope1-qt8l7 <none>
There you go, it is the IP address of one of the routers, which incidentally was gentle enough to put the original client IP address (109.125.122.99 in the output above) in the X-Forwarded-For HTTP header, so that the information does not get lost.
Awesome, we have seen how communication works from the pods to Internet and from Internet to the pods, let us have a look now at the flow between the API and the SQL server. We saw in part 1 that the API will try to reach the server on a certain FQDN that we provided as environment variable. We can inspect the environment variables inside of our pod with the API endpoint “printenv”:
curl -s "http://sqlapilb-project1.apps.m50kgrxk.northeurope.aroapp.io/api/printenv" | jq -r '.SQL_SERVER_FQDN' server.project1.svc.cluster.local
And we can see whether name resolution works for that FQDN:
curl -s "http://sqlapilb-project1.apps.m50kgrxk.northeurope.aroapp.io/api/dns?fqdn=server.project1.svc.cluster.local" { "fqdn": "server.project1.svc.cluster.local", "ip": "172.30.72.7" }
If you scroll up a bit, you can check that 172.30.72.7 was the ClusterIP address for the “server” service, which is load balancing to our SQL Server pod. We can now try to reach the SQL Server using other two API functions: the first one retrieves the SQL Server version, the second one gets the source IP with which the SQL server sees the client:
curl -s "http://sqlapilb-project1.apps.m50kgrxk.northeurope.aroapp.io/api/sqlversion" { "sql_output": "Microsoft SQL Server 2019 (RTM-CU4) (KB4548597) - 15.0.4033.1 (X64) \n\tMar 14 2020 16:10:35 \n\tCopyright (C) 2019 Microsoft Corporation\n\tDeveloper Edition (64-bit) on Linux (Ubuntu 18.04.4 LTS) " }
curl -s "http://sqlapilb-project1.apps.m50kgrxk.northeurope.aroapp.io/api/sqlsrcip" { "sql_output": "10.131.0.40" }
As you can see, the source IP visible to the SQL Server is the pod IP address for the API pod. In the next post (part 3) we will have a look at inter-project communication.
[…] Part 2: Internet connectivity […]
LikeLike
[…] Part 2: Internet and Intracluster Connectivity […]
LikeLike
[…] Part 2: Internet and Intracluster Communication […]
LikeLike
[…] Part 2: Internet and Intra-cluster Communication […]
LikeLike