This post is a continuation from Part 5: Virtual Node. Other posts in this series:
- Part 1: deep dive in AKS with Azure CNI in your own vnet
- Part 2: deep dive in AKS with kubenet in your own vnet, and ingress controllers
- Part 3: outbound connectivity from AKS pods
- Part 4: NSGs with Azure CNI cluster
- Part 5: Virtual Node
- Part 6 (this one): Network Policy with Azure CNI
In part 5 we had two deployments, one in the default namespace using a LoadBalancer service, and another one in a different namespace using the nginx-based ingress controller provided by the HTTP application routing addon of AKS (again, remember that it is not recommended to use this addon for production workloads).
Network Policy
Let’s try to reach pods in the namespace “ingress” from the pods in the namespace “default”:
$ k -n ingress get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE kuard-vnode-ingress-84d8c9586f-5kf2p 1/1 Running 0 20h 10.13.76.52 aks-nodepool1-26711606-1 <none> kuard-vnode-ingress-84d8c9586f-fgr4j 1/1 Running 0 6m53s 10.13.100.5 virtual-node-aci-linux <none> $ k -n default get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE kuard-vnode-bd88cbf77-n8dzg 1/1 Running 0 28h 10.13.76.36 aks-nodepool1-26711606-1 <none> kuard-vnode-bd88cbf77-rmr66 1/1 Running 0 20h 10.13.76.26 aks-nodepool1-26711606-0 <none> $ k exec kuard-vnode-bd88cbf77-n8dzg -- wget -qO- --timeout=3 --server-response http://10.13.76.52:8080 2>&1 | grep "HTTP\/1.1 " HTTP/1.1 200 OK
Now let us apply a network policy. We will use this manifest for the policy:
apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: drop-inter-namespace namespace: ingress spec: podSelector: matchLabels: app: kuard-vnode-ingress policyTypes: - Ingress ingress: - from: - namespaceSelector: matchLabels: name: kube-system ports: - protocol: TCP port: 8080
By the way, that namespace label is not automatically created, you need to put it there:
$ k get ns --show-labels NAME STATUS AGE LABELS default Active 3d1h <none> ingress Active 2d15h <none> kube-public Active 3d1h <none> kube-system Active 3d1h <none> $ $ k label ns/kube-system name=kube-system namespace/kube-system labeled $ k get ns --show-labels NAME STATUS AGE LABELS default Active 3d1h <none> ingress Active 2d15h <none> kube-public Active 3d1h <none> kube-system Active 3d1h name=kube-system
Now we can apply the policy:
$ k apply -f ./isolate_ingress.yaml networkpolicy.networking.k8s.io/drop-inter-namespace created $ k exec kuard-vnode-bd88cbf77-n8dzg -- wget -qO- --timeout=3 --server-response http://10.13.76.52:8080 2>&1 | grep "HTTP\/1.1 " $
As you can see, after applying the policy traffic is not allowed from the pod in the default namespace any more. We can have a look at the networkpolicy resource for more information (the short name for “networkpolicy” is “netpol”, if you are as lazy as I am):
$ k -n ingress get netpol NAME POD-SELECTOR AGE drop-inter-namespace app=kuard-vnode-ingress 6m17s $ k -n ingress describe netpol/drop-inter-namespace Name: drop-inter-namespace Namespace: ingress Created on: 2019-04-04 14:32:25 +0200 DST Labels: <none> Annotations: kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"networking.k8s.io/v1","kind":"NetworkPolicy","metadata":{"annotations":{},"name":"drop-inter-namespace","namespace":"ingress"},"spec":{"... Spec: PodSelector: app=kuard-vnode-ingress Allowing ingress traffic: To Port: 8080/TCP From: NamespaceSelector: name=kube-system Allowing egress traffic: <none> (Selected pods are isolated for egress connectivity) Policy Types: Ingress
As you can see, the ingress part of the policy allows only traffic incoming from the pods in the kube-system namespace, including the ingress nginx.
What does the Network Policy actually do? First of all, it changes a bit how routing inside of the node works. If you remember Part1, routing with Azure CNI was fairly simple: you had a layer-2 bridge azure0 interconnecting the “physical” interface of the node with the “logical” interface of the pods. If we look at it now, it is not there any more (only docker0, the bridge used by the container runtime):
jose@aks-nodepool1-26711606-0:~$ brctl show bridge name bridge id STP enabled interfaces docker0 8000.0242e0150395 no
So how does routing work now with network policy? Let’s have a look at the routing table:
jose@aks-nodepool1-26711606-0:~$ route -nv Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 0.0.0.0 10.13.76.1 0.0.0.0 UG 0 0 0 eth0 10.13.76.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0 10.13.76.6 0.0.0.0 255.255.255.255 UH 0 0 0 caliacdcffac54f 10.13.76.7 0.0.0.0 255.255.255.255 UH 0 0 0 cali56442c847c1 10.13.76.9 0.0.0.0 255.255.255.255 UH 0 0 0 cali291386878c1 10.13.76.10 0.0.0.0 255.255.255.255 UH 0 0 0 calid6217ac985b 10.13.76.12 0.0.0.0 255.255.255.255 UH 0 0 0 cali7b64e095a5a 10.13.76.13 0.0.0.0 255.255.255.255 UH 0 0 0 cali90ac3745b59 10.13.76.14 0.0.0.0 255.255.255.255 UH 0 0 0 cali4fb414dbb9c 10.13.76.15 0.0.0.0 255.255.255.255 UH 0 0 0 cali80d9382b3e8 10.13.76.17 0.0.0.0 255.255.255.255 UH 0 0 0 cali836fd394022 10.13.76.19 0.0.0.0 255.255.255.255 UH 0 0 0 calid3b581beed8 10.13.76.20 0.0.0.0 255.255.255.255 UH 0 0 0 calicabd43ccde9 10.13.76.24 0.0.0.0 255.255.255.255 UH 0 0 0 cali21ea7a92e42 10.13.76.25 0.0.0.0 255.255.255.255 UH 0 0 0 calia9fadc8e1f1 10.13.76.26 0.0.0.0 255.255.255.255 UH 0 0 0 cali17b90f98476 10.13.76.27 0.0.0.0 255.255.255.255 UH 0 0 0 calic2613659d11 10.13.76.29 0.0.0.0 255.255.255.255 UH 0 0 0 cali7d43ea164cd 10.13.76.32 0.0.0.0 255.255.255.255 UH 0 0 0 cali1d965e35ba9 10.13.76.34 0.0.0.0 255.255.255.255 UH 0 0 0 cali2753053547f 168.63.129.16 10.13.76.1 255.255.255.255 UGH 0 0 0 eth0 169.254.169.254 10.13.76.1 255.255.255.255 UGH 0 0 0 eth0 172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0
Oh, this is very different to the implementation without Network Policy! Essentially this looks like a /32 routing system, pointing to calixxxx interfaces (“cali” is short for “calico”, the network policy model used with AKS at the time of this writing), as opposed to the layer-2 model of the Azure CNI.
Let us verify with our nginx pod. As these commands prove, the routes from the previous table point to the interfaces piped to the pod’s network namespaces:
jose@aks-nodepool1-26711606-0:~$ sudo docker ps | grep nginx ef7a7f64efa1 quayio.azureedge.net/kubernetes-ingress-controller/nginx-ingress-controller "/entrypoint.sh /ngi…" 2 days ago Up 2 days k8s_addon-http-application-routing-nginx-ingress-controll er_addon-http-application-routing-nginx-ingress-controller-8fx6v2r_kube-system_aa952aeb-5562-11e9-b161-9a6af760136f_0 eedee9b9618b k8s.gcr.io/pause-amd64:3.1 "/pause" 2 days ago Up 2 days k8s_POD_addon-http-application-routing-nginx-ingress-cont roller-8fx6v2r_kube-system_aa952aeb-5562-11e9-b161-9a6af760136f_0 533ab64405f4 ebe2c7c61055 "nginx -g 'daemon of…" 3 days ago Up 3 days k8s_azureproxy_kube-svc-redirect-rkhpb_kube-system_20ebfe 15-5517-11e9-b161-9a6af760136f_0 jose@aks-nodepool1-26711606-0:~$ sudo docker inspect --format '{{ .State.Pid }}' ef7a7f64efa1 17233 jose@aks-nodepool1-26711606-0:~$ sudo nsenter -t 17233 -n ip addr 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 3: eth0@if21: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default link/ether f6:27:71:75:8d:27 brd ff:ff:ff:ff:ff:ff link-netnsid 0 inet 10.13.76.17/32 scope global eth0 valid_lft forever preferred_lft forever jose@aks-nodepool1-26711606-0:~$ ip a | grep 21: 21: cali836fd394022@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default jose@aks-nodepool1-26711606-0:~$ route -nv | grep 10.13.76.17 10.13.76.17 0.0.0.0 255.255.255.255 UH 0 0 0 cali836fd394022
Wow. But actually I like this, let me explain: now it is very easy to identify the interfaces for specific pods, you just need to have a look at the routing table. For example, since we configured a Network Policy affecting the pods in our ingress namespaces, let us find out one of the “cali” interfaces:
$ k -n ingress get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE kuard-vnode-ingress-7b6868dd49-drldk 1/1 Running 0 29m 10.13.76.19 aks-nodepool1-26711606-0 <none> kuard-vnode-ingress-7b6868dd49-q9tf4 1/1 Running 0 66m 10.13.76.6 aks-nodepool1-26711606-0 <none> $ ssh -J $publicip 10.13.76.4 Welcome to Ubuntu 16.04.5 LTS (GNU/Linux 4.15.0-1037-azure x86_64) [...] jose@aks-nodepool1-26711606-0:~$ route -nv | grep 10.13.76.19 10.13.76.19 0.0.0.0 255.255.255.255 UH 0 0 0 calid3b581beed8 jose@aks-nodepool1-26711606-0:~$
Let’s look for iptables rules matching on that interface:
jose@aks-nodepool1-26711606-0:~$ sudo iptables-save | grep calid3b581beed8 :cali-fw-calid3b581beed8 - [0:0] :cali-tw-calid3b581beed8 - [0:0] -A cali-from-wl-dispatch-d -i calid3b581beed8 -m comment --comment "cali:sisBAcnsda1tZ1I6" -g cali-fw-calid3b581beed8 -A cali-fw-calid3b581beed8 -m comment --comment "cali:eI0X4A8BlwA0sjko" -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT -A cali-fw-calid3b581beed8 -m comment --comment "cali:qaBPX8HNrel-_TA3" -m conntrack --ctstate INVALID -j DROP -A cali-fw-calid3b581beed8 -m comment --comment "cali:Jq_5ut2Y35OHAh7u" -j MARK --set-xmark 0x0/0x10000 -A cali-fw-calid3b581beed8 -m comment --comment "cali:zU6Qm0dPM2KuAPnY" -j cali-pro-kns.ingress -A cali-fw-calid3b581beed8 -m comment --comment "cali:iwM9IW2Gzp_XhmTu" -m comment --comment "Return if profile accepted" -m mark --mark 0x10000/0x10000 -j RETURN -A cali-fw-calid3b581beed8 -m comment --comment "cali:yTxfdXovOk_HY__Q" -j cali-pro-ksa.ingress.default -A cali-fw-calid3b581beed8 -m comment --comment "cali:xz0HLpLrFN89BguB" -m comment --comment "Return if profile accepted" -m mark --mark 0x10000/0x10000 -j RETURN -A cali-fw-calid3b581beed8 -m comment --comment "cali:rTeHOIDtEq3OBgQ-" -m comment --comment "Drop if no profiles matched" -j DROP -A cali-to-wl-dispatch-d -o calid3b581beed8 -m comment --comment "cali:y3O234FtkeR9ducV" -g cali-tw-calid3b581beed8 -A cali-tw-calid3b581beed8 -m comment --comment "cali:7XHdmNu1zVtvWUdb" -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT -A cali-tw-calid3b581beed8 -m comment --comment "cali:yzbqtmTdvcucYXRk" -m conntrack --ctstate INVALID -j DROP -A cali-tw-calid3b581beed8 -m comment --comment "cali:hHFW5V6wHLc8mcEK" -j MARK --set-xmark 0x0/0x10000 -A cali-tw-calid3b581beed8 -m comment --comment "cali:W69tA-LPeC-XSNBQ" -m comment --comment "Start of policies" -j MARK --set-xmark 0x0/0x20000 -A cali-tw-calid3b581beed8 -m comment --comment "cali:kE02v2aA1qh8HxxR" -m mark --mark 0x0/0x20000 -j cali-pi-_FSXf14qKl1ZqjRcvUXU -A cali-tw-calid3b581beed8 -m comment --comment "cali:2HPKw4B7yxk7mqMM" -m comment --comment "Return if policy accepted" -m mark --mark 0x10000/0x10000 -j RETURN -A cali-tw-calid3b581beed8 -m comment --comment "cali:U1Sl_cfTEErYbbZL" -m comment --comment "Drop if no policies passed packet" -m mark --mark 0x0/0x20000 -j DROP -A cali-tw-calid3b581beed8 -m comment --comment "cali:l-FpQiC4LxIZJMMQ" -j cali-pri-kns.ingress -A cali-tw-calid3b581beed8 -m comment --comment "cali:jptOzFu3EdNgVAjj" -m comment --comment "Return if profile accepted" -m mark --mark 0x10000/0x10000 -j RETURN -A cali-tw-calid3b581beed8 -m comment --comment "cali:dWzr_J7EfPumyCgs" -j cali-pri-ksa.ingress.default -A cali-tw-calid3b581beed8 -m comment --comment "cali:WxUuMRbkpCDau6q1" -m comment --comment "Return if profile accepted" -m mark --mark 0x10000/0x10000 -j RETURN -A cali-tw-calid3b581beed8 -m comment --comment "cali:TD9zkf0Ym9GktnsU" -m comment --comment "Drop if no profiles matched" -j DROP
The most relevant rule here is the one that jumps to the firewall chain “cali-pi-_FSXf14qKl1ZqjRcvUXU”. Let’s have a look at that chain:
jose@aks-nodepool1-26711606-0:~$ sudo iptables --list-rules cali-pi-_FSXf14qKl1ZqjRcvUXU -N cali-pi-_FSXf14qKl1ZqjRcvUXU -A cali-pi-_FSXf14qKl1ZqjRcvUXU -p tcp -m comment --comment "cali:grRCxE4BRKEgkFyx" -m set --match-set cali40s:d0vaXDV0OjdKq6czssWe9SI src -m multiport --dports 8080 -j MARK --set-xmark 0x10000/0x10000 -A cali-pi-_FSXf14qKl1ZqjRcvUXU -m comment --comment "cali:hEdW_nNIxpnlEO6A" -m mark --mark 0x10000/0x10000 -j RETURN
Mmmht, this looks interesting, a rule matching on port 8080! It is matching as well on an IP set (“cali40s:d0vaXDV0OjdKq6czssWe9SI”). We can use the tool ipset to proceed with our investigations:
jose@aks-nodepool1-26711606-0:~$ sudo ipset list cali40s:d0vaXDV0OjdKq6czssWe9SI Name: cali40s:d0vaXDV0OjdKq6czssWe9SI Type: hash:net Revision: 6 Header: family inet hashsize 1024 maxelem 1048576 Size in memory: 1304 References: 3 Number of entries: 15 Members: 10.13.76.13 10.13.76.20 10.13.76.32 10.13.76.34 10.13.76.17 10.13.76.25 10.13.76.7 10.13.76.10 10.13.76.9 10.13.76.12 10.13.76.27 10.13.76.24 10.13.76.29 10.13.76.14 10.13.76.15
Which happen to be the pods referenced by our policy, that is, all pods running in the kube-system namespace (except the ones running in the node’s network namespace with IP address 10.13.76.4):
$ k -n kube-system get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE aci-connector-linux-7775fbcf5f-mhrjs 1/1 Running 5 3d 10.13.76.13 aks-nodepool1-26711606-0 <none> addon-http-application-routing-default-http-backend-8cdc9dzfwkw 1/1 Running 0 2d16h 10.13.76.34 aks-nodepool1-26711606-0 <none> addon-http-application-routing-external-dns-6f9bb9b4bf-v4tk6 1/1 Running 0 2d16h 10.13.76.14 aks-nodepool1-26711606-0 <none> addon-http-application-routing-nginx-ingress-controller-8fx6v2r 1/1 Running 0 2d16h 10.13.76.17 aks-nodepool1-26711606-0 <none> azure-cni-networkmonitor-j2vlw 1/1 Running 0 3d1h 10.13.76.4 aks-nodepool1-26711606-0 <none> azure-ip-masq-agent-8f7gw 1/1 Running 0 3d1h 10.13.76.4 aks-nodepool1-26711606-0 <none> calico-node-snct2 0/2 Pending 0 2d1h <none> <none> <none> calico-node-zwt7d 2/2 Running 0 3d1h 10.13.76.4 aks-nodepool1-26711606-0 <none> calico-typha-74c44c79b5-99r2z 1/1 Running 0 3d1h 10.13.76.4 aks-nodepool1-26711606-0 <none> calico-typha-horizontal-autoscaler-6b69cf54f-665cn 1/1 Running 0 3d1h 10.13.76.29 aks-nodepool1-26711606-0 <none> coredns-754f947b4-4pl2g 1/1 Running 0 3d1h 10.13.76.9 aks-nodepool1-26711606-0 <none> coredns-754f947b4-qbl4d 1/1 Running 0 47m 10.13.76.12 aks-nodepool1-26711606-0 <none> coredns-754f947b4-tcct9 1/1 Running 0 3d1h 10.13.76.7 aks-nodepool1-26711606-0 <none> coredns-autoscaler-6fcdb7d64-2tqsn 1/1 Running 0 3d1h 10.13.76.25 aks-nodepool1-26711606-0 <none> heapster-5fb7488d97-5rznw 2/2 Running 0 2d16h 10.13.76.27 aks-nodepool1-26711606-0 <none> kube-proxy-fr62z 1/1 Running 0 2d16h 10.13.76.4 aks-nodepool1-26711606-0 <none> kube-svc-redirect-rkhpb 2/2 Running 0 3d1h 10.13.76.4 aks-nodepool1-26711606-0 <none> kubernetes-dashboard-847bb4ddc6-gzddn 1/1 Running 0 3d1h 10.13.76.15 aks-nodepool1-26711606-0 <none> metrics-server-7b97f9cd9-92cdw 1/1 Running 0 3d1h 10.13.76.32 aks-nodepool1-26711606-0 <none> omsagent-rs-ccd94f4cf-44df9 1/1 Running 0 3d1h 10.13.76.24 aks-nodepool1-26711606-0 <none> omsagent-xmd6d 1/1 Running 0 3d1h 10.13.76.20 aks-nodepool1-26711606-0 <none> tunnelfront-6fff97b995-qvzfx 1/1 Running 0 3d1h 10.13.76.10 aks-nodepool1-26711606-0 <none>
Let’s have a look again at the relevant rules:
-A cali-tw-calid3b581beed8 -m comment --comment "cali:kE02v2aA1qh8HxxR" -m mark --mark 0x0/0x20000 -j cali-pi-_FSXf14qKl1ZqjRcvUXU -A cali-tw-calid3b581beed8 -m comment --comment "cali:2HPKw4B7yxk7mqMM" -m comment --comment "Return if policy accepted" -m mark --mark 0x10000/0x10000 -j RETURN -A cali-tw-calid3b581beed8 -m comment --comment "cali:U1Sl_cfTEErYbbZL" -m comment --comment "Drop if no policies passed packet" -m mark --mark 0x0/0x20000 -j DROP
So essentially the first rule goes to “cali:kE02v2aA1qh8HxxR” above is marking with the value 0x10000 all packets coming from pods in the kube-system namespace, addressed to port 8080, and sent to the interface connected to the pods.
The second rule returns to the normal packet flow matching on the mark of the previous rule. That is, if the policy was accepted and the packet was compliant.
The second one will drop the packet, but only if it has a 0x20000 mark. The bit 0x20000 is only set if there is a network policy, to respect the default permit any any.
And that makes the walkthrough around Calico network policy for AKS clusters using advanced networking (Azure CNI in your own Vnet).
[…] Part 6: Network Policy with Azure CNI […]
LikeLike
[…] Part 6: Network Policy with Azure CNI […]
LikeLike
[…] Part 6: Network Policy with Azure CNI […]
LikeLike
[…] Part 6: Network Policy with Azure CNI […]
LikeLike