KTHW - Kubernetes and network, the basic

July 30, 2020 - Reading time: 5 minutes

The networking model help us deal with the following problems:

Communication between containers.
Reaching containers on different working nodes.
How to reach services
What IP address / port will be assigned to a container

The Kubernetes model was designed to overcome some of the limitations of the Docker model. With Docker, each hosts creates a virtual network bridge that allows containers in the same host to communicate to each other, and to initiate outbound connections. For containers on different hosts, the administrator needs to creat a proxy on the host to expose a port to the container.

All this proxying of services can become very complicated when dealing with muyltiple containers.

The Kubernetes solution is to create one virtual network for the whole cluster.

Each pod has a unique IP address
Each service has an unique IP address (on a different range than pods)

Cluster CIDR

IP range used to assign IP addresses to pods in the cluster.
The kube-proxy service running on the worker nodes, specifies the clusterCIDR: "10.200.0.0/16".
The kube-controller-manager also includes the --cluster-cidr=10.200.0.0/16 flag.

Each pod gets an IP address assigned from the cluster CIDR subnet. All the containers inside a pod will share this IP address.
This means that containers inside the same pod can communicate via localhost.
The Container Network Interface will reserve a subnet for each worker node, and assign the new IP address to pods.

The problem with this model, is that if a pod gets restarted, the CNI assigns a new IP address. In order to keep a static IP address for a service (group of containers), and to allow access from outside the cluster (for example via NodePort) we use Service Cluster.

Service cluster

IP range used for services in the cluster. This range MUST NOT overlap with the cluster CIDR range.

One of the parameters we set when we created the systemd unit service for kube-apiserver was the --service-cluster-ip-range=10.32.0.0/24 and --service-node-port-range=30000-32767

The nodeport range is used when providing access to services via kube-proxy in nodeport mode. In this mode, a port is open on the worker node and the traffic is redirected from there to the service (using iptables or ipvs)

The kube-controller-manager has a --service-cluster-ip-range=10.32.0.0/24 flag

One of the SAN on the kubernetes.pem certificate was IP Address:10.32.0.1

Pod CIDR

The specific IP range for pods on one worker node. This range shouldn't overlap between worker nodes. For example, 10.200.1.0/24 and 10.200.2.0/24 Some network plugins will handle this automatically.

Types of networking and requirements

Communication between containers in a pod (handled by the container runtime) - Docker uses a virtual bridge named docker0. Each container creates a Virtual Ethernet Device and it's attached to the bridge. Containers inside a pod can also communicate via localhost, or intra-process communication.
Communication between pods (across nodes) - Known as East-west traffic - Implemented by the CNI plugin
Communication between pods happens without NAT
External exposure of services to external clients - Kown as North-south traffic
Service discovery and load balancing
Segmenting networks for pod security

CNI plugins

Used to implement pod-to-pod communication (Calico, Weave, Flannel) Currently there are 3 types of networking

L2 (switching)
L3 (routing)
Overlay (tunneling)

L2

Easiest type of communication. All the pods and nodes are in the same L2 domain Pod-to-pod communication happens through ARP. Bridge plugin example:

{
    "name":"kubenet",
    "type":"bridge",
    "bridge":"kube-bridge",
    "isDefatultGateway": true,
    "ipam" : {
                "type": "host-local",
                "subnet": "10.1.0.0./16" 
            }
}

L2 is not scalable.

L3

Flannel is an example of a L3 plugin.

Overlay configuration

It's a Software Defined Network. Using tunnels.
Common encapsulation mechanisms such as VXLAN, GRE are availalbe.

Services

Used to expose functionality externally.
The service refers to a set of pods which is based on labels.
Services get a publicly accesible IP address.