ichbinblau


  • Home

  • Archives

  • Tags

Setup nofile for Docker

Posted on 2019-09-06

Problem

Yesterday I met a weird problem. The mosquitto container failed to start due to the error below: The error message indicates that process failed to allocate memory for creating 1073741816 file descriptors.

1
2
3
4
5
6
7
8
9
10
11
1567622954: mosquitto version 1.6.4 starting
1567622954: Config loaded from /mosquitto/config/mosquitto.conf.
1567622954: Opening websockets listen socket on port 9001.
1567622954: Error: Unable to create websockets listener on port 9001.
1567623383: mosquitto version 1.6.4 starting
1567623383: Config loaded from /mosquitto/config/mosquitto.conf.
1567623383: Opening websockets listen socket on port 9001.
1567623383: libuv support not compiled in
1567623383: OOM allocating 1073741816 fds
1567623383: ZERO RANDOM FD
1567623383: Error: Unable to create websockets listener on port 9001.

Then I traced into the mosquitto dependency called libwebsocket which raised this error. Telling from its source code, the number 1073741816 actually is the max nofile of the container. But the host systems max nofile is less than the fd value:

1
2
cat /proc/sys/fs/file-nr
992 0 1589038

Theoretically, each container uses host’s max nofile by default. How come does the big value come from?

Solution

  1. Check the max nofile value got in container. It means that the default max nofile set in container is 1073741816.

    1
    2
    sudo docker run --rm debian sh -c "ulimit -n"
    1073741816
  2. Investigate where the value comes from. There are a couple of places to set it up.

    • vim /usr/lib/systemd/system/docker.service

      1
      2
      3
      4
      [Service]
      LimitNOFILE=infinity
      LimitNPROC=infinity
      LimitCORE=infinity
    • vim /etc/docker/daemon.json, the settings here will ovewrite the setting in docker daemon

      1
      2
      3
      4
      5
      6
      7
      8
      9
      {
      "default-ulimits": {
      "nofile": {
      "Name": "nofile",
      "Hard": 20000,
      "Soft": 20000
      }
      }
      }
    • Set in docker run command like this: The configuration in command line will overwrite all the settings before.

      1
      docker run --ulimit nofile=1024:1024 --rm debian sh -c "ulimit -n"
  3. The root cause is that the max nofile has been set in file /usr/lib/systemd/system/docker.service as infinity. In order to overwrite the daemon config, I set the nofile in docker-comopse.yml file. And then the error was gone.

    1
    2
    3
    4
    ulimits:
    nofile:
    soft: 200000
    hard: 200000

Reference

  1. https://docs.docker.com/engine/reference/commandline/run/
  2. https://sukbeta.github.io/docker-ulimit-configure/
  3. https://blog.csdn.net/signmem/article/details/51365006

Running the NCSDK examples in Docker

Posted on 2017-12-19

In Movidius’s official documentation, Docker container is not an explicitly supported platform (as of Dec, 2017). Then I decided to test whether the NCS (Neural Compute Stick) can be accessed from a container. Benefiting from this discussion thread, I was able to run the ncsdk examples in a Docker container. Here are the steps I took to make it happen.
Note: I am an Intel employee but all options are on my own.

Prerequisites

My test environment:

  • Host: Ubuntu 16.04.2
  • Docker: docker-ce v17.09
  • Movidius compute stick attached

Build Docker image with ncsdk installed

The Dockerfile I used to install the ncsdk and make the examples is based on instructions from hughdbrown. I created a repo and uploaded the Dockerfile and its config files to it. You may check this branch, and build the Docker image yourself. Run:

1
2
3
git clone https://github.com/ichbinblau/ncsdk_container
cd ncsdk_container
sudo docker build -t the_image_name .

Note: The ncsdk installer currently does not honor any proxy setting options. Build the image without being behind a proxy or the build step may get stuck installing python dependencies from the internet.

Run ncsdk examples in Docker

The ncforum provided a way to run a Docker container which is accessible to the NCS. Host network mode is recommended to make the USB compute stick visible.

Some users have had USB related problems using the Intel NCS within a Docker environment. We have found that including the --net=host flag can help make the device manager events visible to libusb in a Docker environment.

Therefore, I used this command:

1
sudo docker run --rm --net=host -it --privileged -v /dev/bus/usb:/dev/bus/usb:shared -v /run/udev:/run/udev:ro -v /media/data2/NCS/:/media/data2/NCS/ the_image_name:the_image_tag /bin/bash

This leads to an interactive terminal in the container looking like this:

1
movidius@theresa-ubuntu:~/ncsdk$

Inside the container, you can run the examples to test access to the NCS.

1
2
3
4
cd examples/apps/hello_ncs_cpp/
make help
make hello_ncs_cpp
make run

Check the result by looking at the command outputs, such as this:

1
2
3
4
5
6
7
8
9
10
11
movidius@theresa-ubuntu:~/ncsdk/examples/apps/hello_ncs_cpp$ make run
making hello_ncs_cpp
g++ cpp/hello_ncs.cpp -o cpp/hello_ncs_cpp -lmvnc
Created cpp/hello_ncs_cpp executable
making run
cd cpp; ./hello_ncs_cpp; cd ..
Hello NCS! Device opened normally.
Goodbye NCS! Device Closed normally.
NCS device working.

You may continue to test the other model with different frameworks such as caffe and tensorflow under the examples folder.

Comments welcome

This is quick example of using Docker containers to access the NCS. I welcome your comments and corrections.


References:

  1. https://ncsforum.movidius.com/discussion/315/linux-virtual-environment-for-ncs
  2. https://github.com/hughdbrown/movidius/tree/master/docker

Bring up Kubernetes v1.8.4 on Ubuntu 16.04 LTS with kubeadm

Posted on 2017-12-06

Kubernetes has released v1.8 since Sepetember 2017. The former installation steps for v1.75 are not compatible to the new version. The article is to document the steps I took to install Kubernetes cluster on Ubuntu Server 16.04 LTS with kubeadm. The steps are tested by installing Kubernetes v1.8.4.

Prerequisites

Install Ubuntu Server 16.04 LTS using the HWE kernel (version 4.10) option. The HWE kernel is version 4.10 and aims to support newer platforms, while the Ubuntu Sever 16.04 LTS standard kernel is version 4.40.

Environment Preparation

Setup proxy if you work behind the corporate network as Kubeadm uses the system proxy to download components. Put the following settings in the $HOME/.bashrc file. Be mindful to put the master host’s IP address in the no_proxy list.

1
2
3
4
export http_proxy=http://proxy_ip:proxy_port/
export https_proxy=http://proxy_ip:proxy_port/
export ftp_proxy=http://proxy_ip:proxy_port/
export no_proxy=192.168.1.102,192.168.1.103,192.168.1.101,192.168.1.104,127.0.0.1,localhost,loadbalancer,gateway1,gateway2,gateway3

And check your proxy settings:

1
env | grep proxy

Next perform a software update for all packages before you continue with the cluster installation.

Install latest OS updates

1
sudo apt update && sudo apt upgrade

Disable Swap

Since v1.8, it is required to turn swap off. Otherwise, kubelet service cannot be started. We disable system swap by run this command:

1
sudo swapoff -a

System Configuration

We suppose that we have four servers ready, one as k8s master and the other three as k8s nodes.
Configure local DNS in /etc/hosts. Map the IP address with host names.

1
2
3
4
192.168.1.102 loadbalancer
192.168.1.101 gateway1
192.168.1.103 gateway2
192.168.1.104 gateway3

Install Kubernetes version 1.8.4 on each of your hosts.

1
sudo apt install -y kubeadm kubelet kubectl

Install Docker version 17.09 on each of your hosts

1
2
3
4
5
6
7
8
9
10
11
12
13
sudo apt-get update
sudo apt-get install -y \
apt-transport-https \
ca-certificates \
curl \
software-properties-common
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
sudo add-apt-repository \
"deb [arch=amd64] https://download.docker.com/linux/$(. /etc/os-release; echo "$ID") \
$(lsb_release -cs) \
stable"
sudo apt-get update && sudo apt-get install -y docker-ce=$(apt-cache madison docker-ce | grep 17.09 | head -1 | awk '{print $3}')
sudo systemctl enable docker

And ensure that the service is up and running:

1
sudo systemctl status docker

Note: Make sure that the cgroup driver used by kubelet is the same as the one used by Docker. To ensure compatibility you can either update Docker settings (like what the official document recommends) or update kubelet setting by adding the setting option below to file /etc/systemd/system/kubelet.service.d/10-kubeadm.conf:

1
Environment="KUBELET_CGROUP_ARGS=--cgroup-driver=cgroupfs"

After updating the service file (for either kubelet or docker service), do remember to reload the configuration and restart the service

1
2
3
sudo systemctl daemon-reload
sudo systemctl restart docker
sudo systemctl restart kubelet

Other than that, it is highly recommended that you use the overlay2 driver which is faster and stronger than other docker storage drivers. You may follow the instructions here to complete the installation.

Initialize Kubernetes Master

On the master node (load balancer), if you run as root, do

1
kubeadm init --apiserver-advertise-address=192.168.1.102 --pod-network-cidr=10.244.0.0/16 --skip-preflight-checks

If you run as a normal user, do

1
sudo -E kubeadm init --apiserver-advertise-address=192.168.1.102 --pod-network-cidr=10.244.0.0/16 --skip-preflight-checks

If --apiserver-advertise-address is not specified, it auto-detects the network interface to advertise the master. Better to set the argument if there are more than one network interface.
--pod-network-cidris to specify the virtual IP range for the third party network plugin.
Set --use-kubernetes-versionif you want to use a specific Kubernetes version.
To start using your cluster, you need to run (as a regular user):

1
2
3
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

By default, your cluster will not schedule pods on the master for security reasons. Expand your load capacity if you want to schedule pods on your master.

1
kubectl taint nodes --all node-role.kubernetes.io/master-

Install WeaveNet Pod Network Plugin

A pod network add-on is supposed to be installed in order that pods can communicate with each other.
Set /proc/sys/net/bridge/bridge-nf-call-iptables to 1 by adding net.bridge.bridge-nf-call-iptables=1 to file /etc/sysctl.d/k8s.conf in order to pass bridged IPv4 traffic to iptables’ chains.
And run the following command to make it take effect:

1
sudo sysctl -p /etc/sysctl.d/k8s.conf

Then run:

1
kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')"

Check the status of Weave pods and make sure that it is in Running state:

1
kubectl get pod --all-namespaces -o wide

Join Nodes to Cluster

Get the cluster token on master:

1
sudo kubeadm token list

Since Kubernetes v1.8, the token is only valid for 24 hours. You may generate another token if the previous one gets expired.

1
sudo kubeadm token generate

Run the commands below on each of the nodes:

1
sudo kubeadm join --token 15491e.0c0c9c99dfbbe690 192.168.1.102:6443 --discovery-token-ca-cert-hash sha256:82a08ef9c830f240e588a26a8ff0a311e6fe3127c1ee4c5fc019f1369007c0b7 --skip-preflight-checks

Replace the token e5e6d6.6710059ca7130394 and the sha256 hash with the actual token and hash got from kubeadm init command.
“Pub key validation” can be skipped passing --discovery-token-unsafe-skip-ca-verification flag instead of using --discovery-token-ca-cert-hash but the security would be weakened;
And check whether nodes joins the cluster successfully.

1
kubectl get nodes

Install Dashboard Add-on

Create the dashboard pod :

1
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/master/src/deploy/recommended/kubernetes-dashboard.yaml

To start using Dashboard run following command:

1
2
kubectl proxy --address="<ip-addr-listen-on>" -p <listening-port>
eg. $kubectl proxy –-address="192.168.1.102" –p 8001

Then access the dashboard at http://192.168.1.102:8001/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy/. Replace the IP address with the actually IP you are using.
There are a couple of ways to login here. For development purpose, you may simply grant full admin privileges to Dashboard’s Service Account by creating below ClusterRoleBinding. Copy the contents below and save as a file named dashboard-admin.yaml. Use kubectl create -f dashboard-admin.yaml to deploy it. Afterwards you can use Skip option on login page to access Dashboard.

Tear Down

Firstly, drain the nodes on the master or wherever credential is configured. It does a graceful termination and marks the node as unschedulable.

1
2
kubectl drain <node name> --delete-local-data --force --ignore-daemonsets
kubectl delete node <node name>

Then on the node to be removed, remove all the configuration files and settings

1
sudo kubeadm reset

Diagnose

  • Check services and pods status. kube-system is the default namespace for system-level pods. You may also pass other specific namespaces. Use --all-namespaces to check all namespaces

    1
    kubectl get po,svc -n kube-system

    This is how the output looks like:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    NAME READY STATUS RESTARTS AGE
    po/etcd-loadbalancer 1/1 Running 0 9m
    po/kube-apiserver-loadbalancer 1/1 Running 0 9m
    po/kube-controller-manager-loadbalancer 1/1 Running 0 10m
    po/kube-dns-545bc4bfd4-2qvkk 3/3 Running 0 10m
    po/kube-proxy-6rk26 1/1 Running 0 10m
    po/kube-proxy-qvhmw 1/1 Running 0 1m
    po/kube-scheduler-loadbalancer 1/1 Running 0 9m
    po/kubernetes-dashboard-7486b894c6-dw8zz 1/1 Running 0 23s
    po/weave-net-s59fw 2/2 Running 0 3m
    po/weave-net-zsfls 2/2 Running 1 1m
    NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
    svc/kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP 10m
    svc/kubernetes-dashboard ClusterIP 10.110.76.10 <none> 443/TCP 23s
  • Check pods logs. Get pod name from the command above (eg. kubernetes-dashboard-3313488171-tkdtz). Use -c <container_name> if there are more than one containers running in the pod.

    1
    kubectl logs <pod_name> -f -n kube-system
  • Run commands in the container. Use -c <container_name> if there are more than one containers running in the pod.
    Run a single command:

    1
    kubectl exec <pod_name> -n <namespace> <command_ to_run>

    Enter the container’s shell:

    1
    kubectl exec -it <pod_name> -n <namespace> -- /bin/bash
  • Check Docker logs

    1
    sudo journalctl -u docker.service -f
  • Check kubelet logs

    1
    sudo journalctl -u kubelet.service -f

References:

  1. https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/
  2. https://kubernetes.io/docs/admin/kubeadm/

Run Applications in Kubernetes Cluster

Posted on 2017-09-21

Prerequisites

We suppose that the Kubernetes cluster is up and running with at least one master and one node.
Credential has been properly configured and kubectl can be used on at least one of the hosts.
Download all the yaml files from git repo and switch to the directory that contains configuration files.

1
2
3
$ git clone https://github.com/ichbinblau/SmartHome-Demo.git
$ git checkout k8s
$ cd SmartHome-Demo/smarthome-web-portal/tools/yamls

In order to accelerate the velocity to download the Docker images, we set up a local Docker registry on master host 192.168.1.102. Here is the command to start a local registry. Create a local directory to make the docker images persistent.

1
2
3
4
5
6
7
$ sudo mkdir -p /var/lib/registry
$ sudo docker run -d \
-p 5000:5000 \
--restart=always \
--name registry \
-v /var/lib/registry:/var/lib/registry \
registry:2

You can check the images in the registry by visiting http://192.168.1.102:5000/v2/_catalog

Create a New Namespace

There are a couple of pre-defined namespaces for different purposes (default, kube-system, kube-public). The namespace.yaml defines a new namespace named iot2cloud. Here is the namespace.yaml file:

1
2
3
4
5
cat namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
name: iot2cloud

We create a new namespace for our applications.

1
$ kubectl create -f namespace.yaml

You can check the namespaces

1
2
3
4
5
6
$ kubectl get namespaces
NAME STATUS AGE
default Active 1d
iot2cloud Active 34s
kube-public Active 1d
kube-system Active 1d

Voila, we have a new namespace created. In order to simplify the command, we create an alias for the namespace in the context. Alternatively, you can make it persistent by add this to your $HOME/.bashrc file.

1
$ alias kub='kubectl --namespace iot2cloud'

Attach Labels to Nodes

In our scenario, some of the hosts are resource-constrained. We would like to assign all the Cloud applications to the Master and the others to the Nodes.
nodeSelect helps us make this happen. Add labels to categorize the nodes and check the results:

1
2
3
4
5
$ kubectl label nodes gateway1 platform=minnowboard
$ kubectl label nodes gateway2 platform=minnowboard
$ kubectl label nodes gateway3 platform=minnowboard
$ kubectl label nodes loadbalancer platform=nuc
$ kubectl get nodes --show-labels

Set up MariaDB

Create secrets and configMaps

Kubernetes introduces secret concept to hold sensitive information such as password, token, key pairs etc. We keep our MariaDB password in the secret and MariaDB pod will create the password set in the secret for user root.

1
2
3
$ MYSQL_ROOT_PASS = your_password_to_use
$ echo -n $MYSQL_ROOT_PASS > mysql-root-pass.secret
$ kub create secret generic mysql-root-pass --from-file=mysql-root-pass.secret

You can check that the secret was created like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
$ kub get secrets
NAME TYPE DATA AGE
mysql-root-pass Opaque 1 2m
$ kub describe secrets/mysql-root-pass
Name: mysql-root-pass
Namespace: iot2cloud
Labels: <none>
Annotations: <none>
Type: Opaque
Data
====
mysql-root-pass.secret: 8 bytes

Create MariaDB Pod and Service

Remember to update the image: 192.168.1.102:5000/mariadb:latest in mariadb-master-service.yaml file to the actual repository you use. Make the same changes to all the other yaml file.

1
2
$ kub create -f mariadb-master-service.yaml # service definition
$ kub create -f mariadb-master.yaml # deployment definition

Check the pods and service status:

1
$ kub get po,svc -o wide

Optionally, if you are going to run MariaDB in master-slave mode. Ensure that there are more than one nodes labeled nuc in the cluster and run:

1
2
$ kub create -f mariadb-slave-service.yaml
$ kub create -f mariadb-slave.yaml

Run RabbitMQ Service

RabbitMQ service exposes port 5672 catering for requests.

1
2
$ kub create -f rabbitmq-service.yaml
$ kub create -f rabbitmq.yaml

Create IoT Rest API Service and Gateway Server

There are two containers running in one pod, one for iot-rest-api-service and the other is for gateway server.
The replica is set to 3, meaning 3 pods would be created evenly on gateway1 to gateway3 hosts.

1
2
$ kub create -f smarthome-gateway-service.yaml
$ kub create -f smarthome-gateway.yaml

You can get the NodePort of Rest API service:

1
2
3
$ kub describe svc/gateway | grep NodePort
Type: NodePort
NodePort: <unset> 32626/TCP

And browse the REST service via http://192.168.1.102:32626/api/oic/res

Run Home Dashboard

2 pods will be created for home dashboard and database will be initialized after running the commands below:

1
2
$ kub create -f home-portal-service.yaml
$ kub create -f home-portal.yaml

Get the NodePort:

1
2
3
$ kub describe svc/home-portal | grep NodePort
Type: NodePort
NodePort: home 31328/TCP

Then you are able to login to the home portal via http://192.168.1.102:31328/ (use the NodePort you got from the kubectl command instead)

Start Admin Portal

Run the following commands to start the admin portal. The admin portal can only run on single pod in that the trained models are stored in the local file system and not yet shared between pods.
Update the env http_proxy and https_proxy to empty string in admin-portal.yaml if it is not required. Then run:

1
2
$ kub create -f admin-portal-service.yaml
$ kub create -f admin-portal.yaml

Get the NodePort and visit the admin portal. Next, point the demo gateway’s IP address to the http://gateway.iot2cloud:8000/.

Run Celery Worker and Trigger Tasks

Celery worker is simply a worker process thereby no service definition required. There are two containers in the pod, one for long running tasks and the other for periodic tasks.
Run the command below to initialize the worker:

1
$ kub create -f celery-worker.yaml

And run this command to trigger the tasks (I will try to skip this manual step later):

1
$ kub exec $(kub get po -l app=celery-worker -o name | cut -d'/' -f2) -c long-task -- python CeleryTask/tasks.py

BKMs

Restart pods: in some cases, pods get error or fail to restart. I need to restart the pod to recover the application. There is no straight command to restart pods. You can restart pods by:

1
2
$ kub delete pods <pod_to_delete>
$ kub create -f <yml_file_describing_pod>

If you cannot find the yaml file immediately, run the command below alternatively:

1
$ kub get pod <pod_to_delete> -o yaml | kubectl replace --force -f -

Pod log outputs

1
$ kub logs -f <pod_name> -c <container_name>


References:

  1. http://agiletesting.blogspot.com/2016/11/running-application-using-kubernetes-on.html
  2. https://kubernetes.io/docs/concepts/configuration/secret/#overview-of-secrets
  3. https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#nodeselector

Setup Kubernetes Cluster on Centos 7.3 with Kubeadm

Posted on 2017-09-20

The article is to document the steps I took to install Kubernetes cluster on Centos 7.3 with kubeadm.

Prerequisites

System Configuration

We suppose we have four servers ready, one as k8s master and the other three as k8s nodes.
Configure local DNS in /etc/hosts. Map the IP address with host names.

1
2
3
4
192.168.1.102 loadbalancer
192.168.1.101 gateway1
192.168.1.103 gateway2
192.168.1.104 gateway3

Environment Preparation

We should have at least two servers with Centos 7.3 pre-installed and keep them in the same subnet.
Optionally, setup proxy if you work behind the corporate network as Kubeadm uses the system proxy to download components. Put the following settings in the $HOME/.bashrc file. Be mindful to put the master host’s IP address in the no_proxy list.

1
2
3
4
export http_proxy=http://proxy_ip:proxy_port/
export https_proxy=http://proxy_ip:proxy_port/
export ftp_proxy=http://proxy_ip:proxy_port/
export no_proxy=192.168.1.102,192.168.1.103,192.168.1.101,192.168.1.104,127.0.0.1,localhost,loadbalancer,gateway1,gateway2,gateway3

And check your proxy settings:

1
$ env | grep proxy

Install kubeadm and kubelet on each of your hosts

Add k8s repo to the yum source list. We suppose you run the command as root. For non-root users, please wrap the command with sudo bash -c '<command_to_run>'

1
2
3
4
5
6
7
8
9
10
11
$ cat << EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
sslverify=0
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg
https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
EOF

Install kubeadm and kubelet on each hosts.

1
2
3
4
$ sudo setenforce 0
$ sudo yum install -y kubelet kubeadm
$ sudo systemctl enable kubelet
$ sudo systemctl start kubelet

By default, the following command installs the latest kubelet and kubeadm. If you want to install a specific version, you may check the versions.

1
2
3
$ sudo yum list kubeadm --showduplicates |sort -r
kubeadm.x86_64 1.7.5-0 kubernetes
kubeadm.x86_64 1.7.4-0 kubernetes

And install the specific version ( eg. v1.7.5 )

1
$ sudo yum install kubeadm-1.7.5-0

Install Docker on each of your hosts

For the time being, Docker version 1.12.x is still the preferred and verified version that Kubernetes officially supported according to it doc. But this thread says that they will add support for Docker 1.13 very soon.
We will install Docker 1.12 for now. Use the following command to set up the repository.

1
2
3
4
5
6
7
8
$ cat << EOF > /etc/yum.repos.d/docker.repo
[dockerrepo]
name=Docker Repository
baseurl=https://yum.dockerproject.org/repo/main/centos/7
enabled=1
gpgcheck=1
gpgkey=https://yum.dockerproject.org/gpg
EOF

Make yum cache and check possible Docker 1.12.x versions

1
2
3
$ sudo yum makecache
$ sudo yum list docker-engine --showduplicates |sort -r
docker-engine.x86_64 1.12.6-1.el7.centos dockerrepo

Intall and start Docker service

1
2
3
$ sudo yum install docker-engine-1.12.6 -y
$ sudo systemctl enable docker
$ sudo systemctl start docker

Verify if your Docker cgroup driver matches the kubelet config:

1
2
$ sudo docker info |grep -i cgroup
$ sudo cat /etc/systemd/system/kubelet.service.d/10-kubeadm.conf

The default cgroup driver in kubelet config is systemd. If Docker’s cgroup driver is not systemd but cgroupfs. Update the cgroup driver in /etc/systemd/system/kubelet.service.d/10-kubeadm.conf to KUBELET_CGROUP_ARGS=--cgroup-driver=cgroupfs
Further, if your Docker runs behind corporate network, set up the proxy in the Docker config:

1
2
3
4
$ cat << EOF >/etc/systemd/system/docker.service.d/http-proxy.conf
[Service]
Environment="HTTP_PROXY=http://proxy_ip:proxy_port/" "NO_PROXY=localhost,127.0.0.1,192.168.1.102,192.168.1.103,192.168.1.101,192.168.1.104,loadbalancer,gateway1,gateway2,gateway3"
EOF

Then reload the config:

1
2
3
$ sudo systemctl daemon-reload
$ sudo systemctl restart kubelet
$ sudo systemctl restart docker # restart docker if you add proxy config

Initialize Master

On the master node (load balancer), if you run as root, do

1
$ kubeadm init --apiserver-advertise-address=192.168.1.102 --pod-network-cidr=10.244.0.0/16

If you run as a normal user, do

1
$ sudo -E -c "kubeadm init --apiserver-advertise-address=192.168.1.102 --pod-network-cidr=10.244.0.0/16"

If --apiserver-advertise-address is not specified, it auto-detects the network interface to advertise the master. Better to set the argument if there are more than one network interface.
--pod-network-cidris to specify the virtual IP range for the third party network plugin. We use flannel as our network plugin here.
Set --use-kubernetes-versionif you want to use specific Kubernetes version.
To start using your cluster, you need to run (as a regular user):

1
2
3
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

By default, your cluster will not schedule pods on the master for security reasons. Expand your load capacity if you want to schedule pods on your master.

1
$ kubectl taint nodes --all node-role.kubernetes.io/master-

Install Flannel Pod Network Plugin

A pod network add-on is supposed to be installed in order that pods can communicate with each other. Run:

1
2
kubectl create -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel-rbac.yml
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

If there are more than one NIC, refer to the flannel issue 39701.
Next, configure Docker with Flannel IP range and settings:

1
2
3
4
5
6
7
8
9
10
11
$ sudo mkdir -p /etc/systemd/system/docker.service.d
$ cat << EOF > /etc/systemd/system/docker.service.d/flannel.conf
[Service]
EnvironmentFile=-/run/flannel/docker
EOF
$ sudo cat << EOF > /run/flannel/docker
DOCKER_OPT_BIP="--bip=10.244.0.1/24"
DOCKER_OPT_IPMASQ="--ip-masq=false"
DOCKER_OPT_MTU="--mtu=1450"
DOCKER_NETWORK_OPTIONS=" --bip=10.244.0.1/24 --ip-masq=false --mtu=1450"
EOF

Reload the config:

1
2
$ sudo systemctl daemon-reload
$ sudo systemctl restart docker

Check the status of flannel pods and make sure that it is in Running state:

1
$ kubectl get pod --all-namespaces -o wide

Join Nodes to Cluster

Get the cluster token on master:

1
$ sudo kubeadm token list

Run the commands below on each of the nodes:

1
$ kubeadm join --token e5e6d6.6710059ca7130394 192.168.1.102:6443

Replace e5e6d6.6710059ca7130394 with the token got from kubeadm command.
And check whether nodes joins the cluster successfully.

1
$ kubectl get nodes

Install Dashboard Add-on

Create the dashboard pod :

1
$ kubectl create -f https://raw.githubusercontent.com/kubernetes/dashboard/master/src/deploy/kubernetes-dashboard.yaml

Since Kubernetes v1.6, its API server uses RBAC strategy. The kubernetes-dashboard.yaml does not define an valid ServiceAccount. Create file dashboard-rbac.yaml and bind account system:serviceaccount:kube-system:default with role ClusterRole cluster-admin:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
$ cat << EOF > dashboard-rbac.yaml
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: dashboard-admin
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: default
namespace: kube-system
EOF

Define the RBAC rules to the pod and check pod state with kubectl get po --all-namespace command after that

1
$ kubectl create -f dashboard-rbac.yaml

Configure kubernetes-dashboard service to use NodePort:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
$ cat << EOF > dashboard-svc.yaml
kind: Service
apiVersion: v1
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kube-system
spec:
type: NodePort
ports:
- port: 80
targetPort: 9090
selector:
k8s-app: kubernetes-dashboard
EOF
$ kubectl apply -f dashboard-svc.yaml

Then get the NodePort.

1
2
3
$ kubectl get svc kubernetes-dashboard -n kube-system
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes-dashboard 10.102.129.68 <nodes> 80:32202/TCP 12h

32202 in the output is the NodePort. You can visit the dashboard by http://<master-ip>:<node_port> now. In our case, the url is http://192.168.1.102:32202.

Tear Down

Firstly, drain the nodes on the master or wherever credential is configured. It does a graceful termination and marks the node as unschedulable.

1
2
$ kubectl drain <node name> --delete-local-data --force --ignore-daemonsets
$ kubectl delete node <node name>

Then on the node to be removed, remove all the configuration files and settings

1
$ sudo kubeadm reset

Diagnose

  • Check services and pods status. kube-system is the default namespace for system-level pods. You may also pass other specific namespaces. Use --all-namespaces to check all namespaces

    1
    $ kubectl get po,svc -n kube-system

    This is how the output looks like:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    NAME READY STATUS RESTARTS AGE
    po/etcd-loadbalancer 1/1 Running 0 1d
    po/kube-apiserver-loadbalancer 1/1 Running 0 1d
    po/kube-controller-manager-loadbalancer 1/1 Running 0 1d
    po/kube-dns-2425271678-zj91n 3/3 Running 0 1d
    po/kube-flannel-ds-w9dvz 2/2 Running 0 1d
    po/kube-flannel-ds-zn6c4 2/2 Running 1 1d
    po/kube-proxy-m6nvj 1/1 Running 0 1d
    po/kube-proxy-w92kx 1/1 Running 0 1d
    po/kube-scheduler-loadbalancer 1/1 Running 0 1d
    po/kubernetes-dashboard-3313488171-tkdtz 1/1 Running 0 1d
    NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
    svc/kube-dns 10.96.0.10 <none> 53/UDP,53/TCP 1d
    svc/kubernetes-dashboard 10.102.129.68 <nodes> 80:32202/TCP 1d
  • Check pods logs. Get pod name from the command above (eg. kubernetes-dashboard-3313488171-tkdtz). Use -c <container_name> if there are more than one containers running in the pod.

    1
    $ kubectl logs <pod_name> -f -n kube-system
  • Run commands in the container. Use -c <container_name> if there are more than one containers running in the pod.
    Run a single command:

    1
    $ kubectl exec <pod_name> -n <namespace> <command_ to_run>

    Enter the container’s shell:

    1
    $ kubectl exec -it <pod_name> -n <namespace> -- /bin/bash
  • Check Docker logs

    1
    $ sudo journalctl -u docker.service -f
  • Check kubelet logs

    1
    $ sudo journalctl -u kubelet.service -f

References:

  1. https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/
  2. https://kubernetes.io/docs/admin/kubeadm/
  3. https://github.com/coreos/flannel
ichbinblau

ichbinblau

5 posts
6 tags
© 2019 ichbinblau
Total views. You are the th visitor. Hits