An Introduction to Kubernetes

Kubernetes is a highly popular open-source container management system. The goal of the Kubernetes project is to make management of containers across multiple nodes as simple as managing containers on a single system. To accomplish this, it offers quite a few unique features such as Traffic Load Balancing, Self-Healing (automatic restarts), Scheduling, Scaling, and Rolling Updates.
In today’s article, we’ll learn about Kubernetes by deploying a simple web application across a multinode Kubernetes cluster. Before we can start deploying containers however, we first need to set up a cluster.

Standing Up a Kubernetes Cluster

The official Getting Started guide walks you through deploying a Kubernetes cluster on Google’s Container Engine platform. While this is a quick and easy method to get up and running, for this article, we’ll be deploying Kubernetes with an alternative provider, specifically via Vagrant. We’re using Vagrant for a few reasons, but primarily because it shows how to deploy Kubernetes on a generic platform rather than GCE.
Kubernetes comes with several install scripts for various platforms. The majority of these scripts use Bash environmental variables to change what and how Kubernetes is installed.
For this installation, we’ll define two variables:
  • NUM_NODES – controls the number of nodes to deploy
  • KUBERNETES_PROVIDER – specifies the platform on which to install Kubernetes
Let’s define that the installation scripts should use the vagrant platform and to provision a 2 node cluster.
$ export NUM_NODES=2
$ export KUBERNETES_PROVIDER=vagrant
If we wanted to deploy Kubernetes on AWS, for example, we could do so by changing the KUBERNETES_PROVIDER environmental variable to aws.

Using the Kubernetes install script

While there are many walk-throughs on how to install Kubernetes, I have found that the easiest method is to use the Kubernetes install script available at (https://get.k8s.io).
This script is essentially a wrapper to the installation scripts distributed with Kubernetes, which makes the process quite a bit easier. One of my favorite things about this script is that it will also download Kubernetes for you.
To start using this script, we’ll need to download it; we can do this with a quick curlcommand. Once we’ve downloaded the script, we can execute it by running the bashcommand followed by the script name.
$ curl https://get.k8s.io > kubernetes_install.sh
$ bash kubernetes_install.sh
Downloading kubernetes release v1.2.2 to /var/tmp/kubernetes.tar.gz
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  426M  100  426M    0     0  12.2M      0  0:00:34  0:00:34 --:--:-- 6007k
Unpacking kubernetes release v1.2.2
Creating a kubernetes on vagrant...
<output truncated>
Kubernetes master is running at https://10.245.1.2
Heapster is running at https://10.245.1.2/api/v1/proxy/namespaces/kube-system/services/heapster
KubeDNS is running at https://10.245.1.2/api/v1/proxy/namespaces/kube-system/services/kube-dns
kubernetes-dashboard is running at https://10.245.1.2/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard
Grafana is running at https://10.245.1.2/api/v1/proxy/namespaces/kube-system/services/monitoring-grafana
InfluxDB is running at https://10.245.1.2/api/v1/proxy/namespaces/kube-system/services/monitoring-influxdb
Installation successful!
After the script completes execution, we have a running Kubernetes cluster. However, we still have one more step before we can start to interact with this Kubernetes cluster; we need the kubectl command to be installed.

Setting up kubectl

The kubectl command exists for both Linux and Mac OS X. Since I’m running this installation from my MacBook, I’ll be installing the Mac OS X version of kubectl. This means I’ll be running the cluster via Vagrant but interacting with that cluster from my MacBook.
To download the kubectl command, we will once again use curl.
$ curl https://storage.googleapis.com/kubernetes-release/release/v1.2.0/bin/darwin/amd64/kubectl > kubectl
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 38.7M  100 38.7M    0     0  10.4M      0  0:00:03  0:00:03 --:--:-- 10.4M
$ chmod 750 kubectl
After the kubectl binary is downloaded and permissions are changed to allow execution, the kubectl command is almost ready. One more step is required before we can start interacting with our Kubernetes cluster. That step is to configure the kubectl command.
$ export KUBECONFIG=~/.kube/config
As with most Kubernetes scripts, the kubectl command’s configuration is driven by environmental variables. When we executed the cluster installation script above, that script created a .kube configuration directory in my users home directory. Within that directory, it also created a file named config. This file is used to store information about the Kubernetes cluster that was created.
By setting the KUBECONFIG environmental variable to ~/.kube/config, we are defining that the kubectl command should reference this configuration file. Let’s take a quick look at that file to get a better understanding of what is being set.
$ cat ~/.kube/config
apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: <SSLKEYDATA>
    server: https://10.245.1.2
  name: vagrant
contexts:
- context:
    cluster: vagrant
    user: vagrant
  name: vagrant
current-context: vagrant
kind: Config
preferences: {}
users:
- name: vagrant
  user:
    client-certificate-data: <SSLKEYDATA>
    client-key-data: <SSLKEYDATA>
    token: sometoken
- name: vagrant-basic-auth
  user:
    password: somepassword
    username: admin
The .kube/config file sets two main pieces of information:
  • Location of the cluster
  • Authentication data for communicating with that cluster
With the .kube/config file defined, let’s attempt to execute a kubectl command against our Kubernetes cluster to verify everything is working.
$ ./kubectl get nodes
NAME                STATUS    AGE
kubernetes-node-1   Ready     31m
kubernetes-node-2   Ready     24m
The output of the ./kubectl get nodes command shows us that we were able to connect to our Kubernetes cluster and display the status of our two nodes kubernetes-node-1 and kubernetes-node-2. With this, we can move on as our installation is complete.

More About Kubernetes Nodes

In the command above, we used kubectl to show the status of the available Kubernetes Nodes on this cluster. However, we really didn’t explore what a node is or what role it plays within a cluster.
A Kubernetes Node is a physical or virtual (in our case, virtual) machine used to host application containers. In a traditional container-based environment, you would typically define that specific containers run on specified physical or virtual hosts. In a Kubernetes cluster, however, you simply define what application containers you wish to run. The Kubernetes master determines which node the application container will run on.
This methodology also enables the Kubernetes cluster to perform tasks such as automated restarts when containers or nodes die.

Deploying Our Application

With our Kubernetes cluster ready, we can now start deploying application containers. The application container we will be deploying today will be an instance of Ghost. Ghost is a popular JavaScript-based blogging platform, and with its official Docker image, it’s pretty simple to deploy.
Since we’ll be using a prebuilt Docker container, we won’t need to first build a Docker image. However, it is important to call out that in order to use custom-built containers on a Kubernetes cluster. You must first build the container and push it to a Docker repository such as Docker Hub.
To start our Ghost container, we will use the ./kubectl command with the run option.
$ ./kubectl run ghost --image=ghost --port=2368
deployment "ghost" created
In the command above, we created a deployment named ghost, using the image ghostand specified that the ghost container requires the port 2368. Before going too far, let’s first verify that the container is running. We can verify this by executing the kubectlcommand with the get pods options.
$ ./kubectl get pods
NAME                     READY     STATUS    RESTARTS   AGE
ghost-3550706830-4c2n5   1/1       Running   0          20m
The get pods option will tell the kubectl command to list all of the Kubernetes Pods currently deployed to the cluster.

What Are Pods and Deployments?

Pod is a group of containers that can communicate with each other as though they are running within the same system. For those familiar with Docker, this may sound like linking containers, but there’s actually a bit more to it than that. Containers within Pods are not only able to connect to each other through a localhost connection, the processes running within the containers are also able to share memory segments with other containers.
The goal of a Pod is to allow applications running within the Pod to interact in the same way they would as though they were not running in containers but simply running on the same physical host. This ability makes it easy to deploy applications that are not specifically designed to run within containers.
Deployment, or Deployment Object, is similar to the concept of a Desired State. Essentially the Deployment is a high-level configuration around a desired function. For example, earlier when we started the Ghost container, we didn’t just launch a Ghostcontainer. We actually configured Kubernetes to ensure that at least one copy of a Ghostcontainer is running.

Creating a service for Ghost

While containers within Pods can connect to systems external to the cluster, external systems and even other Pods cannot communicate with them. This is because, by default, the port defined for the Ghost service is not exposed beyond this cluster. This is where Services come into play.
In order to make our Ghost application accessible outside the cluster, the deployment we just created needs to be exposed as a Kubernetes Service. To set our Ghost deployment as a service, we will use the kubectl command once again, this time using the exposeoption.
$ ./kubectl expose deployment ghost --type="NodePort"
service "ghost" exposed
In the above command, we used the flag --type with the argument of NodePort. This flag defines the service type to expose for this service, in this case a NodePort service type. The NodePort service type will set all nodes to listen on the specified port. We can see our change take effect if we use the kubectl command again, but this time with the get services option.
$ ./kubectl get services
NAME         CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
ghost        10.247.63.140   nodes         2368/TCP   55s
kubernetes   10.247.0.1      <none>        443/TCP    19m

Service types

At the moment, Kubernetes supports three service types:
  • ClusterIP
  • NodePort
  • LoadBalancer
If we wanted to only expose this service to other Pods within this cluster, we can use the ClusterIP service type, which is the default. This opens the port on each node for Pod to Pod communication.
The LoadBalancer service type is designed to provision an external IP to act as a Load Balancer for the service. Since our deployment is leveraging Vagrant on a local laptop, this option does not work in our environment. It does work with Kubernetes clusters deployed in cloud environments like GCE or AWS.

Testing our Ghost instance

Since we did not specify a port to use when defining our NodePort service, Kubernetes randomly assigned a port. To see what port it assigned, we can use the kubectlcommand, with the describe service option.
$ ./kubectl describe service ghost
Name:           ghost
Namespace:      default
Labels:         run=ghost
Selector:       run=ghost
Type:           NodePort
IP:         10.247.63.140
Port:           <unset> 2368/TCP
NodePort:       <unset> 32738/TCP
Endpoints:      10.246.3.3:2368
Session Affinity:   None
No events.
We can see that the port assigned is 32738. With this port, we can use the curlcommand to make an HTTP call to any of our Kubernetes Nodes and get redirected to port 2368 within our applications container.
$ curl -vk http://10.245.1.3:32738
* Rebuilt URL to: http://10.245.1.3:32738/
*   Trying 10.245.1.3...
* Connected to 10.245.1.3 (10.245.1.3) port 32738 (#0)
> GET / HTTP/1.1
> Host: 10.245.1.3:32738
> User-Agent: curl/7.43.0
> Accept: */*
>
< HTTP/1.1 200 OK
< X-Powered-By: Express
From the output of the curl command, we can see that the connection was successful with a 200 OK response. What is interesting about this is that the request was to a node that wasn’t running the Ghost container. We can see this if we use the kubectl to describe the Pod.
$ ./kubectl describe pod ghost-3550706830-ss4se
Name:       ghost-3550706830-ss4se
Namespace:  default
Node:       kubernetes-node-2/10.245.1.4
Start Time: Sat, 16 Apr 2016 21:13:20 -0700
Labels:     pod-template-hash=3550706830,run=ghost
Status:     Running
IP:     10.246.3.3
Controllers:    ReplicaSet/ghost-3550706830
Containers:
  ghost:
    Container ID:   docker://55ea497a166ff13a733d4ad3be3abe42a6d7f3d2c259f2653102fedda485e25d
    Image:      ghost
    Image ID:       docker://09849b7a78d3882afcd46f2310c8b972352bc36aaec9f7fe7771bbc86e5222b9
    Port:       2368/TCP
    QoS Tier:
      memory:   BestEffort
      cpu:  Burstable
    Requests:
      cpu:      100m
    State:      Running
      Started:      Sat, 16 Apr 2016 21:14:33 -0700
    Ready:      True
    Restart Count:  0
    Environment Variables:
Conditions:
  Type      Status
  Ready     True
Volumes:
  default-token-imnyi:
    Type:   Secret (a volume populated by a Secret)
    SecretName: default-token-imnyi
No events.
In the description above, we can see that the Ghost Pod is running on kubernetes-node-2. However, the HTTP request we just made was to kubernetes-node-1. This is made possible by a Kubernetes service called kube-proxy. With kube-proxy, whenever traffic arrives on a service’s port, the Kubernetes node will check if the service is running local to that node. If not, it will redirect the traffic to a node that is running that service.
In the case above, this means that even though the HTTP request was made to kubernetes-node-1, the kube-proxy service redirected that traffic to kubernetes-node-2where the container is running.
This feature allows users to run services without having to worry about where the service is and whether or not it has moved from node to node. A very useful feature that reduces quite a bit of maintenance and headache.

Scaling a Deployment

Now that our Ghost service is running and accessible to the outside world, we need to perform our last task, scaling out our Ghost application across multiple instances. To do this, we can simply call the kubectl command again, this time however, with the scaleoption.
$ ./kubectl scale deployment ghost --replicas=4
deployment "ghost" scaled
We’ve specified that there should be 4 replicas of the ghost deployment. If we execute kubectl get pods again we should see 3 additional pods for the ghost deployment.
./kubectl get pods
NAME                     READY     STATUS    RESTARTS   AGE
ghost-3550706830-49r81   1/1       Running   0          7h
ghost-3550706830-4c2n5   1/1       Running   0          8h
ghost-3550706830-7o2fs   1/1       Running   0          7h
ghost-3550706830-f3dae   1/1       Running   0          7h
With this last step, we now have our Ghost service running across multiple nodes and multiple pods. As requests are made to our Ghost service, those requests will be load balanced to our various Ghost instances.