Skip to content

Software Development Blogs: Programming, Software Testing, Agile Project Management

Methods & Tools

Subscribe to Methods & Tools
if you are not afraid to read more than one page to be a smarter software developer, software tester or project manager!

Database

Notes on setting up Elasticsearch, Kibana and Fluentd on Ubuntu

Agile Testing - Grig Gheorghiu - Wed, 02/01/2017 - 00:28
I've been experimenting with an EFK stack (with Fluentd replacing Logstash) and I hasten to write down some of my notes. I could have just as well used Logstash, but my goal is to also use the EFK stack for capturing logs out of Kubernetes clusters, and I wanted to become familiar with Fluentd, which is a Cloud Native Computing Foundation project.

1) Install Java 8

On Ubuntu 16.04:

# apt-get install openjdk-8-jre-headless

On Ubuntu 14.04:

# add-apt-repository -y ppa:webupd8team/java
# apt-get update
# apt-get -y install oracle-java8-installer

2) Download and install Elasticsearch (latest version is 5.1.2 currently)

# wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.1.2.deb
# dpkg -i elasticsearch-5.1.2.deb

Edit /etc/default/elasticsearch/elasticsearch.yml and set

network.host: 0.0.0.0

# service elasticsearch restart
3) Download and install Kibana

# wget https://artifacts.elastic.co/downloads/kibana/kibana-5.1.2-amd64.deb
# dpkg -i kibana-5.1.2-amd64.deb

Edit /etc/kibana/kibana.yml and set

server.host: "local_ip_address"

# service kibana restart
4) Install Fluentd agent (td-agent)
On Ubuntu 16.04:
# curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-xenial-td-agent2.sh | sh
On Ubuntu 14.04:
# curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-trusty-td-agent2.sh | sh

Install Fluentd elasticsearch plugin (note that td-agent comes with its own gem installer):
# td-agent-gem install fluent-plugin-elasticsearch

5) Configure Fluentd agent
To specify the Elasticsearch server to send the local logs to, use a match stanza in /etc/td-agent/td-agent.conf:
<match **>  @type elasticsearch  logstash_format true  host IP_ADDRESS_OF_ELASTICSEARCH_SERVER  port 9200  index_name fluentd  type_name fluentd.project.stage.web01</match>
Note that Fluentd is backwards compatible with logstash, so if you set logstash_format true, Elasticsearch will create an index called logstash-*. Also, port 9200 needs to be open from the client to the Elasticsearch server.
I found it useful to set the type_name property to a name specific to the client running the Fluentd agent. For example, if you have several projects/tenants, each with multiple environments (dev, stage, prod) and each environment with multiple servers, you could use something like type_name fluentd.project.stage.web01. This label will then be parsed and shown in Kibana and will allow you to easily tell the source of a given log entry.
If you want Fluentd to parse Apache logs and send the log entries to Elasticsearch, use stanzas of this form in td-agent.conf:
<source>  type tail  format apache2  path /var/log/apache2/mysite.com-access.log  pos_file /var/log/td-agent/mysite.com-access.pos  tag apache.access</source>
<source>  type tail  format apache2  path /var/log/apache2/mysite.com-ssl-access.log  pos_file /var/log/td-agent/mysite.com-ssl-access.pos  tag apache.ssl.access</source>
For syslog logs, use:
<source>  @type syslog  port 5140  bind 0.0.0.0  tag system.local</source>
Restart td-agent:
# service td-agent restart
Inspect the td-agent log file:
# tail -f /var/log/td-agent/td-agent.log
Some things I've had to do to fix errors emitted by td-agent:
  • change permissions on apache log directory and log files so they are readable by user td-agent
  • make sure port 9200 is open from the client to the Elasticsearch server

That's it in a nutshell. In the next installment, I'll show how to secure the communication between the Fluentd agent and the Elasticsearch server.

Using Helm to install Traefik as an Ingress Controller in Kubernetes

Agile Testing - Grig Gheorghiu - Tue, 12/06/2016 - 23:15
That was a mouthful of a title...Hope this post lives up to it :)

First of all, just a bit of theory. If you want to expose your application running on Kubernetes to the outside world, you have several choices.

One choice you have is to expose the pods running your application via a Service of type NodePort or LoadBalancer. If you run your service as a NodePort, Kubernetes will allocate a random high port on every node in the cluster, and it will proxy traffic to that port to your service. Services of type LoadBalancer are only supported if you run your Kubernetes cluster using certain specific cloud providers such as AWS and GCE. In this case, the cloud provider will create a specific load balancer resource, for example an Elastic Load Balancer in AWS, which will then forward traffic to the pods comprising your service. Either way, the load balancing you get by exposing a service is fairly crude, at the TCP layer and using a round-robin algorithm.

A better choice for exposing your Kubernetes application is to use Ingress resources together with Ingress Controllers. An ingress resource is a fancy name for a set of layer 7 load balancing rules, as you might be familiar with if you use HAProxy or Pound as a software load balancer. An Ingress Controller is a piece of software that actually implements those rules by watching the Kubernetes API for requests to Ingress resources. Here is a fragment from the Ingress Controller documentation on GitHub:

What is an Ingress Controller?

An Ingress Controller is a daemon, deployed as a Kubernetes Pod, that watches the ApiServer's /ingresses endpoint for updates to the Ingress resource. Its job is to satisfy requests for ingress.
Writing an Ingress Controller

Writing an Ingress controller is simple. By way of example, the nginx controller does the following:
  • Poll until apiserver reports a new Ingress
  • Write the nginx config file based on a go text/template
  • Reload nginx
As I mentioned in a previous post, I warmly recommend watching a KubeCon presentation from Gerred Dillon on "Kubernetes Ingress: Your Router, Your Rules" if you want to further delve into the advantages of using Ingress Controllers as opposed to plain Services.
While nginx is the only software currently included in the Kubernetes source code as an Ingress Controller, I wanted to experiment with a full-fledged HTTP reverse proxy such as Traefik. I should add from the beginning that only nginx offers the TLS feature of Ingress resources. Traefik can terminate SSL of course, and I'll show how you can do that, but it is outside of the Ingress resource spec.

I've also been looking at Helm, the Kubernetes package manager, and I noticed that Traefik is one of the 'stable' packages (or Charts as they are called) currently offered by Helm, so I went the Helm route in order to install Traefik. In the following instructions I will assume that you are already running a Kubernetes cluster in AWS and that your local kubectl environment is configured to talk to that cluster.

Install Helm

This is pretty easy. Follow the instructions on GitHub to download or install a binary for your OS.

Initialize Helm

Run helm init in order to install the server component of Helm, called tiller, which will be run as a Kubernetes Deployment in the kube-system namespace of your cluster.

Get the Traefik Helm chart from GitHub

I git cloned the entire kubernetes/charts repo, then copied the traefik directory locally under my own source code repo which contains the rest of the yaml files for my Kubernetes resource manifests.

# git clone https://github.com/kubernetes/charts.git helmcharts# cp -r helmcharts/stable/traefik traefik-helm-chart
It is instructive to look at the contents of a Helm chart. The main advantage of a chart in my view is the bundling together of all the Kubernetes resources necessary to run a specific set of services. The other advantage is that you can use Go-style templates for the resource manifests, and the variables in those template files can be passed to helm via a values.yaml file or via the command line.
For more details on Helm charts and templates, I recommend this linux.com article.
Create an Ingress resource for your application service
I copied the dashboard-ingress.yaml template file from the Traefik chart and customized it so as to refer to my application's web service, which is running in a Kubernetes namespace called tenant1.

# cd traefik-helm-chart/templates# cp dashboard-ingress.yaml web-ingress.yaml# cat web-ingress.yaml{{- if .Values.tenant1.enabled }}apiVersion: extensions/v1beta1kind: Ingressmetadata:  namespace: {{ .Values.tenant1.namespace }}  name: {{ template "fullname" . }}-web-ingress  labels:    app: {{ template "fullname" . }}    chart: "{{ .Chart.Name }}-{{ .Chart.Version }}"    release: "{{ .Release.Name }}"    heritage: "{{ .Release.Service }}"spec:  rules:  - host: {{ .Values.tenant1.domain }}    http:      paths:      - path: /        backend:          serviceName: {{ .Values.tenant1.serviceName }}          servicePort: {{ .Values.tenant1.servicePort }}{{- end }}
The variables referenced in the template above are defined in the values.yaml file in the Helm chart. I started with the variables in the values.yaml file that came with the Traefik chart and added my own customizations:
# vi traefik-helm-chart/values.yamlssl:  enabled: trueacme:  enabled: true  email: admin@mydomain.com  staging: false  # Save ACME certs to a persistent volume. WARNING: If you do not do this, you will re-request  # certs every time a pod (re-)starts and you WILL be rate limited!  persistence:    enabled: true    storageClass: kubernetes.io/aws-ebs    accessMode: ReadWriteOnce    size: 1Gidashboard:  enabled: true  domain: tenant1-lb.dev.mydomain.comgzip:  enabled: falsetenant1:  enabled: true  namespace: tenant1  domain: tenant1.dev.mydomain.com  serviceName: web  servicePort: http
Note that I added a section called tenant1, where I defined the variables referenced in the web-ingress.yaml template above. I also enabled the ssl and acme sections, so that Traefik can automatically install SSL certificates from Let's Encrypt via the ACME protocol.
Install your customized Helm chart for Traefik
With these modifications done, I ran 'helm install' to actually deploy the various Kubernetes resources included in the Traefik chart. 
I specified the directory containing my Traefik chart files (traefik-helm-chart) as the last argument passed to helm install:
# helm install --name tenant1-lb --namespace tenant1 traefik-helm-chart/NAME: tenant1-lbLAST DEPLOYED: Tue Nov 29 09:51:12 2016NAMESPACE: tenant1STATUS: DEPLOYED
RESOURCES:==> extensions/IngressNAME                                  HOSTS                    ADDRESS   PORTS     AGEtenant1-lb-traefik-web-ingress   tenant1.dev.mydomain.com             80        1stenant1-lb-traefik-dashboard   tenant1-lb.dev.mydomain.com             80        0s
==> v1/PersistentVolumeClaimNAME                    STATUS    VOLUME    CAPACITY   ACCESSMODES   AGEtenant1-lb-traefik-acme   Pending                                      0s
==> v1/SecretNAME                            TYPE      DATA      AGEtenant1-lb-traefik-default-cert   Opaque    2         1s
==> v1/ConfigMapNAME               DATA      AGEtenant1-lb-traefik   1         1s
==> v1/ServiceNAME                         CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGEtenant1-lb-traefik-dashboard   10.3.0.15    <none>        80/TCP    1stenant1-lb-traefik   10.3.0.215   <pending>   80/TCP,443/TCP   1s
==> extensions/DeploymentNAME               DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGEtenant1-lb-traefik   1         1         1            0           1s

NOTES:1. Get Traefik's load balancer IP/hostname:
    NOTE: It may take a few minutes for this to become available.
    You can watch the status by running:
        $ kubectl get svc tenant1-lb-traefik --namespace tenant1 -w
    Once 'EXTERNAL-IP' is no longer '<pending>':
        $ kubectl describe svc tenant1-lb-traefik --namespace tenant1 | grep Ingress | awk '{print $3}'
2. Configure DNS records corresponding to Kubernetes ingress resources to point to the load balancer IP/hostname found in step 1
At this point you should see two Ingress resources, one for the Traefik dashboard and on for the custom web ingress resource:
# kubectl --namespace tenant1 get ingressNAME                           HOSTS                       ADDRESS   PORTS     AGEtenant1-lb-traefik-dashboard   tenant1-lb.dev.mydomain.com           80        50stenant1-lb-traefik-web-ingress tenant1.dev.mydomain.com            80        51s
As per the Helm notes above (shown as part of the output of helm install), run this command to figure out the CNAME of the AWS ELB created by Kubernetes during the creation of the tenant1-lb-traefik service of type LoadBalancer:
# kubectl describe svc tenant1-lb-traefik --namespace tenant1 | grep Ingress | awk '{print $3}'a5be275d8b65c11e685a402e9ec69178-91587212.us-west-2.elb.amazonaws.com
Create tenant1.dev.mydomain.com and tenant1-lb.dev.mydomain.com as DNS CNAME records pointing to a5be275d8b65c11e685a402e9ec69178-91587212.us-west-2.elb.amazonaws.com.

Now, if you hit http://tenant1-lb.dev.mydomain.com you should see the Traefik dashboard showing the frontends on the left and the backends on the right:
Screen Shot 2016-11-29 at 10.54.07 AM.pngIf you hit http://tenant1.dev.mydomain.com you should see your web service in action.
You can also inspect the logs of the tenant1-lb-traefik pod to see what's going on under the covers when Traefik is launched and to verify that the Let's Encrypt SSL certificates were properly downloaded via ACME:
# kubectl --namespace tenant1 logs tenant1-lb-traefik-3710322105-o2887time="2016-11-29T00:03:51Z" level=info msg="Traefik version v1.1.0 built on 2016-11-18_09:20:46AM"time="2016-11-29T00:03:51Z" level=info msg="Using TOML configuration file /config/traefik.toml"time="2016-11-29T00:03:51Z" level=info msg="Preparing server http &{Network: Address::80 TLS:<nil> Redirect:<nil> Auth:<nil> Compress:false}"time="2016-11-29T00:03:51Z" level=info msg="Preparing server https &{Network: Address::443 TLS:0xc4201b1800 Redirect:<nil> Auth:<nil> Compress:false}"time="2016-11-29T00:03:51Z" level=info msg="Starting server on :80"time="2016-11-29T00:03:58Z" level=info msg="Loading ACME Account..."time="2016-11-29T00:03:59Z" level=info msg="Loaded ACME config from store /acme/acme.json"time="2016-11-29T00:04:01Z" level=info msg="Starting provider *main.WebProvider {\"Address\":\":8080\",\"CertFile\":\"\",\"KeyFile\":\"\",\"ReadOnly\":false,\"Auth\":null}"time="2016-11-29T00:04:01Z" level=info msg="Starting provider *provider.Kubernetes {\"Watch\":true,\"Filename\":\"\",\"Constraints\":[],\"Endpoint\":\"\",\"DisablePassHostHeaders\":false,\"Namespaces\":null,\"LabelSelector\":\"\"}"time="2016-11-29T00:04:01Z" level=info msg="Retrieving ACME certificates..."time="2016-11-29T00:04:01Z" level=info msg="Retrieved ACME certificates"time="2016-11-29T00:04:01Z" level=info msg="Starting server on :443"time="2016-11-29T00:04:01Z" level=info msg="Server configuration reloaded on :80"time="2016-11-29T00:04:01Z" level=info msg="Server configuration reloaded on :443"
To get an even better warm and fuzzy feeling about the SSL certificates installed via ACME, you can run this command against the live endpoint tenant1.dev.mydomain.com:
# echo | openssl s_client -showcerts -servername tenant1.dev.mydomain.com -connect tenant1.dev.mydomain.com:443 2>/dev/nullCONNECTED(00000003)---Certificate chain 0 s:/CN=tenant1.dev.mydomain.com   i:/C=US/O=Let's Encrypt/CN=Let's Encrypt Authority X3-----BEGIN CERTIFICATE-----MIIGEDCCBPigAwIBAgISAwNwBNVU7ZHlRtPxBBOPPVXkMA0GCSqGSIb3DQEBCwUA-----END CERTIFICATE----- 1 s:/C=US/O=Let's Encrypt/CN=Let's Encrypt Authority X3   i:/O=Digital Signature Trust Co./CN=DST Root CA X3-----BEGIN CERTIFICATE-----uM2VcGfl96S8TihRzZvoroed6ti6WqEBmtzw3Wodatg+VyOeph4EYpr/1wXKtx8/KOqkqm57TH2H3eDJAkSnh6/DNFu0Qg==-----END CERTIFICATE--------Server certificatesubject=/CN=tenant1.dev.mydomain.comissuer=/C=US/O=Let's Encrypt/CN=Let's Encrypt Authority X3---No client certificate CA names sent---SSL handshake has read 3009 bytes and written 713 bytes---New, TLSv1/SSLv3, Cipher is AES128-SHAServer public key is 4096 bitSecure Renegotiation IS supportedCompression: NONEExpansion: NONESSL-Session:    Protocol  : TLSv1    Cipher    : AES128-SHA    Start Time: 1480456552    Timeout   : 300 (sec)    Verify return code: 0 (ok)etc.
Other helm commands
You can list the Helm releases that are currently running (a Helm release is a particular versioned instance of a Helm chart) with helm list:
# helm listNAME        REVISION UPDATED                  STATUS   CHARTtenant1-lb    1        Tue Nov 29 10:13:47 2016 DEPLOYED traefik-1.1.0-a

If you change any files or values in a Helm chart, you can apply the changes by means of the 'helm upgrade' command:

# helm upgrade tenant1-lb traefik-helm-chart
You can see the status of a release with helm status:
# helm status tenant1-lbLAST DEPLOYED: Tue Nov 29 10:13:47 2016NAMESPACE: tenant1STATUS: DEPLOYED
RESOURCES:==> v1/ServiceNAME               CLUSTER-IP   EXTERNAL-IP        PORT(S)          AGEtenant1-lb-traefik   10.3.0.76    a92601b47b65f...   80/TCP,443/TCP   35mtenant1-lb-traefik-dashboard   10.3.0.36   <none>    80/TCP    35m
==> extensions/DeploymentNAME               DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGEtenant1-lb-traefik   1         1         1            1           35m
==> extensions/IngressNAME                                  HOSTS                    ADDRESS   PORTS     AGEtenant1-lb-traefik-web-ingress   tenant1.dev.mydomain.com             80        35mtenant1-lb-traefik-dashboard   tenant1-lb.dev.mydomain.com             80        35m
==> v1/PersistentVolumeClaimNAME                    STATUS    VOLUME                                     CAPACITY   ACCESSMODES   AGEtenant1-lb-traefik-acme   Bound     pvc-927df794-b65f-11e6-85a4-02e9ec69178b   1Gi        RWO           35m
==> v1/SecretNAME                            TYPE      DATA      AGEtenant1-lb-traefik-default-cert   Opaque    2         35m
==> v1/ConfigMapNAME               DATA      AGEtenant1-lb-traefik   1         35m




Thu, 01/01/1970 - 01:00