tick-charts: InfluxData Stack on Kubernetes

In my day job I work at InfluxData. We build a full stack of products for monitoring built on top of our best in class timeseries database:

Telegraf - Collection agent with a killer library of plugins
InfluxDB - Highest Rated Timeseries database
Chronograf - A visualization and administration tool for InfluxDB and Kapacitor
Kapacitor - Alerting, ETL and cluster autoscaling

Given my interest in Kubernetes, and the recent work we have done to make it easy to use InfluxDB monitoring in a Kubernetes cluster (kubernetes telegraf plugin, kapacitor kubernetes output, chronograf OSS release), I decided to put together an easy way to spin up the full stack in a Kubernetes instance.

Motivation#

You might ask, doesn’t Kubernetes come with built in monitoring? Why yes, it does! That monitoring and log aggregation is handled differently depending on your cluster configuration and providence. The metrics monitoring is done by default with heapster. Sometimes this is powered by InfluxDB.

Heapster doesn’t do the best job of following InfluxDB schema best practices. It’s also sparsely documented and has stopped adding new features. I wanted an easy way to create demos and POCs with rich live data. With the release of the new Chronograf and the canned dashboards it offers this became possible.

To make it easy to deploy on different Kubernetes configurations I deceded to write the project as a series of Helm Charts. tick-charts is the result.

.
├── LICENSE
├── README.md
├── chronograf
│   ├── Chart.yaml
│   ├── README.md
│   ├── templates
│   │   ├── NOTES.txt
│   │   ├── _helpers.tpl
│   │   ├── deployment.yaml
│   │   ├── pvc.yaml
│   │   └── service.yaml
│   └── values.yaml
├── influxdb
│   ├── Chart.yaml
│   ├── README.md
│   ├── templates
│   │   ├── NOTES.txt
│   │   ├── _helpers.tpl
│   │   ├── config.yaml
│   │   ├── deployment.yaml
│   │   ├── pvc.yaml
│   │   └── service.yaml
│   └── values.yaml
├── kapacitor
│   ├── Chart.yaml
│   ├── README.md
│   ├── templates
│   │   ├── NOTES.txt
│   │   ├── _helpers.tpl
│   │   ├── deployment.yaml
│   │   ├── pvc.yaml
│   │   └── service.yaml
│   └── values.yaml
└── telegraf
    ├── Chart.yaml
    ├── README.md
    ├── docs
    │   ├── all-config-values.toml
    │   └── all-config-values.yaml
    ├── templates
    │   ├── _helpers.tpl
    │   ├── configmap-single.yaml
    │   ├── conifgmap-ds.yaml
    │   ├── daemonset.yaml
    │   ├── deployment.yaml
    │   └── service.yaml
    └── values.yaml

Helm#

Before deploying the members of the stack we need to package them for helm:

$ helm package influxdb chronograf telegraf kapacitor

Telegraf#

$ helm install telegraf-0.1.0.tgz --name telegraf --namespace tick

From a high level the way the system works starts the Telegraf Daemonset:

# telegraf/daemonset.yaml
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: telegraf
  namespace: tick
spec:
  template:
    metadata:
      labels:
        app: telegraf
    spec:
      containers:
        - name: telegraf
          image: telegraf:1.1.0-alpine
          imagePullPolicy: Always
...

This runs pods on each of the virtual hosts that make up the Kubernetes cluster. If you check out the full file you will see a number of mounted volumes. These enable the collection of various host level statistics by the containerized agent. The following input plugins are currently enabled in the daemonset:

The telegraf chart also creates a single telegraf instance that is running the following plugins:

InfluxDB#

$ helm install influxdb-0.1.0.tgz --name influxdb --namespace tick

The data that telegraf produces is then forwarded to InfluxDB:

# influxdb/deployment.yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: influxdb
  namespace: tick
...
    spec:
      containers:
      - image: influxdb:1.1.0-rc2-alpine
...
        volumeMounts:
        - name: config-volume
          mountPath: /etc/influxdb
        - name: influxdb
          mountPath: /var/lib/influxdb
...
      volumes:
      - name: config-volume
        configMap:
          name: influxdb-config
      - name: influxdb
        emptyDir: {}

Note the emptyDir does not provide persistence for your InfluxDB instance. You can enable persistence in the InfluxDB chart. Kapacitor and Chronograf also need persistent storage. Persistence is disabled by default for all products. This project uses persistent volume claims (pvc) to create the volumes if they are desired. Just set the persistence: "true" in the values.yaml associated file.

To make the Influx addressable we need to expose it’s API via a service. This creates an entry in the internal cluster DNS in the form of http://{service_name}.{namespace}:{service_port} making the InfluxDB API addressable from anywhere in the cluster at http://influxdb-influxdb.tick:8086

If you would like to access the database from your local machine using influx try the following:

# the args are {namespace} {app}
# It will attach to the first instance of app returned if multiple exist
# InfluxDB API will be available at localhost:8086
$ kubectl port forward $(kubectl get pods --all-namespaces -l app=influxdb-influxdb -o jsonpath='{ .items[0].metadata.name }') 8086:8086
# In another terminal window:
$ influx
> show databases
name: databases
name
----
telegraf
_internal

Chronograf#

$ helm install chronograf-0.1.0.tgz --name chronograf --namespace tick

Chronograf is the new Open Source timeseries visualization and database management tool from InfluxDB. It includes some canned dashboards for Kubernetes. It also requires some persistent storage for the boltdb file it uses to store state. The only new Kubernetes feature we are using in this app is the LoadBalancer service:

apiVersion: v1
kind: Service
metadata:
  name: chronograf
  namespace: tick
spec:
  type: LoadBalancer
  ports:
  - port: 80
    name: http
    targetPort: 8888
    protocol: TCP
  selector:
    app: chronograf

This takes Chronograf’s service port (:8888) and exposes it via a load balancer (ELB on AWS and Cloud LB in GCE) at a public IP. This can take a minute to provision so after the service is created run kubectl get -w svc --namespace tick chronograf-chronograf to watch for the IP to be available.

Before we configure the web console we need to deploy the last member of the stack, kapacitor. There is nothing new in that set of deployment files, however the web console exposes some functionality.

$ helm install kapacitor-0.1.0.tgz --name kapacitor --namespace tick

Configuring Chronograf#

When you have the public IP in the step above, plug it into your web browser. You should reach a screen that prompts you to input a URL and name for InfluxDB. Because all requests are proxied through the chronograf server we can use the internal cluster addressing to access InfluxDB or Kapacitor using the following URLs:

InfluxDB: http://influxdb-influxdb.tick:8086
Kapacitor: http://kapacitor-kapacitor.tick:9092

Ready to go!#

Now you are ready to explore what data is available! This post is more of a description of the different technologies and strategies used in the example. I’ll get into what you can do with the full stack in another post.