ELK Stack Post — 3

Elasticsearch-Kibana Cluster on Kubernetes using ECK - Operator 101

Elasticsearch deployment on container based platform is continuously evolving. Recently they came up with operator based deployment of ELK stack on K8s Cluster. This operator based service automates the deployment, provisioning, management, and orchestration of Elasticsearch, Kibana and APM Server on Kubernetes.

In this blogpost we are going to create a Elasticsearch cluster on Kubernetes Platform using their K8s operator packaging. Before proceeding further let’s revise few concepts in Elasticsearch.

Elasticsearch is an open-source, broadly-distributable, readily-scalable, enterprise-grade search engine. Type of nodes in Elasticsearch cluster.

Data nodes — stores data and executes data-related operations search etc
Master nodes — in charge of cluster-wide management and configuration
Client nodes — forwards cluster requests to the master node and data-related requests to data nodes
Ingestion nodes for pre-processing documents before indexing, you can compare with logstash
Machine Learning nodes — Enables machine learning tasks. Machine learning nodes have xpack.ml.enabled and node.ml set to true.

Index is a type of data organization mechanism, allowing the user to partition data a certain way.

Documents are JSON objects that are stored within an Elasticsearch index. Document is the unit of search and index. An index consists of one or more Documents, and a Document consists of one or more Fields. A Documents contain few reserve fields:

  • _index — the index where the document resides
  • _type — the type that the document represents. Deprecated in 7.x
  • _id — the unique identifier for the document

Replicas and Shards are the mechanism Elasticsearch uses to distribute data around the cluster. Split up your indices horizontally into shards for better performance. Replicas are Elasticsearch fail-safe mechanisms and are basically copies of your index’s shards.

Very common example available on Internet -

MySQL => Databases => Tables => Rows/Columns

ElasticSearch => Indices => Types => Documents with Properties

An Elasticsearch cluster can contain multiple Indices (databases), which in turn contain multiple Types (tables). These types hold multiple Documents (rows), and each document has Properties(columns).

Indices organize data logically, but they also organize data physically through the underlying shards.

Other components of ELK

ElasticSearch: Store,Index,Search and Analyze data
Kibana: Data Visualization
Logstash: Data Aggregation
Beats:
Kind of Agents used as Data ingestion
X-Pack: Extra Features

What we need to have to deploy the ELK cluster-

  1. A K8s Cluster, I am using Azure K8s Service with 2 nodes.
  2. We will follow this topology.
    one node with Elasticsearch master and Kibana
    one node with Elasticsearch data

To get the above mechanishm implemented we will use K8s nodeselector concept here.

k8s cluster with 2 nodes
Label The Node, we will use these labels as a nodeSelector kubectl label nodes aks-agentpool-24475697-vmss000000 xnum=node-1kubectl label nodes aks-agentpool-24475697-vmss000000 environment=productionkubectl label nodes aks-agentpool-24475697-vmss000001 xnum=node-1kubectl label nodes aks-agentpool-24475697-vmss000001 environment=production

Deploy K8s Operator for Elasticsearch cluster

kubectl apply -f https://download.elastic.co/downloads/eck/1.1.0/all-in-one.yaml<<
customresourcedefinition.apiextensions.k8s.io/apmservers.apm.k8s.elastic.co created
customresourcedefinition.apiextensions.k8s.io/elasticsearches.elasticsearch.k8s.elastic.co created
customresourcedefinition.apiextensions.k8s.io/kibanas.kibana.k8s.elastic.co created
clusterrole.rbac.authorization.k8s.io/elastic-operator created
clusterrolebinding.rbac.authorization.k8s.io/elastic-operator created
namespace/elastic-system created
statefulset.apps/elastic-operator created
serviceaccount/elastic-operator created
validatingwebhookconfiguration.admissionregistration.k8s.io/elastic-webhook.k8s.elastic.co created
service/elastic-webhook-server created
secret/elastic-webhook-server-cert created >>

Now you can create Elasticsearch cluster. I have customized few settings to configure the cluster. I am using data and master nodes separately and using nodeSelector for affinity based deployment.

Deploy the Elasticsearch cluster.

kubectl apply -f elasticsearch.yaml
# validate the health
kubectl get elastic -n elastic-system** To find full description of Elasticsearch resourcekubectl describe crd elasticsearch

After this you have to validate the cluster health.

When you deploy Elasticsearch cluster it uses PVC and PV for storage perspective. You can view the details via K8s cmd line.

<<The reclaim policy for a PersistentVolume tells the cluster what to do with the volume after it has been released of its claim. Currently, volumes can either be Retained, Recycled, or Deleted.>>

A PersistentVolume can be mounted on a host in any way supported by the resource provider.

  • ReadWriteOnce — the volume can be mounted as read-write by a single node
  • ReadOnlyMany — the volume can be mounted read-only by many nodes
  • ReadWriteMany — the volume can be mounted as read-write by many nodes.
https://kubernetes.io/docs/concepts/storage/persistent-volumes/

** Please note we are using default storage class by AKS.

Then I ramped up the Elasticsearch data node just updating number of node value in file. You can see in following image the other pod is coming up.

Please note existing volume claims cannot be resized.

# You can login in pod to verify details about mount systemkubectl exec -it elastic-cluster-es-data-0 /bin/bash -n elastic-system

When you login on both pods, you can see

Now time to access the Elasticsearch APIs.

kubectl get service elastic-cluster-es-http -n elastic-system
# Password gets stored on SecretPASSWORD=$(kubectl get secret elastic-cluster-es-elastic-user -n elastic-system -o=jsonpath='{.data.elastic}' | base64 --decode)echo $PASSWORD# This service is not exposed outside, so to access on local systemkubectl port-forward service/elastic-cluster-es-http 9200 -n elastic-system

Elasticsearch has correctly installed and working as expected. To validate the same we can send curl request to access APIs.

curl -u "elastic:<password>" -k "https://localhost:9200"  | jq# for more about Jq: [https://stedolan.github.io/jq/]
Elastic cluster details

To get more details about cluster

cluster health
curl -u "elastic:<password>" -k "https://localhost:9200/_cluster/health?level=indices&pretty"  | jqcurl -XGET -u "elastic:<password>" -k 'https://localhost:9200/_cluster/health?level=shards&pretty'

Few details on REST API Requests:

REST API Format : http://host:port/[index]/[type]/[_action/id]

HTTP Methods used: GET, POST, PUT, DELETE

# list all indiceshttp://localhost:9200/_cat/indices# status of indexhttp://localhost:9200/<index_name>?pretty# search in indexhttp://localhost:9200/<index_name>/<type>/_search

ElasticSearch lets you use HTTP methods such as GETs, POSTs, DELETEs, and PUTs along with a payload that would be in a JSON structure.

Kibana

Kibana is an open source frontend application that sits on top of the Elastic Stack, providing search and data visualization capabilities for data indexed in Elasticsearch.

We are going to run kibana cluster as well but with 1 node only.

kubectl create -f kibana.yaml

We can see in K8s events the details.

To access the Kibana we created a Loadbalance Service. Fetch the details of service to get the hostIP.

kubectl get svc kibana-cluster-kb-http -n elastic-system

Above command will give you Kibana cluster details. Access in browser

http://<hostname or IP>:5601/
All PODS running in K8s environment

Go to Kibana settings and turn on the monitoring.

Now you can see Elasticsearch cluster in a much better way.

# Lets create an index via curl req and validate the things are 
# working fine in cluster you can do the same via curl or Kibana Dev # Tool
curl -v -u "elastic:<password>" -XPUT "https://localhost:9200/test_index" -H 'Content-Type: application/json' -d'{ "settings" : { "number_of_shards" : 1, "number_of_replicas" : 1 }}'# please note we are creating index with 1 shared and 1 replica

Add some data in index.

# Add the data 
curl -XPUT "https://localhost:9200/test_index/_doc/1" -H 'Content-Type: application/json' -d'{ "name": "John Doe"}'
# validate the data
curl -XGET "https://localhost:9200/test_index/_doc/1"
# I have downloaded one json file with dummy data to load in this cluster.curl -v -k -u "elastic:<password>" -H "Content-Type: application/json" -XPOST "https://localhost:9200/test_index/_bulk?pretty&refresh" --data-binary "@accounts.json"

Once data gets uploaded lets validate the index shards and other details

In quest of understanding How Systems Work !

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store