Our services
Get in touch
August 9, 2023

Kubernetes Monitoring with Datadog

Time needed: 


Kubernetes monitoring is crucial for ensuring the optimal performance, availability, and reliability of your containerized applications running in a Kubernetes cluster. With the complexity and scale of Kubernetes deployments, effective monitoring becomes essential for identifying and resolving issues quickly. Datadog, a popular monitoring platform, provides comprehensive Kubernetes monitoring capabilities. It offers real-time visibility into the health and performance of your cluster, including metrics, logs, and traces. With Datadog, you can gain insights into resource utilization, application performance, and container orchestration. This enables proactive troubleshooting, efficient resource allocation, and effective capacity planning, ensuring the smooth operation of your Kubernetes environment and facilitating application scalability and stability.

1. Prerequisite.

. Install Datadog Agent on EKS

• Install Datadog Cluster Agent

• Configure permissions and secrets

a. Creating ClusterRole, ClusterRoleBinding, and ServiceAccount for allowing permission to cluster agent and datadog agent to collect metrices.

• Creating Kubernetes Secret to provide your Datadog API key

• Deploy the datadog-cluster-agent and datadog-agent on EKS using yaml files. (datadog-cluster-agent.yaml, datadog-agent.yaml.)

Install Datadog Agent on EKS:-

The Datadog Agent is free software that enables you to observe and manage your complete infrastructure in one location by gathering metrics, distributed traces, and logs from each of your nodes and reporting them.

The Agent automatically gathers and provides resource measurements (such as CPU, memory, and network traffic) from your nodes, regardless of the underlying infrastructure platform, in addition to gathering telemetry data from Kubernetes, Docker, and other infrastructure technologies.

Install Datadog Cluster Agent:-

By acting as a proxy between the API server and the node-based Agents, the Datadog Cluster Agent reduces the load on the Kubernetes API server for collecting cluster-level data. It also adds security by lowering the permissions required for the node-based Agents, and it allows Kubernetes workloads to be automatically scaled using any metric that Datadog collects.

Configure permissions and secrets:-

The following manifests can be deployed to create the permissions that the node-based Agent and Cluster Agent will need to function in your Kubernetes cluster if it implements role-based access control. The following manifests provide two sets of permissions: one for the node-based Agent and one for the cluster agent. The cluster agent has rights specifically for gathering cluster-level metrics and Kubernetes events via the Kubernetes API. For each type of Agent, deploying these two manifests will result in the creation of a ClusterRole, ClusterRoleBinding, and ServiceAccount.

2. Tasks To Do.

The Github repository URL is below. In that repository, you can find the configuration file. You must run that file on the EKS Cluster so that the node-based Agent and Cluster Agent may conduct tasks.


kubectl create -f https://raw.githubusercontent.com/DataDog/datadog-agent/master/Dockerfiles/manifests/cluster-agent/cluster-agent-rbac.yaml

kubectl create -f https://raw.githubusercontent.com/DataDog/datad og-agent/master/Dockerfiles/manifests/cluster-agent/rbac.yaml

Create a Kubernetes secret next so you may give the Agent your Datadog API key without include it in your deployment manifests.

kubectl create secret generic datadog-secret — from-literal api-key=”<YOUR_API_KEY>”

In order to provide secure Agent-to-Agent communication between the Cluster Agent and the node-based Agents, construct a secret token as follows:

Create a 32-bit long string using the link below, then use it in the command below.

echo -n <32 String long password> | base64

Use the resulting token to create a Kubernetes secret that both flavors of Agent will use to authenticate with each other:

Use the token in below command which is generated from above command.

Kubectl create secret generic datadog-auth-token –from literal=token=<TOKEN_FROM_PREVIOUS_STEP>

Deploy the Cluster Agent:-

You’re prepared to deploy the Cluster Agent now that you’ve created Kubernetes secrets using your Datadog API key and an authentication token. Copy the manifest file from the aforementioned Gitub repo link to your local computer and save it there as datadog-cluster-agent.yaml:

After copy that file to local you have to run that file on cluster so it will deploy that cluster agent on node, so use below command to deploy agent.

kubectl apply -f datadog-cluster-agent.yaml

For the Cluster Agent, the manifest establishes a Kubernetes deployment and service. The Service offers a consistent endpoint within the cluster so that node-based Agents can communicate with the Cluster Agent, wherever it may be running, while the Deployment ensures that a single Cluster Agent is always running somewhere in the cluster. It should be noted that rather than being saved in plaintext in the manifest itself, the Datadog API key and authentication token are obtained through Kubernetes secrets.

check the status of cluster agent using below command:

kubectl get pods -l app=datadog-cluster-agent

Deploy the node-based Agent:

The node-based Datadog Agent is easy to install to your cluster once the required permissions and secrets have been generated. DD_CLUSTER_AGENT_ENABLED (set to true) and DD_CLUSTER_AGENT_AUTH_TOKEN (set using Kubernetes secrets, much like in the Cluster Agent manifest) are two additional environment variables that are set in the manifest that follows the normal Kubernetes Agent manifest. Save the following manifest as datadog-agent.yaml and copy it to a local file.

Similar to the previous cluster agent, you must copy the datadog-agent file to local storage before deploying the datadog agent on the cluster. Once the node-based Agent is deployed as a DaemonSet, use the following command to make sure that one instance of the Agent is running on each node in the cluster.

kubectl apply -f -f datadog-agent.yaml

To verify that the node-based Datadog Agent is running on your cluster, run the following command:

kubectl get daemonset datadog-agent

After this all the configuration you will be able to see the resources and metrices in the datadog console.

Dive into the metrics

The resource measurements and events from your cluster should be streaming into Datadog after the Datadog Agent has been successfully deployed. The built-in Kubernetes dashboard allows you to view the data you’ve already started gathering.

You might remember from earlier in this series that an optional cluster add-on called kube-state-metrics offers specific cluster-level metrics, more specifically the counts of Kubernetes objects like the count of desired, available, and unavailable pods. If you notice that this information is missing from the dashboard, it indicates that the kube-state-metrics service has not yet been installed. You only need to deploy kube-state-metrics to your cluster to start collecting these statistics in addition to the lower-level resource metrics that the Agent already gathers.

Deploy kube-state-metrics

You may rapidly deploy the add-on and its related resources by using a set of manifests from the official kube-state-metrics project, as was discussed in Part 3 of this series. Run the following commands to get the manifests and apply them to your cluster:

git clone https://github.com/kubernetes/kube-state-metrics.git

cd kube-state-metrics

kubectl apply -f examples/standard

Below are the screenshots from datadog console.