The agent gathers metrics linked to a node and the containers running on it, and it exposes them in the Prometheus format.
It makes utilize of eBPF to trace container linked events similar to TCP connects, so the minimum supported Linux kernel version is 4.16.
Featured Content Ads
add advertising hereParts
TCP connection tracing
To provide visibility into the relationships between services, the agent traces containers TCP events, similar to connect() and hear().
Exported metrics are significant for:
- Acquiring an accurate blueprint of inter-provider communications. It would no longer require integration of disbursed tracing frameworks into your code.
- Detecting connections errors from one provider to one other.
- Measuring network latency between containers, nodes and availability zones.
Log patterns extraction
Log management is mostly quite costly. In most cases, you attain now no longer deserve to investigate each tournament for my piece.
It is ample to extract recurring patterns and the different of the linked events.
This draw enormously reduces the amount of data required for enlighten log prognosis.
Featured Content Ads
add advertising hereThe agent discovers container logs and parses them staunch on the node.
For the time being the next sources are supported:
- Tell logging to recordsdata in /var/log/
- Journald
- Dockerd (JSON file driver)
- Containerd (CRI logs)
To be taught more about automatic log clustering, strive the blog put up “Mining metrics from unstructured logs”.
Prolong accounting
Prolong accounting allows engineers to precisely
title scenarios where a container is experiencing a lack of CPU time or waiting for I/O.
Featured Content Ads
add advertising hereThe agent gathers per-job counters by Netlink and aggregates them into per-container metrics:
- container_resources_cpu_delay_seconds_total
- container_resources_disk_delay_seconds_total
Out-of-memory events tracing
The container_oom_kills_total metric reveals that a container has been terminated by the OOM killer.
Occasion meta data
If a node is a cloud instance, the agent identifies a cloud provider and collects extra data utilizing the linked metadata services.
Supported cloud providers: AWS, GCP, Azure
Composed data:
- AccountID
- InstanceID
- Occasion/machine style
- Region
- AvailabilityZone + AvailabilityZoneId (AWS most nice)
- LifeCycle: on-save an notify to/issue (AWS and GCP most nice)
- Non-public & Public IP addresses
Escape
Requirements
The agent requires some privileges for getting salvage admission to to container data, similar to logs, performance counters and TCP sockets:
- privileged mode (
securityContext.privileged: like minded
) - the host job ID namespace (
hostPID: like minded
) /sys/fs/cgroup
and/sys/kernel/debug
desires to be mounted to the agent’s container
Kubernetes
securityContext:
privileged: true
volumeMounts:
– mountPath: /host/sys/fs/cgroup
name: cgroupfs
readOnly: true
– mountPath: /sys/kernel/debug
name: debugfs
readOnly: false
volumes:
– hostPath:
path: /sys/fs/cgroup
name: cgroupfs
– hostPath:
path: /sys/kernel/debug
name: debugfs”>
apiVersion: v1 style: Namespace metadata: title: coroot --- apiVersion: apps/v1 style: DaemonSet metadata: labels: app: coroot-node-agent title: coroot-node-agent namespace: coroot spec: selector: matchLabels: app: coroot-node-agent template: metadata: labels: app: coroot-node-agent annotations: prometheus.io/predicament: 'like minded' prometheus.io/port: '80' spec: tolerations: - operator: Exists hostPID: like minded containers: - title: coroot-node-agent image: ghcr.io/coroot/coroot-node-agent:most up-to-date args: ["--cgroupfs-root", "/host/sys/fs/cgroup"] ports: - containerPort: 80 title: http securityContext: privileged: like minded volumeMounts: - mountPath: /host/sys/fs/cgroup title: cgroupfs readOnly: like minded - mountPath: /sys/kernel/debug title: debugfs readOnly: flawed volumes: - hostPath: path: /sys/fs/cgroup title: cgroupfs - hostPath: path: /sys/kernel/debug title: debugfs
Ought to you utilize Prometheus Operator,
you are going to also deserve to invent a PodMonitor:
apiVersion: monitoring.coreos.com/v1 style: PodMonitor metadata: title: coroot-node-agent namespace: coroot spec: selector: matchLabels: app: coroot-node-agent podMetricsEndpoints: - port: http
Ensure that the PodMonitor matches podMonitorSelector
outlined for your Prometheus:
apiVersion: monitoring.coreos.com/v1 style: Prometheus ... spec: ... podMonitorNamespaceSelector: {} podMonitorSelector: {} ...
The special price {}
allows Prometheus to transfer making an are attempting to accumulate the complete PodMonitors from all namespaces.
Docker
docker flee --detach --title coroot-node-agent --privileged --pid host -v /sys/kernel/debug:/sys/kernel/debug:rw -v /sys/fs/cgroup:/host/sys/fs/cgroup:ro ghcr.io/coroot/coroot-node-agent --cgroupfs-root=/host/sys/fs/cgroup
Flags
NOW WITH OVER +8500 USERS. other folks can Join Knowasiak for free. Register on Knowasiak.com
Read More