Monitoring of Data Workload Operators

Introduction
Monitoring is a critical aspect of running data workloads in Kubernetes. As we develop the plugin ecosystem for OpenEverest, we are currently researching how various operators handle monitoring to ensure our integrations follow industry best practices. Different operators have adopted various approaches to expose metrics and integrate with monitoring stacks. This blog post explores how some operators implement monitoring and observability for their respective data workloads. We focus specifically on metrics collection and monitoring integration, while distributed tracing may be explored in a future post.
Monitoring Integration Patterns
Many Kubernetes data workload operators follow similar patterns for monitoring:
- Metrics Exporters: Dedicated containers or sidecars that expose metrics
- Prometheus Integration: The de facto standard for metrics collection in Kubernetes
- Service Discovery: Automatic discovery of monitoring endpoints using Kubernetes service discovery
- Grafana Dashboards: Pre-built dashboards for visualizing metrics
Operator Comparison
The following table summarizes monitoring capabilities across different operators:
| Operator | Metrics Exposure | Monitoring | Dashboard |
|---|---|---|---|
| ClickHouse Operator | Built-in | Prometheus | Grafana |
| Milvus Operator | Built-in | Prometheus | Grafana |
| Kafka Operator (Strimzi) | JMX Exporter | Prometheus | Grafana |
| Redis Operator | Redis Exporter (sidecar) | Prometheus | Grafana |
| CloudNativePG Operator | Built-in | Prometheus | Grafana |
| TiDB Operator | Built-in | Prometheus / VictoriaMetrics | Grafana + Custom |
Understanding Prometheus Operator Custom Resources
The Prometheus Operator introduces Custom Resources (CRs) that simplify the configuration of Prometheus monitoring in Kubernetes. Two key resources are ServiceMonitor and PodMonitor. Both CRs provide automatic service discovery, eliminating the need to manually update Prometheus configuration files when pods or services are added or removed.
ServiceMonitor
ServiceMonitor is a CR that declaratively specifies how groups of Kubernetes services should be monitored. Instead of manually configuring Prometheus scrape targets, you define a ServiceMonitor that references services using label selectors.
ServiceMonitor is ideal when:
- Metrics are exposed via Kubernetes Services
- You want to monitor all pods behind a service uniformly
PodMonitor
PodMonitor is similar to ServiceMonitor but directly targets pods instead of services. This is useful when you need to scrape metrics from pods that don’t have a corresponding service, or when you need more granular control over individual pod monitoring.
PodMonitor is ideal when:
- Pods expose metrics without going through a service
- Metrics endpoints are pod-specific (e.g., individual database instances)
Details of Operators
ClickHouse Operator
The ClickHouse Operator exposes metrics directly from ClickHouse pods. It integrates with Prometheus Operator using Kubernetes service discovery and supports Grafana for visualization.
- Metrics exposure: Built-in metrics
- Monitoring: Prometheus Operator; config template
- Dashboards: Setup using Grafana Operator
Milvus Operator
The Milvus Operator exposes metrics from each Milvus component. It integrates with Prometheus Operator using ServiceMonitor CR for component discovery.
- Metrics exposure: Built-in metrics
- Monitoring: Prometheus Operator using ServiceMonitor CR; docs
- Dashboards: Visualize metrics using Grafana
Kafka Operator (Strimzi)
The Strimzi Kafka Operator exports metrics via JMX Exporter. It uses ServiceMonitor CR for Prometheus discovery and provides example Grafana dashboards.
- Metrics exposure: JMX Exporter (Java agent)
- Monitoring: Prometheus Operator using ServiceMonitor CR; docs
- Dashboards: Example of Grafana dashboards
Redis Operator
The Redis Operator by Opstree Solutions uses a sidecar exporter. It integrates with Prometheus Operator via PodMonitor CR. Metrics can be visualized in Grafana.
- Metrics exposure: Redis Exporter sidecar
- Monitoring: Prometheus Operator + PodMonitor; docs
- Dashboards: Grafana dashboards
CloudNativePG Operator
CloudNativePG exposes metrics from each PostgreSQL instance. It works with Prometheus Operator using PodMonitor CR.
- Metrics exposure: Built-in
- Monitoring: Prometheus Operator + PodMonitor; docs
- Dashboards: Setup Grafana dashboard to monitor CloudNativePG
TiDB Operator
The TiDB Operator exposes metrics from each component. It supports both Prometheus Operator and VictoriaMetrics Operator for flexible monitoring backend selection.
- Metrics exposure: Built-in
- Monitoring: Prometheus Operator or VictoriaMetrics via custom resources; docs
- Dashboards: TiDB Dashboard and Grafana
Best Practices
When implementing monitoring for operators, consider these best practices:
- Enable Service Discovery: Automatic endpoint discovery reduces manual configuration
- Deploy Grafana Dashboards: Pre-built dashboards provide immediate visibility
Conclusion
Kubernetes operators for data workloads have converged on Prometheus as the standard for metrics collection, with many providing native integration through Prometheus Operator. The use of service discovery, pre-built exporters, and Grafana dashboards makes it easy to achieve comprehensive observability for data workloads running in Kubernetes.
By understanding the monitoring capabilities of each operator, you can make informed decisions about which solution best fits your observability requirements and existing monitoring infrastructure.
