Druid Monitoring Page Tutorial: A Comprehensive Guide307


Introduction

Druid is an open-source, real-time analytics database that is designed to handle large volumes of time-series data. The Druid monitoring page provides a comprehensive overview of the health and performance of your Druid cluster. In this tutorial, we will cover the different sections of the monitoring page and explain how to use them to troubleshoot and optimize your Druid cluster.

Druid Cluster Overview

The cluster overview section provides a high-level view of the health and performance of your Druid cluster. It includes the following information:
Cluster name: The name of your Druid cluster.
Number of brokers: The number of brokers in your cluster.
Number of coordinators: The number of coordinators in your cluster.
Number of historicals: The number of historical nodes in your cluster.
Number of middle managers: The number of middle managers in your cluster.
Number of overlords: The number of overlords in your cluster.
Number of tasks: The total number of tasks running on your cluster.
Query time: The average query time for the past minute.
Segment count: The total number of segments in your cluster.
Segment size: The average size of segments in your cluster.

Broker Monitoring

The broker monitoring section provides insights into the health and performance of your Druid brokers. It includes the following information:
Broker status: The status of each broker in your cluster, including whether it is active, standby, or down.
Number of requests: The number of requests processed by each broker in the past minute.
Request latency: The average latency of requests processed by each broker in the past minute.
Number of errors: The number of errors encountered by each broker in the past minute.

Coordinator Monitoring

The coordinator monitoring section provides insights into the health and performance of your Druid coordinators. It includes the following information:
Coordinator status: The status of each coordinator in your cluster, including whether it is active, standby, or down.
Number of segments: The number of segments managed by each coordinator.
Number of tasks: The number of tasks running on each coordinator.
Number of errors: The number of errors encountered by each coordinator in the past minute.

Historical Monitoring

The historical monitoring section provides insights into the health and performance of your Druid historical nodes. It includes the following information:
Historical status: The status of each historical node in your cluster, including whether it is active, standby, or down.
Number of segments: The number of segments stored on each historical node.
Segment size: The average size of segments stored on each historical node.
Number of queries: The number of queries processed by each historical node in the past minute.
Query latency: The average latency of queries processed by each historical node in the past minute.
Number of errors: The number of errors encountered by each historical node in the past minute.

Middle Manager Monitoring

The middle manager monitoring section provides insights into the health and performance of your Druid middle managers. It includes the following information:
Middle manager status: The status of each middle manager in your cluster, including whether it is active, standby, or down.
Number of segments: The number of segments managed by each middle manager.
Number of tasks: The number of tasks running on each middle manager.
Number of errors: The number of errors encountered by each middle manager in the past minute.

Overlord Monitoring

The overlord monitoring section provides insights into the health and performance of your Druid overlords. It includes the following information:
Overlord status: The status of each overlord in your cluster, including whether it is active, standby, or down.
Number of segments: The number of segments managed by each overlord.
Number of tasks: The number of tasks running on each overlord.
Number of errors: The number of errors encountered by each overlord in the past minute.

Task Monitoring

The task monitoring section provides insights into the health and performance of your Druid tasks. It includes the following information:
Task status: The status of each task in your cluster, including whether it is running, waiting, or failed.
Task type: The type of each task, such as segment ingestion, compaction, or query execution.
Task progress: The progress of each task, expressed as a percentage.
Task duration: The duration of each task, expressed in milliseconds.
Task errors: The number of errors encountered by each task.

Conclusion

The Druid monitoring page is a powerful tool that can help you to troubleshoot and optimize your Druid cluster. By understanding the different sections of the monitoring page, you can quickly identify any issues that may be affecting the performance of your cluster and take steps to resolve them.

2025-01-06


Previous:Configure Monitoring Playback Time for Optimal Performance

Next:How to Install a Trailer Monitoring System (Step-by-Step with Images)