简介
说明
本文介绍ElasticSearch查看健康状态的方法(API)。
官网
https://www.elastic.co/guide/en/elasticsearch/reference/7.1/cluster-health.html
Head插件查看状态
Head插件可以查看ES的状态,例如:
Head插件可以直观看到颜色,但有如下两个缺点:
- 生产环境中ES不会放开外网端口,无法用Head插件查看状态。
- Head插件只能看到颜色,不能看到详细错误原因
所以,如果想要查看生产环境的ES状态或者查看详细错误原因,就要用到下边的API。
API大全
_cat/health
API | 作用 |
GET _cat/health | 显示集群的健康信息 |
_cat/shards
API | 作用 |
GET _cat/shards | 查看节点包含的分片信息,包括一个分片是主分片还是一个副本分片、文档的数量、硬盘上占用的字节数、节点所在的位置等信息。 |
_cluster/health
API | 作用 |
GET _cluster/health | 集群的状态(检查节点数量) |
GET _cluster/health?level=indices | 所有索引的健康状态(查看有问题的索引) |
GET _cluster/health/my_index | 单个索引的健康状态(查看具体的索引) |
GET _cluster/health?level=shards | 分片级的索引 |
_cluster/allocation/explain
API | 作用 |
GET _cluster/allocation/explain | 返回第一个未分配 Shard 的原因 |
GET /_cluster/allocation/explain { “index”: “myindex”, “shard”: 0, “primary”: true } | 查看特定分片未分配的原因。 index:索引名称。 shard:分片序号。从 0 开始计数。 primary:是否主分片;true 代表是;false 代表否。 |
示例1:索引的健康状态
方法
http://IP:9200/_cat/health
正常的结果
1635328870 10:01:10 kubernetes-logging green 15 10 2160 1080 2 0 0 0 - 100.0%
有问题的结果
1635313779 05:49:39 kubernetes-logging red 15 10 2128 1064 0 0 32 0 - 98.5%
示例2:分片的状态
方法
http://IP:9200/_cat/shards?v=true&h=index,shard,prirep,state,node,unassigned.reason&s=state
- v=true, 代表显示字段含义;否则首行内容不显示。
- h=*,代表列名;
- s=state,代表基于state方式排序。等价于:s=state:asc,默认升序方式排序。
- prirep,代表分片类型。p:代表主分片;r:代表副本分片。
结果
order_info、test_data等索引包含未分配的副本分片。其集群健康状态肯定是“黄色”。
示例3:集群的健康状态
方法
http://IP:9200/_cluster/health
结果
{ "cluster_name": "kubernetes-logging", "status": "red", "timed_out": false, "number_of_nodes": 15, "number_of_data_nodes": 10, "active_primary_shards": 1064, "active_shards": 2128, "relocating_shards": 0, "initializing_shards": 0, "unassigned_shards": 32, "delayed_unassigned_shards": 0, "number_of_pending_tasks": 0, "number_of_in_flight_fetch": 0, "task_max_waiting_in_queue_millis": 0, "active_shards_percent_as_number": 98.51851851851852 }
“unassigned_shards” :未分配的分片数
示例4:所有索引的健康状态
方法
http://IP:9200/_cluster/health?level=indices
结果
{ // 其他数据 "bj-task-hdfs-rpc-2021.11.24" : { "status" : "red", // 分片状态为红色 "number_of_shards" : 5, // 主分片数 "number_of_replicas" : 1, // 每个分片的副本数 "active_primary_shards" : 4, // 活动的主分片数,说明 1 个故障 "active_shards" : 7, // 活动的总分片数,说明 3 个故障 "relocating_shards" : 0, "initializing_shards" : 0, "unassigned_shards" : 3 // 未分配的分片有3个(1主分片 + 2副本分片) } }
示例5:单个索引的健康状态
方法
http://IP:9200/_cluster/health/dev-tool-deployment-service
结果
{ "cluster_name": "kubernetes-logging", "status": "red", "timed_out": false, "number_of_nodes": 15, "number_of_data_nodes": 10, "active_primary_shards": 2, "active_shards": 4, "relocating_shards": 0, "initializing_shards": 0, "unassigned_shards": 6, "delayed_unassigned_shards": 0, "number_of_pending_tasks": 0, "number_of_in_flight_fetch": 0, "task_max_waiting_in_queue_millis": 0, "active_shards_percent_as_number": 98.52534562211981 }
示例6:查看故障原因
GET /_cluster/allocation/explain
结果
{ "index" : "idx", "shard" : 0, "primary" : true, "current_state" : "unassigned", "unassigned_info" : { "reason" : "INDEX_CREATED", "at" : "2017-01-04T18:08:16.600Z", "last_allocation_status" : "no" }, "can_allocate" : "no", "allocate_explanation" : "cannot allocate because allocation is not permitted to any of the nodes", "node_allocation_decisions" : [ { "node_id" : "8qt2rY-pT6KNZB3-hGfLnw", "node_name" : "node-0", "transport_address" : "127.0.0.1:9401", "node_attributes" : {}, "node_decision" : "no", "weight_ranking" : 1, "deciders" : [ { "decider" : "filter", "decision" : "NO", "explanation" : "node does not match index setting [index.routing.allocation.include] filters [_name:\"non_existent_node\"]" } ] } ] }
请先
!