4
回答
Elasticsearch 新增一个节点可以加入群,但是无法分片到新节点,一直是RELOCATING 状态
百度AI开发者大赛带你边学边开发,赢100万奖金,加群:418589053   

现有集群新增加一个node节点,无法分片到新节点,一直是RELOCATING 状态。

已做的检查:

1. iptables 和selinux 已关闭

2. master和new node之间的网络没有问题,可以ping通,9200和9300 Telnet都没问题。

3. master日志里也能看到added 日志,但是就是分片无法同步。

4. 确认分片功能是开着的。

es 的配置文件

cluster.name: data-cluster
node.name: "data-es-05"
#node.data: false

# Indexing & Cache config
index.number_of_shards: 5
index.number_of_replicas: 1
index.cache.field.type: soft
index.cache.field.expire: 10m
index.cache.query.enable: true
indices.cache.query.size: 2%
indices.fielddata.cache.size: 35%
indices.fielddata.cache.expire: 10m
index.search.slowlog.level: INFO
#indices.recovery.max_size_per_sec: 1gb
index.merge.scheduler.max_thread_count: 2    # Only for spinning media. 

# Refresh config
index.refresh_interval: 300s

# Translog config
index.translog.flush_threshold_ops:  100000

# Paths config
path.data: /data/esData
path.plugins: /usr/share/elasticsearch/plugins

# Network And HTTP
network.bind_host: 10.0.126.203
network.publish_host: 10.0.126.203
transport.tcp.port: 9300
transport.tcp.compress: true
http.port: 9200

# Discovery
discovery.zen.minimum_master_nodes: 1
discovery.zen.ping.timeout: 10s
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["10.0.32.3:9300", "10.0.4.37:9300", "10.0.40.159:9300", "10.0.107.116:9300" , "10.0.126.203:9300"]


新节点的日志:


[2015-10-27 09:18:20,565][DEBUG][action.admin.cluster.state] [data-es-05] observer: timeout notification from cluster service. timeout setting [30s], time since start [30s]
[2015-10-27 09:18:20,589][DEBUG][action.admin.cluster.health] [data-es-05] observer: timeout notification from cluster service. timeout setting [30s], time since start [30s]
[2015-10-27 09:18:40,583][DEBUG][action.admin.cluster.state] [data-es-05] observer: timeout notification from cluster service. timeout setting [30s], time since start [30s]
[2015-10-27 09:18:40,583][DEBUG][action.admin.cluster.health] [data-es-05] observer: timeout notification from cluster service. timeout setting [30s], time since start [30s]
[2015-10-27 09:19:00,530][DEBUG][action.admin.indices.get ] [data-es-05] no known master node, scheduling a retry
[2015-10-27 09:19:03,962][DEBUG][action.admin.cluster.state] [data-es-05] no known master node, scheduling a retry
[2015-10-27 09:19:30,531][DEBUG][action.admin.indices.get ] [data-es-05] observer: timeout notification from cluster service. timeout setting [30s], time since start [30s]
[2015-10-27 09:19:33,963][DEBUG][action.admin.cluster.state] [data-es-05] observer: timeout notification from cluster service. timeout setting [30s], time since start [30s]
[2015-10-27 09:19:34,178][DEBUG][action.admin.indices.get ] [data-es-05] no known master node, scheduling a retry
[2015-10-27 09:20:04,178][DEBUG][action.admin.indices.get ] [data-es-05] observer: timeout notification from cluster service. timeout setting [30s], time since start [30s]

master日志:

[2015-10-27 08:55:40,689][INFO ][cluster.service          ] [data-es-01] added {[data-es-05][K1ffqB4zRCuufso-VrD9EA][data-es-05][inet[/10.0.126.203:9300]],}, reason: zen-disco-receive(join from node[[data-es-05][K1ffqB4zRCuufso-VrD9EA][data-es-05][inet[/10.0.126.203:9300]]])




在线急等。。。

举报
共有4个答案 最后回答: 3年前
{
"persistent": {},
"transient": {
"cluster": {
"routing": {
"allocation": {
"enable": "all"
}
}
}
}
}

下面是新节点的日志,从日志看一直找不到master, 可是我的集群是在同一个vlan里面,能ping通和telnet通啊。。。奇怪

[2015-10-27 09:18:14,460][DEBUG][action.admin.cluster.state] [data-es-05] observer: timeout notification from cluster service. timeout setting [30s], time since start [30s]
[2015-10-27 09:18:14,546][DEBUG][action.admin.cluster.health] [data-es-05] observer: timeout notification from cluster service. timeout setting [30s], time since start [30s]
[2015-10-27 09:18:20,565][DEBUG][action.admin.cluster.state] [data-es-05] observer: timeout notification from cluster service. timeout setting [30s], time since start [30s]
[2015-10-27 09:18:20,589][DEBUG][action.admin.cluster.health] [data-es-05] observer: timeout notification from cluster service. timeout setting [30s], time since start [30s]
[2015-10-27 09:18:40,583][DEBUG][action.admin.cluster.state] [data-es-05] observer: timeout notification from cluster service. timeout setting [30s], time since start [30s]
[2015-10-27 09:18:40,583][DEBUG][action.admin.cluster.health] [data-es-05] observer: timeout notification from cluster service. timeout setting [30s], time since start [30s]
[2015-10-27 09:19:00,530][DEBUG][action.admin.indices.get ] [data-es-05] no known master node, scheduling a retry
[2015-10-27 09:19:03,962][DEBUG][action.admin.cluster.state] [data-es-05] no known master node, scheduling a retry
[2015-10-27 09:19:30,531][DEBUG][action.admin.indices.get ] [data-es-05] observer: timeout notification from cluster service. timeout setting [30s], time since start [30s]
[2015-10-27 09:19:33,963][DEBUG][action.admin.cluster.state] [data-es-05] observer: timeout notification from cluster service. timeout setting [30s], time since start [30s]
[2015-10-27 09:19:34,178][DEBUG][action.admin.indices.get ] [data-es-05] no known master node, scheduling a retry
[2015-10-27 09:20:04,178][DEBUG][action.admin.indices.get ] [data-es-05] observer: timeout notification from cluster service. timeout setting [30s], time since start [30s]
[2015-10-27 14:49:19,577][DEBUG][action.admin.cluster.health] [data-es-05] no known master node, scheduling a retry
[2015-10-27 14:49:49,578][DEBUG][action.admin.cluster.health] [data-es-05] observer: timeout notification from cluster service. timeout setting [30s], time since start [30s]
[2015-10-27 14:49:49,775][DEBUG][action.admin.indices.get ] [data-es-05] no known master node, scheduling a retry
[2015-10-27 14:50:19,775][DEBUG][action.admin.indices.get ] [data-es-05] observer: timeout notification from cluster service. timeout setting [30s], time since start [30s]
[2015-10-27 14:50:47,535][DEBUG][action.admin.indices.get ] [data-es-05] no known master node, scheduling a retry
[2015-10-27 14:51:17,536][DEBUG][action.admin.indices.get ] [data-es-05] observer: timeout notification from cluster service. timeout setting [30s], time since start [30s]
[2015-10-27 15:00:21,485][DEBUG][action.admin.cluster.state] [data-es-05] no known master node, scheduling a retry
[2015-10-27 15:00:51,486][DEBUG][action.admin.cluster.state] [data-es-05] observer: timeout notification from cluster service. timeout setting [30s], time since start [30s]
[2015-10-27 15:00:51,690][DEBUG][action.admin.indices.get ] [data-es-05] no known master node, scheduling a retry
[2015-10-27 15:01:01,667][DEBUG][action.admin.cluster.state] [data-es-05] no known master node, scheduling a retry
[2015-10-27 15:01:21,690][DEBUG][action.admin.indices.get ] [data-es-05] observer: timeout notification from cluster service. timeout setting [30s], time since start [30s]
[2015-10-27 15:01:31,668][DEBUG][action.admin.cluster.state] [data-es-05] observer: timeout notification from cluster service. timeout setting [30s], time since start [30s]
[2015-10-27 15:01:31,797][DEBUG][action.admin.indices.get ] [data-es-05] no known master node, scheduling a retry
[2015-10-27 15:02:01,798][DEBUG][action.admin.indices.get ] [data-es-05] observer: timeout notification from cluster service. timeout setting [30s], time since start [30s]



确认分片功能已打开。

真的没人遇到过这问题吗? 

 ElasticSearch 会维护load balancin(负载均衡),relocating(重定位),合并来自各个节点的结果等等。你这个节点不是数据节点吧。#node.data:false 所以没有分片吧。你把这一样注释去掉。再重启看看
--- 共有 4 条评论 ---
寻梦2012@扁豆焖面先生 我建议你把其他节点的配置文件下载下来和你这个节点的配置文件比较一下。看那些不一样 3年前 回复
扁豆焖面先生回复 @寻梦2012 : 恩 node.data: true 的情况也验证过。 3年前 回复
寻梦2012@扁豆焖面先生 node.data:false 不是数据节点。为true才是数据节点 3年前 回复
扁豆焖面先生#node.data: false 这个确认好多次了 ,写这个配置项就是为了特意标注启用了数据节点。 3年前 回复
新节点一直报下面的错,让人很头疼,同一个局域网内其他几台就没有这问题。。。
[2015-10-28 13:10:15,297][DEBUG][action.admin.cluster.state] [data-es-05] observer: timeout notification from cluster service. timeout setting [30s], time since start [30s]
[2015-10-28 13:10:18,996][DEBUG][action.admin.cluster.state] [data-es-05] observer: timeout notification from cluster service. timeout setting [30s], time since start [30s]
[2015-10-28 13:10:19,033][DEBUG][action.admin.cluster.state] [data-es-05] no known master node, scheduling a retry
[2015-10-28 13:10:27,322][DEBUG][action.admin.cluster.state] [data-es-05] observer: timeout notification from cluster service. timeout setting [30s], time since start [30s]
[2015-10-28 13:10:27,412][DEBUG][action.admin.cluster.state] [data-es-05] no known master node, scheduling a retry
[2015-10-28 13:10:33,288][DEBUG][action.admin.cluster.state] [data-es-05] observer: timeout notification from cluster service. timeout setting [30s], time since start [30s]
[2015-10-28 13:10:33,393][DEBUG][action.admin.cluster.state] [data-es-05] no known master node, scheduling a retry
[2015-10-28 13:10:39,296][DEBUG][action.admin.cluster.state] [data-es-05] observer: timeout notification from cluster service. timeout setting [30s], time since start [30s]
[2015-10-28 13:10:39,398][DEBUG][action.admin.cluster.state] [data-es-05] no known master node, scheduling a retry
[2015-10-28 13:10:45,283][DEBUG][action.admin.cluster.state] [data-es-05] observer: timeout notification from cluster service. timeout setting [30s], time since start [30s]
[2015-10-28 13:10:45,387][DEBUG][action.admin.cluster.state] [data-es-05] no known master node, scheduling a retry
[2015-10-28 13:10:49,034][DEBUG][action.admin.cluster.state] [data-es-05] observer: timeout notification from cluster service. timeout setting [30s], time since start [30s]
[2015-10-28 13:10:49,070][DEBUG][action.admin.cluster.state] [data-es-05] no known master node, scheduling a retry
[2015-10-28 13:10:57,412][DEBUG][action.admin.cluster.state] [data-es-05] observer: timeout notification from cluster service. timeout setting [30s], time since start [30s]
[2015-10-28 13:10:57,423][DEBUG][action.admin.cluster.state] [data-es-05] no known master node, scheduling a retry
[2015-10-28 13:11:03,394][DEBUG][action.admin.cluster.state] [data-es-05] observer: timeout notification from cluster service. timeout setting [30s], time since start [30s]
[2015-10-28 13:11:03,417][DEBUG][action.admin.cluster.state] [data-es-05] no known master node, scheduling a retry
[2015-10-28 13:11:09,281][DEBUG][action.admin.cluster.state] [data-es-05] no known master node, scheduling a retry
[2015-10-28 13:11:09,399][DEBUG][action.admin.cluster.state] [data-es-05] observer: timeout notification from cluster service. timeout setting [30s], time since start [30s]
[2015-10-28 13:11:15,282][DEBUG][action.admin.cluster.state] [data-es-05] no known master node, scheduling a retry
[2015-10-28 13:11:15,388][DEBUG][action.admin.cluster.state] [data-es-05] observer: timeout notification from cluster service. timeout setting [30s], time since start [30s]
[2015-10-28 13:11:19,070][DEBUG][action.admin.cluster.state] [data-es-05] observer: timeout notification from cluster service. timeout setting [30s], time since start [30s]
[2015-10-28 13:11:19,715][DEBUG][action.admin.cluster.state] [data-es-05] no known master node, scheduling a retry
[2015-10-28 13:11:27,424][DEBUG][action.admin.cluster.state] [data-es-05] observer: timeout notification from cluster service. timeout setting [30s], time since start [30s]
[2015-10-28 13:11:27,426][DEBUG][action.admin.cluster.state] [data-es-05] no known master node, scheduling a retry
[2015-10-28 13:11:33,418][DEBUG][action.admin.cluster.state] [data-es-05] observer: timeout notification from cluster service. timeout setting [30s], time since start [30s]
[2015-10-28 13:11:33,452][DEBUG][action.admin.cluster.state] [data-es-05] no known master node, scheduling a retry



--- 共有 4 条评论 ---
寻梦2012回复 @扁豆焖面先生 : 你这个是数据节点配置文件和Master肯定不一样。你应该和从节点的配置文件一样 3年前 回复
扁豆焖面先生回复 @寻梦2012 : 新节点的配置文件就从master拷过来的,只改了bind 地址和组播地址。 刚又确认了 没有问题 3年前 回复
寻梦2012回复 @寻梦2012 : 你把其他节点的配置文件下载下来和这个配置文件比较一下 3年前 回复
寻梦2012no known master node, scheduling a retry 是找不到管理节点 3年前 回复
顶部