Hadoop2.4.0 完全分布式集群 nodemanager启动失败

hyacinth_z 发布于 2014/10/23 01:32
阅读 4K+
收藏 0

1.环境:centos7+Hadoop2.4.0

3台机器的实验环境

IP

主机名

角色

192.168.1.10

Hadoop1

Master

192.168.1.11

Hadoop2

Slave1

192.168.1.12

Hadoop3

Slave2

2.错误症状

在Hadoop1上查看50070端口的web页面live nodes和dead nodes均为0,在hadoop1刚刚敲完start-yarn.sh之后使用jps查看hadoop2和3发现nodemanager启动成功,但是大概1分钟之后再使用jps命令查看时就没有了。

检查log后发现以下warning和error


WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: NodeManager configured with 8 G physical memory allocated to containers, which is more than 80% of the total physical memory available (1.8 G). Thrashing might happen.



ERROR org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Unexpected error starting NodeStatusUpdater
java.net.NoRouteToHostException: No Route to Host from  Hadoop2.whutDM/192.168.1.11 to Hadoop1:8031 failed on socket timeout exception: java.net.NoRouteToHostException: No route to host; For more details see:  http://wiki.apache.org/hadoop/NoRouteToHost
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
        at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:783)
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:756)
        at org.apache.hadoop.ipc.Client.call(Client.java:1414)
        at org.apache.hadoop.ipc.Client.call(Client.java:1363)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
        at com.sun.proxy.$Proxy28.registerNodeManager(Unknown Source)
        at org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:68)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:483)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103)
        at com.sun.proxy.$Proxy29.registerNodeManager(Unknown Source)
        at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:257)
at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:190)
        at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
        at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStart(NodeManager.java:197)
        at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:358)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:404)
Caused by: java.net.NoRouteToHostException: No route to host
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716)
        at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)



3.我排错的一些尝试


    网络,防火墙:ssh3台机器互相可无密码登陆,iptables和selinux均已经关闭,禁用了ipv6,hadoop1修改slaves文件后伪集群模式运行成功。(hadoop2和hadoop3是hadoop1复制过去的。)

    hadoop:8031端口的相关配置:


<property>
        <name>yarn.resourcemanager.resource-tracker.address</name>
        <value>Hadoop1:8031</value>
    </property>



    查看8031端口情况:
[hadoop@Hadoop1 hadoop]$ netstat -ntual |grep 8031
tcp        0      0 192.168.1.10:8031       0.0.0.0:*               LISTEN     
tcp        0      0 192.168.1.10:8031       192.168.1.10:46736      ESTABLISHED
tcp        0      0 192.168.1.10:46736      192.168.1.10:8031       ESTABLISHED


刚刚接触Hadoop,感谢各路大神看到最后,如能指点一二不胜感激。





加载中
0
梦里花落知多少
梦里花落知多少
你好,我也遇到了同样的问题,无法解决,请问你的问题解决了吗?
hyacinth_z
hyacinth_z
折腾了很久没有解决,因为只是个人实验,所以后来就换成单机伪集群了。
返回顶部
顶部