elastic-job cloud版本,任务执行失败

kaoyan2010 发布于 2017/11/06 11:26
阅读 153
收藏 0

部署完elastic-job中的cloud版本后,按照elastic-job-example模块中的elastic-job-example-cloud中的README.txt,进行app的发布和作业的发布。

按照文档,发布了test_job_simple,test_job_dataflow,test_job_script,这3个作业,但是通过运维平台,看到只有test_job_script可以执行成功。其它的2个作业都运行失败。

截图如下所示:

通过登录http://xxx.xxx.xxx.xxx:5050,查看mesos运维平台,如下

打开stderr,发现如下信息:

I1106 02:22:11.525714  4852 logging.cpp:194] INFO level logging started!
I1106 02:22:11.530445  4852 fetcher.cpp:533] Fetcher Info: {"cache_directory":"\/tmp\/mesos\/fetch\/slaves\/9bb739ac-820e-4438-90d1-a772bda02e6f-S0\/root","items":[{"action":"RETRIEVE_FROM_CACHE","cache_filename":"c1-elastic-jo_1.5.tar.gz","uri":{"cache":true,"extract":true,"value":"http:\/\/192.168.34.131:8080\/examples\/elastic-job-example-cloud-2.1.5.tar.gz"}}],"sandbox_directory":"\/var\/mesos\/work\/slaves\/9bb739ac-820e-4438-90d1-a772bda02e6f-S0\/frameworks\/ec8c6489-432f-4dd2-869b-a423b6224ad1-0000\/executors\/exampleApp@-@9bb739ac-820e-4438-90d1-a772bda02e6f-S0\/runs\/97e1844f-78fe-466a-a3a0-fa63c58384d2","user":"root"}
I1106 02:22:11.533766  4852 fetcher.cpp:444] Fetching URI 'http://192.168.34.131:8080/examples/elastic-job-example-cloud-2.1.5.tar.gz'
I1106 02:22:11.533799  4852 fetcher.cpp:341] Fetching from cache
I1106 02:22:11.867213  4852 fetcher.cpp:123] Extracted '/tmp/mesos/fetch/slaves/9bb739ac-820e-4438-90d1-a772bda02e6f-S0/root/c1-elastic-jo_1.5.tar.gz' into '/var/mesos/work/slaves/9bb739ac-820e-4438-90d1-a772bda02e6f-S0/frameworks/ec8c6489-432f-4dd2-869b-a423b6224ad1-0000/executors/exampleApp@-@9bb739ac-820e-4438-90d1-a772bda02e6f-S0/runs/97e1844f-78fe-466a-a3a0-fa63c58384d2'
I1106 02:22:11.867307  4852 fetcher.cpp:582] Fetched 'http://192.168.34.131:8080/examples/elastic-job-example-cloud-2.1.5.tar.gz' to '/var/mesos/work/slaves/9bb739ac-820e-4438-90d1-a772bda02e6f-S0/frameworks/ec8c6489-432f-4dd2-869b-a423b6224ad1-0000/executors/exampleApp@-@9bb739ac-820e-4438-90d1-a772bda02e6f-S0/runs/97e1844f-78fe-466a-a3a0-fa63c58384d2'
bin/start.sh: line 2: java: command not found

 

后来,我按照提示,进入到目录

/var/mesos/work/slaves/9bb739ac-820e-4438-90d1-a772bda02e6f-S0/frameworks/ec8c6489-432f-4dd2-869b-a423b6224ad1-0000/executors/exampleApp@-@9bb739ac-820e-4438-90d1-a772bda02e6f-S0/runs/97e1844f-78fe-466a-a3a0-fa63c58384d2

里面,手动执行bin/start.sh,运行结果如下:

[root@bogon 97e1844f-78fe-466a-a3a0-fa63c58384d2]# pwd
/var/mesos/work/slaves/9bb739ac-820e-4438-90d1-a772bda02e6f-S0/frameworks/ec8c6489-432f-4dd2-869b-a423b6224ad1-0000/executors/exampleApp@-@9bb739ac-820e-4438-90d1-a772bda02e6f-S0/runs/97e1844f-78fe-466a-a3a0-fa63c58384d2
[root@bogon 97e1844f-78fe-466a-a3a0-fa63c58384d2]# ll
total 8
drwxrwxrwx. 2 root root   21 Nov  6 02:37 bin
drwxrwxrwx. 2 root root 4096 Nov  6 02:22 lib
drwxrwxrwx. 2 root root   20 Oct 27 09:33 script
-rw-r--r--. 1 root root 1690 Nov  6 02:22 stderr
-rw-r--r--. 1 root root    0 Nov  6 02:22 stdout
[root@bogon 97e1844f-78fe-466a-a3a0-fa63c58384d2]# bin/start.sh    
I1106 03:23:46.122083 11789 logging.cpp:194] INFO level logging started!
Expecting 'MESOS_SLAVE_PID' to be set in the environment
[root@bogon 97e1844f-78fe-466a-a3a0-fa63c58384d2]# 

我想了半天,也不知道这个

MESOS_SLAVE_PID

是从哪里设置的。

 

想请教下各位,我的test_job_simple任务,怎么才能运行成功呢??

加载中
返回顶部
顶部