PipeMapRed.waitOutputThreads(): subprocess failed with code 1 at org.apache.hadoop.streaming.PipeMapRe

图数据库猫 发布于 2016/01/19 14:43
阅读 1K+
收藏 0
java.lang.Exception: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
16/01/19 14:37:29 INFO mapreduce.Job: Job job_local1011267275_0001 failed with state FAILED due to: NA
16/01/19 14:37:29 INFO mapreduce.Job: Counters: 0
16/01/19 14:37:29 ERROR streaming.StreamJob: Job not Successful!
Streaming Command Failed!
Error in mr(map = map, reduce = reduce, combine = combine, vectorized.reduce, :
hadoop streaming failed with error code 1
加载中
0
图数据库猫
图数据库猫
ddply()函数位于plyr包里,提前把plyr安装好, librar(plyr)这个包就可以运行啦
0
图数据库猫
图数据库猫
补充一下代码和日志:
> step2.mr<-mapreduce(
+     train.mr,
+     map = function(k, v) {
+         d<-data.frame(k,v)
+         d2<-ddply(d,.(k,v),count)
+         
+         key<-d2$k
+         val<-d2
+         keyval(key,val)
+     }
+ )
16/01/19 14:37:22 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
16/01/19 14:37:22 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
16/01/19 14:37:22 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
16/01/19 14:37:22 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
16/01/19 14:37:23 INFO mapred.FileInputFormat: Total input paths to process : 1
16/01/19 14:37:23 INFO mapreduce.JobSubmitter: number of splits:1
16/01/19 14:37:24 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local1011267275_0001
16/01/19 14:37:24 WARN conf.Configuration: file:/bigdata/hadoop/tmp/mapred/staging/hadoop1011267275/.staging/job_local1011267275_0001/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.
16/01/19 14:37:24 WARN conf.Configuration: file:/bigdata/hadoop/tmp/mapred/staging/hadoop1011267275/.staging/job_local1011267275_0001/job.xml:an attempt to override final parameter: dfs.namenode.name.dir;  Ignoring.
16/01/19 14:37:24 WARN conf.Configuration: file:/bigdata/hadoop/tmp/mapred/staging/hadoop1011267275/.staging/job_local1011267275_0001/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.
16/01/19 14:37:24 INFO mapred.LocalDistributedCacheManager: Creating symlink: /bigdata/hadoop/tmp/mapred/local/1453185444286/rmr-local-env2c502eec5e73 <- /home/hadoop/rmr-local-env2c502eec5e73
16/01/19 14:37:24 INFO mapred.LocalDistributedCacheManager: Localized file:/tmp/Rtmpva8uMS/rmr-local-env2c502eec5e73 as file:/bigdata/hadoop/tmp/mapred/local/1453185444286/rmr-local-env2c502eec5e73
16/01/19 14:37:24 INFO mapred.LocalDistributedCacheManager: Creating symlink: /bigdata/hadoop/tmp/mapred/local/1453185444287/rmr-global-env2c5013e025a6 <- /home/hadoop/rmr-global-env2c5013e025a6
16/01/19 14:37:24 INFO mapred.LocalDistributedCacheManager: Localized file:/tmp/Rtmpva8uMS/rmr-global-env2c5013e025a6 as file:/bigdata/hadoop/tmp/mapred/local/1453185444287/rmr-global-env2c5013e025a6
16/01/19 14:37:24 INFO mapred.LocalDistributedCacheManager: Creating symlink: /bigdata/hadoop/tmp/mapred/local/1453185444288/rmr-streaming-map2c50204d933b <- /home/hadoop/rmr-streaming-map2c50204d933b
16/01/19 14:37:24 INFO mapred.LocalDistributedCacheManager: Localized file:/tmp/Rtmpva8uMS/rmr-streaming-map2c50204d933b as file:/bigdata/hadoop/tmp/mapred/local/1453185444288/rmr-streaming-map2c50204d933b
16/01/19 14:37:24 WARN conf.Configuration: file:/bigdata/hadoop/tmp/mapred/local/localRunner/hadoop/job_local1011267275_0001/job_local1011267275_0001.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.
16/01/19 14:37:24 WARN conf.Configuration: file:/bigdata/hadoop/tmp/mapred/local/localRunner/hadoop/job_local1011267275_0001/job_local1011267275_0001.xml:an attempt to override final parameter: dfs.namenode.name.dir;  Ignoring.
16/01/19 14:37:24 WARN conf.Configuration: file:/bigdata/hadoop/tmp/mapred/local/localRunner/hadoop/job_local1011267275_0001/job_local1011267275_0001.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.
16/01/19 14:37:24 INFO mapreduce.Job: The url to track the job: http://localhost:8080/
16/01/19 14:37:24 INFO mapred.LocalJobRunner: OutputCommitter set in config null
16/01/19 14:37:24 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapred.FileOutputCommitter
16/01/19 14:37:24 INFO mapreduce.Job: Running job: job_local1011267275_0001
16/01/19 14:37:25 INFO mapred.LocalJobRunner: Waiting for map tasks
16/01/19 14:37:25 INFO mapred.LocalJobRunner: Starting task: attempt_local1011267275_0001_m_000000_0
16/01/19 14:37:25 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
16/01/19 14:37:25 INFO mapred.MapTask: Processing split: hdfs://master:9000/tmp/file2c506162708b/part-00000:0+2790
16/01/19 14:37:25 INFO mapred.MapTask: numReduceTasks: 0
16/01/19 14:37:25 INFO streaming.PipeMapRed: PipeMapRed exec [/usr/local/bin/Rscript, --vanilla, ./rmr-streaming-map2c50204d933b]
16/01/19 14:37:25 INFO streaming.PipeMapRed: R/W/S=1/0/0 in:NA [rec/s] out:NA [rec/s]
16/01/19 14:37:25 INFO streaming.PipeMapRed: R/W/S=10/0/0 in:NA [rec/s] out:NA [rec/s]
Loading objects:
  asset
  cl
  Cluster1
  coeffs
  con
  conMatrix
  cor_x
  Fac
  FAC
  factor1
  factor2
  FAO
  F_score
  GCtorture
  graph
  iris
  iris_ctree
16/01/19 14:37:25 INFO mapreduce.Job: Job job_local1011267275_0001 running in uber mode : false
16/01/19 14:37:25 INFO mapreduce.Job:  map 0% reduce 0%
  json_str
  localDF
  model
  multiple_regerss1
  myll
  name1
  name2
  name3
  pamx
  path
  pch1
  pch2
  predRsult
  .Random.seed
  rank
  resid
  result
  r_log
  s
  stock
  train
  train2.mr
Warning: S3 methods ‘gorder.default’, ‘gorder.factor’, ‘gorder.data.frame’, ‘gorder.matrix’, ‘gorder.raw’ were declared in NAMESPACE but not found
Please review your hadoop settings. See help(hadoop.settings)
  train.hdfs
  train.mr
  x
  X_data
  xtest
  xtrain
  y
  ytest
  ytrain
Loading objects:
  backend.parameters
  combine
  combine.file
  combine.line
  debug
  default.input.format
  default.output.format
  in.folder
  in.memory.combine
  input.format
  libs
  map
  map.file
  map.line
  out.folder
  output.format
  pkg.opts
  postamble
  preamble
  profile.nodes
  reduce
  reduce.file
  reduce.line
  rmr.global.env
  rmr.local.env
  save.env
  tempfile
  vectorized.reduce
  verbose
  work.dir
Loading required package: methods
Loading required package: DBI
Loading required package: RMySQL
Loading required package: rmr2
Loading objects:
  backend.parameters
  combine
  combine.file
  combine.line
  debug
  default.input.format
  default.output.format
  in.folder
  in.memory.combine
  input.format
  libs
  map
  map.file
  map.line
  out.folder
  output.format
  pkg.opts
  postamble
  preamble
  profile.nodes
  reduce
  reduce.file
  reduce.line
  rmr.global.env
  rmr.local.env
  save.env
  tempfile
  vectorized.reduce
  verbose
  work.dir
Error in map(keys(kv), values(kv)) : could not find function "ddply"
Calls: <Anonymous> -> <Anonymous> -> as.keyval -> is.keyval -> map
No traceback available 
Error during wrapup: 
Execution halted
16/01/19 14:37:28 INFO streaming.PipeMapRed: MRErrorThread done
16/01/19 14:37:28 INFO streaming.PipeMapRed: PipeMapRed failed!
java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
	at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
	at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
	at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
	at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
	at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
16/01/19 14:37:28 INFO mapred.LocalJobRunner: map task executor complete.
16/01/19 14:37:28 WARN mapred.LocalJobRunner: job_local1011267275_0001
java.lang.Exception: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
	at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
	at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
	at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
	at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
	at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
	at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
16/01/19 14:37:29 INFO mapreduce.Job: Job job_local1011267275_0001 failed with state FAILED due to: NA
16/01/19 14:37:29 INFO mapreduce.Job: Counters: 0
16/01/19 14:37:29 ERROR streaming.StreamJob: Job not Successful!
Streaming Command Failed!
Error in mr(map = map, reduce = reduce, combine = combine, vectorized.reduce,  : 
  hadoop streaming failed with error code 1



返回顶部
顶部