python写hive的UDF报错

如梦似幻梦幻泡影 发布于 2017/03/02 20:23
阅读 506
收藏 0

sql语句:

select transform(*) 
    using 'python interest.py' as (device_id,device_type,interestOne,interestTwo) 
    from dm_interest_tag 
    where year='2017' and  
    month='02' and  
    day='23' and 
    business='all' limit 100;

python:

#! /usr/bin/python
# -*- coding:utf-8 -*- 

import sys, os
import re
import json

def main():
    for line in sys.stdin:
        line = line.strip()
        fields = line.split('\t')
        tags = json.loads(fields[3])
        for tag in tags :
            interests = tag["tag"]
            for interest in interests:
                interestOne = interest["1"]
                interestTwo = ""
                if interestOne == "Games" :
                    interestTwo = interest["2"]
                try:
                    print '\t'.join([fields[0],fields[1],interestOne.encode('utf-8'),interestTwo.encode('utf-8')])
                except ValueError,e:
                    continue

if __name__ == "__main__":
    main()

错误信息:

Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"device_id":"fbb61bf8-af24-4226-9a22-7912cefd3fad","device_type":"gaid","platform":"android","tags":"[{\"package_name\":\"com.shazam.android\",\"tag\":[{\"1\":\"Music\",\"id\":\"21\"}]},{\"package_name\":\"com.adeco.christmas.sweet.story.pro\",\"tag\":[{\"1\":\"Games\",\"2\":\"Casual\",\"id\":\"41\"}]},{\"package_name\":\"com.forshared\",\"tag\":[{\"1\":\"Entertainment\",\"id\":\"11\"}]},{\"package_name\":\"com.zentertain.photoeditor\",\"tag\":[{\"1\":\"Photography\",\"id\":\"25\"}]},{\"package_name\":\"cn.wps.moffice_eng\",\"tag\":[{\"1\":\"Business\",\"id\":\"6\"}]},{\"package_name\":\"com.miniclip.plagueinc\",\"tag\":[{\"1\":\"Games\",\"2\":\"Simulation\",\"id\":\"47\"}]}]","year":"2017","month":"02","day":"23","business":"all"}
02-03-2017 20:19:34 CST interest ERROR - 	at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195)
02-03-2017 20:19:34 CST interest ERROR - 	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
02-03-2017 20:19:34 CST interest ERROR - 	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:455)
02-03-2017 20:19:34 CST interest ERROR - 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344)
02-03-2017 20:19:34 CST interest ERROR - 	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
02-03-2017 20:19:34 CST interest ERROR - 	at java.security.AccessController.doPrivileged(Native Method)
02-03-2017 20:19:34 CST interest ERROR - 	at javax.security.auth.Subject.doAs(Subject.java:422)
02-03-2017 20:19:34 CST interest ERROR - 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
02-03-2017 20:19:34 CST interest ERROR - 	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
02-03-2017 20:19:34 CST interest ERROR - Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"device_id":"fbb61bf8-af24-4226-9a22-7912cefd3fad","device_type":"gaid","platform":"android","tags":"[{\"package_name\":\"com.shazam.android\",\"tag\":[{\"1\":\"Music\",\"id\":\"21\"}]},{\"package_name\":\"com.adeco.christmas.sweet.story.pro\",\"tag\":[{\"1\":\"Games\",\"2\":\"Casual\",\"id\":\"41\"}]},{\"package_name\":\"com.forshared\",\"tag\":[{\"1\":\"Entertainment\",\"id\":\"11\"}]},{\"package_name\":\"com.zentertain.photoeditor\",\"tag\":[{\"1\":\"Photography\",\"id\":\"25\"}]},{\"package_name\":\"cn.wps.moffice_eng\",\"tag\":[{\"1\":\"Business\",\"id\":\"6\"}]},{\"package_name\":\"com.miniclip.plagueinc\",\"tag\":[{\"1\":\"Games\",\"2\":\"Simulation\",\"id\":\"47\"}]}]","year":"2017","month":"02","day":"23","business":"all"}
02-03-2017 20:19:34 CST interest ERROR - 	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:550)
02-03-2017 20:19:34 CST interest ERROR - 	at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
02-03-2017 20:19:34 CST interest ERROR - 	... 8 more
02-03-2017 20:19:34 CST interest ERROR - Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: [Error 20001]: An error occurred while reading or writing to your custom script. It may have crashed with an error.
02-03-2017 20:19:34 CST interest ERROR - 	at org.apache.hadoop.hive.ql.exec.ScriptOperator.processOp(ScriptOperator.java:410)
02-03-2017 20:19:34 CST interest ERROR - 	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793)
02-03-2017 20:19:34 CST interest ERROR - 	at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:87)
02-03-2017 20:19:34 CST interest ERROR - 	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793)
02-03-2017 20:19:34 CST interest ERROR - 	at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
02-03-2017 20:19:34 CST interest ERROR - 	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793)
02-03-2017 20:19:34 CST interest ERROR - 	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:540)
02-03-2017 20:19:34 CST interest ERROR - 	... 9 more
02-03-2017 20:19:34 CST interest ERROR - Caused by: java.io.IOException: Stream closed
02-03-2017 20:19:34 CST interest ERROR - 	at java.lang.ProcessBuilder$NullOutputStream.write(ProcessBuilder.java:433)
02-03-2017 20:19:34 CST interest ERROR - 	at java.io.OutputStream.write(OutputStream.java:116)
02-03-2017 20:19:34 CST interest ERROR - 	at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
02-03-2017 20:19:34 CST interest ERROR - 	at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
02-03-2017 20:19:34 CST interest ERROR - 	at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
02-03-2017 20:19:34 CST interest ERROR - 	at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
02-03-2017 20:19:34 CST interest ERROR - 	at java.io.DataOutputStream.write(DataOutputStream.java:107)
02-03-2017 20:19:34 CST interest ERROR - 	at org.apache.hadoop.hive.ql.exec.TextRecordWriter.write(TextRecordWriter.java:53)
02-03-2017 20:19:34 CST interest ERROR - 	at org.apache.hadoop.hive.ql.exec.ScriptOperator.processOp(ScriptOperator.java:378)
02-03-2017 20:19:34 CST interest ERROR - 	... 15 more

请问这是哪里错了?求大神指点?

加载中
返回顶部
顶部