nutch代理配置问题

疯疯持刀 发布于 2013/11/15 13:59
阅读 711
收藏 0

在公司的网络设置了代理:在cygwin上执行bin/nutch...命令时连接不上(在没有代理的网络里:比如家里是可以的)

部分日志如下:Generator: jobtracker is 'local', generating exactly one partition.
Generator: Partitioning selected urls for politeness.
Generator: segment: nutchDirs/163/segments/20131115113044
Generator: finished at 2013-11-15 11:30:47, elapsed: 00:00:06
Fetcher: Your 'http.agent.name' value should be listed first in 'http.robots.agents' property.
Fetcher: starting at 2013-11-15 11:30:47
Fetcher: segment: nutchDirs/163/segments/20131115113044
Fetcher: threads: 5
QueueFeeder finished: total 1 records + hit by time limit :0
-finishing thread FetcherThread, activeThreads=2
-finishing thread FetcherThread, activeThreads=1
fetching http://www.163.com/
-finishing thread FetcherThread, activeThreads=1
-finishing thread FetcherThread, activeThreads=1
fetch of http://www.163.com/ failed with: Http code=407, url=http://www.163.com/

我的nutch-site.xml配置如下(我已经加了代理配置了啊)

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
 <configuration>
 <property>
  <name>http.agent.name</name>
  <value>mynutch</value>
  <description>test
  </description>
</property> 
</property>
<property>
  <name>searcher.dir</name>
  <value>E:/cygwin/home/nutch-1.2/163</value>
  <description></description>
</property>
<property>
   <name>http.agent.version</name>
   <value>1.0</value>
</property>
<property>
   <name>http.proxy.host</name>
   <value>proxy.asiainfo-linkage.com</value>
   <description> </description>
 </property>
 <property>
   <name>http.proxy.port</name>
   <value>8080</value>
   <description></description>
 </property>
 
 <property>
   <name>http.proxy.username</name>
   <value>ailk\mapl</value>
   <description> </description>
</property>
 <property>
   <name>http.proxy.password</name>
   <value>lang%681</value>
   <description> </description>
</property>
 </configuration>

 

加载中
返回顶部
顶部