Nutch 2.3.1 爬取爬不到任何数据

加州肥猫 发布于 2016/04/25 10:51
阅读 818
收藏 0
[root@localhost local]# bin/crawl url/ g1 5
No SOLRURL specified. Skipping indexing.
Injecting seed URLs
/root/nutch/nutch/runtime/local/bin/nutch inject url/ -crawlId g1
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/root/nutch/nutch/runtime/local/lib/slf4j-log4j12-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/root/nutch/nutch/runtime/local/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
InjectorJob: starting at 2016-04-24 22:38:15
InjectorJob: Injecting urlDir: url
InjectorJob: Using class org.apache.gora.hbase.store.HBaseStore as the Gora storage class.
InjectorJob: total number of urls rejected by filters: 0
InjectorJob: total number of urls injected after normalization and filtering: 1
Injector: finished at 2016-04-24 22:38:19, elapsed: 00:00:03
Sun Apr 24 22:38:19 EDT 2016 : Iteration 1 of 5
Generating batchId
Generating a new fetchlist
/root/nutch/nutch/runtime/local/bin/nutch generate -D mapred.reduce.tasks=2 -D mapred.child.java.opts=-Xmx1000m -D mapred.reduce.tasks.speculative.execution=false -D mapred.map.tasks.speculative.execution=false -D mapred.compress.map.output=true -topN 50000 -noNorm -noFilter -adddays 0 -crawlId g1 -batchId 1461551899-24095
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/root/nutch/nutch/runtime/local/lib/slf4j-log4j12-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/root/nutch/nutch/runtime/local/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
GeneratorJob: starting at 2016-04-24 22:38:20
GeneratorJob: Selecting best-scoring urls due for fetch.
GeneratorJob: starting
GeneratorJob: filtering: false
GeneratorJob: normalizing: false
GeneratorJob: topN: 50000
GeneratorJob: finished at 2016-04-24 22:38:23, time elapsed: 00:00:03
GeneratorJob: generated batch id: 1461551899-24095 containing 1 URLs
Fetching : 
/root/nutch/nutch/runtime/local/bin/nutch fetch -D mapred.reduce.tasks=2 -D mapred.child.java.opts=-Xmx1000m -D mapred.reduce.tasks.speculative.execution=false -D mapred.map.tasks.speculative.execution=false -D mapred.compress.map.output=true -D fetcher.timelimit.mins=180 1461551899-24095 -crawlId g1 -threads 50
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/root/nutch/nutch/runtime/local/lib/slf4j-log4j12-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/root/nutch/nutch/runtime/local/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
FetcherJob: starting at 2016-04-24 22:38:25
FetcherJob: batchId: 1461551899-24095
FetcherJob: threads: 50
FetcherJob: parsing: false
FetcherJob: resuming: false
FetcherJob : timelimit set for : 1461562705299
Using queue mode : byHost
Fetcher: threads: 50
QueueFeeder finished: total 0 records. Hit by time limit :0
-finishing thread FetcherThread0, activeThreads=0
-finishing thread FetcherThread1, activeThreads=0
-finishing thread FetcherThread2, activeThreads=0
-finishing thread FetcherThread3, activeThreads=0
-finishing thread FetcherThread4, activeThreads=0
-finishing thread FetcherThread5, activeThreads=0
-finishing thread FetcherThread6, activeThreads=0
-finishing thread FetcherThread7, activeThreads=0
-finishing thread FetcherThread8, activeThreads=0
-finishing thread FetcherThread9, activeThreads=0
-finishing thread FetcherThread10, activeThreads=0
-finishing thread FetcherThread11, activeThreads=0
-finishing thread FetcherThread12, activeThreads=0
-finishing thread FetcherThread13, activeThreads=0
-finishing thread FetcherThread14, activeThreads=0
-finishing thread FetcherThread16, activeThreads=0
-finishing thread FetcherThread17, activeThreads=0
-finishing thread FetcherThread18, activeThreads=0
-finishing thread FetcherThread19, activeThreads=0
-finishing thread FetcherThread20, activeThreads=0
-finishing thread FetcherThread21, activeThreads=0
-finishing thread FetcherThread22, activeThreads=0
-finishing thread FetcherThread23, activeThreads=0
-finishing thread FetcherThread15, activeThreads=0
-finishing thread FetcherThread25, activeThreads=0
-finishing thread FetcherThread26, activeThreads=0
-finishing thread FetcherThread27, activeThreads=0
-finishing thread FetcherThread28, activeThreads=0
-finishing thread FetcherThread29, activeThreads=0
-finishing thread FetcherThread30, activeThreads=0
-finishing thread FetcherThread31, activeThreads=0
-finishing thread FetcherThread32, activeThreads=0
-finishing thread FetcherThread33, activeThreads=0
-finishing thread FetcherThread34, activeThreads=0
-finishing thread FetcherThread35, activeThreads=0
-finishing thread FetcherThread36, activeThreads=0
-finishing thread FetcherThread24, activeThreads=0
-finishing thread FetcherThread38, activeThreads=0
-finishing thread FetcherThread37, activeThreads=0
-finishing thread FetcherThread40, activeThreads=0
-finishing thread FetcherThread39, activeThreads=0
-finishing thread FetcherThread42, activeThreads=0
-finishing thread FetcherThread43, activeThreads=0
-finishing thread FetcherThread44, activeThreads=0
-finishing thread FetcherThread41, activeThreads=0
-finishing thread FetcherThread46, activeThreads=0
-finishing thread FetcherThread47, activeThreads=0
-finishing thread FetcherThread48, activeThreads=0
Fetcher: throughput threshold: -1
Fetcher: throughput threshold sequence: 5
-finishing thread FetcherThread45, activeThreads=0
-finishing thread FetcherThread49, activeThreads=0
0/0 spinwaiting/active, 0 pages, 0 errors, 0.0 0 pages/s, 0 0 kb/s, 0 URLs in 0 queues
-activeThreads=0
Using queue mode : byHost
Fetcher: threads: 50
QueueFeeder finished: total 0 records. Hit by time limit :0
-finishing thread FetcherThread28, activeThreads=27
-finishing thread FetcherThread29, activeThreads=27
-finishing thread FetcherThread30, activeThreads=27
-finishing thread FetcherThread31, activeThreads=27
-finishing thread FetcherThread32, activeThreads=27
-finishing thread FetcherThread33, activeThreads=27
-finishing thread FetcherThread34, activeThreads=27
-finishing thread FetcherThread35, activeThreads=27
-finishing thread FetcherThread36, activeThreads=27
-finishing thread FetcherThread37, activeThreads=27
-finishing thread FetcherThread38, activeThreads=27
-finishing thread FetcherThread39, activeThreads=27
-finishing thread FetcherThread40, activeThreads=27
-finishing thread FetcherThread41, activeThreads=27
-finishing thread FetcherThread42, activeThreads=27
-finishing thread FetcherThread43, activeThreads=27
-finishing thread FetcherThread44, activeThreads=27
-finishing thread FetcherThread45, activeThreads=27
-finishing thread FetcherThread46, activeThreads=27
-finishing thread FetcherThread47, activeThreads=27
-finishing thread FetcherThread48, activeThreads=27
Fetcher: throughput threshold: -1
Fetcher: throughput threshold sequence: 5
-finishing thread FetcherThread27, activeThreads=27
-finishing thread FetcherThread49, activeThreads=27
-finishing thread FetcherThread0, activeThreads=26
-finishing thread FetcherThread1, activeThreads=25
-finishing thread FetcherThread2, activeThreads=24
-finishing thread FetcherThread3, activeThreads=23
-finishing thread FetcherThread4, activeThreads=22
-finishing thread FetcherThread5, activeThreads=21
-finishing thread FetcherThread6, activeThreads=20
-finishing thread FetcherThread7, activeThreads=19
-finishing thread FetcherThread9, activeThreads=18
-finishing thread FetcherThread10, activeThreads=17
-finishing thread FetcherThread11, activeThreads=16
-finishing thread FetcherThread12, activeThreads=15
-finishing thread FetcherThread13, activeThreads=14
-finishing thread FetcherThread14, activeThreads=13
-finishing thread FetcherThread15, activeThreads=12
-finishing thread FetcherThread16, activeThreads=11
-finishing thread FetcherThread17, activeThreads=10
-finishing thread FetcherThread18, activeThreads=9
-finishing thread FetcherThread19, activeThreads=8
-finishing thread FetcherThread20, activeThreads=7
-finishing thread FetcherThread21, activeThreads=6
-finishing thread FetcherThread8, activeThreads=5
-finishing thread FetcherThread23, activeThreads=4
-finishing thread FetcherThread24, activeThreads=3
-finishing thread FetcherThread25, activeThreads=2
-finishing thread FetcherThread26, activeThreads=1
-finishing thread FetcherThread22, activeThreads=0
0/0 spinwaiting/active, 0 pages, 0 errors, 0.0 0 pages/s, 0 0 kb/s, 0 URLs in 0 queues
-activeThreads=0
FetcherJob: finished at 2016-04-24 22:38:39, time elapsed: 00:00:13
Parsing : 
/root/nutch/nutch/runtime/local/bin/nutch parse -D mapred.reduce.tasks=2 -D mapred.child.java.opts=-Xmx1000m -D mapred.reduce.tasks.speculative.execution=false -D mapred.map.tasks.speculative.execution=false -D mapred.compress.map.output=true -D mapred.skip.attempts.to.start.skipping=2 -D mapred.skip.map.max.skip.records=1 1461551899-24095 -crawlId g1
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/root/nutch/nutch/runtime/local/lib/slf4j-log4j12-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/root/nutch/nutch/runtime/local/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
ParserJob: starting at 2016-04-24 22:38:40
ParserJob: resuming: false
ParserJob: forced reparse: false
ParserJob: batchId: 1461551899-24095
ParserJob: success
ParserJob: finished at 2016-04-24 22:38:44, time elapsed: 00:00:03
CrawlDB update for g1
/root/nutch/nutch/runtime/local/bin/nutch updatedb -D mapred.reduce.tasks=2 -D mapred.child.java.opts=-Xmx1000m -D mapred.reduce.tasks.speculative.execution=false -D mapred.map.tasks.speculative.execution=false -D mapred.compress.map.output=true 1461551899-24095 -crawlId g1
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/root/nutch/nutch/runtime/local/lib/slf4j-log4j12-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/root/nutch/nutch/runtime/local/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
DbUpdaterJob: starting at 2016-04-24 22:38:45
DbUpdaterJob: batchId: 1461551899-24095
DbUpdaterJob: finished at 2016-04-24 22:38:49, time elapsed: 00:00:03
Skipping indexing tasks: no SOLR url provided.
Sun Apr 24 22:38:49 EDT 2016 : Iteration 2 of 5
Generating batchId
Generating a new fetchlist
/root/nutch/nutch/runtime/local/bin/nutch generate -D mapred.reduce.tasks=2 -D mapred.child.java.opts=-Xmx1000m -D mapred.reduce.tasks.speculative.execution=false -D mapred.map.tasks.speculative.execution=false -D mapred.compress.map.output=true -topN 50000 -noNorm -noFilter -adddays 0 -crawlId g1 -batchId 1461551929-31942
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/root/nutch/nutch/runtime/local/lib/slf4j-log4j12-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/root/nutch/nutch/runtime/local/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
GeneratorJob: starting at 2016-04-24 22:38:50
GeneratorJob: Selecting best-scoring urls due for fetch.
GeneratorJob: starting
GeneratorJob: filtering: false
GeneratorJob: normalizing: false
GeneratorJob: topN: 50000
GeneratorJob: finished at 2016-04-24 22:38:53, time elapsed: 00:00:03
GeneratorJob: generated batch id: 1461551929-31942 containing 0 URLs
Generate returned 1 (no new segments created)
Escaping loop: no more URLs to fetch now
[root@localhost local]# ping www.Garfields.cc
PING d67ccee59d3f5407c48f4b00fc759bce.360safedns.com (61.160.224.174) 56(84) bytes of data.
64 bytes from 61.160.224.174: icmp_seq=1 ttl=49 time=198 ms
64 bytes from 61.160.224.174: icmp_seq=2 ttl=49 time=197 ms
64 bytes from 61.160.224.174: icmp_seq=3 ttl=49 time=200 ms
64 bytes from 61.160.224.174: icmp_seq=4 ttl=49 time=198 ms
64 bytes from 61.160.224.174: icmp_seq=5 ttl=49 time=197 ms
64 bytes from 61.160.224.174: icmp_seq=6 ttl=49 time=209 ms
64 bytes from 61.160.224.174: icmp_seq=7 ttl=49 time=197 ms
64 bytes from 61.160.224.174: icmp_seq=8 ttl=49 time=197 ms
64 bytes from 61.160.224.174: icmp_seq=10 ttl=49 time=197 ms
64 bytes from 61.160.224.174: icmp_seq=11 ttl=49 time=197 ms
64 bytes from 61.160.224.174: icmp_seq=12 ttl=49 time=197 ms
64 bytes from 61.160.224.174: icmp_seq=13 ttl=49 time=197 ms
64 bytes from 61.160.224.174: icmp_seq=14 ttl=49 time=198 ms
64 bytes from 61.160.224.174: icmp_seq=15 ttl=49 time=198 ms
64 bytes from 61.160.224.174: icmp_seq=16 ttl=49 time=197 ms
64 bytes from 61.160.224.174: icmp_seq=17 ttl=49 time=198 ms
64 bytes from 61.160.224.174: icmp_seq=18 ttl=49 time=198 ms
64 bytes from 61.160.224.174: icmp_seq=19 ttl=49 time=197 ms
64 bytes from 61.160.224.174: icmp_seq=20 ttl=49 time=198 ms
64 bytes from 61.160.224.174: icmp_seq=21 ttl=49 time=199 ms
64 bytes from 61.160.224.174: icmp_seq=22 ttl=49 time=207 ms
64 bytes from 61.160.224.174: icmp_seq=23 ttl=49 time=197 ms
64 bytes from 61.160.224.174: icmp_seq=24 ttl=49 time=197 ms
64 bytes from 61.160.224.174: icmp_seq=25 ttl=49 time=197 ms
64 bytes from 61.160.224.174: icmp_seq=26 ttl=49 time=197 ms
64 bytes from 61.160.224.174: icmp_seq=27 ttl=49 time=198 ms
64 bytes from 61.160.224.174: icmp_seq=28 ttl=49 time=197 ms
64 bytes from 61.160.224.174: icmp_seq=29 ttl=49 time=198 ms
64 bytes from 61.160.224.174: icmp_seq=30 ttl=49 time=220 ms
64 bytes from 61.160.224.174: icmp_seq=31 ttl=49 time=197 ms
64 bytes from 61.160.224.174: icmp_seq=32 ttl=49 time=198 ms
64 bytes from 61.160.224.174: icmp_seq=33 ttl=49 time=198 ms
64 bytes from 61.160.224.174: icmp_seq=34 ttl=49 time=197 ms
64 bytes from 61.160.224.174: icmp_seq=35 ttl=49 time=197 ms
64 bytes from 61.160.224.174: icmp_seq=36 ttl=49 time=197 ms
64 bytes from 61.160.224.174: icmp_seq=37 ttl=49 time=197 ms
64 bytes from 61.160.224.174: icmp_seq=38 ttl=49 time=197 ms
64 bytes from 61.160.224.174: icmp_seq=39 ttl=49 time=198 ms
64 bytes from 61.160.224.174: icmp_seq=40 ttl=49 time=197 ms
64 bytes from 61.160.224.174: icmp_seq=41 ttl=49 time=197 ms
64 bytes from 61.160.224.174: icmp_seq=42 ttl=49 time=197 ms
^C
--- d67ccee59d3f5407c48f4b00fc759bce.360safedns.com ping statistics ---
42 packets transmitted, 41 received, 2% packet loss, time 41537ms
rtt min/avg/max/mdev = 197.165/199.038/220.489/4.173 ms
[root@localhost local]# bin/crawl url/ g1 10
No SOLRURL specified. Skipping indexing.
Injecting seed URLs
/root/nutch/nutch/runtime/local/bin/nutch inject url/ -crawlId g1
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/root/nutch/nutch/runtime/local/lib/slf4j-log4j12-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/root/nutch/nutch/runtime/local/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
InjectorJob: starting at 2016-04-24 22:40:36
InjectorJob: Injecting urlDir: url
InjectorJob: Using class org.apache.gora.hbase.store.HBaseStore as the Gora storage class.
InjectorJob: total number of urls rejected by filters: 0
InjectorJob: total number of urls injected after normalization and filtering: 1
Injector: finished at 2016-04-24 22:40:40, elapsed: 00:00:03
Sun Apr 24 22:40:40 EDT 2016 : Iteration 1 of 10
Generating batchId
Generating a new fetchlist
/root/nutch/nutch/runtime/local/bin/nutch generate -D mapred.reduce.tasks=2 -D mapred.child.java.opts=-Xmx1000m -D mapred.reduce.tasks.speculative.execution=false -D mapred.map.tasks.speculative.execution=false -D mapred.compress.map.output=true -topN 50000 -noNorm -noFilter -adddays 0 -crawlId g1 -batchId 1461552040-7753
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/root/nutch/nutch/runtime/local/lib/slf4j-log4j12-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/root/nutch/nutch/runtime/local/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
GeneratorJob: starting at 2016-04-24 22:40:41
GeneratorJob: Selecting best-scoring urls due for fetch.
GeneratorJob: starting
GeneratorJob: filtering: false
GeneratorJob: normalizing: false
GeneratorJob: topN: 50000
GeneratorJob: finished at 2016-04-24 22:40:45, time elapsed: 00:00:03
GeneratorJob: generated batch id: 1461552040-7753 containing 1 URLs
Fetching : 
/root/nutch/nutch/runtime/local/bin/nutch fetch -D mapred.reduce.tasks=2 -D mapred.child.java.opts=-Xmx1000m -D mapred.reduce.tasks.speculative.execution=false -D mapred.map.tasks.speculative.execution=false -D mapred.compress.map.output=true -D fetcher.timelimit.mins=180 1461552040-7753 -crawlId g1 -threads 50
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/root/nutch/nutch/runtime/local/lib/slf4j-log4j12-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/root/nutch/nutch/runtime/local/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
FetcherJob: starting at 2016-04-24 22:40:46
FetcherJob: batchId: 1461552040-7753
FetcherJob: threads: 50
FetcherJob: parsing: false
FetcherJob: resuming: false
FetcherJob : timelimit set for : 1461562846381
Using queue mode : byHost
Fetcher: threads: 50
QueueFeeder finished: total 0 records. Hit by time limit :0
-finishing thread FetcherThread0, activeThreads=0
-finishing thread FetcherThread1, activeThreads=0
-finishing thread FetcherThread2, activeThreads=0
-finishing thread FetcherThread3, activeThreads=0
-finishing thread FetcherThread4, activeThreads=0
-finishing thread FetcherThread5, activeThreads=0
-finishing thread FetcherThread6, activeThreads=0
-finishing thread FetcherThread7, activeThreads=0
-finishing thread FetcherThread8, activeThreads=0
-finishing thread FetcherThread9, activeThreads=0
-finishing thread FetcherThread10, activeThreads=0
-finishing thread FetcherThread11, activeThreads=0
-finishing thread FetcherThread12, activeThreads=0
-finishing thread FetcherThread13, activeThreads=0
-finishing thread FetcherThread14, activeThreads=0
-finishing thread FetcherThread15, activeThreads=0
-finishing thread FetcherThread16, activeThreads=0
-finishing thread FetcherThread17, activeThreads=0
-finishing thread FetcherThread18, activeThreads=0
-finishing thread FetcherThread19, activeThreads=0
-finishing thread FetcherThread20, activeThreads=0
-finishing thread FetcherThread21, activeThreads=0
-finishing thread FetcherThread22, activeThreads=0
-finishing thread FetcherThread23, activeThreads=0
-finishing thread FetcherThread24, activeThreads=0
-finishing thread FetcherThread25, activeThreads=0
-finishing thread FetcherThread26, activeThreads=0
-finishing thread FetcherThread27, activeThreads=0
-finishing thread FetcherThread28, activeThreads=0
-finishing thread FetcherThread29, activeThreads=0
-finishing thread FetcherThread30, activeThreads=0
-finishing thread FetcherThread31, activeThreads=0
-finishing thread FetcherThread32, activeThreads=0
-finishing thread FetcherThread33, activeThreads=0
-finishing thread FetcherThread34, activeThreads=0
-finishing thread FetcherThread35, activeThreads=0
-finishing thread FetcherThread36, activeThreads=0
-finishing thread FetcherThread37, activeThreads=0
-finishing thread FetcherThread39, activeThreads=0
-finishing thread FetcherThread40, activeThreads=0
-finishing thread FetcherThread41, activeThreads=0
-finishing thread FetcherThread42, activeThreads=0
-finishing thread FetcherThread43, activeThreads=0
-finishing thread FetcherThread44, activeThreads=0
-finishing thread FetcherThread45, activeThreads=0
-finishing thread FetcherThread46, activeThreads=0
-finishing thread FetcherThread47, activeThreads=0
-finishing thread FetcherThread48, activeThreads=0
Fetcher: throughput threshold: -1
Fetcher: throughput threshold sequence: 5
-finishing thread FetcherThread38, activeThreads=0
-finishing thread FetcherThread49, activeThreads=0
0/0 spinwaiting/active, 0 pages, 0 errors, 0.0 0 pages/s, 0 0 kb/s, 0 URLs in 0 queues
-activeThreads=0
Using queue mode : byHost
Fetcher: threads: 50
QueueFeeder finished: total 0 records. Hit by time limit :0
Fetcher: throughput threshold: -1
Fetcher: throughput threshold sequence: 5
-finishing thread FetcherThread49, activeThreads=49
-finishing thread FetcherThread0, activeThreads=48
-finishing thread FetcherThread1, activeThreads=47
-finishing thread FetcherThread2, activeThreads=46
-finishing thread FetcherThread3, activeThreads=45
-finishing thread FetcherThread4, activeThreads=44
-finishing thread FetcherThread5, activeThreads=43
-finishing thread FetcherThread6, activeThreads=42
-finishing thread FetcherThread7, activeThreads=41
-finishing thread FetcherThread8, activeThreads=40
-finishing thread FetcherThread9, activeThreads=39
-finishing thread FetcherThread10, activeThreads=38
-finishing thread FetcherThread11, activeThreads=37
-finishing thread FetcherThread12, activeThreads=36
-finishing thread FetcherThread13, activeThreads=35
-finishing thread FetcherThread14, activeThreads=34
-finishing thread FetcherThread15, activeThreads=33
-finishing thread FetcherThread16, activeThreads=32
-finishing thread FetcherThread18, activeThreads=31
-finishing thread FetcherThread19, activeThreads=30
-finishing thread FetcherThread20, activeThreads=29
-finishing thread FetcherThread21, activeThreads=28
-finishing thread FetcherThread22, activeThreads=27
-finishing thread FetcherThread23, activeThreads=26
-finishing thread FetcherThread24, activeThreads=25
-finishing thread FetcherThread25, activeThreads=24
-finishing thread FetcherThread26, activeThreads=23
-finishing thread FetcherThread29, activeThreads=22
-finishing thread FetcherThread30, activeThreads=21
-finishing thread FetcherThread28, activeThreads=20
-finishing thread FetcherThread31, activeThreads=19
-finishing thread FetcherThread32, activeThreads=18
-finishing thread FetcherThread27, activeThreads=17
-finishing thread FetcherThread33, activeThreads=16
-finishing thread FetcherThread34, activeThreads=15
-finishing thread FetcherThread35, activeThreads=14
-finishing thread FetcherThread36, activeThreads=13
-finishing thread FetcherThread37, activeThreads=12
-finishing thread FetcherThread38, activeThreads=11
-finishing thread FetcherThread39, activeThreads=10
-finishing thread FetcherThread40, activeThreads=9
-finishing thread FetcherThread41, activeThreads=8
-finishing thread FetcherThread42, activeThreads=7
-finishing thread FetcherThread43, activeThreads=6
-finishing thread FetcherThread44, activeThreads=5
-finishing thread FetcherThread45, activeThreads=4
-finishing thread FetcherThread17, activeThreads=3
-finishing thread FetcherThread47, activeThreads=2
-finishing thread FetcherThread48, activeThreads=1
-finishing thread FetcherThread46, activeThreads=0
0/0 spinwaiting/active, 0 pages, 0 errors, 0.0 0 pages/s, 0 0 kb/s, 0 URLs in 0 queues
-activeThreads=0
FetcherJob: finished at 2016-04-24 22:41:00, time elapsed: 00:00:13
Parsing : 
/root/nutch/nutch/runtime/local/bin/nutch parse -D mapred.reduce.tasks=2 -D mapred.child.java.opts=-Xmx1000m -D mapred.reduce.tasks.speculative.execution=false -D mapred.map.tasks.speculative.execution=false -D mapred.compress.map.output=true -D mapred.skip.attempts.to.start.skipping=2 -D mapred.skip.map.max.skip.records=1 1461552040-7753 -crawlId g1
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/root/nutch/nutch/runtime/local/lib/slf4j-log4j12-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/root/nutch/nutch/runtime/local/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
ParserJob: starting at 2016-04-24 22:41:01
ParserJob: resuming: false
ParserJob: forced reparse: false
ParserJob: batchId: 1461552040-7753
ParserJob: success
ParserJob: finished at 2016-04-24 22:41:05, time elapsed: 00:00:04
CrawlDB update for g1
/root/nutch/nutch/runtime/local/bin/nutch updatedb -D mapred.reduce.tasks=2 -D mapred.child.java.opts=-Xmx1000m -D mapred.reduce.tasks.speculative.execution=false -D mapred.map.tasks.speculative.execution=false -D mapred.compress.map.output=true 1461552040-7753 -crawlId g1
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/root/nutch/nutch/runtime/local/lib/slf4j-log4j12-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/root/nutch/nutch/runtime/local/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
DbUpdaterJob: starting at 2016-04-24 22:41:06
DbUpdaterJob: batchId: 1461552040-7753
DbUpdaterJob: finished at 2016-04-24 22:41:10, time elapsed: 00:00:03
Skipping indexing tasks: no SOLR url provided.
Sun Apr 24 22:41:11 EDT 2016 : Iteration 2 of 10
Generating batchId
Generating a new fetchlist
/root/nutch/nutch/runtime/local/bin/nutch generate -D mapred.reduce.tasks=2 -D mapred.child.java.opts=-Xmx1000m -D mapred.reduce.tasks.speculative.execution=false -D mapred.map.tasks.speculative.execution=false -D mapred.compress.map.output=true -topN 50000 -noNorm -noFilter -adddays 0 -crawlId g1 -batchId 1461552071-22998
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/root/nutch/nutch/runtime/local/lib/slf4j-log4j12-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/root/nutch/nutch/runtime/local/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
GeneratorJob: starting at 2016-04-24 22:41:12
GeneratorJob: Selecting best-scoring urls due for fetch.
GeneratorJob: starting
GeneratorJob: filtering: false
GeneratorJob: normalizing: false
GeneratorJob: topN: 50000
GeneratorJob: finished at 2016-04-24 22:41:15, time elapsed: 00:00:03
GeneratorJob: generated batch id: 1461552071-22998 containing 0 URLs
Generate returned 1 (no new segments created)
Escaping loop: no more URLs to fetch now
[root@localhost local]# 
Connection closed by foreign host.

加载中
返回顶部
顶部