HTTPclient+htmlparser开发java网络爬虫

chunshui 发布于 2014/07/16 17:28
阅读 2K+
收藏 0

在用HTTPclient+htmlparser开发java网络爬虫时。出现这个错误:

org.htmlparser.util.ParserException: Connection refused: connect;

java.net.ConnectException: Connection refused: connect
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
at java.net.Socket.connect(Socket.java:519)
at java.net.Socket.connect(Socket.java:469)
at sun.net.NetworkClient.doConnect(NetworkClient.java:163)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:394)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:529)
at sun.net.www.http.HttpClient.<init>(HttpClient.java:233)
at sun.net.www.http.HttpClient.New(HttpClient.java:306)
at sun.net.www.http.HttpClient.New(HttpClient.java:323)
at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:852)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:793)
at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:718)
at org.htmlparser.http.ConnectionManager.openConnection(ConnectionManager.java:643)
at org.htmlparser.http.ConnectionManager.openConnection(ConnectionManager.java:841)
at org.htmlparser.Parser.setResource(Parser.java:398)
at org.htmlparser.Parser.<init>(Parser.java:317)
at org.htmlparser.Parser.<init>(Parser.java:331)
at com.sohu.crawler.LinkParser.extracLinks(LinkParser.java:27)
at com.sohu.crawler.LinkParser.doParser(LinkParser.java:78)
at com.sohu.servlet.GetNewsServlet$1.run(GetNewsServlet.java:41)
at java.lang.Thread.run(Thread.java:619)

加载中
0
skanda
skanda
频率太大被对方服务器拒绝,试一试把频率降低
0
c
chuntianhao
网络爬虫教程   http://my.oschina.net/youmumzyx/blog?fromerr=lT0iOFNE
返回顶部
顶部