HttpClient4.x模拟登陆请求保护的url

雷超林 发布于 2013/11/22 23:10
阅读 1K+
收藏 7

    请教一下各位大神。 我需要用HttpClient4.x来模拟登陆一个网站,然后再打开里面的一个链接进行数据抓取。 HttpClient的使用策略等应该是这么样设置?  我实例出来一个HttpClient之后用它进行了登陆Post,然后再使用这个HttpClient去请求受限资源,报没登陆的错误。

    HttpClient是这样设置的:

 // 设置组件参数, HTTP协议的版本,1.1/1.0/0.9 
   HttpParams params = new BasicHttpParams(); 
   HttpProtocolParams.setVersion(params, HttpVersion.HTTP_1_1); 
   HttpProtocolParams.setUserAgent(params, "HttpComponents/1.1"); 
   HttpProtocolParams.setUseExpectContinue(params, true);


   //设置连接超时时间 
   int REQUEST_TIMEOUT = 10*1000; //设置请求超时10秒钟 
int SO_TIMEOUT = 10*1000; //设置等待数据超时时间10秒钟 
//HttpConnectionParams.setConnectionTimeout(params, REQUEST_TIMEOUT);
//HttpConnectionParams.setSoTimeout(params, SO_TIMEOUT);
   params.setParameter(CoreConnectionPNames.CONNECTION_TIMEOUT, REQUEST_TIMEOUT);  
   params.setParameter(CoreConnectionPNames.SO_TIMEOUT, SO_TIMEOUT); 
 
//设置访问协议 
SchemeRegistry schreg = new SchemeRegistry();  
schreg.register(new Scheme("http",80,PlainSocketFactory.getSocketFactory())); 
schreg.register(new Scheme("https", 443, SSLSocketFactory.getSocketFactory()));  

//多连接的线程安全的管理器 
PoolingClientConnectionManager pccm = new PoolingClientConnectionManager(schreg);
pccm.setDefaultMaxPerRoute(20); //每个主机的最大并行链接数 
pccm.setMaxTotal(100); //客户端总并行链接最大数    

HttpClient httpClient = new DefaultHttpClient(pccm, params);

//这两个策略都试过了,不行。
//httpClient.getParams().setParameter(ClientPNames.COOKIE_POLICY, CookiePolicy.BROWSER_COMPATIBILITY);
httpClient.getParams().setParameter(ClientPNames.COOKIE_POLICY, CookiePolicy.BEST_MATCH);


求救大神给个Demo或指导HttpClient应该怎么设置。

加载中
0
开源中国时时彩理财师
开源中国时时彩理财师

不用特殊设置,httpClient会自动提交登录成功后保存session的cookie。

我这样用过,可以抓取:

public class Spider {
	private DefaultHttpClient httpClient;
	private HttpResponse response;
	private HttpEntity entity;
	
	public Spider()
	{
		this.httpClient = new DefaultHttpClient();
		HttpParams params = httpClient.getParams();
		/*连接超时*/
		HttpConnectionParams.setConnectionTimeout(params, 30000);
		/*读取超时*/
	    HttpConnectionParams.setSoTimeout(params, 30000);
	}
	
	public void post(String url, List<NameValuePair> nameValuePair) throws ClientProtocolException, IOException {
		HttpPost httpost = new HttpPost(url);
		if(nameValuePair != null)
		{
			httpost.setEntity(new UrlEncodedFormEntity(nameValuePair, HTTP.UTF_8));
		}
		this.response = this.httpClient.execute(httpost);
		this.entity = response.getEntity();						
	}
	
	public void get(String url) throws ClientProtocolException, IOException {
		HttpGet httpGet = new HttpGet(url);
		this.response = this.httpClient.execute(httpGet);
		this.entity = response.getEntity();			
	}
	
	public void readResponseContent() throws UnsupportedEncodingException, IllegalStateException, IOException
	{
		BufferedReader reader = new BufferedReader(new InputStreamReader(this.entity.getContent(), "utf-8"));
		
		//读取你需要的信息
		
		
		releaseEntity();
	}
	
	private void releaseEntity() throws IOException
	{
		if(this.entity != null){			
			this.entity.consumeContent();			
		}
	}
}

 

雷超林
雷超林
谢谢,问题已经搞定过了。 用了HttpContext localContext = new BasicHttpContext(); 和一个cookieStore
0
华兹格
华兹格

httpClient会自动提交登录成功后保存session的cookie?

我感觉是session的问,httpClient能像浏览器一样支持session和cookie吗?

0
Timco
Timco
一般都是http头的问题,尤其是referer,之前我那个就是这样的
返回顶部
顶部