关于HttpClient的中文参数编码问题

红薯 发布于 2008/10/05 17:13
阅读 2K+
收藏 2

今天在使用HttpClient提交中文参数的时候发现服务器不管怎么处理得到的字符串都是乱码,考虑应该是客户端的问题,查阅 HttpClient的文档,提到这么一段:

The standard for URLs ( RFC1738 ) explictly states that URLs may only contain graphic printable characters of the US-ASCII coded character set and is defined in terms of octets. The octets 80-FF hexadecimal are not used in US-ASCII and the octets OO-1F hexadecimal represent control characters; characters in these ranges must be encoded.

Characters which cannot be represented by an 8-bit ASCII code, can not be used in an URL as there is no way to reliably encode them
(the encoding scheme for URLs is based off of octets). Despite this, some servers do support varying means of encoding double byte characters in URLs,
the most common technique seems to be to use UTF-8 encoding and encode each octet separately even if a pair of octets represents one character. This however, is not specified by the standard and is highly prone to error, so it is recommended that URLs be restricted to the 8-bit ASCII range.


因此在提交中文参数的时候必须进行转码:
NameValuePair content = new NameValuePair("content",new String(" 你好中国".getBytes(),"8859_1"));
搞定!

加载中
0
崔钢
崔钢
有些浏览器会默认在URL里面使用utf-8编码。比如IE和firefox。如果你使用URL来提交中文参数,这个问题还是需要注意的。
返回顶部
顶部