8
回答
读取网页源代码乱码。。
科大讯飞通用文字识别100000次/天免费使用。立即申请   
String url = "http://roll.sohu.com/20111026/n323511012.shtml";

        String str = getHttp(url);
        System.out.println(str);

    public String getHttp(String url) {
        try {
            URL u = new URL(url);
            HttpURLConnection http = (HttpURLConnection) u.openConnection();
            BufferedReader in = new BufferedReader(new InputStreamReader(http.getInputStream(), "gbk"));
            StringBuilder sb = new StringBuilder();
            String line = "";
            while ((line = in.readLine()) != null) {
                sb.append(line).append("\n");
            }
            in.close();
            http.disconnect();
            return sb.toString();
        } catch (Exception ex) {
            Logger.getLogger(Http.class.getName()).log(Level.SEVERE, null, ex);
            return null;
        }
    }

http://roll.sohu.com/20111026/n323511012.shtml 

明明是GBK 的为什么读取出来是乱码


举报
cooc123
发帖于7年前 8回/1K+阅
顶部