如何使用java代码,爬取页面特定内容
1、确认目标我们要爬取的是“百度”首页中的 gif 动态图将它下载下来

3、依赖有了我们就开始编码吧第一步:先把“百度”首页所有的html元素内容全部爬下来

5、第三步:获取图片的网络路径


7、代码都在这里了 : public static void main(String [] args) throws IOExceptio荏鱿胫协n { Document doc = Jsoup.connect("http://www.baidu.com/").get(); Elements select = doc.select(".index-logo-src"); int i = 1; java.net.URL url = null; for (Element element : select) { String src = element.attr("src"); src = src.substring(2); src = "http://" + src; url = new java.net.URL(src); DataInputStream dataInputStream = new DataInputStream(url.openStream()); FileOutputStream fileOutputStream = new FileOutputStream(new File("e:/img/" + (i) + ".gif")); ByteArrayOutputStream output = new ByteArrayOutputStream(); byte[] buffer = new byte[1024]; int length; while ((length = dataInputStream.read(buffer)) > 0) { output.write(buffer, 0, length); } byte[] bytes = output.toByteArray(); fileOutputStream.write(output.toByteArray()); dataInputStream.close(); fileOutputStream.close(); i++; }}