发布于2019-08-06 10:19 阅读(989) 评论(0) 点赞(1) 收藏(5)
依葫芦画瓢
用字符串查找图片地址下载
图片放在当前目录
GIF下载下来不会动.....
import urllib.request
import time
def open_url(url):
#return htmlpage
print(url)
req = urllib.request.Request(url)
req.add_header("User-Agent","Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36")
response = urllib.request.urlopen(req)
return response.read()
def getInitialpage():
#return how many pages we have
url = "http://jandan.net/ooxx"
html = open_url(url)
html = html.decode("utf-8")
index = html.find("span class=\"current-comment-page\"")
beginindex = html.find("[" , index)
endindex = html.find("]" , index)
initialpage = html[(beginindex+1) : endindex]
return initialpage
def getpiclist(pageurl):
html = open_url(pageurl)
html = html.decode("utf-8")
piclist = list()
for i in range(html.count("[查看原图]</a><br /><img")):
index = html.find("[查看原图]</a><br /><img")
html=html[index:]
beginindex = html.find("\"")
endindex = html.find("\"" , (beginindex+1))
picurl = html[beginindex+1:endindex]
html = html[endindex:]
piclist.append(picurl)
return piclist
def savepic(piclist):
for picurl in piclist:
html = open_url("http:{}".format(picurl))
filename = picurl.split("/")[-1]
print(filename)
with open(filename , "wb") as f:
f.write(html)
time.sleep(1)
def test(page):
initialpage = int(getInitialpage())
for i in range((initialpage-page),(initialpage+1)):
pageurl = "http://jandan.net/ooxx/page-{}#comments".format(i)
piclist = getpiclist(pageurl)
savepic(piclist)
if __name__ == "__main__":
test(1)
作者:短发越来越短
链接:https://www.pythonheidong.com/blog/article/7575/909a60805ccf2d37b1fb/
来源:python黑洞网
任何形式的转载都请注明出处,如有侵权 一经发现 必将追究其法律责任
昵称:
评论内容:(最多支持255个字符)
---无人问津也好,技不如人也罢,你都要试着安静下来,去做自己该做的事,而不是让内心的烦躁、焦虑,坏掉你本来就不多的热情和定力
Copyright © 2018-2021 python黑洞网 All Rights Reserved 版权所有,并保留所有权利。 京ICP备18063182号-1
投诉与举报,广告合作请联系vgs_info@163.com或QQ3083709327
免责声明:网站文章均由用户上传,仅供读者学习交流使用,禁止用做商业用途。若文章涉及色情,反动,侵权等违法信息,请向我们举报,一经核实我们会立即删除!