Skip to main content

spider framework for winndoo.

Project description

##版本说明:

  • 5.2.5: add ua list
  • 5.2.6: add table create function
  • 5.2.7: add ua list again
  • 5.2.8: 添加获取百度cookie的方法
  • 5.2.9: 修改百度cookie存储位置
  • 5.2.10: add *.txt file
  • 5.2.11: baidu cookie storage use sorted set
  • 5.2.12: add baidu mb cookie
  • 5.2.13: 修复本地中文乱码
  • 5.2.14: downloader fail
  • 5.2.15: downloader fail

新下载中心,2021-08 版

urls参数是一个数组,数组的元素是字符串,字符串的内容是抓取对象的json,抓取对象主要的字段说明

  • u,str,请求url
  • hs,dict,请求headers
  • sleep,int,sleep的秒数,如果有该字段,adsl不进行抓取,仅仅休眠指定的秒数
  • debug,是否debug模式 【如果请求时,需要带自定义参数,暂时可以使用此字段】
  • et,extract_type,提取类型:0 获取页面
    • 0->不解析;1->解析百度PC排名结果;2->解析百度移动排名结果;3->解析百度真实URL; 4->解析百度PC URL是否收录;5->解析360PC排名结果;6->解析360移动排名结果; 7->解析搜狗PC排名结果;9->解析搜狗移动排名结果;9->解析网页TDK
  • cu,cookie_url,cookie的url
  • tid,任务id
  • uid,请求id,通常是url的md5值
  • d,请求的数据,应用自定义的扩展字段 ->测试会报错
  • r,redirect,重定向
  • v,verify,验证的信息
  • ih,is_head
  • rh,return header
  • rt,retry times,重试次数
  • e,encoding,编码方式
  • t,timeout,超时时间,单位秒
  • sgg,搜狗微文章的临时url

注意:

  • 一般设置tid、uid、u,hs,et字段,其它字段可以不设置
  • 代码要充分利用多线程 服务端是异步

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wd_download_center-1.0.0.tar.gz (42.6 kB view details)

Uploaded Source

Built Distribution

wd_download_center-1.0.0-py2.py3-none-any.whl (77.1 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file wd_download_center-1.0.0.tar.gz.

File metadata

  • Download URL: wd_download_center-1.0.0.tar.gz
  • Upload date:
  • Size: 42.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.44.1 CPython/3.7.6

File hashes

Hashes for wd_download_center-1.0.0.tar.gz
Algorithm Hash digest
SHA256 a0b26def7bd552b09385d7c6578e8b8296fb4c97b950352ca1f8c6bb08f90d03
MD5 324ee2dded549f5757ee49ff55c96bc8
BLAKE2b-256 c7d12a1c52158492fd23e35dd5ada76c22923d6b2f24c0ee485bed7ab786d6c9

See more details on using hashes here.

File details

Details for the file wd_download_center-1.0.0-py2.py3-none-any.whl.

File metadata

  • Download URL: wd_download_center-1.0.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 77.1 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.44.1 CPython/3.7.6

File hashes

Hashes for wd_download_center-1.0.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 99cc2185b239567d21a4ac048562f056738667dee2bcbe8f45201ef48047b5d4
MD5 ce0542fea6d473d5bbdf57071723babc
BLAKE2b-256 6664dabdee64ba657f68efb6ca51ba707b72007e5a11892c2c7b59fadf3173e8

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page