Skip to main content

爬虫者的贴心助手

Project description

项目说明

  • 基于requests封装的一个爬虫类

Python解释器

  • python3

如何使用?

from wauo import WauoSpider

spider = WauoSpider()

GET

url = 'https://github.com/markadc'
resp = spider.send(url)
print(resp.text)

POST

使用data参数

api = 'https://github.com/markadc'
data = {
    'key1': 'value1',
    'key2': 'value2'
}
resp = spider.send(api, data=data)

使用json参数

api = 'https://github.com/markadc'
json = {
    'key1': 'value1',
    'key2': 'value2'
}
resp = spider.send(api, json=json)

限制响应

限制响应码

  • 如果响应码不在codes范围里则抛弃响应
resp = spider.send('https://github.com/markadc', codes=[200, 301, 302])

限制响应内容

  • 如果checker返回False则抛弃响应
def is_ok(response):
    html = response.text
    if html.find('验证码') != -1:
        return False


resp = spider.send('https://github.com/markadc', checker=is_ok)

为headers增加默认字段

  • 实例化的时候使用default_headers参数
例子1
  • 每一次请求的headers都带上cookie
spider = WauoSpider(default_headers={'Cookie': 'Your Cookies'})
resp = spider.send('https://github.com/markadc')
print(resp.request.headers)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wauo-0.5.2.tar.gz (4.3 kB view details)

Uploaded Source

File details

Details for the file wauo-0.5.2.tar.gz.

File metadata

  • Download URL: wauo-0.5.2.tar.gz
  • Upload date:
  • Size: 4.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.10.13

File hashes

Hashes for wauo-0.5.2.tar.gz
Algorithm Hash digest
SHA256 938c8d31ada4221ce0a5d6f2f7e61d8a4b1ff796e3b6ff0447b156b1e86fd658
MD5 847530556dfff5602c5d24d2978f27b2
BLAKE2b-256 efa70e4c832853d81c05a5ad7cbcddbb50cfbf3e9b9873dc9b384724cab7d23a

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page