Skip to main content

kkrobots安全爬虫守护者

Project description

关于这个项目

kkrobots 是一款保证安全爬虫的工具,在爬取任何请求前调用 Parse 对象的 can_crawl() 方法即可判断是否符合 robots.txt 协议。

使用流程

使用流程非常简单,在每次爬虫前调用即可:

from kkrobots import Parse

if __name__ == '__main__':
    parse = Parse(
        user_agent='your spider',
        # 该站点任意链接即可
        test_url='https://xxxx.com/xxx/xxx/xxx'
    )

    can_crawl = parse.can_crawl('https://xxxx.com/xxx/xxx')

    # 下方执行你的爬虫逻辑
    if can_crawl:
        pass

关于作者

微信公众号:Python卡皮巴拉

🌟【Python卡皮巴拉】—— 你的Python修炼秘籍,代码界的“神兽”驾到!🌟

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kkrobots-1.0.1.tar.gz (4.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

kkrobots-1.0.1-py3-none-any.whl (7.3 kB view details)

Uploaded Python 3

File details

Details for the file kkrobots-1.0.1.tar.gz.

File metadata

  • Download URL: kkrobots-1.0.1.tar.gz
  • Upload date:
  • Size: 4.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.5

File hashes

Hashes for kkrobots-1.0.1.tar.gz
Algorithm Hash digest
SHA256 7afddd96714cec3c18a88749ff927ebc8a5f7eb8bb8b4f29168614374379b3b4
MD5 a81407f5ca9cec1105c6b73b6d30673d
BLAKE2b-256 dbea56d7df88019a6fd22b5643305ee3565c4517874b33a69b3f27fc0d1fc545

See more details on using hashes here.

File details

Details for the file kkrobots-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: kkrobots-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 7.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.5

File hashes

Hashes for kkrobots-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 0f7c59436f14e3800989796816d7e636eaa1d6c998afcb05298f47014e6657fb
MD5 aa3de55a83a155a0bd03d01cbaa4133e
BLAKE2b-256 25c8460d0730f334085c98a239550b6268dcd9f68bd1ffb1e3de55e0cec69e8e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page