Skip to main content

百度百科简易爬虫

Project description

baike-spider

百度百科简易爬虫

⚠️ 该爬虫仅用于学习使用, 不得用于任何非法用途或侵犯他人合法权益 ⚠️

检察日报: 爬取数据需遵守


安装

pip install baike-spider

使用


模块

一次性解析全部: 该方法会一次性解析全部的数据并存储进对象属性中

from baikes import Baike

baike = Baike("网络爬虫")

print(baike.album)
print(baike.intro)
print(baike.paragraphs)
# ...

部分解析: 当你只需爬取部分数据时, 该方法能会降低部分性能损耗

from baikes import Baike

baike = Baike("网络爬虫", once=Flase)
intro = baike.get_intro()

print(intro)

有时会出现同名词, 参数 category 用于限定词条分类:

from baikes import Baike

baike = Baike("黄蜂", category="动物")

命令行

该爬虫可使用命令行进行调用

示例:

# 获取全部
python -m baikes -n "网络爬虫"

# 限定词条分类
python -m baikes -n "黄蜂" -c "动物"

# 一次性解析:
# 获取百科卡片
python -m baikes -n "网络爬虫" card

# 部分解析:
# 获取百科简介
python -m baikes -n "网络爬虫" -o False get_card

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

baikes-0.1.1.tar.gz (6.6 kB view details)

Uploaded Source

Built Distribution

baikes-0.1.1-py3-none-any.whl (6.6 kB view details)

Uploaded Python 3

File details

Details for the file baikes-0.1.1.tar.gz.

File metadata

  • Download URL: baikes-0.1.1.tar.gz
  • Upload date:
  • Size: 6.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.10

File hashes

Hashes for baikes-0.1.1.tar.gz
Algorithm Hash digest
SHA256 d1f5a43318d79d264806bfa2078bba750f621946836b868a7b675a2f5eff1442
MD5 f02f25891dd0402d0f9897a8c1b7ea5b
BLAKE2b-256 8edaf9c7f319727b59241ebcc169f9c15ca36b05e2c031152347bc2c763caf93

See more details on using hashes here.

File details

Details for the file baikes-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: baikes-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 6.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.10

File hashes

Hashes for baikes-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 a15ae7aeee5372ab67d887c0703a531bd9021e205ab42317feed91e8f69616b8
MD5 07f5580be84961775840ad37081bebbe
BLAKE2b-256 b4b6ad4c4f1ccc07a080307ce56c7d9d63981e1b1083d5415acb0ba939614b56

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page