Skip to main content

百度百科简易爬虫

Project description

baikeS

baidu-baike-spider

百度百科简易爬虫

⚠️ 该爬虫仅用于学习使用, 不得用于任何非法用途或侵犯他人合法权益 ⚠️

检察日报: 爬取数据需遵守


安装

pip install baikes

使用


模块

获取你想要的数据

from baikes import Baike

baike = Baike("网络爬虫")

print(baike.album)
print(baike.intro)
print(baike.paragraphs)
# ...

Baike 对象包含以下属性:

属性名 类型 描述
name str 条目名称
title str 条目标题
album str 概述图 URL
intro str 简介
card OrderedDict 知识卡片
paragraphs OrderedDict 描述段落

有时可能会出现同名词, 参数 category 用于限定词条分类:

from baikes import Baike

baike = Baike("黄蜂", category="动物")

命令行

该爬虫可使用命令行进行调用

示例:

# 获取全部
python -m baikes -n "网络爬虫"

# 限定词条分类
python -m baikes -n "黄蜂" -c "动物"

# 获取百科卡片
python -m baikes -n "网络爬虫" card

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

baikes-0.1.4.tar.gz (6.7 kB view details)

Uploaded Source

Built Distribution

baikes-0.1.4-py3-none-any.whl (6.6 kB view details)

Uploaded Python 3

File details

Details for the file baikes-0.1.4.tar.gz.

File metadata

  • Download URL: baikes-0.1.4.tar.gz
  • Upload date:
  • Size: 6.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.10

File hashes

Hashes for baikes-0.1.4.tar.gz
Algorithm Hash digest
SHA256 982309d2bf1d0798fcbdc9bb3591dcfc1bd8979eb5b2acfb46b50bea7a5d507a
MD5 08255751a55993117a4758885f94139c
BLAKE2b-256 61c933549e8efc9e3e9a65589de4c96ea3dc98ea9118703f12f1813c2304479a

See more details on using hashes here.

File details

Details for the file baikes-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: baikes-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 6.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.10

File hashes

Hashes for baikes-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 3f7fb8c775e53e4a060c5766271f49f334f211c78cf93c331d24217b974fd2eb
MD5 ac526eae97cd18defd702d71c1f02d49
BLAKE2b-256 c57ad5beeb759598f3486b93c9dc660b2c02b115a7e37c81dce4d4d4da2259cc

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page