Skip to main content

百度百科简易爬虫

Project description

baike-spider

百度百科简易爬虫

⚠️ 该爬虫仅用于学习使用, 不得用于任何非法用途或侵犯他人合法权益 ⚠️

检察日报: 爬取数据需遵守


安装

pip install baike-spider

使用


模块

获取你想要的数据

from baikes import Baike

baike = Baike("网络爬虫")

print(baike.album)
print(baike.intro)
print(baike.paragraphs)
# ...

Baike 对象包含以下属性:

属性名 类型 描述
name str 条目名称
title str 条目标题
album str 概述图 URL
intro str 简介
card OrderedDict 知识卡片
paragraphs OrderedDict 描述段落

有时可能会出现同名词, 参数 category 用于限定词条分类:

from baikes import Baike

baike = Baike("黄蜂", category="动物")

命令行

该爬虫可使用命令行进行调用

示例:

# 获取全部
python -m baikes -n "网络爬虫"

# 限定词条分类
python -m baikes -n "黄蜂" -c "动物"

# 获取百科卡片
python -m baikes -n "网络爬虫" card

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

baikes-0.1.3.tar.gz (6.7 kB view details)

Uploaded Source

Built Distribution

baikes-0.1.3-py3-none-any.whl (6.6 kB view details)

Uploaded Python 3

File details

Details for the file baikes-0.1.3.tar.gz.

File metadata

  • Download URL: baikes-0.1.3.tar.gz
  • Upload date:
  • Size: 6.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.10

File hashes

Hashes for baikes-0.1.3.tar.gz
Algorithm Hash digest
SHA256 ae9cbf92270254fefbe2481b3f9a6e590bf290124229261b5df432c42c801732
MD5 93ae32e808e04a75ba538a71810af95d
BLAKE2b-256 5de4dcb77ed06e69ddc741c19c559379f7a89f9cee91f7824ee20819122ba8d1

See more details on using hashes here.

File details

Details for the file baikes-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: baikes-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 6.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.10

File hashes

Hashes for baikes-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 4bacfac7c325764b1597fdd48f029f5cbe032d7dc8890eb4efa751840dae71b6
MD5 30fe645d2e0aad3395f2f09946c9dcc7
BLAKE2b-256 cea729d0e7810f7f50240a62621e3ff21dd470c22cd898b3e3982c7bf8e090ee

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page