Skip to main content

百度百科简易爬虫

Project description

baike-spider

百度百科简易爬虫

⚠️ 该爬虫仅用于学习使用, 不得用于任何非法用途或侵犯他人合法权益 ⚠️

检察日报: 爬取数据需遵守


安装

pip install baike-spider

使用


模块

获取你想要的数据

from baikes import Baike

baike = Baike("网络爬虫")

print(baike.album)
print(baike.intro)
print(baike.paragraphs)
# ...

Baike 对象包含以下属性:

属性名 类型 描述
name str 条目名称
title str 条目标题
album str 概述图 URL
intro str 简介
card OrderedDict 知识卡片
paragraphs OrderedDict 描述段落

有时可能会出现同名词, 参数 category 用于限定词条分类:

from baikes import Baike

baike = Baike("黄蜂", category="动物")

命令行

该爬虫可使用命令行进行调用

示例:

# 获取全部
python -m baikes -n "网络爬虫"

# 限定词条分类
python -m baikes -n "黄蜂" -c "动物"

# 获取百科卡片
python -m baikes -n "网络爬虫" card

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

baikes-0.1.2.tar.gz (6.6 kB view details)

Uploaded Source

Built Distribution

baikes-0.1.2-py3-none-any.whl (6.6 kB view details)

Uploaded Python 3

File details

Details for the file baikes-0.1.2.tar.gz.

File metadata

  • Download URL: baikes-0.1.2.tar.gz
  • Upload date:
  • Size: 6.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.10

File hashes

Hashes for baikes-0.1.2.tar.gz
Algorithm Hash digest
SHA256 7ee2e170d15397c7bdfe1e6ff45825dd31c6ab841109bd6ae1c10e4978f02ee1
MD5 b91078e2e1f0c59b3bc51aafef634887
BLAKE2b-256 0a2fe153a17023a56eb02691ca8b0cf9b175cdc34d29b9a1e2792318151dfcf6

See more details on using hashes here.

File details

Details for the file baikes-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: baikes-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 6.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.10

File hashes

Hashes for baikes-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 4f52fff8760cce46e665ccd339a717997bcf2e8fcbff2b652ebef63a07460e50
MD5 324afac50ccc14660afd5f4a312ec913
BLAKE2b-256 34527883ba8df7c4969cb7c88669f46e734b4ba057c0f7e0bd4f7fdc62191f99

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page