Skip to main content

Add your description here

Project description

DataHarvest

DataHarvest 是一个用于数据搜索、爬取、清洗的工具。

DataHarvest

数据爬取&清洗

网站 内容 url pattern 爬取 清洗
百度百科 词条 baike.baidu.com/item
百度百家号 文章 baijiahao.baidu.com/s
B站 文章 www.bilibili.com/read
腾讯网 文章 new.qq.com/rain/a
360个人图书馆 文章 www.360doc.com/content
360百科 词条 baike.so.com/doc
搜狗百科 词条 baike.sogou.com/v
搜狐 文章 www.sohu.com/a
头条 文章 www.toutiao.com/article
网易 文章 www.163.com/\w+/article/.+
微信公众号 文章 weixin.qq.com/s

安装与使用

pip install DataHarvest

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataharvest-0.1.7.tar.gz (7.9 kB view details)

Uploaded Source

Built Distribution

dataharvest-0.1.7-py3-none-any.whl (14.4 kB view details)

Uploaded Python 3

File details

Details for the file dataharvest-0.1.7.tar.gz.

File metadata

  • Download URL: dataharvest-0.1.7.tar.gz
  • Upload date:
  • Size: 7.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.10

File hashes

Hashes for dataharvest-0.1.7.tar.gz
Algorithm Hash digest
SHA256 ce1884efb151cf3cb864cc7defc47bf0268a1053621fccff22f2ea28a1e8ec73
MD5 7d40e0d76bc3dcc3c7ca197052994989
BLAKE2b-256 1285cd026923d2142aa058038cea4d2997cafe8e4ab8427f96a30a0b2742b134

See more details on using hashes here.

File details

Details for the file dataharvest-0.1.7-py3-none-any.whl.

File metadata

  • Download URL: dataharvest-0.1.7-py3-none-any.whl
  • Upload date:
  • Size: 14.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.10

File hashes

Hashes for dataharvest-0.1.7-py3-none-any.whl
Algorithm Hash digest
SHA256 8e4e3cd637b8c1bf7bac5e0713a0ed3cbfa205fb6173d10a6beb57fddc89f62d
MD5 e03269d512b315e8f80e51108b482ee6
BLAKE2b-256 d2afabfe252dc562a2f5b382c8c6ac22065535bcfcb736a8454a191021614d30

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page