Skip to main content

An efficient information processing program.

Project description

cmip

An efficient information processing program.

安装

pip install cmip

用法

1. 动态渲染异步爬虫

example:

from cmip.web import web_scraping
import asyncio
urls = [
        "https://baidu.com",
        "https://qq.com",
        # ...More URL
    ]
asyncio.run(web_scraping(urls, output_path="output", max_concurrent_tasks=10, save_image=True, min_img_size=200))

参数含义:

urls 网页链接(包含协议头)
output_path 输出路径
max_concurrent_tasks 最大同时执行任务数,根据自身机器资源和网络情况调整
save_image 是否保存图片
min_img_size 当图片小于这个值时不爬取

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cmip-0.0.5.tar.gz (23.0 kB view details)

Uploaded Source

Built Distribution

cmip-0.0.5-py3-none-any.whl (32.0 kB view details)

Uploaded Python 3

File details

Details for the file cmip-0.0.5.tar.gz.

File metadata

  • Download URL: cmip-0.0.5.tar.gz
  • Upload date:
  • Size: 23.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.6

File hashes

Hashes for cmip-0.0.5.tar.gz
Algorithm Hash digest
SHA256 fdd41e44634850e120025bd8c2460f112b3459b4ae43ee3c9f37adf06d9e2935
MD5 6dc82c782ac8f816ef3e11f8bd1632cd
BLAKE2b-256 5d8fe74db412699ac00595f85ab4492e357fece7a67efbfa7d22c3c664ad018c

See more details on using hashes here.

File details

Details for the file cmip-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: cmip-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 32.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.6

File hashes

Hashes for cmip-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 78a31425436e22c7564d7d27ff11a36f7ca54bd1bcc111db2f93beb14fbce499
MD5 ee452d91d237fc3d911c890a1f4d2665
BLAKE2b-256 433ad95431900ef6dec02aff2581988a8fc64eaabd35ce2480e0e3e2419c964f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page