Skip to main content

Add your description here

Project description

bfg-crawl

A simple web crawler that fetches pages and stores them in SQLite.

Features

  • Multi-threaded crawling
  • Rate limiting support
  • SQLite database storage

Install

pip install bfg-crawl

Usage

from crawl import Crawler, RateLimitingLoader

loader = ALoaderForThisWebsite()
crawler = Crawler("pages.db", RateLimitingLoader(loader), concurrency=5)

crawler.run()

The crawler reads URLs from a SQLite database and saves the page content back to the database.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bfg_crawl-0.1.1.tar.gz (2.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bfg_crawl-0.1.1-py3-none-any.whl (4.3 kB view details)

Uploaded Python 3

File details

Details for the file bfg_crawl-0.1.1.tar.gz.

File metadata

  • Download URL: bfg_crawl-0.1.1.tar.gz
  • Upload date:
  • Size: 2.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.12

File hashes

Hashes for bfg_crawl-0.1.1.tar.gz
Algorithm Hash digest
SHA256 0259cf7242f47e84bc08e6ce7428c88be89c46f5ce1c4d6d36550f28239b411f
MD5 dcd42f40e20e3f55533ff4e8583d0ca6
BLAKE2b-256 521a63372041ac9bcf8c902350e30e0567a5fff86204091438bda018a2bc8437

See more details on using hashes here.

File details

Details for the file bfg_crawl-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: bfg_crawl-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 4.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.12

File hashes

Hashes for bfg_crawl-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 418db0146be1133291d20ce39ca0f89a00c074142b54e00afdebf0cb8e616738
MD5 6843d881f7a7f57e3f1f624d807a0b7d
BLAKE2b-256 7f4b17a540d5b15ce634a7bef44ee0d9dc3a3136509afee04eb6e6660a025557

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page