For crawling web file explorers for content
Project description
The Crawler
Web crawling utility for downloading files from an exposed filesystem.
Installation
From PyPI
This assumes you have Python 3.10+ installed and pip3
is on
your path:
~$ pip3 install the-crawler
...
~$ hms-import -h
usage: the-crawler [-h] [--recurse] [--output-directory OUTPUT_DIRECTORY] [--extensions EXTENSIONS [EXTENSIONS ...]] base_url
Crawls given url for content
positional arguments:
base_url
options:
-h, --help show this help message and exit
--recurse, -r
--output-directory OUTPUT_DIRECTORY, -o OUTPUT_DIRECTORY
--extensions EXTENSIONS [EXTENSIONS ...], -e EXTENSIONS [EXTENSIONS ...]
From Source
This assumes you have git, Python 3.10+, and poetry installed already.
~$ git clone git@gitlab.com:woodforsheep/the-crawler.git
...
~$ cd the-crawler
the-crawler$ poetry install
...
the-crawler$ poetry run the-crawler -h
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
the_crawler-0.2.0.tar.gz
(2.6 kB
view hashes)
Built Distribution
Close
Hashes for the_crawler-0.2.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5b9b2f0f7d48d326ea84f5860e5cb79f5b41d41571308da450780026f74436e1 |
|
MD5 | b7b35895e4f32673a9f05b3f269b0345 |
|
BLAKE2b-256 | 8ebb6fc9d97841a0903b420f31a950b0881dabfb65727d402c67f326735558a3 |