For crawling web file explorers for content
Project description
The Crawler
Web crawling utility for downloading files from an exposed filesystem.
Installation
From PyPI
This assumes you have Python 3.10+ installed and pip3
is on
your path:
~$ pip3 install the-crawler
...
~$ hms-import -h
usage: the-crawler [-h] [--recurse] [--output-directory OUTPUT_DIRECTORY] [--extensions EXTENSIONS [EXTENSIONS ...]] base_url
Crawls given url for content
positional arguments:
base_url
options:
-h, --help show this help message and exit
--recurse, -r
--output-directory OUTPUT_DIRECTORY, -o OUTPUT_DIRECTORY
--extensions EXTENSIONS [EXTENSIONS ...], -e EXTENSIONS [EXTENSIONS ...]
From Source
This assumes you have git, Python 3.10+, and poetry installed already.
~$ git clone git@gitlab.com:woodforsheep/the-crawler.git
...
~$ cd the-crawler
the-crawler$ poetry install
...
the-crawler$ poetry run the-crawler -h
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
the_crawler-0.3.0.tar.gz
(2.7 kB
view hashes)
Built Distribution
Close
Hashes for the_crawler-0.3.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 30143e761bcfb81512408ca429c71e90b72f0fb8586a96e0ea3f5b98e70f4fc8 |
|
MD5 | 207b7ffd5bc6d0a35f4b7ecbe2acdde7 |
|
BLAKE2b-256 | 0a1e03c69deb73fd3aab22b4a300887aec258a7310b3a85ccd27f621aa70f6cd |