grepurl

extract URLs from websites or local HTML files

Project description

grepurl is a command line tool that extracts URLs from a website (or a local HTML file).

Usage

grepurl http://example.com/ # extract all URLs from links and images
grepurl -a http://example.com/foo.htm # only extract from <a> tags (i.e. links)
grepurl -i http://example.com/bar.htm # only extract from <img> tags (i.e. images)
grepurl -r "\.py$" http://example.com/ # only extract links that end in '.py'
grepurl -r "\.zip$" -d http://example.com/ # download all zip files
grepurl -r "\.zip$" -d -o download_dir http://example.com/ # download all zip files into download_dir

Installation using pip

pip install grepurl

Installation from repository

git clone https://github.com/arne-cl/grepurl
cd grepurl
pip install -e .

License

GPLv2 or later.

Authors

Gerome Fournier (original author). His implementation is only available via the Internet Archive.

Arne Neumann (added -l option for local files, minor changes).

GPT-4 (rewrote the script for Python 3 compatibility).

Project details

Release history Release notifications | RSS feed

This version

0.2.0

Apr 5, 2024

0.1.1

Aug 21, 2014

my-version-number pre-release

Aug 21, 2014

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

grepurl-0.2.0.tar.gz (3.1 kB view details)

Uploaded Apr 5, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

grepurl-0.2.0-py3-none-any.whl (3.6 kB view details)

Uploaded Apr 5, 2024 Python 3

File details

Details for the file grepurl-0.2.0.tar.gz.

File metadata

Download URL: grepurl-0.2.0.tar.gz
Upload date: Apr 5, 2024
Size: 3.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.8.0 pkginfo/1.10.0 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/1.0.0 urllib3/1.26.18 tqdm/4.64.1 importlib-metadata/4.8.3 keyring/23.4.1 rfc3986/1.5.0 colorama/0.4.5 CPython/3.6.15

File hashes

Hashes for grepurl-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`91249b7020229b5a975b07b7316a2f64e54ed1d3ba367efb532f1d6ba39b5244`
MD5	`b5cd6eac118efff1df21f12b63a711b5`
BLAKE2b-256	`e8cc98dc4a7db995a3c5dbfb82358c8c727435796e0c2033ab7e860ea060e806`

See more details on using hashes here.

File details

Details for the file grepurl-0.2.0-py3-none-any.whl.

File metadata

Download URL: grepurl-0.2.0-py3-none-any.whl
Upload date: Apr 5, 2024
Size: 3.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.8.0 pkginfo/1.10.0 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/1.0.0 urllib3/1.26.18 tqdm/4.64.1 importlib-metadata/4.8.3 keyring/23.4.1 rfc3986/1.5.0 colorama/0.4.5 CPython/3.6.15

File hashes

Hashes for grepurl-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1c9e83a313f7b7b6639f8cf15654d840999ae6fc2baba728cb914c5687107327`
MD5	`be1003554df2a9a904ce052f669603cc`
BLAKE2b-256	`95993e2acca72e817ef4c501e83933c6fce31c332a8501714dbc99e26a2fe5d2`

See more details on using hashes here.

grepurl 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

Usage

Installation using pip

Installation from repository

License

Authors

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes