Skip to main content

Strip tags from HTML, optionally from areas identified by CSS selectors

Project description

strip-tags

PyPI Changelog Tests License

Strip tags from HTML, optionally from areas identified by CSS selectors

See llm, ttok and strip-tags—CLI tools for working with ChatGPT and other LLMs for more on this project.

Installation

Install this tool using pip:

pip install strip-tags

Usage

Pipe content into this tool to strip tags from it:

cat input.html | strip-tags > output.txt

Or pass a filename:

strip-tags -i input.html > output.txt

To run against just specific areas identified by CSS selectors:

strip-tags '.content' -i input.html > output.txt

This can be called with multiple selectors:

cat input.html | strip-tags '.content' '.sidebar' > output.txt

To minify whitespace - reducing multiple space and tab characters to a single space, and multiple newlines and spaces to a maximum of two newlines - add -m or --minify:

cat input.html | strip-tags -m > output.txt

You can also run this command using python -m like this:

python -m strip_tags --help

Development

To contribute to this tool, first checkout the code. Then create a new virtual environment:

cd strip-tags
python -m venv venv
source venv/bin/activate

Now install the dependencies and test dependencies:

pip install -e '.[test]'

To run the tests:

pytest

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

strip-tags-0.3.tar.gz (8.1 kB view hashes)

Uploaded Source

Built Distribution

strip_tags-0.3-py3-none-any.whl (8.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page