Strip tags from HTML, optionally from areas identified by CSS selectors
Project description
strip-tags
Strip tags from HTML, optionally from areas identified by CSS selectors
See llm, ttok and strip-tags—CLI tools for working with ChatGPT and other LLMs for more on this project.
Installation
Install this tool using pip
:
pip install strip-tags
Usage
Pipe content into this tool to strip tags from it:
cat input.html | strip-tags > output.txt
Or pass a filename:
strip-tags -i input.html > output.txt
To run against just specific areas identified by CSS selectors:
strip-tags '.content' -i input.html > output.txt
This can be called with multiple selectors:
cat input.html | strip-tags '.content' '.sidebar' > output.txt
To minify whitespace - reducing multiple space and tab characters to a single space, and multiple newlines and spaces to a maximum of two newlines - add -m
or --minify
:
cat input.html | strip-tags -m > output.txt
You can also run this command using python -m
like this:
python -m strip_tags --help
Development
To contribute to this tool, first checkout the code. Then create a new virtual environment:
cd strip-tags
python -m venv venv
source venv/bin/activate
Now install the dependencies and test dependencies:
pip install -e '.[test]'
To run the tests:
pytest
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.