CLI tool for stripping hidden form values from an HTML document
Project description
strip-hidden-form-values
CLI tool for stripping hidden form values from an HTML document
Why would you need this? Imagine you're running a Git scraper against a website that includes hidden form fields (such as those produced by __VIEWSTATE
fields) that change on every request. You can pipe the HTML through this tool to strip those hidden form values such that a change is only recorded if the rest of the page is modified in some way.
scrape-ca-wildlife-rules is an example of a repository that uses this tool for that, see the scrape.yml workflow there for details.
Installation
Install this tool using pip
:
$ pip install strip-hidden-form-values
Usage
You can pipe HTML into this tool:
curl http://... | strip-hidden-form-values > output.html
Or pass it a filename:
strip-hidden-form-values input.html > output.html
The tool will replace the value=
attribute of any hidden form fields with a blank string,
so the following:
<input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="p8nVm4PgVPA" />
Will be replaced with:
<input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="" />
All other HTML will remain unchanged.
Development
To contribute to this tool, first checkout the code. Then create a new virtual environment:
cd strip-hidden-form-values
python -m venv venv
source venv/bin/activate
Or if you are using pipenv
:
pipenv shell
Now install the dependencies and test dependencies:
pip install -e '.[test]'
To run the tests:
pytest
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for strip-hidden-form-values-0.2.1.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 61b2aa98602ccabc770b36ebcb8d8d03c8d30872daddce30efa06d31cc083d5f |
|
MD5 | f3346441befff819037fa7a0ffdf3ebe |
|
BLAKE2b-256 | b9b84b5295fed5dc0719102a18146224c95b77272f663acbe91c3eafabb637b8 |
Hashes for strip_hidden_form_values-0.2.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6cedce4300b2cb9998f4efba02a83e95582e2672ee72fd8c5cc6f992d31f0206 |
|
MD5 | e3f9c81266b260026157ec44a37ff15b |
|
BLAKE2b-256 | 9a8634d4ed575ab720eb31079e399f2546aea43b055121e83fd4c068de471341 |