Skip to main content

CLI tool for stripping hidden form values from an HTML document

Project description

strip-hidden-form-values

PyPI Changelog Tests License

CLI tool for stripping hidden form values from an HTML document

Why would you need this? Imagine you're running a Git scraper against a website that includes hidden form fields (such as those produced by __VIEWSTATE fields) that change on every request. You can pipe the HTML through this tool to strip those hidden form values such that a change is only recorded if the rest of the page is modified in some way.

scrape-ca-wildlife-rules is an example of a repository that uses this tool for that, see the scrape.yml workflow there for details.

Installation

Install this tool using pip:

$ pip install strip-hidden-form-values

Usage

You can pipe HTML into this tool:

curl http://... | strip-hidden-form-values > output.html

Or pass it a filename:

strip-hidden-form-values input.html > output.html

The tool will replace the value= attribute of any hidden form fields with a blank string, so the following:

<input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="p8nVm4PgVPA" />

Will be replaced with:

<input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="" />

All other HTML will remain unchanged.

Development

To contribute to this tool, first checkout the code. Then create a new virtual environment:

cd strip-hidden-form-values
python -m venv venv
source venv/bin/activate

Or if you are using pipenv:

pipenv shell

Now install the dependencies and test dependencies:

pip install -e '.[test]'

To run the tests:

pytest

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

strip-hidden-form-values-0.2.1.tar.gz (7.0 kB view hashes)

Uploaded Source

Built Distribution

strip_hidden_form_values-0.2.1-py3-none-any.whl (7.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page