Parse HTML to a Python Dictionary
Project description
HTML Form to Dict
This is a tiny library which provides a method called html_form_to_dict()
.
This method takes a string containing HTML and returns a dictionary of the values of the first form.
The data returned by html_form_to_dict()
is a FormDict
which has the method submit()
. This way
you can submit the data like a real browser would.
This mean you can do simple end-to-end testing of form handling without a real browser (like selenium/puppeteer/playwright).
The submit()
method supports the "action" and "method" attributes of forms and additionaly the htmx attributes hx-get, hx-post.
Example:
def test_foo(client):
...
# client is a DjangoClient. But you could use
# python-requests or a different URL-lib, too
response = client.get(url)
# This method parses the HTML in response.content to a dictionary.
# This dictionary is like request.POST or request.GET.
# It is a flat mapping from the input elements of the form
# to their value.
data = html_form_to_dict(response.content)
# Now you can test the default values of the form.
assert data == {'city': 'Chemnitz', 'name': 'Mr. X'}
# You can edit the data. This is like a human (or Playwright/Selenium)
# altering the HTML input fields
data['name'] = 'Mrs. Y'
# This submits the data to the server.
# This methods uses the "action" attribute of the form.
# The hx-get, hx-post attributes of htmx are supported, too
response = data.submit(client)
# If you use the Post/Redirect/Get pattern:
assert response.status == 302, response.context['form'].errors
Above code uses pytest-django. See client fixture
The FormDict
returned by html_form_to_dict()
does not allow adding new
keys, which are not in the dictionary yet. This way you get an error if your
test sets the value for an input which (maybe due to refactoring) does not exist.
Above example uses Django, but the library is a pure Python library which does not depend on any particular web-framework.
This library was build for testing, but you can use it for all tasks where you want to parse and submit html forms.
This library does not evaluate JavaScript. If you need JS support, please use Playwright (or a similar tool).
Install
pip install html_form_to_dict
Development
You need to upload your ssh-pub-key to github first:
pip install -e git+ssh://git@github.com/guettli/html_form_to_dict#egg=html_form_to_dict
edit-the-code
pip install pytest
pytest
create Pull-Request
Alternatives
- Mechanize This library is like a browser without JS support.
- You could use BeautifulSoup like explained in this Stackoverflow Answer
- Use Playwright for browser based end-to-end tests.
Deploy
via deploy-library.py
for py2 tgz package: python -m twine upload dist/html_form_to_dict-*.tar.gz
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file html_form_to_dict-2022.10.1.tar.gz
.
File metadata
- Download URL: html_form_to_dict-2022.10.1.tar.gz
- Upload date:
- Size: 5.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.10.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 974dbf0481a0b11ac3d8c4ae73b792a0d509a94d4a0b526cb6d9a779fa30da6a |
|
MD5 | 1106267c557b854fa08d497882ae64ff |
|
BLAKE2b-256 | ffcd80203077f6afaaead328eaaa8c1c8e393b019e85dbc3007432aba02662ac |
File details
Details for the file html_form_to_dict-2022.10.1-py3-none-any.whl
.
File metadata
- Download URL: html_form_to_dict-2022.10.1-py3-none-any.whl
- Upload date:
- Size: 6.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.10.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7d447930da03659d52e4b4994faf270ed749c5184cb10c9b17986845c83456e2 |
|
MD5 | edd517b19d219d9bc85e915ea7d4dafd |
|
BLAKE2b-256 | 75d54159d93df599d14ac7899225243cdeb50d8966e1c21049e61b3e3ed3e541 |