Effortlessly convert HTML tables to JSON with this Python-based tool.
Project description
HTML Table to JSON Converter
Description
This project is a converter that reads a table from an HTML file and transforms it into a JSON file. It uses the pandas
and BeautifulSoup
libraries to perform the conversion efficiently and in a structured manner. The project follows Clean Architecture, Clean Code, and SOLID principles, ensuring modular, readable, and maintainable code.
Project Structure
html-table-to-json/
├── src/
│ ├── main.py
│ ├── services/
│ │ ├── html_parser.py
│ │ ├── json_converter.py
│ │ └── file_handler.py
│ └── utils/
│ └── logger.py
├── requirements.txt
├── README.md
├── LICENSE.md
└── .gitignore
Installation
Prerequisites
- Python 3.6 or higher
- Pip (Python package manager)
Steps
-
Clone the repository:
git clone https://github.com/your-username/html-table-to-json.git cd html-table-to-json
-
Create a virtual environment:
python -m venv venv
-
Activate the virtual environment:
- On Windows:
venv\Scripts\activate
- On macOS/Linux:
source venv/bin/activate
- On Windows:
-
Install dependencies:
pip install -r requirements.txt
Usage
To convert a table from an HTML file to JSON, follow these steps:
-
Navigate to the
src
directory:cd src
-
Run the
main.py
script, passing the path of the HTML file as an argument:python main.py path/to/your/file.html
-
The resulting JSON will be saved to
output/out.json
.
Example
If you have an HTML file named table.html
in the root of the project, run:
python main.py ../table.html
The JSON will be saved in output/out.json
and logs will be stored in logs/html_table_to_json.log
.
Code Structure
main.py
Entry point of the program. Coordinates reading the HTML file, parsing, conversion, and writing the output JSON.
services/file_handler.py
Contains functions for reading and writing files.
services/html_parser.py
Contains functions for parsing HTML using BeautifulSoup.
services/json_converter.py
Contains functions for converting the HTML table to JSON using Pandas.
utils/logger.py
Configures and initializes the logger to record important events and errors.
Contribution
- Fork the project
- Create a branch for your feature (
git checkout -b feature/new-feature
) - Commit your changes (
git commit -m 'Add new feature'
) - Push to the branch (
git push origin feature/new-feature
) - Open a Pull Request
License
This project is licensed under the MIT License. See the LICENSE.md file for more details.
Contact
For more information, contact via email at thiagoarturschumann@gmail.com.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file html-table-to-json-0.1.0.tar.gz
.
File metadata
- Download URL: html-table-to-json-0.1.0.tar.gz
- Upload date:
- Size: 4.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.10.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8a2647408c8c127d6ecdd49712c822b47e6ab39b8840fe47f2cb2ccdee253d5b |
|
MD5 | f8f82c09b7920e38de20ad936836dbd2 |
|
BLAKE2b-256 | 1dc2835fb4e0bf880fa0c02774039e9700c158ad6f2f30128831c3d911bba621 |
File details
Details for the file html_table_to_json-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: html_table_to_json-0.1.0-py3-none-any.whl
- Upload date:
- Size: 6.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.10.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0823c9ff8415a3a4e603875dc802da56f4a49856d84ca725c124cc2be42ffa8a |
|
MD5 | 21790517016ea9511e0d4f549cca7d00 |
|
BLAKE2b-256 | b4cd2087787e39e67e615b2fdc89c6c7d1a1a2f8af0b694086c139bebbdb8465 |