perse converts HTML content into structured JSON data
Project description
Perse
Perse converts HTML
to JSON
using a mix of traditional html parsing and LLM based data extraction.
Installation
pip install zf-perse
Usage
export PERSE_OPENAI_API_KEY="your-openai-api-key"
from perse import perse
url = "https://example.com"
html = requests.get(url).text
j = perse(html)
print(j)
Example
Input
<!-- taken from https://zeffmuks.com -->
Output
{
"title": "Zeff Muks",
"description": "Antifragile Entropy Assassin 🥷",
"og": {
"type": "website",
"title": "Zeff Muks",
"description": "Antifragile Entropy Assassin 🥷",
"url": "https://www.zeffmuks.com/",
"image": "https://www.zeffmuks.com/images/ZeffMuks-1920.png",
"site_name": "Zeff Muks",
},
"twitter": {
"card": "summary_large_image",
"site": "@zeffmuks",
"title": "Zeff Muks",
"description": "Antifragile Entropy Assassin 🥷",
"image": "https://www.zeffmuks.com/images/ZeffMuks-1920.png",
},
"main_header": "Antifragile Entropy Assassin 🥷🏻",
"header_link": "https://x.com/zeffmuks",
"builds": [
{
"date": "08/30/2024",
"project": {
"name": "Cursor Git",
"description": "Enhanced Git for Cursor AI Editor",
"logo_url": "https://zf-static.s3.us-west-1.amazonaws.com/cursor-git-logo128.png",
"download_link": "https://zf-static.s3.us-west-1.amazonaws.com/cursor-git-0.1.12.vsix",
"external_link": "",
},
},
{
"date": "08/18/2024",
"project": {
"name": "PyZF",
"description": "Enhancements for Python",
"logo_url": "https://zf-static.s3.us-west-1.amazonaws.com/pyzf-logo128.png",
"download_link": "",
"external_link": "https://pypi.org/project/PyZF",
},
},
{
"date": "08/05/2024",
"project": {
"name": "Xanthus",
"description": "X (formerly Twitter) Assistant",
"logo_url": "https://zf-static.s3.us-west-1.amazonaws.com/xanthus-logo128.png",
"download_link": "",
"external_link": "https://pypi.org/project/zf-xanthus",
},
},
{
"date": "07/24/2024",
"project": {
"name": "Jenga",
"description": "Fast JSON5 Python Library",
"logo_url": "",
"download_link": "https://pypi.org/project/zf-jenga",
"external_link": "",
},
},
...
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
zf-perse-0.1.1.tar.gz
(7.7 kB
view details)
Built Distribution
File details
Details for the file zf-perse-0.1.1.tar.gz
.
File metadata
- Download URL: zf-perse-0.1.1.tar.gz
- Upload date:
- Size: 7.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | dd0d92a6fd609503ea35af3feaf13fbc8bfa321097dfaaeb43b0999607c9662c |
|
MD5 | b31623a04b9272a435c997c33326372b |
|
BLAKE2b-256 | 68b0066e1e31bd49be9a80f6937505418ff132b22d54f62e9bb6ee5242177bea |
File details
Details for the file zf_perse-0.1.1-py3-none-any.whl
.
File metadata
- Download URL: zf_perse-0.1.1-py3-none-any.whl
- Upload date:
- Size: 7.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3bcb2070bf5c41468dc286430ed2e2da5de0ed425a439a53a9cd4c508db0b428 |
|
MD5 | 3ac54b9a1bba71ef86756943a97d319e |
|
BLAKE2b-256 | 8dc4c4c9df58f32f37c51e020e60a872bcca2f0678efbf712bcabc6e01b15aff |