Turns your beautifulsoup4 soup into python dictionary or json
Project description
soup2dict
BeautifulSoup4 to python dictionary converter
Why
Its nice to have a convenient way to change your soup into dict.
Installation
Get package with pip or poetry
pip install soup2dict
poetry add soup2dict
Example
import simplejson
from bs4 import BeautifulSoup
from soup2dict import convert
html_doc = """
<html>
hei
<head>
<title>The Dormouse's story</title>
<title>bob</title>
</head>
<body>
<p class="title">The <b>Dormouse's story</b></p>
<p class="story">Once upon a time there were three little sisters;
and their names were
<a href="http://example.com/elsie" class="sister" id="link1">Elsie</a>,
<a href="http://example.com/lacie" class="sister" id="link2">Lacie</a> and
<a href="http://example.com/tillie" class="sister" id="link3">Tillie</a>;
and they lived at the bottom of a well.</p>
<p class="story">...</p>
"""
# Create soup from html_doc data
soup = BeautifulSoup(html_doc, 'html.parser')
# Convert it to a dictionary with convert()
dict_result = convert(soup)
with open('output.json', 'w') as output_file:
output_file.write(
simplejson.dumps(dict_result, indent=2),
)
Output
{
"html": [
{
"#text": "hei The Dormouse's story bob The Dormouse's story Once upon a time there were three little sisters; and their names were Elsie , Lacie and Tillie ; and they lived at the bottom of a well. ...",
"navigablestring": [
"hei"
],
"head": [
{
"#text": "The Dormouse's story bob",
"title": [
{
"#text": "The Dormouse's story",
"navigablestring": [
"The Dormouse's story"
]
},
{
"#text": "bob",
"navigablestring": [
"bob"
]
}
]
}
],
"body": [
{
"#text": "The Dormouse's story Once upon a time there were three little sisters; and their names were Elsie , Lacie and Tillie ; and they lived at the bottom of a well. ...",
"p": [
{
"@class": [
"title"
],
"#text": "The Dormouse's story",
"navigablestring": [
"The"
],
"b": [
{
"#text": "Dormouse's story",
"navigablestring": [
"Dormouse's story"
]
}
]
},
{
"@class": [
"story"
],
"#text": "Once upon a time there were three little sisters; and their names were Elsie , Lacie and Tillie ; and they lived at the bottom of a well.",
"navigablestring": [
"Once upon a time there were three little sisters;\n and their names were",
",",
"and",
";\n and they lived at the bottom of a well."
],
"a": [
{
"@href": "http://example.com/elsie",
"@class": [
"sister"
],
"@id": "link1",
"#text": "Elsie",
"navigablestring": [
"Elsie"
]
},
{
"@href": "http://example.com/lacie",
"@class": [
"sister"
],
"@id": "link2",
"#text": "Lacie",
"navigablestring": [
"Lacie"
]
},
{
"@href": "http://example.com/tillie",
"@class": [
"sister"
],
"@id": "link3",
"#text": "Tillie",
"navigablestring": [
"Tillie"
]
}
]
},
{
"@class": [
"story"
],
"#text": "...",
"navigablestring": [
"..."
]
}
]
}
]
}
]
}
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
soup2dict-2.1.0.tar.gz
(4.9 kB
view details)
Built Distribution
File details
Details for the file soup2dict-2.1.0.tar.gz
.
File metadata
- Download URL: soup2dict-2.1.0.tar.gz
- Upload date:
- Size: 4.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.12 CPython/3.9.7 Linux/5.4.0-86-generic
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0819e5707a968f5922d65414846f2700fb69bf140ac99af304bb60f0eb02628d |
|
MD5 | 792fd6d176633e052e07df0dc5ac3006 |
|
BLAKE2b-256 | 26fc7da2d1f9c27c78f1558c7e51ebe054e3570c8005a9e0464f0e7dea48a688 |
File details
Details for the file soup2dict-2.1.0-py3-none-any.whl
.
File metadata
- Download URL: soup2dict-2.1.0-py3-none-any.whl
- Upload date:
- Size: 4.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.12 CPython/3.9.7 Linux/5.4.0-86-generic
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 34e1e56217224c14d0f7c595e6052f6aba4f1b0cc294c705a2506910d16c2c8d |
|
MD5 | 685ce772a44cb2863ac056db220656c5 |
|
BLAKE2b-256 | 4465fe195c73bdc9f4b3aec0b8c30623dd9e2fcd2f471ba525e46c4bc6850dd5 |