Skip to main content

Microformats parser

Project description

mf2py banner

version downloads license python-version

Welcome 👋

mf2py is a Python microformats parser with full support for microformats2, backwards-compatible support for microformats1 and experimental support for metaformats.

Installation 💻

To install mf2py run the following command:

$ pip install mf2py

Quickstart 🚀

Import the library:

>>> import mf2py

Parse an HTML Document from a file or string

>>> with open("test/examples/eras.html") as fp:
...     mf2json = mf2py.parse(doc=fp)
>>> mf2json
{'items': [{'type': ['h-entry'],
            'properties': {'name': ['Excited for the Taylor Swift Eras Tour'],
                           'author': [{'type': ['h-card'],
                                       'properties': {'name': ['James'],
                                                      'url': ['https://example.com/']},
                                       'value': 'James',
                                       'lang': 'en-us'}],
                           'published': ['2023-11-30T19:08:09'],
                           'featured': [{'value': 'https://example.com/eras.jpg',
                                         'alt': 'Eras tour poster'}],
                           'content': [{'value': "I can't decide which era is my favorite.",
                                        'lang': 'en-us',
                                        'html': "<p>I can't decide which era is my favorite.</p>"}],
                           'category': ['music', 'Taylor Swift']},
            'lang': 'en-us'}],
 'rels': {'webmention': ['https://example.com/mentions']},
 'rel-urls': {'https://example.com/mentions': {'text': '',
                                               'rels': ['webmention']}},
 'debug': {'description': 'mf2py - microformats2 parser for python',
           'source': 'https://github.com/microformats/mf2py',
           'version': '2.0.0',
           'markup parser': 'html5lib'}}
>>> mf2json = mf2py.parse(doc="<a class=h-card href=https://example.com>James</a>")
>>> mf2json["items"]
[{'type': ['h-card'],
  'properties': {'name': ['James'],
                 'url': ['https://example.com']}}]

Parse an HTML Document from a URL

>>> mf2json = mf2py.parse(url="https://events.indieweb.org")
>>> mf2json["items"][0]["type"]
['h-feed']
>>> mf2json["items"][0]["children"][0]["type"]
['h-event']

Experimental Options

The following options can be invoked via keyword arguments to parse() and Parser().

expose_dom

Use expose_dom=True to expose the DOM of embedded properties.

metaformats

Use metaformats=True to include any metaformats found.

filter_roots

Use filter_roots=True to filter known conflicting user names (e.g. Tailwind). Otherwise provide a custom list to filter instead.

Advanced Usage

parse is a convenience function for Parser. More sophisticated behaviors are available by invoking the parser object directly.

>>> with open("test/examples/festivus.html") as fp:
...     mf2parser = mf2py.Parser(doc=fp)

Filter by Microformat Type

>>> mf2json = mf2parser.to_dict()
>>> len(mf2json["items"])
7
>>> len(mf2parser.to_dict(filter_by_type="h-card"))
3
>>> len(mf2parser.to_dict(filter_by_type="h-entry"))
4

JSON Output

>>> json = mf2parser.to_json()
>>> json_cards = mf2parser.to_json(filter_by_type="h-card")

Breaking Changes in mf2py 2.0

  • Image alt support is now on by default.

Notes 📝

  • If you pass a BeautifulSoup document it may be modified.
  • A hosted version of mf2py is available at python.microformats.io.

Contributing 🛠️

We welcome contributions and bug reports via GitHub.

This project follows the IndieWeb code of conduct. Please be respectful of other contributors and forge a spirit of positive co-operation without discrimination or disrespect.

License 🧑‍⚖️

mf2py is licensed under an MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mf2py-2.0.1.tar.gz (21.8 kB view details)

Uploaded Source

Built Distribution

mf2py-2.0.1-py3-none-any.whl (25.8 kB view details)

Uploaded Python 3

File details

Details for the file mf2py-2.0.1.tar.gz.

File metadata

  • Download URL: mf2py-2.0.1.tar.gz
  • Upload date:
  • Size: 21.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.3.2 CPython/3.11.1 Linux/5.10.0-23-amd64

File hashes

Hashes for mf2py-2.0.1.tar.gz
Algorithm Hash digest
SHA256 1380924633413b8d72e704b5c86b4382c4b1371699edecc907b01cd21138d7cd
MD5 178614c416a1e0097bed573c56073e2e
BLAKE2b-256 f87dbccfc42706cb24053e7897c33c14e79a8c9c69379d21edfca13ec93ed0ac

See more details on using hashes here.

File details

Details for the file mf2py-2.0.1-py3-none-any.whl.

File metadata

  • Download URL: mf2py-2.0.1-py3-none-any.whl
  • Upload date:
  • Size: 25.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.3.2 CPython/3.11.1 Linux/5.10.0-23-amd64

File hashes

Hashes for mf2py-2.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 092806e17f1a93db4aafa5e8d3c4124b5e42cd89027e2db48a5248ef4eabde03
MD5 0c4d2d5a536719f60454ea5373db319e
BLAKE2b-256 8e88b1d83c9e71cbdaefcec38ea350d2bd6360a9d1e030b090ad4b0fcc421ca1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page