No project description provided

These details have not been verified by PyPI

Project links

Homepage

Framework
- Pydantic :: 2
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Natural Language
- English
Operating System
- OS Independent
Programming Language
Topic
- Text Processing :: Markup
Typing
- Typed

Project description

🌌 GravitasML

Lightweight Markup Parsing for Python - Perfect for LLMs

License PyPI version Python versions CI/CD Status Code style: black

A lightweight Python library for parsing custom markup languages, built and used by AutoGPT

🤔 Why use GravitasML?

GravitasML is purpose-built for parsing simple markup structures, particularly LLM-generated outputs.

By design, it excludes XML features that can introduce security risks:

No DTD processing - Prevents billion laughs and quadratic blowup attacks
No external entities - Prevents XXE attacks
No entity expansion - Prevents decompression bombs
Simple and predictable - No namespaces, no attributes, just tags and content

Perfect for:

Parsing LLM outputs with xml tags
Simple configuration formats
Data extraction from controlled markup
Any scenario where you need safe, simple markup parsing

🛡️ Security by Design

GravitasML is immune to common XML vulnerabilities because it simply doesn't implement the features that enable them:

Attack Type	GravitasML
Billion Laughs	✅ Safe (no entity support)
Quadratic Blowup	✅ Safe (no entity expansion)
External Entity Expansion (XXE)	✅ Safe (no external resources)
DTD Retrieval	✅ Safe (no DTD support)
Decompression Bomb	✅ Safe (no decompression)

Perfect for parsing LLM outputs and other scenarios where you need simple, secure markup processing.

✨ Features

GravitasML transforms custom markup into Python data structures:

Simple API - Parse markup to dictionaries with just a few lines of code
Pydantic Integration - Convert parsed data directly to Pydantic models for validation
Nested Structure Support - Handles nested tags, multiple roots, and repeated elements
Tag Normalization - Automatic whitespace handling and case conversion
Error Detection - Syntax error detection for unmatched or improperly nested tags

📦 Installation

pip install gravitasml

Or with Poetry:

poetry add gravitasml

🚀 Quick Start

Basic Usage

from gravitasml.token import tokenize
from gravitasml.parser import Parser

# Parse simple markup
markup = "<name>GravitasML</name>"
tokens = tokenize(markup)
parser = Parser(tokens)
result = parser.parse()

print(result)  # {'name': 'GravitasML'}

Nested Structure Example

from gravitasml.token import tokenize
from gravitasml.parser import Parser

markup = """
<person>
    <name>John Doe</name>
    <contact>
        <email>john@example.com</email>
        <phone>555-0123</phone>
    </contact>
</person>
"""

tokens = tokenize(markup)
result = Parser(tokens).parse()

# Result: {
#     'person': {
#         'name': 'John Doe',
#         'contact': {
#             'email': 'john@example.com',
#             'phone': '555-0123'
#         }
#     }
# }

🎓 Advanced Usage

Pydantic Model Integration

Transform your markup directly into validated Pydantic models:

from pydantic import BaseModel
from gravitasml.token import tokenize
from gravitasml.parser import Parser

class Contact(BaseModel):
    email: str
    phone: str

class Person(BaseModel):
    name: str
    contact: Contact

markup = """
<person>
    <name>Jane Smith</name>
    <contact>
        <email>jane@example.com</email>
        <phone>555-9876</phone>
    </contact>
</person>
"""

tokens = tokenize(markup)
parser = Parser(tokens)
person = parser.parse_to_pydantic(Person)

print(person.name)  # Jane Smith
print(person.contact.email)  # jane@example.com

Handling Repeated Tags

GravitasML automatically converts repeated tags into lists:

from gravitasml.token import tokenize
from gravitasml.parser import Parser

markup = "<tag><a>value1</a><a>value2</a></tag>"
tokens = tokenize(markup)
result = Parser(tokens).parse()
# Result: {'tag': [{'a': 'value1'}, {'a': 'value2'}]}

# Multiple root tags with the same name also become a list
markup2 = "<tag>content1</tag><tag>content2</tag>"
tokens2 = tokenize(markup2)
result2 = Parser(tokens2).parse()
# Result: [{'tag': 'content1'}, {'tag': 'content2'}]

Tag Name Normalization

Tag names are automatically normalized - spaces become underscores and names are lowercased:

from gravitasml.token import tokenize
from gravitasml.parser import Parser

# Spaces in tag names are converted to underscores
markup = "<User Profile><First Name>Alice</First Name></User Profile>"
tokens = tokenize(markup)
result = Parser(tokens).parse()
# Result: {'user_profile': {'first_name': 'Alice'}}

🏗️ Architecture

GravitasML uses a two-stage parsing approach:

Tokenization (gravitasml.token) - Converts raw markup into a stream of tokens
Parsing (gravitasml.parser) - Builds a tree structure and converts to Python objects

🧪 Testing

GravitasML comes with a test suite. To run the tests, execute the following command:

python -m unittest discover -v

📊 Dependencies

GravitasML has minimal dependencies:

Python 3.10, 3.11, or 3.12 (tested in CI)
Pydantic 2.x (for model validation features)
Black (development dependency for code formatting)
Pytest (development dependency)

🤝 Contributing

We welcome contributions! GravitasML uses:

Poetry for dependency management
Black for code formatting
GitHub Actions for CI/CD
unittest for testing

To contribute:

Fork the repository
Create your feature branch (git checkout -b feature/amazing-feature)
Make your changes and add tests
Ensure all tests pass and code is formatted with Black
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

See our CI/CD workflow for the automated checks your PR must pass.

📝 Current Limitations

GravitasML is designed for simplicity. It currently does not support:

XML namespaces or schema validation
Tag attributes (e.g., <tag attr="value">)
Processing instructions or CDATA sections
Writing/generating markup (parsing only)
Streaming parsing for very large documents
Self-closing tags (e.g., <tag />)

These limitations are intentional to keep the library focused and easy to use. If you need these features, consider using Python's built-in xml.etree.ElementTree or third-party libraries like lxml.

🎯 Philosophy

GravitasML is built on the principle that not every markup parsing task needs the complexity of full XML processing. Sometimes you just want to convert simple markup to Python dictionaries without the overhead of namespaces, DTDs, or complex validation rules.

📄 License

GravitasML is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Built by the AutoGPT Team and used in the AutoGPT project.

Simple markup parsing for modern Python applications.

Project details

These details have not been verified by PyPI

Project links

Homepage

Framework
- Pydantic :: 2
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Natural Language
- English
Operating System
- OS Independent
Programming Language
Topic
- Text Processing :: Markup
Typing
- Typed

Release history Release notifications | RSS feed

This version

0.1.4

Dec 26, 2025

0.1.3

Feb 12, 2025

0.1.2

Feb 8, 2025

0.1.0

Nov 10, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gravitasml-0.1.4.tar.gz (7.4 kB view details)

Uploaded Dec 26, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

gravitasml-0.1.4-py3-none-any.whl (8.4 kB view details)

Uploaded Dec 26, 2025 Python 3

File details

Details for the file gravitasml-0.1.4.tar.gz.

File metadata

Download URL: gravitasml-0.1.4.tar.gz
Upload date: Dec 26, 2025
Size: 7.4 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for gravitasml-0.1.4.tar.gz
Algorithm	Hash digest
SHA256	`35d0d9fec7431817482d53d9c976e375557c3e041d1eb6928e809324a8c866e3`
MD5	`8911a2e2451f3270f7723dda7fd21bb2`
BLAKE2b-256	`031489ec16093615cb9b3f6902879140c8ae0895b8133726dfbd78f3fb55a9b5`

See more details on using hashes here.

Provenance

The following attestation bundles were made for gravitasml-0.1.4.tar.gz:

Publisher: cicd.yml on Significant-Gravitas/gravitasml

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: gravitasml-0.1.4.tar.gz
- Subject digest: 35d0d9fec7431817482d53d9c976e375557c3e041d1eb6928e809324a8c866e3
- Sigstore transparency entry: 779878016
- Sigstore integration time: Dec 26, 2025
Source repository:
- Permalink: Significant-Gravitas/gravitasml@9eaa339a2c65e9df32415169fb086d6c0145c5be
- Branch / Tag: refs/tags/v0.1.4
- Owner: https://github.com/Significant-Gravitas
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: cicd.yml@9eaa339a2c65e9df32415169fb086d6c0145c5be
- Trigger Event: release

File details

Details for the file gravitasml-0.1.4-py3-none-any.whl.

File metadata

Download URL: gravitasml-0.1.4-py3-none-any.whl
Upload date: Dec 26, 2025
Size: 8.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for gravitasml-0.1.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`671a18b11d3d8a0e270c6a80c72cd058458b18d5ef7560d00010e962ab1bca74`
MD5	`4b2a81b7501dbfca7be3b5bdcec998da`
BLAKE2b-256	`e1d64fdcac30962e243b7ec5793661ac589b95ca0295808b4a6e89a3aca99b1e`

See more details on using hashes here.

Provenance

The following attestation bundles were made for gravitasml-0.1.4-py3-none-any.whl:

Publisher: cicd.yml on Significant-Gravitas/gravitasml

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: gravitasml-0.1.4-py3-none-any.whl
- Subject digest: 671a18b11d3d8a0e270c6a80c72cd058458b18d5ef7560d00010e962ab1bca74
- Sigstore transparency entry: 779878017
- Sigstore integration time: Dec 26, 2025
Source repository:
- Permalink: Significant-Gravitas/gravitasml@9eaa339a2c65e9df32415169fb086d6c0145c5be
- Branch / Tag: refs/tags/v0.1.4
- Owner: https://github.com/Significant-Gravitas
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: cicd.yml@9eaa339a2c65e9df32415169fb086d6c0145c5be
- Trigger Event: release

gravitasml 0.1.4

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

🌌 GravitasML

Lightweight Markup Parsing for Python - Perfect for LLMs

🤔 Why use GravitasML?

🛡️ Security by Design

✨ Features

📦 Installation

🚀 Quick Start

Basic Usage

Nested Structure Example

🎓 Advanced Usage

Pydantic Model Integration

Handling Repeated Tags

Tag Name Normalization

🏗️ Architecture

🧪 Testing

📊 Dependencies

🤝 Contributing

📝 Current Limitations

🎯 Philosophy

📄 License

🙏 Acknowledgments

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance