Skip to main content

Python library for SOLI data generation

Project description

SOLI Data Generator

SOLI Data Generator is a Python package for generating synthetic legal data using the SOLI (Standards for Open Legal Information) knowledge graph. It provides both procedural and LLM-based generation techniques to create realistic legal text and data.

Features

  • Procedural generation using templates with SOLI and Faker tags
  • LLM-based text generation using various AI models
  • Easy integration with the SOLI knowledge graph
  • Flexible and extensible architecture

Installation

You can install SOLI Data Generator using pip:

pip install soli-data-generator

Usage

Procedural Template Generation

from soli import SOLI
from soli_data_generator.procedural.template import TemplateFormatter

# Initialize the SOLI graph
soli_graph = SOLI()

# Initialize the TemplateFormatter
formatter = TemplateFormatter()

# Define a template with SOLI and Faker tags
template = """
Company: <|company|>
Industry: <|industry|>
Legal Issue: <|area_of_law|>
Document Type: <|document_artifact|>
"""

# Format the template
formatted_text = formatter.format(template)

print(formatted_text)

LLM-based Text Generation

from alea_llm_client import VLLMModel
from soli_data_generator.llm.text import TextGenerator

# Initialize the VLLM model
model = VLLMModel()

# Initialize the TextGenerator
generator = TextGenerator(model)

# Generate text
generated_text = generator()

print(generated_text)

Examples

For more detailed examples, please check the examples/ directory in this repository.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License.

Contact

For any questions or concerns, please open an issue on the GitHub repository.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

soli_data_generator-0.1.0.tar.gz (8.9 kB view details)

Uploaded Source

Built Distribution

soli_data_generator-0.1.0-py3-none-any.whl (9.6 kB view details)

Uploaded Python 3

File details

Details for the file soli_data_generator-0.1.0.tar.gz.

File metadata

  • Download URL: soli_data_generator-0.1.0.tar.gz
  • Upload date:
  • Size: 8.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.3 Linux/6.8.0-41-generic

File hashes

Hashes for soli_data_generator-0.1.0.tar.gz
Algorithm Hash digest
SHA256 425dbc64f1804e0dbff0b5c7641e11690dc14469c65e8335eb7d3c5be5a561f3
MD5 cf79937c99e7aab873ba19deb9b6ff15
BLAKE2b-256 e946fe4c49a9badaf506d8eff4252f7ec07728aa7daa64eed33cf3fb86cac9c0

See more details on using hashes here.

File details

Details for the file soli_data_generator-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for soli_data_generator-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 bfe07fe06bd8c5e1cacc7bae5a0a00b1a1e9c00b812ccfa07dbafaabf39075f3
MD5 5b6b24558d60eb8e8ba65838d2f399df
BLAKE2b-256 f6838acb2b7e7f9868a8998c6e901e199d3499eadd9170639251d54414738ea4

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page