Python library for SOLI data generation
Project description
SOLI Data Generator
SOLI Data Generator is a Python package for generating synthetic legal data using the SOLI (Standards for Open Legal Information) knowledge graph. It provides both procedural and LLM-based generation techniques to create realistic legal text and data.
Features
- Procedural generation using templates with SOLI and Faker tags
- LLM-based text generation using various AI models
- Easy integration with the SOLI knowledge graph
- Flexible and extensible architecture
Installation
You can install SOLI Data Generator using pip:
pip install soli-data-generator
Usage
Procedural Template Generation
from soli import SOLI
from soli_data_generator.procedural.template import TemplateFormatter
# Initialize the SOLI graph
soli_graph = SOLI()
# Initialize the TemplateFormatter
formatter = TemplateFormatter()
# Define a template with SOLI and Faker tags
template = """
Company: <|company|>
Industry: <|industry|>
Legal Issue: <|area_of_law|>
Document Type: <|document_artifact|>
"""
# Format the template
formatted_text = formatter.format(template)
print(formatted_text)
LLM-based Text Generation
from alea_llm_client import VLLMModel
from soli_data_generator.llm.text import TextGenerator
# Initialize the VLLM model
model = VLLMModel()
# Initialize the TextGenerator
generator = TextGenerator(model)
# Generate text
generated_text = generator()
print(generated_text)
Examples
For more detailed examples, please check the examples/
directory in this repository.
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
This project is licensed under the MIT License.
Contact
For any questions or concerns, please open an issue on the GitHub repository.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file soli_data_generator-0.1.0.tar.gz
.
File metadata
- Download URL: soli_data_generator-0.1.0.tar.gz
- Upload date:
- Size: 8.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.12.3 Linux/6.8.0-41-generic
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 425dbc64f1804e0dbff0b5c7641e11690dc14469c65e8335eb7d3c5be5a561f3 |
|
MD5 | cf79937c99e7aab873ba19deb9b6ff15 |
|
BLAKE2b-256 | e946fe4c49a9badaf506d8eff4252f7ec07728aa7daa64eed33cf3fb86cac9c0 |
File details
Details for the file soli_data_generator-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: soli_data_generator-0.1.0-py3-none-any.whl
- Upload date:
- Size: 9.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.12.3 Linux/6.8.0-41-generic
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | bfe07fe06bd8c5e1cacc7bae5a0a00b1a1e9c00b812ccfa07dbafaabf39075f3 |
|
MD5 | 5b6b24558d60eb8e8ba65838d2f399df |
|
BLAKE2b-256 | f6838acb2b7e7f9868a8998c6e901e199d3499eadd9170639251d54414738ea4 |