Skip to main content

Python bindings for Rudof

Project description

Rudof Python bindings

The Python bindings for rudof are called pyrudof. They are available at pypi.

For more information, you can access the readthedocs documentation. We keep several tutorials about rudof as Jupyter notebooks in: [https://rudof-project.github.io/tutorials].

After compiling and installing this module, a Python library called pyrudof should be available.

Build the development version

This module is based on pyo3 and maturin.

To build and install the development version of pyrudof you need to clone this git repository, go to the python directory (the one this README is in) and run:

pip install maturin

followed by:

pip install .

If you are using .env, you can do the following:

python3 -m venv .venv

followed by:

source .venv/bin/activate

or

source .venv/bin/activate.fish

and once you do that, you can locally install que package as:

pip install -e .

Running the tests

Go to the tests folder:

cd tests

and run:

python3 -m unittest discover -vvv

Using rudof_generate

The pyrudof package includes bindings for rudof_generate, which allows you to generate synthetic RDF data from ShEx or SHACL schemas.

Basic Example

import pyrudof

# Create configuration
config = pyrudof.GeneratorConfig()
config.set_entity_count(100)
config.set_output_path("output.ttl")
config.set_output_format(pyrudof.OutputFormat.Turtle)

# Create generator
generator = pyrudof.DataGenerator(config)

# Load schema and generate data
generator.run("schema.shex")

Configuration Options

The GeneratorConfig class provides many configuration options:

config = pyrudof.GeneratorConfig()

# Generation parameters
config.set_entity_count(1000)           # Number of entities to generate
config.set_seed(42)                     # Random seed for reproducibility

# Schema format
config.set_schema_format(pyrudof.SchemaFormat.ShEx)  # or SchemaFormat.SHACL

# Output configuration
config.set_output_path("data.ttl")
config.set_output_format(pyrudof.OutputFormat.Turtle)  # or OutputFormat.NTriples
config.set_compress(False)              # Whether to compress output
config.set_write_stats(True)            # Write generation statistics

# Cardinality strategy
config.set_cardinality_strategy(pyrudof.CardinalityStrategy.Balanced)
# Options: Minimum, Maximum, Random, Balanced

# Parallel processing
config.set_worker_threads(4)            # Number of worker threads
config.set_batch_size(100)              # Batch size for processing
config.set_parallel_writing(True)       # Enable parallel file writing
config.set_parallel_file_count(4)       # Number of output files (when parallel)

Loading Schemas

You can load schemas in different ways:

# Load ShEx schema
generator.load_shex_schema("schema.shex")

# Load SHACL schema
generator.load_shacl_schema("shapes.ttl")

# Auto-detect schema format
generator.load_schema_auto("schema_file")

# Then generate data
generator.generate()

Complete Workflow

The run() method provides a convenient way to load a schema and generate data in one step:

# Auto-detect format
generator.run("schema.shex")

# Specify format explicitly
generator.run_with_format("shapes.ttl", pyrudof.SchemaFormat.SHACL)

Configuration Files

You can also load configuration from TOML or JSON files:

# Load from TOML
config = pyrudof.GeneratorConfig.from_toml_file("config.toml")

# Load from JSON
config = pyrudof.GeneratorConfig.from_json_file("config.json")

# Save configuration
config.to_toml_file("saved_config.toml")

Available Enums

  • SchemaFormat: ShEx, SHACL
  • OutputFormat: Turtle, NTriples
  • CardinalityStrategy: Minimum, Maximum, Random, Balanced

For more examples, see the examples/generate_example.py file.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

pyrudof-0.2.2-cp37-abi3-win_amd64.whl (8.4 MB view details)

Uploaded CPython 3.7+Windows x86-64

pyrudof-0.2.2-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (9.8 MB view details)

Uploaded CPython 3.7+manylinux: glibc 2.17+ x86-64

pyrudof-0.2.2-cp37-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (9.5 MB view details)

Uploaded CPython 3.7+manylinux: glibc 2.17+ ARM64

pyrudof-0.2.2-cp37-abi3-macosx_11_0_arm64.whl (8.8 MB view details)

Uploaded CPython 3.7+macOS 11.0+ ARM64

pyrudof-0.2.2-cp37-abi3-macosx_10_12_x86_64.whl (9.1 MB view details)

Uploaded CPython 3.7+macOS 10.12+ x86-64

File details

Details for the file pyrudof-0.2.2-cp37-abi3-win_amd64.whl.

File metadata

  • Download URL: pyrudof-0.2.2-cp37-abi3-win_amd64.whl
  • Upload date:
  • Size: 8.4 MB
  • Tags: CPython 3.7+, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.12.4

File hashes

Hashes for pyrudof-0.2.2-cp37-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 fb54ecae917a9ad2885ef2a992c51df48fce22a9adfcb8a848647643ee888a56
MD5 97e7366a736a87aeea110e89d995fb15
BLAKE2b-256 d81165c28691843a850dec772cae450bcffda3110e07bdd771164cbb3203fa37

See more details on using hashes here.

File details

Details for the file pyrudof-0.2.2-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pyrudof-0.2.2-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 5b4e367972c19172d98d10e8e42f7832abb60dff77735c52dd1ad05cfaa6c383
MD5 5ae81d728e8811fc0da748827e6b1184
BLAKE2b-256 f1946de319e5fdf91a8173eb4b683e59fa10b08b126905ff70ce35db8c6c6929

See more details on using hashes here.

File details

Details for the file pyrudof-0.2.2-cp37-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for pyrudof-0.2.2-cp37-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 0071bcc81f831203ba6f9ab6d727a302b8c8ceeb2a8677e5f884a30ebc7194c7
MD5 621474c87eb80ab67db151653d420114
BLAKE2b-256 9a85c46bff45a2dd269497a6bce5d82e12d41c62ba6396f5dc006248cc3c042f

See more details on using hashes here.

File details

Details for the file pyrudof-0.2.2-cp37-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for pyrudof-0.2.2-cp37-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 df2e7fef1e3a1384321630f274647d60e3aeef27efaf625905ac1e55c06a0e70
MD5 650ba3ed522ddd4732843d7ecd439b3d
BLAKE2b-256 67217518ab55bd084f0435e9b56b54b03770b2ddac7d68258d39814dd345104d

See more details on using hashes here.

File details

Details for the file pyrudof-0.2.2-cp37-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for pyrudof-0.2.2-cp37-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 d1e08e4a6ebfeb60ad1e3b11c847c9e278121f22ad3fb888053a222c7a970b4e
MD5 257f7bf7b8ed66075ae3376235b737c1
BLAKE2b-256 f2b4712636cbac58ad8e329d8338db3da62aa39af125bb8771052a39e1a36d9e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page