Skip to main content

Python bindings for Rudof

Project description

Rudof Python bindings

The Python bindings for rudof are called pyrudof. They are available at pypi.

For more information, you can access the readthedocs documentation. We keep several tutorials about rudof as Jupyter notebooks in: [https://rudof-project.github.io/tutorials].

After compiling and installing this module, a Python library called pyrudof should be available.

Build the development version

This module is based on pyo3 and maturin.

To build and install the development version of pyrudof you need to clone this git repository, go to the python directory (the one this README is in) and run:

pip install maturin

followed by:

pip install .

If you are using .env, you can do the following:

python3 -m venv .venv

followed by:

source .venv/bin/activate

or

source .venv/bin/activate.fish

and once you do that, you can locally install que package as:

pip install -e .

Running the tests

Go to the tests folder:

cd tests

and run:

python3 -m unittest discover -vvv

Using rudof_generate

The pyrudof package includes bindings for rudof_generate, which allows you to generate synthetic RDF data from ShEx or SHACL schemas.

Basic Example

import pyrudof

# Create configuration
config = pyrudof.GeneratorConfig()
config.set_entity_count(100)
config.set_output_path("output.ttl")
config.set_output_format(pyrudof.OutputFormat.Turtle)

# Create generator
generator = pyrudof.DataGenerator(config)

# Load schema and generate data
generator.run("schema.shex")

Configuration Options

The GeneratorConfig class provides many configuration options:

config = pyrudof.GeneratorConfig()

# Generation parameters
config.set_entity_count(1000)           # Number of entities to generate
config.set_seed(42)                     # Random seed for reproducibility

# Schema format
config.set_schema_format(pyrudof.SchemaFormat.ShEx)  # or SchemaFormat.SHACL

# Output configuration
config.set_output_path("data.ttl")
config.set_output_format(pyrudof.OutputFormat.Turtle)  # or OutputFormat.NTriples
config.set_compress(False)              # Whether to compress output
config.set_write_stats(True)            # Write generation statistics

# Cardinality strategy
config.set_cardinality_strategy(pyrudof.CardinalityStrategy.Balanced)
# Options: Minimum, Maximum, Random, Balanced

# Parallel processing
config.set_worker_threads(4)            # Number of worker threads
config.set_batch_size(100)              # Batch size for processing
config.set_parallel_writing(True)       # Enable parallel file writing
config.set_parallel_file_count(4)       # Number of output files (when parallel)

Loading Schemas

You can load schemas in different ways:

# Load ShEx schema
generator.load_shex_schema("schema.shex")

# Load SHACL schema
generator.load_shacl_schema("shapes.ttl")

# Auto-detect schema format
generator.load_schema_auto("schema_file")

# Then generate data
generator.generate()

Complete Workflow

The run() method provides a convenient way to load a schema and generate data in one step:

# Auto-detect format
generator.run("schema.shex")

# Specify format explicitly
generator.run_with_format("shapes.ttl", pyrudof.SchemaFormat.SHACL)

Configuration Files

You can also load configuration from TOML or JSON files:

# Load from TOML
config = pyrudof.GeneratorConfig.from_toml_file("config.toml")

# Load from JSON
config = pyrudof.GeneratorConfig.from_json_file("config.json")

# Save configuration
config.to_toml_file("saved_config.toml")

Available Enums

  • SchemaFormat: ShEx, SHACL
  • OutputFormat: Turtle, NTriples
  • CardinalityStrategy: Minimum, Maximum, Random, Balanced

For more examples, see the examples/generate_example.py file.

Project details


Release history Release notifications | RSS feed

This version

0.2.6

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

pyrudof-0.2.6-cp37-abi3-win_amd64.whl (8.6 MB view details)

Uploaded CPython 3.7+Windows x86-64

pyrudof-0.2.6-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (9.9 MB view details)

Uploaded CPython 3.7+manylinux: glibc 2.17+ x86-64

pyrudof-0.2.6-cp37-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (9.7 MB view details)

Uploaded CPython 3.7+manylinux: glibc 2.17+ ARM64

pyrudof-0.2.6-cp37-abi3-macosx_11_0_arm64.whl (8.9 MB view details)

Uploaded CPython 3.7+macOS 11.0+ ARM64

pyrudof-0.2.6-cp37-abi3-macosx_10_12_x86_64.whl (9.2 MB view details)

Uploaded CPython 3.7+macOS 10.12+ x86-64

File details

Details for the file pyrudof-0.2.6-cp37-abi3-win_amd64.whl.

File metadata

  • Download URL: pyrudof-0.2.6-cp37-abi3-win_amd64.whl
  • Upload date:
  • Size: 8.6 MB
  • Tags: CPython 3.7+, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.12.6

File hashes

Hashes for pyrudof-0.2.6-cp37-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 807398fe690e6dade3d82be6ec2a8829ae9ba661036836e0e5ae5a6900c4e12e
MD5 bc06af4cf56bae2ce0b79b6075284a43
BLAKE2b-256 72dc6b83426f70f1edcbdeaa854fbfb24a084ce18a89dfe92b170c1f1119095b

See more details on using hashes here.

File details

Details for the file pyrudof-0.2.6-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pyrudof-0.2.6-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 18ef3aeb63db41cdc7fd2629ad25e9b9ae79ae04b59080ea97e54c2ce9bc2a9e
MD5 c1a39ffab51b9799e24ee552a48b4474
BLAKE2b-256 6b87dc50b66a21d813dc7ceb642ed3cd9378b21b8e9823ce9b8c0d52542c53ec

See more details on using hashes here.

File details

Details for the file pyrudof-0.2.6-cp37-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for pyrudof-0.2.6-cp37-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 6f13d8062c31974193695ae2bf5bded3760a044d62771a3b32374c3f40a2070f
MD5 3ac7112e377ad1b770bfa26304a8462f
BLAKE2b-256 f02b5e886c8c254abcc9a56d94e75afed71d985ea955c70ac95b767879d36913

See more details on using hashes here.

File details

Details for the file pyrudof-0.2.6-cp37-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for pyrudof-0.2.6-cp37-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 2e744a1d313e3e7d8edc8be37e5f63c6184432ff57fe8397f8d34dae4872c19b
MD5 8048dbaea42df8115f3faa97300f1f13
BLAKE2b-256 0efb894e1127e747c5da491adaf924302e1899033e1e80ac5c92d124c1390195

See more details on using hashes here.

File details

Details for the file pyrudof-0.2.6-cp37-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for pyrudof-0.2.6-cp37-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 1e25b55952194f0545fa5f85b08bccdcf0ac14546923092f3ae3885191b031ac
MD5 03a0d7c973b9d999a8a225cc1fbc41ad
BLAKE2b-256 e9e0177330302d968b2bdaeab32709fe0f37255b84ce9220afc55b7aaabf7887

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page