Skip to main content

A collection of interop, core, and orchestration services for the bclearer framework

Project description

bclearer Pipeline Development Kit (PDK)

The bCLEARer Pipeline Development Kit (PDK) bundles the libraries, scaffolding tools, and reference assets used to build semantic data pipelines on the bCLEARer platform. It delivers the core building blocks for configuration, data interoperability, orchestration, and ontology modelling so you can go from a pipeline blueprint to a running implementation quickly.

Highlights

  • Generate complete pipeline skeletons with the bclearer-pipeline-builder CLI (interactive authoring, JSON-driven creation, structural updates, and template extraction).
  • Connect to the ecosystems your pipelines touch: CSV/Excel/JSON, Delta Lake, PySpark, HDF5, MongoDB, MS Access, PostgreSQL, SQL Server, Neo4j, CozoDB, Raphtory, Enterprise Architect, and more.
  • Operate pipelines confidently with orchestration helpers covering app lifecycle management, UUID/identity services, logging, reporting, static analysis, version-control utilities, and unit-of-measure management.
  • Model your universe with the BNOP ontology module—our BORO Native Objects implementation featuring factories, relationship management, and XML migrations.

Workspace Packages

Package Path Highlights
bclearer-core libraries/core Configuration managers, canonical identifiers (CKIDs), pipeline stage definitions, and the pipeline builder engine + CLI.
bclearer-interop-services libraries/interop_services Data I/O adapters and transformations spanning DataFrames, parquet/delta, document stores, graph backends (Neo4j, Raphtory, CozoDB), RDBMS connectors, EA integrations, and session orchestration.
bclearer-orchestration-services libraries/orchestration_services Application runner wrappers, logging and reporting helpers, UUID generation, static code analysis, string/unicode tooling, unit-of-measure libraries, and version-control services.
bnop libraries/ontology BORO Native Objects (Python) ontology runtime with factories, facades, migrations, and serializers used across bCLEARer pipelines.

Repository Layout

  • pipelines/ – reference pipelines generated by the builder (template domain, BOSON, CFI, Uniclass).
  • documentation/ – architecture notes and feature blueprints (pipeline framework, Neo4j, Raphtory, universe designer, RDF/Jena, and more).
  • docker/ – container recipes for running services locally.
  • release_management/ – scripts supporting builds and releases.
  • ui/ – the React-based tooling used to drive pipeline authoring experiences.

Getting Started

Requires Python 3.12+.

Install the workspace (recommended)

pip install uv
uv sync
source .venv/bin/activate

uv sync installs all workspace members (bclearer-core, bclearer-interop-services, bclearer-orchestration-services, bnop) in editable mode.

Alternative: standard pip

python -m venv .venv
source .venv/bin/activate
pip install -e .

Install individual packages from PyPI if you only need a subset, for example pip install bclearer-core.

Pipeline builder CLI

The pipeline builder turns JSON (or interactive prompts) into a fully structured bCLEARer pipeline: domains, pipelines, thin slices, stages, sub-stages, orchestrators, and b-units.

# Generate a sample configuration file
bclearer-pipeline-builder sample --output pipeline_config.json

# Create a pipeline in the current directory
bclearer-pipeline-builder create --config pipeline_config.json --output ./pipelines

# Update an existing pipeline from configuration
bclearer-pipeline-builder update --config pipeline_config.json --pipeline ./pipelines/example_domain

# Extract templates from a curated pipeline
bclearer-pipeline-builder update-templates --template-path pipelines/template_pipeline

Run bclearer-pipeline-builder help or python -m bclearer_core.pipeline_builder help for the full command reference. The generated pipelines follow the bCLEARer pipeline framework.

Working with the libraries

Data interchange

from bclearer_interop_services.b_dictionary_service.table_as_dictionary_service import (
    TableAsDictionaryFromCsvFileReader,
    TableAsDictionaryToDataFrameConverter,
)

reader = TableAsDictionaryFromCsvFileReader()
table_dict = reader.read("data/example.csv")

converter = TableAsDictionaryToDataFrameConverter()
dataframe = converter.convert(table_dict)

Beyond CSV and DataFrames you will find adapters for Excel, JSON, XML, HDF5, Parquet/Delta Lake, PySpark sessions, MongoDB, MS Access, PostgreSQL, SQL Server, CozoDB, Neo4j, Raphtory, Enterprise Architect, filesystem snapshots, and more.

Ontology modelling

from bnop.bnop_facades import BnopFacades
from bclearer_orchestration_services.identification_services.uuid_service.uuid_helpers.uuid_factory import (
    create_new_uuid,
)

repository_uuid = create_new_uuid()

product_type = BnopFacades.create_new_bnop_type(repository_uuid)
product = BnopFacades.create_bnop_object(
    object_uuid=create_new_uuid(),
    owning_repository_uuid=repository_uuid,
    presentation_name="Example Product",
)

BnopFacades.write_bnop_object_to_xml("bnop_snapshot.xml")

Use CKIDs from bclearer_core.ckids to classify tuples and relationships when you need richer BORO semantics.

Orchestration helpers

from bclearer_orchestration_services.b_app_runner_service.b_application_runner import run_b_application

def bootstrap():
    print("hello bCLEARer")

run_b_application(bootstrap)

Complement this with utilities from identification_services, log_environment_utility_service, static_code_analysis_service, and version_control_services to manage runtime behaviour and governance.

Testing & quality gates

pytest                     # run the complete suite
pytest -m "not heavy"      # skip connectors that rely on external services
ruff check                 # lint
black .                    # format

Most tests live under libraries/*/tests. Heavy tests target databases or graph backends and are opt-in via the heavy marker.

Documentation & next steps

  • Architecture overview: documentation/bclearer_pipeline_framework.md
  • Feature workstreams: documentation/features/
  • UI tooling walkthroughs: documentation/ui/

Contributing

  1. Fork the repository and create a feature branch.
  2. Sync dependencies (uv sync or pip install -e .).
  3. Add tests where sensible and run the quality gates.
  4. Submit a pull request with context on the change.

We welcome issues and ideas—open a discussion in GitHub or drop us a line.

License

MIT License. See LICENSE.

Contact

Mesbah Khan — khanm@ontoledgy.io

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bclearer-0.1.7.tar.gz (7.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bclearer-0.1.7-py3-none-any.whl (5.6 kB view details)

Uploaded Python 3

File details

Details for the file bclearer-0.1.7.tar.gz.

File metadata

  • Download URL: bclearer-0.1.7.tar.gz
  • Upload date:
  • Size: 7.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for bclearer-0.1.7.tar.gz
Algorithm Hash digest
SHA256 9959049aa46d00e6c84c800df9066c44cf01d3df7d1dae17ea729ea8791cf0b2
MD5 ed0a5faf0e554f7b7928b16e27d1f8e9
BLAKE2b-256 d5dd6ddb4ac96097ecafea0e6eaeb2b482af1f63a54eea9bab4176a18bf4c4bf

See more details on using hashes here.

File details

Details for the file bclearer-0.1.7-py3-none-any.whl.

File metadata

  • Download URL: bclearer-0.1.7-py3-none-any.whl
  • Upload date:
  • Size: 5.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for bclearer-0.1.7-py3-none-any.whl
Algorithm Hash digest
SHA256 09113df174db44639e13680139a0515b9d8174bd41d8062c5a751baeffc5864a
MD5 2e0757216399c19292fe734e2e0872e6
BLAKE2b-256 23e8577e412fb24337cd0328db22342006c3e67c5d25d3b110815002941a1376

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page