Skip to main content

Cogniflow basic sink StepPackage with C++ data hive parquet sink.

Project description

cf-basic-sinks

cf-basic-sinks provides native sink steps for pipeline outputs.

Steps

Step Category Description
cfsink:DataHiveParquetSinkStep sink Write canonical data hive parquet output through the cf_datahive_cpp gatekeeper

Published distribution name:

pip install cf-basic-sinks

Native build prerequisites

cf_basic_sinks is built with scikit-build-core and CMake and requires:

  • CPython 3.13
  • CMake on PATH
  • a Windows C++ toolchain compatible with that CMake installation
  • access to cf-pipeline-sdk from the package index
  • access to cf-datahive from the package index for the owner-provided native cf_datahive_cpp source surface
  • DuckDB amalgamation source (duckdb.cpp) for default static-link builds
  • DuckDB native headers/library only when forcing shared-link fallback mode

This package does not publish cf_datahive_cpp separately. The step package consumes the owner-provided native source surface exposed by the installed cf_datahive package. Native data hive write ownership remains under cf_datahive.

Local builds can satisfy the default static-link DuckDB requirement through:

  • .native_deps/duckdb/src/duckdb.cpp staged by scripts/setup_native_deps_v2.ps1
  • CF_DATAHIVE_CPP_DUCKDB_SOURCE pointing at a duckdb.cpp amalgamation file

Shared-link fallback mode can be selected with CF_DATAHIVE_CPP_DUCKDB_LINKAGE=shared. In that mode, DuckDB inputs are satisfied through either:

  • the repo-local .native_deps/duckdb layout created by the setup scripts
  • CF_DATAHIVE_CPP_DUCKDB_INCLUDE and CF_DATAHIVE_CPP_DUCKDB_LIB

Publishing

cf_basic_sinks is published with the dedicated Windows workflow:

  • Workflow: .github/workflows/cf_basic_sinks_windows_publish.yml
  • Package directory: sandcastle/cf_basic_steps/cf_basic_sinks
  • PyPI tag: cf-basic-sinks-v<version>
  • TestPyPI tag: cf-basic-sinks-v<version>-test

Workflow note:

  • test/build jobs provision native deps via scripts/setup_native_deps_v2.ps1
  • test/build jobs preinstall published cf-pipeline-sdk and cf-datahive into the job interpreter before invoking the shared publish helper, so the CMake Python subprocesses can discover the owner-provided SDK and data-hive surfaces
  • test/build jobs export CF_DATAHIVE_CPP_DUCKDB_LINKAGE=static and CF_DATAHIVE_CPP_DUCKDB_SOURCE from the repo-local .native_deps/duckdb/src layout
  • test/build jobs also export CF_DATAHIVE_CPP_DUCKDB_INCLUDE and CF_DATAHIVE_CPP_DUCKDB_LIB to preserve shared-link fallback compatibility

Local preflight:

$env:CF_DATAHIVE_CPP_DUCKDB_LINKAGE = "static"
$env:CF_DATAHIVE_CPP_DUCKDB_SOURCE = (Resolve-Path .native_deps/duckdb/src/duckdb.cpp).Path
$env:CF_DATAHIVE_CPP_DUCKDB_INCLUDE = (Resolve-Path .native_deps/duckdb).Path
$env:CF_DATAHIVE_CPP_DUCKDB_LIB = (Resolve-Path .native_deps/duckdb/lib/duckdb.lib).Path
powershell -ExecutionPolicy Bypass -File scripts/mimic_windows_python_publish_workflow.ps1 `
  -WorkflowFile .github/workflows/cf_basic_sinks_windows_publish.yml `
  -PackageDir sandcastle/cf_basic_steps/cf_basic_sinks `
  -PythonExe py `
  -PythonVersion 3.13

Queue a dry-run dispatch:

powershell -ExecutionPolicy Bypass -File scripts/queue_windows_python_publish_workflow.ps1 `
  -WorkflowFile .github/workflows/cf_basic_sinks_windows_publish.yml `
  -PackageDir sandcastle/cf_basic_steps/cf_basic_sinks `
  -PublishTarget testpypi `
  -Ref main `
  -RequireLocalPass `
  -DryRun

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cf_basic_sinks-0.1.2.tar.gz (7.0 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cf_basic_sinks-0.1.2-cp313-cp313-win_amd64.whl (7.0 MB view details)

Uploaded CPython 3.13Windows x86-64

File details

Details for the file cf_basic_sinks-0.1.2.tar.gz.

File metadata

  • Download URL: cf_basic_sinks-0.1.2.tar.gz
  • Upload date:
  • Size: 7.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for cf_basic_sinks-0.1.2.tar.gz
Algorithm Hash digest
SHA256 b5b33817dee5c4f43ce174cca91a337c2ef9c297423b599428978af5df20b4f2
MD5 ac78fbe5b1786bb2d89f5693ec36a4fa
BLAKE2b-256 cd5a1ce3925be9b7c300b1c8c2b6b638ef972235162554cb665039b0fc5f5f11

See more details on using hashes here.

File details

Details for the file cf_basic_sinks-0.1.2-cp313-cp313-win_amd64.whl.

File metadata

File hashes

Hashes for cf_basic_sinks-0.1.2-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 fa16101f453e5e1a7845c010dc4fc0ba2a2fdab6284a66e98a5d8f5d6dab54ac
MD5 e761883dfec3daa674727eceb354a626
BLAKE2b-256 8e0f1e158b74b68f83b77a81be8dd9fdb0b82df9263385314bfeedc5a78ac488

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page