Skip to main content

No project description provided

Project description

dagster-mssql-bcp

Unit tests

ODBC is slow 🐢 bcp is fast! 🐰

This is a custom dagster IO manager for loading data into SQL Server using the bcp utility.

What you need to run it

Pypi

PyPI

pip install dagster-mssql-bcp

BCP Utility

The bcp utility must be installed on the machine that is running the dagster pipeline.

See Microsoft's documentation for more information.

Ideally you should place this on your PATH, but you can specify in the IO configuration where it is located.

ODBC Drivers

You need the ODBC drivers installed on the machine that is running the dagster pipeline.

See Microsoft's documentation for more information.

Permissions

The user running the dagster pipeline must have the necessary permissions to load data into the SQL Server database.

  • CREATE SCHEMA
  • CREATE/ALTER TABLES

Basic Usage

Polars

Polars processes as a LazyFrame. Either a DataFrame or LazyFrame can be provided as an output of your asset before its cast automatically to lazy

from dagster import asset, Definitions
from dagster_mssql_bcp import PolarsBCPIOManager, PolarsBCPResource
import polars as pl

io_manager = PolarsBCPIOManager(
    resource=PolarsBCPResource(
        host="my_mssql_server",
        database="my_database",
        port='1433',
        username="username",
        password="password",
        query_props={
            "TrustServerCertificate": "yes",
        },
        bcp_arguments={"-u": ""},
        bcp_path="/opt/mssql-tools18/bin/bcp",
    )
)

@asset(
    metadata={
        "asset_schema": [
            {"name": "id", "type": "INT"},
        ],
        "schema": "my_schema",
    }
)
def my_polars_asset(context):
    return pl.DataFrame({"id": [1, 2, 3]})


@asset(
    metadata={
        "asset_schema": [
            {"name": "id", "type": "INT"},
        ],
        "schema": "my_schema",
    }
)
def my_polars_asset_lazy(context):
    return pl.LazyFrame({"id": [1, 2, 3]})

defs = Definitions(
    assets=[my_polars_asset, my_polars_asset_lazy],
    resources={
        "io_manager": io_manager,
    },
)

Pandas

from dagster import asset, Definitions
from dagster_mssql_bcp import PandasBCPIOManager, PandasBCPResource
import pandas as pd

io_manager = PandasBCPIOManager(
    resource=PandasBCPResource(
        host="my_mssql_server",
        database="my_database",
        port='1433',
        username="username",
        password="password",
        query_props={
            "TrustServerCertificate": "yes",
        },
        bcp_arguments={"-u": ""},
        bcp_path="/opt/mssql-tools18/bin/bcp",
    )
)


@asset(
    metadata={
        "asset_schema": [
            {"name": "id", "type": "INT"},
        ],
        "schema": "my_schema",
    }
)
def my_pandas_asset(context):
    return pd.DataFrame({"id": [1, 2, 3]})


defs = Definitions(
    assets=[my_pandas_asset],
    resources={
        "io_manager": io_manager,
    },
)

The asset schema defines your table structure and your asset returns your data to load.

Docs

For more details see assets doc, io manager doc, and for how its implemented, the dev doc.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dagster_mssql_bcp-0.1.5.tar.gz (45.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dagster_mssql_bcp-0.1.5-py3-none-any.whl (30.6 kB view details)

Uploaded Python 3

File details

Details for the file dagster_mssql_bcp-0.1.5.tar.gz.

File metadata

  • Download URL: dagster_mssql_bcp-0.1.5.tar.gz
  • Upload date:
  • Size: 45.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dagster_mssql_bcp-0.1.5.tar.gz
Algorithm Hash digest
SHA256 19aaf25fe554bd666ca0ff4f5317f2b835f94dbfd9c05f0010efe4e71607df22
MD5 edad03ff0e3162b63d96e917d29b2b61
BLAKE2b-256 34999c10ac1ee3cb614701bc1f8edeba9fcc85afda28e5d9c9795ad5de98b5f1

See more details on using hashes here.

Provenance

The following attestation bundles were made for dagster_mssql_bcp-0.1.5.tar.gz:

Publisher: python-publish.yml on cody-scott/dagster-mssql-bcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dagster_mssql_bcp-0.1.5-py3-none-any.whl.

File metadata

File hashes

Hashes for dagster_mssql_bcp-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 2fd33f74ff507944c294f5c3c04d8389d868c7a3ea6fd3f19e64195d0d5d7968
MD5 5f78c0a5bfb8d219abbf02bb2db6bbaa
BLAKE2b-256 81fc5a62d215d212bd25d4ae21ed907f4b8520236bdd6e1efd6aa50df859343f

See more details on using hashes here.

Provenance

The following attestation bundles were made for dagster_mssql_bcp-0.1.5-py3-none-any.whl:

Publisher: python-publish.yml on cody-scott/dagster-mssql-bcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page