Dagster extra utilities for data processing
Project description
Dxtr Dagster Library
A Python library that provides utilities and components for data engineering workflows using Dagster. The library focuses on data processing capabilities including downloading data from Sharepoint, loading to PostgreSQL, and performing data transformations.
Project Structure
The library is organized into the following components:
dxtr/
├── dxtr/ # Main library package
│ ├── dagster/ # Dagster-specific components and resources
│ └── utils/ # Utility functions
├── pyproject.toml # Project configuration and dependencies
└── README.md # This file
Features
- Sharepoint data file downloading
- SQLAlchemy data loading
- Data transformation capabilities
- Integration with Dagster for workflow orchestration
Dependencies
The library requires Python 3.11.8 or higher and includes key dependencies such as:
- polars
- google-cloud-storage
- requests
- msal
- pandas
- sqlalchemy
- psycopg2-binary
- and more (see pyproject.toml for complete list)
Development
Installation
For development purposes, install the package in editable mode:
pip install -e ".[dev] --config-settings editable_mode=compat"
Please refer to the Wiki to usage of ./dxtrx.sh to setup the environment and start the Dagster code server a more convenient way of working with this code.
The library requires several environment variables to be set:
- Sharepoint credentials
- Database credentials
- Other configuration variables
Please refer to the Wiki for detailed setup instructions using ./dxtrx.sh to configure the environment and start the Dagster code server.
Contributing Guidelines
When contributing to this library:
- Follow the existing code structure and naming conventions
- Add new components in the appropriate directories
- Update documentation as needed
- Test changes locally
- Submit PRs with evidence of testing and team review
Running tests
To run the tests, use the following command:
pytest
Or you can also run them in watching mode:
ptw
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dxtrx-0.0.8.tar.gz.
File metadata
- Download URL: dxtrx-0.0.8.tar.gz
- Upload date:
- Size: 44.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2c9d6acbb0dab5a0e6bcb7ae1490c5269ec8e5a1d244ea6eda1ef66647878a6f
|
|
| MD5 |
f522208df5e7afdb46c5889e8ab296cd
|
|
| BLAKE2b-256 |
58dcd470c5a43ff804209dd092e3a01e220933120d74a3da1fb57fc29788a800
|
File details
Details for the file dxtrx-0.0.8-py3-none-any.whl.
File metadata
- Download URL: dxtrx-0.0.8-py3-none-any.whl
- Upload date:
- Size: 53.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4c87c15e710080c34a6035c55c9158e79bf713d0ba4c12fd383e1093daf6b1c4
|
|
| MD5 |
68e82875d769f1cb5b69ee2360842260
|
|
| BLAKE2b-256 |
bdab3093f6bff70c270303b837df211dd90dfeff9112eb1ad6bddfbec4fbfe8f
|