Skip to main content

A framework to manage data continuously

Project description


CDF (Continuous Data Framework)

Craft end-to-end data pipelines and manage them continuously

Python SQLMesh dlt

GitHub license git-last-commit GitHub commit activity GitHub top language

📖 Table of Contents


📍 Overview

CDF (Continuous Data Framework) is an integrated framework designed to manage data across the entire lifecycle, from ingestion through transformation to publishing. It is built on top of two open-source projects, sqlmesh and dlt, providing a unified interface for complex data operations. CDF simplifies data engineering workflows, offering scalable solutions from small to large projects through an opinionated project structure that supports both multi-workspace and single-workspace layouts.

[!WARNING] The repo is currently under ACTIVE development with multiple large refactors already having been completed. As such, you must be aware that the codebase is not yet stable and is subject to change. Furthermore, you must look to the code (or tests) itself for the most accurate and up-to-date information until this disclaimer is removed.

Features

...

Getting Started

  1. Installation:

    (NOT YET PUBLISHED ON PYPI, INSTALLATION INSTRUCTIONS WILL BE UPDATED SOON)

    CDF requires Python 3.9 or newer. Install CDF using pip:

    pip install python-cdf
    

Documentation

For detailed documentation, including API references and tutorials, visit CDF Documentation.

Contributing

Contributions to CDF are welcome! Please refer to the contributing guidelines for more information on how to submit pull requests, report issues, or suggest enhancements.

License

CDF is licensed under Apache 2.0 License.


This README provides an overview of the CDF tool, highlighting its primary features, installation steps, basic usage examples, and contribution guidelines. It serves as a starting point for users to understand the capabilities of CDF and how it can be integrated into their data engineering workflows.

🧪 Tests

Run the tests with pytest:

pytest tests

🛣 Project Roadmap

TODO: Add a roadmap for the project.

🤝 Contributing

Contributions are welcome! Here are several ways you can contribute:

Contributing Guidelines

Click to expand
  1. Fork the Repository: Start by forking the project repository to your GitHub account.
  2. Clone Locally: Clone the forked repository to your local machine using a Git client.
    git clone <your-forked-repo-url>
    
  3. Create a New Branch: Always work on a new branch, giving it a descriptive name.
    git checkout -b new-feature-x
    
  4. Make Your Changes: Develop and test your changes locally.
  5. Commit Your Changes: Commit with a clear and concise message describing your updates.
    git commit -m 'Implemented new feature x.'
    
  6. Push to GitHub: Push the changes to your forked repository.
    git push origin new-feature-x
    

7a. Submit a Pull Request: Create a PR against the original project repository. Clearly describe the changes and their motivations.

Once your PR is reviewed and approved, it will be merged into the main branch.


📄 License

This project is distributed under the Apache 2.0 License. For more details, refer to the LICENSE file.


👏 Acknowledgments

  • Harness (https://harness.io/) for being the proving grounds in which the initial concept of this project was born.
  • SQLMesh (https://sqlmesh.com) for being a foundational pillar of this project as well as the team for their support, advice, and guidance.
  • DLT (https://dlthub.com) for being the other foundational pillar of this project as well as the team for their support, advice, and guidance.

Return


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

python_cdf-0.9.2.tar.gz (49.5 kB view details)

Uploaded Source

Built Distribution

python_cdf-0.9.2-py3-none-any.whl (57.2 kB view details)

Uploaded Python 3

File details

Details for the file python_cdf-0.9.2.tar.gz.

File metadata

  • Download URL: python_cdf-0.9.2.tar.gz
  • Upload date:
  • Size: 49.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.9.0.dev0 CPython/3.11.9 Darwin/23.1.0

File hashes

Hashes for python_cdf-0.9.2.tar.gz
Algorithm Hash digest
SHA256 fdcbcf63cdc4698bc772dbe353fae8f036f54ace01b0b3aac2c2e469d124006b
MD5 f1c68c27da1427134df810b4599ad10e
BLAKE2b-256 b22eefc540ec550d7432064d6b8a19d5d91470f8a29de3c0a66855904b2e8ada

See more details on using hashes here.

File details

Details for the file python_cdf-0.9.2-py3-none-any.whl.

File metadata

  • Download URL: python_cdf-0.9.2-py3-none-any.whl
  • Upload date:
  • Size: 57.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.9.0.dev0 CPython/3.11.9 Darwin/23.1.0

File hashes

Hashes for python_cdf-0.9.2-py3-none-any.whl
Algorithm Hash digest
SHA256 aab87b0821ea832846a648f91fe191ce0aa2471d722e9e826c5a520222822503
MD5 a00f92c0e2d5a3dba89c6c619014a937
BLAKE2b-256 c0f559e1bb993c6cdacd1deda694183971992dc949715517e8c5b86be61c8e62

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page