Skip to main content

A minimalist, agnostic Python framework to standardize data engineering pipelines.

Project description

Data Engineering Experience 🚀

License: MIT Python Version

Flint is a minimalist, agnostic Python framework designed to streamline and standardize data engineering pipelines. By embracing Convention over Configuration, flint eliminates environment friction, absolute path hardcoding, and complex PySpark session management.


✨ Key Features

  • Zero-Config File Discovery: Automatic tree-walking directory resolution anchors your data catalog using your local pyproject.toml file.
  • Decentralized Catalog: Declare your metadata layouts inside modular, self-contained mini-YAML files.
  • Elastic Processing Runtimes: Switch dynamically between Pandas and PySpark execution engines using exactly the same unified interface.
  • Interactive CLI Scaffolding: Spin up a new production-ready data directory structure instantly with flint init.

📦 Installation

(Once published to PyPI)

pip install flint-core

Or install it directly from the source repository using Poetry:

poetry add git+[https://github.com/idperez720/data-engineering-exp.git](https://github.com/idperez720/data-engineering-exp.git)

🏁 Quick Start

1. Initialize your workspace

Navigate to an empty directory and let the interactive wizard scaffold the workspace conventions:

flint init

2. Declare a dataset

Add a specification block inside conf/catalog/sample_dataset.yaml:

customers:
  description: "Main production customer data"
  format: "csv"
  engine: "pandas"
  storage_path: "data/sample_table.csv"

3. Load data anywhere

Create a Python script or open a Jupyter Notebook inside src/notebooks/ and fetch your data instantly:

from flint_core.core.io import DataLoader

# Autodiscovers your project root boundaries and settings
loader = DataLoader()

# Loads the dataset securely as a Pandas DataFrame
df = loader.load("customers")
df.head()

📖 Complete Documentation

For comprehensive guides, testing architecture deep-dives, and complete API references, visit our documentation site: 👉 http://127.0.0.1:8000/ (Replace with your deployed docs URL, e.g., GitHub Pages)


⚖️ License

Distributed under the MIT License. Any modification or distribution (including forks) must include the original copyright notice and liability waiver. See LICENSE for more information.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flint_core-0.1.1.tar.gz (21.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

flint_core-0.1.1-py3-none-any.whl (30.6 kB view details)

Uploaded Python 3

File details

Details for the file flint_core-0.1.1.tar.gz.

File metadata

  • Download URL: flint_core-0.1.1.tar.gz
  • Upload date:
  • Size: 21.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.4.1 CPython/3.12.13 Linux/6.17.0-1018-azure

File hashes

Hashes for flint_core-0.1.1.tar.gz
Algorithm Hash digest
SHA256 8b839ea55c8d9a72afd69bc6df53c84c9c1066894596e836d9dc3d4154c205a8
MD5 c0a83f57e45c583c10cf19f45990cdfa
BLAKE2b-256 4f01b5b712028481ea85c71a0823d9544739af0abf3012d6bf86b60c7dedbb99

See more details on using hashes here.

File details

Details for the file flint_core-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: flint_core-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 30.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.4.1 CPython/3.12.13 Linux/6.17.0-1018-azure

File hashes

Hashes for flint_core-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 348a057e8a858c2c709f8f2d04491f5e09860e64c8a5aa4662c1f24365780af0
MD5 f38c29f11b11de0251da46ca6f31d983
BLAKE2b-256 62d6ed942084602c87606cf3619d33f99fcd33b8b1f60330b12fa953e53e5b4c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page