Skip to main content

A minimalist, agnostic Python framework to standardize data engineering pipelines.

Project description

Data Engineering Experience 🚀

License: MIT Python Version

Flint is a minimalist, agnostic Python framework designed to streamline and standardize data engineering pipelines. By embracing Convention over Configuration, flint eliminates environment friction, absolute path hardcoding, and complex PySpark session management.


✨ Key Features

  • Zero-Config File Discovery: Automatic tree-walking directory resolution anchors your data catalog using your local pyproject.toml file.
  • Decentralized Catalog: Declare your metadata layouts inside modular, self-contained mini-YAML files.
  • Elastic Processing Runtimes: Switch dynamically between Pandas and PySpark execution engines using exactly the same unified interface.
  • Interactive CLI Scaffolding: Spin up a new production-ready data directory structure instantly with flint init.

📦 Installation

(Once published to PyPI)

pip install flint-core

Or install it directly from the source repository using Poetry:

poetry add git+[https://github.com/idperez720/data-engineering-exp.git](https://github.com/idperez720/data-engineering-exp.git)

🏁 Quick Start

1. Initialize your workspace

Navigate to an empty directory and let the interactive wizard scaffold the workspace conventions:

flint init

2. Declare a dataset

Add a specification block inside conf/catalog/sample_dataset.yaml:

customers:
  description: "Main production customer data"
  format: "csv"
  engine: "pandas"
  storage_path: "data/sample_table.csv"

3. Load data anywhere

Create a Python script or open a Jupyter Notebook inside src/notebooks/ and fetch your data instantly:

from flint_core.core.io import DataLoader

# Autodiscovers your project root boundaries and settings
loader = DataLoader()

# Loads the dataset securely as a Pandas DataFrame
df = loader.load("customers")
df.head()

📖 Complete Documentation

For comprehensive guides, testing architecture deep-dives, and complete API references, visit our documentation site: 👉 http://127.0.0.1:8000/ (Replace with your deployed docs URL, e.g., GitHub Pages)


⚖️ License

Distributed under the MIT License. Any modification or distribution (including forks) must include the original copyright notice and liability waiver. See LICENSE for more information.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flint_core-0.1.0.tar.gz (14.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

flint_core-0.1.0-py3-none-any.whl (17.2 kB view details)

Uploaded Python 3

File details

Details for the file flint_core-0.1.0.tar.gz.

File metadata

  • Download URL: flint_core-0.1.0.tar.gz
  • Upload date:
  • Size: 14.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.10.18 Darwin/25.2.0

File hashes

Hashes for flint_core-0.1.0.tar.gz
Algorithm Hash digest
SHA256 eeb49bb3d987199be771007cd378fbb510dfa57011cfdc7aaaebefed01bb1dfc
MD5 ccbcc43f5e1c560d5fe6089f7b994f24
BLAKE2b-256 5b75bfc812bb3ca124c5b52e5128d4d5f65f09c1b67f75e6585b547936fb8eec

See more details on using hashes here.

File details

Details for the file flint_core-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: flint_core-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 17.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.10.18 Darwin/25.2.0

File hashes

Hashes for flint_core-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 211f17b8d1849f78ca65337b50f8ad2779adbe80e5b5c29ba83bbc215c18f4b2
MD5 673ef9a09d0f7adba1e650a416f499eb
BLAKE2b-256 c7f0e8038fef7b97a6e1f49a96d23c8db5ff74a80f5354ac7eaf276419ccc6f9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page