Package for representation of numerical data in format of integers or strings.
Project description
A Modular Machine Learning Framework for Industrial Tasks
[Dashboard] [Examples & Tutorials]
๐ Welcome to Neuralk Foundry
Neuralk Foundry is a lightweight yet powerful framework for building modular machine learning pipelines โ particularly well-suited for industrial tasks and representation learning. Whether you're prototyping or scaling up, Foundry helps you build, combine, and orchestrate steps cleanly and efficiently.
Foundry is also the engine behind TabBench, Neuralk's internal benchmark for evaluating ML models on real-world tabular datasets.
Why Foundry?
Most ML frameworks fall into one of two camps:
- Rigid benchmarks and academic pipelines: great for simple supervised learning tasks, but brittle or limited when adapting to more complex use cases.
- Heavyweight MLOps frameworks (e.g., ZenML, Metaflow): offer full orchestration but at the cost of steep setup and reduced flexibility.
Foundry sits in between. It gives you just the right level of structure to scale from prototype to production โ without locking you into opinionated tooling.
๐ Key Features
Composable Workflows : Define steps in terms of their inputs and outputs โ no black boxes.
Supports Heterogeneous Tasks : Classification, regression, ranking, record linkage, and more.
Customizable & Extensible : Plug in your own logic or replace any step with a variant.
Built-in Caching & Logging : Avoid recomputation and keep track of metrics automatically.
Workflow Explorer UI : Inspect and debug workflows through an interactive, visual interface.
Reproducibility by Design : Strong separation between configuration, code, and data.
๐ง How Things Are Organized
Foundry is a modular framework. Its codebase is split into submodules that reflect each phase of the ML pipeline:
neuralk_foundry_ce/
โโโ datasets/ # Dataset loading utilities
โโโ sample_selection/
โ โโโ splitter/ # Data splitting strategies (e.g., stratified shuffle)
โ โโโ blocking/ # Candidate pair selection (e.g., for deduplication)
โโโ feature_engineering/
โ โโโ preprocessing/ # Traditional preprocessing for tabular data
โ โโโ vectorizer/ # Text and other unstructured data vectorization
โ โโโ blocking/ # Pair processing modules for matching/merging
โโโ models/
โ โโโ classifier/ # Classification models
โ โโโ regressor/ # Regression models
โ โโโ embedder/ # Embedding/representation learning
โ โโโ clustering/ # Clustering and unsupervised methods
โโโ workflow/ # Core execution engine: Step, Workflow, etc.
โโโ utils/ # Helper functions and shared infrastructure
Each component (e.g., a model or preprocessing step) inherits from a base Step class and declares:
- Its expected inputs
- The outputs it produces
- Any configurable parameters
Steps can then be connected into a Workflow, either manually or through a task-specific template (e.g., Classification).
โ๏ธ Quick-Start Installation
Install the package from PyPI:
pip install neuralk_foundry_ce
๐ฌ Development Installation
Clone the Repository
git clone https://github.com/Neuralk-AI/NeuralkFoundry-CE
cd NeuralkFoundry-CE
Create a Dedicated Environment (recommended)
Neuralk Foundry relies on a variety of external machine learning libraries. As a result, managing package versions can be delicate. To avoid compatibility issues, we strongly recommend installing Foundry in a dedicated virtual environment (e.g., using conda or venv).
conda create -n foundry python=3.11
conda activate foundry
Install the Package
pip install -e .
Examples and tutorials
-
Getting Started with Neuralk Foundry A gentle introduction to the framework and how to run your first workflow.
-
Three Levels of Workflows Understand how Foundry supports simple pipelines, reusable workflows, and specialized task flows.
-
Use a Custom Model Learn how to plug in and use your own ML model within a Foundry pipeline.
Citing Foundry
If you incorporate any part of this repository into your work, please reference it using the following citation:
@article{neuralk2025foundry,
title={Foundry: A Modular Machine Learning Framework for Industrial Tasks},
author={Neuralk-AI},
year={2025},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/Neuralk-AI/NeuralkFoundry-CE}},
}
Contact
If you have any questions or wish to propose new features please feel free to open an issue or contact us at alex@neuralk-ai.com.
For collaborations please contact us at antoine@neuralk-ai.com.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file neuralk_foundry_ce-0.0.2.tar.gz.
File metadata
- Download URL: neuralk_foundry_ce-0.0.2.tar.gz
- Upload date:
- Size: 61.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fe943499a791d7ad67c18e52f10ed866940e0eead2c8f53fd16fb43439657739
|
|
| MD5 |
2dc8eb78f37c8d10fec51959095960ed
|
|
| BLAKE2b-256 |
d6f07877defc874601a4823e9916e54a3488ca5fd8efda627bf55367ff30c7a1
|
File details
Details for the file neuralk_foundry_ce-0.0.2-py3-none-any.whl.
File metadata
- Download URL: neuralk_foundry_ce-0.0.2-py3-none-any.whl
- Upload date:
- Size: 79.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
509a154f92c7b35bd7eddfecf7f1b626d5040f9ca05055f513581fece2922d0c
|
|
| MD5 |
13b79defbe07e05a8d481292a607c207
|
|
| BLAKE2b-256 |
9afa5bbdda99741737e9506836375e1f798b25e8476e331fbec810a0d58caf42
|