Skip to main content

Package for representation of numerical data in format of integers or strings.

Project description

Dashboard License Python Versions Website

Neuralk Foundry

A Modular Machine Learning Framework for Industrial Tasks

[Dashboard] [Examples & Tutorials]


๐ŸŽ‰ Welcome to Neuralk Foundry

Neuralk Foundry is a lightweight yet powerful framework for building modular machine learning pipelines โ€” particularly well-suited for industrial tasks and representation learning. Whether you're prototyping or scaling up, Foundry helps you build, combine, and orchestrate steps cleanly and efficiently.

Foundry is also the engine behind TabBench, Neuralk's internal benchmark for evaluating ML models on real-world tabular datasets.

Why Foundry?

Most ML frameworks fall into one of two camps:

  • Rigid benchmarks and academic pipelines: great for simple supervised learning tasks, but brittle or limited when adapting to more complex use cases.
  • Heavyweight MLOps frameworks (e.g., ZenML, Metaflow): offer full orchestration but at the cost of steep setup and reduced flexibility.

Foundry sits in between. It gives you just the right level of structure to scale from prototype to production โ€” without locking you into opinionated tooling.


๐Ÿš€ Key Features

Composable Workflows : Define steps in terms of their inputs and outputs โ€” no black boxes.

Supports Heterogeneous Tasks : Classification, regression, ranking, record linkage, and more.

Customizable & Extensible : Plug in your own logic or replace any step with a variant.

Built-in Caching & Logging : Avoid recomputation and keep track of metrics automatically.

Workflow Explorer UI : Inspect and debug workflows through an interactive, visual interface.

Reproducibility by Design : Strong separation between configuration, code, and data.


๐Ÿง  How Things Are Organized

Foundry is a modular framework. Its codebase is split into submodules that reflect each phase of the ML pipeline:

neuralk_foundry_ce/
โ”œโ”€โ”€ datasets/               # Dataset loading utilities
โ”œโ”€โ”€ sample_selection/
โ”‚   โ”œโ”€โ”€ splitter/           # Data splitting strategies (e.g., stratified shuffle)
โ”‚   โ””โ”€โ”€ blocking/           # Candidate pair selection (e.g., for deduplication)
โ”œโ”€โ”€ feature_engineering/
โ”‚   โ”œโ”€โ”€ preprocessing/      # Traditional preprocessing for tabular data
โ”‚   โ”œโ”€โ”€ vectorizer/         # Text and other unstructured data vectorization
โ”‚   โ””โ”€โ”€ blocking/           # Pair processing modules for matching/merging
โ”œโ”€โ”€ models/
โ”‚   โ”œโ”€โ”€ classifier/         # Classification models
โ”‚   โ”œโ”€โ”€ regressor/          # Regression models
โ”‚   โ”œโ”€โ”€ embedder/           # Embedding/representation learning
โ”‚   โ””โ”€โ”€ clustering/         # Clustering and unsupervised methods
โ”œโ”€โ”€ workflow/               # Core execution engine: Step, Workflow, etc.
โ””โ”€โ”€ utils/                  # Helper functions and shared infrastructure

Each component (e.g., a model or preprocessing step) inherits from a base Step class and declares:

  • Its expected inputs
  • The outputs it produces
  • Any configurable parameters

Steps can then be connected into a Workflow, either manually or through a task-specific template (e.g., Classification).


โš™๏ธ Quick-Start Installation

Install the package from PyPI:

pip install neuralk_foundry_ce

๐Ÿ”ฌ Development Installation

Clone the Repository

git clone https://github.com/Neuralk-AI/NeuralkFoundry-CE
cd NeuralkFoundry-CE

Create a Dedicated Environment (recommended)

Neuralk Foundry relies on a variety of external machine learning libraries. As a result, managing package versions can be delicate. To avoid compatibility issues, we strongly recommend installing Foundry in a dedicated virtual environment (e.g., using conda or venv).

conda create -n foundry python=3.11
conda activate foundry

Install the Package

pip install -e .

Examples and tutorials

Citing Foundry

If you incorporate any part of this repository into your work, please reference it using the following citation:

@article{neuralk2025foundry,
         title={Foundry: A Modular Machine Learning Framework for Industrial Tasks}, 
         author={Neuralk-AI},
         year={2025},
         publisher = {GitHub},
         journal = {GitHub repository},
         howpublished = {\url{https://github.com/Neuralk-AI/NeuralkFoundry-CE}},
}

Contact

If you have any questions or wish to propose new features please feel free to open an issue or contact us at alex@neuralk-ai.com.

For collaborations please contact us at antoine@neuralk-ai.com.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

neuralk_foundry_ce-0.0.2.tar.gz (61.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

neuralk_foundry_ce-0.0.2-py3-none-any.whl (79.9 kB view details)

Uploaded Python 3

File details

Details for the file neuralk_foundry_ce-0.0.2.tar.gz.

File metadata

  • Download URL: neuralk_foundry_ce-0.0.2.tar.gz
  • Upload date:
  • Size: 61.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.6

File hashes

Hashes for neuralk_foundry_ce-0.0.2.tar.gz
Algorithm Hash digest
SHA256 fe943499a791d7ad67c18e52f10ed866940e0eead2c8f53fd16fb43439657739
MD5 2dc8eb78f37c8d10fec51959095960ed
BLAKE2b-256 d6f07877defc874601a4823e9916e54a3488ca5fd8efda627bf55367ff30c7a1

See more details on using hashes here.

File details

Details for the file neuralk_foundry_ce-0.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for neuralk_foundry_ce-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 509a154f92c7b35bd7eddfecf7f1b626d5040f9ca05055f513581fece2922d0c
MD5 13b79defbe07e05a8d481292a607c207
BLAKE2b-256 9afa5bbdda99741737e9506836375e1f798b25e8476e331fbec810a0d58caf42

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page