A package of Estimators for Large Language Model Training.

These details have been verified by PyPI

Project links

Source

GitHub Statistics

Maintainers

aluu317

These details have not been verified by PyPI

License
- OSI Approved :: Apache Software License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

FM Training Estimator

Estimators for Large Language Model Training.

Estimate resource consumption - memory, tokens, time etc for training and fine-tuning jobs using an hybrid of theory and learned regression models.

Feature Matrix and Roadmap

Technique	Support
Full (1 gpu)	:heavy_check_mark:
FSDP (multi)	:heavy_check_mark:
Lora (1 gpu)	:heavy_check_mark:
QLora (1 gpu)	Planned
Speculators	Planned
Tensor Parallelism	Planned

Time

Full learned approach. Coverage based on availability of training data.

Memory

Hybrid theory + learned. Coverage of learned approach is subject to availability of training data.

Tokens

Fully theory. Simulation based models available.

Technique	Explanation	Availability
TE0	Simulation based - slow but accurate	:heavy_check_mark:
TE1	Statistical	Planned
TE2	Approximate - fast, light, reasonable accurate	Coming soon

Usage

You can use the library fm_training_estimator as a Python package by installing it via pip, see installation, build a regession model and using the lirbary. If you'd like to construct the estimator service with a Web UI via FastAPI or build a docker image, clone the repository in your local machine before following the instructions in those sections.

Within your working directory, it is recommended to create a virtual environment to ensure no conflicts in dependencies.

python -m venv .venv
source .venv/bin/activate

Install

pip install fm_training_estimator

Build a regression model for learned prediction method

Now, prepare data in the expected format for lookup and regression. Some example data csv files are here. Save your data file into ./workdir/data.csv.

mkdir workdir
mv <data file> ./workdir/data.csv

Now, build a regression model using this data, using the provided make target:

from fm_training_estimator.regressor.xgboost.train import train 
train("./workdir/data.csv", "./workdir/model.json", ["tokens_per_second","memory","memory_act"])

This will create a model called ./workdir/model.json which you can then use to estimate the resource consumption.

You can now run the estimator library, see below.

Use the library to get estimates

For a full API reference, visit our readthedocs.

Example code:

# Standard
import os

# First Party
from fm_training_estimator.config.arguments import (
    DataArguments,
    EstimateInput,
    EstimatorMetadata,
    FMArguments,
    HFTrainingArguments,
    InfraArguments,
    JobConfig,
)
from fm_training_estimator.sdk import (
    estimate_cost,
    estimate_memory,
    estimate_time,
    estimate_tokens,
)

workdir_path = os.path.join(os.path.abspath(os.curdir), "workdir")

model_path = os.path.join(workdir_path, "model.json")
lookup_data_path = os.path.join(workdir_path, "data.csv")

estimator_metadata = EstimatorMetadata(base_data_path=lookup_data_path)

fm = FMArguments(
    base_model_path="ibm-granite/granite-7b-base",
    torch_dtype="bfloat16",
    block_size=1024,
)
hf_training = HFTrainingArguments(
    per_device_train_batch_size=1, gradient_checkpointing=False
)
data = DataArguments(dataset="imdb", te_approach=0)
infra = InfraArguments(numGpusPerPod=1)
job_conf = JobConfig(hf_training, fm, data, infra)
est_input = EstimateInput(estimator_metadata=estimator_metadata, job_configs=[job_conf])

print("Estimating Memory:....")

print("With only theory: ", estimate_memory(est_input))
print("With reg model: ", estimate_memory(est_input, model_path))

hf_training.fsdp = "full_shard"

print("Using fsdp full shard")
print("With only theory: ", estimate_memory(est_input))
print("With reg model: ", estimate_memory(est_input, model_path))


print("Estimating Time:....")
print("With only theory: ", estimate_time(est_input))
print("With reg model: ", estimate_time(est_input, model_path))

print("Estimating Tokens:....")
print("With only theory: ", estimate_tokens(est_input))
print("With reg model: ", estimate_tokens(est_input, model_path))

Make estimates via a Web UI

To do this, first prepare a txt file called model_whitelist.txt in the workdir/ with a list of model names, 1 per line. Note that these are the models on which you want to run the estimator to estimate their resource consumption. You can use the provided example and place it in your workdir. Modify this list as needed.

Now, run the ui:

make run-web-ui

This will start the UI on localhost:3000 port.

(The web ui has other options, not covered in this simple setup. If you want to skip the model whitelisting or change the port, directly run the UI as shown in the README in the ./fm_training_estimator/ui folder.)

Build a Docker Container Image

To build the estimator container image:

Make sure both model.json and data.csv files are present in the workdir folder.
Use this command to build and push the image:

make cbuild
make cpush # If you want to push to the container registry

Use this command to run the image:

docker run --rm -it -v "/path/to/input.json:/app/input.json" icr.io/ftplatform/fm_training_estimator:latest

Project details

These details have been verified by PyPI

Project links

Source

GitHub Statistics

Maintainers

aluu317

These details have not been verified by PyPI

License
- OSI Approved :: Apache Software License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

0.1.3

Feb 10, 2026

0.1.2

Dec 11, 2025

0.1.1

Dec 9, 2025

0.1.0

Feb 18, 2025

This version

0.0.2rc1 pre-release

Oct 30, 2024

0.0.1

Oct 30, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

fm_training_estimator-0.0.2rc1-py3-none-any.whl (55.4 kB view details)

Uploaded Oct 30, 2024 Python 3

File details

Details for the file fm_training_estimator-0.0.2rc1-py3-none-any.whl.

File metadata

Download URL: fm_training_estimator-0.0.2rc1-py3-none-any.whl
Upload date: Oct 30, 2024
Size: 55.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for fm_training_estimator-0.0.2rc1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9006809acc4cfb8adc78ad573d3a26bbb754eaa8ed0c18c8398d62fcc9b15a61`
MD5	`fcaf6edff1bdfbee0b809c0aa2cece11`
BLAKE2b-256	`0c859f1789f9730bea93927c8008e0042e302497ab40896c47efa70195d2102f`

See more details on using hashes here.

Provenance

The following attestation bundles were made for fm_training_estimator-0.0.2rc1-py3-none-any.whl:

Publisher: build-and-push.yml on foundation-model-stack/fm-training-estimator

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: fm_training_estimator-0.0.2rc1-py3-none-any.whl
- Subject digest: 9006809acc4cfb8adc78ad573d3a26bbb754eaa8ed0c18c8398d62fcc9b15a61
- Sigstore transparency entry: 145269500
- Sigstore integration time: Oct 30, 2024
Source repository:
- Permalink: foundation-model-stack/fm-training-estimator@f5fa8a71f0a4f7c6593480538d7ad81716a41783
- Branch / Tag: refs/tags/v0.0.2-rc.1
- Owner: https://github.com/foundation-model-stack
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: build-and-push.yml@f5fa8a71f0a4f7c6593480538d7ad81716a41783
- Trigger Event: release

fm-training-estimator 0.0.2rc1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

FM Training Estimator

Feature Matrix and Roadmap

Time

Memory

Tokens

Usage

Install

Build a regression model for learned prediction method

Use the library to get estimates

Make estimates via a Web UI

Build a Docker Container Image

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

File details

File metadata

File hashes

Provenance