Skip to main content

Shared database, model-loading, and vacua-vault infrastructure for string compactifications.

Reason this release was yanked:

upgrade jaxvacua

Project description

StringForge

Docs StringForge Python CI License: GPL v3

Shared database, model-loading, and vacua-vault infrastructure for string-compactification workflows.

StringForge is the infrastructure layer for the StringJAX ecosystem of JAX-based string-compactification packages. It provides reproducible access to Calabi-Yau geometry databases, bridges those data into physics engines such as JAXVacua, and manages persistent vacuum-solution storage with provenance.

The package is intentionally solver-light. It does not replace JAXVacua, KahlerJAX, JAXiverse, or CYTools. Instead, it standardises the shared conventions that those packages and downstream scans need: catalogue queries, lazy downloads, cache/offline workflows, model loading, vault layout, validation, and curation.

What StringForge owns

  • Geometry databases. Unified access to hosted TDF/Kreuzer-Skarke and CICY datasets through CYDatabase, TDFDatabase, CICYDatabase, and LCSDatabase.
  • Lazy local caching. Catalogues and parquet shards are downloaded on demand and cached under a configurable data directory, with explicit offline mode for HPC jobs.
  • Model-loading bridges. LCSDatabase loads database rows as jaxvacua.lcs.lcs_tree objects or fully initialised JAXVacua FluxVacuaFinder models when JAXVacua is installed.
  • Vacua vault. VacuaWriter designates, validates, queries, uploads, fetches, retracts, and purges vacuum-solution parquet files in a shared vault layout.
  • Vault validation tools. stringforge.vacuavault validates parquet submissions, rebuilds catalogues, and supports curation workflows without importing physics solvers.
  • Advanced curated indices. KKLTDatabase exposes a specialised conifold-class indexed kklt subset used for KKLT-style searches, tags, and TDF hand-off.
  • Production vacuum forging. Vulcan is the cluster-side, append-only counterpart to VacuaWriter: workers stage validated parquet shards locally, a head node batches them into one HfApi.create_commit call per max_batch-sized chunk (default 500 files per commit), the rolling-window budget respects HuggingFace's 100-commit-per-hour cap, and VulcanReader / VulcanMLView give downstream consumers queryable rows and deterministic, geometry-disjoint train/val/test splits (rows sharing a geometry_id always land in the same split, regardless of process or seed).

What StringForge does not own

  • It is not the flux-vacuum solver. Vacuum search, period calculations, ISD sampling, flux bounding, and stability analysis live in JAXVacua.
  • It is not a public release of KahlerJAX or JAXiverse. Those packages remain planned ecosystem consumers until their own releases are ready.
  • It is not the owner of every derived dataset used by collaborators. Public pages distinguish hosted StringForge datasets from collaborator-generated or paper-specific data.
  • It is not a monolithic umbrella package that imports every physics engine on startup. Imports stay lightweight and optional physics packages are loaded only when a workflow needs them.

Architecture

CYDatabase      -> pure I/O, HuggingFace downloads, cache, catalog queries
    |
LCSDatabase     -> mirror-convention model loading for JAXVacua workflows
    |
VacuaWriter     -> designated vacua, vault catalogues, push/fetch workflows

Vulcan          -> cluster-side production: stage locally, batch-commit
                   via HfApi.create_commit, query/stream as ML dataset
VulcanReader   -> read-side catalogue scan, run / shard fetch
VulcanMLView   -> geometry-disjoint train/val/test splits for ML

KKLTDatabase is an advanced LCSDatabase-style interface for a curated TDF subset. It does not duplicate the TDF geometry data; it stores logical links, conifold-class provenance, and curation tags. Actual KKLT vacuum records belong in the shared vacua_vault infrastructure.

Vulcan deliberately complements VacuaWriter rather than replacing it: VacuaWriter designates curated low-volume vacua into the paper-aligned vacua_vault, while Vulcan forges high-volume cluster output into a separate production repo with a uniform parquet schema with fixed-shape columns plus pad-to-tensor list columns (flux, moduli_re, moduli_im, F_terms_*) and deterministic geometry-disjoint train/val/test splits via VulcanMLView. The two share the parquet floor (flux, moduli_re, moduli_im, tau_re, tau_im), so a future promotion step can lift production runs into the curated vault.

Ecosystem packages

Package Role Release status
JAXVacuadocs Type IIB flux vacua, complex-structure/axio-dilaton EFTs, vacuum finding, stability analysis Public
JAXPolyLogdocs JAX-compatible polylogarithms with autodiff support Public
KahlerJAX Kähler-moduli stabilisation for 4D N=1 EFTs Planned; not a StringForge dependency
JAXiverse Multi-axion EFT spectra, decay constants, and couplings Planned; not a StringForge dependency
CYTools External toric Calabi-Yau geometry package Public external dependency for selected workflows

Quick start

from stringforge import LCSDatabase

# Query the hosted TDF catalogue. The constructor itself performs no network I/O;
# the catalogue is fetched lazily on first query.
db = LCSDatabase(dataset="tdf", cache_dir=".stringforge_cache")
models = db.query(h12=2, has_conifolds=True).head(5)
print(models[["h11", "h12", "ks_id", "triang_id", "n_conifolds"]])

# Load one catalogue row as JAXVacua-compatible data.
row = models.iloc[0]
tree = db.load(
    h11=int(row["h11"]),
    h12=int(row["h12"]),
    ks_id=int(row["ks_id"]),
    triang_id=int(row["triang_id"]),
    include_gv=False,
)

# Or construct the corresponding JAXVacua FluxVacuaFinder directly.
finder = db.load_model(
    h11=int(row["h11"]),
    h12=int(row["h12"]),
    ks_id=int(row["ks_id"]),
    triang_id=int(row["triang_id"]),
    include_gv=False,
)

The returned finder is a JAXVacua FluxVacuaFinder. Use the JAXVacua documentation for vacuum-search, flux-sampling, period-calculation, and stability-analysis workflows.

Vulcan: production vacuum forging

For high-volume cluster workloads, use Vulcan instead of VacuaWriter to publish runs to a separate production repo without tripping over HuggingFace's commit-rate cap.

from stringforge.vulcan import Vulcan

# On a cluster worker: stage a batch locally; no HuggingFace I/O.
forge = Vulcan.from_env()                      # reads STRINGFORGE_VULCAN_*
forge.write(
    vacua_df,
    geometry={"h11": 3, "h12": 2, "ks_id": 384564, "triang_id": 0},
    tadpole_charge=12,
    solver={"name": "newton", "config_hash": "abc123"},
    provenance={"git_sha": "deadbeef", "seed": 42, "wall_clock_s": 3.4},
)

# On the head node (cron, daemon, or manual): drain pending shards
# into batched commits respecting the 90/hour budget.
report = forge.sync(max_batch=500)             # one create_commit, many files

# After sync: query, fetch a specific run, or build an ML view.
susy = forge.query(h12=2, solver_name="newton", is_susy=True)
train = forge.ml_view().as_dataframe("train")  # deterministic, geometry-disjoint split

CLI: python -m stringforge.vulcan {status,sync}. See Vulcan cluster runs for the full cluster best-practice walkthrough.

Vacua vault workflow

import pandas as pd

vacua = pd.DataFrame({
    "flux": [[1, 0, -2, 3, 0, 1]],
    "moduli_re": [[0.0, 0.0]],
    "moduli_im": [[2.5, 3.0]],
    "tau_re": [0.0],
    "tau_im": [4.0],
    "is_susy": [True],
})

db.designate_vacua(
    vacua,
    label="example_run",
    committed_by="A. Schachner",
    h11=int(row["h11"]),
    h12=int(row["h12"]),
    ks_id=int(row["ks_id"]),
    triang_id=int(row["triang_id"]),
)

designated = db.query_vacua(label="example_run")
print(designated[["label", "n_vacua", "created"]])

Installation

Prerequisites: Python >= 3.12. If GPU acceleration is needed, install JAX with CUDA support first.

# Recommended once the package is public on PyPI
pip install stringforge

# Development install from a local clone
git clone https://github.com/AndreasSchachner/stringforge.git
cd stringforge
pip install -e .

[!CAUTION] StringForge workflows that construct JAX models require float64 precision. JAX Metal on macOS does not support the required complex float64 operations; use the CPU backend on Mac.

Documentation

Build the documentation locally with:

cd documentation
pip install -r requirements.txt
make html

The full JAXVacua API reference is available at jaxvacua.readthedocs.io.

Requirements

Core dependencies installed by pip:

  • NumPy
  • Pandas and PyArrow
  • HuggingFace Hub
  • JAX and jaxlib
  • JAXPolyLog
  • JAXVacua

Optional workflow dependencies:

  • CYTools for constructing models from Kreuzer-Skarke polytopes.
  • python-flint for exact arithmetic in selected downstream routines.

Citation

If you find this work useful, please cite the companion paper as the primary reference and the software release as the secondary reference. When StringForge is used to drive a JAXVacua flux-vacuum search, please additionally cite the JAXVacua framework paper.

Companion paper (preferred, in preparation). A single-author paper describing the conventions, dataset structure and capabilities of StringForge is in preparation. The arXiv identifier and journal/DOI will be added here at submission; until then, cite the temporary manuscript entry below together with the software release.

@article{Schachner:2026stringforge,
    author = "Schachner, Andreas",
    title = "{StringForge: shared database and vacuum-storage infrastructure for differentiable type IIB flux-compactification workflows}",
    note = "In preparation; arXiv ID and journal/DOI to be added at submission.",
    year = "2026"
}

Software release.

@software{schachner_2026_stringforge,
  author = {Schachner, Andreas},
  title = {StringForge: shared infrastructure for string-compactification workflows},
  year = {2026},
  version = {0.1.0},
  url = {https://github.com/AndreasSchachner/stringforge}
}

JAXVacua framework (cite when relevant). The upstream physics engine that StringForge provides infrastructure for was introduced in:

@article{Dubey:2023dvu,
    author = "Dubey, Abhishek and Krippendorf, Sven and Schachner, Andreas",
    title = "{JAXVacua --- a framework for sampling string vacua}",
    eprint = "2306.06160",
    archivePrefix = "arXiv",
    primaryClass = "hep-th",
    doi = "10.1007/JHEP12(2023)146",
    journal = "JHEP",
    volume = "12",
    pages = "146",
    year = "2023"
}

License

StringForge is released under the GNU General Public License v3.0.

Contact

Andreas Schachner

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

stringforge-0.1.0.tar.gz (449.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

stringforge-0.1.0-py3-none-any.whl (196.4 kB view details)

Uploaded Python 3

File details

Details for the file stringforge-0.1.0.tar.gz.

File metadata

  • Download URL: stringforge-0.1.0.tar.gz
  • Upload date:
  • Size: 449.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.16

File hashes

Hashes for stringforge-0.1.0.tar.gz
Algorithm Hash digest
SHA256 52b72ca4abafed5acfe4e723f9934dffc99f2ef80f9a7219f4e2dfd62e65b3e6
MD5 e595c67dae9ee3019c753a1533d51859
BLAKE2b-256 f29208602717344b0d1905bd13e3579f09a46ff117cb8dbe663417d9b8a712fd

See more details on using hashes here.

File details

Details for the file stringforge-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: stringforge-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 196.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.16

File hashes

Hashes for stringforge-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 717183294c3ea83391af35cb563ac4843559a9663f9c5a3bcf8b6e61d2278ca8
MD5 748bdd78c837a88e5ffb8b9aecc1c501
BLAKE2b-256 ac3eb00ba4a21d7bbf09d4818d89955d647da8941ad074ae34119e6b20a269a9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page