Skip to main content

Shared database, model-loading, and vacua-vault infrastructure for string compactifications.

Project description

StringForge

Docs StringForge Python CI License: GPL v3

Shared database, model-loading, and vacua-vault infrastructure for string-compactification workflows.

StringForge is the infrastructure layer for the StringJAX ecosystem of JAX-based string-compactification packages. It provides reproducible access to Calabi-Yau geometry databases, bridges those data into physics engines such as JAXVacua, and manages persistent vacuum-solution storage with provenance.

The package is intentionally solver-light. It does not replace JAXVacua, KahlerJAX, JAXiverse, or CYTools. Instead, it standardises the shared conventions that those packages and downstream scans need: catalogue queries, lazy downloads, cache/offline workflows, model loading, vault layout, validation, and curation.

What StringForge owns

  • Geometry databases. Unified access to hosted TDF/Kreuzer-Skarke and CICY datasets through CYDatabase, TDFDatabase, CICYDatabase, and LCSDatabase.
  • Lazy local caching. Catalogues and parquet shards are downloaded on demand and cached under a configurable data directory, with explicit offline mode for HPC jobs.
  • Model-loading bridges. LCSDatabase loads database rows as jaxvacua.lcs.lcs_tree objects or fully initialised JAXVacua FluxVacuaFinder models when JAXVacua is installed.
  • Vacua vault. VacuaWriter designates, validates, queries, uploads, fetches, retracts, and purges vacuum-solution parquet files in a shared vault layout.
  • Vault validation tools. stringforge.vacuavault validates parquet submissions, rebuilds catalogues, and supports curation workflows without importing physics solvers.
  • Advanced curated indices. KKLTDatabase exposes a specialised conifold-class indexed kklt subset used for KKLT-style searches, tags, and TDF hand-off.
  • Production vacuum forging. Vulcan is the cluster-side, append-only counterpart to VacuaWriter: workers stage validated parquet shards locally, a head node batches them into one HfApi.create_commit call per max_batch-sized chunk (default 500 files per commit), the rolling-window budget respects HuggingFace's 100-commit-per-hour cap, and VulcanReader / VulcanMLView give downstream consumers queryable rows and deterministic, geometry-disjoint train/val/test splits (rows sharing a geometry_id always land in the same split, regardless of process or seed).

What StringForge does not own

  • It is not the flux-vacuum solver. Vacuum search, period calculations, ISD sampling, flux bounding, and stability analysis live in JAXVacua.
  • It is not a public release of KahlerJAX or JAXiverse. Those packages remain planned ecosystem consumers until their own releases are ready.
  • It is not the owner of every derived dataset used by collaborators. Public pages distinguish hosted StringForge datasets from collaborator-generated or paper-specific data.
  • It is not a monolithic umbrella package that imports every physics engine on startup. Imports stay lightweight and optional physics packages are loaded only when a workflow needs them.

Architecture

CYDatabase      -> pure I/O, HuggingFace downloads, cache, catalog queries
    |
LCSDatabase     -> mirror-convention model loading for JAXVacua workflows
    |
VacuaWriter     -> designated vacua, vault catalogues, push/fetch workflows

Vulcan          -> cluster-side production: stage locally, batch-commit
                   via HfApi.create_commit, query/stream as ML dataset
VulcanReader   -> read-side catalogue scan, run / shard fetch
VulcanMLView   -> geometry-disjoint train/val/test splits for ML

KKLTDatabase is an advanced LCSDatabase-style interface for a curated TDF subset. It does not duplicate the TDF geometry data; it stores logical links, conifold-class provenance, and curation tags. Actual KKLT vacuum records belong in the shared vacua_vault infrastructure.

Vulcan deliberately complements VacuaWriter rather than replacing it: VacuaWriter designates curated low-volume vacua into the paper-aligned vacua_vault, while Vulcan forges high-volume cluster output into a separate production repo with a uniform parquet schema with fixed-shape columns plus pad-to-tensor list columns (flux, moduli_re, moduli_im, F_terms_*) and deterministic geometry-disjoint train/val/test splits via VulcanMLView. The two share the parquet floor (flux, moduli_re, moduli_im, tau_re, tau_im), so a future promotion step can lift production runs into the curated vault.

Ecosystem packages

Package Role Release status
JAXVacuadocs Type IIB flux vacua, complex-structure/axio-dilaton EFTs, vacuum finding, stability analysis Public
JAXPolyLogdocs JAX-compatible polylogarithms with autodiff support Public
KahlerJAX Kähler-moduli stabilisation for 4D N=1 EFTs Planned; not a StringForge dependency
JAXiverse Multi-axion EFT spectra, decay constants, and couplings Planned; not a StringForge dependency
CYTools External toric Calabi-Yau geometry package Public external dependency for selected workflows

Quick start

from stringforge import LCSDatabase

# Query the hosted TDF catalogue. The constructor itself performs no network I/O;
# the catalogue is fetched lazily on first query.
db = LCSDatabase(dataset="tdf", cache_dir=".stringforge_cache")
models = db.query(h12=2, has_conifolds=True).head(5)
print(models[["h11", "h12", "ks_id", "triang_id", "n_conifolds"]])

# Load one catalogue row as JAXVacua-compatible data.
row = models.iloc[0]
tree = db.load(
    h11=int(row["h11"]),
    h12=int(row["h12"]),
    ks_id=int(row["ks_id"]),
    triang_id=int(row["triang_id"]),
    include_gv=False,
)

# Or construct the corresponding JAXVacua FluxVacuaFinder directly.
finder = db.load_model(
    h11=int(row["h11"]),
    h12=int(row["h12"]),
    ks_id=int(row["ks_id"]),
    triang_id=int(row["triang_id"]),
    include_gv=False,
)

The returned finder is a JAXVacua FluxVacuaFinder. Use the JAXVacua documentation for vacuum-search, flux-sampling, period-calculation, and stability-analysis workflows.

Vulcan: production vacuum forging

For high-volume cluster workloads, use Vulcan instead of VacuaWriter to publish runs to a separate production repo without tripping over HuggingFace's commit-rate cap.

from stringforge.vulcan import Vulcan

# On a cluster worker: stage a batch locally; no HuggingFace I/O.
forge = Vulcan.from_env()                      # reads STRINGFORGE_VULCAN_*
forge.write(
    vacua_df,
    geometry={"h11": 3, "h12": 2, "ks_id": 384564, "triang_id": 0},
    tadpole_charge=12,
    solver={"name": "newton", "config_hash": "abc123"},
    provenance={"git_sha": "deadbeef", "seed": 42, "wall_clock_s": 3.4},
)

# On the head node (cron, daemon, or manual): drain pending shards
# into batched commits respecting the 90/hour budget.
report = forge.sync(max_batch=500)             # one create_commit, many files

# After sync: query, fetch a specific run, or build an ML view.
susy = forge.query(h12=2, solver_name="newton", is_susy=True)
train = forge.ml_view().as_dataframe("train")  # deterministic, geometry-disjoint split

CLI: python -m stringforge.vulcan {status,sync}. See Vulcan cluster runs for the full cluster best-practice walkthrough.

Vacua vault workflow

import pandas as pd

vacua = pd.DataFrame({
    "flux": [[1, 0, -2, 3, 0, 1]],
    "moduli_re": [[0.0, 0.0]],
    "moduli_im": [[2.5, 3.0]],
    "tau_re": [0.0],
    "tau_im": [4.0],
    "is_susy": [True],
})

db.designate_vacua(
    vacua,
    label="example_run",
    committed_by="A. Schachner",
    h11=int(row["h11"]),
    h12=int(row["h12"]),
    ks_id=int(row["ks_id"]),
    triang_id=int(row["triang_id"]),
)

designated = db.query_vacua(label="example_run")
print(designated[["label", "n_vacua", "created"]])

Installation

Prerequisites: Python >= 3.12. If GPU acceleration is needed, install JAX with CUDA support first.

# Recommended once the package is public on PyPI
pip install stringforge

# Development install from a local clone
git clone https://github.com/AndreasSchachner/stringforge.git
cd stringforge
pip install -e .

[!CAUTION] StringForge workflows that construct JAX models require float64 precision. JAX Metal on macOS does not support the required complex float64 operations; use the CPU backend on Mac.

Documentation

Build the documentation locally with:

cd documentation
pip install -r requirements.txt
make html

The full JAXVacua API reference is available at jaxvacua.readthedocs.io.

Requirements

Core dependencies installed by pip:

  • NumPy
  • Pandas and PyArrow
  • HuggingFace Hub
  • JAX and jaxlib
  • JAXPolyLog
  • JAXVacua

Optional workflow dependencies:

  • CYTools for constructing models from Kreuzer-Skarke polytopes.
  • python-flint for exact arithmetic in selected downstream routines.

Citation

If you find this work useful, please cite the companion paper as the primary reference and the software release as the secondary reference. When StringForge is used to drive a JAXVacua flux-vacuum search, please additionally cite the JAXVacua framework paper.

Companion paper (preferred, in preparation). A single-author paper describing the conventions, dataset structure and capabilities of StringForge is in preparation. The arXiv identifier and journal/DOI will be added here at submission; until then, cite the temporary manuscript entry below together with the software release.

@article{Schachner:2026stringforge,
    author = "Schachner, Andreas",
    title = "{StringForge: shared database and vacuum-storage infrastructure for differentiable type IIB flux-compactification workflows}",
    note = "In preparation; arXiv ID and journal/DOI to be added at submission.",
    year = "2026"
}

Software release.

@software{schachner_2026_stringforge,
  author = {Schachner, Andreas},
  title = {StringForge: shared infrastructure for string-compactification workflows},
  year = {2026},
  version = {0.1.0},
  url = {https://github.com/AndreasSchachner/stringforge}
}

JAXVacua framework (cite when relevant). The upstream physics engine that StringForge provides infrastructure for was introduced in:

@article{Dubey:2023dvu,
    author = "Dubey, Abhishek and Krippendorf, Sven and Schachner, Andreas",
    title = "{JAXVacua --- a framework for sampling string vacua}",
    eprint = "2306.06160",
    archivePrefix = "arXiv",
    primaryClass = "hep-th",
    doi = "10.1007/JHEP12(2023)146",
    journal = "JHEP",
    volume = "12",
    pages = "146",
    year = "2023"
}

License

StringForge is released under the GNU General Public License v3.0.

Contact

Andreas Schachner

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

stringforge-0.1.1.tar.gz (222.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

stringforge-0.1.1-py3-none-any.whl (196.4 kB view details)

Uploaded Python 3

File details

Details for the file stringforge-0.1.1.tar.gz.

File metadata

  • Download URL: stringforge-0.1.1.tar.gz
  • Upload date:
  • Size: 222.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.16

File hashes

Hashes for stringforge-0.1.1.tar.gz
Algorithm Hash digest
SHA256 f7bee8a2344deab0564436b1b959bdae4933b6365ea77ee58f90d8c861d4ecfe
MD5 07ad9c2388b1c896401c92fc3917dc0c
BLAKE2b-256 8b4dfde4ee829ee65855e19d2f9bd1613991cb186509d4d3ae71110c7890a8fa

See more details on using hashes here.

File details

Details for the file stringforge-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: stringforge-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 196.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.16

File hashes

Hashes for stringforge-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 cfa431458841e5800efc62f8535ea5f4c45937ad2ec1549f8b49e176258c232c
MD5 f42b91f682289c71bb65d2688abba97c
BLAKE2b-256 290e25f748ca1daf218fe9535d10dd802cb1432902cfcb9df22d0952c01a2c5e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page