Shared database, model-loading, and vacua-vault infrastructure for string compactifications.
Project description
StringForge
Shared database, model-loading, and vacua-vault infrastructure for string-compactification workflows.
StringForge is the infrastructure layer for the StringJAX ecosystem of JAX-based string-compactification packages. It provides reproducible access to Calabi-Yau geometry databases, bridges those data into physics engines such as JAXVacua, and manages persistent vacuum-solution storage with provenance.
The package is intentionally solver-light. It does not replace JAXVacua, KahlerJAX, JAXiverse, or CYTools. Instead, it standardises the shared conventions that those packages and downstream scans need: catalogue queries, lazy downloads, cache/offline workflows, model loading, vault layout, validation, and curation.
What StringForge owns
- Geometry databases. Unified access to hosted TDF/Kreuzer-Skarke and CICY datasets through
CYDatabase,TDFDatabase,CICYDatabase, andLCSDatabase. - Lazy local caching. Catalogues and parquet shards are downloaded on demand and cached under a configurable data directory, with explicit offline mode for HPC jobs.
- Model-loading bridges.
LCSDatabaseloads database rows asjaxvacua.lcs.lcs_treeobjects or fully initialised JAXVacuaFluxVacuaFindermodels when JAXVacua is installed. - Vacua vault.
VacuaWriterdesignates, validates, queries, uploads, fetches, retracts, and purges vacuum-solution parquet files in a shared vault layout. - Vault validation tools.
stringforge.vacuavaultvalidates parquet submissions, rebuilds catalogues, and supports curation workflows without importing physics solvers. - Advanced curated indices.
KKLTDatabaseexposes a specialised conifold-class indexedkkltsubset used for KKLT-style searches, tags, and TDF hand-off. - Production vacuum forging.
Vulcanis the cluster-side, append-only counterpart toVacuaWriter: workers stage validated parquet shards locally, a head node batches them into oneHfApi.create_commitcall permax_batch-sized chunk (default 500 files per commit), the rolling-window budget respects HuggingFace's 100-commit-per-hour cap, andVulcanReader/VulcanMLViewgive downstream consumers queryable rows and deterministic, geometry-disjoint train/val/test splits (rows sharing a geometry_id always land in the same split, regardless of process or seed).
What StringForge does not own
- It is not the flux-vacuum solver. Vacuum search, period calculations, ISD sampling, flux bounding, and stability analysis live in JAXVacua.
- It is not a public release of KahlerJAX or JAXiverse. Those packages remain planned ecosystem consumers until their own releases are ready.
- It is not the owner of every derived dataset used by collaborators. Public pages distinguish hosted StringForge datasets from collaborator-generated or paper-specific data.
- It is not a monolithic umbrella package that imports every physics engine on startup. Imports stay lightweight and optional physics packages are loaded only when a workflow needs them.
Architecture
CYDatabase -> pure I/O, HuggingFace downloads, cache, catalog queries
|
LCSDatabase -> mirror-convention model loading for JAXVacua workflows
|
VacuaWriter -> designated vacua, vault catalogues, push/fetch workflows
Vulcan -> cluster-side production: stage locally, batch-commit
via HfApi.create_commit, query/stream as ML dataset
VulcanReader -> read-side catalogue scan, run / shard fetch
VulcanMLView -> geometry-disjoint train/val/test splits for ML
KKLTDatabase is an advanced LCSDatabase-style interface for a curated TDF subset. It does not duplicate the TDF geometry data; it stores logical links, conifold-class provenance, and curation tags. Actual KKLT vacuum records belong in the shared vacua_vault infrastructure.
Vulcan deliberately complements VacuaWriter rather than replacing it: VacuaWriter designates curated low-volume vacua into the paper-aligned vacua_vault, while Vulcan forges high-volume cluster output into a separate production repo with a uniform parquet schema with fixed-shape columns plus pad-to-tensor list columns (flux, moduli_re, moduli_im, F_terms_*) and deterministic geometry-disjoint train/val/test splits via VulcanMLView. The two share the parquet floor (flux, moduli_re, moduli_im, tau_re, tau_im), so a future promotion step can lift production runs into the curated vault.
Ecosystem packages
| Package | Role | Release status |
|---|---|---|
| JAXVacua — docs | Type IIB flux vacua, complex-structure/axio-dilaton EFTs, vacuum finding, stability analysis | Public |
| JAXPolyLog — docs | JAX-compatible polylogarithms with autodiff support | Public |
| KahlerJAX | Kähler-moduli stabilisation for 4D N=1 EFTs | Planned; not a StringForge dependency |
| JAXiverse | Multi-axion EFT spectra, decay constants, and couplings | Planned; not a StringForge dependency |
| CYTools | External toric Calabi-Yau geometry package | Public external dependency for selected workflows |
Quick start
from stringforge import LCSDatabase
# Query the hosted TDF catalogue. The constructor itself performs no network I/O;
# the catalogue is fetched lazily on first query.
db = LCSDatabase(dataset="tdf", cache_dir=".stringforge_cache")
models = db.query(h12=2, has_conifolds=True).head(5)
print(models[["h11", "h12", "ks_id", "triang_id", "n_conifolds"]])
# Load one catalogue row as JAXVacua-compatible data.
row = models.iloc[0]
tree = db.load(
h11=int(row["h11"]),
h12=int(row["h12"]),
ks_id=int(row["ks_id"]),
triang_id=int(row["triang_id"]),
include_gv=False,
)
# Or construct the corresponding JAXVacua FluxVacuaFinder directly.
finder = db.load_model(
h11=int(row["h11"]),
h12=int(row["h12"]),
ks_id=int(row["ks_id"]),
triang_id=int(row["triang_id"]),
include_gv=False,
)
The returned finder is a JAXVacua FluxVacuaFinder. Use the JAXVacua documentation for vacuum-search, flux-sampling, period-calculation, and stability-analysis workflows.
Vulcan: production vacuum forging
For high-volume cluster workloads, use Vulcan instead of VacuaWriter to publish runs to a separate production repo without tripping over HuggingFace's commit-rate cap.
from stringforge.vulcan import Vulcan
# On a cluster worker: stage a batch locally; no HuggingFace I/O.
forge = Vulcan.from_env() # reads STRINGFORGE_VULCAN_*
forge.write(
vacua_df,
geometry={"h11": 3, "h12": 2, "ks_id": 384564, "triang_id": 0},
tadpole_charge=12,
solver={"name": "newton", "config_hash": "abc123"},
provenance={"git_sha": "deadbeef", "seed": 42, "wall_clock_s": 3.4},
)
# On the head node (cron, daemon, or manual): drain pending shards
# into batched commits respecting the 90/hour budget.
report = forge.sync(max_batch=500) # one create_commit, many files
# After sync: query, fetch a specific run, or build an ML view.
susy = forge.query(h12=2, solver_name="newton", is_susy=True)
train = forge.ml_view().as_dataframe("train") # deterministic, geometry-disjoint split
CLI: python -m stringforge.vulcan {status,sync}. See Vulcan cluster runs for the full cluster best-practice walkthrough.
Vacua vault workflow
import pandas as pd
vacua = pd.DataFrame({
"flux": [[1, 0, -2, 3, 0, 1]],
"moduli_re": [[0.0, 0.0]],
"moduli_im": [[2.5, 3.0]],
"tau_re": [0.0],
"tau_im": [4.0],
"is_susy": [True],
})
db.designate_vacua(
vacua,
label="example_run",
committed_by="A. Schachner",
h11=int(row["h11"]),
h12=int(row["h12"]),
ks_id=int(row["ks_id"]),
triang_id=int(row["triang_id"]),
)
designated = db.query_vacua(label="example_run")
print(designated[["label", "n_vacua", "created"]])
Installation
Prerequisites: Python >= 3.12. If GPU acceleration is needed, install JAX with CUDA support first.
# Recommended once the package is public on PyPI
pip install stringforge
# Development install from a local clone
git clone https://github.com/AndreasSchachner/stringforge.git
cd stringforge
pip install -e .
[!CAUTION] StringForge workflows that construct JAX models require
float64precision. JAX Metal on macOS does not support the required complexfloat64operations; use the CPU backend on Mac.
Documentation
Build the documentation locally with:
cd documentation
pip install -r requirements.txt
make html
The full JAXVacua API reference is available at jaxvacua.readthedocs.io.
Requirements
Core dependencies installed by pip:
- NumPy
- Pandas and PyArrow
- HuggingFace Hub
- JAX and jaxlib
- JAXPolyLog
- JAXVacua
Optional workflow dependencies:
- CYTools for constructing models from Kreuzer-Skarke polytopes.
- python-flint for exact arithmetic in selected downstream routines.
Citation
If you find this work useful, please cite the companion paper as the primary reference and the software release as the secondary reference. When StringForge is used to drive a JAXVacua flux-vacuum search, please additionally cite the JAXVacua framework paper.
Companion paper (preferred, in preparation). A single-author paper describing the conventions, dataset structure and capabilities of StringForge is in preparation. The arXiv identifier and journal/DOI will be added here at submission; until then, cite the temporary manuscript entry below together with the software release.
@article{Schachner:2026stringforge,
author = "Schachner, Andreas",
title = "{StringForge: shared database and vacuum-storage infrastructure for differentiable type IIB flux-compactification workflows}",
note = "In preparation; arXiv ID and journal/DOI to be added at submission.",
year = "2026"
}
Software release.
@software{schachner_2026_stringforge,
author = {Schachner, Andreas},
title = {StringForge: shared infrastructure for string-compactification workflows},
year = {2026},
version = {0.1.0},
url = {https://github.com/AndreasSchachner/stringforge}
}
JAXVacua framework (cite when relevant). The upstream physics engine that StringForge provides infrastructure for was introduced in:
@article{Dubey:2023dvu,
author = "Dubey, Abhishek and Krippendorf, Sven and Schachner, Andreas",
title = "{JAXVacua --- a framework for sampling string vacua}",
eprint = "2306.06160",
archivePrefix = "arXiv",
primaryClass = "hep-th",
doi = "10.1007/JHEP12(2023)146",
journal = "JHEP",
volume = "12",
pages = "146",
year = "2023"
}
License
StringForge is released under the GNU General Public License v3.0.
Contact
Andreas Schachner
- Email: as3475@cornell.edu
- GitHub: github.com/AndreasSchachner
- Website: andreasschachner.github.io
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file stringforge-0.1.1.tar.gz.
File metadata
- Download URL: stringforge-0.1.1.tar.gz
- Upload date:
- Size: 222.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f7bee8a2344deab0564436b1b959bdae4933b6365ea77ee58f90d8c861d4ecfe
|
|
| MD5 |
07ad9c2388b1c896401c92fc3917dc0c
|
|
| BLAKE2b-256 |
8b4dfde4ee829ee65855e19d2f9bd1613991cb186509d4d3ae71110c7890a8fa
|
File details
Details for the file stringforge-0.1.1-py3-none-any.whl.
File metadata
- Download URL: stringforge-0.1.1-py3-none-any.whl
- Upload date:
- Size: 196.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cfa431458841e5800efc62f8535ea5f4c45937ad2ec1549f8b49e176258c232c
|
|
| MD5 |
f42b91f682289c71bb65d2688abba97c
|
|
| BLAKE2b-256 |
290e25f748ca1daf218fe9535d10dd802cb1432902cfcb9df22d0952c01a2c5e
|