Skip to main content

CSP5: pip-installable NMR predictor for 13C and 1H.

Project description

CSP5

CSP5 is a pip-installable NMR predictor package with:

  • batched 13C and 1H prediction
  • prediction from precomputed geometries
  • shift matching utilities with dp (default), scipy, and murty (k-best)

Bundled defaults:

  • 13C model: CSP5-13C (model_id: csp5-13c)
  • 1H model: CSP5-1H (model_id: csp5-1h)

Install

Requires Python 3.9 or newer.

pip install CSP5

Prediction CLI

In interactive terminals, csp5 prints status lines to stderr before and after prediction. If a run is slow, it prints an additional note that first invocation can take longer while dependencies and model weights initialize, plus periodic "still working" updates during long runs. Use --no-status to silence them.

From SMILES

csp5 --smiles "CCO" --nucleus 1H
csp5 --smiles "CCO" --nucleus both
csp5 --smiles-file smiles.txt --nucleus 13C --batch-size 64
csp5 --smiles "CCO" --nucleus 13C --num-conformers 8
csp5 --smiles "CCO" --nucleus both --num-conformers 8 --output-conformers-json cco_conformers.json
csp5 --smiles "CCO" --nucleus 13C --num-conformers 8 --output-conformers-sdf cco_conformers.sdf
csp5 --smiles "CCO" --nucleus 13C --output-svg cco_13c.svg
csp5 --smiles "CCO" --nucleus 13C --output-svg cco_13c.svg --svg-bond-length 72 --svg-shift-font-scale 1.1
csp5 --smiles "CCO" --nucleus 1H --output-html cco_1h.html

From molecule files (molfile or SDF)

By default, molecule-file input uses the coordinates embedded in the file. Add --regenerate-geometry to keep the input atom order/numbering while generating fresh ETKDG + MMFF/UFF coordinates for prediction.

csp5 --molecule-file input.mol --nucleus 13C
csp5 --molecule-file input.sdf --nucleus 1H --regenerate-geometry

From precomputed geometries (parquet structures dataset)

Input dataset requirements:

  • required columns: smiles, molblock
  • optional columns: conformer_rank, conformer_id, energy, energy_method

Predict only rank-0 conformers:

csp5 \
  --structures-path /path/to/structures.parquet \
  --conformer-rank 0 \
  --nucleus 1H \
  --batch-size 64

Predict using all conformers in the dataset:

csp5 \
  --structures-path /path/to/structures.parquet \
  --use-all-conformers \
  --nucleus 13C

Prediction Python API

from csp5 import draw_prediction, draw_prediction_html, predict_molecule_file, predict_smiles, predict_structures, predict_sdf

# Standard SMILES mode
res = predict_smiles(["CCO", "c1ccccc1"], nucleus="1H", batch_size=32)
print(res.predictions.head())
svg = draw_prediction(res)
html = draw_prediction_html(res)

# Precomputed-geometry parquet mode
res2 = predict_structures(
    "/path/to/structures.parquet",
    nucleus="1H",
    conformer_rank=0,
    use_all_conformers=False,
)

# Precomputed-geometry SDF mode
res3 = predict_sdf("/path/to/embedded.sdf", nucleus="13C")

# Molfile/SDF mode with fresh generated geometry while preserving atom order
res4 = predict_molecule_file("/path/to/input.mol", nucleus="13C", regenerate_geometry=True)

Matching CLI

csp5-match expects one shift per line in each file.

Default fast path (dp)

csp5-match \
  --predicted-file predicted.txt \
  --experimental-file experimental.txt \
  --solver dp

SciPy Hungarian option

csp5-match \
  --predicted-file predicted.txt \
  --experimental-file experimental.txt \
  --solver scipy

Murty k-best option

csp5-match \
  --predicted-file predicted.txt \
  --experimental-file experimental.txt \
  --solver murty \
  --k-best-policy clip \
  --k-best 25 \
  --temperature 0.5 \
  --mae-delta-threshold 0.2

Matching Python API

from csp5 import match_shifts

pred = [7.35, 7.30, 1.25]
exp = [7.34, 7.31, 1.20]

# DP (default)
r1 = match_shifts(pred, exp, solver="dp")

# SciPy Hungarian
r2 = match_shifts(pred, exp, solver="scipy")

# Murty k-best
r3 = match_shifts(pred, exp, solver="murty", k_best=10, k_best_policy="clip")
print(r3.assignment_entropy, r3.num_competing_assignments)

Solver Notes

  • dp is the default and is intended for the standard 1D shift objective.
  • scipy uses Hungarian assignment on the full padded cost matrix.
  • murty is the k-best solver; use this when you need assignment ambiguity analysis.
  • For murty, k_best_policy="clip" (default) returns all feasible unique assignments when k_best is larger than what exists. Use k_best_policy="strict" to fail instead.
  • dp and scipy are top-1 only (k_best must be 1).

Output Notes

  • Prediction failures are returned explicitly (failures) with reason tags.
  • Prediction output always includes nucleus, model_id, and model_name.
  • For structures-mode predictions, conformer metadata columns are propagated when available.
  • CLI JSON is molecule-oriented, with top-level model metadata, per-molecule prediction lists, and atom-map numbers matching mapped_smiles_explicit_h.
  • Use --nucleus both to write 13C and 1H predictions in one JSON, grouped by nucleus under each molecule's predictions.
  • In SMILES mode, --num-conformers N predicts generated conformers and returns Boltzmann-averaged shifts at 298.15 K (--boltzmann-temperature-k changes the temperature). The default remains one conformer.
  • In structures mode, --use-all-conformers also returns Boltzmann-averaged shifts. Use --output-conformers-json to save individual conformer predictions separately.
  • Use --output-conformers-sdf to save the exact conformer geometry or geometries used for prediction.
  • Use --molecule-file path.mol or --molecule-file path.sdf for molfile/SDF input. Add --regenerate-geometry to discard embedded coordinates and create fresh geometry without changing the input atom order used for atom maps.
  • Use --output-svg path.svg or draw_prediction(result) to create an RDKit-native SVG drawing with atom labels (C4, H9) and shift notes. SVGs auto-size by default. Use both --svg-width and --svg-height to force a fixed canvas; tune with --svg-bond-length, --svg-atom-font-size, --svg-shift-font-scale, and --svg-padding.
  • Use --output-html path.html or draw_prediction_html(result) to create a self-contained interactive 3D HTML viewer.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

csp5-0.2.12.tar.gz (33.9 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

csp5-0.2.12-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (34.0 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

csp5-0.2.12-cp313-cp313-macosx_11_0_universal2.whl (34.0 MB view details)

Uploaded CPython 3.13macOS 11.0+ universal2 (ARM64, x86-64)

csp5-0.2.12-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (34.0 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

csp5-0.2.12-cp312-cp312-macosx_11_0_universal2.whl (34.0 MB view details)

Uploaded CPython 3.12macOS 11.0+ universal2 (ARM64, x86-64)

csp5-0.2.12-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (34.0 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

csp5-0.2.12-cp311-cp311-macosx_11_0_universal2.whl (34.0 MB view details)

Uploaded CPython 3.11macOS 11.0+ universal2 (ARM64, x86-64)

csp5-0.2.12-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (34.0 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

csp5-0.2.12-cp310-cp310-macosx_11_0_universal2.whl (34.0 MB view details)

Uploaded CPython 3.10macOS 11.0+ universal2 (ARM64, x86-64)

csp5-0.2.12-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (34.0 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64

csp5-0.2.12-cp39-cp39-macosx_11_0_universal2.whl (34.0 MB view details)

Uploaded CPython 3.9macOS 11.0+ universal2 (ARM64, x86-64)

File details

Details for the file csp5-0.2.12.tar.gz.

File metadata

  • Download URL: csp5-0.2.12.tar.gz
  • Upload date:
  • Size: 33.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for csp5-0.2.12.tar.gz
Algorithm Hash digest
SHA256 2417eb9229c673760acf45efaa040e7076b4a8b9b34332669b1431fb6f38eaf5
MD5 325d1da958d3c4b537b58e21d8de8f33
BLAKE2b-256 ebca4cda9cd7676e1d1682ea60322d83468c33b015afc6a5d4bd9b4e72997156

See more details on using hashes here.

File details

Details for the file csp5-0.2.12-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for csp5-0.2.12-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 9c4611111b5d6a1995f1dfbc2355362f3b3bfd854cc615b94b67814a8c9d9729
MD5 c5246bd55c13e7d3b24461648e2df501
BLAKE2b-256 40f86c61a7000630a03cbb886cd2482acc2ac5a702c36cfc475b54d2f4f7b8d2

See more details on using hashes here.

File details

Details for the file csp5-0.2.12-cp313-cp313-macosx_11_0_universal2.whl.

File metadata

File hashes

Hashes for csp5-0.2.12-cp313-cp313-macosx_11_0_universal2.whl
Algorithm Hash digest
SHA256 0460061c461c64dd39f8eee24dc893e41125acb13dfcbb34fb3b85ce4e7088af
MD5 5b74be0d65ec2bf36f6462581a549385
BLAKE2b-256 eef63067cf618ccddcae07a55341f88a8789ade1ee6d33588890f091ef27dc40

See more details on using hashes here.

File details

Details for the file csp5-0.2.12-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for csp5-0.2.12-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 07af501cec9930cd462a73ad606a126d98dd888e2c6af10d0efa849c4e0b8625
MD5 e8b42a867a64bbd1b76a7a147e6a0f6a
BLAKE2b-256 0824bbd89b8717337b6b2d272cd411c02ba42c11a34fc9a685bd8c51f6c965f9

See more details on using hashes here.

File details

Details for the file csp5-0.2.12-cp312-cp312-macosx_11_0_universal2.whl.

File metadata

File hashes

Hashes for csp5-0.2.12-cp312-cp312-macosx_11_0_universal2.whl
Algorithm Hash digest
SHA256 ad91099a681b9479485b4dcab61bd31ed7a36d63d1c7e18567094474a25ee340
MD5 569468bcec0dc0f98de896ff861507f2
BLAKE2b-256 6e439d9ae4b09aed58edbe6d62ef3e4323f2626009f36d0b9cde8dea46542847

See more details on using hashes here.

File details

Details for the file csp5-0.2.12-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for csp5-0.2.12-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 c4763b234605e71181cf61c8ef3ac89ece0098a92c10251ab5d6c6c3cc02fc1b
MD5 5decb06559b0eafa21d721bf9245d2b7
BLAKE2b-256 4eb6807fcb0a74fb7d00a1f92fb2728f3b276265d982e47bdf16805fe3c19472

See more details on using hashes here.

File details

Details for the file csp5-0.2.12-cp311-cp311-macosx_11_0_universal2.whl.

File metadata

File hashes

Hashes for csp5-0.2.12-cp311-cp311-macosx_11_0_universal2.whl
Algorithm Hash digest
SHA256 79cd2aaeacd6c067d25cf7010fc00521e774b42c40e4a4d2c923a6e4305acd61
MD5 faee45141ab966c1262a6054bd1ed180
BLAKE2b-256 801e6a773a5ebf46265289000ca8000e2ccf3d30c1bc0dee208f3c6fe4f2e946

See more details on using hashes here.

File details

Details for the file csp5-0.2.12-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for csp5-0.2.12-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 3dbc6f2653f2ba08a1bdb792fb5ade9eb9c230fdf991c07e7032be9d097e77c5
MD5 3075c41ec19443b47d045c4e9d7f000c
BLAKE2b-256 174fbd4506fbf53fcc68b24527e93fd3d47064026e9da98a1bc5718c087781cf

See more details on using hashes here.

File details

Details for the file csp5-0.2.12-cp310-cp310-macosx_11_0_universal2.whl.

File metadata

File hashes

Hashes for csp5-0.2.12-cp310-cp310-macosx_11_0_universal2.whl
Algorithm Hash digest
SHA256 9575a3bee41d2e932220c6c982de0225da8711a04b569b957aee9f160cee52e6
MD5 7d5a83d7d90dbe0a3415dbf411a78886
BLAKE2b-256 d37d5e3dc2602aabeb53dec5ff3035d63a462705d633c9553e1797c5c80bf50c

See more details on using hashes here.

File details

Details for the file csp5-0.2.12-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for csp5-0.2.12-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 ee53a8f05d5afff2fa74aed1d22377d856e1ad6875656e23a28a63c5c645a4df
MD5 dc7be6d7fb25568c6eba521a73587ab6
BLAKE2b-256 004a670624476a27bdef3061fb2059fbdcd11c614a90f16352049dea45993c52

See more details on using hashes here.

File details

Details for the file csp5-0.2.12-cp39-cp39-macosx_11_0_universal2.whl.

File metadata

  • Download URL: csp5-0.2.12-cp39-cp39-macosx_11_0_universal2.whl
  • Upload date:
  • Size: 34.0 MB
  • Tags: CPython 3.9, macOS 11.0+ universal2 (ARM64, x86-64)
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.2.0 CPython/3.9.13

File hashes

Hashes for csp5-0.2.12-cp39-cp39-macosx_11_0_universal2.whl
Algorithm Hash digest
SHA256 5cae74e60bb15e7c6621ea738e270c8186a95168c0907e0cc9dc76693e1da4fe
MD5 45902e099ce25ebcf698c22fa1cb66b3
BLAKE2b-256 be338ae1a2e34719fda642866e197c209d9a528d8782d0c34bc680719b9e19d4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page