Skip to main content

CSP5: pip-installable NMR predictor for 13C and 1H.

Project description

CSP5

CSP5 is a pip-installable NMR predictor package with:

  • batched 13C and 1H prediction
  • prediction from precomputed geometries
  • shift matching utilities with dp (default), scipy, and murty (k-best)

Bundled defaults:

  • 13C model: CSP5-13C (model_id: csp5-13c)
  • 1H model: CSP5-1H (model_id: csp5-1h)

Install

Requires Python 3.9 or newer.

pip install CSP5

Prediction CLI

In interactive terminals, csp5 prints status lines to stderr before and after prediction. If a run is slow, it prints an additional note that first invocation can take longer while dependencies and model weights initialize, plus periodic "still working" updates during long runs. Use --no-status to silence them.

From SMILES

csp5 --smiles "CCO" --nucleus 1H
csp5 --smiles "CCO" --nucleus both
csp5 --smiles-file smiles.txt --nucleus 13C --batch-size 64
csp5 --smiles "CCO" --nucleus 13C --num-conformers 8
csp5 --smiles "CCO" --nucleus both --num-conformers 8 --output-conformers-json cco_conformers.json
csp5 --smiles "CCO" --nucleus 13C --num-conformers 8 --output-conformers-sdf cco_conformers.sdf
csp5 --smiles "CCO" --nucleus 13C --output-svg cco_13c.svg
csp5 --smiles "CCO" --nucleus 13C --output-svg cco_13c.svg --svg-bond-length 72 --svg-shift-font-scale 1.1

From molecule files (molfile or SDF)

By default, molecule-file input uses the coordinates embedded in the file. Add --regenerate-geometry to keep the input atom order/numbering while generating fresh ETKDG + MMFF/UFF coordinates for prediction.

csp5 --molecule-file input.mol --nucleus 13C
csp5 --molecule-file input.sdf --nucleus 1H --regenerate-geometry

From precomputed geometries (parquet structures dataset)

Input dataset requirements:

  • required columns: smiles, molblock
  • optional columns: conformer_rank, conformer_id, energy, energy_method

Predict only rank-0 conformers:

csp5 \
  --structures-path /path/to/structures.parquet \
  --conformer-rank 0 \
  --nucleus 1H \
  --batch-size 64

Predict using all conformers in the dataset:

csp5 \
  --structures-path /path/to/structures.parquet \
  --use-all-conformers \
  --nucleus 13C

Prediction Python API

from csp5 import draw_prediction, predict_molecule_file, predict_smiles, predict_structures, predict_sdf

# Standard SMILES mode
res = predict_smiles(["CCO", "c1ccccc1"], nucleus="1H", batch_size=32)
print(res.predictions.head())
svg = draw_prediction(res)

# Precomputed-geometry parquet mode
res2 = predict_structures(
    "/path/to/structures.parquet",
    nucleus="1H",
    conformer_rank=0,
    use_all_conformers=False,
)

# Precomputed-geometry SDF mode
res3 = predict_sdf("/path/to/embedded.sdf", nucleus="13C")

# Molfile/SDF mode with fresh generated geometry while preserving atom order
res4 = predict_molecule_file("/path/to/input.mol", nucleus="13C", regenerate_geometry=True)

Matching CLI

csp5-match expects one shift per line in each file.

Default fast path (dp)

csp5-match \
  --predicted-file predicted.txt \
  --experimental-file experimental.txt \
  --solver dp

SciPy Hungarian option

csp5-match \
  --predicted-file predicted.txt \
  --experimental-file experimental.txt \
  --solver scipy

Murty k-best option

csp5-match \
  --predicted-file predicted.txt \
  --experimental-file experimental.txt \
  --solver murty \
  --k-best-policy clip \
  --k-best 25 \
  --temperature 0.5 \
  --mae-delta-threshold 0.2

Matching Python API

from csp5 import match_shifts

pred = [7.35, 7.30, 1.25]
exp = [7.34, 7.31, 1.20]

# DP (default)
r1 = match_shifts(pred, exp, solver="dp")

# SciPy Hungarian
r2 = match_shifts(pred, exp, solver="scipy")

# Murty k-best
r3 = match_shifts(pred, exp, solver="murty", k_best=10, k_best_policy="clip")
print(r3.assignment_entropy, r3.num_competing_assignments)

Solver Notes

  • dp is the default and is intended for the standard 1D shift objective.
  • scipy uses Hungarian assignment on the full padded cost matrix.
  • murty is the k-best solver; use this when you need assignment ambiguity analysis.
  • For murty, k_best_policy="clip" (default) returns all feasible unique assignments when k_best is larger than what exists. Use k_best_policy="strict" to fail instead.
  • dp and scipy are top-1 only (k_best must be 1).

Output Notes

  • Prediction failures are returned explicitly (failures) with reason tags.
  • Prediction output always includes nucleus, model_id, and model_name.
  • For structures-mode predictions, conformer metadata columns are propagated when available.
  • CLI JSON is molecule-oriented, with top-level model metadata, per-molecule prediction lists, and atom-map numbers matching mapped_smiles_explicit_h.
  • Use --nucleus both to write 13C and 1H predictions in one JSON, grouped by nucleus under each molecule's predictions.
  • In SMILES mode, --num-conformers N predicts generated conformers and returns Boltzmann-averaged shifts at 298.15 K (--boltzmann-temperature-k changes the temperature). The default remains one conformer.
  • In structures mode, --use-all-conformers also returns Boltzmann-averaged shifts. Use --output-conformers-json to save individual conformer predictions separately.
  • Use --output-conformers-sdf to save the exact conformer geometry or geometries used for prediction.
  • Use --molecule-file path.mol or --molecule-file path.sdf for molfile/SDF input. Add --regenerate-geometry to discard embedded coordinates and create fresh geometry without changing the input atom order used for atom maps.
  • Use --output-svg path.svg or draw_prediction(result) to create an RDKit-native SVG drawing with atom labels (C4, H9) and shift notes. SVGs auto-size by default. Use both --svg-width and --svg-height to force a fixed canvas; tune with --svg-bond-length, --svg-atom-font-size, --svg-shift-font-scale, and --svg-padding.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

csp5-0.2.10.tar.gz (33.9 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

csp5-0.2.10-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (34.0 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

csp5-0.2.10-cp313-cp313-macosx_11_0_universal2.whl (34.0 MB view details)

Uploaded CPython 3.13macOS 11.0+ universal2 (ARM64, x86-64)

csp5-0.2.10-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (34.0 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

csp5-0.2.10-cp312-cp312-macosx_11_0_universal2.whl (34.0 MB view details)

Uploaded CPython 3.12macOS 11.0+ universal2 (ARM64, x86-64)

csp5-0.2.10-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (34.0 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

csp5-0.2.10-cp311-cp311-macosx_11_0_universal2.whl (34.0 MB view details)

Uploaded CPython 3.11macOS 11.0+ universal2 (ARM64, x86-64)

csp5-0.2.10-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (34.0 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

csp5-0.2.10-cp310-cp310-macosx_11_0_universal2.whl (34.0 MB view details)

Uploaded CPython 3.10macOS 11.0+ universal2 (ARM64, x86-64)

csp5-0.2.10-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (34.0 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64

csp5-0.2.10-cp39-cp39-macosx_11_0_universal2.whl (34.0 MB view details)

Uploaded CPython 3.9macOS 11.0+ universal2 (ARM64, x86-64)

File details

Details for the file csp5-0.2.10.tar.gz.

File metadata

  • Download URL: csp5-0.2.10.tar.gz
  • Upload date:
  • Size: 33.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for csp5-0.2.10.tar.gz
Algorithm Hash digest
SHA256 fe8060ac2853940a3aa3445e5121145a296a2c56a29ca77d5d8da6faa79461b2
MD5 85fb3d429eabeda97d1245ec58cb58ea
BLAKE2b-256 027f50b5c5dd67e64e5a912f46da0cc194f6d7ef9a4c44cc7e876315ba304563

See more details on using hashes here.

File details

Details for the file csp5-0.2.10-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for csp5-0.2.10-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 f72d1682851e76fa24ea90f1ccd57ff4d64b01965ea17b78480b2f7876ae6afb
MD5 90341912164b723f691b9f1b3be8e5a4
BLAKE2b-256 30fe911e8bc2c7eb76c3387aef47ebe872d7689d51d7dd7c809a4a8cddd44320

See more details on using hashes here.

File details

Details for the file csp5-0.2.10-cp313-cp313-macosx_11_0_universal2.whl.

File metadata

File hashes

Hashes for csp5-0.2.10-cp313-cp313-macosx_11_0_universal2.whl
Algorithm Hash digest
SHA256 ff91dcf4b8159bd3304ea7db8a039c73cb5fe422090b5c38757205dfae068bd9
MD5 e86df7eb9bd3dcd4de285e2324b3b4d8
BLAKE2b-256 0a34c7d1017c13e410ab65e31e74c9c73bc71bae11d81c1f7d481fb49fd0d3f9

See more details on using hashes here.

File details

Details for the file csp5-0.2.10-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for csp5-0.2.10-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 5447f420a75e762d9f7eb1fc40d55a34cfb3a7bc763ec1eb973ca1420a7b68c2
MD5 bd4e66f85cb0d3b8b962695c7e28c084
BLAKE2b-256 34ae287df41b5949a7e7f2d8b57401b1471519799d52f90fa458cc03e2d8dfc4

See more details on using hashes here.

File details

Details for the file csp5-0.2.10-cp312-cp312-macosx_11_0_universal2.whl.

File metadata

File hashes

Hashes for csp5-0.2.10-cp312-cp312-macosx_11_0_universal2.whl
Algorithm Hash digest
SHA256 782af104e2dd93feb90491cf3d3701f3e815ffb62d57b71dcc0dd74fe409bd45
MD5 3fda2c553f98de2e724fa80d0b5664bd
BLAKE2b-256 5008d1b766b5d95dc17643cea0fef533704d772c8911efc345f5906dad88ab15

See more details on using hashes here.

File details

Details for the file csp5-0.2.10-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for csp5-0.2.10-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 bd6538163565c405b5cc37252f77fedeae523cb7d291c55fcb0b0306c7e4c9fd
MD5 8d5f1e098e76786925ea702c2dc7f188
BLAKE2b-256 4585953b0d0661a01c26e8259196901a8cb38bcd685b08af993cf6b3571eecae

See more details on using hashes here.

File details

Details for the file csp5-0.2.10-cp311-cp311-macosx_11_0_universal2.whl.

File metadata

File hashes

Hashes for csp5-0.2.10-cp311-cp311-macosx_11_0_universal2.whl
Algorithm Hash digest
SHA256 9d6a9a0322d0bfbf7409f3729a59a9d7595e8012fbb0d8fae3447a7a6414621a
MD5 12c448b203ac50081d3a0fefed9b72ac
BLAKE2b-256 96098495c0f76a88210a9e910fa58689ea39d67c0bc7c632e16eb72fa6eff1b8

See more details on using hashes here.

File details

Details for the file csp5-0.2.10-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for csp5-0.2.10-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 ee6f4195d5935a30f7de649e3d5288006bbc4fdd7d919f0d5ce343915aba51c3
MD5 4fa0a587ebdc5803b844199ad239008d
BLAKE2b-256 b3fdf4a61a6f717a1f8318d47a562d18072608a5d4d020546a630babb896c6bb

See more details on using hashes here.

File details

Details for the file csp5-0.2.10-cp310-cp310-macosx_11_0_universal2.whl.

File metadata

File hashes

Hashes for csp5-0.2.10-cp310-cp310-macosx_11_0_universal2.whl
Algorithm Hash digest
SHA256 95dc96d81bed104ad84304dd2219cdbdf9f6b1003b2df842abfb0263d98b8b11
MD5 452710d7d0567206b005a5d2171b1bbc
BLAKE2b-256 e470885b341a0fa3893ef7ec9967cdba80e88ebcb045a05b25b12a4e726c59fe

See more details on using hashes here.

File details

Details for the file csp5-0.2.10-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for csp5-0.2.10-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 20f31c34d392de30fa7bba03198da779a3247e5830ada24dbe3909e4c102f84c
MD5 466f89fb24d40fc63f7b55fea348c0a7
BLAKE2b-256 8692ced4fdac1dfb9a979642fbd70a568d0c4904e670ff0fab809708cdd4581c

See more details on using hashes here.

File details

Details for the file csp5-0.2.10-cp39-cp39-macosx_11_0_universal2.whl.

File metadata

  • Download URL: csp5-0.2.10-cp39-cp39-macosx_11_0_universal2.whl
  • Upload date:
  • Size: 34.0 MB
  • Tags: CPython 3.9, macOS 11.0+ universal2 (ARM64, x86-64)
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.2.0 CPython/3.9.13

File hashes

Hashes for csp5-0.2.10-cp39-cp39-macosx_11_0_universal2.whl
Algorithm Hash digest
SHA256 8284609edc15f0a9bac20a3dd2bcf94e3be74cda3aee2eff94c8b8a8ac485674
MD5 910acf7e9fac325e0f111198b8e99667
BLAKE2b-256 4cf823aac18aa3905648a2fe1f8f73b9e44aa8921380d445ed4b36d0cf71533c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page