Skip to main content

CSP5: pip-installable NMR predictor for 13C and 1H.

Project description

CSP5

CSP5 is a pip-installable NMR predictor package with:

  • batched 13C and 1H prediction
  • prediction from precomputed geometries
  • shift matching utilities with dp (default), scipy, and murty (k-best)

Bundled defaults:

  • 13C model: CSP5-13C (model_id: csp5-13c)
  • 1H model: CSP5-1H (model_id: csp5-1h)

Install

Requires Python 3.9 or newer.

pip install CSP5

Prediction CLI

In interactive terminals, csp5 prints status lines to stderr before and after prediction. If a run is slow, it prints an additional note that first invocation can take longer while dependencies and model weights initialize, plus periodic "still working" updates during long runs. Use --no-status to silence them.

From SMILES

csp5 --smiles "CCO" --nucleus 1H
csp5 --smiles "CCO" --nucleus both
csp5 --smiles-file smiles.txt --nucleus 13C --batch-size 64
csp5 --smiles "CCO" --nucleus 13C --num-conformers 8
csp5 --smiles "CCO" --nucleus both --num-conformers 8 --output-conformers-json cco_conformers.json
csp5 --smiles "CCO" --nucleus 13C --num-conformers 8 --output-conformers-sdf cco_conformers.sdf
csp5 --smiles "CCO" --nucleus 13C --output-svg cco_13c.svg
csp5 --smiles "CCO" --nucleus 13C --output-svg cco_13c.svg --svg-bond-length 72 --svg-shift-font-scale 1.1

From molecule files (molfile or SDF)

By default, molecule-file input uses the coordinates embedded in the file. Add --regenerate-geometry to keep the input atom order/numbering while generating fresh ETKDG + MMFF/UFF coordinates for prediction.

csp5 --molecule-file input.mol --nucleus 13C
csp5 --molecule-file input.sdf --nucleus 1H --regenerate-geometry

From precomputed geometries (parquet structures dataset)

Input dataset requirements:

  • required columns: smiles, molblock
  • optional columns: conformer_rank, conformer_id, energy, energy_method

Predict only rank-0 conformers:

csp5 \
  --structures-path /path/to/structures.parquet \
  --conformer-rank 0 \
  --nucleus 1H \
  --batch-size 64

Predict using all conformers in the dataset:

csp5 \
  --structures-path /path/to/structures.parquet \
  --use-all-conformers \
  --nucleus 13C

Prediction Python API

from csp5 import draw_prediction, predict_molecule_file, predict_smiles, predict_structures, predict_sdf

# Standard SMILES mode
res = predict_smiles(["CCO", "c1ccccc1"], nucleus="1H", batch_size=32)
print(res.predictions.head())
svg = draw_prediction(res)

# Precomputed-geometry parquet mode
res2 = predict_structures(
    "/path/to/structures.parquet",
    nucleus="1H",
    conformer_rank=0,
    use_all_conformers=False,
)

# Precomputed-geometry SDF mode
res3 = predict_sdf("/path/to/embedded.sdf", nucleus="13C")

# Molfile/SDF mode with fresh generated geometry while preserving atom order
res4 = predict_molecule_file("/path/to/input.mol", nucleus="13C", regenerate_geometry=True)

Matching CLI

csp5-match expects one shift per line in each file.

Default fast path (dp)

csp5-match \
  --predicted-file predicted.txt \
  --experimental-file experimental.txt \
  --solver dp

SciPy Hungarian option

csp5-match \
  --predicted-file predicted.txt \
  --experimental-file experimental.txt \
  --solver scipy

Murty k-best option

csp5-match \
  --predicted-file predicted.txt \
  --experimental-file experimental.txt \
  --solver murty \
  --k-best-policy clip \
  --k-best 25 \
  --temperature 0.5 \
  --mae-delta-threshold 0.2

Matching Python API

from csp5 import match_shifts

pred = [7.35, 7.30, 1.25]
exp = [7.34, 7.31, 1.20]

# DP (default)
r1 = match_shifts(pred, exp, solver="dp")

# SciPy Hungarian
r2 = match_shifts(pred, exp, solver="scipy")

# Murty k-best
r3 = match_shifts(pred, exp, solver="murty", k_best=10, k_best_policy="clip")
print(r3.assignment_entropy, r3.num_competing_assignments)

Solver Notes

  • dp is the default and is intended for the standard 1D shift objective.
  • scipy uses Hungarian assignment on the full padded cost matrix.
  • murty is the k-best solver; use this when you need assignment ambiguity analysis.
  • For murty, k_best_policy="clip" (default) returns all feasible unique assignments when k_best is larger than what exists. Use k_best_policy="strict" to fail instead.
  • dp and scipy are top-1 only (k_best must be 1).

Output Notes

  • Prediction failures are returned explicitly (failures) with reason tags.
  • Prediction output always includes nucleus, model_id, and model_name.
  • For structures-mode predictions, conformer metadata columns are propagated when available.
  • CLI JSON is molecule-oriented, with top-level model metadata, per-molecule prediction lists, and atom-map numbers matching mapped_smiles_explicit_h.
  • Use --nucleus both to write 13C and 1H predictions in one JSON, grouped by nucleus under each molecule's predictions.
  • In SMILES mode, --num-conformers N predicts generated conformers and returns Boltzmann-averaged shifts at 298.15 K (--boltzmann-temperature-k changes the temperature). The default remains one conformer.
  • In structures mode, --use-all-conformers also returns Boltzmann-averaged shifts. Use --output-conformers-json to save individual conformer predictions separately.
  • Use --output-conformers-sdf to save the exact conformer geometry or geometries used for prediction.
  • Use --molecule-file path.mol or --molecule-file path.sdf for molfile/SDF input. Add --regenerate-geometry to discard embedded coordinates and create fresh geometry without changing the input atom order used for atom maps.
  • Use --output-svg path.svg or draw_prediction(result) to create an RDKit-native SVG drawing with atom labels (C4, H9) and shift notes. SVGs auto-size by default. Use both --svg-width and --svg-height to force a fixed canvas; tune with --svg-bond-length, --svg-atom-font-size, --svg-shift-font-scale, and --svg-padding.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

csp5-0.2.11.tar.gz (33.9 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

csp5-0.2.11-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (34.0 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

csp5-0.2.11-cp313-cp313-macosx_11_0_universal2.whl (34.0 MB view details)

Uploaded CPython 3.13macOS 11.0+ universal2 (ARM64, x86-64)

csp5-0.2.11-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (34.0 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

csp5-0.2.11-cp312-cp312-macosx_11_0_universal2.whl (34.0 MB view details)

Uploaded CPython 3.12macOS 11.0+ universal2 (ARM64, x86-64)

csp5-0.2.11-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (34.0 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

csp5-0.2.11-cp311-cp311-macosx_11_0_universal2.whl (34.0 MB view details)

Uploaded CPython 3.11macOS 11.0+ universal2 (ARM64, x86-64)

csp5-0.2.11-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (34.0 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

csp5-0.2.11-cp310-cp310-macosx_11_0_universal2.whl (34.0 MB view details)

Uploaded CPython 3.10macOS 11.0+ universal2 (ARM64, x86-64)

csp5-0.2.11-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (34.0 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64

csp5-0.2.11-cp39-cp39-macosx_11_0_universal2.whl (34.0 MB view details)

Uploaded CPython 3.9macOS 11.0+ universal2 (ARM64, x86-64)

File details

Details for the file csp5-0.2.11.tar.gz.

File metadata

  • Download URL: csp5-0.2.11.tar.gz
  • Upload date:
  • Size: 33.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for csp5-0.2.11.tar.gz
Algorithm Hash digest
SHA256 541dadd89b5ed404160733f6806ac97dd517c458378e12b40aff85a3dc5b6300
MD5 9536e8a4915f2a6c13895f7f1b6da599
BLAKE2b-256 547c70ac7ece8a0ed62652bd0b4d162dd26482531a34363f5a7cbac3b79ec9b5

See more details on using hashes here.

File details

Details for the file csp5-0.2.11-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for csp5-0.2.11-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 3cd12b6baad9b9527675bad19d3a57e5a1db611c6b69943f25d71642d25ddfc2
MD5 8457b71a9670ec93d79c83e6f2e0ee9e
BLAKE2b-256 d61c4b07576d05e82d11d3dbe7f228fe16e1815ab29373ecfb4451bfa073a16b

See more details on using hashes here.

File details

Details for the file csp5-0.2.11-cp313-cp313-macosx_11_0_universal2.whl.

File metadata

File hashes

Hashes for csp5-0.2.11-cp313-cp313-macosx_11_0_universal2.whl
Algorithm Hash digest
SHA256 f1a43197ed7e8d2458adb405567ed6034b5430d27faeb0dde4104247481ea78c
MD5 670bf423a217271ed703a622dea341a8
BLAKE2b-256 7517bed236837fbdd9fd1879cecf9be7d5653d7ddf97e8947d619d28bac76166

See more details on using hashes here.

File details

Details for the file csp5-0.2.11-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for csp5-0.2.11-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 6092c8ffdafa7b56fed297985848140bde7f26063a3fa61268e4164699aea968
MD5 4ff79707e18cf1bbbb6a70bda3d0b779
BLAKE2b-256 5e863307a4e3bbf8984a3bb54f593deb3e3f91060084bed61e7628b9ff86ce94

See more details on using hashes here.

File details

Details for the file csp5-0.2.11-cp312-cp312-macosx_11_0_universal2.whl.

File metadata

File hashes

Hashes for csp5-0.2.11-cp312-cp312-macosx_11_0_universal2.whl
Algorithm Hash digest
SHA256 674293917657767c44b95c4dd5eb9c37e827b2c06d321c52f544a24ed80a049b
MD5 0bb17f29cccc4e17c9efab66263a5f07
BLAKE2b-256 f8abc49f8c5f3b12fac11196f661ddad5929e62307bac21f3da6184857574bc9

See more details on using hashes here.

File details

Details for the file csp5-0.2.11-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for csp5-0.2.11-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 b540aa274bda34ca40e342f9f08a226429dcd41b9f506b50308dddc4a22e1487
MD5 76b1401bfeee1f311b817597d24d2c88
BLAKE2b-256 427521eb40f7d1bfc60354e4af5683fb4914db7d84f84060baae058d48f7eb85

See more details on using hashes here.

File details

Details for the file csp5-0.2.11-cp311-cp311-macosx_11_0_universal2.whl.

File metadata

File hashes

Hashes for csp5-0.2.11-cp311-cp311-macosx_11_0_universal2.whl
Algorithm Hash digest
SHA256 e3e1f3d3364be67729fdeb974773da2cb6f0a3ea9a715e29aa48bd4bad646acb
MD5 d3afcdf147ffa4e310cbceb5877c3de7
BLAKE2b-256 0534702ec13f57f8fb943cc9718f8f87784969db56f05d4be8e518755cc8182a

See more details on using hashes here.

File details

Details for the file csp5-0.2.11-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for csp5-0.2.11-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 8a6e813e513ce0db150ed34518198100cec9ebb9e73dc9db75ca978164b48f82
MD5 6fe2767d572ae42fe1efc0d6fd6a5181
BLAKE2b-256 7e758d14a1a31acd1dd36c105d5c4478a03b8b5a8970422ec8f36a338fdaa628

See more details on using hashes here.

File details

Details for the file csp5-0.2.11-cp310-cp310-macosx_11_0_universal2.whl.

File metadata

File hashes

Hashes for csp5-0.2.11-cp310-cp310-macosx_11_0_universal2.whl
Algorithm Hash digest
SHA256 8aecdae96030e33f594830aa76a878f74ed652a897152600818c8cf8e23dde62
MD5 07adbe6302afbf51fefdd67b8e9f97cd
BLAKE2b-256 c1938be7d57d4ed51adeee0fe28e7c54eeb43acc2b461fba36a1673f130039c7

See more details on using hashes here.

File details

Details for the file csp5-0.2.11-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for csp5-0.2.11-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 8394abc73edd0ec5f2c0310b4b54fa2834fcfc79a8fb1f8e0ed685f8ce728793
MD5 3bf5143b03d51b3e203050e2a7aab411
BLAKE2b-256 68e84b6b8114885d6562a3a10b7a50a130ab58f12d8bcb4f487956a94361aac6

See more details on using hashes here.

File details

Details for the file csp5-0.2.11-cp39-cp39-macosx_11_0_universal2.whl.

File metadata

  • Download URL: csp5-0.2.11-cp39-cp39-macosx_11_0_universal2.whl
  • Upload date:
  • Size: 34.0 MB
  • Tags: CPython 3.9, macOS 11.0+ universal2 (ARM64, x86-64)
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.2.0 CPython/3.9.13

File hashes

Hashes for csp5-0.2.11-cp39-cp39-macosx_11_0_universal2.whl
Algorithm Hash digest
SHA256 f6d08c28fe238c0b0d38edd97fa553168b1c5c5880b303ba2b295ac5dd56533c
MD5 5ac09bf74e25850066a64f09034bd9d5
BLAKE2b-256 e229121ea7b31c2ac11e5c1419627e31b5d32f33270226a4b2854b6344d13e9b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page