Skip to main content

CSP5: pip-installable NMR predictor for 13C and 1H.

Project description

CSP5

CSP5 is a pip-installable NMR predictor package with:

  • batched 13C and 1H prediction
  • prediction from precomputed geometries
  • shift matching utilities with dp (default), scipy, and murty (k-best)

Bundled defaults:

  • 13C model: CSP5-13C (model_id: csp5-13c)
  • 1H model: CSP5-1H (model_id: csp5-1h)

Install

pip install CSP5

Prediction CLI

In interactive terminals, csp5 prints status lines to stderr before and after prediction. If a run is slow, it prints an additional note that first invocation can take longer while dependencies and model weights initialize, plus periodic "still working" updates during long runs. Use --no-status to silence them.

From SMILES

csp5 --smiles "CCO" --nucleus 1H
csp5 --smiles-file smiles.txt --nucleus 13C --batch-size 64

From precomputed geometries (parquet structures dataset)

Input dataset requirements:

  • required columns: smiles, molblock
  • optional columns: conformer_rank, conformer_id, energy, energy_method

Predict only rank-0 conformers:

csp5 \
  --structures-path /path/to/structures.parquet \
  --conformer-rank 0 \
  --nucleus 1H \
  --batch-size 64

Predict using all conformers in the dataset:

csp5 \
  --structures-path /path/to/structures.parquet \
  --use-all-conformers \
  --nucleus 13C

Prediction Python API

from csp5 import predict_smiles, predict_structures, predict_sdf

# Standard SMILES mode
res = predict_smiles(["CCO", "c1ccccc1"], nucleus="1H", batch_size=32)
print(res.predictions.head())

# Precomputed-geometry parquet mode
res2 = predict_structures(
    "/path/to/structures.parquet",
    nucleus="1H",
    conformer_rank=0,
    use_all_conformers=False,
)

# Precomputed-geometry SDF mode
res3 = predict_sdf("/path/to/embedded.sdf", nucleus="13C")

Matching CLI

csp5-match expects one shift per line in each file.

Default fast path (dp)

csp5-match \
  --predicted-file predicted.txt \
  --experimental-file experimental.txt \
  --solver dp

SciPy Hungarian option

csp5-match \
  --predicted-file predicted.txt \
  --experimental-file experimental.txt \
  --solver scipy

Murty k-best option

csp5-match \
  --predicted-file predicted.txt \
  --experimental-file experimental.txt \
  --solver murty \
  --k-best-policy clip \
  --k-best 25 \
  --temperature 0.5 \
  --mae-delta-threshold 0.2

Matching Python API

from csp5 import match_shifts

pred = [7.35, 7.30, 1.25]
exp = [7.34, 7.31, 1.20]

# DP (default)
r1 = match_shifts(pred, exp, solver="dp")

# SciPy Hungarian
r2 = match_shifts(pred, exp, solver="scipy")

# Murty k-best
r3 = match_shifts(pred, exp, solver="murty", k_best=10, k_best_policy="clip")
print(r3.assignment_entropy, r3.num_competing_assignments)

Solver Notes

  • dp is the default and is intended for the standard 1D shift objective.
  • scipy uses Hungarian assignment on the full padded cost matrix.
  • murty is the k-best solver; use this when you need assignment ambiguity analysis.
  • For murty, k_best_policy="clip" (default) returns all feasible unique assignments when k_best is larger than what exists. Use k_best_policy="strict" to fail instead.
  • dp and scipy are top-1 only (k_best must be 1).

Output Notes

  • Prediction failures are returned explicitly (failures) with reason tags.
  • Prediction output always includes nucleus, model_id, and model_name.
  • For structures-mode predictions, conformer metadata columns are propagated when available.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

csp5-0.2.6.tar.gz (34.0 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

csp5-0.2.6-cp310-cp310-manylinux_2_24_x86_64.whl (34.0 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.24+ x86-64

File details

Details for the file csp5-0.2.6.tar.gz.

File metadata

  • Download URL: csp5-0.2.6.tar.gz
  • Upload date:
  • Size: 34.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.12

File hashes

Hashes for csp5-0.2.6.tar.gz
Algorithm Hash digest
SHA256 8b7466cd808907b894a64d11f99d3c4041796dddd42537255302a58219f49316
MD5 152a0bdfc4c7a21442840ae32d7972c1
BLAKE2b-256 9dc908d81e1095eb743dc25cae0302d5e1fb5b1f780bc35481d51135170aae7e

See more details on using hashes here.

File details

Details for the file csp5-0.2.6-cp310-cp310-manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for csp5-0.2.6-cp310-cp310-manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 8ad88b270f9158aecc4cda927ad11eafb7acf0afa6eab39ae52ca841bbce98f2
MD5 0efc6cbc271ce8077803cf854d74ec08
BLAKE2b-256 8c0d14482a1f1bc593f781d8fee9a37081255505a096b6ce9d68ea84d930d70b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page