Fast Rust-based SDF, MOL2, and XYZ molecular structure file parser
Project description
sdfrust - Python Bindings
Fast Rust-based SDF, MOL2, and XYZ molecular structure file parser with Python bindings, including transparent gzip decompression.
Installation
From source (requires Rust toolchain)
cd sdfrust-python
pip install maturin
maturin develop --features numpy
Build wheel
maturin build --release --features numpy
pip install target/wheels/sdfrust-*.whl
Quick Start
import sdfrust
# Parse a single SDF file
mol = sdfrust.parse_sdf_file("molecule.sdf")
print(f"Name: {mol.name}")
print(f"Atoms: {mol.num_atoms}")
print(f"Formula: {mol.formula()}")
print(f"MW: {mol.molecular_weight():.2f}")
# Parse multiple molecules
mols = sdfrust.parse_sdf_file_multi("database.sdf")
for mol in mols:
print(f"{mol.name}: {mol.num_atoms} atoms")
# Memory-efficient iteration over large files
for mol in sdfrust.iter_sdf_file("large_database.sdf"):
print(f"{mol.name}: MW={mol.molecular_weight():.2f}")
Supported Formats
- SDF V2000: Full support for reading and writing (up to 999 atoms/bonds)
- SDF V3000: Full support for reading and writing (unlimited atoms/bonds)
- MOL2 TRIPOS: Full support for reading and writing
- XYZ: Read support for XYZ coordinate files (single and multi-molecule)
- Gzip: Transparent decompression of
.gzfiles for all formats
API Reference
Parsing Functions
SDF Files
# Single molecule
mol = sdfrust.parse_sdf_file("file.sdf") # V2000
mol = sdfrust.parse_sdf_auto_file("file.sdf") # Auto-detect V2000/V3000
mol = sdfrust.parse_sdf_v3000_file("file.sdf") # V3000 only
# Multiple molecules
mols = sdfrust.parse_sdf_file_multi("file.sdf")
mols = sdfrust.parse_sdf_auto_file_multi("file.sdf")
# From string
mol = sdfrust.parse_sdf_string(content)
mols = sdfrust.parse_sdf_string_multi(content)
MOL2 Files
mol = sdfrust.parse_mol2_file("file.mol2")
mols = sdfrust.parse_mol2_file_multi("file.mol2")
mol = sdfrust.parse_mol2_string(content)
Iterators (Memory-Efficient)
for mol in sdfrust.iter_sdf_file("large.sdf"):
process(mol)
for mol in sdfrust.iter_mol2_file("large.mol2"):
process(mol)
Writing Functions
# Single molecule
sdfrust.write_sdf_file(mol, "output.sdf")
sdfrust.write_sdf_auto_file(mol, "output.sdf") # Auto V2000/V3000
sdf_string = sdfrust.write_sdf_string(mol)
# Multiple molecules
sdfrust.write_sdf_file_multi(mols, "output.sdf")
Molecule Properties
mol = sdfrust.parse_sdf_file("aspirin.sdf")
# Basic info
print(mol.name) # Molecule name
print(mol.num_atoms) # Number of atoms
print(mol.num_bonds) # Number of bonds
print(mol.formula()) # Molecular formula
# Descriptors
print(mol.molecular_weight()) # Molecular weight
print(mol.exact_mass()) # Monoisotopic mass
print(mol.heavy_atom_count()) # Non-hydrogen atoms
print(mol.ring_count()) # Number of rings
print(mol.rotatable_bond_count()) # Rotatable bonds
print(mol.total_charge()) # Sum of formal charges
# Geometry
centroid = mol.centroid() # (x, y, z) center
mol.translate(1.0, 0.0, 0.0) # Move molecule
mol.center() # Center at origin
# Properties (from SDF data block)
cid = mol.get_property("PUBCHEM_CID")
mol.set_property("SOURCE", "generated")
Atom Access
# Iterate over atoms
for atom in mol.atoms:
print(f"{atom.element} at ({atom.x}, {atom.y}, {atom.z})")
# Get specific atom
atom = mol.get_atom(0)
print(atom.element)
print(atom.formal_charge)
print(atom.coords()) # (x, y, z) tuple
# Filter atoms
carbons = mol.atoms_by_element("C")
neighbors = mol.neighbors(0) # Atom indices bonded to atom 0
Bond Access
# Iterate over bonds
for bond in mol.bonds:
print(f"{bond.atom1}-{bond.atom2}: {bond.order}")
# Filter bonds
double_bonds = mol.bonds_by_order(sdfrust.BondOrder.double())
aromatic = mol.has_aromatic_bonds()
# Bond properties
bond = mol.bonds[0]
print(bond.is_aromatic())
print(bond.contains_atom(0))
print(bond.other_atom(0)) # Other atom in bond
NumPy Integration
import numpy as np
import sdfrust
mol = sdfrust.parse_sdf_file("molecule.sdf")
# Get coordinates as NumPy array
coords = mol.get_coords_array() # Shape: (N, 3)
print(coords.shape)
# Modify and set back
coords[:, 0] += 10.0 # Translate in x
mol.set_coords_array(coords)
# Get atomic numbers
atomic_nums = mol.get_atomic_numbers() # Shape: (N,)
Creating Molecules
import sdfrust
# Create empty molecule
mol = sdfrust.Molecule("water")
# Add atoms
mol.add_atom(sdfrust.Atom(0, "O", 0.0, 0.0, 0.0))
mol.add_atom(sdfrust.Atom(1, "H", 0.96, 0.0, 0.0))
mol.add_atom(sdfrust.Atom(2, "H", -0.24, 0.93, 0.0))
# Add bonds
mol.add_bond(sdfrust.Bond(0, 1, sdfrust.BondOrder.single()))
mol.add_bond(sdfrust.Bond(0, 2, sdfrust.BondOrder.single()))
# Write to file
sdfrust.write_sdf_file(mol, "water.sdf")
Performance
sdfrust is implemented in Rust for maximum performance. Benchmarks show it is significantly faster than pure Python parsers and comparable to C++ implementations.
For large files, use the iterator API (iter_sdf_file) to process molecules
one at a time without loading the entire file into memory.
License
MIT License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sdfrust-0.5.0.tar.gz.
File metadata
- Download URL: sdfrust-0.5.0.tar.gz
- Upload date:
- Size: 93.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9a1390b0199be9d6695a847d87dde110a61f915905e30eb7cddecfb4d4227ba8
|
|
| MD5 |
31d7fde89ca8ebd2159349cc775fec72
|
|
| BLAKE2b-256 |
4564b01f7b689ba674cf90e04be9b92ddfb0ce19d6ad7f4b90192f6acb986d12
|
File details
Details for the file sdfrust-0.5.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: sdfrust-0.5.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 656.9 kB
- Tags: CPython 3.13, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0cb5f10875598ced3e21dc8ff3fa7a136695ff05bc37a95aae3d1e3ef1c297a0
|
|
| MD5 |
66355188f04c7bfc839cae980d24ca2d
|
|
| BLAKE2b-256 |
8a54407163dd19fa8a2aadcf7793170829f3c693bc3ff9e90e2c4fef40f4abd1
|
File details
Details for the file sdfrust-0.5.0-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.
File metadata
- Download URL: sdfrust-0.5.0-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
- Upload date:
- Size: 651.0 kB
- Tags: CPython 3.13, manylinux: glibc 2.17+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
69dfc53ae62cc853f30efb44a9c237096fdff64c56f857c2173835e9a534cb86
|
|
| MD5 |
3894592b9583b68a4d9ded6e3608feaa
|
|
| BLAKE2b-256 |
a9da6527e2f5dd9a6302352070e6903def252c79075608ada91ef71d4e78bcff
|
File details
Details for the file sdfrust-0.5.0-cp312-cp312-win_amd64.whl.
File metadata
- Download URL: sdfrust-0.5.0-cp312-cp312-win_amd64.whl
- Upload date:
- Size: 483.0 kB
- Tags: CPython 3.12, Windows x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
62ed58657d3bcfb9dfe8b564c3cb26932bd407c0b3993c80261fd8d5b0932e67
|
|
| MD5 |
64b1912328ccf7697daad56e1daff7ad
|
|
| BLAKE2b-256 |
c7149fbd7f5bc3928b5b429a5c7c33197edbcf78541f87634e472ab61ddcfdce
|
File details
Details for the file sdfrust-0.5.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: sdfrust-0.5.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 657.2 kB
- Tags: CPython 3.12, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c9f337063a02d00e0be394e2ab445108817be52fc6f38d1376caf84c4c651e39
|
|
| MD5 |
b8d06406921c730f991b3c38601630e6
|
|
| BLAKE2b-256 |
702210d11449888f049b1d376448c1bce952578a3fcff7374fcc1d5e6aaa916a
|
File details
Details for the file sdfrust-0.5.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.
File metadata
- Download URL: sdfrust-0.5.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
- Upload date:
- Size: 651.5 kB
- Tags: CPython 3.12, manylinux: glibc 2.17+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e74a6e9e8d0c99063a8f5d57f9ca6bfa715f361c742449f889beb997be2d42b1
|
|
| MD5 |
9d1de8ef7006a5165456d3a287bc9c07
|
|
| BLAKE2b-256 |
f2677da8220e7098237e468cbfd715cca7f4c57ad4d2beac731e0c9389a2f0b4
|
File details
Details for the file sdfrust-0.5.0-cp312-cp312-macosx_11_0_arm64.whl.
File metadata
- Download URL: sdfrust-0.5.0-cp312-cp312-macosx_11_0_arm64.whl
- Upload date:
- Size: 595.6 kB
- Tags: CPython 3.12, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f57643c4818a217eded12014f44b8f3a69a2463146709b3505c15305f9716897
|
|
| MD5 |
1dfd175b0fc34b6b330c4bb8e577acdd
|
|
| BLAKE2b-256 |
80bcf24828aca88763e20d16d10627f6cbfc4e38c3ecf4d85ed595a0ab5bd4df
|
File details
Details for the file sdfrust-0.5.0-cp312-cp312-macosx_10_12_x86_64.whl.
File metadata
- Download URL: sdfrust-0.5.0-cp312-cp312-macosx_10_12_x86_64.whl
- Upload date:
- Size: 610.0 kB
- Tags: CPython 3.12, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a88111ae397d3111e8a5ad16793a628aaa4180e4812de4ef15e2baebb63275f1
|
|
| MD5 |
7327f56b993f1e0e8b9654df9c81a285
|
|
| BLAKE2b-256 |
a362266dcc811a8ec813848ad94391e043022cabba8b469b5f7ac69418549845
|
File details
Details for the file sdfrust-0.5.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: sdfrust-0.5.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 657.5 kB
- Tags: CPython 3.11, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b382e8c41bd2dee93021d42d056c7157bd4451a39faf798f252951928c963084
|
|
| MD5 |
97ccd8d903ca88c2f7b4e13bcd7a359e
|
|
| BLAKE2b-256 |
9911010939bdaeb44ce433e3cd6c5bf39c7fe5ae0e736cf9519849cab3bbece5
|
File details
Details for the file sdfrust-0.5.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.
File metadata
- Download URL: sdfrust-0.5.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
- Upload date:
- Size: 652.0 kB
- Tags: CPython 3.11, manylinux: glibc 2.17+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
05b76810eeff7b1fbc9518406721c134ef5d2bdf88a39557487e8e5765d7f736
|
|
| MD5 |
9c3ce4d139530b344905ce7150ef676a
|
|
| BLAKE2b-256 |
0629bdd7b7eb46c90b46a08187a99058db076288f4748b440137df8fba51610b
|
File details
Details for the file sdfrust-0.5.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: sdfrust-0.5.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 657.8 kB
- Tags: CPython 3.10, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8b4bdcd1ed7dacfbec901ca124e34c4e508fdcaa3c53ef755cbddec81d432653
|
|
| MD5 |
4a79aed93e1d11a24b3549ef5c5ac3c0
|
|
| BLAKE2b-256 |
d9b1c9c32356c72dbb4344ff42aeba3c0791ced3635cd3e540a42099ea43f3f4
|
File details
Details for the file sdfrust-0.5.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.
File metadata
- Download URL: sdfrust-0.5.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
- Upload date:
- Size: 650.8 kB
- Tags: CPython 3.10, manylinux: glibc 2.17+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ff25bb44dfecd6b1c6dcee590e21f4bf58b940d96df367d8da3cf6f5ed42f42a
|
|
| MD5 |
852683e9ed0cc40e8dd4f8ce4c39221f
|
|
| BLAKE2b-256 |
1fe9522c26893ea2166d7792d3c0ea4e52c64aa14190d0cd08c80e5a5fce8541
|
File details
Details for the file sdfrust-0.5.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: sdfrust-0.5.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 658.4 kB
- Tags: CPython 3.9, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
26f36030338605955dc33d48165d7748dde4c0110737ebcbed3e9a555ffc97d0
|
|
| MD5 |
953d8933f8b890f51fe3eff52556dd7b
|
|
| BLAKE2b-256 |
364f0fd48861e8e8d14e61b4ecc6094b40ed1d68b77d0a98ad2e250e8c257962
|
File details
Details for the file sdfrust-0.5.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.
File metadata
- Download URL: sdfrust-0.5.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
- Upload date:
- Size: 651.4 kB
- Tags: CPython 3.9, manylinux: glibc 2.17+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8aa8c84d4c091e2131aeeb279709d4efc0ae2515dac43e62cc5172735238443b
|
|
| MD5 |
c00c62d6b41531f6ce28736734de6fe2
|
|
| BLAKE2b-256 |
c6ea8cadf2dc119369fece3cb387c67ecad667c1cc759d29d1ce4feb26ee6361
|