Convert PDB Chemical Component Dictionary (CCD) files to RDKit molecules
Project description
ccd2rdmol
A lightweight Python library and CLI tool for converting PDB Chemical Component Dictionary (CCD) files to RDKit molecule objects.
This project is a simplified implementation inspired by pdbeccdutils, focusing solely on CCD to RDKit conversion with 3D conformer support.
Features
- Fast CIF parsing using gemmi
- Conversion to RDKit molecule objects
- Support for both Ideal and Model 3D conformers
- Automatic metal bond to dative bond conversion
- Stereochemistry assignment from 3D coordinates
- CLI tool with rich output
Installation
git clone https://github.com/N283T/ccd2rdmol.git
cd ccd2rdmol
uv sync
Usage
As a Library
from ccd2rdmol import read_ccd_file, read_ccd_block
import gemmi
# Read from file
result = read_ccd_file("ATP.cif")
mol = result.mol
print(f"Atoms: {mol.GetNumAtoms()}")
print(f"Bonds: {mol.GetNumBonds()}")
print(f"Conformers: {mol.GetNumConformers()}") # 2 (IDEAL + MODEL)
print(f"Sanitized: {result.sanitized}")
# With options
result = read_ccd_file(
"ATP.cif",
sanitize_mol=True, # Sanitize molecule (default: True)
add_conformers=True, # Add 3D conformers (default: True)
remove_hydrogens=True, # Remove hydrogens (default: True)
)
# From gemmi CIF block
doc = gemmi.cif.read("components.cif")
for block in doc:
result = read_ccd_block(block)
print(f"{block.name}: {result.mol.GetNumAtoms()} atoms")
As a CLI
# Output SMILES to stdout
ccd2rdmol convert ATP.cif
# Write to MOL file
ccd2rdmol convert ATP.cif -o ATP.mol
# Write to SDF format
ccd2rdmol convert ATP.cif -o ATP.sdf
# Keep hydrogen atoms
ccd2rdmol convert ATP.cif --keep-hydrogens
# Show verbose information
ccd2rdmol convert ATP.cif -v
# Show molecule information only
ccd2rdmol info ATP.cif
CLI Options
ccd2rdmol convert [OPTIONS] INPUT_FILE
Arguments:
INPUT_FILE Input CCD CIF file path [required]
Options:
-o, --output PATH Output file path (.mol, .sdf)
-f, --format TEXT Output format (mol, sdf, smiles, inchi)
--no-sanitize Skip sanitization step
--no-conformers Skip adding 3D conformers
-H, --keep-hydrogens Keep hydrogen atoms
-v, --verbose Show detailed information
--help Show help message
Development
This project uses poethepoet as a task runner.
# Install dev dependencies
uv sync
# Format code (ruff format)
uv run poe format
# Lint (ruff check)
uv run poe lint
# Lint and auto-fix
uv run poe fix
# Type check (ty)
uv run poe check
# Run tests
uv run poe test
# Run all checks (format, lint, check, test)
uv run poe all
# Clean cache files
uv run poe clean
Acknowledgments
This project is inspired by and built upon concepts from pdbeccdutils by PDBe (Protein Data Bank in Europe). Test data files are derived from the pdbeccdutils test suite.
We thank the PDBe team for their excellent work on chemical component processing tools.
License
MIT License
Test data files in tests/data/ are from pdbeccdutils (Apache-2.0 License).
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ccd2rdmol-0.1.0.tar.gz.
File metadata
- Download URL: ccd2rdmol-0.1.0.tar.gz
- Upload date:
- Size: 113.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
04ae874ea4d18acab5abb316bf5496f1320fad1a2c31a7b4cc418848fd1cc47d
|
|
| MD5 |
41f212e3dfa3ef45cd12f34e17081f28
|
|
| BLAKE2b-256 |
c52a176c301f26b4a4df7829461f3e4cfee87a97a413b8a09d13108334570f45
|
Provenance
The following attestation bundles were made for ccd2rdmol-0.1.0.tar.gz:
Publisher:
release.yml on N283T/ccd2rdmol
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ccd2rdmol-0.1.0.tar.gz -
Subject digest:
04ae874ea4d18acab5abb316bf5496f1320fad1a2c31a7b4cc418848fd1cc47d - Sigstore transparency entry: 821484942
- Sigstore integration time:
-
Permalink:
N283T/ccd2rdmol@fd231af0a1c4be11541b0d09cae1a8af2a2cbf1c -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/N283T
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@fd231af0a1c4be11541b0d09cae1a8af2a2cbf1c -
Trigger Event:
push
-
Statement type:
File details
Details for the file ccd2rdmol-0.1.0-py3-none-any.whl.
File metadata
- Download URL: ccd2rdmol-0.1.0-py3-none-any.whl
- Upload date:
- Size: 10.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2b7cc51fe506ae718f4bf6c5e0ebaef6bd37493435abea0d4a813427cde1888a
|
|
| MD5 |
d4e98ac111733e2460864095c2218095
|
|
| BLAKE2b-256 |
28714a2774033f93c9671dc1b2650c5a36f127b9161e9ad3f8cdbef612e8df5c
|
Provenance
The following attestation bundles were made for ccd2rdmol-0.1.0-py3-none-any.whl:
Publisher:
release.yml on N283T/ccd2rdmol
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ccd2rdmol-0.1.0-py3-none-any.whl -
Subject digest:
2b7cc51fe506ae718f4bf6c5e0ebaef6bd37493435abea0d4a813427cde1888a - Sigstore transparency entry: 821484946
- Sigstore integration time:
-
Permalink:
N283T/ccd2rdmol@fd231af0a1c4be11541b0d09cae1a8af2a2cbf1c -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/N283T
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@fd231af0a1c4be11541b0d09cae1a8af2a2cbf1c -
Trigger Event:
push
-
Statement type: