Skip to main content

Convert PDB Chemical Component Dictionary (CCD) files to RDKit molecules

Project description

ccd2rdmol

CI PyPI version License: MIT Python 3.10+

A lightweight Python library and CLI tool for converting PDB Chemical Component Dictionary (CCD) files to RDKit molecule objects.

This project is a simplified implementation inspired by pdbeccdutils, focusing solely on CCD to RDKit conversion with 3D conformer support.

Features

  • Fast CIF parsing using gemmi
  • Conversion to RDKit molecule objects
  • Support for both Ideal and Model 3D conformers
  • Automatic metal bond to dative bond conversion
  • Stereochemistry assignment from 3D coordinates
  • CLI tool with rich output

Installation

# Library only
uv add ccd2rdmol

# With CLI support
uv add ccd2rdmol[cli]

Or for development:

git clone https://github.com/N283T/ccd2rdmol.git
cd ccd2rdmol
uv sync  # CLI is included in dev dependencies

Usage

As a Library

from ccd2rdmol import read_ccd_file, read_ccd_block
import gemmi

# Read from file
result = read_ccd_file("ATP.cif")
mol = result.mol

print(f"Atoms: {mol.GetNumAtoms()}")
print(f"Bonds: {mol.GetNumBonds()}")
print(f"Conformers: {mol.GetNumConformers()}")  # 2 (IDEAL + MODEL)
print(f"Sanitized: {result.sanitized}")

# With options
result = read_ccd_file(
    "ATP.cif",
    sanitize_mol=True,      # Sanitize molecule (default: True)
    add_conformers=True,    # Add 3D conformers (default: True)
    remove_hydrogens=True,  # Remove hydrogens (default: True)
)

# From gemmi CIF block
doc = gemmi.cif.read("components.cif")
for block in doc:
    result = read_ccd_block(block)
    print(f"{block.name}: {result.mol.GetNumAtoms()} atoms")

As a CLI

Note: CLI requires extra dependencies. Install with uv add ccd2rdmol[cli]

# Output SMILES to stdout
ccd2rdmol convert ATP.cif

# Write to MOL file
ccd2rdmol convert ATP.cif -o ATP.mol

# Write to SDF format
ccd2rdmol convert ATP.cif -o ATP.sdf

# Keep hydrogen atoms
ccd2rdmol convert ATP.cif --keep-hydrogens

# Show verbose information
ccd2rdmol convert ATP.cif -v

# Show molecule information only
ccd2rdmol info ATP.cif

CLI Options

ccd2rdmol convert [OPTIONS] INPUT_FILE

Arguments:
  INPUT_FILE  Input CCD CIF file path [required]

Options:
  -o, --output PATH       Output file path (.mol, .sdf)
  -f, --format TEXT       Output format (mol, sdf, smiles, inchi)
  --no-sanitize           Skip sanitization step
  --no-conformers         Skip adding 3D conformers
  -H, --keep-hydrogens    Keep hydrogen atoms
  -v, --verbose           Show detailed information
  --help                  Show help message

Development

This project uses poethepoet as a task runner.

# Install dev dependencies
uv sync

# Format code (ruff format)
uv run poe format

# Lint (ruff check)
uv run poe lint

# Lint and auto-fix
uv run poe fix

# Type check (ty)
uv run poe check

# Run tests
uv run poe test

# Multi-version testing with nox (3.10, 3.11, 3.12, 3.13, 3.14)
uv run poe nox

# Run all checks (format, lint, check, test)
uv run poe all

# Clean cache files
uv run poe clean

Acknowledgments

This project is inspired by and built upon concepts from pdbeccdutils by PDBe (Protein Data Bank in Europe). Test data files are derived from the pdbeccdutils test suite.

We thank the PDBe team for their excellent work on chemical component processing tools.

License

MIT License

Test data files in tests/data/ are from pdbeccdutils (Apache-2.0 License).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ccd2rdmol-0.2.1.tar.gz (135.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ccd2rdmol-0.2.1-py3-none-any.whl (10.2 kB view details)

Uploaded Python 3

File details

Details for the file ccd2rdmol-0.2.1.tar.gz.

File metadata

  • Download URL: ccd2rdmol-0.2.1.tar.gz
  • Upload date:
  • Size: 135.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ccd2rdmol-0.2.1.tar.gz
Algorithm Hash digest
SHA256 4c21d6b4e1ccf052919b82eac95b45a93477c47460f8f570530a16ce4cf06581
MD5 749353bafa0a73f347a6348c20d722cf
BLAKE2b-256 5fd08c11a46e7084e8b6a38e00e9f5bf4c8410b19d04a9b0e4a817090d1dfb5a

See more details on using hashes here.

Provenance

The following attestation bundles were made for ccd2rdmol-0.2.1.tar.gz:

Publisher: release.yml on N283T/ccd2rdmol

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ccd2rdmol-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: ccd2rdmol-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 10.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ccd2rdmol-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 52e3f40999b23def13fc8a2efb7703349a53d25e639590c698402a00880de97f
MD5 52c0a6d7585926bb10ba72891df7e876
BLAKE2b-256 ca90ab4964047dfe1bdbc6c1b878a4462afc52ab917dfbb707fe5bc40bd4569e

See more details on using hashes here.

Provenance

The following attestation bundles were made for ccd2rdmol-0.2.1-py3-none-any.whl:

Publisher: release.yml on N283T/ccd2rdmol

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page