Create or modify Rosetta params files (topology files) from scratch, RDKit mols or another params file.

These details have not been verified by PyPI

Project links

Project description

RDKit to params

Create or modify Rosetta params files (topology files) from scratch, RDKit mols or another params file.

Installation

To install from pip type:

pip install rdkit-to-params

To install the latest version (probably the same) from GitHub

pip install git+https://github.com/matteoferla/rdkit_to_params

RDKit and PyRosetta are optional module, but most of the useful functionality comes from the former. To install rdkit, conda install -c conda-forge rdkit or apt-get or pip install rdkit-pypi. To install PyRosetta you need to get a licence (free for academic use) and install it. One option for installation without visiting the Rosetta Commons site is:

pip install pyrosetta_help; PYROSETTA_USERNAME='👾👾👾' PYROSETTA_PASSWORD='👾👾👾' install_pyrosetta

Rationale

This is a fresh rewrite of mol_to_params.py. For three reasons:

I cannot share my 2to3 port and modd ed module-version of mol_to_params.py due to licence.
I want to modify params files and more as opposed to use a standalone script.
RDKit does not save mol2 files, yet knows about atom names and Gasteiger-Massilli charges and more...

It sounds mad, but did not actually take too long.

Website

For a web app using this see https://direvo.mutanalyst.com/params. For the code running the website, see:

Roundtrip

Native amino acid params files can be found in the Rosetta folder rosetta/main/database/chemical/residue_type_sets/fa_standard/residue_types/l-caa Let's do a roundtrip changing an atomname:

import pyrosetta
pyrosetta.init(extra_options='-mute all') # required for test
from rdkit_to_params import Params

p = Params.load('PHE.params')
p.IO_STRING[0].name3 = 'PHX'
p.IO_STRING[0].name1 = 'Z'
p.AA = 'UNK'  #If it's not one of the twenty (plus extras), UNK!
del p.ROTAMER_AA[0]
p.rename_atom(' CB ', ' CX ') # this renames
p.dump('fake.params')
p.test().dump_pdb('test.pdb')

p.test() returns a pyrosetta pose. The static method params_to_pose('something.params', name3) accepts a params file

import nglview
pose = Params.params_to_pose('some_topology_I_found.params', name3)
view = nglview.show_rosetta(pose)
view

From mol object

Requirements

For the sake of sanity, EmbedMolecule, Chem.AddHs(mol) or any other operation is assumed to have been done beforehand. And that the user is going to do Chem.MolToPDBFile(params.mol) or Chem.MolToPDBBlock(params.mol) or use the bound methods of Params, dump_pdb and dump_pdb_conf (see below).

The molecule should preferably be not Kekulised. 3letter name of residue is either from the title row (_Name) if a 3letter word or from the PDBInfo or 'LIG'.

Dummy atom (*/R) is assumed to be a CONNECT —ligand only atm.

Here is a conversion to an amino acid from a SMILES (quickest way):

import pyrosetta
pyrosetta.init(extra_options='-mute all')
from rdkit_to_params import Params
p = Params.from_smiles('*C(=O)C(Cc1ccccc1)[NH]*', #recognised as amino acid.
        name='PHX', #optional.
        atomnames={3: 'CZ'} #optional, rando atom name as see in previous edit
        )
print(p.is_aminoacid()) # True
p.dump('fake.params')
p.test().dump_pdb('test.pdb')
Chem.MolToPDBFile(mol, 'ref.pdb')

Here is a conversion to a ligand the circuitous way, just for fun:

import pyrosetta
pyrosetta.init(extra_options='-mute all')
# note that pyrosetta needs to be started before rdkit.
from rdkit_to_params import Params
# make the molecule in RDKit or chemdraw or download it or whatever.
mol = Chem.MolFromSmiles('NC(C(=O)O)Cc1ccccc1')
mol = AllChem.AddHs(mol)
AllChem.EmbedMolecule(mol)
AllChem.MMFFOptimizeMolecule(mol)
# add names to the mol beforehand
Params.add_names(mol, names=['N', 'CA', 'C', 'O', 'OXT', 'CB'], name='PHZ')
# parameterise
p = Params.from_mol(mol, name='PHZ')
p.test().dump_pdb('test.pdb')
Chem.MolToPDBFile('ref.pdb')

The class method add_names is based upon atom index (which is derived from the SMILES or sdf/mol file unless atoms have been replaced). The instance method rename_by_substructure accepts a substructure and a list of atom names in the order they are in the substructure.

Note that conformer generation is not fully automatic and is not done by default.

# make your conformers as you desire
AllChem.EmbedMultipleConfs(mol, numConfs=10) # or whatever you choose. This is a somewhat important decision.
AllChem.AlignMolConformers(mol) # I do not know if the conformers need to be aligned for Rosetta
# params time!
p = Params.from_mol(mol, name='LIG') 
p.dump_pdb_conf('LIG_conf.pdb')
p.PDB_ROTAMERS.append('LIG_conf.pdb')
p.dump('my_params.params')

Note dump_pdb and dump_pdb_conf will save the molecule(s) without the dummy atoms, to stop this add stripped=False.

From SMILES string

The above is actually a bit convoluted for example purposes as Params.from_smiles, accepts a SMILES string.

From SMILES string and PDB for names

In some cases one has a PDB file with a ligand that needs parameterising. Assuming one has also the smiles of the ligand (PubChem has an super easy search), one can do

p = Params.from_smiles_w_pdbfile(pdb_file, smiles, 'XXX') # the name has to match.

The smiles does not need to match full. It can contain more atoms or even one* (CONNECT). The smiles gets parameterised. So be suse to add correct charges properly —hydrogens are added. It could be used for scaffold hopping, but if position matters so much, you may be interested in Fragmenstein.

For more see autogenerated documentation. Sphinx with markdown cannot deal with typehinting, so checking the code might be clearer.

Rename

A key part is the atom names ——this can happen at . The following renaming methods are present:

p.rename(???): "overloaded" method that directs to the others
p.rename_from_str('XX,YY,ZZ') or p.rename_from_str('0:XX,3:YY')
p.rename_from_list(['XX','YY', 'ZZ'])
p.rename_from_dict({0:'XX',3:'YY'})
p.rename_from_template(Chem.Mol)
p.rename_by_substructure(Chem.Mol, ['XX','YY', 'ZZ']) where the list is the atom idx in substructure

Note, retype_by_name does not have all these options (only atomname -> Rosetta atomtype).

The class method add_names simply uses these, but returns a mol

DIY

If you have two mol objects from whatever routes, the basic operation is:

p = Params.load_mol(mol, generic=False, name='LIG')
p.rename_from_template(template) # or whatever middle step

p.convert_mol()

Note that convert_mol should be called once and is already called in the two from_XXX classmethods.

p = Params.from_mol(...)
p.convert_mol() # No!!!
p.mol # is the mol...
p2 = Params.load_mol(p.mol)
p2.convert_mol() # Yes

Partial charges

The partial charges in the rdkit.Chem.Atom property to use can be set via pcharge_prop_name argument in from_mol and load_mol among others. By default it is '_GasteigerCharge', which is the Gasteiger-Marsili charge assigned by RDKit. Custom partial charges need to be assigned to the atoms beforehand, for example:

import psi4
from rdkit import Chem
from rdkit_to_params import Params

mol: Chem.Mol = ...  # your molecule
pcharge_prop_name: str = 'ESPCharge'  # custom property name for partial charges
mol_string = f"{Chem.GetFormalCharge(mol)} 1\n" + "\n".join(Chem.MolToXYZBlock(mol).split('\n')[2:])  # charge, multiplicity, then coords
psi4_mol = psi4.geometry(mol_string)
psi4.set_options({
    'basis': '6-31G*',
    'scf_type': 'df'
})
energy, wfn = psi4.energy('B3LYP', return_wfn=True)
psi4.set_options({'CUBEPROP_TASKS': ['ESP']})
psi4.cubeprop(wfn)
charges: list[float] = wfn.atomic_point_charges().np.tolist()
for i, atom in enumerate(mol.GetAtoms()):
    atom.SetDoubleProp(pcharge_prop_name, charges[i])
    
p = Params.from_mol(mol, pcharge_prop_name=pcharge_prop_name, name='LIG')

Constraints

The self-standing class Constraints is for generating constraint files, which are a must with covalent attachments in order to stop janky topologies. The class is instantiated with a pair of SMILES, each with at least a real atom and with one attachment point, the first is the ligand and the second is its peptide target. The names of the heavy atoms and the Rosetta residue "numbers".

from rdkit_to_params import Constraints
c = Constraints(smiles=('*C(=N)', '*SC'), names= ['*', 'CX', 'NY', '*', 'SG', 'CB'], ligand_res= '1B', target_res='145A')
c.dump('con.con')
# individual strings can be accessed
c.atom_pair_constraint
c.angle_constraint
c.dihedral_constaint
c.custom_constaint # if you want to add your own before `str`, `.dumps`, `.dump`.

Do note that to make covalent links work in Rosetta, NGL and a few other places you need a LINK record, here is a f-string for it:

f'LINK        {target_atom: >4} {target_resn: >3} {p_chain[:1]} {target_resi: >3}                '+\
f'{ligand_atom: >4} {ligand_resn: >3} {ligand_chain[:1]} {ligand_resi: >3}     1555   1555  1.8\n'

This is not to be confused with CCP4 REFMAC's LINKR, which are however easy to covert. Alternatively, you can add it after importing the pose, cf. pose.residue(lig_pos).connect_map.

Bond order

It is worth mentioning that the bond order specified in the topology file in the BOND_ORDER lines is mostly ignored and the bond order is derived from the rosetta types that get assigned. To extract and correct a ligand, consider the following

# pose to string
buffer = pyrosetta.rosetta.std.stringbuf()
pose.dump_pdb(pyrosetta.rosetta.std.ostream(buffer))
pdbblock = buffer.str()
# get the residue
mol = Chem.MolFromPDBBlock(pdbblock, proximityBonding=False, removeHs=False)
ligand = Chem.SplitMolByPDBResidues(mol, whiteList=[params.NAME])[params.NAME]
# fix bond order
template = AllChem.DeleteSubstructs(params.mol, Chem.MolFromSmiles('*'))
AllChem.AssignBondOrdersFromTemplate(template, ligand)

Amino acids

A *C(=O)C([*:3])[NH]* molecule, where R3 is whatever sidechain is automatically converted into an amino acid. Omitting the hydrogen on the amine is fine (implicit), so *C(=O)C([*:3])N* is also automatically accepted. Likewise, a secondary amine like in proline, *C(=O)C1CCCN1*, is automatically determined to be an amino acid. Omitting the double bond of the carboxyl will result in a hydroxyl backbones amino acid, which will behave like C=[OH+] for properties, but without the partial charge. The criterion for an amino acid is if the substracture *NCC(~O)* is matched (see _aminoacid_override).

However, C(=O)C([*:3])[NH] will be parsed as [CH](=O)C([*:3])[NH], i.e. with a radical amine and an aldehyde.

Here is an example of making a sequence with a custom residue (without writing to file):

import nglview as nv
from rdkit_to_params import Params
import pyrosetta

# make params
p = Params.from_smiles('CCCCC(N*)C(*)=O', name='NLE')
p.PROPERTIES.append('ALIPHATIC')
p.PROPERTIES.append('HYDROPHOBIC')
# p.test() would test it in isolation.

# add to pose
pose = pyrosetta.Pose()
rst = p.add_residuetype(pose)
pyrosetta.rosetta.core.pose.make_pose_from_sequence(pose, 'AX[NLE]A', rst)

# relax and show
scorefxn = pyrosetta.get_fa_scorefxn()
relax = pyrosetta.rosetta.protocols.relax.FastRelax(scorefxn, 15)
relax.apply(pose)
nv.show_rosetta(pose)

Greek

In the amino acid case, the class attribute greekification changes the atomnames to CB, CD2 etc. It is by default True. It is called during fix_mol, a step in load_mol/load_smiles, so should be safe for rename methods.

From RCSB PDB

The PDB has ligand definitions (chemical components) that can be downloaded as a cif file. If Gemmi is installed the function chemcomp_to_mol in rdkit_to_params.mol_from_compchem_block can be used to download and convert a chemical component to an RDKit mol. Note that not all PDB chemical components definitions are perfect and some manual fixing may be required.

Optionals

Installing RDKit with conda is easy (conda install rdkit). With apt-get likewise (sudo apt-get install python3-rdkit librdkit1 rdkit-data). With brew idem (brew install rdkit --with-python3). But there is always a system where one needs to compile it from source, which is a pain. Hence why it is optional. For example, I have never tried installing it on a Windows.

Pyrosetta is optional because it has a non-standard installation.

Terminal caps

To make a cap, there is a quick way:

p = Params.from_smiles('*NCC', name='CAP', atomnames={1: ' N  ', 2: ' CA '})
p.make_C_terminal_cap(mainchain_atoms=['N', 'CA'])

p = Params.from_smiles('*C(=O)C', name='CAP', atomnames={1: ' C  ', 2: ' O  ', 3: ' CA '})
p.make_N_terminal_cap(mainchain_atoms=['C', 'CA'])

These methods also accept connection_idx, which is the Fortran-style index of the connection that will become a LOWER/UPPER. i.e. if the cap is further connected but not as a polymer, say *NCC*.

Do note:

.test() does not work on a terminal cap and will segfault
mainchain_atoms will change Rosetta atom-types only if the name matches
The code accepts only cases with 3 or more atoms (so a *=O cap is a no go and requires virtual atoms added manually)

Caveat: I do not know many things!

Chemical

I suspect I am doing stuff weirdly and I am meant to create ligands via pyrosetta.rosetta.core.chemical and not via params files... If this is so let me know. I don't mind knowing I made a mistake!

Edit: it is indeed possible, but it is far from trivial/sane. The class pyrosetta.rosetta.core.chemical.MutableResidueType can be operated upon by various other classes such as pyrosetta.rosetta.core.chemical.SetAtomicCharge. A mutable residue type can be converted to a regular residue type via .make().

Generic

I like this generic atom type business, but I am not sure how to use them in RL. from_mol(mol, generic=True) will make generic atom types. I made several guesses with the classic atom types and I am sure many things are wrong...

Rings and cis-trans

CUT_BOND is implemented, but I am not sure it does anything. CHI entries cannot work across a cut bond, even when undeclared, so is likely redudant.
ADD_RING is not implemented in the from_mol conversion as I think it's an old command.
Does a cis-trans tautomer bond (say C(=O)-C=O) gets a CHI entry? I am assuming no, but not sure.

Notes

There are some other things to pay attention to:

Atom names are 4-letters. It is always safer to add the spaces yourself if assigning them.
CHI struggles with rings, so currently C1CCCCC1CCC has only one CHI (C7, C8, C9, H10), even if (C6, C7, C8, C9) most probably counts.

To Do

The from_mol class method recognises *[NH]CC(~O)* and assigns it as a backbone properly. However, Chem.MolFromSmiles('*[NH]CC(~O)*') cannot be embedded, so is a bit of a horrible one for users to use. Throughout the code, dummy atoms (*/R) are changed to carbons or chlorines and then changed back. Cystathionine and similar twinned amino acids are the problem as I cannot simply make an amino acid backbone be recognised, however if protonated as is the case [NH1]C[CH0](=O). Maybe the CC(=O)NCC(=O)NC option may be a better choice after all.

Footnote

To save a ResidueType in PyRosetta to a params file, the command is:

pyrosetta.rosetta.core.chemical.write_topology_file(residuetype, 'foo.params')  # noqa

Legal Disclaimer

The author, Matteo Ferla, is not affiliated with either Rosetta or RDKit.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.3.3

Jan 3, 2026

1.3.2

Dec 31, 2025

1.3.0

Nov 28, 2025

1.2.14

Nov 16, 2025

1.2.13

Nov 16, 2025

1.2.12

Nov 16, 2025

1.2.11

Jul 15, 2025

1.2.10

Jul 15, 2025

1.2.8

Dec 8, 2023

1.2.5

Jul 19, 2023

1.2.4

May 21, 2023

1.2.3

Jan 9, 2023

1.2.2

Jun 9, 2022

1.2.1

May 25, 2022

1.2

Apr 8, 2022

1.1.14

Oct 17, 2021

1.1.13

Oct 1, 2021

1.1.12

Sep 27, 2021

1.1.11

Sep 22, 2021

1.1.10

Sep 21, 2021

1.1.9

Sep 21, 2021

1.1.8

Jun 16, 2021

1.1.7

May 28, 2021

1.1.6

Mar 22, 2021

1.1.5

Feb 18, 2021

1.1.5b0 pre-release

Mar 13, 2021

1.1.4

Jan 2, 2021

1.1.4b0 pre-release

Jan 29, 2021

1.1.3

Nov 5, 2020

1.1.2

Jul 21, 2020

1.1.1

Jul 16, 2020

1.1.0

Jul 16, 2020

1.0.5

Jul 10, 2020

1.0.4

Jul 2, 2020

1.0.2

Jun 3, 2020

1.0.1

Jun 2, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rdkit_to_params-1.3.3.tar.gz (65.6 kB view details)

Uploaded Jan 3, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

rdkit_to_params-1.3.3-py3-none-any.whl (56.8 kB view details)

Uploaded Jan 3, 2026 Python 3

File details

Details for the file rdkit_to_params-1.3.3.tar.gz.

File metadata

Download URL: rdkit_to_params-1.3.3.tar.gz
Upload date: Jan 3, 2026
Size: 65.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for rdkit_to_params-1.3.3.tar.gz
Algorithm	Hash digest
SHA256	`a238399ee20a045c25031dd98fdc15ba9db1db684eb2f31743b980bc3cca3123`
MD5	`02731531a9a48d2822b2c9eb16e8d5aa`
BLAKE2b-256	`6eba18c10ef59bff228ca02a5e800e3a6865ad05a3e3a6cdbf09278b24c85608`

See more details on using hashes here.

File details

Details for the file rdkit_to_params-1.3.3-py3-none-any.whl.

File metadata

Download URL: rdkit_to_params-1.3.3-py3-none-any.whl
Upload date: Jan 3, 2026
Size: 56.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for rdkit_to_params-1.3.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a461f50a93636f1d45ed9fec03172ab49e2b93e952e65e0978353f7b2d47317a`
MD5	`bb2ead1aca4c78e2ab8206bfcebb9fa8`
BLAKE2b-256	`f2475bf4704ef4914a4a84802f49aebf6eb3ac5244e8774823fcc32af84c72a7`

See more details on using hashes here.

rdkit-to-params 1.3.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

RDKit to params

Installation

Rationale

Website

Roundtrip

From mol object

Requirements

From SMILES string

From SMILES string and PDB for names

Rename

DIY

Partial charges

Constraints

Bond order

Amino acids

Greek

From RCSB PDB

Optionals

Terminal caps

Caveat: I do not know many things!

Chemical

Generic

Rings and cis-trans

Notes

To Do

Footnote

Legal Disclaimer

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes