Skip to main content

Schrodinger file format support for OEPandas

Project description

oepandas-mae

Maestro file format support for oepandas. Reads .mae, .mae.gz, and .maegz files into pandas DataFrames following oepandas conventions.

This only requires a valid OpenEye Toolkit license. You do not need to have any Schrodinger software installed.

Installation

Install as an oepandas extra:

pip install oepandas[mae]

Or install standalone:

pip install oepandas-mae

Requirements: Python 3.10+, oepandas >= 3.2.0, oemaestro >= 0.1.0, OpenEye Toolkits

Quick Start

import oepandas as oepd

df = oepd.read_mae("tests/assets/5.maegz")
print(df)

Prints:

                                            Molecule          Title ...
0  <oechem.OEMol; proxy of <Swig Object of type '...        Aspirin ...
1  <oechem.OEMol; proxy of <Swig Object of type '...      Ibuprofen ...
2  <oechem.OEMol; proxy of <Swig Object of type '...  Acetaminophen ...
3  <oechem.OEMol; proxy of <Swig Object of type '...       Caffeine ...
4  <oechem.OEMol; proxy of <Swig Object of type '...       Diazepam ...

[5 rows x 16 columns]

Usage

Column Naming

Rename the molecule and title columns, or suppress the title column entirely:

import oepandas as oepd
df = oepd.read_mae("tests/assets/5.maegz", molecule_column="Mol", title_column="Name")
df = oepd.read_mae("tests/assets/5.maegz", no_title=True)

Selecting Columns

Use usecols to include only specific data columns:

df = read_mae("tests/assets/5.maegz", usecols=["NumAcceptors", "NumDonors"])

Numeric Conversion

CT property values are strings by default. Use numeric to convert columns:

# Convert a single column
df = read_mae("file.mae", numeric="pdb_tfactor")

# Convert multiple columns
df = read_mae("file.mae", numeric=["pdb_tfactor", "occupancy"])

# Specify downcast types
df = read_mae("file.mae", numeric={"pdb_tfactor": "float", "atom_count": "integer"})

SMILES Columns

Add SMILES string columns alongside the molecule objects:

df = oepd.read_mae("file.mae", add_smiles=True)
# Creates a "Molecule SMILES" column

Conformer Grouping

Consecutive structures in a Maestro file can be grouped into multi-conformer molecules:

df = read_mae("file.mae", conformer_test="absolute")

Available conformer tests:

  • "default" -- groups consecutive molecules with matching titles
  • "absolute" -- requires identical atom/bond ordering, properties, and title
  • "absolute_canonical" -- requires matching canonical SMILES
  • "isomeric" -- like absolute, but also requires matching stereochemistry
  • "omega" -- like isomeric, plus invertible nitrogen stereochemistry

Tag Formatting

Maestro property keys follow a type_owner_name convention (e.g., r_pdb_PDB_CRYST1_a). By default, only the name portion is used as the column name. Control this with tags:

import oepandas as oepd
from oepandas_mae import TAG_ALL, TAG_NAME, TAG_NONE

# Default: clean names only (e.g., "PDB_CRYST1_a")
df = oepd.read_mae("file.mae")

# Full Maestro keys (e.g., "r_pdb_PDB_CRYST1_a")
df = oepd.read_mae("file.mae", tags=TAG_ALL)

# No data columns (molecules and titles only)
df = oepd.read_mae("file.mae", tags=TAG_NONE)

Perception Control

Control post-parse chemical perception with the perception parameter:

from oepandas_mae import read_mae, PERCEPTION_NONE

df = oepd.read_mae("file.mae", perception=PERCEPTION_NONE)

By default, only limited molecule perception occurs in order to respect what is specified in the Maestro file.

Configuration Object

For repeated use, pass an OEMaestroReaderConfig object. Keyword arguments override config values:

from oepandas_mae import read_mae, OEMaestroReaderConfig, TAG_ALL

config = OEMaestroReaderConfig()
config.tags = TAG_ALL

df = oepd.read_mae("file.mae", config=config)
df = oepd.read_mae("file.mae", config=config, tags=TAG_NAME)  # tags overrides config

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

oepandas_mae-0.1.2-py3-none-any.whl (9.3 kB view details)

Uploaded Python 3

File details

Details for the file oepandas_mae-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: oepandas_mae-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 9.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for oepandas_mae-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 0c43606ca2a0efbe6d6b7cb31a54911857caa996b93bca856cb6746e6e52c8c7
MD5 c2a72a1f20fc656f45bf8111ce2ec576
BLAKE2b-256 317e74c11590d0cf380967755d74a807cf75facd7c331326fed9f205316e625e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page