Skip to main content

A lightweight, efficient parser for Google-style Python docstrings that converts them into structured dictionaries.

Project description

Google Docstring Parser

[!IMPORTANT] This package requires a PAID LICENSE for all users EXCEPT the Albumentations Team. Contact iglovikov@gmail.com to obtain a license before using this software.

A Python package for parsing Google-style docstrings into structured dictionaries.

License Information

This package is available under a custom license:

See the LICENSE file for complete details.

Installation

pip install google-docstring-parser

Usage

from google_docstring_parser import parse_google_docstring

docstring = '''Apply elastic deformation to images, masks, bounding boxes, and keypoints.

This transformation introduces random elastic distortions to the input data. It's particularly
useful for data augmentation in training deep learning models, especially for tasks like
image segmentation or object detection where you want to maintain the relative positions of
features while introducing realistic deformations.

Args:
    alpha (float): Scaling factor for the random displacement fields. Higher values result in
        more pronounced distortions. Default: 1.0
    sigma (float): Standard deviation of the Gaussian filter used to smooth the displacement
        fields. Higher values result in smoother, more global distortions. Default: 50.0

Example:
    >>> import albumentations as A
    >>> transform = A.ElasticTransform(alpha=1, sigma=50, p=0.5)

References:
    - Original paper: Simard, P. Y., et al. "Best practices for convolutional neural networks applied to visual document analysis." ICDAR 2003
    - Implementation details: https://example.com/elastic-transform
Returns:
    dict[str, Any]: Some info here
'''

parsed = parse_google_docstring(docstring)
print(parsed)

Output:

{
    'Description': 'Apply elastic deformation to images, masks, bounding boxes, and keypoints.\n\nThis transformation introduces random elastic distortions to the input data. It\'s particularly\nuseful for data augmentation in training deep learning models, especially for tasks like\nimage segmentation or object detection where you want to maintain the relative positions of\nfeatures while introducing realistic deformations.',
    'Args': [
        {
            'name': 'alpha',
            'type': 'float',
            'description': 'Scaling factor for the random displacement fields. Higher values result in\nmore pronounced distortions. Default: 1.0'
        },
        {
            'name': 'sigma',
            'type': 'float',
            'description': 'Standard deviation of the Gaussian filter used to smooth the displacement\nfields. Higher values result in smoother, more global distortions. Default: 50.0'
        }
    ],
    'Example': '>>> import albumentations as A\n>>> transform = A.ElasticTransform(alpha=1, sigma=50, p=0.5)',
    'References': [
        {
            'description': 'Original paper',
            'source': 'Simard, P. Y., et al. "Best practices for convolutional neural networks applied to visual document analysis." ICDAR 2003'
        },
        {
            'description': 'Implementation details',
            'source': 'https://example.com/elastic-transform'
        }
    ],
    'Returns':
        {
            "type": "dict[str, Any]",
            "description": "Some info here"
        }
}

Features

  • Parses Google-style docstrings into structured dictionaries
  • Extracts parameter names, types, and descriptions
  • Preserves other sections like Examples, Notes, etc.
  • Handles multi-line descriptions and indentation properly
  • Properly parses and validates References sections with special handling for URLs

References

The parser can handle reference sections in Google-style docstrings. References can be formatted in two ways:

  1. Single reference format:
"""
Reference:
    Paper title: https://example.com/paper
"""
  1. Multiple references format (requires leading dashes):
"""
References:
    - First paper: https://example.com/paper1
    - Second paper: https://example.com/paper2
"""

Each reference is parsed into a dictionary with description and source keys. URLs in the source are properly handled, ensuring colons in URLs are not confused with the separator colon.

Short Description (Meta Description)

The checker extracts the short description as the first paragraph (up to the first blank line). Multi-line first paragraphs are joined with spaces. This is useful for meta descriptions on documentation sites. SEO best practice: 120-160 characters. Use min_short_description_length and max_short_description_length to enforce bounds (0 to disable).

Pre-commit Hook

This package includes a pre-commit hook that checks if Google-style docstrings in your codebase can be parsed correctly.

Usage in Other Projects

To use this hook in another project, add the following to your .pre-commit-config.yaml:

- repo: https://github.com/ternaus/google-docstring-parser
  rev: v0.0.1  # Use the latest version
  hooks:
    - id: check-google-docstrings
      additional_dependencies: ["tomli>=2.0.0"]  # Required for pyproject.toml configuration

Configuration

The hook is configured via pyproject.toml, following modern Python tooling conventions like those used by mypy, ruff, and other tools.

Add a [tool.docstring_checker] section to your pyproject.toml:

[tool.docstring_checker]
paths = ["src", "tests"]                     # Directories or files to scan
require_param_types = true                   # Require parameter types in docstrings
check_references = true                      # Check references for proper format
check_type_consistency = true                # Compare docstring types with annotations
exclude_files = ["conftest.py", "__init__.py"] # Files to exclude from checks
min_short_description_length = 50            # Minimum short description length; 0 to disable
max_short_description_length = 160           # Maximum short description length; 0 to disable (SEO: 120-160)
verbose = false                              # Enable verbose output

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

google_docstring_parser-0.0.11.tar.gz (25.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

google_docstring_parser-0.0.11-py3-none-any.whl (24.3 kB view details)

Uploaded Python 3

File details

Details for the file google_docstring_parser-0.0.11.tar.gz.

File metadata

  • Download URL: google_docstring_parser-0.0.11.tar.gz
  • Upload date:
  • Size: 25.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for google_docstring_parser-0.0.11.tar.gz
Algorithm Hash digest
SHA256 2fcb051bcd63b521b25f5c1a0eeec891fc10af4832ae47c1f4442a5516f16294
MD5 bcb33ffda8163ec7a2c61c3362dda2f6
BLAKE2b-256 a4def5ab342bcfe9bed582ffa676bf50ee4e4243f9c9b7f54cf9b19f1149f883

See more details on using hashes here.

File details

Details for the file google_docstring_parser-0.0.11-py3-none-any.whl.

File metadata

File hashes

Hashes for google_docstring_parser-0.0.11-py3-none-any.whl
Algorithm Hash digest
SHA256 e1fd0885bf60d6c10871f066f9816051278e427dfa90a13885483b00cc473248
MD5 f2d65fb828b33f275dee7654f39b5228
BLAKE2b-256 ab439654a0c43623d6b430b2913e173b94c5b2665bdb07f41e8944f681b8519b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page