Skip to main content

Python bindings for Cymbal code indexing and symbol discovery

Project description

py-cymbal: Python Bindings for Cymbal

Python bindings for the Cymbal code indexing and symbol discovery tool, created using the gopy framework.

Overview

Cymbal is a Go-based tool that uses tree-sitter for multi-language AST parsing and SQLite for indexed storage, providing fast symbol search, cross-references, impact analysis, and scoped diffs. These Python bindings allow Python developers to programmatically access Cymbal's powerful code analysis capabilities.

Features

  • Repository Indexing: Index code repositories for fast symbol lookup
  • Symbol Search: Search for functions, classes, variables, and other symbols
  • Symbol Investigation: Get detailed information about specific symbols including definitions and references
  • Reference Finding: Find all references to a particular symbol
  • Multi-language Support: Works with all languages supported by Cymbal (via tree-sitter)
  • Clean Python API: Pythonic interface with context managers and convenience functions

Installation

Prerequisites

  1. Go 1.25+: Required for building the bindings
  2. Python 3.7+: Python development headers are required
  3. gopy: Go package for creating Python bindings
  4. Cymbal: The Go library being wrapped

Building from Source

# Clone the repository
git clone <repository-url>
cd py-cymbal

# Build the Python bindings
./build.sh

# Set up environment for development
export LD_LIBRARY_PATH=$(pwd)/python/cymbal:$LD_LIBRARY_PATH

# Install in development mode
pip install -e .

Build Script

The build.sh script automates the entire build process:

  1. Sets up the Go environment with CGO flags for SQLite FTS5
  2. Generates Python bindings using gopy
  3. Builds the C extension
  4. Organizes the Python module structure
  5. Creates a setup.py for pip installation

Usage

Basic Example

import cymbal

# Create a Cymbal instance
with cymbal.Cymbal() as c:
    # Index a repository
    stats = c.index("/path/to/your/repository")
    print(f"Indexing result: {stats}")
    
    # Search for symbols
    results = c.search("handleAuth", limit=10)
    for symbol in results:
        print(f"{symbol.name} ({symbol.kind}) at {symbol.file}:{symbol.start_line}")
    
    # Investigate a specific symbol
    investigation = c.investigate("UserModel")
    print(f"Definition: {investigation.definition}")
    print(f"References: {len(investigation.references)}")
    
    # Find references to a symbol
    references = c.find_references("DatabaseConnection", limit=20)
    for ref in references:
        print(f"Reference at {ref.file}:{ref.line}")

Convenience Functions

import cymbal

# Index a repository (one-liner)
stats = cymbal.index_repository("/path/to/repo")

# Search with existing database path
results = cymbal.search_symbols("config", limit=15, db_path="/path/to/index.db")

# Investigate a symbol
investigation = cymbal.investigate_symbol("ApiClient", db_path="/path/to/index.db")

Advanced Usage

import cymbal

# Reuse an existing index
try:
    c = cymbal.Cymbal()
    c.db_path = "/path/to/existing/index.db"  # Set path to existing database
    
    # Perform searches
    results = c.search("test", limit=5)
    
    # Process results
    for symbol in results:
        print(f"Found: {symbol.name} in {symbol.file}")
        
except Exception as e:
    print(f"Error: {e}")
finally:
    c.close()

API Reference

cymbal.Cymbal Class

__init__(repo_path=None)

Create a new Cymbal instance. Optionally index a repository immediately.

index(repo_path)

Index a repository. Returns statistics about the indexing operation.

search(query, limit=20)

Search for symbols matching the query. Returns a list of symbol results.

investigate(symbol_name)

Investigate a specific symbol. Returns an investigation result with definition and references.

find_references(symbol_name, limit=50)

Find references to a symbol. Returns a list of reference results.

db_path (property)

Get or set the current database path.

close()

Close the Cymbal instance and release resources.

Convenience Functions

index_repository(repo_path)

Convenience function to index a repository.

search_symbols(query, limit=20, db_path=None)

Convenience function to search for symbols.

investigate_symbol(symbol_name, db_path=None)

Convenience function to investigate a symbol.

Architecture

The Python bindings are built using a layered architecture:

  1. Go Wrapper Layer (go/pycymbal/wrapper.go): Simplified Go API that exposes core Cymbal functionality with Python-friendly signatures.
  2. gopy Generated Layer: Automatically generated by gopy, providing the C bindings between Go and Python.
  3. Python API Layer (python/cymbal/__init__.py): Pythonic wrapper that provides a clean, intuitive interface for Python developers.

File Structure

py-cymbal/
├── go/                    # Go wrapper package
│   ├── pycymbal/         # Go wrapper source code
│   │   └── wrapper.go    # Simplified Go API for Python
│   ├── go.mod           # Go module definition
│   └── main.go          # Test program
├── python/              # Python bindings
│   └── cymbal/          # Python module
│       ├── __init__.py  # Python API layer
│       ├── pycymbal.py  # Generated Python bindings
│       ├── go.py        # Generated Go runtime support
│       ├── _pycymbal.so # Compiled C extension
│       └── pycymbal_go.so # Go shared library
├── examples/            # Usage examples
│   └── basic_usage.py  # Basic usage demonstration
├── build.sh            # Build automation script
├── setup.py            # pip installation configuration
└── README.md           # This file

Technical Details

CGO Dependencies

The bindings handle CGO dependencies including:

  • SQLite with FTS5: Enabled via -DSQLITE_ENABLE_FTS5 CFLAG
  • tree-sitter: C library for parsing multiple programming languages
  • Python C API: For Python extension module support

Type Mapping

gopy automatically handles type conversions between Go and Python:

  • Go structs → Python objects with attributes
  • Go slices → Python lists
  • Go errors → Python exceptions
  • Go interfaces → Python objects with methods

Resource Management

The bindings properly manage resources including:

  • SQLite database connections
  • tree-sitter parser instances
  • Go garbage collection coordination with Python reference counting

Limitations and Known Issues

  1. Shared Library Loading: The bindings require both _pycymbal.so and pycymbal_go.so to be in the same directory and accessible via LD_LIBRARY_PATH.
  2. Platform Support: Primarily tested on Linux. macOS and Windows may require additional configuration.
  3. Memory Management: Large repositories may require significant memory for indexing.
  4. Concurrent Access: The SQLite database may have limitations with concurrent writes.

Troubleshooting

Import Error: "pycymbal_go.so: cannot open shared object file"

export LD_LIBRARY_PATH=/path/to/py-cymbal/python/cymbal:$LD_LIBRARY_PATH

Build Error: Missing Python.h

Install Python development headers:

# Fedora/RHEL/CentOS
sudo dnf install python3-devel

# Ubuntu/Debian
sudo apt-get install python3-dev

Build Error: SQLite FTS5 not enabled

The build script sets -DSQLITE_ENABLE_FTS5 automatically. If you're building manually, ensure this flag is set:

export CGO_CFLAGS="-DSQLITE_ENABLE_FTS5"

Development

Building Manually

cd go
export CGO_CFLAGS="-DSQLITE_ENABLE_FTS5 -I/usr/include/python3.12"
gopy gen -vm=python3 ./pycymbal
make build
cd ..
mv go/_pycymbal.so go/pycymbal.py go/go.py python/cymbal/
mv go/pycymbal_go.so python/cymbal/

Testing

# Run the example script
export LD_LIBRARY_PATH=$(pwd)/python/cymbal:$LD_LIBRARY_PATH
python examples/basic_usage.py

Cleaning Build Artifacts

./build.sh clean  # If implemented in build script
# Or manually:
rm -rf python/cymbal/_pycymbal*.so python/cymbal/*.pyc __pycache__ build dist *.egg-info go/pycymbal*.so go/_pycymbal*.so go/*.c go/*.py go/Makefile go/build.py

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Acknowledgments

  • Cymbal for the excellent code indexing tool
  • gopy for making Go-Python bindings possible
  • tree-sitter for robust parsing of multiple languages

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

py_cymbal-0.1.1.tar.gz (6.4 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

py_cymbal-0.1.1-py3-none-any.whl (6.5 MB view details)

Uploaded Python 3

File details

Details for the file py_cymbal-0.1.1.tar.gz.

File metadata

  • Download URL: py_cymbal-0.1.1.tar.gz
  • Upload date:
  • Size: 6.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for py_cymbal-0.1.1.tar.gz
Algorithm Hash digest
SHA256 3d4a1ab920f7c56f309995aa76ea5c4a5e3554558fd7fb38023edaf82777c74f
MD5 e7d6c2222bfa4bea2e66af57e46fff93
BLAKE2b-256 994142eaaec3d292240e3e6b4a58ea12e001d034afe752bed6f57cc9342b03fb

See more details on using hashes here.

File details

Details for the file py_cymbal-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: py_cymbal-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 6.5 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for py_cymbal-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 77ce93e9f94a4b5c403d64c40ee40e76b34c519dc79fd54777de0192c21d2b76
MD5 06c77cb170555117e38844a193a3a38b
BLAKE2b-256 9c1aa3129b59d2eef702453501e93650e47d57c98be30359969c82efb6968ff7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page