Skip to main content

Python bindings for Cymbal code indexing and symbol discovery

Project description

py-cymbal: Python Bindings for Cymbal

Python bindings for the Cymbal code indexing and symbol discovery tool, created using the gopy framework.

Overview

Cymbal is a Go-based tool that uses tree-sitter for multi-language AST parsing and SQLite for indexed storage, providing fast symbol search, cross-references, impact analysis, and scoped diffs. These Python bindings allow Python developers to programmatically access Cymbal's powerful code analysis capabilities.

Features

  • Repository Indexing: Index code repositories for fast symbol lookup
  • Symbol Search: Search for functions, classes, variables, and other symbols
  • Symbol Investigation: Get detailed information about specific symbols including definitions and references
  • Reference Finding: Find all references to a particular symbol
  • Multi-language Support: Works with all languages supported by Cymbal (via tree-sitter)
  • Clean Python API: Pythonic interface with context managers and convenience functions

Installation

Prerequisites

  1. Go 1.25+: Required for building the bindings
  2. Python 3.7+: Python development headers are required
  3. gopy: Go package for creating Python bindings
  4. Cymbal: The Go library being wrapped

Building from Source

# Clone the repository
git clone <repository-url>
cd py-cymbal

# Build the Python bindings
./build.sh

# Set up environment for development
export LD_LIBRARY_PATH=$(pwd)/python/cymbal:$LD_LIBRARY_PATH

# Install in development mode
pip install -e .

Build Script

The build.sh script automates the entire build process:

  1. Sets up the Go environment with CGO flags for SQLite FTS5
  2. Generates Python bindings using gopy
  3. Builds the C extension
  4. Organizes the Python module structure
  5. Creates a setup.py for pip installation

Usage

Basic Example

import cymbal

# Create a Cymbal instance
with cymbal.Cymbal() as c:
    # Index a repository
    stats = c.index("/path/to/your/repository")
    print(f"Indexing result: {stats}")
    
    # Search for symbols
    results = c.search("handleAuth", limit=10)
    for symbol in results:
        print(f"{symbol.name} ({symbol.kind}) at {symbol.file}:{symbol.start_line}")
    
    # Investigate a specific symbol
    investigation = c.investigate("UserModel")
    print(f"Definition: {investigation.definition}")
    print(f"References: {len(investigation.references)}")
    
    # Find references to a symbol
    references = c.find_references("DatabaseConnection", limit=20)
    for ref in references:
        print(f"Reference at {ref.file}:{ref.line}")

Convenience Functions

import cymbal

# Index a repository (one-liner)
stats = cymbal.index_repository("/path/to/repo")

# Search with existing database path
results = cymbal.search_symbols("config", limit=15, db_path="/path/to/index.db")

# Investigate a symbol
investigation = cymbal.investigate_symbol("ApiClient", db_path="/path/to/index.db")

Advanced Usage

import cymbal

# Reuse an existing index
try:
    c = cymbal.Cymbal()
    c.db_path = "/path/to/existing/index.db"  # Set path to existing database
    
    # Perform searches
    results = c.search("test", limit=5)
    
    # Process results
    for symbol in results:
        print(f"Found: {symbol.name} in {symbol.file}")
        
except Exception as e:
    print(f"Error: {e}")
finally:
    c.close()

API Reference

cymbal.Cymbal Class

__init__(repo_path=None)

Create a new Cymbal instance. Optionally index a repository immediately.

index(repo_path)

Index a repository. Returns statistics about the indexing operation.

search(query, limit=20)

Search for symbols matching the query. Returns a list of symbol results.

investigate(symbol_name)

Investigate a specific symbol. Returns an investigation result with definition and references.

find_references(symbol_name, limit=50)

Find references to a symbol. Returns a list of reference results.

db_path (property)

Get or set the current database path.

close()

Close the Cymbal instance and release resources.

Convenience Functions

index_repository(repo_path)

Convenience function to index a repository.

search_symbols(query, limit=20, db_path=None)

Convenience function to search for symbols.

investigate_symbol(symbol_name, db_path=None)

Convenience function to investigate a symbol.

Architecture

The Python bindings are built using a layered architecture:

  1. Go Wrapper Layer (go/pycymbal/wrapper.go): Simplified Go API that exposes core Cymbal functionality with Python-friendly signatures.
  2. gopy Generated Layer: Automatically generated by gopy, providing the C bindings between Go and Python.
  3. Python API Layer (python/cymbal/__init__.py): Pythonic wrapper that provides a clean, intuitive interface for Python developers.

File Structure

py-cymbal/
├── go/                    # Go wrapper package
│   ├── pycymbal/         # Go wrapper source code
│   │   └── wrapper.go    # Simplified Go API for Python
│   ├── go.mod           # Go module definition
│   └── main.go          # Test program
├── python/              # Python bindings
│   └── cymbal/          # Python module
│       ├── __init__.py  # Python API layer
│       ├── pycymbal.py  # Generated Python bindings
│       ├── go.py        # Generated Go runtime support
│       ├── _pycymbal.so # Compiled C extension
│       └── pycymbal_go.so # Go shared library
├── examples/            # Usage examples
│   └── basic_usage.py  # Basic usage demonstration
├── build.sh            # Build automation script
├── setup.py            # pip installation configuration
└── README.md           # This file

Technical Details

CGO Dependencies

The bindings handle CGO dependencies including:

  • SQLite with FTS5: Enabled via -DSQLITE_ENABLE_FTS5 CFLAG
  • tree-sitter: C library for parsing multiple programming languages
  • Python C API: For Python extension module support

Type Mapping

gopy automatically handles type conversions between Go and Python:

  • Go structs → Python objects with attributes
  • Go slices → Python lists
  • Go errors → Python exceptions
  • Go interfaces → Python objects with methods

Resource Management

The bindings properly manage resources including:

  • SQLite database connections
  • tree-sitter parser instances
  • Go garbage collection coordination with Python reference counting

Limitations and Known Issues

  1. Shared Library Loading: The bindings require both _pycymbal.so and pycymbal_go.so to be in the same directory and accessible via LD_LIBRARY_PATH.
  2. Platform Support: Primarily tested on Linux. macOS and Windows may require additional configuration.
  3. Memory Management: Large repositories may require significant memory for indexing.
  4. Concurrent Access: The SQLite database may have limitations with concurrent writes.

Troubleshooting

Import Error: "pycymbal_go.so: cannot open shared object file"

export LD_LIBRARY_PATH=/path/to/py-cymbal/python/cymbal:$LD_LIBRARY_PATH

Build Error: Missing Python.h

Install Python development headers:

# Fedora/RHEL/CentOS
sudo dnf install python3-devel

# Ubuntu/Debian
sudo apt-get install python3-dev

Build Error: SQLite FTS5 not enabled

The build script sets -DSQLITE_ENABLE_FTS5 automatically. If you're building manually, ensure this flag is set:

export CGO_CFLAGS="-DSQLITE_ENABLE_FTS5"

Development

Building Manually

cd go
export CGO_CFLAGS="-DSQLITE_ENABLE_FTS5 -I/usr/include/python3.12"
gopy gen -vm=python3 ./pycymbal
make build
cd ..
mv go/_pycymbal.so go/pycymbal.py go/go.py python/cymbal/
mv go/pycymbal_go.so python/cymbal/

Testing

# Run the example script
export LD_LIBRARY_PATH=$(pwd)/python/cymbal:$LD_LIBRARY_PATH
python examples/basic_usage.py

Cleaning Build Artifacts

./build.sh clean  # If implemented in build script
# Or manually:
rm -rf python/cymbal/_pycymbal*.so python/cymbal/*.pyc __pycache__ build dist *.egg-info go/pycymbal*.so go/_pycymbal*.so go/*.c go/*.py go/Makefile go/build.py

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Acknowledgments

  • Cymbal for the excellent code indexing tool
  • gopy for making Go-Python bindings possible
  • tree-sitter for robust parsing of multiple languages

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

py_cymbal-0.1.2.tar.gz (6.4 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

py_cymbal-0.1.2-py3-none-any.whl (6.5 MB view details)

Uploaded Python 3

File details

Details for the file py_cymbal-0.1.2.tar.gz.

File metadata

  • Download URL: py_cymbal-0.1.2.tar.gz
  • Upload date:
  • Size: 6.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for py_cymbal-0.1.2.tar.gz
Algorithm Hash digest
SHA256 4eec2cb54d91267a0279b0d833505728c36521088a63b2581e15856a801849b0
MD5 66616f1922addcc15dae9a24902c2b72
BLAKE2b-256 e9e84f344cbf4b7f9469d1bc5c4b885c7abc2a7fabce8e1a7e2f08df51d3cf83

See more details on using hashes here.

File details

Details for the file py_cymbal-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: py_cymbal-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 6.5 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for py_cymbal-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 2a26f206fe734de736996bd1abc5f2ca66b084b2e2ad6c81516f6c3a85accb0d
MD5 3e42f5ce7d4216c23a36312221ed7238
BLAKE2b-256 82d4b8ba4c7f8e15f27781fd179b4377ab7c5ac0e9b655c8740cc4ec8a1ba62d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page