Skip to main content

Interactive parameter configurator for the Hoodini CLI genomic neighborhood analysis tool

Project description

Hoodini Logo hoodini-colab PyPI Python License Open In Colab Documentation

Interactive parameter configurator for the Hoodini CLI genomic neighborhood analysis tool. Designed for Google Colab to run genomic analyses in the cloud without any local installation.

Hoodini Colab Launcher Interface

📖 Documentation

Full documentation available at hoodini.bio/docs/colab

What is this?

hoodini-colab is an interactive Jupyter widget that makes it easy to configure and run genomic neighborhood analyses with Hoodini. Instead of remembering dozens of command-line parameters and flags, you get a visual interface where you can click, select, and configure everything through an intuitive web-based UI.

Built specifically for Google Colab, this tool allows researchers to run complex genomic analyses directly in their browser without installing any software locally. The widget handles all the complexity of installing Hoodini and its dependencies automatically through pixi, making it perfect for users who want to try Hoodini without setting up a local bioinformatics environment.

Key Features

The interface is organized into collapsible sections covering all aspects of Hoodini's functionality. You can configure remote BLAST searches, adjust neighborhood window sizes, select clustering methods, choose tree construction algorithms, and enable various annotation tools like PADLOC, DefenseFinder, and CCtyper. The launcher includes smart defaults for every parameter, so you can start with a basic analysis and only customize what you need.

Every parameter shows helpful descriptions explaining what it does, and the generated command updates instantly as you make changes. You can copy the command to run it manually later, or click the "Run" button to execute it immediately. The widget displays installation progress and analysis status, so you always know what's happening.

Installation

The easiest way to install hoodini-colab is directly from PyPI using pip:

pip install hoodini-colab

This will automatically install all required dependencies including anywidget, traitlets, and ipython. If you want to contribute to the development or modify the code, you can install it in editable mode:

git clone https://github.com/pentamorfico/hoodini-colab.git
cd hoodini-colab
pip install -e ".[dev]"

The development installation includes additional tools like ruff for linting and mypy for type checking.

Quick Start

The fastest way to try hoodini-colab is through Google Colab (recommended), where you don't need to install anything on your computer. Just click the badge below and the notebook will open in your browser:

Open In Colab

Once the notebook opens, run the cells in order. The first cell installs the package, and the second cell displays the interactive launcher widget where you can start configuring your analysis immediately.

⏱️ Note on execution time: The first time you run Hoodini, it needs to install the tool and download reference databases. This process typically takes 5-10 minutes depending on which annotation tools you enable (PADLOC, DefenseFinder, geNomad, eggNOG). More tools selected means more databases to download and longer initial setup time. Subsequent runs within the same Google Colab session will be much faster as everything is already installed, but everytime a new session starts, the installation process needs to be run again.

Local Jupyter Notebook

While this tool is optimized for Google Colab, you can also use it in a local Jupyter notebook:

from hoodini_colab import create_launcher

launcher = create_launcher()
display(launcher)

The widget will check for required dependencies and install Hoodini automatically through pixi if it's not already present on your system. When you click the run button, it handles the entire installation process in the background, downloads necessary databases, and executes your configured analysis.

Advanced Usage

If you need more control over the widget behavior, you can work with the HoodiniLauncher class directly. This allows you to programmatically access the generated command, monitor the execution status, or integrate the widget into more complex workflows:

from hoodini_colab import HoodiniLauncher

launcher = HoodiniLauncher()

# Access the generated command at any time
print(launcher.command)

# Set up a callback to monitor status changes
def on_status_change(change):
    print(f"Status: {launcher.status_state} - {launcher.status_message}")

launcher.observe(on_status_change, names=['status_state'])

display(launcher)

Use Cases

Single Protein Analysis: When you want to explore the genomic neighborhood of a specific protein, select the "Single Input" mode and enter an NCBI protein ID like WP_000000001.1. Hoodini will use BLAST to automatically find homologous sequences and analyze their genomic neighborhoods. You can configure optional parameters such as remote BLAST e-values or window sizes, then click "Run Hoodini Analysis" to start the process. Note that this mode only works with NCBI protein IDs.

Custom Homolog List: If you want to analyze specific sequences rather than letting BLAST choose the homologs automatically, switch to "Input List" mode and provide your own list of sequence IDs with one per line. This gives you complete control over which sequences are included in the analysis. Unlike Single Input mode, you can use both NCBI protein IDs (like WP_000000001.1) and nucleotide IDs (like NZ_CP000001.1) in this mode. This is particularly useful when you already know which sequences you want to compare or when you want to reproduce specific analyses.

Custom Coordinates: For more precise control over exactly which genomic regions to analyze, use "Input Sheet" mode. This lets you specify protein IDs along with their exact nucleotide coordinates, strand information, and assembly IDs in a tabular format. You can either fill in the table manually or paste TSV data directly.

Parameter Organization

The launcher organizes Hoodini's extensive set of parameters into logical categories to make configuration easier. Input and output settings let you specify file paths and directories. Remote BLAST options control e-values and the number of targets to retrieve when searching remote databases. Performance settings include thread count and NCBI API keys for faster database access.

Neighborhood window parameters determine how much sequence context to include around your target proteins. Clustering options control how similar sequences are grouped together. Tree construction methods let you choose from taxonomy-based trees, neighbor-joining, maximum likelihood, or various distance-based approaches.

Pairwise comparison settings configure ANI and AAI calculations, while annotation toggles enable tools like PADLOC for antiphage defense systems, DefenseFinder, CCtyper for CRISPR-Cas detection, and many others. Link configuration determines whether to compute protein and nucleotide similarity connections between neighborhoods.

Development

Setting up a development environment is straightforward. Clone the repository and install it in editable mode with the dev dependencies:

git clone https://github.com/pentamorfico/hoodini-colab.git
cd hoodini-colab
pip install -e ".[dev]"

The project uses ruff for fast Python linting and formatting. You can check code style with ruff check src/ and automatically format code with ruff format src/. Type checking is handled by mypy, which you can run with mypy src/.

The project structure follows modern Python packaging conventions with a src/ layout. All package code lives in src/hoodini_colab/, which includes the main widget class, utility functions for installation, and the JavaScript frontend code. Configuration is handled through pyproject.toml using the Hatchling build backend.

hoodini-colab/
├── src/
│   └── hoodini_colab/
│       ├── __init__.py
│       ├── widget.py          # Main widget class
│       ├── widget.js          # Frontend JavaScript
│       └── utils.py           # Installation utilities
├── pyproject.toml             # Modern Python packaging
├── README.md
├── LICENSE
└── .gitignore

🤝 Contributing

Contributions are welcome! Please:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

The project follows standard Python development practices with ruff for code style and mypy for type checking.

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

This project is built on top of anywidget, a modern framework for creating interactive Jupyter widgets. The configuration system uses traitlets, which provides a robust way to handle typed attributes and callbacks.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hoodini_colab-0.1.5.tar.gz (26.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hoodini_colab-0.1.5-py3-none-any.whl (26.2 kB view details)

Uploaded Python 3

File details

Details for the file hoodini_colab-0.1.5.tar.gz.

File metadata

  • Download URL: hoodini_colab-0.1.5.tar.gz
  • Upload date:
  • Size: 26.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for hoodini_colab-0.1.5.tar.gz
Algorithm Hash digest
SHA256 1b9dfb8ad6c03faa4d1bd9998e54d3e5e0817b4e109912fc639a100389bf8269
MD5 d0e146e2a27acf030a27a3943f4ad35e
BLAKE2b-256 e2be2668e8bf8dee31644290bcda800cf5acdb304cab54caef2d5d72ed6f2663

See more details on using hashes here.

File details

Details for the file hoodini_colab-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: hoodini_colab-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 26.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for hoodini_colab-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 6d98f7ff00b0056c6926f1bfc0fb249d56d2bb6f4a8cbf007213c3b94b643d25
MD5 b42006824ff1b3e0fb0413ead1bece48
BLAKE2b-256 16b9374bb65fa1f410b48f1246f88de10fe77409a214899c1e10db917908bbc3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page