Skip to main content

Add your description here

Project description

KARMA: Knowledge Assessment and Reasoning for Medical Applications

Python Version License Downloads

---

Documentation: https://karma.eka.care

Source Code: https://github.com/eka-care/KARMA-OpenMedEvalKit


KARMA provides a unified package for evaluating medical AI systems, supporting text, image, and audio-based models. The framework includes support for 12 medical datasets and offers standardized evaluation metrics commonly used in healthcare AI research.

The key features are:

  • Fast: Very high performance evaluation, capable of processing thousands of medical examples efficiently
  • Easy: Designed to be easy to use and learn. Less time reading docs, more time evaluating models
  • Comprehensive: Support for 12+ medical datasets across multiple modalities (text, images, VQA)
  • Model Agnostic: Works with any model - Qwen, MedGemma, API providers (OpenAI, AWS Bedrock) or your custom architecture
  • Smart Caching: Intelligent result caching with DuckDB/DynamoDB backends for faster re-evaluations
  • Standards-based: Extensible architecture with registry-based auto-discovery of models and datasets
pip install karma-medeval

Table of Contents

Installation

Install KARMA from PyPI:

pip install karma-medeval

Or install from source:

# Clone the repository
git clone https://github.com/eka-care/KARMA-OpenMedEvalKit.git
cd KARMA-OpenMedEvalKit

# Install with uv (recommended)
uv sync

# Or install with pip
pip install -e .

# source the environment
source .venv/bin/activate

Example

Evaluate your first medical AI model Using the Example of Qwen3 Model:

$ karma eval --model "Qwen/Qwen3-0.6B" --datasets openlifescienceai/pubmedqa

Supported Models

KARMA depends on PyTorch and HuggingFace Transformers.

Check supported models through

$ karma list models

Adding Custom Models

KARMA supports custom model integration through its registry system. See the Contributing section for details on adding new models.

Custom Model and Dataset Registration

KARMA uses a decorator-based registry system that makes it easy to add your own models and datasets for evaluation.

Registering a Model

Create a new model by inheriting from BaseHFModel and then call the register_model_meta method from registry.py with the ModelMeta

See sample implementation from qwen.py Multiple models from the same family can be imported through this now.

Take any model specific inputs through the loader_kwargs in ModelMeta, they have to be set as init parameters to be used. They are passed as kwargs from the model registry.

from karma.models.base_model_abs import BaseHFModel
from karma.data_models.model_meta import ModelMeta, ModelType, ModalityType
from karma.registries.model_registry import register_model_meta

logger = logging.getLogger(__name__)

class MyCustomModel(BaseHFModel):
    """Custom model implementation."""
    
    def __init__(
        self,
        model_name_or_path: str,
        device: str = "mps",
        max_tokens: int = 32768,
        temperature: float = 0.7,
        top_p: float = 0.9,
        top_k: Optional[int] = None,
        enable_thinking: bool = True,
        **kwargs,
    ):
    super().__init__(
            model_name_or_path=model_name_or_path,
            device=device,
            max_tokens=max_tokens,
            temperature=temperature,
            top_p=top_p,
            top_k=top_k,
            enable_thinking=enable_thinking,
            **kwargs,
        )
      
    ...
  
my_custom_model = ModelMeta(
    name="Qwen/Qwen3-1.7B",
    description="QWEN model",
    loader_class="karma.models.custom.MyCustomModel",
    loader_kwargs={
        "temperature": 0.7,
        "top_k": 50,
        "top_p": 0.9,
        "enable_thinking": True,
        "max_tokens": 256,
    },
    revision=None,
    reference=None,
    model_type=ModelType.TEXT_GENERATION,
    modalities=[ModalityType.TEXT],
    n_parameters=None,
    memory_usage_mb=None,
    max_tokens=None,
    embed_dim=None,
    framework=["PyTorch", "Transformers"],
)
register_model_meta(my_custom_model)

Registering a Custom Dataset

Create a new dataset by inheriting from BaseMultimodalDataset and using the @register_dataset decorator:

from karma.eval_datasets.base_dataset import BaseMultimodalDataset
from karma.registries.dataset_registry import register_dataset

@register_dataset(
    "my_custom_dataset", 
    metrics=["exact_match", "accuracy"], 
    task_type="mcqa",
    required_args=["domain"],
    optional_args=["split", "subset"],
    default_args={"split": "test"}
)
class MyCustomDataset(BaseMultimodalDataset):
    """Custom dataset implementation."""
    
    def __init__(self, domain: str, split: str = "test", subset: str = None, **kwargs):
        self.domain = domain
        self.split = split
        self.subset = subset
        super().__init__(**kwargs)
    

Using Your Custom Components

After defining your custom model and dataset, use them with the CLI:

# Use your custom model and dataset
karma eval --model my_custom_model --model-path "path/to/model" \
  --datasets "my_custom_dataset" \
  --dataset-args "my_custom_dataset:domain=medical"
  --model-kwargs '{"temperature":0.5}'

Registration Parameters

Model Registration:

  • name: Unique identifier for your model

Dataset Registration:

  • name: Unique identifier for your dataset
  • metrics: List of applicable metrics (e.g., ["exact_match", "bleu", "accuracy"])
  • task_type: Type of task ("mcqa", "vqa", "translation", "qa")
  • required_args: Arguments that must be provided when creating the dataset
  • optional_args: Arguments that can be provided but have defaults
  • default_args: Default values for arguments

Usage

List available resources:

karma list models
karma list datasets

Basic evaluation:

karma eval --model qwen --model-path "Qwen/Qwen3-0.6B"

Evaluate specific datasets:

karma eval --model qwen --model-path "Qwen/Qwen3-0.6B" --datasets "pubmedqa,medmcqa"

With dataset-specific arguments:

karma eval --model qwen --model-path "Qwen/Qwen3-0.6B" --datasets "in22conv" \
  --dataset-args "in22conv:source_language=en,target_language=hi"

Advanced options:

karma eval --model qwen --model-path "Qwen/Qwen3-0.6B" \
  --datasets "pubmedqa" --batch-size 16 --output results.json --no-cache

Configuration

KARMA supports environment-based configuration. Create a .env file:

# Cache configuration  
KARMA_CACHE_TYPE=duckdb
KARMA_CACHE_PATH=./cache.db

# Model configuration
HUGGINGFACE_TOKEN=your_token
LOG_LEVEL=INFO

Caching options

  • DuckDB (default) - for local development
  • DynamoDB - for production environments

Enable or disable caching:

karma eval --cache      # Enable (default)
karma eval --no-cache   # Disable

Contributing

We welcome contributions to KARMA!

Adding New Components

KARMA uses a registry-based architecture that makes it easy to add:

  • New datasets - Extend BaseMultimodalDataset and register with @register_dataset
  • New models - Extend BaseLLM and register with @register_model
  • New metrics - Implement custom evaluation metrics
  • New processors - Add data preprocessing capabilities

See the existing implementations in karma/eval_datasets/ and karma/models/ for examples.

License

This project is licensed under the terms of the MIT license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

karma_medeval-0.2.0.tar.gz (134.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

karma_medeval-0.2.0-py3-none-any.whl (183.5 kB view details)

Uploaded Python 3

File details

Details for the file karma_medeval-0.2.0.tar.gz.

File metadata

  • Download URL: karma_medeval-0.2.0.tar.gz
  • Upload date:
  • Size: 134.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for karma_medeval-0.2.0.tar.gz
Algorithm Hash digest
SHA256 d9c662c523a6c2d671e2d7fe915960aae4b7490f297199f7add95b71771b02f8
MD5 e3f99d6ce2a2f02dd2d2122f9478d5a1
BLAKE2b-256 35b2887e31b61d950698164b117da36f3c7ab3f299ca95075c77655fef8beb56

See more details on using hashes here.

Provenance

The following attestation bundles were made for karma_medeval-0.2.0.tar.gz:

Publisher: pypi-publish.yml on eka-care/KARMA-OpenMedEvalKit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file karma_medeval-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: karma_medeval-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 183.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for karma_medeval-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 91dc7d63599f5529ed9e5df43deda344d01a5e268fe71560715ecdc8c7f517db
MD5 ac32ddc19a5619ce72b9c7f2e8e3571c
BLAKE2b-256 4173b7f43e470ae009cf69b38d0e5fe155fe3df80baa7ad3d0c3d7e49df80650

See more details on using hashes here.

Provenance

The following attestation bundles were made for karma_medeval-0.2.0-py3-none-any.whl:

Publisher: pypi-publish.yml on eka-care/KARMA-OpenMedEvalKit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page