Skip to main content

Streamline evaluation evidence mapping at scale with LLMs

Project description

IOMEval

CI PyPI License

IOMEval streamlines the mapping of IOM evaluation reports against strategic frameworks like the Strategic Results Framework (SRF) and the Global Compact for Migration (GCM). It uses LLMs to process PDF reports, extract key sections, and tags (matches) them to framework components, turning dispersed, untagged evaluation documents into structured evidence maps that can be searched by framework components (for example, finding all evaluation findings mapped to a specific GCM objective).

The Challenge Addressed

UN agencies produce several evaluation reports. For IOM, this body of knowledge is extensive and variegated, but putting it to practical use becomes more challenging as volume increases. Critically, the metadata of IOM evaluation reports does not indicate which elements in the IOM Strategic Results Framework (SRF), or in the Global Compact for Migration (GCM), are addressed by the evaluation. This is a major gap that limits the ability to connect evaluation evidence with the two key strategic frameworks of the organization.

Manual tagging of evaluation reports against the IOM SRF and the GCM is extremely challenging due to the limited resources that IOM has at its disposal for evaluation in general. Time constraints of IOM evaluators and other staff are also exacerbated by the shrinkage of the organizations budget in the context of the broader “humanitarian reset”. In addition to this, tagging IOM evaluation reports against SRF elements is cognitively taxing due to the sheer amount of elements in these frameworks (the GCM has 23 objectives; the SRF has more than one hundred outputs).

What This Enables

Addressing the “tagging” challenge enables the creation of evidence maps (visual tools that systematically displays what evaluation and research exists for specific topics, and where evidence may be missing) that would have otherwise not been possible to produce. Maps in turn help answer questions like: Which framework elements are well-covered by existing evaluations? Where are the knowledge gaps that should prioritize future evaluation work? Which themes have enough evidence for a dedicated synthesis report?

Key Features

  • Automated PDF Processing: Download and OCR evaluation reports
  • Intelligent Section Extraction: LLM-powered extraction of executive summaries, findings, conclusions, and recommendations
  • Strategic Framework Mapping: Map report content against the IOM Strategic Results Framework (outputs, enablers and cross-cutting priorities) and the Global Compact for Migration (objectives)
  • Checkpoint/Resume: Stop processing at any time and pick up where you left off without losing progress
  • Granular Control: Run the entire processing pipeline for a report in one command, or execute individual steps (PDF extraction, section identification, specific framework theme mapping) separately for more flexibility

Installation

Install from PyPI:

pip install iomeval

Or install the latest development version from GitHub:

pip install git+https://github.com/franckalbinet/iomeval.git

Configuration

Core Dependencies

iomeval relies on two key libraries:

  • mistocr: Powers the PDF-to-markdown conversion with intelligent OCR and heading hierarchy detection
  • lisette: A thin wrapper around litellm that provides access to all major LLM providers. By default, iomeval uses Anthropic models (Haiku for debugging, Sonnet for production)

API Keys

iomeval automatically loads API keys on import. You have two options:

Option 1: Environment variables (recommended for production)

export ANTHROPIC_API_KEY='your-key-here'
export MISTRAL_API_KEY='your-key-here'

Option 2: .env file (convenient for development)

Create a .env file in your project root:

ANTHROPIC_API_KEY=your-key-here
MISTRAL_API_KEY=your-key-here

Since lisette supports all major LLM providers via litellm, you can configure other providers (OpenAI, Google, etc.) by setting their respective API keys using either method.

Quick Start

First, prepare your evaluation report metadata. Export the CSV file from the IOM Evaluation Repository containing report records and their associated PDF links, then convert to JSON:

from iomeval.readers import IOMRepoReader

reader = IOMRepoReader('evaluation-search-export.csv')
reader.to_json('evaluations.json')

Now process an evaluation report through the complete pipeline:

from iomeval.readers import load_evals
from iomeval.pipeline import run_pipeline

evals = load_evals('evaluations.json')
url = "https://evaluation.iom.int/sites/g/files/tmzbdl151/files/docs/resources/Abridged%20Evaluation%20Report_%20Final_Olta%20NDOJA.pdf"

report = await run_pipeline(url, evals, 
                            pdf_dst='data/pdfs', 
                            md_dst='data/md', 
                            results_path='data/results', 
                            ocr_kwargs=dict(add_img_desc=False), 
                            model='claude-haiku-4-5')
report

The pipeline runs 7 steps: first processing the PDF (download → OCR → extract), then mapping extracted content against each strategic framework component (SRF Enablers, SRF Cross-cutting Priorities, GCM Objectives, and SRF Outputs).

Progress is displayed as each step completes, and state is automatically saved after each stage for checkpoint/resume capability.

[!NOTE]

The prompts used for extraction and framework mapping are available in the prompts directory.

Detailed Workflow

For more control over individual pipeline stages, see the module documentation:

  • Loading evaluation metadata: See readers for working with IOM evaluation data
  • Downloading and OCR: See downloaders and core for PDF processing
  • Section extraction: See extract for extracting executive summaries, findings, conclusions, and recommendations
  • Framework mapping: See mapper for mapping to SRF enablers, cross-cutting priorities, GCM objectives, and SRF outputs
  • Pipeline control: See pipeline for granular control over the full pipeline and checkpoint/resume functionality

Development

iomeval is built with nbdev, which means the entire library is developed in Jupyter notebooks. The notebooks serve as both documentation and source code.

Setup for development

git clone https://github.com/franckalbinet/iomeval.git
cd iomeval
pip install -e '.[dev]'

Key nbdev commands

nbdev_test          # Run tests in notebooks
nbdev_export        # Export notebooks to Python modules
nbdev_preview       # Preview documentation site
nbdev_prepare       # Export, test, and clean notebooks (run before committing)

Workflow

  1. Make changes in the .ipynb notebook files
  2. Run nbdev_prepare to export code and run tests
  3. Commit both notebooks and exported Python files
  4. Documentation is automatically generated from the notebooks

Learn more about nbdev’s literate programming approach in the nbdev documentation.

Contributing

Contributions are welcome! Please: - Follow the existing notebook structure - Run nbdev_prepare before submitting PRs

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

iomeval-0.2.3.tar.gz (78.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

iomeval-0.2.3-py3-none-any.whl (91.4 kB view details)

Uploaded Python 3

File details

Details for the file iomeval-0.2.3.tar.gz.

File metadata

  • Download URL: iomeval-0.2.3.tar.gz
  • Upload date:
  • Size: 78.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.16

File hashes

Hashes for iomeval-0.2.3.tar.gz
Algorithm Hash digest
SHA256 2993423a2d4ead950c8ae799c77e4ff7525cfa0daf50b1eb2a63597bdbc2d715
MD5 031bd2d3f5ffc70ae5ecad8bd3c8814d
BLAKE2b-256 06e536a6fb58f1d3b0ece80d2dda6ac157cde5a7c2d2c6c7e543c5b107ad6149

See more details on using hashes here.

File details

Details for the file iomeval-0.2.3-py3-none-any.whl.

File metadata

  • Download URL: iomeval-0.2.3-py3-none-any.whl
  • Upload date:
  • Size: 91.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.16

File hashes

Hashes for iomeval-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 d6c5b44e5b0236ceef9ee109d2202ef6b2295667cd6cf99311f18a0145e9a597
MD5 b844a988d15a629bebb9fabc3ad3a633
BLAKE2b-256 a28933a17533a0541ebe1b60c79f1a2777e13292639ca45cdb643650c6f1d1be

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page