Skip to main content

LLM-assisted PRISMA workflow for systematic literature review

Project description

pysyrev

tests docs PyPI license python

pysyrev (PYthon SYstematic REView) is an automated, LLM-assisted PRISMA workflow for systematic literature reviews. It covers the full pipeline — from raw bibliographic records to screened, deduplicated, and thematically structured corpora — and produces a PDF report at the end.


Features

  • Multi-source ingestion — Web of Science (file or REST API), OpenAlex (file or REST API), Scopus, PubMed
  • Automatic deduplication — fuzzy title matching across sources
  • LLM-based title/abstract screening — multi-reviewer workflows with majority or mean voting, powered by any provider supported by LiteLLM (Anthropic, OpenAI, Ollama, LiteLLM proxy…)
  • Bibliographic network analysis — bibliographic coupling and co-citation graphs exported as GraphML
  • Topic modelling — BERTopic-based clustering with UMAP + HDBSCAN grid search, ranked by coherence scores
  • PDF report generation — declarative, theme-aware PDF engine built on ReportLab

Pipeline stages

Stage Key Description
Bibliography bib Fetch, clean, filter, deduplicate, and optionally resolve references
LLM review review Screen documents against inclusion/exclusion criteria with one or more LLM reviewers
Bibliographic network bib-network Build coupling and co-citation networks from the included corpus
Topic modelling topic-model Cluster documents into topics using BERTopic; rank configurations by coherence
Report topic-report Generate a PDF report from the selected topic model run

All sections are optional — only the stages declared in the config file are executed. Each stage auto-detects the most recent output of the previous one when run standalone.


Installation

Prerequisite: Python ≥ 3.10.

From PyPI

pip install pysyrev

To enable Plotly figure embedding in PDF reports:

pip install "pysyrev[plotly]"

From source

git clone <repo-url>
cd pysyrev
pip install -e .

Documentation

Documentation is available from here

Quick start

CLI

# Run all configured stages (only stages present in the config are executed)
python -m pysyrev config.yaml

# Run a single stage
python -m pysyrev config.yaml --stage bib
python -m pysyrev config.yaml --stage review
python -m pysyrev config.yaml --stage bib-network
python -m pysyrev config.yaml --stage topic-model
python -m pysyrev config.yaml --stage topic-report

If installed via setup.py, the pysyrev command is also available directly:

pysyrev config.yaml --stage topic-report

Python API

from pysyrev import Pipeline

# Full pipeline in one call — runs only the stages declared in the config
pipeline = Pipeline.from_config("config.yaml")
pipeline.run()

# Or stage by stage — results persist on the instance between calls
pipeline.run(stages=["bib"])
pipeline.run(stages=["review"])        # uses pipeline.bib.dataset automatically
pipeline.run(stages=["topic-report"])  # generates the PDF report

# Access results
df_all    = pipeline.bib.dataset          # pd.DataFrame — all collected documents
df_kept   = pipeline.review.included_docs # pd.DataFrame — LLM-screened inclusions
network   = pipeline.network              # BibNetwork
topic     = pipeline.topic                # TopicModel
report    = pipeline.report               # TopicReport

Report-only run

A config containing only the topic_report (and optionally report and llm) sections is valid. This lets you generate or regenerate a report from a previous topic-model run without re-running the full pipeline:

# report_only.yaml
topic_report:
  run_dir: /path/to/topic_modeling/run_2026-05-01T120000/  # or leave blank to auto-detect
  model_index: 0
  export_to: /path/to/output/report/
python -m pysyrev report_only.yaml

Configuration

A single YAML file controls all stages. Copy pysyrev/config_examples/config_template.yaml and fill in the sections you need. Sections not present in the file are simply skipped.

Key auto-detection rules (when fields are left blank):

Blank field Auto-detected from
review.doc_dataset latest run in bib.export.export_dir
bib_network.doc_dataset latest run in review.export.export_dir
topic_model.doc_dataset latest run in review.export.export_dir
topic_report.run_dir latest run in topic_model.export.export_dir
bib-network graphs in report latest run in bib_network.export.export_dir

Getting started

See the tutorials/ folder for step-by-step Jupyter notebooks and annotated configuration examples covering each pipeline stage.



organizations

Contributing

Development and improvement

  • Benjamin Pillot
  • Théo Chamarande
  • Kevin Chapuis

Conceptualization and Coordination

  • Benjamin Pillot

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pysyrev-0.1.1.tar.gz (110.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pysyrev-0.1.1-py3-none-any.whl (130.0 kB view details)

Uploaded Python 3

File details

Details for the file pysyrev-0.1.1.tar.gz.

File metadata

  • Download URL: pysyrev-0.1.1.tar.gz
  • Upload date:
  • Size: 110.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for pysyrev-0.1.1.tar.gz
Algorithm Hash digest
SHA256 821ba3e01ea1b35429e09a2f6911defcf8c406e16d148e0cbab11d2da18f0a44
MD5 f244082586dbe8d55f23281ce5903554
BLAKE2b-256 73912b72494630aa9b6079b2a450d45013045d746647e9dcf51d2e55612a0f73

See more details on using hashes here.

File details

Details for the file pysyrev-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: pysyrev-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 130.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for pysyrev-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 13f8f3260bfac63f8727753a42e36ba0588e48eb0355dcda0cde0f4d9a973b35
MD5 31d2beb7d5c1904959a33d358705496a
BLAKE2b-256 d711df77c385ef037caf126200fed0a9637941253e8897fbb8717e4fa30fbe43

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page