Skip to main content

Extract visible Overleaf comments from saved browser snapshots.

Project description

Extract Overleaf Comments

Extract visible comments from a Chrome-saved Overleaf page and optionally map Overleaf character offsets (data-pos) back to line numbers in a .tex file.

This is useful because Overleaf review-panel comments are not included in the downloaded LaTeX source archive.

Install

The extractor uses only the Python standard library.

From PyPI, once released:

pip install extract-overleaf-comments

From a local checkout:

python3 -m pip install .

Then run:

extract-overleaf-comments --help

Development

python3 -m pip install -e ".[test]"
python3 -m pytest tests
python3 -m build
python3 -m twine check dist/*

Usage

  1. Open the Overleaf project.
  2. Open the review/comments panel.
  3. Use the browser's "Save page as..." feature and save the complete page.
  4. Zip the saved .html file and its companion _files/ directory, or pass a zip archive created by the browser.
  5. Run:
python3 src/overleaf_comment_extractor.py Archive.zip --tex main.tex --out-prefix comments

The command writes:

  • comments.csv
  • comments.md

To also create a compilable annotated LaTeX copy:

python3 src/overleaf_comment_extractor.py Archive.zip \
  --tex main.tex \
  --out-prefix comments \
  --comment-tex

This writes main_comments.tex by default. The annotated file inserts red thread markers such as [T001] near the Overleaf character offsets and adds a final Extracted Overleaf Comments section with the full comment text.

To put comments directly in the PDF margin instead:

python3 src/overleaf_comment_extractor.py Archive.zip \
  --tex main.tex \
  --out-prefix comments \
  --comment-tex \
  --comment-placement margin

The margin mode groups all replies from one thread into one \marginpar note in \footnotesize text. The default is --comment-placement appendix.

By default, this annotated copy is made standalone: the original document class and package list are replaced by a minimal article setup, and missing figures are rendered as placeholders. This makes the review PDF compile even when the full private Overleaf project is not available locally. Use --preserve-comment-tex-preamble if you want to keep the original LaTeX class and package setup.

Example

The repository contains a fake HTML snapshot and fake .tex file only. It does not contain private Overleaf projects or real manuscripts.

cd examples/fake_overleaf_save
zip -r ../fake_overleaf_save.zip .
cd ../..
python3 src/overleaf_comment_extractor.py examples/fake_overleaf_save.zip \
  --tex examples/fake_project.tex \
  --out-prefix examples/fake_comments \
  --comment-tex examples/fake_project_comments.tex

Release

PyPI publishing is configured through GitHub Actions Trusted Publishing. To publish a release:

  1. Create a pending publisher on PyPI for:
    • PyPI project: extract-overleaf-comments
    • Owner: adakite
    • Repository: extract-overleaf-comments
    • Workflow: publish-pypi.yml
    • Environment: pypi
  2. Tag a version and push the tag:
git tag v0.1.0
git push origin v0.1.0

Limitations

This tool parses comments that are present in the saved browser DOM. If Overleaf has not loaded a thread, or if a collapsed/truncated comment is not present in the DOM, the extractor cannot recover it from the HTML snapshot. For a more complete capture, make sure the review panel is open and the relevant comments are loaded before saving the page.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

extract_overleaf_comments-0.1.0.tar.gz (10.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

extract_overleaf_comments-0.1.0-py3-none-any.whl (10.2 kB view details)

Uploaded Python 3

File details

Details for the file extract_overleaf_comments-0.1.0.tar.gz.

File metadata

File hashes

Hashes for extract_overleaf_comments-0.1.0.tar.gz
Algorithm Hash digest
SHA256 63e4fceff74d2f23df5de912ea64cce8b781c3b968ab0a04a399d39dafb74b8a
MD5 93ec5d3c3fb48a9f5e1d93e8b39ebdb6
BLAKE2b-256 cfee9a5e0403ca720510a949c62ceb983b347525cfe523a28e6830908058f066

See more details on using hashes here.

Provenance

The following attestation bundles were made for extract_overleaf_comments-0.1.0.tar.gz:

Publisher: publish-pypi.yml on adakite/extract-overleaf-comments

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file extract_overleaf_comments-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for extract_overleaf_comments-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a5c0ed6be781f0582b9f7989fe0114e9f175b9104e0fd8a867bd1c6ff8f54643
MD5 755beef9de542cee016e7cc869a3461a
BLAKE2b-256 40725e581f13c389b73017e1cb27c420804648c20e91935199ab079012400038

See more details on using hashes here.

Provenance

The following attestation bundles were made for extract_overleaf_comments-0.1.0-py3-none-any.whl:

Publisher: publish-pypi.yml on adakite/extract-overleaf-comments

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page