arxiv-to-prompt

transform arXiv papers into a single latex prompt for LLMs

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

tksii

Project description

A command-line tool to transform arXiv papers into a single LaTeX source that can be used as a prompt for asking LLMs questions about the paper. It downloads the source files, automatically finds the main tex file containing \documentclass, and flattens multiple files into a single coherent source by resolving \input and \include commands. The tool also provides options to remove LaTeX comments and appendix sections from the output (which can be useful to shorten the prompt).

Installation

pip install arxiv-to-prompt

Usage

# Display LaTeX source
arxiv-to-prompt 2303.08774

# Display LaTeX source without comments
arxiv-to-prompt 2303.08774 --no-comments

# Display LaTeX source without appendix sections
arxiv-to-prompt 2303.08774 --no-appendix

# Combine options (no comments and no appendix)
arxiv-to-prompt 2303.08774 --no-comments --no-appendix

# Copy to clipboard
arxiv-to-prompt 2303.08774 --copy # or -c

You can use either the arXiv ID (e.g., 2303.08774) or the full URL (e.g., https://arxiv.org/abs/2303.08774). It will automatically download the most recent version of the paper, so you don't need to specify the version. Downloaded papers are cached locally, so subsequent runs for the same paper will use the cached version without re-downloading.

Advanced Options

# Force re-download even if the paper is already cached
arxiv-to-prompt 2303.08774 --force-download

# Process a local folder containing TeX files (instead of downloading from arXiv)
arxiv-to-prompt --local-folder /path/to/tex/files

# Cache locking is on by default (120s timeout); increase/decrease it if needed
arxiv-to-prompt 2303.08774 --lock-timeout 300

# List all sections (with subsections indented)
arxiv-to-prompt 2307.09288 --list-sections
# Introduction
# Pretraining
#   Pretraining Data
#   Training Details
#     Training Hardware \& Carbon Footprint
#   ...

# Extract specific sections
arxiv-to-prompt 2307.09288 --section "Introduction" --section "Pretraining"

# Ambiguous names show a helpful error
arxiv-to-prompt 2307.09288 --section "Human Evaluation"
# Warning: 'Human Evaluation' is ambiguous. Found at:
#   - Fine-tuning > RLHF Results > Human Evaluation
#   - Appendix > Additional Details for Fine-tuning > Human Evaluation
# Use path notation to disambiguate.

# Use path notation when the same name appears multiple times
arxiv-to-prompt 2307.09288 --section "Fine-tuning > RLHF Results > Human Evaluation"

# Output figure file paths instead of LaTeX text
arxiv-to-prompt 2303.08774 --figure-paths

# Figure paths from main body only (exclude appendix and commented-out figures)
arxiv-to-prompt 2303.08774 --figure-paths --no-appendix --no-comments

# Output only the abstract text
arxiv-to-prompt 2303.08774 --abstract

# Expand \newcommand and related macro definitions inline
arxiv-to-prompt 2303.08774 --expand-macros

# Combine with the `llm` library from https://github.com/simonw/llm to chat about the paper
arxiv-to-prompt 1706.03762 | llm -s "explain this paper"

Python API

You can also use arxiv-to-prompt in your Python code:

from arxiv_to_prompt import process_latex_source

# Get LaTeX source with comments
latex_source = process_latex_source("2303.08774")

# Get LaTeX source without comments
latex_source = process_latex_source("2303.08774", keep_comments=False)

# Get LaTeX source without appendix sections
latex_source = process_latex_source("2303.08774", remove_appendix_section=True)

# Combine options (no comments and no appendix)
latex_source = process_latex_source("2303.08774", keep_comments=False, remove_appendix_section=True)

# Force re-download even if the paper is already cached
latex_source = process_latex_source("2303.08774", use_cache=False)

# Process LaTeX sources from a local folder (instead of downloading from arXiv)
latex_source = process_latex_source(local_folder="/path/to/tex/files")

# Get resolved figure file paths instead of LaTeX text
figure_paths = process_latex_source("2303.08774", figure_paths_only=True)

# Get only the abstract text
abstract = process_latex_source("2303.08774", abstract_only=True)

# Expand custom macro definitions inline
latex_source = process_latex_source("2303.08774", expand_macros_flag=True)

Projects Using arxiv-to-prompt

Here are some projects and use cases that leverage arxiv-to-prompt:

arxiv-latex-mcp: MCP server that fetch and process arXiv LaTeX sources for precise interpretation of mathematical expressions in papers.
arxiv-tex-ui: chat with an LLM about an arxiv paper by using the latex source.
paper2slides: transform an arXiv paper into slides.
notations-cli: extract notation tables from arXiv papers using LLMs, generating searchable HTML documentation of mathematical symbols and their definitions.
ArXivToPrompt: iOS app that allows users to easily extract LaTeX source from arXiv papers on their iPhone and copy it to the clipboard for use with LLM apps.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

tksii

Release history Release notifications | RSS feed

0.13.3

Apr 29, 2026

0.13.2

Apr 10, 2026

0.13.1

Mar 26, 2026

0.13.0

Mar 14, 2026

0.12.1

Mar 14, 2026

This version

0.12.0

Mar 14, 2026

0.11.1

Mar 8, 2026

0.11.0

Feb 22, 2026

0.10.0

Feb 15, 2026

0.9.0

Feb 12, 2026

0.8.0

Feb 11, 2026

0.7.0

Feb 10, 2026

0.6.0

Feb 4, 2026

0.5.1

Jan 30, 2026

0.5.0

Jan 30, 2026

0.4.1

Jan 30, 2026

0.4.0

Jan 30, 2026

0.3.0

Dec 24, 2025

0.2.2

Jun 29, 2025

0.2.1

Jun 29, 2025

0.2.0

Jun 29, 2025

0.1.1

Mar 7, 2025

0.1.0

Feb 2, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arxiv_to_prompt-0.12.0.tar.gz (29.8 kB view details)

Uploaded Mar 14, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

arxiv_to_prompt-0.12.0-py3-none-any.whl (18.0 kB view details)

Uploaded Mar 14, 2026 Python 3

File details

Details for the file arxiv_to_prompt-0.12.0.tar.gz.

File metadata

Download URL: arxiv_to_prompt-0.12.0.tar.gz
Upload date: Mar 14, 2026
Size: 29.8 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for arxiv_to_prompt-0.12.0.tar.gz
Algorithm	Hash digest
SHA256	`70583c8b98abe6ed9eccf852ceacf44f0ca1dacc5676f4c95a805de0349b6b08`
MD5	`45255a0ebd3b9612814f36e89e9d54a6`
BLAKE2b-256	`693715b695a62c1186d388dd9844eac6e68ef225933da020d5c64c84c55de6be`

See more details on using hashes here.

Provenance

The following attestation bundles were made for arxiv_to_prompt-0.12.0.tar.gz:

Publisher: publish.yml on takashiishida/arxiv-to-prompt

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: arxiv_to_prompt-0.12.0.tar.gz
- Subject digest: 70583c8b98abe6ed9eccf852ceacf44f0ca1dacc5676f4c95a805de0349b6b08
- Sigstore transparency entry: 1102207985
- Sigstore integration time: Mar 14, 2026
Source repository:
- Permalink: takashiishida/arxiv-to-prompt@75db0a6933b78c8db9dcca841c420039bba7d89e
- Branch / Tag: refs/tags/v0.12.0
- Owner: https://github.com/takashiishida
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@75db0a6933b78c8db9dcca841c420039bba7d89e
- Trigger Event: release

File details

Details for the file arxiv_to_prompt-0.12.0-py3-none-any.whl.

File metadata

Download URL: arxiv_to_prompt-0.12.0-py3-none-any.whl
Upload date: Mar 14, 2026
Size: 18.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for arxiv_to_prompt-0.12.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`fc8b29f46e8c9f7d8d0794db2ef2d1727ff0a7c74be8188b6d8eda1fef80776c`
MD5	`58be301bcad3cb01db3aae407e52fa8d`
BLAKE2b-256	`87910cee01d43ce06e944377fabcc31d8c330589ef1854bee351028270f9d408`

See more details on using hashes here.

Provenance

The following attestation bundles were made for arxiv_to_prompt-0.12.0-py3-none-any.whl:

Publisher: publish.yml on takashiishida/arxiv-to-prompt

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: arxiv_to_prompt-0.12.0-py3-none-any.whl
- Subject digest: fc8b29f46e8c9f7d8d0794db2ef2d1727ff0a7c74be8188b6d8eda1fef80776c
- Sigstore transparency entry: 1102207988
- Sigstore integration time: Mar 14, 2026
Source repository:
- Permalink: takashiishida/arxiv-to-prompt@75db0a6933b78c8db9dcca841c420039bba7d89e
- Branch / Tag: refs/tags/v0.12.0
- Owner: https://github.com/takashiishida
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@75db0a6933b78c8db9dcca841c420039bba7d89e
- Trigger Event: release

arxiv-to-prompt 0.12.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Project description

Installation

Usage

Advanced Options

Python API

Projects Using arxiv-to-prompt

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance