Skip to main content

transform arXiv papers into a single latex prompt for LLMs

Project description

PyPI version Tests License Changelog Downloads

A command-line tool to transform arXiv papers into a single LaTeX source that can be used as a prompt for asking LLMs questions about the paper. It downloads the source files, automatically finds the main tex file containing \documentclass, and flattens multiple files into a single coherent source by resolving \input and \include commands. The tool also provides options to remove LaTeX comments and appendix sections from the output (which can be useful to shorten the prompt).

Installation

pip install arxiv-to-prompt

Usage

Basic usage:

# Display LaTeX source with comments
arxiv-to-prompt 2303.08774

# Display LaTeX source without comments
arxiv-to-prompt 2303.08774 --no-comments

# Display LaTeX source without appendix sections
arxiv-to-prompt 2303.08774 --no-appendix

# Combine options (no comments and no appendix)
arxiv-to-prompt 2303.08774 --no-comments --no-appendix

# Wait up to 5 minutes for a lock if another process is already downloading this paper
arxiv-to-prompt 2303.08774 --lock-timeout 300

# Process a local folder containing TeX files (instead of downloading from arXiv)
arxiv-to-prompt --local-folder /path/to/tex/files

# List all sections (with subsections indented)
arxiv-to-prompt 2307.09288 --list-sections
# Introduction
# Pretraining
#   Pretraining Data
#   Training Details
#     Training Hardware \& Carbon Footprint
#   ...

# Extract specific sections
arxiv-to-prompt 2307.09288 --section "Introduction" --section "Pretraining"

# Ambiguous names show a helpful error
arxiv-to-prompt 2307.09288 --section "Human Evaluation"
# Warning: 'Human Evaluation' is ambiguous. Found at:
#   - Fine-tuning > RLHF Results > Human Evaluation
#   - Appendix > Additional Details for Fine-tuning > Human Evaluation
# Use path notation to disambiguate.

# Use path notation when the same name appears multiple times
arxiv-to-prompt 2307.09288 --section "Fine-tuning > RLHF Results > Human Evaluation"

# Copy to clipboard
arxiv-to-prompt 2303.08774 | pbcopy

# Combine with the `llm` library from https://github.com/simonw/llm to chat about the paper
arxiv-to-prompt 1706.03762 | llm -s "explain this paper"

You can use either the arXiv ID (e.g., 2303.08774) or the full URL (e.g., https://arxiv.org/abs/2303.08774). It will automatically download the latest version of the paper, so you don't need to specify the version.

Python API

You can also use arxiv-to-prompt in your Python code:

from arxiv_to_prompt import process_latex_source

# Get LaTeX source with comments
latex_source = process_latex_source("2303.08774")

# Get LaTeX source without comments
latex_source = process_latex_source("2303.08774", keep_comments=False)

# Get LaTeX source without appendix sections
latex_source = process_latex_source("2303.08774", remove_appendix_section=True)

# Combine options (no comments and no appendix)
latex_source = process_latex_source("2303.08774", keep_comments=False, remove_appendix_section=True)

# Process LaTeX sources from a local folder (instead of downloading from arXiv)
latex_source = process_latex_source(local_folder="/path/to/tex/files")

Projects Using arxiv-to-prompt

Here are some projects and use cases that leverage arxiv-to-prompt:

  • arxiv-latex-mcp: MCP server that fetch and process arXiv LaTeX sources for precise interpretation of mathematical expressions in papers.
  • arxiv-tex-ui: chat with an LLM about an arxiv paper by using the latex source.
  • paper2slides: transform an arXiv paper into slides.
  • ArXivToPrompt: iOS app that allows users to easily extract LaTeX source from arXiv papers on their iPhone and copy it to the clipboard for use with LLM apps.

If you're using arxiv-to-prompt in your project, please submit a pull request to add it to this list!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arxiv_to_prompt-0.7.0.tar.gz (19.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

arxiv_to_prompt-0.7.0-py3-none-any.whl (13.0 kB view details)

Uploaded Python 3

File details

Details for the file arxiv_to_prompt-0.7.0.tar.gz.

File metadata

  • Download URL: arxiv_to_prompt-0.7.0.tar.gz
  • Upload date:
  • Size: 19.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.16

File hashes

Hashes for arxiv_to_prompt-0.7.0.tar.gz
Algorithm Hash digest
SHA256 fdb8c01af36816517152ddac939fde144cd01a512dff283978b42a30da83be4d
MD5 86147f553f40d5524b451315e8df4a77
BLAKE2b-256 c2e6b653697cce0a174450f00999366b1c6448d47ded73be32a72ab3c22294d1

See more details on using hashes here.

File details

Details for the file arxiv_to_prompt-0.7.0-py3-none-any.whl.

File metadata

File hashes

Hashes for arxiv_to_prompt-0.7.0-py3-none-any.whl
Algorithm Hash digest
SHA256 90816dba14ac38361a04600de804097bf5e5e3ef04f3a4f8e4796866f767e21b
MD5 3f742f73082b5c941cacbfff8e368ed9
BLAKE2b-256 71e8e8276a3f2510df32e9ddd78d4ee525bf00c636a5343b7b98b1348e0e57cc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page