Skip to main content

CLI for the arXiv API, for humans and agents alike

Project description

arxivy

A command-line interface for the arXiv API, designed for both human researchers and AI agents.

Part of a family of academic CLI tools (s2cli, dblpcli, openalexcli) that share a consistent interface and output conventions.

Installation

pip install arxivy

Or with uv:

uv pip install arxivy

Quick Start

Run directly without installing using uvx:

uvx arxivy search "attention is all you need"

Or after installing:

# Search for papers (shows table in terminal)
arxivy search "attention mechanism transformers"

# Get paper details
arxivy paper 1706.03762

# Export BibTeX
arxivy bibtex 1706.03762 >> references.bib

# Browse latest papers in a category
arxivy new cs.AI --limit 5

Output Formats

arxivy is designed to work seamlessly for both humans and AI agents:

Context Default Output Behavior
Terminal (interactive) Human-readable table Easy to scan and read
Piped to another command Compact JSON Machine-parseable for scripts
--json flag Pretty JSON Explicit JSON when you need it
--bibtex / -b flag BibTeX Ready for LaTeX
# Terminal: shows a nice Rich table
arxivy search "transformers"

# Piped: automatically outputs JSON for jq, scripts, AI agents
arxivy search "transformers" | jq '.results[0].title'

# Explicit JSON (pretty-printed in terminal)
arxivy search "transformers" --json

# BibTeX output
arxivy search "transformers" --bibtex

Commands

Command Description
arxivy search <query> Search papers by keyword
arxivy paper <id>... Get paper details (single = detail view, multiple = table)
arxivy bibtex <id>... Export BibTeX citations
arxivy new <category> Browse recent papers in a category

arxivy search

Search papers across all of arXiv.

# Basic search
arxivy search "attention is all you need"

# Filter by arXiv category
arxivy search "transformers" --category cs.CL
arxivy search "transformers" -c cs.CL

# Limit results
arxivy search "deep learning" --limit 20
arxivy search "deep learning" -n 20

# Sort by submission date or last updated
arxivy search "diffusion models" --sort submittedDate
arxivy search "diffusion models" --sort lastUpdatedDate --order ascending

# Pagination
arxivy search "neural networks" --limit 10 --offset 20

# Show abstracts in the table
arxivy search "reinforcement learning" --abstract
arxivy search "reinforcement learning" -a

# Combine options
arxivy search "vision transformer" -c cs.CV -n 5 --sort submittedDate --json

Options:

Flag Short Description
--limit -n Maximum number of results (default: 10)
--offset Pagination offset
--category -c Filter by arXiv category (e.g. cs.AI, math.CO, hep-ph)
--sort Sort by: relevance, lastUpdatedDate, submittedDate
--order Sort order: ascending, descending
--json Output as JSON
--bibtex -b Output as BibTeX
--abstract -a Include abstracts in table output

arxivy paper

Fetch one or more papers by arXiv ID.

# Single paper: shows detailed panel with abstract, authors, links
arxivy paper 1706.03762

# Multiple papers: shows comparison table
arxivy paper 1706.03762 2010.11929 1810.04805

# Accepts full URLs
arxivy paper https://arxiv.org/abs/1706.03762

# Old-style arXiv IDs
arxivy paper hep-ph/0601001

# Versioned IDs (version is stripped automatically)
arxivy paper 1706.03762v7

# Export as JSON or BibTeX
arxivy paper 1706.03762 --json
arxivy paper 1706.03762 --bibtex

arxivy bibtex

Export BibTeX entries for one or more papers. This is a shortcut for arxivy paper <ids> --bibtex.

# Single paper
arxivy bibtex 1706.03762

# Multiple papers
arxivy bibtex 1706.03762 2010.11929 1810.04805

# Save to file
arxivy bibtex 1706.03762 2010.11929 > references.bib

# Append to existing file
arxivy bibtex 1810.04805 >> references.bib

BibTeX entries use @article when a journal reference is present, @misc otherwise. All entries include eprint, archiveprefix, and primaryclass fields for proper arXiv citation.

arxivy new

Browse the most recently submitted papers in an arXiv category.

# Latest cs.AI papers
arxivy new cs.AI

# Limit results
arxivy new cs.CL --limit 20
arxivy new cs.CL -n 20

# With abstracts
arxivy new cs.LG -n 5 --abstract

# Export as JSON or BibTeX
arxivy new math.CO --json
arxivy new hep-ph -n 10 --bibtex

arXiv ID Formats

arxivy automatically normalizes various ID formats:

Input Normalized to
1706.03762 1706.03762
1706.03762v7 1706.03762
https://arxiv.org/abs/1706.03762v7 1706.03762
http://arxiv.org/abs/1706.03762 1706.03762
hep-ph/0601001 hep-ph/0601001
hep-ph/0601001v2 hep-ph/0601001

arXiv Categories

Some commonly used arXiv categories:

Category Field
cs.AI Artificial Intelligence
cs.CL Computation and Language (NLP)
cs.CV Computer Vision
cs.LG Machine Learning
cs.RO Robotics
cs.SE Software Engineering
math.CO Combinatorics
math.ST Statistics Theory
stat.ML Machine Learning (Statistics)
hep-ph High Energy Physics - Phenomenology
quant-ph Quantum Physics
cond-mat Condensed Matter

Full list: https://arxiv.org/category_taxonomy

JSON Output Structure

Search / New results

{
  "results": [
    {
      "arxiv_id": "1706.03762",
      "title": "Attention Is All You Need",
      "summary": "The dominant sequence transduction models...",
      "authors": [
        {"name": "Ashish Vaswani", "affiliation": "Google Brain"}
      ],
      "published": "2017-06-12T17:57:34Z",
      "updated": "2023-08-02T01:31:28Z",
      "categories": ["cs.CL", "cs.LG"],
      "primary_category": "cs.CL",
      "comment": "15 pages, 5 figures",
      "journal_ref": "Advances in Neural Information Processing Systems 30 (NIPS 2017)",
      "doi": "10.48550/arXiv.1706.03762",
      "pdf_url": "https://arxiv.org/pdf/1706.03762v7",
      "abstract_url": "https://arxiv.org/abs/1706.03762v7"
    }
  ],
  "meta": {
    "total_results": 100,
    "start_index": 0,
    "items_per_page": 10,
    "query": "attention is all you need"
  }
}

Single paper

{
  "result": {
    "arxiv_id": "1706.03762",
    "title": "Attention Is All You Need",
    "...": "..."
  }
}

Errors

{
  "error": {
    "code": "NOT_FOUND",
    "message": "Paper not found: 9999.99999",
    "suggestion": "Check the arXiv ID format (e.g. 1706.03762 or hep-ph/0601001)",
    "documentation": "https://info.arxiv.org/help/api/index.html"
  }
}

Examples

Human Workflows

# Find the seminal transformer paper
arxivy search "attention is all you need" -n 5

# Look up a specific paper with full details
arxivy paper 1706.03762

# Export a bibliography for a literature review
arxivy bibtex 1706.03762 2010.11929 1810.04805 > transformers.bib

# Browse today's ML papers
arxivy new cs.LG -n 20

# Search within a category, sorted by date
arxivy search "RLHF" -c cs.AI --sort submittedDate

# Read abstracts at a glance
arxivy search "chain of thought" -n 5 --abstract

AI Agent / Scripting Workflows

# Quick context gathering (auto-JSON when piped)
arxivy search "retrieval augmented generation" | jq '.results[:3]'

# Extract just titles
arxivy search "vision transformers" | jq -r '.results[].title'

# Get arXiv IDs for further processing
arxivy search "BERT" | jq -r '.results[].arxiv_id'

# Batch BibTeX export from a list of IDs
arxivy bibtex 1706.03762 1810.04805 2010.11929

# Get the latest paper in a category
arxivy new cs.AI -n 1 | jq '.results[0]'

# Check if a paper exists
arxivy paper 1706.03762 --json | jq '.result.title'

Rate Limiting

arXiv asks clients to wait at least 3 seconds between requests. arxivy enforces this automatically via proactive throttling — no action needed on your part. If the arXiv API returns a server error (5xx), arxivy retries with exponential backoff (up to 3 retries).

Development

# Clone and install
git clone https://github.com/mrshu/arxivy.git
cd arxivy
uv sync --all-extras

# Run tests
uv run pytest -v

# Lint
uv run ruff check src/ tests/

Publishing

This project follows the same publishing approach as the sibling tools in this family.

Trusted Publishing (recommended)

Publishing to PyPI is automated via GitHub Actions using PyPI Trusted Publishing. You must configure the trusted publisher once in PyPI. To publish a new version:

  1. In PyPI, add a Trusted Publisher for:

    • Owner: mrshu
    • Repository: arxivy
    • Workflow: .github/workflows/publish.yml
  2. Bump version in pyproject.toml

  3. Update CHANGELOG.md

  4. Tag and push a release tag:

git tag v0.1.0
git push origin v0.1.0

The publish workflow builds the package and runs uv publish.

Manual publish (fallback)

uv build
uv publish

Design Philosophy

Based on CLI best practices and consistent with the sibling tools:

  1. Human-first by default — Rich tables in terminals, detail panels for single papers
  2. Machine-friendly when piped — Automatic compact JSON for scripts and AI agents
  3. Explicit overrides--json and --bibtex flags when you need control
  4. No API key needed — arXiv's API is free and open
  5. Respectful throttling — Proactive 3-second delays between requests, as arXiv requests

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arxivy-0.1.0.tar.gz (36.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

arxivy-0.1.0-py3-none-any.whl (22.5 kB view details)

Uploaded Python 3

File details

Details for the file arxivy-0.1.0.tar.gz.

File metadata

  • Download URL: arxivy-0.1.0.tar.gz
  • Upload date:
  • Size: 36.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.21 {"installer":{"name":"uv","version":"0.9.21","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for arxivy-0.1.0.tar.gz
Algorithm Hash digest
SHA256 0517b9c2633b9b1058bc5c67d628d21570a595b68fb21a5d0df15d0c6a6ae800
MD5 14b9764bfaadcf9fd6c206196c3372eb
BLAKE2b-256 d6b4134b1e96af86589bbf3400e2b39814a0272df904a78e999354baa716ba12

See more details on using hashes here.

File details

Details for the file arxivy-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: arxivy-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 22.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.21 {"installer":{"name":"uv","version":"0.9.21","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for arxivy-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 352ca2422e92f6b4ad28ad3ab1631f118663fb0b6dfa264d44040708c1a847de
MD5 e6e12f705735357bce18228ad1c1ecd0
BLAKE2b-256 97c62c1efc733259f7965a650c20524fbc043ab9b2577cd6eeb8c7ecc3d5c94f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page