Python wrapper for the arXiv API: http://arxiv.org/help/api/

These details have not been verified by PyPI

Project links

Homepage

Project description

arxiv.py

Python wrapper for the arXiv API.

Quick links

Full package documentation
Example: fetching results: the most common usage.
Example: downloading papers
Example: fetching results with a custom client

About arXiv

arXiv is a project by the Cornell University Library that provides open access to 1,000,000+ articles in Physics, Mathematics, Computer Science, Quantitative Biology, Quantitative Finance, and Statistics.

Usage

Installation

$ pip install arxiv

In your Python script, include the line

import arxiv

Search

A Search specifies a search of arXiv's database.

arxiv.Search(
  query: str = "",
  id_list: List[str] = [],
  max_results: float = float('inf'),
  sort_by: SortCriterion = SortCriterion.Relevanvce,
  sort_order: SortOrder = SortOrder.Descending
)

query: an arXiv query string. Advanced query formats are documented in the arXiv API User Manual.
id_list: list of arXiv record IDs (typically of the format "0710.5765v1"). See the arXiv API User's Manual for documentation of the interaction between query and id_list.
max_results: The maximum number of results to be returned in an execution of this search. To fetch every result available, set max_results=float('inf') (default); to fetch up to 10 results, set max_results=10. The API's limit is 300,000 results.
sort_by: The sort criterion for results: relevance, lastUpdatedDate, or submittedDate.
sort_order: The sort order for results: 'descending' or 'ascending'.

To fetch arXiv records matching a Search, use search.run() or (Client).run(search) to get a generator yielding Results.

Example: fetching results

Print the titles fo the 10 most recent articles related to the keyword "quantum:"

import arxiv

search = arxiv.Search(
  query = "quantum",
  max_results = 10,
  sort_by = arxiv.SortCriterion.SubmittedDate
)

for result in search.get():
  print(result.title)

Fetch and print the title of the paper with ID "1605.08386v1:"

import arxiv

search = arxiv.Search(id_list=["1605.08386v1"])
paper = next(search.get())
print(paper.title)

Result

The Result objects yielded by (Search).get() include metadata about each paper and some helper functions for downloading their content.

The meaning of the underlying raw data is documented in the arXiv API User Manual: Details of Atom Results Returned.

result.entry_id: A url http://arxiv.org/abs/{id}.
result.updated: When the result was last updated.
result.published: When the result was originally published.
result.title: The title of the result.
result.authors: The result's authors, as arxiv.Authors.
result.summary: The result abstract.
result.comment: The authors' comment if present.
result.journal_ref: A journal reference if present.
result.doi: A URL for the resolved DOI to an external resource if present.
result.primary_category: The result's primary arXiv category. See arXiv: Category Taxonomy.
result.categories: All of the result's categories. See arXiv: Category Taxonomy.
result.links: Up to three URLs associated with this result, as arxiv.Links.

They also expose helper methods for downloading papers: (Result).download_pdf() and (Result).download_source().

Example: downloading papers

To download a PDF of the paper with ID "1605.08386v1," run a Search and then use (Result).download_pdf():

import arxiv

paper = next(arxiv.Search(id_list=["1605.08386v1"]).get())
# Download the PDF to the PWD with a default filename.
paper.download_pdf()
# Download the PDF to the PWD with a custom filename.
paper.download_pdf(filename="downloaded-paper.pdf")
# Download the PDF to a specified directory with a custom filename.
paper.download_pdf(dirpath="./mydir", filename="downloaded-paper.pdf")

The same interface is available for downloading .tar.gz files of the paper source:

import arxiv

paper = next(arxiv.Search(id_list=["1605.08386v1"]).get())
# Download the archive to the PWD with a default filename.
paper.download_source()
# Download the archive to the PWD with a custom filename.
paper.download_source(filename="downloaded-paper.tar.gz")
# Download the archive to a specified directory with a custom filename.
paper.download_source(dirpath="./mydir", filename="downloaded-paper.tar.gz")

Client

A Client specifies a strategy for fetching results from arXiv's API; it obscures pagination and retry logic.

For most use cases the default client should suffice. You can construct it explicitly with arxiv.Client(), or use it via the (Search).get() method.

arxiv.Client(
  page_size: int = 100,
  delay_seconds: int = 3,
  num_retries: int = 3
)

page_size: the number of papers to fetch from arXiv per page of results. Smaller pages can be retrieved faster, but may require more round-trips. The API's limit is 2000 results.
delay_seconds: the number of seconds to wait between requests for pages. arXiv's Terms of Use ask that you "make no more than one request every three seconds."
num_retries: The number of times the client will retry a request that fails, either with a non-200 HTTP status code or with an unexpected number of results given the search parameters.

Example: fetching results with a custom client

(Search).get() uses the default client settings. If you want to use a client you've defined instead of the defaults, use (Client).get(...):

import arxiv

big_slow_client = arxiv.Client(
  page_size = 1000,
  delay_seconds = 10,
  num_retries = 5
)

# Prints 1000 titles before needing to make another request.
for result in big_slow_client.get(arxiv.Search(query="quantum")):
  print(result.title)

Example: logging

To inspect this package's network behavior and API logic, configure an INFO-level logger.

>>> import logging, arxiv
>>> logging.basicConfig(level=logging.INFO)
>>> paper = next(arxiv.Search(id_list=["1605.08386v1"]).get()) # Logs:
INFO:arxiv.arxiv:Requesting 100 results at offset 0
INFO:arxiv.arxiv:Requesting page of results
INFO:arxiv.arxiv:Got first page; 1 of inf results available

Contributors

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

2.4.1

Mar 4, 2026

2.4.0

Jan 5, 2026

2.3.2

Jan 5, 2026

2.3.2.dev15 pre-release

Jan 5, 2026

2.3.1

Nov 13, 2025

2.3.0

Nov 1, 2025

2.2.0

Apr 8, 2025

2.1.3

Jun 25, 2024

2.1.2

Jun 23, 2024

2.1.1

Jun 22, 2024

2.1.0

Dec 18, 2023

2.0.0

Oct 17, 2023

1.4.8

Jul 11, 2023

1.4.7

Apr 18, 2023

1.4.6

Apr 18, 2023

1.4.5

Apr 17, 2023

1.4.4

Apr 11, 2023

1.4.3

Feb 1, 2023

1.4.2

Aug 18, 2021

1.4.1

Jul 31, 2021

1.4.0

Jul 13, 2021

1.3.0

Jul 2, 2021

1.2.0

Apr 25, 2021

1.1.0

Apr 20, 2021

This version

1.0.2

Apr 17, 2021

1.0.1

Apr 5, 2021

1.0.0

Apr 4, 2021

0.5.4

Apr 2, 2021

0.5.3

Feb 23, 2020

0.5.2

Feb 15, 2020

0.5.1

Jun 15, 2019

0.5.0

Jun 15, 2019

0.4.0

May 19, 2019

0.3.1

Dec 21, 2018

0.3.0

Dec 17, 2018

0.2.3

Jun 20, 2018

0.2.2

Jul 28, 2017

0.2.1

Jul 27, 2017

0.1.1

Sep 18, 2016

0.1.0

Jul 24, 2016

0.0.3

Nov 26, 2015

0.0.2

Nov 26, 2015

0.0.1

Nov 25, 2015

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arxiv-1.0.2.tar.gz (10.2 kB view details)

Uploaded Apr 17, 2021 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

arxiv-1.0.2-py3-none-any.whl (9.9 kB view details)

Uploaded Apr 17, 2021 Python 3

File details

Details for the file arxiv-1.0.2.tar.gz.

File metadata

Download URL: arxiv-1.0.2.tar.gz
Upload date: Apr 17, 2021
Size: 10.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.0 setuptools/41.4.0 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/2.7.17

File hashes

Hashes for arxiv-1.0.2.tar.gz
Algorithm	Hash digest
SHA256	`71d4c92f634a6c6635d3589fd87a763338887c5b5bc2f45002e04eff17001d10`
MD5	`000a7259c988efab952ee0c9f2cae637`
BLAKE2b-256	`67f872535d089e374455101963ec199eefa5c2ca2ec573b87c963392653fbe25`

See more details on using hashes here.

File details

Details for the file arxiv-1.0.2-py3-none-any.whl.

File metadata

Download URL: arxiv-1.0.2-py3-none-any.whl
Upload date: Apr 17, 2021
Size: 9.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.0 setuptools/41.4.0 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/2.7.17

File hashes

Hashes for arxiv-1.0.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8fbcc984e62ec804f7c6af2a1c13ceb4f9ce153d9d9e1f43027b31c6edb2cdf6`
MD5	`f4fcd5c63451f6ab56a115ceec764f1a`
BLAKE2b-256	`550f179248d74bf87dafdecd20efd5c39494edce26ebe38a25ea0173a7567a73`

See more details on using hashes here.

arxiv 1.0.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

arxiv.py

Quick links

About arXiv

Usage

Installation

Search

Example: fetching results

Result

Example: downloading papers

Client

Example: fetching results with a custom client

Example: logging

Contributors

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes