Skip to main content

Export arxiv papers to pdf formats

Project description

Arxix Export

Arxiv Export is a Python library that allows you to search, download, and manage scientific articles from arXiv.org. It is useful for automating paper downloads and obtaining structured information about articles.

Installation

pip install arxiv-export

Usage Example

import asyncio
from arxiv_export_documents import export_papers


async def main():
    search_query = "quantum computing"
    download_path = "./arxiv_papers"
    max_results = 5

    async for paper in export_papers(
        search=search_query,
        path_download=download_path,
        max_results=max_results
    ):
        print(f"Downloaded paper: {paper.title}")
        print(f"Authors: {', '.join(paper.authors)}")
        print(f"Summary: {paper.summary}")
        print(f"Link: {paper.link}")
        print(f"Path: {paper.path}")
        print(f"Documents: {len(paper.documents)}")
        print(f"Exists: {paper.is_exist}")
        print("-" * 80)


if __name__ == "__main__":
    asyncio.run(main())

Features

  • Search for articles on arXiv using keywords.
  • Automatically download article PDFs.
  • Access metadata such as title, authors, abstract, link, and local path.
  • Manage multiple results with a single command.

Main Parameters

  • search: search string (e.g., "quantum computing").
  • path_download: path to save the PDFs.
  • max_results: maximum number of articles to download.

Vector Database for LLMs

The documents property provides a list of Document files intended for ingestion into a vector database. These files are commonly used to supply structured data to language models (LLMs), supporting semantic search and advanced analysis.

License

This library is distributed under the MIT license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arxiv_export_documents-0.1.7.tar.gz (6.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

arxiv_export_documents-0.1.7-py3-none-any.whl (6.6 kB view details)

Uploaded Python 3

File details

Details for the file arxiv_export_documents-0.1.7.tar.gz.

File metadata

  • Download URL: arxiv_export_documents-0.1.7.tar.gz
  • Upload date:
  • Size: 6.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.2

File hashes

Hashes for arxiv_export_documents-0.1.7.tar.gz
Algorithm Hash digest
SHA256 ed089c5823a0d6dcecfbecfa35e1343978db7529d371ce83ce20310ebb11055e
MD5 86b5e1c4324c8088fd4ed5a242e21fa7
BLAKE2b-256 d8a5c4ff7af9d2e238b4395af3c745607967366e6de5edce8789e79104bfeb21

See more details on using hashes here.

File details

Details for the file arxiv_export_documents-0.1.7-py3-none-any.whl.

File metadata

File hashes

Hashes for arxiv_export_documents-0.1.7-py3-none-any.whl
Algorithm Hash digest
SHA256 fc90b3dda8b2cc795cf0bf670e1210b287f66f27ab23cd84ecb55f7cb9faa597
MD5 257cb4b3aa07667d5eda9ceda04dacae
BLAKE2b-256 6d191d8fc09a7eb37c5baeba108e6176faf9d565776328c4e51d0acdfe86e12d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page