Add your description here

Project description

ArXiv Search SDK

一个功能强大的 Python SDK，用于搜索和获取 arXiv 论文。

功能特性

🔍 多字段搜索: 支持按标题、作者、摘要、分类等字段搜索
🎯 智能查询: 自动优化搜索查询以提高结果相关性
📚 分类浏览: 支持按学科分类浏览论文
🌐 异步支持: 使用 asyncio 提供高性能的异步搜索
📊 数据模型: 使用 Pydantic 提供类型安全的数据模型

安装

pip install arxiv-search-sdk

快速开始

基本搜索

import asyncio
from arxiv_search import ArxivClient

async def main():
    async with ArxivClient() as client:
        # 搜索机器学习相关论文
        papers = await client.search_ml_papers(max_results=10)
        
        for paper in papers:
            print(f"Title: {paper.title}")
            print(f"Authors: {', '.join(paper.authors)}")
            print(f"Abstract: {paper.abstract[:200]}...")
            print(f"PDF: {paper.pdf_url}")
            print("-" * 50)

# 运行异步函数
asyncio.run(main())

高级搜索

import asyncio
from arxiv_search import ArxivClient

async def main():
    async with ArxivClient() as client:
        # 组合搜索：标题包含"transformer"，作者包含"attention"
        papers = await client.advanced_search(
            title_terms=["transformer"],
            abstract_terms=["attention"],
            categories=["cs.AI", "cs.LG"],
            max_results=20,
            sort_by="submittedDate"
        )
        
        for paper in papers:
            print(f"Title: {paper.title}")
            print(f"Categories: {', '.join(paper.categories)}")
            print(f"Published: {paper.published}")
            print()

asyncio.run(main())

分类搜索

import asyncio
from arxiv_search import ArxivClient

async def main():
    async with ArxivClient() as client:
        # 搜索人工智能论文
        ai_papers = await client.search_ai_papers(max_results=10)
        
        # 搜索计算机视觉论文
        cv_papers = await client.search_cv_papers(max_results=10)
        
        # 搜索自然语言处理论文
        nlp_papers = await client.search_nlp_papers(max_results=10)
        
        print(f"Found {len(ai_papers)} AI papers")
        print(f"Found {len(cv_papers)} CV papers")
        print(f"Found {len(nlp_papers)} NLP papers")

asyncio.run(main())

使用查询对象

import asyncio
from arxiv_search import ArxivClient, SearchQuery, SearchFieldQuery

async def main():
    async with ArxivClient() as client:
        # 创建复杂查询
        query = SearchQuery(
            original_query="deep learning transformers",
            field_queries=[
                SearchFieldQuery(field="ti", terms=["deep learning"]),
                SearchFieldQuery(field="abs", terms=["transformer", "attention"]),
                SearchFieldQuery(field="cat", terms=["cs.AI", "cs.LG"])
            ],
            max_results=15,
            sort_by="relevance"
        )
        
        papers = await client.search_papers(query)
        
        for paper in papers:
            print(f"Title: {paper.title}")
            print(f"ArXiv ID: {paper.arxiv_id}")
            print()

asyncio.run(main())

文档

📚 完整 API 文档

查看完整的 API 文档，包含详细的方法说明、参数解释和更多示例：

📖 API 文档

快速参考

API 文档

ArxivClient 类

主要的客户端类，提供所有搜索功能。

方法

search_papers(query: SearchQuery) -> List[Paper]: 执行搜索查询
search_by_title(title_terms: List[str]) -> List[Paper]: 按标题搜索
search_by_author(author_names: List[str]) -> List[Paper]: 按作者搜索
search_by_abstract(abstract_terms: List[str]) -> List[Paper]: 按摘要搜索
search_by_category(categories: List[str]) -> List[Paper]: 按分类搜索
advanced_search(...): 高级组合搜索
search_ai_papers(): 搜索人工智能论文
search_ml_papers(): 搜索机器学习论文
search_cv_papers(): 搜索计算机视觉论文
search_nlp_papers(): 搜索自然语言处理论文

数据模型

Paper 类

表示 arXiv 论文的数据模型。

class Paper(BaseModel):
    title: str                              # 论文标题
    authors: List[str]                      # 作者列表
    abstract: str                           # 论文摘要
    arxiv_id: str                          # arXiv ID
    published: datetime                     # 发表日期
    updated: Optional[datetime]             # 更新日期
    categories: List[str]                  # 学科分类
    pdf_url: str                           # PDF链接
    entry_id: str                          # 完整条目ID
    summary: Optional[str]                 # 论文总结
    links: List[Dict[str, str]]            # 相关链接

SearchQuery 类

搜索查询配置。

class SearchQuery(BaseModel):
    original_query: str                     # 原始查询
    field_queries: List[SearchFieldQuery]   # 字段查询
    max_results: int                        # 最大结果数
    sort_by: str                           # 排序方式
    sort_order: str                        # 排序顺序
    # ... 其他字段

支持的 arXiv 分类

计算机科学 (cs.*)

cs.AI - 人工智能
cs.LG - 机器学习
cs.CV - 计算机视觉
cs.CL - 计算语言学
cs.RO - 机器人学
更多...

数学 (math.*)

math.ST - 统计理论
math.OC - 优化与控制
math.PR - 概率论
更多...

物理 (physics.*)

physics.comp-ph - 计算物理
physics.data-an - 数据分析
更多...

统计学 (stat.*)

stat.ML - 机器学习
stat.ME - 统计方法
更多...

贡献

欢迎贡献代码！请查看 CONTRIBUTING.md 了解详情。

许可证

本项目采用 MIT 许可证。详情请见 LICENSE 文件。

更新日志

1.0.0 (2025-07-06)

初始版本发布
基本搜索功能
多字段搜索支持
异步 API
命令行工具

Project details

Release history Release notifications | RSS feed

This version

0.1.0

Jul 5, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arxiv_search-0.1.0.tar.gz (19.2 kB view details)

Uploaded Jul 5, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

arxiv_search-0.1.0-py3-none-any.whl (15.1 kB view details)

Uploaded Jul 5, 2025 Python 3

File details

Details for the file arxiv_search-0.1.0.tar.gz.

File metadata

Download URL: arxiv_search-0.1.0.tar.gz
Upload date: Jul 5, 2025
Size: 19.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.10.18

File hashes

Hashes for arxiv_search-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`2585bf5dd7e9a602a975062d6ba88d7bc8da8d4238393ee01616eeec808ad19f`
MD5	`167cb28aa7dd4c3e81105f4a8809b71f`
BLAKE2b-256	`23f2f2e7025804bc20fcc440eaf3308364bc1bdd09a5a9415b0862bebd05bc56`

See more details on using hashes here.

File details

Details for the file arxiv_search-0.1.0-py3-none-any.whl.

File metadata

Download URL: arxiv_search-0.1.0-py3-none-any.whl
Upload date: Jul 5, 2025
Size: 15.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.10.18

File hashes

Hashes for arxiv_search-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`22e9b83f68c8f4014577c6fbb200e2485cafac614f80847009f0f08adc5a9ac0`
MD5	`9e8f850bbf10452826b3ded0c0f6d944`
BLAKE2b-256	`0896c534d1d1c22e25cb63acc5e58e72ef8db694216af9881c38ac5b302084d0`

See more details on using hashes here.

arxiv-search 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

ArXiv Search SDK

功能特性

安装

快速开始

基本搜索

高级搜索

分类搜索

使用查询对象

文档

📚 完整 API 文档

快速参考

API 文档

ArxivClient 类

方法

数据模型

Paper 类

SearchQuery 类

支持的 arXiv 分类

计算机科学 (cs.*)

数学 (math.*)

物理 (physics.*)

统计学 (stat.*)

贡献

许可证

更新日志

1.0.0 (2025-07-06)

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes