Skip to main content

INSPIRE HEP tools and API wrapper for LangChain.

Project description

INSPIRE HEP Tools for LangChain

Integration with INSPIRE HEP, the trusted community hub for high energy physics research literature.

Overview

This package provides three tools for searching and retrieving physics papers from the INSPIRE HEP database:

  • Search Literature: Find physics papers by topic with flexible sorting options
  • Get Author Papers: Retrieve an author's publications (requires INSPIRE identifiers)
  • Get Paper Details: Fetch complete information about a specific paper by record ID

Features

  • Search by topic, author, or citation count
  • Sort results by most recent or most cited
  • Configurable result limits
  • Comprehensive error handling
  • Full test coverage (15 unit tests + 5 integration tests)

Installation

pip install langchain-community-inspire-hep

Import paths remain under langchain_community after installation.

Quick Start

Basic Usage

from langchain_community.tools.inspire_hep import INSPIRESearchLiteratureTool

tool = INSPIRESearchLiteratureTool()

# Search for papers
result = tool.invoke({"query": "quantum field theory"})
print(result)

# Search with sorting
result = tool.invoke({
    "query": "string theory",
    "sort": "mostcited"  # or "mostrecent"
})

All Three Tools

from langchain_community.tools.inspire_hep import (
    INSPIRESearchLiteratureTool,
    INSPIREGetAuthorPapersTool,
    INSPIREGetPaperDetailsTool,
)

# Search for papers on a topic
search_tool = INSPIRESearchLiteratureTool()
result = search_tool.invoke({
    "query": "quantum gravity",
    "sort": "mostrecent"
})

# Get an author's papers (requires INSPIRE identifier)
author_tool = INSPIREGetAuthorPapersTool()
result = author_tool.invoke({
    "author_name": "Witten.Edward.1",
    "sort": "mostcited"
})

# Get details of a specific paper
details_tool = INSPIREGetPaperDetailsTool()
result = details_tool.invoke({
    "record_id": "451647"  # Maldacena's AdS/CFT paper
})

Using with AI Agents

from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, create_tool_calling_agent
from langchain_core.prompts import ChatPromptTemplate
from langchain_community.tools.inspire_hep import (
    INSPIRESearchLiteratureTool,
    INSPIREGetAuthorPapersTool,
)

# Create tools
tools = [
    INSPIRESearchLiteratureTool(),
    INSPIREGetAuthorPapersTool(),
]

# Create LLM (use models with good tool calling support)
llm = ChatOpenAI(model="gpt-4", temperature=0)

# Create prompt
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a physics research assistant with access to INSPIRE HEP."),
    ("human", "{input}"),
    ("placeholder", "{agent_scratchpad}"),
])

# Create agent
agent = create_tool_calling_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

# Use the agent
result = agent_executor.invoke({
    "input": "What are the most cited papers on string theory?"
})
print(result['output'])

Direct API Access (Without Agents)

For direct API access without LangChain agents:

from langchain_community.utilities.inspire_hep import INSPIREHEPAPIWrapper

# Create wrapper with custom settings
wrapper = INSPIREHEPAPIWrapper(top_k_results=5)

# Search papers
papers = wrapper.search_literature("quantum gravity", sort="mostcited")
print(papers)

# Get author papers
author_papers = wrapper.get_author_papers("Witten.Edward.1", sort="mostrecent")
print(author_papers)

# Get paper details
details = wrapper.get_paper_details("451647")
print(details)

Sorting Options

Both search and author tools support flexible sorting:

  • mostrecent (default for search): Newest papers first
  • mostcited (default for author): Most cited papers first
# Find recent breakthroughs
tool.invoke({"query": "quantum computing", "sort": "mostrecent"})

# Find influential papers
tool.invoke({"query": "supersymmetry", "sort": "mostcited"})

Finding Author Identifiers

The author papers tool requires INSPIRE identifiers (format: Lastname.Firstname.N), not plain names:

  1. Go to https://inspirehep.net/authors
  2. Search for the author by name
  3. Click on their profile
  4. Use the identifier shown (e.g., Witten.Edward.1)

Why? Plain names are ambiguous (many physicists share the same name), while INSPIRE identifiers are unique.

Advanced Search Syntax

INSPIRE HEP supports advanced search queries:

# Highly cited papers (1000+ citations)
tool.invoke({"query": "topcite 1000+"})

# Papers by specific author
tool.invoke({"query": "author:Witten"})

# Papers in date range
tool.invoke({"query": "date 2020->2024"})

# Combine criteria
tool.invoke({"query": "quantum gravity topcite 500+"})

See the INSPIRE HEP search guide for more syntax.

API Rate Limiting

INSPIRE HEP enforces rate limits of 15 requests per 5 seconds per IP address. The wrapper handles basic rate limiting, but avoid making rapid successive requests.

Testing

This package includes comprehensive test coverage:

  • 15 unit tests: Test wrapper and tools with mocked API responses
  • 5 integration tests: Test with real API calls

Run tests:

# Unit tests (fast, no internet required)
pytest tests/unit_tests/test_inspire_hep.py -v

# Integration tests (requires internet)
pytest tests/integration_tests/test_inspire_hep_integrations.py -v

# All tests
pytest tests/ -v

Known Limitations

  1. Author identifiers required: The author papers tool works reliably only with INSPIRE identifiers, not plain names. Users must look up identifiers at https://inspirehep.net/authors.

  2. LLM compatibility: Agent performance depends on the LLM's tool-calling capabilities. Works best with OpenAI GPT-4, Anthropic Claude, and other models with strong structured output support.

Example Use Cases

Research Assistant

"What are the most influential papers on the AdS/CFT correspondence?"
 Uses search_literature with sort="mostcited"

Literature Review

"Find recent papers on quantum entanglement from the last year"
 Uses search_literature with sort="mostrecent"

Author Research

"What are Edward Witten's most cited contributions?"
 Uses get_author_papers with author identifier

Paper Deep Dive

"Tell me about INSPIRE record 451647"
 Uses get_paper_details for full information

Citation

If you use INSPIRE HEP in your research, please cite:

@article{Moskovic:2021zjs,
    author = "Moskovic, Micha",
    title = "{The INSPIRE REST API}",
    url = "https://github.com/inspirehep/rest-api-doc",
    doi = "10.5281/zenodo.5788550",
    month = "12",
    year = "2021"
}

Contributing

Contributions and issue reports are welcome. Future enhancements could include:

  • Job search functionality
  • Conference search
  • Advanced filtering options
  • Citation graph traversal
  • Batch operations

Resources

License

Released under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

langchain_community_inspire_hep-0.1.1.tar.gz (11.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

langchain_community_inspire_hep-0.1.1-py3-none-any.whl (10.5 kB view details)

Uploaded Python 3

File details

Details for the file langchain_community_inspire_hep-0.1.1.tar.gz.

File metadata

File hashes

Hashes for langchain_community_inspire_hep-0.1.1.tar.gz
Algorithm Hash digest
SHA256 c2899c7fff79e8b5c21b2945cdffbf0d8067e7fa7df78dcbc79a5c8ca92eb353
MD5 9fe00e00e09a1dd69dcdb9ed7009ee2b
BLAKE2b-256 d87b053939fe29e640786f5dbac081fb87f7e0347ce7f9d1d8f13d3ac6da8e01

See more details on using hashes here.

File details

Details for the file langchain_community_inspire_hep-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for langchain_community_inspire_hep-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 83d1b90ed039496f030358efae26acc39467990cc9ebdfd80e9ad1678b90d35c
MD5 66e3d872ca1389e6953ceb55261573b5
BLAKE2b-256 2a8ddd3cbd3846d27070e17e0d0b59c85d17321035b9a2b8032b96b0c8771c83

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page