An integration package connecting Perigon and LangChain

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

Perigon LangChain Integration

A LangChain integration for the Perigon API, enabling seamless access to news articles and vector search capabilities within the LangChain ecosystem.

Features

News Articles Search: Semantic search through news articles using Perigon's vector search API
Wikipedia Search: Semantic search through Wikipedia articles with rich metadata
LangChain Compatible: Both retrievers implement LangChain's BaseRetriever interface
Async Support: Both synchronous and asynchronous operations
Type Safety: Built with the official Perigon Python SDK for robust type checking
Flexible Filtering: Support for country, source, category, topic, and location-based filtering
Rich Metadata: Wikipedia results include pageviews, Wikidata IDs, revision information

Installation

pip install langchain-perigon

Or with Poetry:

poetry add langchain-perigon

Quick Start

News Articles Search

from langchain_perigon import ArticlesRetriever, ArticlesFilter

# Initialize with API key
retriever = ArticlesRetriever(API_KEY="your_perigon_api_key")

# Or use environment variable PERIGON_API_KEY
retriever = ArticlesRetriever()

# Simple search
documents = retriever.invoke("artificial intelligence developments")

# With options
options: ArticlesFilter = {
    "size": 10,
    "showReprints": False,
    "filter": {
        "country": "us",
        "category": "tech"
    }
}
documents = retriever.invoke("machine learning breakthroughs", options=options)

Wikipedia Search

from langchain_perigon import WikipediaRetriever, WikipediaOptions

# Initialize Wikipedia retriever
wiki_retriever = WikipediaRetriever(API_KEY="your_perigon_api_key")

# Simple Wikipedia search
documents = wiki_retriever.invoke("quantum computing")

# With advanced options
options: WikipediaOptions = {
    "size": 5,
    "pageviewsFrom": 100,  # Only popular pages
    "filter": {
        "wikidataInstanceOfLabel": ["academic discipline"],
        "category": ["Physics", "Computer science"]
    }
}
documents = wiki_retriever.invoke("machine learning", options=options)

# Access rich metadata
for doc in documents:
    print(f"Title: {doc.metadata['title']}")
    print(f"Pageviews: {doc.metadata['pageviews']}")
    print(f"Wikidata ID: {doc.metadata['wikidataId']}")

Async Usage

import asyncio
from langchain_perigon import ArticlesRetriever, WikipediaRetriever, ArticlesFilter, WikipediaOptions

async def search_both():
    # News articles
    articles_retriever = ArticlesRetriever(API_KEY="your_perigon_api_key")
    articles_options: ArticlesFilter = {
        "size": 5,
        "filter": {"country": "us"}
    }
    articles = await articles_retriever.ainvoke("climate change", options=articles_options)
    
    # Wikipedia articles
    wiki_retriever = WikipediaRetriever(API_KEY="your_perigon_api_key")
    wiki_options: WikipediaOptions = {
        "size": 3,
        "pageviewsFrom": 50
    }
    wiki_docs = await wiki_retriever.ainvoke("climate change", options=wiki_options)
    
    return articles, wiki_docs

# Run async search
articles, wiki_docs = asyncio.run(search_both())

Configuration

API Key

Set your Perigon API key in one of these ways:

Parameter: ArticlesRetriever(API_KEY="your_key")
Environment Variable: Set PERIGON_API_KEY environment variable

Filter Options

News Articles (`ArticlesFilter`)

options: ArticlesFilter = {
    "size": 10,                    # Number of results (default: 10)
    "showReprints": False,         # Include reprints (default: False)
    "filter": {
        "country": "us",           # Country filter (string or list)
        "source": "nytimes.com",   # Source filter (string or list)  
        "category": "tech",        # Category filter (string or list)
        "topic": "ai",            # Topic filter (string or list)
        "state": "CA",            # State filter (string or list)
        "city": "San Francisco"   # City filter (string or list)
    }
}

Wikipedia Articles (`WikipediaOptions`)

options: WikipediaOptions = {
    "size": 10,                           # Number of results (default: 10)
    "page": 0,                           # Page number (default: 0)
    "pageviewsFrom": 100,                # Minimum daily pageviews
    "pageviewsTo": 10000,                # Maximum daily pageviews
    "wikiRevisionFrom": "2024-01-01",    # Modified after date
    "wikiRevisionTo": "2024-12-31",      # Modified before date
    "filter": {
        "wikidataId": "Q2539",           # Specific Wikidata ID
        "wikidataInstanceOfLabel": ["academic discipline"],  # Instance type
        "category": ["Computer science"], # Wikipedia categories
        "title": "machine learning",     # Title search
        "withPageviews": True            # Only pages with view data
    }
}

Integration with LangChain

Both retrievers implement LangChain's BaseRetriever interface and work seamlessly with other LangChain components:

QA Chain with News Articles

from langchain.chains import RetrievalQA
from langchain.llms import OpenAI

# Create news retriever
retriever = ArticlesRetriever(API_KEY="your_perigon_api_key")

# Use in a QA chain
qa_chain = RetrievalQA.from_chain_type(
    llm=OpenAI(),
    chain_type="stuff",
    retriever=retriever
)

# Ask questions about recent news
result = qa_chain.run("What are the latest developments in AI?")

QA Chain with Wikipedia Knowledge

from langchain.chains import RetrievalQA
from langchain.llms import OpenAI

# Create Wikipedia retriever
wiki_retriever = WikipediaRetriever(API_KEY="your_perigon_api_key")

# Use in a QA chain for encyclopedic knowledge
qa_chain = RetrievalQA.from_chain_type(
    llm=OpenAI(),
    chain_type="stuff",
    retriever=wiki_retriever
)

# Ask questions about established knowledge
result = qa_chain.run("Explain the fundamentals of machine learning")

Combining Both Retrievers

from langchain.retrievers import EnsembleRetriever

# Create both retrievers
news_retriever = ArticlesRetriever(API_KEY="your_perigon_api_key")
wiki_retriever = WikipediaRetriever(API_KEY="your_perigon_api_key")

# Combine them for comprehensive search
ensemble_retriever = EnsembleRetriever(
    retrievers=[news_retriever, wiki_retriever],
    weights=[0.6, 0.4]  # Favor news articles slightly
)

# Use combined retriever
documents = ensemble_retriever.get_relevant_documents("artificial intelligence")

Migration from v0.x

This version has been migrated to use the official Perigon Python SDK instead of raw HTTP requests. The public API remains the same, but you'll get:

Better type safety and error handling
Improved performance and reliability
Automatic retries and connection management
Future-proof compatibility with API changes

Development

Running Tests

This project uses Poetry for dependency management. To run tests:

# Install dependencies
poetry install

# Run all tests
poetry run pytest

# Run specific test files
poetry run pytest tests/unit_tests/imports_test.py
poetry run pytest tests/integration_tests/

# Run tests with verbose output
poetry run pytest -v

Running Examples

Examples require a valid Perigon API key:

# Set your API key
export PERIGON_API_KEY=your_actual_api_key

# Run examples with poetry
poetry run python examples/simple_test.py
poetry run python examples/wikipedia_example.py

Performance Optimizations

This version includes several performance improvements:

Optimized metadata transformation: Reduced reflection-based attribute access
Configurable timeouts: Set custom timeout values for API calls
Error handling: Graceful fallbacks for transformation errors
Efficient processing: Streamlined data extraction pipelines

You can configure timeout settings:

# Set custom timeout (default: 30 seconds)
retriever = ArticlesRetriever(API_KEY="your_key", timeout=60)
wiki_retriever = WikipediaRetriever(API_KEY="your_key", timeout=45)

Requirements

Python 3.11+
LangChain Core
Perigon Python SDK

License

MIT

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

islem.maboud

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.1

Sep 18, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

langchain_perigon-0.1.1.tar.gz (9.6 kB view details)

Uploaded Sep 18, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

langchain_perigon-0.1.1-py3-none-any.whl (9.8 kB view details)

Uploaded Sep 18, 2025 Python 3

File details

Details for the file langchain_perigon-0.1.1.tar.gz.

File metadata

Download URL: langchain_perigon-0.1.1.tar.gz
Upload date: Sep 18, 2025
Size: 9.6 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for langchain_perigon-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`88c7392b8e0e39283f3d7830de8ac319d269d0b59a77108ce6c27d519aa462f7`
MD5	`2c9439c6af178073749f8b428602e90d`
BLAKE2b-256	`1025a80590500fffab150abc022dac8af6efde15f13fab38e05b5a0675fc1b48`

See more details on using hashes here.

Provenance

The following attestation bundles were made for langchain_perigon-0.1.1.tar.gz:

Publisher: publish.yml on goperigon/langchain-perigon

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: langchain_perigon-0.1.1.tar.gz
- Subject digest: 88c7392b8e0e39283f3d7830de8ac319d269d0b59a77108ce6c27d519aa462f7
- Sigstore transparency entry: 534318230
- Sigstore integration time: Sep 18, 2025
Source repository:
- Permalink: goperigon/langchain-perigon@8b4e01eab5b45cf75e09517e490fbb326a34c941
- Branch / Tag: refs/heads/main
- Owner: https://github.com/goperigon
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@8b4e01eab5b45cf75e09517e490fbb326a34c941
- Trigger Event: push

File details

Details for the file langchain_perigon-0.1.1-py3-none-any.whl.

File metadata

Download URL: langchain_perigon-0.1.1-py3-none-any.whl
Upload date: Sep 18, 2025
Size: 9.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for langchain_perigon-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`fa84470f0b833b54e3d486972037a6217cae8ffbe6ed11a1781f0689038d65e8`
MD5	`7bbac303f69ab8cf9ff2c807b63206a9`
BLAKE2b-256	`958e8d9fbf81ccaaf43bc6e84e8b9d877a469692619af141b5f1c5c3136c3a8b`

See more details on using hashes here.

Provenance

The following attestation bundles were made for langchain_perigon-0.1.1-py3-none-any.whl:

Publisher: publish.yml on goperigon/langchain-perigon

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: langchain_perigon-0.1.1-py3-none-any.whl
- Subject digest: fa84470f0b833b54e3d486972037a6217cae8ffbe6ed11a1781f0689038d65e8
- Sigstore transparency entry: 534318304
- Sigstore integration time: Sep 18, 2025
Source repository:
- Permalink: goperigon/langchain-perigon@8b4e01eab5b45cf75e09517e490fbb326a34c941
- Branch / Tag: refs/heads/main
- Owner: https://github.com/goperigon
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@8b4e01eab5b45cf75e09517e490fbb326a34c941
- Trigger Event: push

langchain-perigon 0.1.1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Perigon LangChain Integration

Features

Installation

Quick Start

News Articles Search

Wikipedia Search

Async Usage

Configuration

API Key

Filter Options

News Articles (ArticlesFilter)

Wikipedia Articles (WikipediaOptions)

Integration with LangChain

QA Chain with News Articles

QA Chain with Wikipedia Knowledge

Combining Both Retrievers

Migration from v0.x

Development

Running Tests

Running Examples

Performance Optimizations

Requirements

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

News Articles (`ArticlesFilter`)

Wikipedia Articles (`WikipediaOptions`)