Skip to main content

An integration package connecting Pinecone and LangChain

Project description

langchain-pinecone

This package contains the LangChain integration with Pinecone.

Installation

pip install -qU langchain langchain-pinecone langchain-openai

And you should configure credentials by setting the following environment variables:

  • PINECONE_API_KEY
  • OPENAI_API_KEY (optional, for embeddings to use)

Development

Running Tests

The test suite includes both unit tests and integration tests. To run the tests:

# Run unit tests only
make test

# Run integration tests (requires environment variables)
make integration_test

Required Environment Variables for Tests

Integration tests require the following environment variables:

  • PINECONE_API_KEY: Required for all integration tests
  • OPENAI_API_KEY: Optional, required only for OpenAI embedding tests

You can set these environment variables before running the tests:

export PINECONE_API_KEY="your-api-key"
export OPENAI_API_KEY="your-openai-key"  # Optional

If these environment variables are not set, the integration tests that require them will be skipped.

Usage

Initialization

Before initializing our vector store, let's connect to a Pinecone index. If one named index_name doesn't exist, it will be created.

from pinecone import ServerlessSpec

index_name = "langchain-test-index"  # change if desired

if not pc.has_index(index_name):
    pc.create_index(
        name=index_name,
        dimension=1536,
        metric="cosine",
        spec=ServerlessSpec(
            cloud='aws',
            region='us-east-1'
        )
    )

index = pc.Index(index_name)

Initialize embedding model:

from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

The PineconeVectorStore class exposes the connection to the Pinecone vector store.

from langchain_pinecone import PineconeVectorStore

vector_store = PineconeVectorStore(index=index, embedding=embeddings)

Manage vector store

Once you have created your vector store, we can interact with it by adding and deleting different items.

Add items to vector store

We can add items to our vector store by using the add_documents function.

from uuid import uuid4

from langchain_core.documents import Document

document_1 = Document(
    page_content="I had chocalate chip pancakes and scrambled eggs for breakfast this morning.",
    metadata={"source": "tweet"},
)

document_2 = Document(
    page_content="The weather forecast for tomorrow is cloudy and overcast, with a high of 62 degrees.",
    metadata={"source": "news"},
)

document_3 = Document(
    page_content="Building an exciting new project with LangChain - come check it out!",
    metadata={"source": "tweet"},
)

document_4 = Document(
    page_content="Robbers broke into the city bank and stole $1 million in cash.",
    metadata={"source": "news"},
)

document_5 = Document(
    page_content="Wow! That was an amazing movie. I can't wait to see it again.",
    metadata={"source": "tweet"},
)

document_6 = Document(
    page_content="Is the new iPhone worth the price? Read this review to find out.",
    metadata={"source": "website"},
)

document_7 = Document(
    page_content="The top 10 soccer players in the world right now.",
    metadata={"source": "website"},
)

document_8 = Document(
    page_content="LangGraph is the best framework for building stateful, agentic applications!",
    metadata={"source": "tweet"},
)

document_9 = Document(
    page_content="The stock market is down 500 points today due to fears of a recession.",
    metadata={"source": "news"},
)

document_10 = Document(
    page_content="I have a bad feeling I am going to get deleted :(",
    metadata={"source": "tweet"},
)

documents = [
    document_1,
    document_2,
    document_3,
    document_4,
    document_5,
    document_6,
    document_7,
    document_8,
    document_9,
    document_10,
]
uuids = [str(uuid4()) for _ in range(len(documents))]
vector_store.add_documents(documents=documents, ids=uuids)

Delete items from vector store

vector_store.delete(ids=[uuids[-1]])

Query vector store

Once your vector store has been created and the relevant documents have been added you will most likely wish to query it during the running of your chain or agent.

Query directly

Performing a simple similarity search can be done as follows:

results = vector_store.similarity_search(
    "LangChain provides abstractions to make working with LLMs easy",
    k=2,
    filter={"source": "tweet"},
)
for res in results:
    print(f"* {res.page_content} [{res.metadata}]")

Similarity search with score

You can also search with score:

results = vector_store.similarity_search_with_score(
    "Will it be hot tomorrow?", k=1, filter={"source": "news"}
)
for res, score in results:
    print(f"* [SIM={score:3f}] {res.page_content} [{res.metadata}]")

Query by turning into retriever

You can also transform the vector store into a retriever for easier usage in your chains.

retriever = vector_store.as_retriever(
    search_type="similarity_score_threshold",
    search_kwargs={"k": 1, "score_threshold": 0.4},
)
retriever.invoke("Stealing from the bank is a crime", filter={"source": "news"})

List Supported Pinecone Models (Dynamic)

You can dynamically fetch the list of supported embedding and reranker models from Pinecone using the following methods:

from langchain_pinecone import PineconeEmbeddings, PineconeRerank

# List all supported embedding models
embedding_models = PineconeEmbeddings.list_supported_models()
print("Embedding models:", [m["model"] for m in embedding_models])

# List all supported reranker models
reranker_models = PineconeRerank.list_supported_models()
print("Reranker models:", [m["model"] for m in reranker_models])

# You can also filter by vector type (e.g., 'dense' or 'sparse')
sparse_embedding_models = PineconeEmbeddings.list_supported_models(vector_type="sparse")
print("Sparse embedding models:", [m["model"] for m in sparse_embedding_models])

Async Model Listing

For async applications, you can use the async versions of the model listing functions:

import asyncio
from langchain_pinecone import PineconeEmbeddings, PineconeRerank

async def list_models_async():
    # List all supported embedding models asynchronously
    embedding_models = await PineconeEmbeddings().alist_supported_models()
    print("Embedding models:", [m["model"] for m in embedding_models])
    
    # List all supported reranker models asynchronously
    reranker_models = await PineconeRerank().alist_supported_models()
    print("Reranker models:", [m["model"] for m in reranker_models])
    
    # Filter by vector type asynchronously
    dense_embedding_models = await PineconeEmbeddings().alist_supported_models(vector_type="dense")
    print("Dense embedding models:", [m["model"] for m in dense_embedding_models])

# Run the async function
asyncio.run(list_models_async())

You can also use the low-level async function directly:

import asyncio
from langchain_pinecone._utilities import aget_pinecone_supported_models

async def get_models_directly():
    api_key = "your-pinecone-api-key"
    
    # Get all models
    all_models = await aget_pinecone_supported_models(api_key)
    
    # Get only embedding models
    embed_models = await aget_pinecone_supported_models(api_key, model_type="embed")
    
    # Get only dense embedding models
    dense_models = await aget_pinecone_supported_models(api_key, model_type="embed", vector_type="dense")
    
    return all_models, embed_models, dense_models

This ensures your application always uses valid, up-to-date model names from Pinecone.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

langchain_pinecone-0.2.13.tar.gz (40.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

langchain_pinecone-0.2.13-py3-none-any.whl (26.2 kB view details)

Uploaded Python 3

File details

Details for the file langchain_pinecone-0.2.13.tar.gz.

File metadata

  • Download URL: langchain_pinecone-0.2.13.tar.gz
  • Upload date:
  • Size: 40.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for langchain_pinecone-0.2.13.tar.gz
Algorithm Hash digest
SHA256 294a6da7e1a81d5805060e37639d3e4fb72cb239d651a34a4d1f81ba096f473a
MD5 294bfdcc1898ebb4a6f5eb6698c8a96b
BLAKE2b-256 5fe9b52029651f6f8c0c585f26ae665a8ef34cd36a47b2590a2cd3a1a0b11d9d

See more details on using hashes here.

File details

Details for the file langchain_pinecone-0.2.13-py3-none-any.whl.

File metadata

File hashes

Hashes for langchain_pinecone-0.2.13-py3-none-any.whl
Algorithm Hash digest
SHA256 2f9db3f9d8c634e8716eb8fb65a405458083c4d52810be76294665e2d75ad65a
MD5 efd2a1d74f6184693a7195402a26de60
BLAKE2b-256 1ecf27ec504e2fa92e73d49bc49f4345d82e5b6e75158c56092f5140f6afc8bd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page