Skip to main content

An integration package connecting Glean and LangChain

Project description

langchain-glean

This package contains the LangChain integration with Glean, an enterprise search platform. It allows you to search and retrieve information from your organization's content using LangChain.

Installation

pip install -U langchain-glean

Configuration

You need to configure your Glean credentials by setting the following environment variables:

export GLEAN_SUBDOMAIN="your-glean-subdomain"
export GLEAN_API_TOKEN="your-api-token"
export GLEAN_ACT_AS="user@example.com"  # Optional: Email to act as when making requests

Usage

Using the Chat Model

The ChatGlean allows you to interact with Glean's AI chat functionality:

from langchain_core.messages import HumanMessage, SystemMessage
from langchain_glean import ChatGlean

# Initialize the chat model (will use environment variables)
chat = ChatGlean()

# Create messages
messages = [
    SystemMessage(content="You are a helpful AI assistant."),
    HumanMessage(content="What are the company holidays this year?")
]

# Generate a response
response = chat.invoke(messages)
print(response.content)

Streaming Responses

You can stream responses from the chat model:

from langchain_core.messages import HumanMessage, SystemMessage
from langchain_glean import ChatGlean

# Initialize the chat model
chat = ChatGlean()

# Create messages
messages = [
    SystemMessage(content="You are a helpful AI assistant."),
    HumanMessage(content="Explain retrieval augmented generation.")
]

# Stream the response
for chunk in chat.stream(messages):
    # Process each chunk as it arrives
    print(chunk.message.content, end="", flush=True)

Multi-turn Conversations

You can have multi-turn conversations with chat history:

from langchain_core.messages import HumanMessage, SystemMessage
from langchain_glean import ChatGlean

# Initialize the chat model with chat saving enabled
chat = ChatGlean(save_chat=True)

# Start a conversation
conversation = [
    SystemMessage(content="You are a helpful AI assistant for our company.")
]

# First turn
conversation.append(HumanMessage(content="What are our main projects?"))
response = chat.invoke(conversation)
print(f"AI: {response.content}")
conversation.append(response)

# Second turn
conversation.append(HumanMessage(content="Which one has the highest priority?"))
response = chat.invoke(conversation)
print(f"AI: {response.content}")

# The chat_id is saved in the chat model
print(f"Chat ID: {chat.chat_id}")

Chat with RAG

You can combine the chat model with a retriever for RAG:

from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain_glean import ChatGlean, GleanSearchRetriever

# Initialize components
retriever = GleanSearchRetriever()
chat = ChatGlean()

# Create a prompt template
prompt = ChatPromptTemplate.from_messages([
    ("system", "Answer based on the retrieved information: {context}"),
    ("human", "{question}")
])

# Format documents function
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

# Create a RAG chain
rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | chat
    | StrOutputParser()
)

# Run the chain
response = rag_chain.invoke("What are our company policies?")
print(response)

Using the Retriever

The GleanSearchRetriever allows you to search and retrieve documents from Glean:

from langchain_glean.retrievers import GleanSearchRetriever

# Initialize the retriever (will use environment variables)
retriever = GleanSearchRetriever()

# Search for documents
documents = retriever.invoke("quarterly sales report")

# Process the results
for doc in documents:
    print(f"Title: {doc.metadata.get('title')}")
    print(f"URL: {doc.metadata.get('url')}")
    print(f"Content: {doc.page_content}")
    print("---")

Using the Tool

The GleanSearchTool can be used in LangChain agents to search Glean:

from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, create_openai_tools_agent
from langchain_glean.retrievers import GleanSearchRetriever
from langchain_glean.tools import GleanSearchTool

# Initialize the retriever (will use environment variables)
retriever = GleanSearchRetriever()

# Create the tool
glean_tool = GleanSearchTool(
    retriever=retriever,
    name="glean_search",
    description="Search for information in your organization's content using Glean."
)

# Create an agent with the tool
llm = ChatOpenAI(model="gpt-3.5-turbo")
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant with access to Glean search."),
    ("user", "{input}")
])

agent = create_openai_tools_agent(llm, [glean_tool], prompt)
agent_executor = AgentExecutor(agent=agent, tools=[glean_tool])

# Run the agent
response = agent_executor.invoke({"input": "Find the latest quarterly report"})
print(response["output"])

Integration with LangChain Chains

You can integrate the retriever with LangChain chains for more complex workflows:

from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import ChatOpenAI
from langchain_glean.retrievers import GleanSearchRetriever

# Initialize the retriever (will use environment variables)
retriever = GleanSearchRetriever()

# Create a prompt template
prompt = ChatPromptTemplate.from_template(
    """Answer the question based only on the context provided.

Context: {context}

Question: {question}"""
)

# Initialize the language model
llm = ChatOpenAI(model="gpt-3.5-turbo")

# Format documents function
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

# Create the chain
chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

# Run the chain
result = chain.invoke("What were our Q2 sales results?")
print(result)

Advanced Usage

Chat Model Parameters

You can customize the chat model behavior with additional parameters:

from langchain_glean import ChatGlean

# Initialize with custom parameters
chat = ChatGlean(
    save_chat=True,  # Save the chat session in Glean
    chat_id="existing-chat-id",  # Continue an existing chat
    agent="GPT",  # Specify the agent type (DEFAULT, GPT, etc.)
    mode="SEARCH",  # Specify the mode (DEFAULT, SEARCH, etc.)
    timeout=30  # Timeout in seconds for API requests
)

Search Parameters

You can customize your search by passing additional parameters:

# Search with additional parameters
documents = retriever.invoke(
    "quarterly sales report",
    page_size=5,  # Number of results to return
    disable_spellcheck=True,  # Disable spellcheck
    max_snippet_size=200  # Maximum snippet size
)

Contributing

For information on setting up a development environment and contributing to the project, please see CONTRIBUTING.md.

License

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

langchain_glean-0.2.1.tar.gz (105.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

langchain_glean-0.2.1-py3-none-any.whl (18.2 kB view details)

Uploaded Python 3

File details

Details for the file langchain_glean-0.2.1.tar.gz.

File metadata

  • Download URL: langchain_glean-0.2.1.tar.gz
  • Upload date:
  • Size: 105.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for langchain_glean-0.2.1.tar.gz
Algorithm Hash digest
SHA256 995540dcbb7e43a411a58b2725b6e0075a81d59330a32e8e96dd27182d28e122
MD5 5984a3569d1808310d81331579c0210c
BLAKE2b-256 418ba4b870c0b5a849cf252bc57f849343b5ae4059487c8c849732f99c615925

See more details on using hashes here.

Provenance

The following attestation bundles were made for langchain_glean-0.2.1.tar.gz:

Publisher: publish.yml on gleanwork/langchain-glean

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file langchain_glean-0.2.1-py3-none-any.whl.

File metadata

File hashes

Hashes for langchain_glean-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 6647153b56cf919e16641a22d4463540527e62b2aaa625a9a02f90da2b08e1d3
MD5 164a591b9f56b7f1c081d17b6a0ce1c6
BLAKE2b-256 bc7c5b1330aaea8f87f5767b7f80477b52d1c0a41f45cfaaf3f7f1aacdea82eb

See more details on using hashes here.

Provenance

The following attestation bundles were made for langchain_glean-0.2.1-py3-none-any.whl:

Publisher: publish.yml on gleanwork/langchain-glean

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page