An integration package connecting Glean and LangChain
Project description
langchain-glean
This package contains the LangChain integration with Glean, an enterprise search platform. It allows you to search and retrieve information from your organization's content using LangChain.
Installation
pip install -U langchain-glean
Configuration
You need to configure your Glean credentials by setting the following environment variables:
export GLEAN_SUBDOMAIN="your-glean-subdomain"
export GLEAN_API_TOKEN="your-api-token"
export GLEAN_ACT_AS="user@example.com" # Optional: Email to act as when making requests
Usage
Using the Chat Model
The ChatGlean allows you to interact with Glean's AI chat functionality:
from langchain_core.messages import HumanMessage, SystemMessage
from langchain_glean import ChatGlean
# Initialize the chat model (will use environment variables)
chat = ChatGlean()
# Create messages
messages = [
SystemMessage(content="You are a helpful AI assistant."),
HumanMessage(content="What are the company holidays this year?")
]
# Generate a response
response = chat.invoke(messages)
print(response.content)
Streaming Responses
You can stream responses from the chat model:
from langchain_core.messages import HumanMessage, SystemMessage
from langchain_glean import ChatGlean
# Initialize the chat model
chat = ChatGlean()
# Create messages
messages = [
SystemMessage(content="You are a helpful AI assistant."),
HumanMessage(content="Explain retrieval augmented generation.")
]
# Stream the response
for chunk in chat.stream(messages):
# Process each chunk as it arrives
print(chunk.message.content, end="", flush=True)
Multi-turn Conversations
You can have multi-turn conversations with chat history:
from langchain_core.messages import HumanMessage, SystemMessage
from langchain_glean import ChatGlean
# Initialize the chat model with chat saving enabled
chat = ChatGlean(save_chat=True)
# Start a conversation
conversation = [
SystemMessage(content="You are a helpful AI assistant for our company.")
]
# First turn
conversation.append(HumanMessage(content="What are our main projects?"))
response = chat.invoke(conversation)
print(f"AI: {response.content}")
conversation.append(response)
# Second turn
conversation.append(HumanMessage(content="Which one has the highest priority?"))
response = chat.invoke(conversation)
print(f"AI: {response.content}")
# The chat_id is saved in the chat model
print(f"Chat ID: {chat.chat_id}")
Chat with RAG
You can combine the chat model with a retriever for RAG:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain_glean import ChatGlean, GleanSearchRetriever
# Initialize components
retriever = GleanSearchRetriever()
chat = ChatGlean()
# Create a prompt template
prompt = ChatPromptTemplate.from_messages([
("system", "Answer based on the retrieved information: {context}"),
("human", "{question}")
])
# Format documents function
def format_docs(docs):
return "\n\n".join(doc.page_content for doc in docs)
# Create a RAG chain
rag_chain = (
{"context": retriever | format_docs, "question": RunnablePassthrough()}
| prompt
| chat
| StrOutputParser()
)
# Run the chain
response = rag_chain.invoke("What are our company policies?")
print(response)
Using the Retriever
The GleanSearchRetriever allows you to search and retrieve documents from Glean:
from langchain_glean.retrievers import GleanSearchRetriever
# Initialize the retriever (will use environment variables)
retriever = GleanSearchRetriever()
# Search for documents
documents = retriever.invoke("quarterly sales report")
# Process the results
for doc in documents:
print(f"Title: {doc.metadata.get('title')}")
print(f"URL: {doc.metadata.get('url')}")
print(f"Content: {doc.page_content}")
print("---")
Using the Tool
The GleanSearchTool can be used in LangChain agents to search Glean:
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, create_openai_tools_agent
from langchain_glean.retrievers import GleanSearchRetriever
from langchain_glean.tools import GleanSearchTool
# Initialize the retriever (will use environment variables)
retriever = GleanSearchRetriever()
# Create the tool
glean_tool = GleanSearchTool(
retriever=retriever,
name="glean_search",
description="Search for information in your organization's content using Glean."
)
# Create an agent with the tool
llm = ChatOpenAI(model="gpt-3.5-turbo")
prompt = ChatPromptTemplate.from_messages([
("system", "You are a helpful assistant with access to Glean search."),
("user", "{input}")
])
agent = create_openai_tools_agent(llm, [glean_tool], prompt)
agent_executor = AgentExecutor(agent=agent, tools=[glean_tool])
# Run the agent
response = agent_executor.invoke({"input": "Find the latest quarterly report"})
print(response["output"])
Integration with LangChain Chains
You can integrate the retriever with LangChain chains for more complex workflows:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import ChatOpenAI
from langchain_glean.retrievers import GleanSearchRetriever
# Initialize the retriever (will use environment variables)
retriever = GleanSearchRetriever()
# Create a prompt template
prompt = ChatPromptTemplate.from_template(
"""Answer the question based only on the context provided.
Context: {context}
Question: {question}"""
)
# Initialize the language model
llm = ChatOpenAI(model="gpt-3.5-turbo")
# Format documents function
def format_docs(docs):
return "\n\n".join(doc.page_content for doc in docs)
# Create the chain
chain = (
{"context": retriever | format_docs, "question": RunnablePassthrough()}
| prompt
| llm
| StrOutputParser()
)
# Run the chain
result = chain.invoke("What were our Q2 sales results?")
print(result)
Advanced Usage
Chat Model Parameters
You can customize the chat model behavior with additional parameters:
from langchain_glean import ChatGlean
# Initialize with custom parameters
chat = ChatGlean(
save_chat=True, # Save the chat session in Glean
chat_id="existing-chat-id", # Continue an existing chat
agent="GPT", # Specify the agent type (DEFAULT, GPT, etc.)
mode="SEARCH", # Specify the mode (DEFAULT, SEARCH, etc.)
timeout=30 # Timeout in seconds for API requests
)
Search Parameters
You can customize your search by passing additional parameters:
# Search with additional parameters
documents = retriever.invoke(
"quarterly sales report",
page_size=5, # Number of results to return
disable_spellcheck=True, # Disable spellcheck
max_snippet_size=200 # Maximum snippet size
)
Contributing
For information on setting up a development environment and contributing to the project, please see CONTRIBUTING.md.
License
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file langchain_glean-0.2.1.tar.gz.
File metadata
- Download URL: langchain_glean-0.2.1.tar.gz
- Upload date:
- Size: 105.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
995540dcbb7e43a411a58b2725b6e0075a81d59330a32e8e96dd27182d28e122
|
|
| MD5 |
5984a3569d1808310d81331579c0210c
|
|
| BLAKE2b-256 |
418ba4b870c0b5a849cf252bc57f849343b5ae4059487c8c849732f99c615925
|
Provenance
The following attestation bundles were made for langchain_glean-0.2.1.tar.gz:
Publisher:
publish.yml on gleanwork/langchain-glean
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
langchain_glean-0.2.1.tar.gz -
Subject digest:
995540dcbb7e43a411a58b2725b6e0075a81d59330a32e8e96dd27182d28e122 - Sigstore transparency entry: 190397411
- Sigstore integration time:
-
Permalink:
gleanwork/langchain-glean@a5604d26d979826c1739f2afc85cf1e0a14a6723 -
Branch / Tag:
refs/tags/v0.2.1 - Owner: https://github.com/gleanwork
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@a5604d26d979826c1739f2afc85cf1e0a14a6723 -
Trigger Event:
push
-
Statement type:
File details
Details for the file langchain_glean-0.2.1-py3-none-any.whl.
File metadata
- Download URL: langchain_glean-0.2.1-py3-none-any.whl
- Upload date:
- Size: 18.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6647153b56cf919e16641a22d4463540527e62b2aaa625a9a02f90da2b08e1d3
|
|
| MD5 |
164a591b9f56b7f1c081d17b6a0ce1c6
|
|
| BLAKE2b-256 |
bc7c5b1330aaea8f87f5767b7f80477b52d1c0a41f45cfaaf3f7f1aacdea82eb
|
Provenance
The following attestation bundles were made for langchain_glean-0.2.1-py3-none-any.whl:
Publisher:
publish.yml on gleanwork/langchain-glean
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
langchain_glean-0.2.1-py3-none-any.whl -
Subject digest:
6647153b56cf919e16641a22d4463540527e62b2aaa625a9a02f90da2b08e1d3 - Sigstore transparency entry: 190397416
- Sigstore integration time:
-
Permalink:
gleanwork/langchain-glean@a5604d26d979826c1739f2afc85cf1e0a14a6723 -
Branch / Tag:
refs/tags/v0.2.1 - Owner: https://github.com/gleanwork
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@a5604d26d979826c1739f2afc85cf1e0a14a6723 -
Trigger Event:
push
-
Statement type: