Skip to main content

LangChain integrations for Stack Overflow for Teams

Project description

Stack Overflow for Teams for LangChain

This library provides a basic LangChain document loader for Stack Overflow for Teams.

Document Loader Usage

Get articles from Teams Basic or Business

from langchain_stack_overflow_for_teams import StackOverflowTeamsApiV3Loader

loader = StackOverflowTeamsApiV3Loader(
   access_token=os.environ.get("SO_PAT"),
   team="my team",
   content_type="articles",
)
docs = loader.load()

Get questions with answers from Teams Basic or Business

from langchain_stack_overflow_for_teams import StackOverflowTeamsApiV3Loader

loader = StackOverflowTeamsApiV3Loader(
   access_token=os.environ.get("SO_PAT"),
   team="my team",
   content_type="questions",
)
docs = loader.load()

Get articles from Teams Enterprise

from langchain_stack_overflow_for_teams import StackOverflowTeamsApiV3Loader

loader = StackOverflowTeamsApiV3Loader(
   endpoint="[your_site].stackenterprise.co/api",
   access_token=os.environ.get("SO_API_TOKEN"),
   content_type="articles",
)
docs = loader.load()

Get articles from a private team in Teams Enterprise

from langchain_stack_overflow_for_teams import StackOverflowTeamsApiV3Loader

loader = StackOverflowTeamsApiV3Loader(
   endpoint="[your_site].stackenterprise.co/api",
   access_token=os.environ.get("SO_API_TOKEN"),
   team="my team",
   content_type="articles",
)
docs = loader.load()

Full Example - Questions from Teams Enterprise

This example retrieves content from a Stack Overflow for Teams Enterprise site and loads it into a LanceDB vector store for access by an LLM-based system.

""" This script demonstrates use the Langchain add_documents model to naively load all documents every time (easy, but not efficient) """
import os
from dotenv import load_dotenv
from langchain_openai import AzureOpenAIEmbeddings
from langchain_community.vectorstores import LanceDB
from langchain_text_splitters import HTMLSemanticPreservingSplitter
from lib.stackoverflow.loader import StackOverflowTeamsApiV3Loader




def main():
   load_dotenv()


   embeddings = AzureOpenAIEmbeddings()
   db = LanceDB(
       table_name="docs",
       uri="./db/lancedb",
       embedding=embeddings,
   )


   # load documents
   loader = StackOverflowTeamsApiV3Loader(
       endpoint="[your_site].stackenterprise.co/api",
       access_token=os.environ.get("SO_API_TOKEN"),
       team="[your_team]",
       date_from="2021-05-01T00:00:00Z",
       sort="activity",
       order="desc",
       content_type="questions",
       is_answered="true",
       has_accepted_answer="true",
   )
   docs = loader.load()
   print(f"Loaded {len(docs)} documents")


   # chunk documents
   print("Chunking documents...")
   documents = HTMLSemanticPreservingSplitter(
       headers_to_split_on=[("h1", "Header 1"), ("h2", "Header 2")],
       max_chunk_size=1000,
       chunk_overlap=200,
       preserve_parent_metadata=True
   ).transform_documents(docs)
   print(f"Chunked {len(documents)} documents.")
   if len(documents) > 0:
       print(documents[0])


       # load embeddings into lancedb
       db.add_documents(documents)




if __name__ == "__main__":
   main()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

langchain_stack_overflow_for_teams-0.1.2.tar.gz (8.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file langchain_stack_overflow_for_teams-0.1.2.tar.gz.

File metadata

File hashes

Hashes for langchain_stack_overflow_for_teams-0.1.2.tar.gz
Algorithm Hash digest
SHA256 124ea16042bb137899b7b92c905cc6a894824ffa4be35f20ee78f0e84bc55029
MD5 ffa2916175bf51fb9c1d06aa5cdf880c
BLAKE2b-256 6f78076355bc9d038c777da6184130c2ae84a7f7892ae790421270dac925b153

See more details on using hashes here.

File details

Details for the file langchain_stack_overflow_for_teams-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for langchain_stack_overflow_for_teams-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 a7c7fc1267264341daccde49dc4548f91562e751e29030df4752d08b5224d1c4
MD5 be1b0d88e0600d9f647a86e9da091acf
BLAKE2b-256 92ec4db1603045b9bb25777576de83d10376a132c6cc84e78371df9c9c630813

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page