LangChain retriever for Sourcey-generated documentation sites.
Project description
langchain-sourcey
langchain-sourcey is a LangChain retriever for Sourcey-generated
documentation sites.
It works against Sourcey's public build artefacts instead of a private hosted API:
search-index.jsonfor candidate discoveryllms-full.txtfor full-page content hydration- canonical page URLs for citations
Install
pip install langchain-sourcey
Usage
from langchain_sourcey import SourceyRetriever
retriever = SourceyRetriever(
site_url="https://docs.example.com/reference",
top_k=4,
)
docs = retriever.invoke("How does search work?")
for doc in docs:
print(doc.metadata["source"])
print(doc.page_content[:160])
The site_url should point at the root of a published Sourcey docs build.
The retriever fetches search-index.json and llms-full.txt from that root.
Output requirements
For best results, the Sourcey site should:
- publish
search-index.json - publish
llms-full.txt - set
siteUrlinsourcey.config.tsso citations are canonical
If llms-full.txt is not available, the retriever falls back to extracting
plain text from the matched HTML page.
Scope
This package currently ships SourceyRetriever only. A document loader is
intentionally out of scope until the retriever proves its usage.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file langchain_sourcey-0.1.0.tar.gz.
File metadata
- Download URL: langchain_sourcey-0.1.0.tar.gz
- Upload date:
- Size: 30.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fceb23206006002f74c26da5152dfd458b030ba243aac6f62546ff12dc17f8b1
|
|
| MD5 |
87aa21b63a0d07e24e1cba238252f319
|
|
| BLAKE2b-256 |
952bd9a29cf4e2f83843fcec916f439a86ebb89dd9a2de30dff0f196194e6d39
|
File details
Details for the file langchain_sourcey-0.1.0-py3-none-any.whl.
File metadata
- Download URL: langchain_sourcey-0.1.0-py3-none-any.whl
- Upload date:
- Size: 30.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
aaef1d2e2a15ea5aa50dc80e055d152900da41bb6b4130a00b2ff82010b1ff18
|
|
| MD5 |
4e347ae713dac666adee45a6758342d1
|
|
| BLAKE2b-256 |
3df6677e06691adf2b64cecbc8a2f1b274e9802639ac71f51b459acd9e2966a4
|