Comprehensive ServiceNow data loader for AI/LLM pipelines - Incidents, CMDB, KB, Changes, Catalog & more. Works with LangChain & LlamaIndex.
Project description
snowloader
Created by Roni Das · thetotaltechnology@gmail.com
Comprehensive ServiceNow data loader for AI/LLM pipelines - Incidents, CMDB, KB, Changes, Problems, Catalog & more.
Works with LangChain & LlamaIndex out of the box. Python 3.10-3.13.
Documentation | PyPI | GitHub
Why snowloader?
Building RAG or agentic AI on top of ServiceNow data? You need a reliable way to pull structured ITSM records into your vector store. Existing tools either cover a single table, ignore relationships, or lock you into one framework.
snowloader gives you:
- 7 loaders covering core ServiceNow tables (Incidents, Knowledge Base, CMDB, Changes, Problems, Service Catalog, Attachments)
- Async support via
aiohttpfor concurrent paginated fetches - CMDB relationship traversal - concurrent graph walking with dependency mapping
- Delta sync - only fetch records updated since your last sync
- 4 auth modes - Basic, OAuth Password, OAuth Client Credentials, Bearer Token
- Production-grade - retry with backoff, rate limiting, thread safety, proxy support
- Framework-agnostic core with sync + async adapters for LangChain and LlamaIndex
- Memory-efficient streaming - generator-based pagination, never holds the full table in memory
- Built-in HTML cleaning - strips KB article HTML without extra dependencies
- Fully typed - PEP 561 compliant, mypy --strict clean
Installation
# pip
pip install snowloader # Core only
pip install snowloader[async] # + AsyncSnowConnection (aiohttp)
pip install snowloader[langchain] # + LangChain adapter
pip install snowloader[llamaindex] # + LlamaIndex adapter
pip install snowloader[all] # Everything
# uv
uv add snowloader
uv add snowloader[all]
Requirements: Python 3.10+ and a ServiceNow instance with REST API access.
Quick Start
from snowloader import SnowConnection, IncidentLoader
conn = SnowConnection(
instance_url="https://mycompany.service-now.com",
username="admin",
password="password",
)
loader = IncidentLoader(connection=conn, query="active=true^priority<=2")
for doc in loader.lazy_load():
print(doc.page_content[:200])
All 7 Loaders
Every loader shares the same interface: load() returns a list, lazy_load() yields one document at a time, load_since(datetime) fetches only updated records.
from snowloader import (
IncidentLoader, # IT incidents
KnowledgeBaseLoader, # KB articles (HTML auto-cleaned)
CMDBLoader, # Configuration items + relationships
ChangeLoader, # Change requests
ProblemLoader, # Problem records
CatalogLoader, # Service catalog items
AttachmentLoader, # File attachments (sys_attachment)
)
Async API
Pull large tables faster with AsyncSnowConnection. Pages are fetched concurrently against a shared aiohttp session, which delivers a 10-50x speedup on production-sized extractions.
import asyncio
from snowloader import AsyncSnowConnection, AsyncIncidentLoader
async def main() -> None:
async with AsyncSnowConnection(
instance_url="https://mycompany.service-now.com",
username="admin",
password="password",
page_size=500,
concurrency=16,
) as conn:
loader = AsyncIncidentLoader(connection=conn, query="active=true")
async for doc in loader.alazy_load():
print(doc.page_content[:200])
asyncio.run(main())
Every sync loader has a matching Async* variant: AsyncIncidentLoader, AsyncKnowledgeBaseLoader, AsyncCMDBLoader, AsyncChangeLoader, AsyncProblemLoader, AsyncCatalogLoader, and AsyncAttachmentLoader. The framework adapters expose async variants too (AsyncServiceNow*Loader for LangChain, AsyncServiceNow*Reader for LlamaIndex).
Attachments
The AttachmentLoader pulls records from the sys_attachment table. By default it returns metadata only (file name, content type, size, parent record). Pass download=True to fetch each file's bytes during iteration.
from snowloader import SnowConnection, AttachmentLoader
conn = SnowConnection(instance_url="...", username="...", password="...")
# Metadata only
loader = AttachmentLoader(connection=conn, query="table_name=kb_knowledge")
for doc in loader.lazy_load():
print(doc.metadata["file_name"], doc.metadata["size_bytes"])
# Download a specific file
loader.download_to("att_sys_id", "./out/diagram.png")
# Eager download with size cap
loader = AttachmentLoader(
connection=conn,
download=True,
max_size_bytes=10 * 1024 * 1024,
)
for doc in loader.lazy_load():
blob = doc.metadata.get("content_bytes")
Journal Entries (Work Notes & Comments)
Include the full investigation history from sys_journal_field:
loader = IncidentLoader(connection=conn, query="active=true", include_journals=True)
for doc in loader.lazy_load():
print(doc.page_content)
# Incident: INC0000007
# Summary: Need access to sales DB
# ...
# [work_notes] 2024-06-01 09:15:00 by alice
# Restarted Exchange service, monitoring.
#
# [comments] 2024-06-01 09:20:00 by alice
# We are working on the issue.
Also works with ChangeLoader and ProblemLoader.
LangChain Adapter
from snowloader import SnowConnection
from snowloader.adapters.langchain import ServiceNowIncidentLoader
conn = SnowConnection(instance_url="...", username="...", password="...")
loader = ServiceNowIncidentLoader(connection=conn, query="active=true")
docs = loader.load() # list[langchain_core.documents.Document]
# Use with any vector store
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings
vectorstore = FAISS.from_documents(docs, OpenAIEmbeddings())
LlamaIndex Adapter
from snowloader.adapters.llamaindex import ServiceNowIncidentReader
reader = ServiceNowIncidentReader(connection=conn, query="active=true")
docs = reader.load_data() # list[llama_index.core.schema.Document]
from llama_index.core import VectorStoreIndex
index = VectorStoreIndex.from_documents(docs)
Delta Sync
from datetime import datetime, timezone
loader = IncidentLoader(connection=conn)
docs = loader.load() # First run: everything
last_sync = datetime.now(timezone.utc)
updated = loader.load_since(last_sync) # Next runs: only changes
CMDB Relationship Traversal
loader = CMDBLoader(
connection=conn,
ci_class="cmdb_ci_server",
include_relationships=True,
)
for doc in loader.lazy_load():
# -> db-prod-01 (Depends on::Used by)
# <- load-balancer-01 (Depends on::Used by)
print(doc.page_content)
Authentication
# Basic Auth (development)
conn = SnowConnection(instance_url="...", username="admin", password="pass")
# OAuth Client Credentials (recommended for production)
conn = SnowConnection(instance_url="...", client_id="...", client_secret="...")
# OAuth Password Grant
conn = SnowConnection(instance_url="...", client_id="...", client_secret="...",
username="...", password="...")
# Bearer Token (pre-obtained)
conn = SnowConnection(instance_url="...", token="eyJhbG...")
Configuration
| Parameter | Default | Description |
|---|---|---|
page_size |
100 |
Records per API call (1-10,000) |
timeout |
60 |
HTTP timeout in seconds |
max_retries |
3 |
Retry attempts for 429/502/503/504 |
retry_backoff |
1.0 |
Base delay between retries (doubles each attempt) |
request_delay |
0.0 |
Min seconds between requests (rate limiting) |
display_value |
"true" |
sysparm_display_value setting |
proxy |
None |
HTTP/HTTPS proxy URL |
verify |
True |
SSL verification (path for custom CA bundle) |
See the full documentation for all parameters.
Roadmap
| Version | Feature | Status |
|---|---|---|
| v0.2 | Async support (aiohttp + async for) - 10-50x faster |
Shipped |
| v0.2 | Attachment loader (sys_attachment downloads) |
Shipped |
| v0.3 | Direct vector store streaming (Pinecone, Weaviate, Chroma) | Planned |
| v0.3 | Checkpoint and resume for large loads | Planned |
| v1.0 | Custom field mapping for customized instances | Planned |
Contributing
Contributions are welcome! Please:
- Fork the repository
- Create a feature branch
- Write tests first (we use pytest + responses for HTTP mocking)
- Ensure the quality gate passes:
ruff check src/ tests/ && ruff format --check src/ tests/ && mypy src/snowloader/ && pytest tests/ -x
- Open a pull request
Author
Created and maintained by Roni Das - thetotaltechnology@gmail.com
License
MIT - see LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file snowloader-0.2.4.tar.gz.
File metadata
- Download URL: snowloader-0.2.4.tar.gz
- Upload date:
- Size: 452.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
12d46a08b843e32c9f3cba333fbeac1763acf3e0bbba664569d52f43f5fb8714
|
|
| MD5 |
e41d82121cdf4491c5e3587e1183bc63
|
|
| BLAKE2b-256 |
938256efa1e7615f3d18c8c8d79fd6f9933e90383f74e985a8b7d917e21f7a4e
|
Provenance
The following attestation bundles were made for snowloader-0.2.4.tar.gz:
Publisher:
publish.yml on ronidas39/snowloader
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
snowloader-0.2.4.tar.gz -
Subject digest:
12d46a08b843e32c9f3cba333fbeac1763acf3e0bbba664569d52f43f5fb8714 - Sigstore transparency entry: 1396515963
- Sigstore integration time:
-
Permalink:
ronidas39/snowloader@df7259783b03dc42da5ca4272f04f76fbbff9301 -
Branch / Tag:
refs/tags/v0.2.4 - Owner: https://github.com/ronidas39
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@df7259783b03dc42da5ca4272f04f76fbbff9301 -
Trigger Event:
push
-
Statement type:
File details
Details for the file snowloader-0.2.4-py3-none-any.whl.
File metadata
- Download URL: snowloader-0.2.4-py3-none-any.whl
- Upload date:
- Size: 51.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4d7072c3b5f26f679462a17d753f1c192a5b6c2afbbd1be60512f80a0c550b15
|
|
| MD5 |
c3ff3ac29449262f3f0e7161d2d4f798
|
|
| BLAKE2b-256 |
d328a2c197d5c8b0d832dddb2d458a8dcca3519c5def316ebfb22fd24672b7cc
|
Provenance
The following attestation bundles were made for snowloader-0.2.4-py3-none-any.whl:
Publisher:
publish.yml on ronidas39/snowloader
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
snowloader-0.2.4-py3-none-any.whl -
Subject digest:
4d7072c3b5f26f679462a17d753f1c192a5b6c2afbbd1be60512f80a0c550b15 - Sigstore transparency entry: 1396515982
- Sigstore integration time:
-
Permalink:
ronidas39/snowloader@df7259783b03dc42da5ca4272f04f76fbbff9301 -
Branch / Tag:
refs/tags/v0.2.4 - Owner: https://github.com/ronidas39
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@df7259783b03dc42da5ca4272f04f76fbbff9301 -
Trigger Event:
push
-
Statement type: