Skip to main content

Sync anything to Open WebUI Knowledge Bases

Project description

📚 oikb

Keep your Open WebUI Knowledge Bases in sync. Point it at a local directory, a GitHub repo, a Confluence space, an S3 bucket, or any of 44 supported sources. Only new and modified files are uploaded via incremental SHA-256 diffing.

[!IMPORTANT] Requires Open WebUI 0.9.6+

Quick Start

pip install oikb

export OPEN_WEBUI_URL=http://localhost:3000
export OPEN_WEBUI_API_KEY=sk-your-api-key

# Sync a directory to a Knowledge Base
oikb sync ./docs --kb-id your-kb-id

# Or watch for changes and auto-sync continuously
oikb watch ./docs --kb-id your-kb-id

Commands

Command Description
oikb sync <source> Incremental sync to a Knowledge Base
oikb watch <dir> Watch for changes and auto-sync
oikb daemon Long-lived scheduler with HTTP API
oikb diff <source> Preview what a sync would do
oikb history View sync history
oikb ls List files in a Knowledge Base
oikb status Show KB info and file count
oikb reset Delete all files in a Knowledge Base
oikb config Manage saved URL and API key

Daemon

Run oikb daemon for production deployments. Reads .oikb.yaml and syncs each source on a schedule.

oikb daemon --port 8080

Features:

  • Scheduled sync — configurable per-source intervals (30m, 1h, 6h)
  • Webhooks — instant sync on push via /webhooks/github, /webhooks/gitlab, /webhooks/slack, /webhooks/confluence
  • Health checksGET /health for Docker/K8s readiness probes
  • Sync historyGET /history queryable log of all syncs
  • On-demand syncPOST /sync/{source} to trigger immediately
  • OpenAPI tool server — add http://oikb:8080 as a Tool Server in Open WebUI (Settings → Connections) and let the LLM trigger syncs, check status, and query history
# .oikb.yaml
sources:
  - source: github:owner/repo
    kb-id: team-wiki
    interval: 1h
    webhook: true

  - source: confluence:ENG
    kb-id: handbook
    interval: 6h

Docker Compose

services:
  open-webui:
    image: ghcr.io/open-webui/open-webui:main
    ports:
      - "3000:8080"

  oikb:
    image: ghcr.io/open-webui/oikb:latest
    environment:
      - OPEN_WEBUI_URL=http://open-webui:8080
      - OPEN_WEBUI_API_KEY=${OPEN_WEBUI_API_KEY}
    volumes:
      - ./.oikb.yaml:/app/.oikb.yaml:ro
    command: daemon
    ports:
      - "8080:8080"
    depends_on:
      - open-webui
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8080/health/ready"]
      interval: 30s
      timeout: 5s

44 Connectors

Category Sources
Code Repos GitHub, GitLab, Bitbucket
Cloud Storage S3, GCS, Azure Blob, Dropbox, R2, Google Drive, SharePoint, Egnyte, Oracle Cloud
Wikis & KBs Confluence, Notion, BookStack, Discourse, GitBook, Guru, Outline, Slab, Document360, DokuWiki, Google Sites
Ticketing Jira, Linear, Zendesk, Freshdesk, Asana, ClickUp, Airtable, ServiceNow, ProductBoard
Messaging Slack, Discord, Microsoft Teams, Gmail, Zulip
Meetings Gong, Fireflies
Forums XenForo
Sales & CRM Salesforce, HubSpot
Web Website / Sitemap crawler
oikb sync github:owner/repo --kb-id your-kb-id
oikb sync confluence:ENG --kb-id your-kb-id
oikb sync s3://bucket/prefix --kb-id your-kb-id
oikb sync servicenow:incident --kb-id your-kb-id

Some connectors need an optional extra: pip install oikb[gdrive], pip install oikb[s3], or pip install oikb[all] for everything.

Multi-KB Routing

Route files from a single source to different Knowledge Bases by glob pattern:

sources:
  - source: github:owner/repo
    routes:
      "docs/**/*.md": docs-kb
      "src/**": code-kb

Selective Sync Filters

Narrow what gets synced with include/exclude globs:

sources:
  - source: github:owner/repo
    kb-id: docs-only
    filter:
      include: ["docs/**/*.md", "*.txt"]
      exclude: ["drafts/**"]

Configuration

Resolved in order (highest priority wins):

  1. CLI flags (--url, --token)
  2. Environment variables (OPEN_WEBUI_URL, OPEN_WEBUI_API_KEY)
  3. Config file (~/.config/oikb/config.yaml)

History

oikb history                    # Table view
oikb history --json             # JSON output
oikb history --errors           # Failed syncs only
oikb history --clear --days 7   # Prune old entries

GitHub Actions

- name: Sync docs to Open WebUI
  uses: docker://ghcr.io/open-webui/oikb:latest
  with:
    args: sync /github/workspace/docs --kb-id ${{ secrets.KB_ID }}
  env:
    OPEN_WEBUI_URL: ${{ secrets.OPEN_WEBUI_URL }}
    OPEN_WEBUI_API_KEY: ${{ secrets.OPEN_WEBUI_API_KEY }}

How It Works

  1. Scan source, compute checksums
  2. Send manifest to Open WebUI /sync/diff
  3. Delete stale files, create missing directories
  4. Upload only new and modified files

License

MIT. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

oikb-0.2.0.tar.gz (50.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

oikb-0.2.0-py3-none-any.whl (89.1 kB view details)

Uploaded Python 3

File details

Details for the file oikb-0.2.0.tar.gz.

File metadata

  • Download URL: oikb-0.2.0.tar.gz
  • Upload date:
  • Size: 50.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.0

File hashes

Hashes for oikb-0.2.0.tar.gz
Algorithm Hash digest
SHA256 b91231d6cd86fc0aaad9c27640d77f3a22f86e119265c9ef1d2e65a89e5ce7d6
MD5 762e608839a8363846a7994819177e43
BLAKE2b-256 338c66f8b6a8ba215be412cd7f6c1a756039b34c01368eec6c97e1edf959d5e2

See more details on using hashes here.

File details

Details for the file oikb-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: oikb-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 89.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.0

File hashes

Hashes for oikb-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 42c3f98f2cebf1089c7e49ffd7077d7549d6eb1c361b51aa692f98be44f12b72
MD5 27f3a514fbdc813511975d7abccbf06a
BLAKE2b-256 105fe65ed79da5f9bf36ff8a6a3d701ca69a7e72c6986c55419041e857c04c14

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page