Skip to main content

A high-performance Python application template

Project description

OmniSkill

DeepWiki Context7 Python 3.12+ License: Apache-2.0 PyPI

omni-skill

A super skill generator that turns CSV and Markdown datasets into ready-to-use Agentic-RAG skills with a single command.

Overview

OmniSkill analyzes your dataset directory and generates:

  • SKILL.md — Skill specification document that tells LLMs how to use the knowledge base
  • search.py — Standalone Python script for BM25-based retrieval against the dataset
  • datasets/ — Symlinked source data for the generated skill

It also provides a CLI for manually creating skill scaffolding and searching knowledge bases.

Installation

Using uv (Recommended)

uv add omniskill

Using pip

pip install omniskill

From Source

git clone https://github.com/longcipher/omni-skill.git
cd omni-skill
uv sync --all-groups

Quick Start

Generate a Skill from a Dataset

Point OmniSkill at any directory containing CSV and/or Markdown files:

omniskill generate examples/backend-api-master

This analyzes the dataset files and produces a complete skill under skills/<dataset-name>/:

skills/backend-api-master/
  SKILL.md      # Skill specification for LLMs
  search.py     # Standalone search script
  datasets/     # Symlink to source data

Use the Generated Skill

The generated search.py can be run directly:

python skills/backend-api-master/search.py "API design best practices"

Or use the CLI:

omniskill search "API design best practices" --skill-dir skills/backend-api-master

Custom Options

# Custom skill name and output directory
omniskill generate my-datasets/ --name my-skill --output out/my-skill/

# Verbose mode shows dataset analysis details
omniskill generate my-datasets/ --verbose

Architecture

graph TD
    subgraph "1. Dataset Input"
        CSV[CSV Files] --> GEN[omniskill generate]
        MD[Markdown Files] --> GEN
    end

    subgraph "2. Generator"
        GEN --> ANALYZE[Analyze Dataset]
        ANALYZE --> GEN_SCRIPT[Generate search.py]
        ANALYZE --> GEN_MD[Generate SKILL.md]
        ANALYZE --> LINK[Symlink datasets/]
    end

    subgraph "3. Generated Skill"
        GEN_SCRIPT --> SEARCH_PY[search.py]
        GEN_MD --> SKILL_MD[SKILL.md]
        LINK --> DATASETS[datasets/]
    end

    subgraph "4. Runtime"
        SEARCH_PY --> ENGINE[SearchEngine]
        ENGINE --> INDEXER[CSV / MD Indexer]
        ENGINE --> BM25[BM25 Searcher]
        BM25 --> ASM[PromptAssembler]
        ASM --> |XML / Markdown / llms.txt| OUTPUT[Formatted Context]
    end

    SKILL_MD -. "instructs LLM" .-> SEARCH_PY

CLI Reference

generate — Generate a Skill from Datasets

omniskill generate <dataset-dir> [options]

Arguments:

  • dataset-dir — Path to a directory containing CSV and/or Markdown files

Options:

  • --name, -n — Skill name (defaults to the dataset directory name)
  • --output, -o — Output directory (defaults to skills/<skill-name>/)
  • --verbose, -v — Show dataset analysis details

Example:

omniskill generate data/api-specs/ --name api-assistant --output skills/api-assistant/

create — Create Skill Scaffolding

Create an empty skill directory with template files:

omniskill create <skill-name> [--force]

Options:

  • --force, -f — Overwrite existing skill directory

search — Search a Knowledge Base

omniskill search <query> --skill-dir <path> [options]

Options:

  • --skill-dir, -d — Path to the skill directory (required)
  • --format, -f — Output format: xml, markdown (default: xml)
  • --limit, -l — Maximum number of results (default: 10)
  • --type, -t — Filter by document type: csv, markdown
  • --tag — Filter by tag (AND logic, repeatable)
  • --metadata — Include BM25 scores in output
  • --verbose, -v — Enable verbose output

Python API

Generate a Skill Programmatically

from omniskill.core.generator import generate_skill

analysis = generate_skill(
    dataset_dir="data/my-datasets",
    skill_name="my-skill",
    output_dir="skills/my-skill",
)

print(f"Generated {analysis.skill_name} with {analysis.total_documents} documents")

SearchEngine

Index and search directories of CSV/Markdown files:

from omniskill.core.engine import SearchEngine
from omniskill.core.assembler import OutputFormat, PromptAssembler

engine = SearchEngine()
engine.index_directory("skills/my-skill/datasets")

results = engine.search("API design", limit=10)

assembler = PromptAssembler()
print(assembler.assemble(results, output_format=OutputFormat.XML))

Dataset Analysis

Analyze a dataset directory without generating files:

from omniskill.core.generator import analyze_dataset

analysis = analyze_dataset("data/my-datasets")
print(f"CSV files: {len(analysis.csv_files)}")
print(f"Markdown files: {len(analysis.markdown_files)}")
print(f"Total documents: ~{analysis.total_documents}")

Output Formats

OmniSkill supports three output formats for assembled search results:

Format CLI Flag Description
XML --format xml Structured <context_injection> with <rules> and <reference> sections
Markdown --format markdown Human-readable sections with source attribution
llms.txt (in generated scripts) Follows the llms.txt spec for LLM consumption

Contributing

Development Setup

git clone https://github.com/longcipher/omni-skill.git
cd omni-skill
uv sync --all-groups

Common Commands

just format      # Format code
just lint        # Run linter
just test        # Run unit tests
just bdd         # Run BDD tests
just test-all    # Run all tests
just build       # Build package
just typecheck   # Run type checker

Adding New Features

  1. Write a failing Gherkin scenario in features/*.feature
  2. Write a failing pytest test for the inner domain logic
  3. Implement the feature
  4. Re-run just test and just bdd to verify

License

Apache-2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

omniskill-0.1.0.tar.gz (18.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

omniskill-0.1.0-py3-none-any.whl (24.8 kB view details)

Uploaded Python 3

File details

Details for the file omniskill-0.1.0.tar.gz.

File metadata

  • Download URL: omniskill-0.1.0.tar.gz
  • Upload date:
  • Size: 18.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.1 {"installer":{"name":"uv","version":"0.11.1","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for omniskill-0.1.0.tar.gz
Algorithm Hash digest
SHA256 92c0a123b8b1be4081400b1cd47ee4da96636d3978e038a927577b68397f5ab4
MD5 63a6906bda6e68c08d3c2aa6478f047d
BLAKE2b-256 f99a69b4447282d6ac9352fed7bf30281daa90ab7fdb699b8de44b8a9af196a5

See more details on using hashes here.

File details

Details for the file omniskill-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: omniskill-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 24.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.1 {"installer":{"name":"uv","version":"0.11.1","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for omniskill-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 33ddbe82ca7732cbd1febdadc7965e3b2ea0cc43833f9d0ab4f485e6a8a17044
MD5 313901e04cd1eff03e26f3e7c2642d80
BLAKE2b-256 3c7930a137959693116367c782d9825b20a98606bad8ac04a500cd0413c8ada3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page