Owlsight is a command-line tool combining open-source AI models with Python functionality to create a powerful AI assistant.

These details have not been verified by PyPI

Project description

Owlsight

Owlsight is a command-line tool that combines Python programming with open-source language models. It offers an interactive interface that allows you to execute Python code, shell commands, and use an AI assistant in one unified environment. Next to that, Owlsight offers an extensive set of tools in its backend-API, which enables you to use most of the existing CLI-functionaliy in your own Python scripts.

Why owlsight?

Picture this: you are someone who dabbles in Python occasionally. Or you frequently use generative AI to accelerate your workflow, whether for generating code or working with data. Often, this involves a tedious process—copying and pasting code between ChatGPT and your IDE, repeatedly switching contexts.

What if you could eliminate this friction?

Owlsight brings Python and generative AI together in an intuitive Command Line Interface, streamlining your workflow by integrating them into a single, unified platform. No more toggling between windows, no more manual code transfers. With Owlsight, you get the full power of Python and AI, all in one place—simplifying your process and boosting productivity. Owlsight has been designed to be a swiss-army knife for Python and AI with a core focus on open-source models, allowing you to execute code directly from model prompts and access this code directly from the Python interpreter.

Installation of the CLI:

You can install Owlsight using pip:

pip install owlsight

Installation Options and Dependencies

A basic installation includes only the core dependencies needed for the transformers library and basic functionality. For access to specific features, you will need to install optional dependency groups:

Optional Feature Modules

To add GGUF model support (using llama-cpp-python):

pip install owlsight[gguf]

To add ONNX model support (optimized model inference):

pip install owlsight[onnx]

To add multimodal functionality (image processing, OCR):

pip install owlsight[multimodal]

To add web search and scraping capabilities:

pip install owlsight[search]

To add voice control functionality:

pip install owlsight[voice]

For operating in an offline environment with tika-server.jar file, enabling you to use the DocumentReader class (which includes Apache Tika functionality):

pip install owlsight[offline]

Comprehensive Installation

To install all packages and features:

pip install owlsight[all]

Available Features Based on Installation

Feature	Basic Install	Required Extra
Transformers models	✓	-
GGUF models	✗	[gguf]
ONNX models	✗	[onnx]
Image processing	✗	[multimodal]
Web search/scraping	✗	[search]
Voice control	✗	[voice]
Offline document reading (using Apache Tika)	✗	[offline]
Development	✗	[dev]

Security and Performance Notes

Using multiple flags is a conscious design choice to give users more control over the behavior of the application and prevent "dependency hell".
The application is designed to gracefully handle missing dependencies - you will receive helpful warning/error messages if you attempt to use a feature without the required dependencies.
Some libraries like llama-cpp-python and pytorch may require specific configurations depending on your hardware.
If you want most useful features out of the box, it is recommended to pip install Owlsight with the [all] option. This will install owlsight with the following flags: gguf, onnx, multimodal, search
Recommended python versions: 3.10, 3.11, 3.12. Lower or higher python versions may not support all features, especially due to package incompatibilities.

Usage

After installation, launch Owlsight in the terminal by running the following command:

owlsight

This will present you with some giant ASCII-art of an owl and information which tells you whether you have access to an active GPU (assuming you use CUDA).

Then, you are presented with the mainmenu:

Current choice:
> how can I assist you?
shell
python
config: main
save
load
clear history
quit

A choice can be made in the mainmenu by pressing the UP and DOWN arrow keys. In the config menu, the LEFT and RIGHT arrow keys can be used to navigate between the different sections. From the config sections, press "back" to go back to the mainmenu. Press ENTER to select an option. If you want to change an option, press ENTER to confirm the change.

Keyboard Shortcuts

When working with the editable option, the following keyboard shortcuts are available:

Ctrl+A: Select all text in the current editable field
Ctrl+C: Copy selected text
Ctrl+Y: Paste selected text

Getting Started

Now, lets start out by loading a model. Go to config and toggle a few times to the right to reach the huggingface section. Choose a task like text-generation and press ENTER.

Then, use the search option to search for a model. You can first type in keywords before searching, like "llama gguf". This will give you results from the Huggingface modelhub which are related to models in the llama-family in GGUf format.

Press ENTER to see the top_k results. Use the LEFT and RIGHT arrow keys in the select_model option to select a model and press ENTER to load it.

Available Commands

The following available commands are available from the mainmenu:

How can I assist you: Ask a question or give an instruction. By default, model responses are streamed to the console.
shell : Execute shell commands. This can be useful for pip installing python libraries inside the application.
python : Enter a Python interpreter. Press exit() to return to the mainmenu.
config: main : Modify the main, model , generate or rag configuration settings.
save/load : Save or load a configuration file.
clear history : Clear the chat history and cache folder.
quit : Exit the application.

Voice Control

Owlsight supports voice control functionality when installed with pip install owlsight[voice]. This allows you to control the application using voice commands.

To enable voice control, use the --voice flag when starting Owlsight:

owlsight --voice

You can customize the voice control behavior using JSON-based configuration:

# Custom key mappings (spoken words to keyboard actions)
owlsight --voice --word-to-key '{
    "backward": "left",
    "forward": "right",
    "save": ["ctrl", "s"],
    "select all": ["ctrl", "a"]
}'

# Custom word substitutions
owlsight --voice --word-to-word '{
    "print": "print()",
    "function": "def my_function():",
    "exit": "exit()"
}'

# Advanced voice control settings
owlsight --voice --voicecontrol-kwargs '{
    "cmd_cooldown": 0.5,
    "debug": true,
    "language": "en",
    "model": "base.en",
    "key_press_interval": 0.1,
    "typing_interval": 0.05
}'

These options can be combined to create a fully customized voice control experience, which you can also utilize outside of the application.

Example Workflow

You can combine Python variables defined in the Python Interpreter together with language models in Owlsight through special double curly-brackets syntax. For example:

python > a = 42
How can I assist you? > How much is {{a}} * 5?

answer -> 210

Additionally, you can also ask a model to write pythoncode and access that in the python interpreter.

From a model response, all generated python code will be extracted and can be edited or executed afterwards. This choice is always optional. After execution, the defined objects will be saved in the global namespace of the python interpreter for the remainder of the current active session. This is a powerful feature, which allows build-as-you-go for a wide range of complex tasks.

Example:

How can I assist you? > Can you write a function which reads an Excel file?

-> model writes a function called read_excel

python > excel_data = read_excel("path/to/excel")

MultiModal Support

In Owlsight 2, special multimodal support is available for certain models that require additional input, like images or audio. In the backend, this is made possible with the MultiModalProcessorTransformers class. In the CLI, this can be done by setting the config.model.model_id to a multimodal model from the Huggingface modelhub. Keep in mind that this model should be a Pytorch model (so not GGUF or ONNX). For convenience, it is recommended to select a model through the new Huggingface API in the configuration-settings (read below for more information).

The following tasks are supported for multimodal models:

image-to-text
automatic-speech-recognition
visual-question-answering
document-question-answering

These models require additional input, which can be passed in the prompt. The syntax for passing mediatypes can be done through special double-square brackets syntax, like so:

How can I assist you?

[[image:path/to/file.jpg]]

The supported mediatypes are: image, audio. For example, to pass an image to a document-question-answering model, you can use the following syntax:

How can I assist you?

What is the first sentence in this image? [[image:path/to/image.jpg]]

Python interpreter

Next to the fact that objects generated by model-generated code can be accessed, the Python interpreter also has some useful default functions, starting with the "owl_" suffix. These serve as utilityfunctions.

These are:

owl_import: Import Python file to current namespace
owl_read: Read file content from any supported format
owl_edit: Edit file content
owl_terminal: Execute shell commands. Useful for tool usage by an agent
owl_scrape: Scrape urls
owl_show: Display active objects in the Python namespace
owl_write: Write content to text file
owl_history: Display model chat history
owl_models: Display loaded HuggingFace models in cache directory
owl_press: Press keys for automation tasks
owl_save_namespace: Save namespace to .dill file
owl_load_namespace: Load namespace from .dill file
owl_tools: Show available functions for tool calling
owl_search: Search and get results from the web using DuckDuckGo's API
owl_search_and_scrape: Search and scrape the web using DuckDuckGo's API. Uses both the owl_search and owl_scrape functions combined.
owl_create_document_searcher: Create a DocumentSearcher instance with a given set of documents and a text splitter. This class is great for usage in a RAG scenario.

Configurations

Owlsight uses a configuration file in JSON-format to adjust various parameters. The configuration is divided into five main sections: main, model, generate, rag and huggingface. Here's an overview of the application architecture:

Main Menu:

assistant: Chat with the loaded model. Use {{expression}} to pass python code directly. Or e.g. [[image: path/to/image.jpg]] to pass an image to the model
shell: Execute shell commands
python: Enter Python interpreter
config: Configuration settings
- main settings:
  - back: Return to previous menu
  - max_retries_on_error: Maximum number of retries for Python code error recovery. This parameter is only used when prompt_retry_on_error is set to True., Options: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, Type: OptionType.TOGGLE
  - prompt_retry_on_error: Whether to prompt before retrying on error. Set this to True to avoid direct Python code execution on error!, Options: False, True, Type: OptionType.TOGGLE
  - prompt_code_execution: Whether to prompt before executing code. Set this to True to avoid direct Python code execution!, Options: False, True, Type: OptionType.TOGGLE
  - track_model_usage: Show metrics after a model response. Tracks GPU/CPU usage, amount of generated words and responsetime of model. NOTE: GPU tracking only works for PyTorch models., Options: False, True, Type: OptionType.TOGGLE
  - extra_index_url: Additional URL for Python package installation. Useful for example when installing python packages (through pip) from private repositories, Type: OptionType.EDITABLE
  - python_compile_mode: Compile mode in the Python Interpreter (main menu): 'exec' is suited for defining code blocks, 'single' for direct execution, Options: exec, single, Type: OptionType.TOGGLE
  - dynamic_system_prompt: Experimental feature: The model will first act as Prompt Engineer to create a new system prompt based on user input., Options: False, True, Type: OptionType.TOGGLE
  - default_config_on_startup: Link to a configuration file that will be loaded on startup., Type: OptionType.EDITABLE
  - sequence_on_loading: A list of key sequences to execute when loading the configuration. Uses owl_press functionality., Type: OptionType.EDITABLE
- model settings:
  - back: Return to previous menu
  - model_id: Model identifier or path. The most important parameter in the configuration, as this will load the model to be used, Type: OptionType.EDITABLE
  - apply_chat_history: Toggle the inclusion of saved chat history in the prompt. Enable for chat models, disable for instruct models., Options: False, True, Type: OptionType.TOGGLE
  - system_prompt: System prompt defining model behavior, Type: OptionType.EDITABLE
  - model_kwargs: Additional parameters passed during model initialization. For llama-cpp, these get passed to llama_cpp.Llama. For transformers, these get passed to transformers.pipeline, Type: OptionType.EDITABLE
  - transformers__device: Device for transformers model, Options: None, cpu, cuda, mps, Type: OptionType.TOGGLE
  - transformers__quantization_bits: Quantization bits for transformers model, Options: None, 4, 8, 16, Type: OptionType.TOGGLE
  - transformers__stream: Whether to stream input to transformers model, Options: False, True, Type: OptionType.TOGGLE
  - gguf__filename: GGUF model filename, Type: OptionType.EDITABLE
  - gguf__verbose: Verbose output for GGUF model, Options: False, True, Type: OptionType.TOGGLE
  - gguf__n_ctx: Context length for GGUF model, Options: 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192, 16384, 32768, 65536, 131072, 262144, 524288, Type: OptionType.TOGGLE
  - gguf__n_gpu_layers: Number of layers from the model which are offloaded to the GPU, Options: -1, 0, 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, Type: OptionType.TOGGLE
  - gguf__n_batch: Batch size to be used by GGUF model, Options: 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192, 16384, Type: OptionType.TOGGLE
  - gguf__n_cpu_threads: Number of CPU threads to be used by GGUF model., Options: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, Type: OptionType.TOGGLE
  - onnx__model_dir: Directory containing local ONNX model, Type: OptionType.EDITABLE
  - onnx__verbose: Verbose output for ONNX model, Options: False, True, Type: OptionType.TOGGLE
  - onnx__n_cpu_threads: Number of CPU threads to be used by ONNX model, Options: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, Type: OptionType.TOGGLE
- generate settings:
  - back: Return to previous menu
  - stopwords: Stopwords that stop text generation. This can be useful for getting more control over when modelgeneration should stop. Pass these like ['stop', 'word'], Type: OptionType.EDITABLE
  - max_new_tokens: Maximum amount of tokens to generate, Options: 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192, 16384, 32768, 65536, 131072, 262144, 524288, Type: OptionType.TOGGLE
  - temperature: Temperature for model generation, Options: 0.0, 0.05, 0.1, 0.15, 0.2, 0.25, 0.3, 0.35, 0.4, 0.45, 0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9, 0.95, 1.0, Type: OptionType.TOGGLE
  - generation_kwargs: Additional generation parameters, like top_k, top_p, etc. Pass these like {'top_k': 4, 'top_p': 0.9}, Type: OptionType.EDITABLE
- rag settings:
  - back: Return to previous menu
  - active: Whether RAG for python libraries is active. If True, the search-results will be implicitly added as context to the modelprompt and when pressing ENTER, search-results will be shown, Options: False, True, Type: OptionType.TOGGLE
  - target_library: Target python library for to use for RAG. If the library is not installed in the active environment, a warning will be showed with available options, Type: OptionType.EDITABLE
  - top_k: Number of most matching RAG results to return, based on search query, Options: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, Type: OptionType.TOGGLE
  - sentence_transformer_weight: Weight for the embedding model. TFIDF-weight is 1 - sentence_transformer_weight, Options: 0.0, 0.05, 0.1, 0.15, 0.2, 0.25, 0.3, 0.35, 0.4, 0.45, 0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9, 0.95, 1.0, Type: OptionType.TOGGLE
  - sentence_transformer_name_or_path: Name or path to a sentence-transformer model, which is used for embedding, Type: OptionType.EDITABLE
  - search: RAG search query. Press ENTER to show the top_k results. Only used when active is True, Type: OptionType.EDITABLE
- agentic settings:
  - back: Return to previous menu
  - active: Toggle whether the agentic system is active. Available tools concerns an existing subset of functions (and every new defined one) in the Python Interpreter namespace., Options: False, True, Type: OptionType.TOGGLE
  - additional_information: Additional information specifically for the Tool agent. E.g. 'Do NOT use owl_scrape and owl_search, because there is no internet connection', Type: OptionType.EDITABLE
  - exclude_tools: Comma-separated list of tools (as string) to exclude from the available tools. These tools can be used by the Tool agent. E.g. ['owl_scrape,owl_search'], Type: OptionType.EDITABLE
  - config_per_agent: Set configurations per agent, allowing unique models for each agent type. For Example: {'PlanAgent': 'path/to/config.json', 'PlanValidationAgent': 'path/to/config.json', 'ToolCreationAgent': 'path/to/config.json', 'ToolSelectionAgent': 'path/to/config.json', 'ObservationAgent': 'path/to/config.json', 'FinalAgent': 'path/to/config.json'}, Type: OptionType.EDITABLE
- huggingface settings:
  - back: Return to previous menu
  - search: Search for a model on the Hugging Face Hub by pressing ENTER. Keywords can be used optionally to finetune searchresults, e.g. 'llama 3b gguf', Type: OptionType.EDITABLE
  - top_k: Top number of Hugging Face results to return. The results will be sorted by highest score first, Options: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, Type: OptionType.TOGGLE
  - select_model: Select and load a model from the Hugging Face Hub by toggling through the options found by search, Type: OptionType.TOGGLE
  - task: Filter Hugging Face models by task. When using search, the results will be filtered directly by chosen task, Options: None, text-generation, text2text-generation, translation, summarization, image-to-text, automatic-speech-recognition, visual-question-answering, document-question-answering, Type: OptionType.TOGGLE
save: Save current configuration as JSON-file
load: Load a configuration from a JSON-file
clear history: Clear owlsight cache (directory called '.owlsight') and chat history
quit: Exit application

Here is an example of what the default configuration looks like:

{
    "main": {
        "max_retries_on_error": 3,
        "prompt_retry_on_error": true,
        "prompt_code_execution": true,
        "track_model_usage": false,
        "extra_index_url": "",
        "python_compile_mode": "single",
        "dynamic_system_prompt": false,
        "default_config_on_startup": "",
        "sequence_on_loading": []
    },
    "model": {
        "model_id": "",
        "apply_chat_history": true,
        "system_prompt": "",
        "model_kwargs": {},
        "transformers__device": null,
        "transformers__quantization_bits": null,
        "transformers__stream": true,
        "gguf__filename": "",
        "gguf__verbose": false,
        "gguf__n_ctx": 2048,
        "gguf__n_gpu_layers": 0,
        "gguf__n_batch": 8,
        "gguf__n_cpu_threads": 8,
        "onnx__model_dir": "",
        "onnx__verbose": false,
        "onnx__n_cpu_threads": 8
    },
    "generate": {
        "stopwords": [],
        "max_new_tokens": 2048,
        "temperature": 0.7,
        "generation_kwargs": {}
    },
    "rag": {
        "active": false,
        "target_library": "",
        "top_k": 10,
        "sentence_transformer_weight": 0.0,
        "sentence_transformer_name_or_path": "Alibaba-NLP/gte-base-en-v1.5",
        "search": ""
    },
    "agentic": {
        "active": false,
        "additional_information": "",
        "exclude_tools": [],
        "config_per_agent": {}
    },
    "huggingface": {
        "search": "",
        "top_k": 10,
        "select_model": "",
        "task": null
    }
}

Configuration files can be saved (save) and loaded (load) through the main menu.

Changing configurations

To update a configuration, simply modify the desired value and press ENTER to confirm the change. Please note that only one configuration setting can be updated at a time, and the change will only go into effect once ENTER has been pressed.

Temporary environment

During an Owlsight session, a temporary environment is created within the homedirectory, called ".owlsight_packages". Newly installed python packages will be installed here. This folder will be removed if the session ends. If you want to persist installed packages, simply install them outside of Owlsight.

Error Handling and Auto-Fix

Owlsight automatically tries to fix and retry any code that encounters a ModuleNotFoundError by installing the required package and re-executing the code. It can also attempt to fix errors in its own generated code. This feature can be controlled by the max_retries_on_error parameter in the configuration file.

Agentic system

Owlsight implements a multistep agentic system, which allows for more complex tasks to be executed than would normally be possible with one language model. This agentic system is accessible through the CLI by setting the config.agentic.active parameter to true.

The agents consist of: ['PlanAgent', 'PlanValidationAgent', 'ToolCreationAgent', 'ToolSelectionAgent', 'ObservationAgent', 'FinalAgent']

First, an Executionplan is created by the PlanAgent. This plan contain several steps, where each step is assigned to a downstream agent.

To make sure the plan is valid, the plan is validated by the PlanValidationAgent.

The mainagents for executing steps from the executionplan are ToolSelectionAgent and ToolCreationAgent.

ToolSelectionAgent is the main agent that is used to select and run tools. The following tools from the Python interpreter are available out of the box for ToolSelectionAgent to use: owl_read, owl_write, owl_edit, owl_search, owl_scrape, owl_terminal

After every ToolSelectionAgent step, the ObservationAgent is used to summarize the result of the tool execution. This so that the information provided by ToolSelectionAgent is shorter, richer and free of noise. This makes the information better suited for downstream agents.

ToolCreationAgent is the main agent that is used to create new tools. Using this agent, a new tool can be created dynamicly in Python and added to the AVAILABLE TOOLS registry. This tool can then later be used by ToolSelectionAgent.

The final agent is the FinalAgent, which is used to provide the final response to the user based on all previous steps.

Here is a diagram illustrating the agentic flow within Owlsight-CLI (config:agentic):

API Examples

Owlsight can also be used as a library in Python scripts. The main classes are the TextGenerationProcessor family, which can be imported from the owlsight package.

Here is a simple example of how to use it:

from owlsight import TextGenerationProcessorGGUF
# If you want to use another type of text-generation model, you can import the other classes: TextGenerationProcessorONNX, TextGenerationProcessorTransformers

processor = TextGenerationProcessorGGUF(
    model_id=r"path	o\Phi-3-mini-128k-instruct.Q5_K_S.gguf",
)

question = "What is the meaning of life?"

for token in processor.generate_stream(question):
    print(token, end="", flush=True)

Alternatively, there is a lot more to explore in the owlsight package. Here is an example on how to use the DocumentSearcher class for simple document retrieval:

from owlsight import DocumentSearcher, SentenceTextSplitter, SemanticTextSplitter

docs = {
    "doc1": "Quantum mechanics describes nature at atomic scales, introducing wave-particle duality and entanglement.",
    "doc2": "General relativity redefines gravity as spacetime curvature, predicting black holes and gravitational waves.",
    "doc3": "Quantum gravity aims to unify quantum mechanics and relativity, with theories like string theory and LQG.",
    "doc4": "String theory is a framework for understanding the universe, with models like the Minkowski space-time and the Einstein-Hilbert action.",
    "doc5": "LQG is a framework for quantum gravity, with models like the Einstein action and the black hole metric."
}

# Experiment with different text splitters
# splitter = SemanticTextSplitter()
splitter = SentenceTextSplitter(n_sentences=2)

searcher = DocumentSearcher(
    documents=docs,
    text_splitter=splitter,
    cache_dir="quantum_gravity",
    cache_dir_suffix="test",
)

query = "black holes in quantum gravity"
results = searcher.search(query, top_k=2)

Or a more advanced example of similarity search, where some websites are scraped and being splitted in chunks based on their semantic similarity.

from owlsight import OwlDefaultFunctions, SemanticTextSplitter, DocumentSearcher

if __name__ == "__main__":
    owl_funcs = OwlDefaultFunctions({})

    # List of AI/ML related URLs to scrape
    urls = [
        "https://plato.stanford.edu/entries/artificial-intelligence/",  # Stanford's AI Philosophy
        "https://www.nature.com/articles/s42256-019-0088-2",  # Nature's Deep Learning overview
    ]

    scraped_text = owl_funcs.owl_scrape(urls)
    model_name = "sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2"
    doc_splitter = SemanticTextSplitter(percentile=0.99, target_chunk_length=400, model_name=model_name)
    doc_searcher = DocumentSearcher(scraped_text, sentence_transformer_model=model_name, text_splitter=doc_splitter)
    df = doc_searcher.search("reinforcement learning")
    seperator = "-" * 100
    for idx, row in enumerate(df.iterrows(), start=1):
        print(seperator)
        score = row[1]["aggregated_score"]
        print(f"Rank: {idx} (Score: {score:.2f})")
        print(f"Document name: {row[1]['document_name']}")
        print(row[1]["document"])

API Documentation

The following section details all the objects and functions available in the Owlsight API:

Classes

TextGenerationProcessorOnnx

class TextGenerationProcessorOnnx(model_id: str, onnx__verbose: bool = False, onnx__n_cpu_threads: int = 8, onnx__model_dir: Optional[str] = None, token: Optional[str] = None, apply_chat_history: bool = False, system_prompt: Optional[str] = None, model_kwargs: Optional[dict] = None, apply_tools: Optional[List[dict]] = None, **kwargs: Any) -> None

Text generation processor using ONNX Runtime optimized models.

This processor enables text generation using ONNX-optimized models, which can run on both CPU and GPU. Supports both local models and models from Hugging Face Hub.

Parameters

model_id : str Path to local ONNX model or Hugging Face model ID onnx__verbose : bool, default=False Enable verbose ONNX Runtime logging onnx__n_cpu_threads : int, default=8 Number of CPU threads for computation onnx__model_dir : str, optional Specific model directory when multiple valid ones exist token : str, optional Hugging Face token for private models apply_chat_history : bool, default=False Whether to maintain conversation history system_prompt : str, optional System prompt prepended to all inputs model_kwargs : dict, optional Additional keyword arguments to pass to the model. Default is None.

Notes

ONNX models typically offer better CPU performance than PyTorch
Thread count affects CPU performance significantly
Models must be ONNX-optimized versions of transformers models

Examples:

--------
>>> # Load local ONNX model
>>> processor = TextGenerationProcessorOnnx("path/to/model")
>>>
>>> # Load from Hugging Face
>>> processor = TextGenerationProcessorOnnx(
...     "onnx-community/Llama-2-7B-Instruct-ONNX",
...     onnx__n_cpu_threads=12
... )

Methods:

apply_chat_template(self, input_data: str, tokenizer: transformers.tokenization_utils.PreTrainedTokenizer) -> str
- Apply chat template to the input text.
clear_history(self) -> None
- Clear the chat history.
generate(self, input_data: str, max_new_tokens: int = 512, temperature: float = 0.0, stopwords: Optional[List[str]] = None, buffer_wordsize: int = 10, generation_kwargs: Optional[Dict[str, Any]] = None) -> str
- Generate text response for the given input.
generate_stream(self, input_data: str, max_new_tokens: int = 512, temperature: float = 0.0, generation_kwargs: Optional[Dict[str, Any]] = None)
- Stream generated text tokens one by one.
get_history(self) -> List[Dict[str, str]]
- Get complete chat history of inputs and outputs and system prompt.
get_max_context_length(self) -> Optional[int]
- Get maximum context length for the model.
list_valid_repo_files(repo_id: str) -> List[str]
pre_validate_model_id(model_id: str, onnx__model_dir: str)
- Validate the model_id and model_directory before using snapshot_download.
update_history(self, input_data: str, generated_text: str) -> None
- Update the history with the input and generated text.

TextGenerationProcessorTransformers

class TextGenerationProcessorTransformers(model_id: str, transformers__device: Optional[str] = None, transformers__quantization_bits: Optional[int] = None, transformers__stream: bool = True, bnb_kwargs: Optional[dict] = None, tokenizer_kwargs: Optional[dict] = None, task: Optional[str] = None, apply_chat_history: bool = False, system_prompt: str = '', apply_tools: Optional[List[dict]] = None, model_kwargs: Optional[dict] = None, **kwargs)

Text generation processor using transformers library.

Methods:

apply_chat_template(self, input_data: str, tokenizer: transformers.tokenization_utils.PreTrainedTokenizer) -> str
- Apply chat template to the input text.
clear_history(self) -> None
- Clear the chat history.
generate(self, input_data: str, max_new_tokens: int = 512, temperature: float = 0.0, stopwords: Optional[List[str]] = None, generation_kwargs: Optional[Dict[str, Any]] = None) -> str
- Generate text response.
generate_stream(self, input_data: str, max_new_tokens: int = 512, temperature: float = 0.0, stopwords: Optional[List[str]] = None, generation_kwargs: Optional[Dict[str, Any]] = None) -> Generator[str, NoneType, NoneType]
- Generate streaming text response.
get_history(self) -> List[Dict[str, str]]
- Get complete chat history of inputs and outputs and system prompt.
get_max_context_length(self) -> Optional[int]
- Retrieve the maximum context length of the model.
pipe_call(self, input_data: Union[str, List[str]], **gen_kwargs) -> Any
- Call the pipeline with input data and kwargs, supporting batch processing.
prepare_generation(self, input_data: str, max_new_tokens: int, temperature: float, stopwords: Optional[List[str]], generation_kwargs: Optional[Dict[str, Any]], streaming: bool = False, apply_chat_template: bool = True) -> Tuple[str, Dict[str, Any]]
- Prepare generation parameters.
update_history(self, input_data: str, generated_text: str) -> None
- Update the history with the input and generated text.

TextGenerationProcessorGGUF

class TextGenerationProcessorGGUF(model_id: str, gguf__filename: str = '', gguf__verbose: bool = False, gguf__n_ctx: Optional[int] = None, gguf__n_gpu_layers: int = 0, gguf__n_batch: Optional[int] = None, gguf__n_cpu_threads: Optional[int] = None, apply_chat_history: bool = False, system_prompt: str = '', model_kwargs: Dict[str, Any] = None, apply_tools: Optional[List[dict]] = None, **kwargs)

Text generation processor for GGUF models using llama-cpp.

This processor enables efficient text generation using GGUF-quantized models, which are optimized for CPU and GPU inference. Supports both local models and models from Hugging Face Hub.

Parameters

model_id : str Path to local GGUF model or Hugging Face model ID gguf__filename : str, optional Specific GGUF file to load when using Hugging Face model ID gguf__verbose : bool, default=False Enable verbose logging from llama-cpp gguf__n_ctx : int, optional Context window size. Larger values allow longer conversations but use more memory gguf__n_gpu_layers : int, default=0 Number of layers to offload to GPU. Set >0 for GPU acceleration gguf__n_batch : int, optional Batch size for generation. Increase for faster generation, at the cost of memory. gguf__n_cpu_threads : int, optional The number of CPU threads to use for generation. Increase for much faster generation if multiple cores are available. apply_chat_history : bool, default=False Whether to maintain conversation history system_prompt : str, default="" System prompt prepended to all inputs model_kwargs : Optional[Dict[str, Any]] Additional arguments passed for the model. These get passed to transformers.pipeline function as model_kwargs argument. Default is None.

Notes

GPU acceleration requires llama-cpp-python build specifically with CUDA support
Context size (n_ctx) affects memory usage significantly
For optimal performance, adjust n_batch and n_cpu_threads based on hardware

Examples:

--------
>>> # Load local GGUF model
>>> processor = TextGenerationProcessorGGUF("path/to/model.gguf", gguf__n_gpu_layers=20)
>>>
>>> # Load from Hugging Face with GPU
>>> processor = TextGenerationProcessorGGUF(
...     "TheBloke/Llama-2-7B-GGUF",
...     gguf__filename="llama-2-7b.Q4_K_M.gguf",
...     gguf__n_gpu_layers=32
... )

Methods:

apply_chat_template(self, input_data: str) -> List[Dict[str, str]]
- Apply chat template to the input text.
clear_history(self) -> None
- Clear the chat history.
generate(self, input_data: str, max_new_tokens: int = 512, temperature: float = 0.1, stopwords: Optional[List[str]] = None, generation_kwargs: Optional[Dict[str, Any]] = None) -> str
- Generate text response for the given input.
generate_stream(self, input_data: str, max_new_tokens: int = 512, temperature: float = 0.1, generation_kwargs: Optional[Dict[str, Any]] = None) -> Generator[str, NoneType, NoneType]
- Stream generated text tokens one by one.
get_history(self) -> List[Dict[str, str]]
- Get complete chat history of inputs and outputs and system prompt.
get_max_context_length(self) -> Optional[int]
- Retrieve the maximum context length of the model.
update_history(self, input_data: str, generated_text: str) -> None
- Update the history with the input and generated text.

MultiModalProcessorTransformers

class MultiModalProcessorTransformers(model_id: str, task: str, apply_chat_history: bool = False, system_prompt: str = '', **kwargs: Any) -> None

Multimodal processor using Hugging Face transformers.

This processor handles text generation tasks that involve multiple modalities (text, images, audio) using Hugging Face transformer models. It combines the MediaPreprocessor for handling media inputs with text generation capabilities.

Parameters

model_id : str Identifier for the Hugging Face model to use task : str Task type, must be one of HUGGINGFACE_MEDIA_TASKS apply_chat_history : bool, default=False Whether to maintain chat history system_prompt : str, default="" System prompt to use for generation **kwargs : dict Additional arguments passed to TextGenerationProcessorTransformers

Notes

Supports various multimodal tasks (VQA, image captioning, etc.)
Handles media preprocessing automatically
Integrates with Hugging Face's transformers library
Manages memory efficiently for large media files

Examples:

--------
>>> processor = MultiModalProcessorTransformers(
...     model_id="dandelin/vilt-b32-finetuned-vqa", task="visual-question-answering"
... )
>>> media_obj = MediaObject(path="image-of-car.jpg", tag="image")
>>> result = processor.generate("What color is the car in this image:", media_objects={"image1": media_obj})

Methods:

apply_chat_template(self, input_data: str, tokenizer: transformers.tokenization_utils.PreTrainedTokenizer) -> str
- Apply chat template to the input text.
clear_history(self) -> None
- Clear the chat history.
generate(self, input_data: str, media_objects: Dict[str, owlsight.utils.custom_classes.MediaObject], stopwords: Optional[List[str]] = None, max_new_tokens: int = 512, temperature: float = 0.0, generation_kwargs: Optional[Dict[str, Any]] = None) -> str
- Generate text based on input text and media objects.
get_history(self) -> List[Dict[str, str]]
- Get complete chat history of inputs and outputs and system prompt.
get_max_context_length(self)
- Retrieve the maximum context length of the model.
preprocess_input(self, input_data: Union[str, bytes, pathlib.Path], question: Optional[str] = None) -> Any
- Preprocess media input data for the model.
update_history(self, input_data: str, generated_text: str) -> None
- Update the history with the input and generated text.

PythonLibSearcher

class PythonLibSearcher(*args, **kwargs)

A singleton class for searching Python library documentation with caching capabilities. Maintains document and engine caches throughout the owlsight session.

Methods:

clear_cache(self, library: Optional[str] = None)
- Clear the document and engine caches.
search(self, library: str, query: str, top_k: int = 5, cache_dir: Optional[str] = None, as_context: bool = True, tfidf_weight: float = 1.0, sentence_transformer_weight: float = 0.0, sentence_transformer_model: str = 'Alibaba-NLP/gte-base-en-v1.5') -> Union[pandas.core.frame.DataFrame, str]
- Search Python library documentation with caching for documents and search engines.

SentenceTextSplitter

class SentenceTextSplitter(n_sentences: int = 3, n_overlap: int = 0)

Split text into chunks based on sentences.

Methods:

split_and_clean_text(text: str) -> List[str]
- Split a longer text into sentences and clean them.
split_documents(self, documents: Dict[str, str], **kwargs) -> Dict[str, str]
- Split documents into chunks of n sentences with overlap.
split_text_in_sentences(text: str) -> List[str]
- Split a longer text into sentences, while keeping account edgecases.

SemanticTextSplitter

class SemanticTextSplitter(model_name: str = 'Alibaba-NLP/gte-base-en-v1.5', window_size: int = 0, percentile: float = 0.9, device: Optional[str] = None, target_chunk_length: Optional[int] = None, sentence_transformer_kwargs: Optional[Dict[str, Any]] = None)

Split text into chunks based on semantic similarity breakpoints.

Methods:

set_model(self, model: Union[str, ~SentenceTransformer]) -> None
- Set or update the model used for generating embeddings.
split_documents(self, documents: Dict[str, str], show_progress_bar: bool = True, **kwargs) -> Dict[str, str]
- Split documents using semantic breakpoint detection.

DocumentSearcher

class DocumentSearcher(documents: Dict[str, str], sentence_transformer_model: str = 'Alibaba-NLP/gte-base-en-v1.5', sentence_transformer_batch_size: int = 64, text_splitter: Optional[owlsight.rag.text_splitters.TextSplitter] = None, cache_dir: Optional[str] = None, cache_dir_suffix: Optional[str] = None, device: Optional[str] = None, sentence_transformer_kwargs: Optional[Dict[str, Any]] = None) -> None

Document search engine using an ensemble of TFIDF and Sentence Transformer methods.

This class provides document search capability by combining traditional TF-IDF with embeddings from Sentence Transformer-based models. The idea behind this is two-fold:

TFIDF can capture relevant words an embedding model was not trained on.
Embeddings can capture context better than TFIDF.

Order in __init__is like so: [splitting in chunks (optional)] [TF-IDF] [Sentence Transformer: create embeddings and cache as .pkl files]

And then use the search method to combine the results: [Combine TF-IDF and Sentence Transformer results]

Methods:

search(self, query: str, top_k: int = 20, sentence_transformer_weight: float = 0.7, tfidf_weight: float = 0.3, as_context: bool = False) -> Union[pandas.core.frame.DataFrame, str]
- Search documents using the configured ensemble methods.

DocumentReader

class DocumentReader(supported_extensions: Optional[List[str]] = None, ignore_patterns: Optional[List[str]] = None, ocr_enabled: bool = True, timeout: int = 5, text_only: bool = True, tika_server_jar_path: Optional[str] = None)

A class for reading text content from files using Apache Tika.

Supports a wide variety of file formats and provides streaming capabilities for processing large directories.

Examples:

--------
>>> reader = DocumentReader()
>>> for filename, content in reader.read_directory("path/to/docs"):
...     print(f"Processing {filename}...")
...     process_content(content)

Methods:

is_supported_file(self, filepath: str) -> bool
- Check if a file is supported based on its extension and ignore patterns.
read_directory(self, directory: str, recursive: bool = True) -> Generator[Tuple[str, str], NoneType, NoneType]
- Read all supported files in a directory and yield their content.
read_file(self, file_source: Union[str, bytes]) -> str
- Read and extract text content from either a file path or file content buffer.
should_ignore_file(self, filepath: str) -> bool
- Check if a file should be ignored based on gitignore-style patterns.

HashingVectorizerSearchEngine

class HashingVectorizerSearchEngine(documents: Dict[str, str], cache_dir: Optional[str] = None, cache_dir_suffix: Optional[str] = None, **hashing_kwargs: Any)

Search engine using Hashing Vectorizer for memory-efficient search.

This search engine uses feature hashing for vectorization, making it memory-efficient and suitable for large document collections.

Parameters

documents : Dict[str, str] Dictionary mapping document IDs to their content cache_dir : str, optional Directory to cache hash matrices cache_dir_suffix : str, optional Suffix for cache directory name **hashing_kwargs Additional arguments passed to sklearn.feature_extraction.text.HashingVectorizer

Notes

Memory-efficient, suitable for large datasets
No inverse transform capability
Constant memory usage regardless of vocabulary size
Small chance of hash collisions

Examples:

--------
>>> docs = {
...     "doc1": "Large text document...",
...     "doc2": "Another large document..."
... }
>>> engine = HashingVectorizerSearchEngine(
...     docs,
...     n_features=(2**16)
... )
>>> results = engine.search("specific terms", top_k=1)

Methods:

create_index(self) -> None
- Create search index from documents.
get_full_cache_path(self) -> pathlib.Path
- Generate a deterministic and safe cache path, preserving metadata in filename.
get_suffix_filename(self) -> str
- Get the suffix filename.
load_data(self) -> Optional[Any]
- Load data from cache.
save_data(self, data: Any)
- Save data to cache.
search(self, query: str, top_k: int = 3) -> List[owlsight.rag.custom_classes.SearchResult]
- Search documents using the query.

TFIDFSearchEngine

class TFIDFSearchEngine(documents: Dict[str, str], cache_dir: Optional[str] = None, cache_dir_suffix: Optional[str] = None, **tfidf_kwargs: Any) -> None

Search engine using TF-IDF (Term Frequency-Inverse Document Frequency).

This search engine uses traditional TF-IDF vectorization for keyword-based search, making it effective for finding documents with specific terms.

Parameters

documents : Dict[str, str] Dictionary mapping document IDs to their content cache_dir : str, optional Directory to cache TF-IDF matrices cache_dir_suffix : str, optional Suffix for cache directory name **tfidf_kwargs Additional arguments passed to sklearn.feature_extraction.text.TfidfVectorizer

Notes

Fast and memory-efficient
Good for exact keyword matching
Supports n-grams and custom tokenization
Caches TF-IDF matrices for better performance

Examples:

--------
>>> docs = {
...     "doc1": "Python programming basics",
...     "doc2": "Advanced Python concepts"
... }
>>> engine = TFIDFSearchEngine(docs, ngram_range=(1, 2))
>>> results = engine.search("python basics", top_k=1)

Methods:

create_index(self) -> None
- Create search index from documents.
get_full_cache_path(self) -> pathlib.Path
- Generate a deterministic and safe cache path, preserving metadata in filename.
get_suffix_filename(self) -> str
- Get the suffix filename.
load_data(self) -> Optional[Any]
- Load data from cache.
save_data(self, data: Any)
- Save data to cache.
search(self, query: str, top_k: int = 3) -> List[owlsight.rag.custom_classes.SearchResult]
- Search documents using the query.

SentenceTransformerSearchEngine

class SentenceTransformerSearchEngine(documents: Dict[str, str], model_name: str = 'Alibaba-NLP/gte-base-en-v1.5', pooling_strategy: Literal['mean', 'max', None] = 'mean', device: Optional[str] = None, cache_dir: Optional[str] = None, cache_dir_suffix: Optional[str] = None, batch_size: int = 64, sentence_transformer_kwargs: Optional[Dict[str, Any]] = None)

Search engine using Sentence Transformer embeddings.

This search engine uses neural embeddings to find semantically similar documents, making it effective for concept-based search rather than just keyword matching.

Methods:

create_index(self) -> None
- Create search index by computing embeddings for all documents.
get_full_cache_path(self) -> pathlib.Path
- Generate a deterministic and safe cache path, preserving metadata in filename.
get_suffix_filename(self) -> str
- Get the suffix filename.
load_data(self) -> Optional[Any]
- Load data from cache.
save_data(self, data: Any)
- Save data to cache.
search(self, query: str, top_k: int = 3) -> List[owlsight.rag.custom_classes.SearchResult]
- Search documents using the query.

OwlDefaultFunctions

class OwlDefaultFunctions(globals_dict: Dict)

Define default functions that can be used in the Python interpreter. This provides the user with some utility functions to interact with the interpreter. Convention is that the functions start with 'owl_' to avoid conflicts with built-in functions.

This class is open for extension, as possibly more useful functions can be added in the future.

Methods:

owl_create_document_searcher(self, documents: Dict[str, str], sentence_transformer_model_name: str, sentence_transformer_kwargs: Optional[Dict[str, Any]] = None, percentile: float = 0.99, target_chunk_length: int = 400, device: Optional[str] = None, **document_searcher_kwargs) -> ~DocumentSearcher
- Utility function to create a DocumentSearcher instance from a dictionary of documents.
owl_edit(self, file_path: Union[str, pathlib.Path], edits: List[Dict[str, str]], *, regex: bool = True, create_backup: bool = True, backup_suffix: str = '.bak', encoding: str = 'utf-8') -> str
- Apply multiple substitutions to one local file.
owl_import(self, file_path: str)
- Import Python module into the current execution environment.
owl_load_namespace(self, file_path: str)
- Load namespace using dill.
owl_models(self, cache_dir: Optional[str] = None, show_task: bool = False) -> List[str]
- Audit Hugging Face model cache.
owl_press(self, sequence: List[str], exit_python_before_sequence: bool = True, time_before_sequence: float = 0.5, time_between_keys: float = 0.12) -> bool
- Simulate keyboard input for application control.
owl_read(self, file_source: Union[str, pathlib.Path, bytes, Iterable[Union[str, pathlib.Path]]], recursive: bool = False, ignore_patterns: Optional[List[str]] = None, ocr_enabled: bool = True, timeout: int = 5) -> Union[str, Dict[str, str]]
- Read ONLY local files or directories.
owl_save_namespace(self, file_path: str)
- Serialize current namespace state to disk.
owl_scrape(self, urls: List[str], max_concurrent: int = 5, timeout: int = 10) -> Dict[str, str]
- Download and parse the main text from web pages.
owl_search(self, query: str, max_results: int = 10, max_retries: int = 3) -> Dict[str, str]
- DuckDuckGo text search with simple back-off.
owl_search_and_scrape(self, query: str, max_results: int = 10, max_concurrent: int = 5, timeout: int = 10, max_retries: int = 3) -> Dict[str, str]
- Search the web then scrape the resulting URLs.
owl_show(self, docs: bool = True, return_str: bool = False) -> List[str]
- Display active namespace objects with documentation.
owl_terminal(self, command: Union[str, List[str]], shell: bool, cwd: Union[str, pathlib.Path] = '.', capture_output: bool = True, timeout: Optional[int] = None, raise_on_error: bool = True, encoding: str = 'utf-8') -> Dict[str, Union[str, int]]
- Cross-platform shell command runner.
owl_tools(self, as_json: bool = True) -> List[Union[Callable, Dict]]
- Retrieve available tool-callable functions in OpenAI-compatible format.
owl_write(self, file_path: str, content: str) -> None
- Write content to file_path (UTF-8, overwrite).

ExpertPrompts

class ExpertPrompts()

System prompts for different expert roles

Methods:

as_dict(self) -> Dict[str, str]
- Return a dictionary of role keys and their descriptions.

AgentPrompts

class AgentPrompts(essential_information: str = '')

A collection of system prompts to be used in Agentic frameworks.

Methods:

as_dict(self) -> Dict[str, str]
- Return a dictionary of role keys and their descriptions.
get_essential_information(self) -> str
get_single_agent(self) -> str

PromptWriter

class PromptWriter(prompt: str)

Writes a system prompt to an Owlsight configuration JSON file.

Parameters

prompt : str The system prompt to be written to the Owlsight configuration JSON file.

Methods:

to(self, target_json: str) -> None
- Updates the 'system_prompt' field under the 'model' key in the given Owlsight configuration JSON file.

VoiceControl

class VoiceControl(*args, **kwargs)

Proxy class that inherits from DummyVoiceControl when dependencies are missing

Methods:

is_running(self)
- Check if the voice control system is running.
start(self)
- Start the voice control system.
stop(self)
- Stop the voice control system.

Functions

setup_tesseract

def setup_tesseract() -> str

Initialize Tesseract. Return the path to the Tesseract executable.

get_best_device

def get_best_device() -> str

Check for best device and return the device name.

check_onnx_device

def check_onnx_device(current_device: str = 'cuda') -> str

Check the current device being used for ONNXRuntime.

Parameters: current_device (str): The current device to use. Default is 'cuda'.

check_gpu_and_cuda

def check_gpu_and_cuda()

Checks if a CUDA-capable GPU is available on pytorch and if CUDA is installed.

llama_supports_gpu_offload

def llama_supports_gpu_offload(base_path: str) -> bool

Checks if Llama.cpp supports GPU offload. This is useful for checking if a GPU is available for GGUF models.

Parameters

base_path : str Path to the Llama.cpp shared library. Usually something like 'dist-packages/llama_cpp/lib' or 'site-packages/llama_cpp/lib' in the current virtual environment.

Returns: bool: True if Llama.cpp is available on the GPU, False otherwise.

SOURCE: https://stackoverflow.com/questions/78415856/detecting-gpu-availability-in-llama-cpp-python

calculate_max_parameters_per_dtype

def calculate_max_parameters_per_dtype()

Calculate the maximum number of parameters that can be run on the GPU for different data types (32-bit, 16-bit, 8-bit, 4-bit).

calculate_memory_for_model

def calculate_memory_for_model(n_bilion_parameters: int, n_bit: int = 32) -> float

Calculate the memory required for a model in GB.

Parameters: n_bilion_parameters (int): The number of parameters in the model in billions. n_bit (int): The number of bits used to represent the model parameters. Default is 32. Quantized models use 16/8/4 bits.

calculate_available_vram

def calculate_available_vram() -> float

Calculate the available VRAM on the GPU in GB.

select_processor_type

def select_processor_type(model_id: str, task: Optional[str] = None) -> Type[ForwardRef('TextGenerationProcessor')]

Utilityfunction which selects the appropriate TextGenerationProcessor class based on the model ID or directory.

If the model_id is a directory, the function will inspect the contents of the directory to decide the processor type. Otherwise, it will use the model_id string to make the decision.

is_url

def is_url(url: str) -> bool

Check if a string is a valid URL.

Parameters

url : str The string to check.

Returns

bool True if the string is a valid URL, False otherwise.

get_model_data

def get_model_data(model_search: str, top_n_models: int = 10, **kwargs) -> Dict[str, Dict[str, str]]

Get and display the model data from the HuggingFace Hub in a visually appealing format.

Parameters: model_search: Search term for filtering models top_n_models: Number of top models to display **kwargs: Additional keyword arguments to pass to get_model_list. E.g., task, framework, etc. See HfApi().list_models() from huggingface_hub package for more details.

Returns: Dictionary containing model information

get_mteb_leaderboard_data

def get_mteb_leaderboard_data(max_params: Optional[int] = None) -> pandas.core.frame.DataFrame

Fetch and parse data from the MTEB leaderboard, focussed on text embedding models.

Parameters:

max_params : Optional[int], default None Maximum number of parameters for filtering

Returns:

pd.DataFrame DataFrame containing the MTEB leaderboard data with appropriate columns

function_to_json_for_tool_calling

def function_to_json_for_tool_calling(func: Callable) -> dict

Converts a Python function into a JSON structure suitable for function-calling with an LLM. This function inspects the target function's signature and docstring (assumed to be in NumPy style) and returns a JSON schema-like definition.

Parameters

func : Callable The Python function to be converted.

Returns

str A JSON string describing the function's name, short description, and parameter schema.

RELEASE NOTES

1.0.2

Enhanced cross-platform compatibility.
Introduced the generate_stream method to all TextGenerationProcessor classes.
Various minor bug fixes.

1.1.0

Added Retrieval Augmented Generation (RAG) for enriching prompts with documentation from python libraries. This option is also added to the configuration.
History with autocompletion is now also available when writing prompts. Prompts can be autocompleted with TAB.

1.2.1

Access backend functionality through the Owlsight API using "from owlsight import ..."
Added default functions to the Python interpreter, starting with the "owl_" suffix.
More configurations available when using GGUF models from the command line.

1.3.0

Add owl_history function to python interpreter for directly accessing model chat history.
Improved validation when loading a configuration file.
Added validation for retrying a codeblock from an error. This configuration is called prompt_retry_on_error

1.4.1

improve RAG capabilities in the Owlsight API, added SentenceTransformerSearchEngine, TFIDFSearchEngine and HashingVectorizerSearchEngine as classes.
Added DocumentSearcher to offer a general RAG solution for documents. At its core, uses a combination of TFIDF and Sentence Transformer.
Added caching possibility to all RAG solutions in the Owlsight API (cache_dir & cache_dir_suffix), where documents, embeddings etc. get pickled. This can save a big amount of time if amount of documents is large.

2.0.1beta

BREAKING CHANGES

Added Huggingface API in the configuration-settings of the CLI. This allows the user to search and load models directly from the Huggingface modelhub and can be found through config:huggingface.
added transformers__use_fp16 and transformers__stream to config:model for using fp16 and streaming the model output in the transformers-based models.
Added MultiModalProcessorTransformers for non text-input based models. This class can be used for models which require additional input like images, audio or video and works with models from the Huggingface Hub based on the Pytorch framework.
Introduced new double-square brackets syntax for passing mediatypes in the prompt.
Improved logging with clearer color coding and more detailed information.
System Prompt in config:modelis now an empty string as default.
Several small bugfixes and improvements.

2.0.2 (stable)

Upgraded UI with new color scheme and improved readability. Description of the current choice is now displayed above the menu.
Removed onnx__tokenizer from TextGenerationProcessorOnnx constructor, so that only model_id is needed as constructor argument.
Added get_max_context_length method to all TextGenerationProcessor classes, which returns the maximum context length of the loaded model.
Moved transformers__use_fp16 in config:model to transformers__quantization_bits as value 16, as it is more clear.
Added track_model_usage to config:main, which can be used to track usage of the model, like the amount of words generated, total time spent etc.
Added possibility to pass complete directories as argument to mediatypes to a model in the CLI, like so:

How can I assist you?

[[image:directory/containing/images]]

Add owl_models() function to python interpreter for displaying all Huggingface models in the cache directory.

2.2.0

Improved userexperience in the CLI by preventing shrinking of the terminal window if menu is too large.
In the EDITABLE optiontype fields, multiple lines are now possible.
Add owl_save_namespace owl_load_namespace functions to save and load all variables inside the Python interpreter. This is useful if you want to save any code created by a model. Or load a namespace from a previous session.
ProcessorMemoryContext can be used as a context_manager to clean up resources from TextGenerationProcessor, like the model, from memory after usage.
Improved config:rag functionality with the new sentence_transformer_weight option. This allows to weigh the sentence-transformer part in the RAG model next to the already present TFIDF, improving semantic search capabilities.
Improved config:rag functionality with the new sentence_transformer_name_or_path option. This allows to specify the name or path to a sentence-transformer model, which is used for embedding.
Add DocumentSearcher class to offer a general RAG solution for documents. At its core, uses a combination of TFIDF and Sentence Transformer.
Add DocumentReader class to read text from a broad range of file formats. This class is build on top of Apache Tika.
Improved owl_read with the new DocumentReader class. As input, you can now pass a directory or a list of files.
Added main:sequence_on_loading to the configuration json. This allows execution of a sequence of keys on loading a config through the load option in the Owlsight main-menu. TIP: above option can be used to load a sequence of different models as "agents", where every config can be threaded as a different agent with their own role. In theory, every action in Owlsight can be automated through this option.

2.3.0

Added compile mode for the Python interpreter (config:main:python_compile_mode), so that the user can both execute single lines ("single") or define multiple lines of code ("exec").
added split_documents_n_sentences and split_documents_n_overlap parameters to DocumentSearcher class, which can be used to split a long document into smaller chunks before embedding.
Added a from_cache method in DocumentSearcher class. This method can be used to load a DocumentSearcher instance from earlier cached documents and embeddings.
Removed transformers__model_kwargs from config:model, and instead added a model_kwargs parameter to all TextGenerationProcessor classes. The advantage is that model_kwargs can now also be passed to other TextGenerationProcessor classes. For example, when passed to TextGenerationProcessorGGUF, these parameters are now used to initialize the Llama class from llama-cpp-python.
ESC + V can be used inside the Python Interpreter to show the currently defined objects in a dropdown-menu.
ESC + V can be used inside the "How can I assist you?"-option after typing the following: "[[", "{{". This will autocomplete the following: "[[" will autocomplete to: "image:", "audio:" "{{" will autocomplete any available defined objects from the python-namespace.
Added owl_tools function to the Python interpreter. This function can be used to convert all defined functions in the namespace to a dictionary, which can be used for tool/function-calling.
Bracket-syntax "{{}}" for augmenting Python expressions can now also be used inside the config section of the CLI. For example, in the Python interpreter, we can store a long string inside a variable and pass it to config:model:system_prompt directly.
Added new option dynamic_system_prompt to config:main section. This option can be used to dynamically generate a fitting system prompt first for a given user input, before passing it to the model. The idea is that this might help the model to give a more focused response to the question.
Add basic functionality, like select all, copy and paste. Use CTRL+A, CTRL+C and CTRL+Y respectively. This option applies to all editable fields and the Python Interpreter.

2.4.0.beta

Several changes for the "How can I assist you?"-option:

Added [[load:...]] tag support for dynamic configuration loading during conversations. This can be used in "How can I assist you?" in mainmenu to chain multiple configurations (agents) together, like so:

How can I assist you?

[[load:config-to-model1.json]] Generates a rough draft for the following text: {{owl_read("mockup-idea.txt")}} [[load:config-to-model2.json]] Validate that the generated draft based on the previous text is relevant and contains all necessary information

TIP 1: Combing a sequence of different agents together with above method can lead to complex conversation flows. TIP 2: Using above tag in combination with sequence_on_loading in the configuration json opens lots of new possibilities to control the application.

Added [[chain:...]] tag support for changing config parameters in between conversations. For example:

How can I assist you?

[[chain:model.system_prompt=act as a helpful assistant||generate.temperature=0.5]]

Above tags can also be used INSIDE a python-expression inside the "How can I assist you?"-option, like so:

How can I assist you?

{{"".join(f"[[load:config-to-model{i}.json]]how much is {i} + 1?" for i in range(1, 10))}}

Added SentenceTextSplitter to the Owlsight API. This can be used to split text into chunks based on sentences.
Added SemanticTextSplitter to the Owlsight API. This can be used to split text into chunks based on semantic similarity breakpoints and might be more accurate for chunking than SentenceTextSplitter. Note that both TextSplitter classes can be used as input for the DocumentSearcher class.
Added main.default_config_on_startup to the config:main section. This option can be used to specify a default configuration file to load when starting Owlsight. This will load the configuration file specified in main.default_config_on_startup when every time when starting Owlsight.
Added an experimental new section in config, called config:agentic. This section can be enabled through the "active" option. The section consists of a multi-step agentic system, where the the agents are in fixed order: ToolAgent (can search the internet, scrape, etc) -> Pythonagent (specialized in generating Python code) -> JudgeAgent. In the end, the final response is computed by a last agent. All agents are the currently loaded model with different roles.
Added --log and --level flags to the CLI. This can be used to specify a log file and log level, like so:

owlsight --log log.txt --level DEBUG

Added additional_information option to the config:agentic section. This option can be used to add additional information to every agent call, for example: "Do NOT use owl_scrape and owl_search, because there is no internet connection."
Added voice control support with customizable mappings through owlsight[voice] package This can be used for (close to realtime) transcription of user input to the screen, using faster-whisper. Voice control features include:
- Customizable word-to-key mappings for keyboard control
- Word-to-word substitutions for text input
- Configurable settings like command cooldown and typing intervals
- Support for multiple languages and speech recognition models
Added JSON-based configuration for all voice control settings
Added owl_search_and_scrape function to the Python interpreter. This function can be used to search and scrape the web using DuckDuckGo's API.
Added owl_create_document_searcher function to the Python interpreter. This utilityfunction can be used to create a DocumentSearcher instance with a given set of documents and a text splitter.

2.4.0(stable)

Added get_mteb_leaderboard_data function to the backend API. This function can be used to fetch the MTEB leaderboard data.
Added support for uv as an alternative package manager. Also improved current support for pip environments.
Several minor bugfixes and improvements.

2.5.0(stable)

Added owl_context_length function to the Python interpreter. This function can be used to get the maximum context length of the currently loaded model.
Improved flow of agentic system, which is now: RouterPlanningAgent -> ToolAgent | PythonAgent -> ValidationAgent -> [Until max_steps is reached or all data is collected for final answer] -> ResponseSynthesisAgent
Added new options to config:agentic:
- show_available_tools: Show all available tools (available from the Python interpreter) to the ToolAgent.
- exclude_tools: Exclude certain tools from the ToolAgent.
Implement lazy loading in all classes where SentenceTransformer models are used, so that they only get loaded if sentence_transformer_weight is more than 0. First, SentenceTransformer models were loaded without being sure that they would be used.
Several minor bugfixes and improvements.

2.6.0

Significantly enhanced agentic workflow through a major refactoring of the core agentic system, replacing the old agentic system with a new one. Current flow is now: PlanAgent -> PlanValidationAgent -> ToolCreationAgent | ToolSelectionAgent -> ObservationAgent -> [Until all steps have been executed] -> FinalAgent
Added owl_edit and owl_terminate functions to the Python interpreter.
Added config_per_agent option to the config:agentic section. This option can be used to specify a different configuration file for each agent.
Various minor bugfixes, features and stability improvements.

2.6.1

Some critical (regression-related) bugfixes, like:
- fixed error where GGUF models could not be loaded through config:huggingface.
- fixed error where generated pythoncode was not correctly parsed from modelresponse.

If you encounter any issues, feel free to shoot me an email at v.ouwendijk@gmail.com

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

2.7.0b1 pre-release

Jun 11, 2025

This version

2.6.1

May 16, 2025

2.6.0

May 14, 2025

2.5.0

Mar 26, 2025

2.4.0

Mar 1, 2025

2.4.0b0 pre-release

Feb 23, 2025

2.3.0

Feb 1, 2025

2.2.0

Jan 25, 2025

2.1.0

Jan 3, 2025

2.0.1b1 pre-release

Dec 7, 2024

1.4.1

Nov 3, 2024

1.4.0

Nov 1, 2024

1.3.0

Oct 25, 2024

1.2.1

Oct 18, 2024

1.2.0

Oct 18, 2024

1.1.0

Oct 15, 2024

1.0.2

Oct 9, 2024

1.0.0

Oct 9, 2024

1.0.0b0 pre-release

Oct 7, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

owlsight-2.6.1.tar.gz (63.3 MB view details)

Uploaded May 16, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

owlsight-2.6.1-py3-none-any.whl (58.5 MB view details)

Uploaded May 16, 2025 Python 3

File details

Details for the file owlsight-2.6.1.tar.gz.

File metadata

Download URL: owlsight-2.6.1.tar.gz
Upload date: May 16, 2025
Size: 63.3 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.3

File hashes

Hashes for owlsight-2.6.1.tar.gz
Algorithm	Hash digest
SHA256	`417a2e1d0430e8deef99a849f4947620cfbd421f815f109b92989be0193d2df5`
MD5	`418086672267c215a3580212aa4c8217`
BLAKE2b-256	`e85f290185c65dbd47b81533263c8b1d73e512f756784e9b7786ac892e991f21`

See more details on using hashes here.

File details

Details for the file owlsight-2.6.1-py3-none-any.whl.

File metadata

Download URL: owlsight-2.6.1-py3-none-any.whl
Upload date: May 16, 2025
Size: 58.5 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.3

File hashes

Hashes for owlsight-2.6.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c8ce200af95b37341216a2619844b9788f70b88a425be097e5dbde845322d39c`
MD5	`5b6eaec37841e81f9949a193f7e1db69`
BLAKE2b-256	`f98c01412f7fc927cb5c3f571b3d3cefc94900533773797c68cb960bcea07872`

See more details on using hashes here.

owlsight 2.6.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

Owlsight

Why owlsight?

Installation of the CLI:

Installation Options and Dependencies

Optional Feature Modules

Comprehensive Installation

Available Features Based on Installation

Security and Performance Notes

Usage

Keyboard Shortcuts

Getting Started

Available Commands

Voice Control

Example Workflow

MultiModal Support

Python interpreter

Configurations

Changing configurations

Temporary environment

Error Handling and Auto-Fix

Agentic system

API Examples

API Documentation

Classes

TextGenerationProcessorOnnx

Parameters

Notes

TextGenerationProcessorTransformers

TextGenerationProcessorGGUF

Parameters

Notes

MultiModalProcessorTransformers

Parameters

Notes

PythonLibSearcher

SentenceTextSplitter

SemanticTextSplitter

DocumentSearcher

DocumentReader

HashingVectorizerSearchEngine

Parameters

Notes

TFIDFSearchEngine

Parameters

Notes

SentenceTransformerSearchEngine

OwlDefaultFunctions

ExpertPrompts

AgentPrompts

PromptWriter

Parameters

VoiceControl

Functions

setup_tesseract

get_best_device

check_onnx_device

check_gpu_and_cuda

llama_supports_gpu_offload

Parameters

calculate_max_parameters_per_dtype

calculate_memory_for_model

calculate_available_vram

select_processor_type

is_url

Parameters

Returns

get_model_data

get_mteb_leaderboard_data

Parameters:

Returns:

function_to_json_for_tool_calling

Parameters

Returns