Skip to main content

Tools for working with Ollama model data

Project description

Ollama Data Tools

Requirements

  • Python 3.x

Installation

Clone the repository and install the necessary dependencies:

git clone https://github.com/queelius/ollama_data_tools.git
cd ollama_data_tools
pip install -r requirements.txt
pip install -e .

Ollama Data Toolkit

The OllamaData class is the core module of the Ollama Data Toolkit, allowing users to work programmatically with Ollama model data. This class provides methods to access, search, and filter model information.

Features

  • Retrieve the schema of the OllamaData object.
  • Access models by name or index.
  • List all available models.
  • Perform JMESPath queries and apply regex filters on the model data.
  • Cache model data for efficient repeated access.

Class Methods

OllamaData.get_schema() -> Dict[str, Any]

Returns the schema of the OllamaData object.

OllamaData.__init__(cache_path: str = '~/.ollama_data/cache', cache_time: str = '1 day')

Initializes the OllamaData object.

  • cache_path: The path to the cache file.
  • cache_time: The duration the cache is valid.

OllamaData.__len__() -> int

Returns the number of models.

OllamaData.__getitem__(index: int) -> Dict[str, Any]

Gets a model by index.

  • index: The index of the model.

OllamaData.get_model(name: str) -> Dict[str, Any]

Gets the model by name. Returns the most specific model that starts with the given name.

  • name: The name of the model.

OllamaData.get_models() -> Dict[str, Any]

Gets the models. Caches the model data to avoid repeated regeneration.

OllamaData.search(query: str = '[*]', regex: Optional[str] = None, regex_path: str = '@') -> Dict[str, Any]

Queries, searches, and views the models using a JMESPath query, regex filter, and exclude keys.

  • query: The JMESPath query to filter and provide a view of the models.
  • regex: The regex pattern to match against the output.
  • regex_path: The JMESPath query for the regex pattern.

Usage Example

Here is an example of how to use the OllamaData class programmatically:

import ollama_data as od

# Initialize the OllamaData object
models = od.OllamaData(cache_path='~/.ollama_data/cache', cache_time='1 day')

# Get the schema of the OllamaData object
print("Schema:", models.get_schema())

# List all models
print("Models:", ollama_data.get_models())

# Get a specific model by name
model = models.get_model('mistral')
print("Specific Model:", model['name'])

# Search models using a JMESPath query
query_result = models.search(query="[*].{name: name, size: total_weights_size}")
print("Query Result:", query_result)

# Search models using a JMESPath query and regex filter
query_regex_result = models.search(
    query="[*].{name: name, size: total_weights_size}",
    regex="mistral", regex_path="name")
print("Query Regex Result:", query_regex_result)

Ollama Data Query

The ollama_data_query.py script allows users to search and filter Ollama models using JMESPath queries and regular expressions. This tool is designed to help users explore and retrieve specific information about the models in their Ollama registry.

Features

  • Perform JMESPath queries to filter model data.
  • Use regular expressions to match specific patterns within the model data.
  • Print the JSON schema of the models.
  • Support for piped input queries.

Arguments

  • query: The JMESPath query to filter results.
  • --regex: Regular expression to match.
  • --regex-path: The JMESPath query for the regex pattern to apply against (default: @).
  • --schema: Print the JSON schema.
  • --debug: Set logging level to DEBUG.
  • --cache-time: Time to keep the cache file (default: 1 hour).
  • --cache-path: The path to the cache file (default: ~/.ollama_data/cache).

Usage

To perform a JMESPath query:

ollama_data_query "max_by(@, &total_weights_size).{name: name, size: total_weights_size}"

To use a regular expression to filter results:

ollama_data_query --regex "mistral:latest" --regex-path name "[*].{name: name, size: total_weights_size}"

To pipe a query from a file or another command:

cat query.txt | ollama_data_query

Using regex and regex-path with a piped query:

echo "[*].{info: { name: name, other: weights}}" | ollama_data_query --regex 14f2 --regex-path "info.other[*].file_name"

Examples

Query for the Largest Model

ollama_data_query "max_by(@, &total_weights_size).{name: name, sz: total_weights_size}"

Filter Models Using Regex

ollama_data_query --regex "mistral|llama3" --regex-path name "[*].{name: name, size: total_weights_size}"

Pipe a Query from a File

cat query.txt | ollama_data_query

Use Regex with a Piped Query

echo "[*].{info: { name: name, other: weights}}" | ollama_data_query --regex 14f2 --regex-path "info.other[*].file_name"

Ollama Data Export

The ollama_data_export script allows users to export Ollama models to a specified directory. This tool creates soft links for the model weights and saves the model metadata in the output directory.

Features

  • Export specified models to a self-contained directory.
  • Create soft links for model weights.
  • Save model metadata in JSON format.
  • Enable debug logging for detailed output.

Arguments

  • outdir: The output directory where the models will be exported.
  • --models: Comma-separated list of models to export. If not specified, all models will be exported.
  • --cache-path: The path to the cache file (default: ~/.ollama_data/cache).
  • --cache-time: The time to keep the cache file (default: 1 day).
  • --debug: Enable debug logging.
  • --hash-length: The length of the hash to use for the weight soft-links (default: 8).

Usage

To export specified models to a directory:

ollama_data_export --models model1,model2 --outdir /path/to/export

To export all models to a directory:

ollama_data_export /path/to/export

Examples

Export Specified Models

ollama_data_export --models mistral,llama3 --outdir /path/to/export

Export All Models

ollama_data_export --ourdir /path/to/export

Enable Debug Logging

ollama_data_export --models mistral --outdir /path/to/export --debug

Specify Hash Length for Soft Links

ollama_data_export --models mistral --outdir /path/to/export --hash-length 2

Ollama Data Adapter

The ollama_data_adapter script adapts Ollama models for use with other inference engines, such as llamacpp. This tool is designed to reduce friction when experimenting with local LLM models and integrates with other tools for viewing, searching, and exporting Ollama models.

Features

  • List available engines and models.
  • Run models with specified engines.
  • Show the template for a given model.
  • Pass additional arguments to the inference engine.
  • Debugging information for advanced users.

Arguments

  • model: The model to run.
  • engine: The engine to use.
  • --engine-path: The path to the engine (required).
  • --list-engines: List available engines.
  • --list-models: List available models.
  • --cache-path: The path to the cache file (default: ~/.ollama_data/cache).
  • --cache-time: The time to keep the cache file (default: 1 day).
  • --engine-args: Arguments to pass through to the engine.
  • --debug: Print debug information.
  • --show-template: Show the template for the model.

Usage

To list all available engines:

ollama_data_adapter --list-engines

To list all available models:

ollama_data_adapter --list-models

To show the template for a specific model:

ollama_data_adapter mistral --show-template

## The template for the model has the following forms:
## - [INST] {{ .System }} {{ .Prompt }} [/INST]

To run a specific model with an engine:

ollama_data_adapter model engine --engine-path /path/to/engine --engine-args 'arg1' ... 'argn'

Example

To use the llamacpp inference engine with the mistral model (assuming it is available in your Ollama registry), you might use the following arguments:

ollama_data_adapter
    mistral                          # Also matches `mistral:latest`
    llamacpp                         # Use the llamacpp engine
    --engine-path /path/to/llamacpp  # Path to engine, e.g. ~/llamacpp/main
    --engine-args                    # Pass these arguments into the engine 
        '--n-gpu-layers 40'
        '--prompt "[INST] You are a helpful AI assistant. [/INST]"'

The --prompt engine pass-through argument follows the template shown by the ollama_data_adapter mistral --show-template.

We place a lot of burden on the end-user to get the formatting right. These models are very sensitive to how you prompt them, so some experimentation may be necessary.

You may also want to use ollama_data_query to show the system message or other properties of a model, so that you can further customize the pass-through arguments to better fit its training data.

Contributing

Contributions are welcome! Please submit a pull request or open an issue to discuss changes.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Author

Alex Towell

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ollama_data_tools-0.1.1.tar.gz (16.6 kB view details)

Uploaded Source

Built Distribution

ollama_data_tools-0.1.1-py3-none-any.whl (17.7 kB view details)

Uploaded Python 3

File details

Details for the file ollama_data_tools-0.1.1.tar.gz.

File metadata

  • Download URL: ollama_data_tools-0.1.1.tar.gz
  • Upload date:
  • Size: 16.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.11.5

File hashes

Hashes for ollama_data_tools-0.1.1.tar.gz
Algorithm Hash digest
SHA256 a76a80445cdf86cabf97adf4c04ce516610411dd8cf4e115a9f5f23d4f8da108
MD5 610683a5af251c6444c96a311f77dfee
BLAKE2b-256 85c0c6303ed047bdc930c7bc46907b444ae65838411ba5a759cd5f1c18f2e88c

See more details on using hashes here.

File details

Details for the file ollama_data_tools-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for ollama_data_tools-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 2b6d89dd070a93ea80d4a4b993bd11b0881e09d32c7fc9f5b320f67628636606
MD5 3d285db762dc501e9a4cddf66cff059d
BLAKE2b-256 320db791547ac057d1dbc78de42e4f602351eefa0df7a9e0b725927375e66397

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page