Skip to main content

A Python library for storing and loading versioned collections of items searchable by key or value

Project description

Build Status Coverage Status GitHub PyPI version Python version

(Remote) Versioned Collection

Why

While working with LLMs I wanted to version my prompts and load them consistently whether from local files or from an online resource. The abstraction goes beyond LLMs and prompts, and is basically a versioned collection of objects, e.g. texts, images, feature flags, etc.

(a specificially LLMs related library is prompt-base which will be released later, based on this library).

NOTE: This library does not automatically manage the versions of each mutation of items. It only manages the version of the collection as a whole.

How

The library abstracts over storage and versioning. The main object is the Collection class, which is a glorified dictionary with pluggable storage and versioning implementations. Items in the collection can be accessed by key or by a value hash (content based lookup).

The collection itself can be loaded from a storage such that we don't need to know anything about the actual storage format and protocols at the site of the collection usage. For example, we can load a collection from a local file, a remote URL, a git repository, a database - and thiscan be configured vie env vars, logic or any other runtime or buildtime mechanism.

Usage

Loading a collection

Here's a simple way to load a collection:

from versioned_collection import Collection, CollectionStore

collection = CollectionStore.load(url="path/to/collection")

The url can be a local file path (using file:// scheme), or remote URL. I'll add more storage options in the future if need arises (github, sqlite, api, etc.)

Loading from the the internet requires installing the optional aiohttp dependency:

poetry install --extras http

Accessing items

Once we have a collection we can access items by key or by hash:

item = collection["my.item"]
item = collection.get("my.item")
item = collection.get_by_hash("my.item.hash")

See the tests/ folder for examples of usage.

Creating a collection

A collection can be created from a list of items:

collection = Collection(items=[Item(key="my.item", value="my item value")])

Adding items to a collection

Items can be added to a collection:

collection.add(Item(key="my.item", value="my item value"))

Removing items from a collection

Items can be removed from a collection:

collection.remove("my.item")

Saving a collection

A collection can be saved to a storage:

collection.save(url="path/to/collection")

Versioning

The currently implemented scheme assumes that the version is part of the collection file.

If you have a use case where the versioning should happen outside of the file (for example, as happens with git), please let me know in the issues.

Storage

The currently implemented storage is a single file per-collection, either JSON or YAML.

Both files are human readable and more importantly their diff is human readable.

If there's a need to add file-per-item storage please let me know in the issues.

Roadmap and Contributing

This library would stay small. Further work on this library is pending feedback from the community in the issues.

I currently plan to add the following capabilities:

  • - Support watch on a collection (i.e. file watch) and change notification callback.
  • - A link content type to support nested loading.
  • - Lazy loading for link values.

Please let me know if you want me to add additional storage options or features.

Contributing

Please feel free to open issues, especially before opening pull requests. I'll add a mandatory CLA in the future.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

versioned_collection-0.0.1.tar.gz (12.3 kB view details)

Uploaded Source

Built Distribution

versioned_collection-0.0.1-py3-none-any.whl (14.8 kB view details)

Uploaded Python 3

File details

Details for the file versioned_collection-0.0.1.tar.gz.

File metadata

  • Download URL: versioned_collection-0.0.1.tar.gz
  • Upload date:
  • Size: 12.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.12.1 Darwin/23.2.0

File hashes

Hashes for versioned_collection-0.0.1.tar.gz
Algorithm Hash digest
SHA256 837d3b04e16723aac03f1e6337c2359cf0f290afa7d16ad1c78239ae89325c29
MD5 6d78c833bc18c977f7ed5b45370358f5
BLAKE2b-256 da5a2643f2f7147fc8ebfc30b1b423d42520d2f78e5c7075766246d7c2c0e82b

See more details on using hashes here.

File details

Details for the file versioned_collection-0.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for versioned_collection-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 a92b1b834d6e4520bfec2955e8ddc6dd431b906d2730ed40364f7dd90ec98087
MD5 f6cf963153e88e3a4864dd7559b350f2
BLAKE2b-256 532b226c9e371dd280d9bf59a5827cedaab989b90a6f5e64f6422b5296d8ebec

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page