A tool for migrating to chroma versions >= 0.4.0
Project description
Chroma Migrate
Schema and data format changes are a necessary evil of evolving software. We take changes seriously and make them infrequently and only when necessary.
Chroma's commitment is whenever schema or data format change, we will provide a seamless and easy-to-use migration tool to move to the new schema/format.
Specifically we will announce schema changes on:
- Discord (#migrations channel)
- Github (here)
- Email listserv Sign up
We will aim to provide:
- a description of the change and the rationale for the change.
- a CLI migration tool you can run
- a video walkthrough of using the tool
Migration Log
Migration from >0.4.0 to 0.4.0 - July 17, 2023
We are migrating:
metadata store: where metadata is storedindex on disk: how indexes are stored on disk
Metadata Store: Previously Chroma used underlying storage engines DuckDB for the in-memory version of Chroma, and Clickhouse for the single-node server version of Chroma. These decisions were made when Chroma was addressing more batch analytical workloads and are no longer the best choice for users. The new metadata store for the in-memory and single-node server version of Chroma will be sqlite. (The distributed version of Chroma (forthcoming), will use a different distributed metadata store.)
Index store: Previously Chroma saved the entire index on every write. This because painfully slow when the collection grew to a reasonable amount of embeddings. The new index store saves only the change and should scale seamlessly!
Here are the 9-possible migration paths, and any notes, if applicable.
| From 👇 ➡️ To 👉 | Persistent Chroma | Local Chroma Server | Remote Chroma Server |
|---|---|---|---|
| Persistent Chroma | ✅ | ✅ | 1️⃣ |
| Local Chroma Server | ✅ | 2️⃣ | 1️⃣ |
| Remote Chroma Server | ✅ | ✅ | 1️⃣ 2️⃣ |
1️⃣ - Make sure to configure any auth headers correctly
2️⃣ - Run both the existing version of Chroma and the new 0.4.0 version of Chroma at same time. Run the new version on a new port if local.
How to use the migration tool
-
pipinstall this utility.pip install chroma_migrate -
Running the CLI. In your terminal run:
chroma_migrate
-
Choose whether the data you want to migrate is locally on disk (duckdb) on clickhouse instance used by chroma, or directly from another chroma server
-
Choose where you want to write the new data to.
Developing Locally
Run python main.py to test locally
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file chroma_migrate-0.0.7.tar.gz.
File metadata
- Download URL: chroma_migrate-0.0.7.tar.gz
- Upload date:
- Size: 340.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ae402998b2ca0aa76e5ce6ec8a6c3f13053c7ae07a72e17df3a2b320c9aa3177
|
|
| MD5 |
7290527dd8f9fa42dbd1a66f50a95acf
|
|
| BLAKE2b-256 |
848a652378d58855c985ba932d3333631958beadf1cbd890542c047512225b3c
|
File details
Details for the file chroma_migrate-0.0.7-py3-none-any.whl.
File metadata
- Download URL: chroma_migrate-0.0.7-py3-none-any.whl
- Upload date:
- Size: 13.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
59646a4c3b23242f8c8d1bd907f5ec6d122512bcf36a8da0b8e2fa1ed74a76b8
|
|
| MD5 |
ad3ff4912f68fa9a129216768e4ff5f8
|
|
| BLAKE2b-256 |
9aaf6348764d965691a78a4695e0fae310e8b469e35875033d53ab77582bba70
|