Skip to main content

Chroma.

Project description

Chroma logo

Chroma - the open-source embedding database.
The fastest way to build Python or JavaScript LLM apps with memory!

Discord | License | Docs | Homepage

pip install chromadb # python client
# for javascript, npm install chromadb!
# for client-server mode, chroma run --path /chroma_db_path

The core API is only 4 functions (run our 💡 Google Colab or Replit template):

import chromadb
# setup Chroma in-memory, for easy prototyping. Can add persistence easily!
client = chromadb.Client()

# Create collection. get_collection, get_or_create_collection, delete_collection also available!
collection = client.create_collection("all-my-documents")

# Add docs to the collection. Can also update and delete. Row-based API coming soon!
collection.add(
    documents=["This is document1", "This is document2"], # we handle tokenization, embedding, and indexing automatically. You can skip that and add your own embeddings as well
    metadatas=[{"source": "notion"}, {"source": "google-docs"}], # filter on these!
    ids=["doc1", "doc2"], # unique for each doc
)

# Query/search 2 most similar results. You can also .get by id
results = collection.query(
    query_texts=["This is a query document"],
    n_results=2,
    # where={"metadata_field": "is_equal_to_this"}, # optional filter
    # where_document={"$contains":"search_string"}  # optional filter
)

Features

  • Simple: Fully-typed, fully-tested, fully-documented == happiness
  • Integrations: 🦜️🔗 LangChain (python and js), 🦙 LlamaIndex and more soon
  • Dev, Test, Prod: the same API that runs in your python notebook, scales to your cluster
  • Feature-rich: Queries, filtering, density estimation and more
  • Free & Open Source: Apache 2.0 Licensed

Use case: ChatGPT for ______

For example, the "Chat your data" use case:

  1. Add documents to your database. You can pass in your own embeddings, embedding function, or let Chroma embed them for you.
  2. Query relevant documents with natural language.
  3. Compose documents into the context window of an LLM like GPT3 for additional summarization or analysis.

Embeddings?

What are embeddings?

  • Read the guide from OpenAI
  • Literal: Embedding something turns it from image/text/audio into a list of numbers. 🖼️ or 📄 => [1.2, 2.1, ....]. This process makes documents "understandable" to a machine learning model.
  • By analogy: An embedding represents the essence of a document. This enables documents and queries with the same essence to be "near" each other and therefore easy to find.
  • Technical: An embedding is the latent-space position of a document at a layer of a deep neural network. For models trained specifically to embed data, this is the last layer.
  • A small example: If you search your photos for "famous bridge in San Francisco". By embedding this query and comparing it to the embeddings of your photos and their metadata - it should return photos of the Golden Gate Bridge.

Embeddings databases (also known as vector databases) store embeddings and allow you to search by nearest neighbors rather than by substrings like a traditional database. By default, Chroma uses Sentence Transformers to embed for you but you can also use OpenAI embeddings, Cohere (multilingual) embeddings, or your own.

Get involved

Chroma is a rapidly developing project. We welcome PR contributors and ideas for how to improve the project.

Release Cadence We currently release new tagged versions of the pypi and npm packages on Mondays. Hotfixes go out at any time during the week.

License

Apache 2.0

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chromadb-1.0.15.tar.gz (1.2 MB view details)

Uploaded Source

Built Distributions

chromadb-1.0.15-cp39-abi3-win_amd64.whl (19.5 MB view details)

Uploaded CPython 3.9+Windows x86-64

chromadb-1.0.15-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (19.5 MB view details)

Uploaded CPython 3.9+manylinux: glibc 2.17+ x86-64

chromadb-1.0.15-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (18.6 MB view details)

Uploaded CPython 3.9+manylinux: glibc 2.17+ ARM64

chromadb-1.0.15-cp39-abi3-macosx_11_0_arm64.whl (18.1 MB view details)

Uploaded CPython 3.9+macOS 11.0+ ARM64

chromadb-1.0.15-cp39-abi3-macosx_10_12_x86_64.whl (18.8 MB view details)

Uploaded CPython 3.9+macOS 10.12+ x86-64

File details

Details for the file chromadb-1.0.15.tar.gz.

File metadata

  • Download URL: chromadb-1.0.15.tar.gz
  • Upload date:
  • Size: 1.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.9.0

File hashes

Hashes for chromadb-1.0.15.tar.gz
Algorithm Hash digest
SHA256 3e910da3f5414e2204f89c7beca1650847f2bf3bd71f11a2e40aad1eb31050aa
MD5 558bb65c1fb925bb98a4a17b4df362b8
BLAKE2b-256 ade20653b2e539db5512d2200c759f1bc7f9ef5609fe47f3c7d24b82f62dc00f

See more details on using hashes here.

File details

Details for the file chromadb-1.0.15-cp39-abi3-win_amd64.whl.

File metadata

  • Download URL: chromadb-1.0.15-cp39-abi3-win_amd64.whl
  • Upload date:
  • Size: 19.5 MB
  • Tags: CPython 3.9+, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.9.0

File hashes

Hashes for chromadb-1.0.15-cp39-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 e0cb3b93fdc42b1786f151d413ef36299f30f783a30ce08bf0bfb12e552b4190
MD5 8a3570aad61a432ff4b3d52a1441146a
BLAKE2b-256 a1306890da607358993f87a01e80bcce916b4d91515ce865f07dc06845cb472f

See more details on using hashes here.

File details

Details for the file chromadb-1.0.15-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for chromadb-1.0.15-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 479f1b401af9e7c20f50642ffb3376abbfd78e2b5b170429f7c79eff52e367db
MD5 80190cc98bc17f71caa7aa390a958b26
BLAKE2b-256 cb33190df917a057067e37f8b48d082d769bed8b3c0c507edefc7b6c6bb577d0

See more details on using hashes here.

File details

Details for the file chromadb-1.0.15-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for chromadb-1.0.15-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 3b73cd6fb32fcdd91c577cca16ea6112b691d72b441bb3f2140426d1e79e453a
MD5 3baa0ed0a143eea95ba1a806fe43b0f7
BLAKE2b-256 314974e34cc5aeeb25aff2c0ede6790b3671e14c1b91574dd8f98d266a4c5aad

See more details on using hashes here.

File details

Details for the file chromadb-1.0.15-cp39-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chromadb-1.0.15-cp39-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 48015803c0631c3a817befc276436dc084bb628c37fd4214047212afb2056291
MD5 79f20a2abc2d1cf5c5c90d13906d37d4
BLAKE2b-256 e118ff9b58ab5d334f5ecff7fdbacd6761bac467176708fa4d2500ae7c048af0

See more details on using hashes here.

File details

Details for the file chromadb-1.0.15-cp39-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for chromadb-1.0.15-cp39-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 51791553014297798b53df4e043e9c30f4e8bd157647971a6bb02b04bfa65f82
MD5 44ce40532e154e2c41df9cfa43817767
BLAKE2b-256 855a866c6f0c2160cbc8dca0cf77b2fb391dcf435b32a58743da1bc1a08dc442

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page