Skip to main content

Chroma.

Project description

Chroma Chroma

Chroma - the open-source search engine for AI.
The fastest way to build Python or JavaScript LLM apps that search over your data!

Discord | License | Docs | Homepage

pip install chromadb # python client
# for javascript, npm install chromadb!
# for client-server mode, chroma run --path /chroma_db_path

Chroma Cloud

Our hosted service, Chroma Cloud, powers serverless vector, hybrid, and full-text search. It's extremely fast, cost-effective, scalable and painless. Create a DB and try it out in under 30 seconds with $5 of free credits.

Get started with Chroma Cloud

API

The core API is only 4 functions (run our 💡 Google Colab):

import chromadb
# setup Chroma in-memory, for easy prototyping. Can add persistence easily!
client = chromadb.Client()

# Create collection. get_collection, get_or_create_collection, delete_collection also available!
collection = client.create_collection("all-my-documents")

# Add docs to the collection. Can also update and delete. Row-based API coming soon!
collection.add(
    documents=["This is document1", "This is document2"], # we handle tokenization, embedding, and indexing automatically. You can skip that and add your own embeddings as well
    metadatas=[{"source": "notion"}, {"source": "google-docs"}], # filter on these!
    ids=["doc1", "doc2"], # unique for each doc
)

# Query/search 2 most similar results. You can also .get by id
results = collection.query(
    query_texts=["This is a query document"],
    n_results=2,
    # where={"metadata_field": "is_equal_to_this"}, # optional filter
    # where_document={"$contains":"search_string"}  # optional filter
)

Learn about all features on our Docs

Features

  • Simple: Fully-typed, fully-tested, fully-documented == happiness
  • Integrations: 🦜️🔗 LangChain (python and js), 🦙 LlamaIndex and more soon
  • Dev, Test, Prod: the same API that runs in your python notebook, scales to your cluster
  • Feature-rich: Queries, filtering, regex and more
  • Free & Open Source: Apache 2.0 Licensed

Use case: ChatGPT for ______

For example, the "Chat your data" use case:

  1. Add documents to your database. You can pass in your own embeddings, embedding function, or let Chroma embed them for you.
  2. Query relevant documents with natural language.
  3. Compose documents into the context window of an LLM like GPT4 for additional summarization or analysis.

Embeddings?

What are embeddings?

  • Read the guide from OpenAI
  • Literal: Embedding something turns it from image/text/audio into a list of numbers. 🖼️ or 📄 => [1.2, 2.1, ....]. This process makes documents "understandable" to a machine learning model.
  • By analogy: An embedding represents the essence of a document. This enables documents and queries with the same essence to be "near" each other and therefore easy to find.
  • Technical: An embedding is the latent-space position of a document at a layer of a deep neural network. For models trained specifically to embed data, this is the last layer.
  • A small example: If you search your photos for "famous bridge in San Francisco". By embedding this query and comparing it to the embeddings of your photos and their metadata - it should return photos of the Golden Gate Bridge.

Chroma allows you to store these vectors or embeddings and search by nearest neighbors rather than by substrings like a traditional database. By default, Chroma uses Sentence Transformers to embed for you but you can also use OpenAI embeddings, Cohere (multilingual) embeddings, or your own.

Get involved

Chroma is a rapidly developing project. We welcome PR contributors and ideas for how to improve the project.

Release Cadence We currently release new tagged versions of the pypi and npm packages on Mondays. Hotfixes go out at any time during the week.

License

Apache 2.0

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chromadb-1.5.1.tar.gz (2.4 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

chromadb-1.5.1-cp39-abi3-win_amd64.whl (21.9 MB view details)

Uploaded CPython 3.9+Windows x86-64

chromadb-1.5.1-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (21.5 MB view details)

Uploaded CPython 3.9+manylinux: glibc 2.17+ x86-64

chromadb-1.5.1-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (20.6 MB view details)

Uploaded CPython 3.9+manylinux: glibc 2.17+ ARM64

chromadb-1.5.1-cp39-abi3-macosx_11_0_arm64.whl (20.0 MB view details)

Uploaded CPython 3.9+macOS 11.0+ ARM64

chromadb-1.5.1-cp39-abi3-macosx_10_12_x86_64.whl (20.7 MB view details)

Uploaded CPython 3.9+macOS 10.12+ x86-64

File details

Details for the file chromadb-1.5.1.tar.gz.

File metadata

  • Download URL: chromadb-1.5.1.tar.gz
  • Upload date:
  • Size: 2.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.12.3

File hashes

Hashes for chromadb-1.5.1.tar.gz
Algorithm Hash digest
SHA256 1ebf53664f6d2064c07681741016c80f5f47e7d61d1eba0d654d01823842a516
MD5 9a8d772dbb7930c4c78295c24d1c637a
BLAKE2b-256 c3b6b7bd96a44a94698d10bb61a7714439108f06900f6c89e005e66b5f64ccb9

See more details on using hashes here.

File details

Details for the file chromadb-1.5.1-cp39-abi3-win_amd64.whl.

File metadata

  • Download URL: chromadb-1.5.1-cp39-abi3-win_amd64.whl
  • Upload date:
  • Size: 21.9 MB
  • Tags: CPython 3.9+, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.12.3

File hashes

Hashes for chromadb-1.5.1-cp39-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 7ec9dc47841cf3fecc475ca07a0aacfc9a347b3460881051636755618d6250c6
MD5 af3de258775b0a54a685fd12bc2aad0f
BLAKE2b-256 84a2023696860162c59ed7d5d2a589d701bf5c54233d82a0f808c69956204c10

See more details on using hashes here.

File details

Details for the file chromadb-1.5.1-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for chromadb-1.5.1-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 89ff9f7185238b352c498181b3cfa9e28f7f3336c2b8d7ab8cdfe4f3d76e5e96
MD5 6f3d053d495b38bd651976a0c9db98d6
BLAKE2b-256 246b051e4684966599991d9fc6fe10cf2fd8d84e08bfe8752485c74111167543

See more details on using hashes here.

File details

Details for the file chromadb-1.5.1-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for chromadb-1.5.1-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 aa1a63c757c2a9a61820aab81d6ad4921e7394daf4f0cf04c8690d30274530f2
MD5 646dca3226a6afc91655fed97f7e91d6
BLAKE2b-256 2125b4dbc81e174bb6e661c5aa48d03598f0d5c0e8267461b608e861dcb841d4

See more details on using hashes here.

File details

Details for the file chromadb-1.5.1-cp39-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chromadb-1.5.1-cp39-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 8f4c06709e5bd8f6af1a2196db8500dc728697aef4a8cb4f8f37b47338582032
MD5 31753bbd3e3f67c49956a7da325fdd58
BLAKE2b-256 5a96e219be6a44ffc6d7f8012cc6987e1618561a20a8673341f696f9feb93890

See more details on using hashes here.

File details

Details for the file chromadb-1.5.1-cp39-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for chromadb-1.5.1-cp39-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 0ca6e9f8110e848eeb2807994184b50380b35a59bce09d7acff850ec35c735f9
MD5 23f1e87ae6df629ccb0539e7463c9949
BLAKE2b-256 31c3598e28a67db38ffc377f30c49f37cad865be2fe261d719fa84641b07ff72

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page