Skip to main content

Chroma.

Project description

Chroma Chroma

Chroma - the open-source search engine for AI.
The fastest way to build Python or JavaScript LLM apps that search over your data!

Discord | License | Docs | Homepage

pip install chromadb # python client
# for javascript, npm install chromadb!
# for client-server mode, chroma run --path /chroma_db_path

Chroma Cloud

Our hosted service, Chroma Cloud, powers serverless vector, hybrid, and full-text search. It's extremely fast, cost-effective, scalable and painless. Create a DB and try it out in under 30 seconds with $5 of free credits.

Get started with Chroma Cloud

API

The core API is only 4 functions (run our 💡 Google Colab):

import chromadb
# setup Chroma in-memory, for easy prototyping. Can add persistence easily!
client = chromadb.Client()

# Create collection. get_collection, get_or_create_collection, delete_collection also available!
collection = client.create_collection("all-my-documents")

# Add docs to the collection. Can also update and delete. Row-based API coming soon!
collection.add(
    documents=["This is document1", "This is document2"], # we handle tokenization, embedding, and indexing automatically. You can skip that and add your own embeddings as well
    metadatas=[{"source": "notion"}, {"source": "google-docs"}], # filter on these!
    ids=["doc1", "doc2"], # unique for each doc
)

# Query/search 2 most similar results. You can also .get by id
results = collection.query(
    query_texts=["This is a query document"],
    n_results=2,
    # where={"metadata_field": "is_equal_to_this"}, # optional filter
    # where_document={"$contains":"search_string"}  # optional filter
)

Learn about all features on our Docs

Features

  • Simple: Fully-typed, fully-tested, fully-documented == happiness
  • Integrations: 🦜️🔗 LangChain (python and js), 🦙 LlamaIndex and more soon
  • Dev, Test, Prod: the same API that runs in your python notebook, scales to your cluster
  • Feature-rich: Queries, filtering, regex and more
  • Free & Open Source: Apache 2.0 Licensed

Use case: ChatGPT for ______

For example, the "Chat your data" use case:

  1. Add documents to your database. You can pass in your own embeddings, embedding function, or let Chroma embed them for you.
  2. Query relevant documents with natural language.
  3. Compose documents into the context window of an LLM like GPT4 for additional summarization or analysis.

Embeddings?

What are embeddings?

  • Read the guide from OpenAI
  • Literal: Embedding something turns it from image/text/audio into a list of numbers. 🖼️ or 📄 => [1.2, 2.1, ....]. This process makes documents "understandable" to a machine learning model.
  • By analogy: An embedding represents the essence of a document. This enables documents and queries with the same essence to be "near" each other and therefore easy to find.
  • Technical: An embedding is the latent-space position of a document at a layer of a deep neural network. For models trained specifically to embed data, this is the last layer.
  • A small example: If you search your photos for "famous bridge in San Francisco". By embedding this query and comparing it to the embeddings of your photos and their metadata - it should return photos of the Golden Gate Bridge.

Chroma allows you to store these vectors or embeddings and search by nearest neighbors rather than by substrings like a traditional database. By default, Chroma uses Sentence Transformers to embed for you but you can also use OpenAI embeddings, Cohere (multilingual) embeddings, or your own.

Get involved

Chroma is a rapidly developing project. We welcome PR contributors and ideas for how to improve the project.

Release Cadence We currently release new tagged versions of the pypi and npm packages on Mondays. Hotfixes go out at any time during the week.

License

Apache 2.0

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chromadb-1.5.5.tar.gz (2.4 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

chromadb-1.5.5-cp39-abi3-win_amd64.whl (21.9 MB view details)

Uploaded CPython 3.9+Windows x86-64

chromadb-1.5.5-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (21.6 MB view details)

Uploaded CPython 3.9+manylinux: glibc 2.17+ x86-64

chromadb-1.5.5-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (20.7 MB view details)

Uploaded CPython 3.9+manylinux: glibc 2.17+ ARM64

chromadb-1.5.5-cp39-abi3-macosx_11_0_arm64.whl (20.1 MB view details)

Uploaded CPython 3.9+macOS 11.0+ ARM64

chromadb-1.5.5-cp39-abi3-macosx_10_12_x86_64.whl (20.8 MB view details)

Uploaded CPython 3.9+macOS 10.12+ x86-64

File details

Details for the file chromadb-1.5.5.tar.gz.

File metadata

  • Download URL: chromadb-1.5.5.tar.gz
  • Upload date:
  • Size: 2.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.12.6

File hashes

Hashes for chromadb-1.5.5.tar.gz
Algorithm Hash digest
SHA256 8d669285b77cc288db27583a57b2f85ba451a9b8e3bef85a260cd78e6b57be35
MD5 451587100b760350452bb4084ed6e8bb
BLAKE2b-256 3a6dab03e16be3ec663e353166f38be082efb51c0988687f8c8eee1416a7e732

See more details on using hashes here.

File details

Details for the file chromadb-1.5.5-cp39-abi3-win_amd64.whl.

File metadata

  • Download URL: chromadb-1.5.5-cp39-abi3-win_amd64.whl
  • Upload date:
  • Size: 21.9 MB
  • Tags: CPython 3.9+, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.12.6

File hashes

Hashes for chromadb-1.5.5-cp39-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 3953403b63bb1c05405d10db36d183c4d19a027938c15898510d11943499046f
MD5 3ac0b399e075dac461496cd7aafc987b
BLAKE2b-256 a2dfce1ffcc0ad3eef8bd35b920809b990e6925ba94b2580dc5bd7ccde0fc06a

See more details on using hashes here.

File details

Details for the file chromadb-1.5.5-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for chromadb-1.5.5-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 bb238ae508a6ce68fdd7875e040d7e5aa29d6e40fb651b51f5537b7cda789762
MD5 edf2d37084407c46d01f9b91a421a880
BLAKE2b-256 d366e0b35c41be7c02d6fa37f6c8f61a16b7b20607ddc847574e9a5503fe853b

See more details on using hashes here.

File details

Details for the file chromadb-1.5.5-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for chromadb-1.5.5-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 2f54e7736ae0eeec436a1c1fb04b77b2c6c4108996790ef16f88327e38ad13cd
MD5 032d32f327144be908193c35cbefe6a1
BLAKE2b-256 a85a11543a76ab25c55bec6133bb98ce0dc0f4850acb36600344d8286734a051

See more details on using hashes here.

File details

Details for the file chromadb-1.5.5-cp39-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chromadb-1.5.5-cp39-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 5ff2912d20a82fdbf4e27ff3e1c91dab25e2ba2c629f9739bc12c11a3151aac7
MD5 f99fea2639164adbeda74019b399fd62
BLAKE2b-256 f8ce430a87d906f79cdc7e23efcd89dd237e3dbedaf6704b40ce1da127993bf8

See more details on using hashes here.

File details

Details for the file chromadb-1.5.5-cp39-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for chromadb-1.5.5-cp39-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 d590998ed81164afbfb1734bb534b25ec2c9810fc1c5ce53bf8f7ac644a79887
MD5 8d316944d8d0f13b712a134b8a772736
BLAKE2b-256 f062ee578f8ccd62928257558b13a3e7c236e402cfb319c9b201b6a75897d644

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page