Skip to main content

No project description provided

Project description

Github Banner

Join our slack channel!

Relevance AI - The ML Platform for Unstructured Data Analysis

Documentation Status License

🌎 80% of data in the world is unstructured in the form of text, image, audio, videos, and more.

🔥 Use Relevance to unlock the value of your unstructured data:

  • ⚡ Quickly analyze unstructured data with pre-trained machine learning models in a few lines of code.
  • ✨ Visualize your unstructured data. Text highlights from Named entity recognition, Word cloud from keywords, Bounding box from images.
  • 📊 Create charts for both structured and unstructured.
  • 🔎 Drilldown with filters and similarity search to explore and find insights.
  • 🚀 Share data apps with your team.

Sign up for a free account ->

Relevance AI also acts as a platform for:

  • 🔑 Vectors, storing and querying vectors with flexible vector similarity search, that can be combined with multiple vectors, aggregates and filters.
  • 🔮 ML Dataset Evaluation, for debugging dataset labels, model outputs and surfacing edge cases.

🧠 Documentation

Type Link
Python API Documentation
Python Reference Documentation
Cloud Dashboard Documentation

🛠️ Installation

Using pip:

pip install -U relevanceai

Using conda:

conda install -c relevance relevanceai

⏩ Quickstart

Open In Colab

Login to relevanceai:

from relevanceai import Client

client = Client()

Prepare your documents for insertion by following the below format:

  • Each document should be a dictionary
  • Include a field _id as a primary key, otherwise it's automatically generated
  • Suffix vector fields with _vector_
docs = [
    {"_id": "1", "example_vector_": [0.1, 0.1, 0.1], "data": "Documentation"},
    {"_id": "2", "example_vector_": [0.2, 0.2, 0.2], "data": "Best document!"},
    {"_id": "3", "example_vector_": [0.3, 0.3, 0.3], "data": "document example"},
    {"_id": "4", "example_vector_": [0.4, 0.4, 0.4], "data": "this is another doc"},
    {"_id": "5", "example_vector_": [0.5, 0.5, 0.5], "data": "this is a doc"},
]

Insert data into a dataset

Create a dataset object with the name of the dataset you'd like to use. If it doesn't exist, it'll be created for you.

ds = client.Dataset("quickstart")
ds.insert_documents(docs)

Quick tip! Our Dataset object is compatible with common dataframes methods like .head(), .shape() and .info().

Perform vector search

query = [
    {"vector": [0.2, 0.2, 0.2], "field": "example_vector_"}
]
results = ds.search(
    vector_search_query=query,
    page_size=3,
)

Learn more about how to flexibly configure your vector search ->

Perform clustering

Generate clusters

clusterop = ds.cluster(vector_fields=["example_vector_"])
clusterop.list_closest()

Generate clusters with sklearn

from sklearn.cluster import AgglomerativeClustering

cluster_model = AgglomerativeClustering()
clusterop = ds.cluster(vector_fields=["example_vector_"], model=cluster_model, alias="agglomerative")
clusterop.list_closest()

Learn more about how to flexibly configure your clustering ->

🧰 Config

The config object contains the adjustable global settings for the SDK. For a description of all the settings, see here.

To view setting options, run the following:

client.config.options

The syntax for selecting an option is section.key. For example, to disable logging, run the following to modify logging.enable_logging:

client.config.set_option('logging.enable_logging', False)

To restore all options to their default, run the following:

Changing the base URL

You can change the base URL as such:

client.base_url = "https://.../latest"

🚧 Development

Getting Started

To get started with development, ensure you have pytest and mypy installed. These will help ensure typechecking and testing.

python -m pip install pytest mypy

Then run testing using:

Don't forget to set your test credentials!

export TEST_PROJECT = xxx
export TEST_API_KEY = xxx

python -m pytest
mypy relevanceai

Set up precommit

pip install precommit
pre-commit install

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

RelevanceAI-3.0.1.tar.gz (296.6 kB view details)

Uploaded Source

Built Distribution

RelevanceAI-3.0.1-py3-none-any.whl (422.4 kB view details)

Uploaded Python 3

File details

Details for the file RelevanceAI-3.0.1.tar.gz.

File metadata

  • Download URL: RelevanceAI-3.0.1.tar.gz
  • Upload date:
  • Size: 296.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.6

File hashes

Hashes for RelevanceAI-3.0.1.tar.gz
Algorithm Hash digest
SHA256 d098dd85e243ea1a5b2a473769759a592da8b191f6ca5e788c48d561a290b322
MD5 2d2a03bb7a143c95e4ea0b19d1c8f2f5
BLAKE2b-256 5dc57fcdbedf2a9daf41edb86252e6370de181b5a62ae1f2cc4e80904de51223

See more details on using hashes here.

File details

Details for the file RelevanceAI-3.0.1-py3-none-any.whl.

File metadata

  • Download URL: RelevanceAI-3.0.1-py3-none-any.whl
  • Upload date:
  • Size: 422.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.6

File hashes

Hashes for RelevanceAI-3.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b234144e3b38c7f92abf4fd07c90807d2390868651a179040dd899186c4a7b97
MD5 d0f2e0af0d89757560872721ff397a50
BLAKE2b-256 1f508a7e3ee70056e1b26dbfd6d323eb9d3d5db873dcc4ae27aa821473ca020e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page