Unofficial demo datasets for Weaviate
Project description
UNOFFICIAL Weaviate demo data uploader
This is an educational project that aims to make it easy to upload demo data to your instance of Weaviate. The intended use case for users learning how to use Weaviate.
Usage
All datasets are based on Dataset
superclass, and includes a number of built-in methods to make it easier to work with it.
Once you instantiate a dataset, to upload it to Weaviate the syntax is as follows:
import wv_datasets
dataset = wv_datasets.JeopardyQuestionsSmall() # Instantiate dataset
dataset.upload_dataset(client) # Add class to schema & Upload objects (uses batch uploads by default)
Where client
is the instantiated weaviate.Client
object.
import weaviate
import os
import json
wv_url = "https://some-endpoint.weaviate.network"
api_key = os.environ.get("OPENAI_API_KEY")
auth = weaviate.AuthClientPassword(
username=os.environ.get("WCS_USER"),
password=os.environ.get("WCS_PASS"),
)
client = weaviate.Client(
url=wv_url,
auth_client_secret=auth,
additional_headers={"X-OpenAI-Api-Key": api_key},
)
Built-in methods
-
.add_to_schema(client)
- add defined classes to schema; returns status & any classes already present -
.upload_objects(client, batch_size)
- adds objects; must specify batch size -
.upload_dataset(client)
- runs.add_to_schema
and.upload_objects
; default batch size 100 -
.get_class_definitions()
: See the schema definition to be added -
.get_class_names()
: See class names in the dataset -
.classes_in_schema(client)
: Check whether each class is already in the Weaviate schema -
.delete_existing_dataset_classes(client)
: If dataset classes are already in the Weaviate instance, delete them from the Weaviate instance. -
.set_vectorizer(vectorizer_name, module_config)
: Set the vectorizer and corresponding module configuration for the dataset. Datasets come pre-configured with a vectorizer & module configuration.
Available classes
- WikiArticles
- WineReviews
- JeopardyQuestions1k
- JeopardyQuestions10k
Source code:
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file weaviate-demo-datasets-0.0.9.tar.gz
.
File metadata
- Download URL: weaviate-demo-datasets-0.0.9.tar.gz
- Upload date:
- Size: 67.9 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5d51c22efdd22cae6d14577e0c796da8ba2faa275e096eff3cf72292d4c3b942 |
|
MD5 | eeac4a8e897da4a9b16a52254ae92f20 |
|
BLAKE2b-256 | 6e1a79694eb3569227a723dd84846e8cb2887c44dfbae5f08be5f4adab740190 |
File details
Details for the file weaviate_demo_datasets-0.0.9-py3-none-any.whl
.
File metadata
- Download URL: weaviate_demo_datasets-0.0.9-py3-none-any.whl
- Upload date:
- Size: 72.2 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f32e52bf773de9b8262d14604b90a9e124869d76207959295209e4b4be47d7bf |
|
MD5 | dd84ca378327bf1722ce9003f7e787aa |
|
BLAKE2b-256 | 97d35aa885e988918b4804996d82e2ed75b1b40b885cf5e4c7dc73c158eb5dd3 |