Skip to main content

Dataverse SDK For Python

Project description

Dataverse SDK For Python

Dataverse is a MLOPs platform for assisting in data selection, data visualization and model training in comupter vision. Use Dataverse-SDK for Python to help you to interact with the Dataverse platform by Python. Currently, the library supports:

  • Create Project with your input ontology and sensors
  • Get Project by project-id
  • Create Dataset from your AWS/Azure storage or local
  • Get Dataset by dataset-id

Package (PyPi) | Source code

Getting started

Install the package

pip install dataverse-sdk

Prerequisites: You must have an Dataverse Platform Account and Python 3.9+ to use this package.

Create the client

Interaction with the Dataverse site starts with an instance of the DataverseClient class. You need an email-account and its password to instantiate the client object.

from dataverse_sdk import *
from dataverse_sdk.connections import get_connection
client = DataverseClient(
    host=DataverseHost.STAGING, email="XXX", password="***"
)
assert client is get_connection()

Key concepts

Once you've initialized a DataverseClient, you can interact with Dataverse from the initialized object.

Examples

The following sections provide examples for the most common DataVerse tasksm including:

List Projects

The list_projects method will list all projects of the given sites.

projects = client.list_projects(current_user = True,
                                exclude_sensor_type=SensorType.LIDAR,
                                image_type= OntologyImageType._2D_BOUNDING_BOX)

Create Project

The create_project method will create project on the connected site with the defined ontology and sensors.

ontology = Ontology(
    name="test ot",
    image_type=OntologyImageType._2D_BOUNDING_BOX,
    classes=[
        OntologyClass(name="Pedestrian", rank=1, color="#234567"),
        OntologyClass(name="Truck", rank=2, color="#345678"),
        OntologyClass(name="Car", rank=3, color="#456789"),
        OntologyClass(name="Cyclist", rank=4, color="#567890"),
        OntologyClass(name="DontCare", rank=5, color="#6789AB"),
        OntologyClass(name="Misc", rank=6, color="#789AB1"),
        OntologyClass(name="Van", rank=7, color="#89AB12"),
        OntologyClass(name="Tram", rank=8, color="#9AB123"),
        OntologyClass(name="Person_sitting", rank=9, color="#AB1234"),
    ],
)
sensors = [
    Sensor(name="camera 1", type=SensorType.CAMERA),
    Sensor(name="lidar 1", type=SensorType.LIDAR),
]
project_tag = ProjectTag(
    attributes=[
        {"name": "year", "type": "number"},
        {
            "name": "unknown_object",
            "type": "option",
            "options": [{"value": "fire"}, {"value": "leaves"}, {"value": "water"}],
        },
    ]
)

project = client.create_project(name="test project", ontology=ontology, sensors=sensors, project_tag=project_tag)

Get Project

The get_proejct method retrieves the project from the connected site. The id parameter is the unique interger ID of the project, not its "name" property.

project = client.get_project(id)

Edit Project

For editing project contents, we have four functions below for add/edit project tag and ontology classes.

Add New Project Tags

  • Note: Can not create existing project tag!
tag = {
        "attributes": [
            {
                "name": "month",
                "type": "number"
            },
            {
                "name": "weather",
                "type": "option",
                "options": [{"value":"sunny"}, {"value":"rainy"}, {"value":"cloudy"}
                ]
            }]}
project_tag= ProjectTag(**tag)
client.add_project_tag(project_id = 10, project_tag=project_tag)
#OR
project.add_project_tag(project_tag=project_tag)

Edit Project Tags

** Note:

  1. Can not edit project tag that does not exist
  2. Can not modify the data type of existing project tags
  3. Can not provide attributes with existing options
tag = {
        "attributes": [
            {
                "name": "weather",
                "type": "option",
                "options": [{"value":"unknown"}, {"value":"snowy"}
                ]
            }]}
project_tag= ProjectTag(**tag)
client.edit_project_tag(project_id = 10, project_tag=project_tag)
#OR
project.edit_project_tag(project_tag=project_tag)

Add New Ontology Classes

  • Note: Can not add existing ontology class!
new_classes = [OntologyClass(name="obstruction",
                    rank=9,
                    color="#AB4321",
                    attributes=[{
                    "name":
                    "status",
                    "type":
                    "option",
                    "options": [{
                    "value": "static"}, {"value": "moving"
                    }]}])]
client.add_ontology_classes(project_id=24, ontology_classes=new_classes)
#OR
project.add_ontology_classes(ontology_classes=new_classes)

Edit Ontology Classes

** Note:

  1. Can not edit ontology class that does not exist
  2. Can not modify the data type of existing ontology class attributes
  3. Can not provide attributes with existing options
edit_classes = [OntologyClass(name="obstruction",
                    color="#AB4321",
                    attributes=[{
                    "name":
                    "status",
                    "type":
                    "option",
                    "options": [{
                    "value": "unknown"}]}])]
client.edit_ontology_classes(project_id=24, ontology_classes=edit_classes)
#OR
project.edit_ontology_classes(ontology_classes=edit_classes)

Create Dataset

  • Use create_dataset to create dataset from cloud storage
dataset_data = {
    "data_source": DataSource.Azure/DataSource.AWS,
    "storage_url": "storage/url",
    "container_name": "azure container name",
    "data_folder": "datafolder/to/vai_anno",
    "sas_token": "azure sas token",
    "name": "Dataset 1",
    "type": DatasetType.ANNOTATED_DATA,
    "annotations": ["groundtruth"],
    "generate_metadata": False,
    "auto_tagging": ["timeofday"],
    "render_pcd": False,
    "annotation_format": AnnotationFormat.VISION_AI,
    "sequential": False,
    "sensors": project.sensors,
    "access_key_id" : "aws s3 access key id",# only for private s3 bucket, don't need to assign it in case of public s3 bucket or azure data source
    "secret_access_key": "aws s3 secret access key"# only for private s3 bucket, don't need to assign it in case of public s3 bucket or azure data source
}
dataset = project.create_dataset(**dataset_data)
  • Use create_dataset to create dataset from your local directory
dataset_data = {
    "data_source": DataSource.SDK,
    "storage_url" : "",
    "container_name": "",
    "sas_token":"",
    "data_folder": "/path/to/your_localdir",
    "name": "Dataset Local Upload",
    "type": DatasetType.ANNOTATED_DATA,
    "generate_metadata": False,
    "auto_tagging": ["weather"],
    "render_pcd": False,
    "annotation_format": AnnotationFormat.VISION_AI,
    "sequential": False,
    "sensors": project.sensors,
    "annotations" :['model_name']
}
dataset = project.create_dataset(**dataset_data)

Get Dataset

The get_dataset method retrieves the dataset info from the connected site. The id parameter is the unique interger ID of the dataset, not its "name" property.

dataset = client.get_dataset(id)

List Models

The list_models method will list all the models in the given project

#1
models = client.list_models(project_id = 1)
#2
project = client.get_project(project_id=1)
models = project.list_models()

Get Model

The get_model method will get the model detail info by the given model-id

model = client.get_model(model_id=30)
model = project.get_model(model_id=30)

From the given model, we could get the label file / triton model file / onnx model file by the commands below.

status, label_file_path = model.get_label_file(save_path="./labels.txt", timeout=6000)
status, triton_model_path = model.get_triton_model_file(save_path="./model.zip", timeout=6000)
status, onnx_model_path = model.get_onnx_model_file(save_path="./model.zip", timeout=6000)

Troubleshooting

Next steps

Contributing

Links to language repos

Python Readme

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataverse-sdk-0.4.0.tar.gz (18.0 kB view hashes)

Uploaded Source

Built Distribution

dataverse_sdk-0.4.0-py3-none-any.whl (19.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page