Dataverse SDK For Python
Project description
Dataverse SDK For Python
Dataverse is a MLOPs platform for assisting in data selection, data visualization and model training in comupter vision. Use Dataverse-SDK for Python to help you to interact with the Dataverse platform by Python. Currently, the library supports:
- Create Project with your input ontology and sensors
- Get Project by project-id
- Create Dataset from your AWS/Azure storage or local
- Get Dataset by dataset-id
Getting started
Install the package
pip install dataverse-sdk
Prerequisites: You must have an Dataverse Platform Account and Python 3.9+ to use this package.
Create the client
Interaction with the Dataverse site starts with an instance of the DataverseClient
class. You need an email-account and its password to instantiate the client object.
from dataverse_sdk import *
from dataverse_sdk.connections import get_connection
client = DataverseClient(
host=DataverseHost.STAGING, email="XXX", password="***"
)
assert client is get_connection()
Key concepts
Once you've initialized a DataverseClient, you can interact with Dataverse from the initialized object.
Examples
The following sections provide examples for the most common DataVerse tasksm including:
List Projects
The list_projects
method will list all projects of the given sites.
projects = client.list_projects(current_user = True,
exclude_sensor_type=SensorType.LIDAR,
image_type= OntologyImageType._2D_BOUNDING_BOX)
Create Project
The create_project
method will create project on the connected site with the defined ontology and sensors.
ontology = Ontology(
name="test ot",
image_type=OntologyImageType._2D_BOUNDING_BOX,
classes=[
OntologyClass(name="Pedestrian", rank=1, color="#234567"),
OntologyClass(name="Truck", rank=2, color="#345678"),
OntologyClass(name="Car", rank=3, color="#456789"),
OntologyClass(name="Cyclist", rank=4, color="#567890"),
OntologyClass(name="DontCare", rank=5, color="#6789AB"),
OntologyClass(name="Misc", rank=6, color="#789AB1"),
OntologyClass(name="Van", rank=7, color="#89AB12"),
OntologyClass(name="Tram", rank=8, color="#9AB123"),
OntologyClass(name="Person_sitting", rank=9, color="#AB1234"),
],
)
sensors = [
Sensor(name="camera 1", type=SensorType.CAMERA),
Sensor(name="lidar 1", type=SensorType.LIDAR),
]
project_tag = ProjectTag(
attributes=[
{"name": "year", "type": "number"},
{
"name": "unknown_object",
"type": "option",
"options": [{"value": "fire"}, {"value": "leaves"}, {"value": "water"}],
},
]
)
project = client.create_project(name="test project", ontology=ontology, sensors=sensors, project_tag=project_tag)
Get Project
The get_proejct
method retrieves the project from the connected site. The id
parameter is the unique interger ID of the project, not its "name" property.
project = client.get_project(id)
Edit Project
For editing project contents, we have four functions below for add/edit project tag and ontology classes.
Add New Project Tags
- Note: Can not create existing project tag!
tag = {
"attributes": [
{
"name": "month",
"type": "number"
},
{
"name": "weather",
"type": "option",
"options": [{"value":"sunny"}, {"value":"rainy"}, {"value":"cloudy"}
]
}]}
project_tag= ProjectTag(**tag)
client.add_project_tag(project_id = 10, project_tag=project_tag)
#OR
project.add_project_tag(project_tag=project_tag)
Edit Project Tags
** Note:
- Can not edit project tag that does not exist
- Can not modify the data type of existing project tags
- Can not provide attributes with existing options
tag = {
"attributes": [
{
"name": "weather",
"type": "option",
"options": [{"value":"unknown"}, {"value":"snowy"}
]
}]}
project_tag= ProjectTag(**tag)
client.edit_project_tag(project_id = 10, project_tag=project_tag)
#OR
project.edit_project_tag(project_tag=project_tag)
Add New Ontology Classes
- Note: Can not add existing ontology class!
new_classes = [OntologyClass(name="obstruction",
rank=9,
color="#AB4321",
attributes=[{
"name":
"status",
"type":
"option",
"options": [{
"value": "static"}, {"value": "moving"
}]}])]
client.add_ontology_classes(project_id=24, ontology_classes=new_classes)
#OR
project.add_ontology_classes(ontology_classes=new_classes)
Edit Ontology Classes
** Note:
- Can not edit ontology class that does not exist
- Can not modify the data type of existing ontology class attributes
- Can not provide attributes with existing options
edit_classes = [OntologyClass(name="obstruction",
color="#AB4321",
attributes=[{
"name":
"status",
"type":
"option",
"options": [{
"value": "unknown"}]}])]
client.edit_ontology_classes(project_id=24, ontology_classes=edit_classes)
#OR
project.edit_ontology_classes(ontology_classes=edit_classes)
Create Dataset
- Use
create_dataset
to create dataset from cloud storage
dataset_data = {
"data_source": DataSource.Azure/Datasource.AWS,
"storage_url": "storage/url",
"container_name": "azure container name",
"data_folder": "datafolder/to/vai_anno",
"sas_token": "azure sas token",
"name": "Dataset 1",
"type": DatasetType.ANNOTATED_DATA,
"annotations": ["groundtruth"]
"generate_metadata": False,
"render_pcd": False,
"annotation_format": AnnotationFormat.VISION_AI,
"sequential": False,
"sensors": project.sensors,
}
dataset = project.create_dataset(**dataset_data)
- Use
create_dataset
to create dataset from your local directory
dataset_data = {
"data_source": DataSource.SDK,
"storage_url" : "",
"container_name": "",
"sas_token":"",
"data_folder": "/path/to/your_localdir",
"name": "Dataset Local Upload",
"type": DatasetType.ANNOTATED_DATA,
"generate_metadata": False,
"auto_tagging": ["weather"],
"render_pcd": False,
"annotation_format": AnnotationFormat.VISION_AI,
"sequential": False,
"sensors": project.sensors,
"annotations" :['model_name']
}
dataset = project.create_dataset(**dataset_data)
Get Dataset
The get_dataset
method retrieves the dataset info from the connected site. The id
parameter is the unique interger ID of the dataset, not its "name" property.
dataset = client.get_dataset(id)
List Models
The list_models
method will list all the models in the given project
#1
models = client.list_models(project_id = 1)
#2
project = client.get_project(project_id=1)
models = project.list_models()
Get Model
The get_model
method will get the model detail info by the given model-id
model = client.get_model(model_id=30)
model = project.get_model(model_id=30)
From the given model, we could get the label file and the triton model file by the commands below.
status, label_file_path = model.get_label_file(save_path="./labels.txt", timeout=6000)
status, triton_model_path = model.get_triton_model_file(save_path="./model.zip", timeout=6000)
Troubleshooting
Next steps
Contributing
Links to language repos
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for dataverse_sdk-0.3.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9f8dfac8aa91ddcb3a82879df99f5059d3baa9f47c102f28120107e6998a04fb |
|
MD5 | 5b44a2db4d922c9260cce2c054773416 |
|
BLAKE2b-256 | 084010f075a4380780a0d4f7614b530105248de685efc77c6e3ce4b4a71d2ecc |