Dataverse SDK For Python
Project description
Dataverse SDK For Python
Dataverse is a MLOPs platform for assisting in data selection, data visualization and model training in comupter vision. Use Dataverse-SDK for Python to help you to interact with the Dataverse platform by Python. Currently, the library supports:
- Create Project with your input ontology and sensors
- Get Project by project-id
- Create Dataset from your AWS/Azure storage or local
- Get Dataset by dataset-id
Getting started
Install the package
pip install dataverse-sdk
Prerequisites: You must have an Dataverse Platform Account and Python 3.9+ to use this package.
Create the client
Interaction with the Dataverse site starts with an instance of the DataverseClient
class. You need an email-account and its password to instantiate the client object.
from dataverse_sdk import *
from dataverse_sdk.connections import get_connection
client = DataverseClient(
host=DataverseHost.STAGING, email="XXX", password="***"
)
assert client is get_connection()
Key concepts
Once you've initialized a DataverseClient, you can interact with Dataverse from the initialized object.
Examples
The following sections provide examples for the most common DataVerse tasksm including:
Create Project
The create_project
method will create project on the connected site with the defined ontology and sensors.
ontology = Ontology(
name="test ot",
image_type=OntologyImageType._2D_BOUNDING_BOX,
classes=[
OntologyClass(name="Pedestrian", rank=1, color="#234567"),
OntologyClass(name="Truck", rank=2, color="#345678"),
OntologyClass(name="Car", rank=3, color="#456789"),
OntologyClass(name="Cyclist", rank=4, color="#567890"),
OntologyClass(name="DontCare", rank=5, color="#6789AB"),
OntologyClass(name="Misc", rank=6, color="#789AB1"),
OntologyClass(name="Van", rank=7, color="#89AB12"),
OntologyClass(name="Tram", rank=8, color="#9AB123"),
OntologyClass(name="Person_sitting", rank=9, color="#AB1234"),
],
)
sensors = [
Sensor(name="camera 1", type=SensorType.CAMERA),
Sensor(name="lidar 1", type=SensorType.LIDAR),
]
project_tag = ProjectTag(
attributes=[
{"name": "year", "type": "number"},
{
"name": "unknown_object",
"type": "option",
"options": [{"value": "fire"}, {"value": "leaves"}, {"value": "water"}],
},
]
)
project = client.create_project(name="test project", ontology=ontology, sensors=sensors, project_tag=project_tag)
Get Project
The get_proejct
method retrieves the project from the connected site. The id
parameter is the unique interger ID of the project, not its "name" property.
project = client.get_project(id)
Create Dataset
- Use
create_dataset
to create dataset from cloud storage
dataset_data = {
"data_source": DataSource.Azure/Datasource.AWS,
"storage_url": "storage/url",
"container_name": "azure container name",
"data_folder": "datafolder/to/vai_anno",
"sas_token": "azure sas token",
"name": "Dataset 1",
"type": DatasetType.ANNOTATED_DATA,
"generate_metadata": False,
"render_pcd": False,
"annotation_format": AnnotationFormat.VISION_AI,
"sequential": False,
"sensors": project.sensors,
}
dataset = project.create_dataset(**dataset_data)
- Use
create_dataset
to create dataset from your local directory
dataset_data = {
"data_source": DataSource.SDK,
"storage_url" : "",
"container_name": "",
"sas_token":"",
"data_folder": "/path/to/your_localdir",
"name": "Dataset Local Upload",
"type": DatasetType.ANNOTATED_DATA,
"generate_metadata": False,
"render_pcd": False,
"annotation_format": AnnotationFormat.VISION_AI,
"sequential": False,
"sensors": project.sensors,
"extra_annotations" :['model_name'] #optional
}
dataset = project.create_dataset(**dataset_data)
Get Dataset
The get_dataset
method retrieves the dataset info from the connected site. The id
parameter is the unique interger ID of the dataset, not its "name" property.
dataset = client.get_dataset(id)
Troubleshooting
Next steps
Contributing
Links to language repos
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for dataverse_sdk-0.1.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7e6c17874d3af027dfc75d50bbf143018ecda7da94387466b31d26d5f5fa166f |
|
MD5 | 77bf04b88e1a30e317678c9d51012b5b |
|
BLAKE2b-256 | 2bfb6e2c68e89b353f81fdb9d77f74e5a942da4d5d9c8442fb5152012ac84fe7 |