Skip to main content

Python bindings for the Datature API

Project description

:hammer: Datature Python SDK :hammer:

Python - Version PyPI - Version PyPI - Downloads

Join Datature Slack MIT license


:zap: Empower your MLOps pipelines and applications with seamless integrations :zap:

Automate tasks to manage your datasets, run training experiments, export and deploy your models from Datature Nexus with ease. Perform development via Python Scripts or with the Command-Line Interface.


Getting Started

Prerequisites

  • 3.8 <= Python <= 3.12

We recommend users to create a virtual environment before installing any dependencies. For more information on virtual environments, please refer to:

Installation

pip install --upgrade datature

Python Usage

For a full list of documentation and examples, please refer to the API docs.

Authentication

To get started, you will first need to create a project on Datature Nexus (you can create sign up for a free account here). You will then need to locate the project secret key. This key can only be accessed if you are the Project Owner or have been granted elevated permissions by the Project Owner, and will be used for all subsequent authentication when invoking the various SDK functions.

Examples

To list projects:

from datature import nexus

client = nexus.Client("31a9f0dd997cb632765fc0d222369f6106327c3d20719d31f6ffafe51708f117")
projects = client.list_projects()

To upload assets:

import os
from datature import nexus

logging.basicConfig()
client = nexus.Client("31a9f0dd997cb632765fc0d222369f6106327c3d20719d31f6ffafe51708f117")
project = client.list_projects("proj_fca32b1bb15405d1c2bde19fd90b516d")

upload_session = project.assets.create_upload_session(groups=["dataset"])
with upload_session as session:
  session.add_path("/Users/dataset")
print(len(upload_session))

Logging

You can vary the logging level depending on your task or use case (such as DEBUG to provide more insights), but the default INFO level is typically best suited for production use.

import logging

logging.basicConfig()
logging.getLogger("datature-nexus").setLevel(logging.DEBUG)

CLI Usage

For a full list of documentation and examples, please refer to the CLI docs.

Authentication

To get started, you will first need to create a project on Datature Nexus (you can create sign up for a free account here). You will then need to locate the project secret key. This key can only be accessed if you are the Project Owner or have been granted elevated permissions by the Project Owner, and will be used for all subsequent authentication when invoking the various SDK functions.

Once you have the project secret, you will now be able to make API requests using the CLI by entering the command datature projects auth:

datature projects auth
[?] Enter the project secret: ************************************************
[?] Make [Your Project Name] the default project? (Y/n): y

Authentication succeeded.

You will now be able to run your desired CLI commands as outlined above. To see all possible functions as well as view the required inputs and expected outputs, check out the following documentation.

Project Management

datature projects

Show a help page of various functions to add projects, select the default project, and retrieve project information.

Authenticate Project

datature projects auth

Authenticate new projects using the project secret key. Multiple projects can be authenticated and stored using different secret keys.

Select Project

datature projects select

Select an active project to work on from a list of saved projects. All subsequent CLI commands will be in the context of the selected project until a different project is selected, or the shell session is terminated.

$ datature projects select

> Brain Tumor DICOM
  Hand Gesture Keypoint Detection

Your active project is now: [Brain Tumor DICOM]

List Projects

datature projects list

View a table of saved projects with columns containing the names of the projects, the total number of assets, the number of annotated assets, the number of annotations, and the number of tags for each project. The name of the active project is displayed at the bottom of the list.

$ datature projects list

NAME                               TOTAL_ASSETS        ANNOTATED_ASSETS    ANNOTATIONS         TAGS
Brain Tumor DICOM                  4071                433                 1874                3
Hand Gesture Keypoint Detection    718                 53                  959                 4

Your active project is now: [Brain Tumor DICOM]

Asset Management

Upload Assets

datature assets upload

Upload assets to Datature Nexus. You will be prompted to enter the path to the folder containing the assets that you wish to upload, as well as optional group name(s) to categorize the set of assets. This function is designed specially for bulk uploading of large datasets, which accelerates the process of onboarding data for subsequent annotation and training.

This function also supports DICOM and NIfTI file upload, which caters to important medical use cases.

$ datature asset upload
[?] Enter the assets folder path to be uploaded: /Downloads/Training2
[?] Enter the assets group name(s), split by ',': main
[?] 281 assets will be uploaded to group(s) (main)? (Y/n):
Preparing    |████████████████████████████████████████| 281/281 [100%] in 0.1s (2775.28/s)
Processing   |████████████████████████████████████████| 100% [281/281] in 1:17.5 (3.56/s)
Server processing completed.

Group Assets

datature assets groups

List asset group information within your project. You will be prompted to select an existing group or create a new group. If you select an existing group, information about the selected group will be displayed, including the total number of assets in the group, the number of assets that have been annotated, reviewed, or marked for fixes, and the number of assets that have been completed.

$ datature assets groups

> main
  validation

NAME            TOTAL           ANNOTATED       REVIEW          TOFIX           COMPLETED
main            8               1               0               0               0

Annotation Management

Upload Annotations

datature annotations upload

Upload annotation files to Datature Nexus You will be prompted to enter the path of the annotation file you wish to upload and select a supported annotation format.

$ datature annotations upload
[?] Enter the annotation files path to be uploaded: /Users/Downloads/Training.csv
Processing   |████████████████████████████████████████| 100% [1/1] in 7.0s (0.14/s)
Server processing completed.

Download Annotations

datature annotations download

Download annotation files from Datature Nexus. You will be prompted to enter a path to save the downloaded annotation file to, and select the desired annotation format.

$ datature annotations download
[?] Enter the annotation files path to be download: /Users/Downloads/
[?] Select the annotation file format: csv_widthheight
   csv_fourcorner
 > csv_widthheight
   coco
   pascal_voc
   yolo_keras_pytorch
   yolo_darknet
   polygon_single
   polygon_coco

Processing   |████████████████████████████████████████| 100% [1/1] in 7.0s (0.14/s)
Server processing completed.

Artifact Management

Artifact Download

datature artifacts download

Download a model artifact from Datature Nexus. You will be prompted to enter a folder path to save the model to, and select the name and export format of the artifact to download.

$ datature artifacts download
[?] Enter the folder path to save model: /Volumes/
[?] Which artifact do you want to download?: BEAF45-Workflow
 > BEAF45-Workflow

[?] Which model format do you want to download?: tensorflow
 > TensorFlow
   TFLite
   ONNX

Downloading  |████████████████████████████████████████| 100% [443421394/443421394] in 7.1s (62639992.12/s)

FAQ

How do I find my Secret Key and Project Key?

We provide a step-by-step guide to finding these two crucial keys in our Developer's Documentation. You can also explore the other sections under Python SDK to learn more about the full functionality and feature set.

I'm facing some issues, what now?

We're sorry to hear that, please head over to our Issues page and post a detailed bug report following our guidelines, and we will address your concerns as soon as we can. Alternatively, ping us in our Community Slack where our engineers will attend to your needs.

I've noticed that some features are missing, how do I contribute?

Datature Python SDK is open-source and we welcome everyone to help to improve it. Please check out our Contributing Guide to learn how you can be a part of the team.

How do I resolve the command not found: datature error for the CLI?

The command not found: datature error indicates that the Datature SDK/CLI tool is not installed properly in your system's PATH, or it has not been installed at all. To resolve this error, please follow these steps:

Ensure Datature CLI is Installed

Before anything else, verify that you've installed the Datature CLI. You can install it using pip with the following command:

pip install datature

If you're using a virtual environment (which is recommended), ensure that it's activated before running the installation command.

Check Your PATH

After installation, the datature command should be automatically added to your system's PATH. If it's not found, you may need to manually add the directory containing the datature executable to your PATH:

which datature

Or

pip show datature

As expected, it will show the location of the package like this:

Location: /Users/.pyenv/versions/3.8.18/lib/python3.8/site-packages

Add the Path to Your Profile

Open your shell profile file with a text editor. This file could be one of ~/.bash_profile, ~/.bashrc, ~/.zshrc, etc., depending on which shell you use and the specific configuration of your operating system.

For example, you can add the path using the following command:

echo 'export PATH="$PATH:/path/to/datature"' >> ~/.bash_profile

Replace /path/to/datature with the actual path you found with which datature or pip show datature.

Restart Your Terminal

Sometimes, changes to the PATH environment variable do not take effect until you open a new terminal session. After installation or modification of the PATH, close your current terminal and open a new one, then try the command again.

What does the warning As the c extension couldn't be imported, google-crc32c is using a pure python implementation that is significantly slower. If possible, please configure a c build environment and compile the extension mean, and do I need to take any action?

This warning is indicating that the google-crc32c library is falling back to a pure Python implementation for calculating CRC32C checksums because it cannot find the C extension which is usually faster. The Python implementation is fully functional but performs slower than its C counterpart. While this won't harm the function of the library, you may experience performance issues if your application requires high-speed checksum computing.

To address this warning for improved performance, you can take the following steps to compile the C extension:

Install the Build Essentials

Make sure that you have the necessary build tools installed to compile the C extension.

On Ubuntu/Debian systems, you can install build-essential by running:

sudo apt-get install build-essential

On Red Hat/CentOS systems, you can use:

sudo yum install gcc gcc-c++ make

On macOS, ensure you have Xcode Command Line Tools installed:

xcode-select --install

Reinstall google-crc32c with C Extension

With the build environment configured, you can attempt to reinstall the google-crc32c library which should now include the C extension:

pip install --no-cache-dir --force-reinstall google-crc32c

By taking these steps, you should be able to eliminate the warning by having the faster C extension compiled and used by google-crc32c. If speed is not critical for your use case, you can choose to ignore this warning, and the library will still function correctly, albeit with potentially decreased performance.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datature-1.7.0.tar.gz (3.4 MB view hashes)

Uploaded Source

Built Distribution

datature-1.7.0-py3-none-any.whl (134.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page