Skip to main content

Deep Learning, Annotations, Training Data.

Project description

What is Diffgram?:

Diffgram is all about Training Data: Data that's ready to be used by AI systems.

It's created by combining raw data with human centered meaning. For example, combining an image with a box identifying an object. The encoded meaning can be relatively simple, for example a single bounding box, or complex, such as a time series video with a graph of attributes.

Tell me moar

Motivation

  • Subject matter experts are the annotators and they need an easy way to do it.
  • An increase in complexity in annotations and frequency of data change.
  • Organization between data, people and teams on larger scale projects.

I need motivation

What can I do with Diffgram SDK?

  • Create batches of work (Jobs), including sending files
  • Export annotations programmatically
  • Create training data and work with files in a deep learning native format.

And a lot more (features list)

Install

Full Documentation

Quickstart

pip install diffgram

On linux pip3 install diffgram

  1. Get credentials from Diffgram.com
  2. Download sample files from github
  3. Config credentials

Example

from diffgram import Project

project = Project(project_string_id = "replace_with_project_string",
		  client_id = "replace_with_client_id",
		  client_secret = "replace_with_client_secret"	)

Import data

Create a batch of work and send data to it

job = project.job.new()

for signed_url in signed_url_list:

	result = project.file.from_url(
		signed_url,
		job = job
	)

Signed URL Guide

Importing from URL (ie cloud provider)

result = project.file.from_url(url)

See our help article for signed URLS

Importing existing instances


instance_bravo = {
		'type': 'box',
		'name': 'cat',
		'x_max': 128, 
		'x_min': 1,
		'y_min': 1,
		'y_max': 128
				}

# Combine into image packet

image_packet = {'instance_list' : [instance_alpha, instance_bravo],
		'media' : {
			'url' : "https://www.readersdigest.ca/wp-content/uploads/sites/14/2011/01/4-ways-cheer-up-depressed-cat.jpg",
			'type' : 'image'
			}
		}


result = project.file.from_packet(image_packet)

Importing a single local file:

file = project.file.from_local(path)

Multiple file example

Beta

Note the API/SDK is in beta and is undergoing rapid improvment. There may be breaking changes. Please see the API docs for the latest canonical reference and be sure to upgrade to latest ie: pip install diffgram --upgrade. We will attempt to keep the SDK up to date with the API.

Help articles for Diffgram.com See below for some examples.

Requires Python >=3.5

The default install through pip will install dependencies for local prediction (tensorflow opencv) as listed in requirements.txt. The only requirement needed for majority of functions is requests. If you are looking for a minimal size install and already have requests use the --no-dependencies flag ie pip install diffgram --no-dependencies

Overall flow

The primary flow of using Diffgram is a cycle of importing data, training models, and updating those models, primarily by changing the data. Making use of the deep learning and collecting feedback to channel back to Diffgram is handled in your system. More on this here.

System diagram

Tutorials and walk throughs

System diagram

Red Pepper Chef - from new training data to deployed system in a few lines of code

How to validate your model

Fast Annotation Net


Code samples

See samples folder

The project object

from diffgram import Project

project = Project(project_string_id = "replace_with_project_string",
		  client_id = "replace_with_client_id",
		  client_secret = "replace_with_client_secret"	)

The project represents the primary starting point. The following examples assumes you have a project defined like this.


Actions and Brains (Beta)

Brain

Benefits of using prediction through Diffgram brain

  • Clean abstraction for different deep learning methods, local vs online prediction, and file types
  • Designed for changing models and data. The same object you call .train() on can also call .predict()
  • Ground up support for many models. See local_cam for one example.

And of course local prediction - your model is your model.

Note: We plan to support many deep learning methods in the future, so while this is fairly heavily focused on object detection, the vast majority of concepts carry over to semantic segmentation and other methods.

Train

brain = project.train.start(method="object_detection",
			    name="my_model")

brain.check_status()

Predict Online

Predicting online requires no advanced setup and uses less local compute resources.

For predicting online there are 3 ways to send files

Local file path

inference = brain.predict_from_local(path)

URL, ie a remote cloud server

inference = brain.predict_from_url(url)

From a diffgram file

inference = brain.predict_from_file(file_id = 111546)


Predict Local

Predicting locally downloads the model weights, graph defintion, and relevant labels. It will setup the model - warning this may use a significant amount of your local compute resources. By default the model downloads to a temp directory, although you are welcome to download and save the model to disk.

Local prediction, with local file

Same as before, except we set the local flag to True

brain = project.get_model(
			name = None,
			local = True)

Then we can call

inference = brain.predict_from_local(path)

Local prediction, two models with visual

Get two models:

page_brain = project.get_model(
			name = "page_example_name",
			local = True)

graphs_brain = project.get_model(
			name = "graph_example_name",
			local = True)

This opens an image from a local path and runs both brains on same image. We are only reading the image once, so you can stack as many networks as you need here.


image = open(path, "rb")
image = image.read()

page_inference = page_brain.run(image)
graphs_inference = graphs_brain.run(image)

Optional, render a visual

output_image = page_brain.visual(image_backup)
output_image = graphs_brain.visual(output_image)

Imagine the "page" brain, most pages look the same, so it will need less data and less retraining to reach an acceptable level of performance. Then you can have a seperate network that gets retrained often to detect items of interest on the page (ie graphs).

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

diffgram-0.1.7.6.tar.gz (25.2 kB view hashes)

Uploaded Source

Built Distribution

diffgram-0.1.7.6-py3-none-any.whl (27.9 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page