Command Line Interface for Zegami
Project description
zegami-cli
A Command Line Interface for Zegami.
Zegami is a visual data exploration tool that makes the analysis of large collections of image rich information quick and simple.
The Zegami cli relies on a combination of yaml files and arguments.
The first step is to create a collection
Installation
pip3 install zegami-cli[sql]
Commands
Login
The login command promtps for username and password which is then used to retrieve a long-lived API token which can be used for subsequent requests. The token is stored in a file in the currenet users data directory.
Once retrieved all subsequest commands will use the stored token, unless it is specifically overridden wiht the --token
option
zeg login
Get a collection
Get the details of a collection.
If the collection id
is excluded then all collections will be listed.
zeg get collections [collection id] --project [Project Id] --token [API token]
Create a collection
Create a collection using a combined dataset and imageset config.
zeg create collections --project [Project Id] --token [API token] --url [Server url] --config [path to configuration yaml]
The following config properties are supported for file based imageset and datasets.
# The name of the collection
name: file based
description: an example collection with a file based imageset and dataset
# The type of data set. For now this needs to be set to 'file'
dataset_type: file
# Config for the file data set type
imageset_type: file
# Config for the file image set type
file_config:
# Path to the dataset file
path:
# A collection of paths to image files. Paths can be to both images and directories
paths:
- an_image.jpg
- a/directory/path
# Name of the column in the dataset that contains the image name
dataset_column: image_name
If you are creating a url based imageset with a data file use these properties.
The dataset_column property is used to set the column where the url is stored. You will need to include the full image url e.g. https://zegami.com/wp-content/uploads/2018/01/weatherall.svg
# The name of the collection
name: url based
# The description of the collection
description: an example collection with a file based dataset where images are to be downloaded from urls
# The type of image set. for now this needs to be set to 'url'
imageset_type: url
# Name of the column in the dataset that contains the image url
dataset_column: image_name
# Url pattern - python format string where {} is the name of the image name (from data file)
url_template: https://example.com/images/{}?accesscode=abc3e20423423497
dataset_type: file
# Config for the file data set type
file_config:
# Path to the dataset file
path:
If you are creating an imageset on Azure from a private azure bucket with a local file do as follows:
# The name of the collection
name: azure bucket based
# The description of the collection
description: an example collection with a file based dataset where images are to be downloaded from an azure bucket
dataset_type: file
# Config for the file data set type
file_config:
# Path to the dataset file
path:
# The type of image set. for now this needs to be set to 'url'
imageset_type: azure_storage_container
# Name of the container
container_name: my_azure_blobs
# Name of the column in the dataset that contains the image url
dataset_column: image_name
# Note that the storage account connection string should also be made available via environment variable AZURE_STORAGE_CONNECTION_STRING
If you are using SQL data see below for config
Update a collection
Update a collection - coming soon.
Delete a collection
Delete a collection
zeg delete collections [collection id] --project [Project Id] --token [API token]
Publish a collection
zeg publish collection [collection id] --project [Project Id] --config [path to configuration yaml] --token [API token]
The config yaml
file is used to specify additional configuration for the collection publish.
# The type of update. For now this needs to be set to 'publish'
update_type: publish
# Config for the publish update type
publish_config:
# Flag to indicate if the collection should be published or unpublished
publish: true
# The id of the project to publish to
destination_project: public
Get a data set
Get a data set
zeg get dataset [dataset id] --project [Project Id] --token [API token]
Update a data set
Update an existing data set with new data.
Note that when using against a collection the dataset id used should be the upload_dataset_id. This is different from the below imageset update which requires the dataset identifier known as dataset_id from the collection.
zeg update dataset [dataset id] --project [Project Id] --config [path to configuration yaml] --token [API token]
The config yaml
file is used to specify additional configuration for the data set update. There are two supported dataset_type
supported.
File
The file
type is used to update a data set with a file. It can be set up to either specify the fully qualified path to a .csv.
, .tsv
or .xlsx
file to upload using the path
property or the directory
property can be used to upload the latest file in a directory location.
# The type of data set. For now this needs to be set to 'file'
dataset_type: file
# Config for the file data set type
file_config:
# Path to the dataset file
path:
# Or path to a directory that contains data files.
# Only the latest file that matches the accepted extensions (.csv, .tsv, .xlsx)
# will be uploaded. This is useful for creating collections based on
# automated exports from a system, like log files.
directory:
SQL
The sql
type is used to update a data set based on an SQL
query.
Uses SQLAlchemy to connect to the database. See http://docs.sqlalchemy.org/en/latest/core/engines.html and https://www.connectionstrings.com/ for the correct connection string format.
# The type of data set. For now this needs to be set to 'file'
dataset_type: sql
# Config for the sql data set type
sql_config:
# The connection string.
connection:
# SQL query
query:
Delete a data set
Delete a data set - coming soon.
zeg delete dataset [dataset id] --project [Project Id] --token [API token]
Get an image set
Get an image set - coming soon.
zeg get imageset [imageset id] --project [Project Id] --token [API token]
Update an image set
Update an image set with new images.
zeg update imageset [imageset id] --project [Project Id] --config [path to configuration yaml] --token [API token]
The config yaml
file is used to specify additional configuration for the image set update.
File imageset
The paths
property is used to specify the location of images to upload and can include both images and directories.
# The type of image set. for now this needs to be set to 'file'
imageset_type: file
# Config for the file image set type
file_config:
# A collection of paths. Paths can be to both images and directories
paths:
- an_image.jpg
- a/directory/path
# Unique identifier of the collection
collection_id: 5ad3a99b75f3b30001732f36
# Unique identifier of the collection data set (get this from dataset_id)
dataset_id: 5ad3a99b75f3b30001732f36
# Name of the column in the dataset that contains the image name
dataset_column: image_name
URL imageset
The dataset_column property is used to set the column where the url is stored. You will need to include the full image url e.g. https://zegami.com/wp-content/uploads/2018/01/weatherall.svg
# The type of image set. for now this needs to be set to 'url'
imageset_type: url
# Unique identifier of the collection
collection_id: 5ad3a99b75f3b30001732f36
# Unique identifier of the collection data set
dataset_id: 5ad3a99b75f3b30001732f36
# Name of the column in the dataset that contains the image url
dataset_column: image_name
# Url pattern - python format string where {} is the name of the image name (from data file)
url_template: https://example.com/images/{}?accesscode=abc3e20423423497
Azure storage imageset
# The type of image set. for now this needs to be set to 'url'
imageset_type: azure_storage_container
# Name of the container
container_name: my_azure_blobs
# Unique identifier of the collection
collection_id: 5ad3a99b75f3b30001732f36
# Unique identifier of the collection data set
dataset_id: 5ad3a99b75f3b30001732f36
# Name of the column in the dataset that contains the image url
dataset_column: image_name
# Note that the storage account connection string should also be made available via environment variable AZURE_STORAGE_CONNECTION_STRING
Delete an image set
Delete an image set - coming soon.
zeg delete imageset [imageset id] --project [Project Id] --token [API token]
Developer
Tests
Setup tests:
pip install -r requirements/test.txt
Run tests:
python3 -m unittest discover .
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.