Python SDK for Tesseract Models
Project description
Tesseract Python SDK
This is an SDK for developing Tesseract models in Python.
Test a Tesseract Image
To run tests that will ensure that your model container will run correctly in tesseract you can use the validation cli. To run with a basic setup you just need to run:
tesseract-sdk validate <image-name>:<tag>
This will look at the model info in your model code and generate a random array for input. It will then spin up the container and attempt to send data into the model. If data is returned from the model then it will validate the the shape and dtypes are correct. Thats all you need for simple models.
For more complicated models or models where you would like to test with real data you need to create a configuration file for testing. The configuration file just lets the validator know about things like where the local data to be loaded is, and which bands should be included. The resulting arrays or features will be written out to PNG and geojson respectively. An example config is shown below:
{
"image": "my-tesseract-model:v0.0.1",
"test_data": {
"job_id": "my-job-id",
"project": "my-project",
},
"asset_bands": [
{
"asset_name": "modis",
"bands": [0,1,2,5],
},
{
"asset_name": "sentinel",
"bands": [2,4]
}
],
"args": {
"model-arg-1": "value1",
"model-arg-2": "value2"
},
"output_asset_bands": [
{
"asset_name": "model_output_1",
"bands": [0, 1, 2]
},
{
"asset_name": "model_output_2",
"bands": [0]
}
],
"save_output": false
}
image: The docker image to validate.
test_data: This can either be a dictionary with a job id and project to get data directly from
a Tesseract job, or path to a zarr file. If reading directly from a Tesseract job, the dict
should have only the keys job_id
and project
. To read from a zarr file directly, pass in the
path or URL as a string. This can be a local file or a remote zarr file (for example in google
storage) so long as the credentials are available. Optional: if not provided, random data
will be created.
asset_bands: A list of asset bands like the inputs to a Tesseract Job. Each asset_band in the list should contain the keys "asset_name" and "bands". The "asset_name" must exist in the input zarr file or Tesseract job and "bands" should be a list of integers corresponding to bands in the asset. Optional: If not provided, will use all bands from all input datsets.
args: Any arguments that need to be passed to the model inference function. Optional: If not provided no args are supplied to the model.
output_asset_bands: For each model output, the bands that should be used to output an image.
This should be either 1 or 3 bands. For each item in the list, a PNG image will be created so
that the model outputs can be quickly inspected to ensure that the model looks like it is
working correctly. Unlike asset_bands
, you can have multiple outputs here with the same name.
This can be useful if you want to output several images for one asset i.e. 3 images with one
band each instead of one 3 band image. Optional: If not provided, no output images will
be generated.
save_output: If True, will write the model output as bytes that can be read in with numpy. Files
will be named by the name of the output with a '.dat' extension. Optional: Defaults to
false
.
To run the validator with a configuration file, simply pass it to the utility:
tesseract-sdk validate -f valid_config.json
Contributing
To contribute to the project you must first install the package using the dev
option.
pip install .[dev]
IMPORTANT: Before creating a PR make sure to update the protobuf files. The PR checks will fail if you do not. To update the protobuf files run the following commands:
make protoc-python
make copy-protos
make check-protos
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
File details
Details for the file tesseract_sdk-0.8.4-py3-none-any.whl
.
File metadata
- Download URL: tesseract_sdk-0.8.4-py3-none-any.whl
- Upload date:
- Size: 31.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.8.18
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c5d27f7697b1c78cee1041eb74b989b3bc7a3bfca5429af9cf690a196d854fec |
|
MD5 | 9157bea8f65acd89a59888d269064353 |
|
BLAKE2b-256 | c87b13d1459cf5d0d18a1c3f68c60296410c59ef633481e80a2c2cfc9e16b147 |