Data preparation for geospatial artificial intelligence
Project description
geolabel-maker
This tool is provided to help you in your data preparation for geospatial artificial intelligence.
With geolabel-maker
, you will be able to combine satellite or aerial imagery with
vector spatial data to create your own ground-truth dataset. This Python package can
generate your final dataset in the COCO format for deep-learning models.
It is designed to link up these 4 needed steps :
- create labels from geometries and raster files;
- make virtual raster files to combine images and labels;
- split image and label virtual files creating tiles;
- create an annotation JSON file in the COCO format for a specific zoom level.
Installation
Requirements
- Python 3.6
- GDAL
GDAL
As a particular case, GDAL is not included into the setup.py
file.
For Ubuntu
distributions, the following operations are needed to install this program:
sudo apt-get install libgdal-dev
sudo apt-get install python3-gdal
The GDAL version can be verified by:
gdal-config --version
After that, a simple pip install gdal
(or conda install gdal
) may be sufficient, however considering our own experience it is not the case on Ubuntu. One has to retrieve a GDAL for Python that corresponds to the GDAL of system:
pip install --global-option=build_ext --global-option="-I/usr/include/gdal" GDAL==`gdal-config --version`
python3 -c "import osgeo;print(osgeo.__version__)"
For other OS, please visit the GDAL
installation documentation.
Installation with pip
pip install geolabel-maker
Usage
Inputs
An sample dataset is available in the
data/
folder. See the Examples paragraph to more information.
To create your labels and annotations in the COCO format, you need:
IMG
: path to the folder containing the images to be labeledCATEGORIES
: a JSON file with an unique id for each, the description of expected categories with path to vector label file, and color as RGB triplet.TILES
: a folder where the tiles (256x256 pixels images) will be recorded
Example of the categories file:
{
"category_1": {
"id": 1,
"file": "path to your category_1 vector file",
"color": [150, 0, 0]
},
"category_2": {
"id": 2,
"file": "path to your category_2 vector file",
"color": [255, 255, 255]
}
}
Supported formats:
We use packages based on GDAL drivers.
- for images, see raster formats :
- GeoTIFF,
- JPEG2000,
- ASCII Grid,
- etc
- for geometries, see supported drivers of the
fiona
package :- ESRI Shapefile,
- GeoJSON,
- GPKG,
- etc
Using the command-line interface
A command-line interface is proposed with 5 available
actions (make_labels
, make_rasters
, make_tiles
, make_annotations
and make_all
).
Step-by-step commands
1. Create labels from geometries and raster files
geolabels make_labels IMG CATEGORIES
2. Make virtual raster files to combine images and labels
geolabels make_rasters IMG
3. Split image and label virtual files creating tiles
geolabels make_tiles IMG_VRT LABEL_VRT TILES
4. Create an annotation JSON file in the COCO format for a specific zoom level
geolabels make_annotations TILES CATEGORIES
Option:
- --zoom, the zoom level
Global command
To run the full process to get a ground truth in the COCO format, use the command:
geolabels make_all IMG TILES CATEGORIES
Option:
- --zoom, the zoom level
Importing the package in Python code
from geolabel_maker import geolabels
# Create labels from geometries and raster files
geolabels.make_labels(img, categories)
# Make virtual raster files to combine images and labels
images_vrt, labels_vrt = geolabels.make_rasters(img)
# Split image and label virtual files creating tiles
geolabels.make_tiles(images_vrt, labels_vrt, tiles)
# Create an annotation JSON file in the COCO format
geolabels.make_annotations(tiles, categories)
# OR
# Run the full process
geolabels.make_all(img, tiles, categories)
Examples
The data/
folder contains geometries (vectors/
) from Lyon, published as open data in the website https://data.grandlyon.com.
It contains also an example of a JSON file describing categories used to create labels.
This folder doesn't contain images because this type of file is too big to be supported in Github.
To follow our example, just download these two files and put them in the folder data/rasters/
:
Notebooks
Some Jupyter notebooks (in French) are available :
- Use_geolabel_maker.ipynb explains the process to build your ground truth.
- Check_coco_annotations.ipynb allows to explore your final annotations file.
For developers
Install from source
git clone URL
cd geolabel-maker
pip install -e .
Pre-commit and linting
-
Install pre-commit and run
pre-commit install
to check linting before committing. -
When you want, you can force a pre-commit on all the files :
pre-commit run --all-files
Acknowledgements
We gratefully acknowledge the contributions of the people who helped get this project off of the ground, including people who beta tested the software, gave feedback, improved dependencies of code in service of this release, or otherwise supported the project.
Particularly thank you Lucie Camanez to have initiate this project in its internship.
We also acknowledge Adam Kelly whose work has helped us in the development of this tool.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for geolabel_maker-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 727d27aa82f2b60681961d78305637729cc2bdd63c2deefa14ce71bc926d1936 |
|
MD5 | acb9d5f01094d1d2279e5275db01f219 |
|
BLAKE2b-256 | bf71cad38ae9159bfc06bb1f256b74cfab184b4a2d27dca761b529dca437fee9 |