Creating maps with machine learning models and earth observation data.

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: Apache Software License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

OpenMapFlow 🌍

Rapid map creation with machine learning and earth observation data.

Example projects: Cropland, Buildings, Maize

Example maps: Earth Engine script

Tutorial
Creating a map from scratch
Accessing existing datasets

Tutorial

Colab notebook tutorial demonstrating data exploration, model training, and inference over small region. (video)

Prerequisites:

Github access token (obtained here)
Forked OpenMapFlow repository
Basic Python knowledge

Creating a map from scratch

To create your own maps with OpenMapFlow, you need to

Generate your own OpenMapFlow project, this will allow you to:
Add your own labeled data
Train a model using that labeled data, and
Create a map using the trained model.

openmapflow-pipeline

Generating a project

A project can be generated by either following the below documentation OR running the above Colab notebook.

Prerequisites:

Github repository - where your project will be stored
Google/Gmail based account - for accessing Google Drive and Google Cloud
Google Cloud Project (create) - for accessing Cloud resources for creating a map (additional info)
Google Cloud Service Account Key (generate) - for deploying Cloud resources from Github Actions

Once all prerequisites are satisfied, inside your Github repository run:

pip install openmapflow
openmapflow generate

The command will prompt for project configuration such as project name and Google Cloud Project ID. Several prompts will have defaults shown in square brackets. These will be used if nothing is entered.

After all configuration is set, the following project structure will be generated:

<YOUR PROJECT NAME>
│   README.md
│   datasets.py             # Dataset definitions (how labels should be processed)
│   evaluate.py             # Template script for evaluating a model
│   openmapflow.yaml        # Project configuration file
│   train.py                # Template script for training a model
│   
└─── .dvc/                  # https://dvc.org/doc/user-guide/what-is-dvc
│       
└─── .github
│   │
│   └─── workflows          # Github actions
│       │   deploy.yaml     # Automated Google Cloud deployment of trained models
│       │   test.yaml       # Automated integration tests of labeled data
│       
└─── data
    │   raw_labels/                     # User added labels
    │   datasets/                       # ML ready datasets (labels + earth observation data)
    │   models/                         # Models trained using datasets
    |   raw_labels.dvc                  # Reference to a version of raw_labels/
    |   datasets.dvc                    # Reference to a version of datasets/
    │   models.dvc                      # Reference to a version of models/

Github Actions Secrets Being able to pull and deploy data inside Github Actions requires access to Google Cloud. To allow the Github action to access Google Cloud, add a new repository secret (instructions).

In step 5 of the instructions, name the secret: GCP_SA_KEY
In step 6, enter your Google Cloud Service Account Key

After this the Github actions should successfully run.

GCloud Bucket: A Google Cloud bucket must be created for the labeled earth observation files. Assuming gcloud is installed run:

gcloud auth login
gsutil mb -l <YOUR_OPENMAPFLOW_YAML_GCLOUD_LOCATION> gs://<YOUR_OPENMAPFLOW_YAML_BUCKET_LABELED_EO>

Adding data

Adding already existing data

Prerequisites:

Generated OpenMapFlow project

Add reference to already existing dataset in your datasets.py:

from openmapflow.datasets import GeowikiLandcover2017, TogoCrop2019

datasets = [GeowikiLandcover2017(), TogoCrop2019()]

Download and push datasets

openmapflow create-datasets # Download datasets
dvc commit && dvc push      # Push data to version control

git add .
git commit -m'Created new dataset'
git push

Adding custom data

Data can be added by either following the below documentation OR running the above Colab notebook.

Prerequisites:

Generated OpenMapFlow project
EarthEngine account - for accessing Earth Engine and pulling satellite data
Raw labels - a file (csv/shp/zip/txt) containing a list of labels and their coordinates (latitude, longitude)

Pull the latest data

dvc pull

Move raw label files into project's data/raw_labels folder
Write a LabeledDataset class in datasets.py with a load_labels function that converts raw labels to a standard format, example:

label_col = "is_crop"

class TogoCrop2019(LabeledDataset):
    def load_labels(self) -> pd.DataFrame:
        # Read in raw label file
        df = pd.read_csv(PROJECT_ROOT / DataPaths.RAW_LABELS / "Togo_2019.csv")

        # Rename coordinate columns to be used for getting Earth observation data
        df.rename(columns={"latitude": LAT, "longitude": LON}, inplace=True)

        # Set start and end date for Earth observation data
        df[START], df[END] = date(2019, 1, 1), date(2020, 12, 31)

        # Set consistent label column
        df[label_col] = df["crop"].astype(float)

        # Split labels into train, validation, and test sets
        df[SUBSET] = train_val_test_split(index=df.index, val=0.2, test=0.2)

        # Set country column for later analysis
        df[COUNTRY] = "Togo"

        return df

datasets: List[LabeledDataset] = [TogoCrop2019(), ...]

Check your new dataset load_labels function

openmapflow verify TogoCrop2019

Run dataset creation (can be skipped if automated in CI e.g. in https://github.com/nasaharvest/crop-mask):

earthengine authenticate    # For getting new earth observation data
gcloud auth login           # For getting cached earth observation data
openmapflow create-datasets # Initiatiates or checks progress of dataset creation

Push new data to remote storage and new code to Github

dvc commit && dvc push
git add .
git commit -m'Created new dataset'
git push

Training a model

A model can be trained by either following the below documentation OR running the above Colab notebook.

Prerequisites:

Generated OpenMapFlow project
Added labeled data

# Pull in latest data
dvc pull

# Set model name, train model, record test metrics
export MODEL_NAME=<YOUR MODEL NAME>              
python train.py --model_name $MODEL_NAME    
python evaluate.py --model_name $MODEL_NAME 

# Push new models to data version control
dvc commit 
dvc push  

# Make a Pull Request to the repository
git checkout -b"$MODEL_NAME"
git add .
git commit -m "$MODEL_NAME"
git push --set-upstream origin "$MODEL_NAME"

Now after merging the pull request, the model will be deployed to Google Cloud.

Creating a map

Prerequisites:

Generated OpenMapFlow project
Added labeled data
Trained model

Only available through above Colab notebook. Cloud Architecture must be deployed using the deploy.yaml Github Action.

Accessing existing datasets

from openmapflow.datasets import TogoCrop2019
df = TogoCrop2019().load_df(to_np=True)
x = df.iloc[0]["eo_data"]
y = df.iloc[0]["class_prob"]

Citation

@inproceedings{OpenMapFlow2023,
  title={OpenMapFlow: A Library for Rapid Map Creation with Machine Learning and Remote Sensing Data},
  author={Zvonkov, Ivan and Tseng, Gabriel and Nakalembe, Catherine and Kerner, Hannah},
  booktitle={AAAI},
  year={2023}
}

Project details

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: Apache Software License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

0.2.5rc1 pre-release

Nov 21, 2024

This version

0.2.4

Dec 22, 2023

0.2.4rc4 pre-release

Nov 15, 2023

0.2.3

Apr 29, 2023

0.2.3rc3 pre-release

Mar 23, 2023

0.2.3rc2 pre-release

Mar 7, 2023

0.2.3rc1 pre-release

Dec 22, 2022

0.2.2

Nov 29, 2022

0.2.2rc2 pre-release

Nov 29, 2022

0.2.2rc1 pre-release

Nov 11, 2022

0.2.1

Oct 26, 2022

0.2.1rc2 pre-release

Oct 6, 2022

0.2.1rc1 pre-release

Oct 3, 2022

0.2.0rc1 pre-release

Sep 21, 2022

0.1.6rc4 pre-release

Sep 6, 2022

0.1.6rc3 pre-release

Aug 29, 2022

0.1.6rc2 pre-release

Aug 29, 2022

0.1.6rc1 pre-release

Aug 29, 2022

0.1.5

Aug 24, 2022

0.1.4

Aug 22, 2022

0.1.3

Aug 21, 2022

0.1.3rc1 pre-release

Aug 20, 2022

0.1.2

Jul 25, 2022

0.1.1

Jul 22, 2022

0.1.0

Jul 19, 2022

0.0.2

Jul 15, 2022

0.0.1rc1 pre-release

Jun 15, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openmapflow-0.2.4.tar.gz (83.6 kB view details)

Uploaded Dec 22, 2023 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

openmapflow-0.2.4-py3-none-any.whl (88.7 kB view details)

Uploaded Dec 22, 2023 Python 3

File details

Details for the file openmapflow-0.2.4.tar.gz.

File metadata

Download URL: openmapflow-0.2.4.tar.gz
Upload date: Dec 22, 2023
Size: 83.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.9.18

File hashes

Hashes for openmapflow-0.2.4.tar.gz
Algorithm	Hash digest
SHA256	`0bdd075aefdfc7e6532d0e01229ca20d81ebdfa7bbc7f4894043e388d3666b4e`
MD5	`7558de7a769912b0205d36dd4e787900`
BLAKE2b-256	`11b520739254a30f35ec7d29f49029a9e8eb083e36c7d00b6d69757841e029e6`

See more details on using hashes here.

File details

Details for the file openmapflow-0.2.4-py3-none-any.whl.

File metadata

Download URL: openmapflow-0.2.4-py3-none-any.whl
Upload date: Dec 22, 2023
Size: 88.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.9.18

File hashes

Hashes for openmapflow-0.2.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`58de2ab17570708a79d505761048af625c5cb0ce88c06c2ce346adb8895f700b`
MD5	`994d8f1cc0a05e352b0b9ccea788da50`
BLAKE2b-256	`4dee797c1c73a0f93d6df8095ccbfd1b18d40d301924613f4a5938946d130ada`

See more details on using hashes here.

openmapflow 0.2.4

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

OpenMapFlow 🌍

Tutorial

Creating a map from scratch

Generating a project

Adding data

Adding already existing data

Adding custom data

Training a model

Creating a map

Accessing existing datasets

Citation

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes