Python library for interacting with the SpatiaFi API
Project description
Python SpatiaFi API
Python library for interacting with the SpatiaFi API.
Also included is gdal_auth
which is a CLI tool to help with GCP authentication for GDAL.
Quickstart
Install the Package
pip install spatiafi
Get Authenticated Session
from spatiafi import get_session
session = get_session()
# The `session` object works just like `requests` but will automatically
# refresh the authentication token when it expires
params = {"item_id": "wildfire-risk-current-global-v1.0"}
url = "https://api.spatiafi.com/api/info"
response = session.get(url, params=params)
Get help with gdal_auth
gdal_auth --help
GDAL Authentication
gdal_auth
is a CLI tool to help with GCP authentication for GDAL.
This command can be used in three ways:
-
Set the environment variables in a file that can be sourced to set the environment variables:
gdal_auth --file source /tmp/gdal_auth.env
-
Print a command that can be run to set the environment variables once:
gdal_auth --line
Running this command will output a line that you will need to copy and run. This will set the GDAL environment variables for anything run afterward in the same terminal session. (e.g. run the command and then in the command line run another command like
qgis
) -
Print instructions for setting up aliases. These aliases allow
gdal_*
commands to be run as normal and authentication will be handled automatically:gdal_auth --alias
Note: for all options except --alias
, the authentication will eventually
expire. This is because the tokens generated by the Google Cloud SDK expire
after 1 hour. The aliases will automatically refresh the authentication
tokens when they expire.
Mini-Batch (Async Queue)
The AsyncQueue
class is a helper class for running many (up to ~1 million) API queries in parallel.
To use it, you must:
- Create a task function
- Create an
AsyncQueue
object - Enqueue tasks
- Fetch results
tl;dr: See tests/test_async_queue.py
for an example.
Create a AsyncQueue Task
A valid AsyncQueue task must:
- Be an async function
- Take a single argument
- Take an optional async session argument (if not provided, an async session will be created)
- Return a single, serializable object (a
dict
is recommended)
If your task function requires multiple arguments, you can:
- Use a wrapper function or closure (may not work on Windows or 'spawn' multiprocessing)
- Create a new function using
functools.partial
(as shown intests/test_async_queue.py
) - Pass a tuple as the argument and unpack it in the task function e.g.
async def task(args, session=None): arg1, arg2, arg3 = args ... with AsyncQueue(task) as async_queue: async_queue.enqueue((arg1, arg2, arg3))
Example Task Function
from spatiafi.async_queue import AsyncQueue
from spatiafi.session import get_async_session
async def get_point(point, session=None):
"""
Get a point from the SpatiaFI API.
"""
# Unpack the `point` tuple because we can only pass
# a single argument to the task function.
lon, lat = point
# Create an async session if one is not provided.
if session is None:
session = await get_async_session()
# Create the url.
url = (
"https://api.spatiafi.com/api/point/" + str(lon) + "," + str(lat)
)
params = {"item_id": "wildfire-risk-current-global-v1.0"}
r = await session.get(url, params=params)
# We want to raise for all errors except 400 (bad request)
if not (r.status_code == 200 or r.status_code == 400):
r.raise_for_status()
return r.json()
Create an AsyncQueue and Enqueue Tasks
AsyncQueue
takes a task function as an argument, and launches multiple instances of that task in parallel.
The AsyncQueue.enqueue
method takes a single argument is used to add tasks to the queue.
The AsyncQueue.results
property will return a list of results in the order they were enqueued.
When starting the AsyncQueue
, it is highly recommended that you specify the number of workers/CPUs to use
using the n_cores
argument. The default is to use the minimum of 4 and the number of CPUs on the machine.
This queue is designed to be used with the with
statement. Entering the with
statement will start the
subprocess and event loop. Exiting the with
statement will wait for all tasks to finish and then stop the
event loop and subprocess.
For example:
from spatiafi.async_queue import AsyncQueue
with AsyncQueue(get_point) as async_queue:
for _, row in df.iterrows():
async_queue.enqueue((row["lon"], row["lat"]))
results = async_queue.results
Alternatively, you can use the start
and stop
methods:
from spatiafi.async_queue import AsyncQueue
async_queue = AsyncQueue(get_point)
async_queue.start()
for _, row in df.iterrows():
async_queue.enqueue((row["lon"], row["lat"]))
async_queue.stop()
results = async_queue.results
Development
Use a Virtual Environment
Development should be done in a virtual environment. It is recommended to use the virtual environment manager built into PyCharm. To create a new virtual environment:
- Open the project in PyCharm and select
File > Settings > Project: spfi-api > Python Interpreter
. - In the top right corner of the window, click the gear icon and select
Add Interpreter > Add Local Interpreter...
Mark src
as a Source Root
In PyCharm, mark the src
folder as a source root. This will allow you to import modules from the src
folder without using relative imports.
Right-click on the src
folder and select Mark Directory as > Sources Root
.
Bootstrap the Development Environment
Run ./scripts/bootstrap_dev.sh
to install the package and development dependencies.
This will also set up access to our private PyPI server, generate the first requirements.txt
(if required),
and install pre-commit
hooks.
Protip: This script can be run at any time if you're afraid you've messed up your environment.
Running the tests
Tests can be run locally via the scripts/test.sh
script:
./scripts/test.sh
All additional arguments to that script will be passed to PyTest which allows you to do things such as run a single test:
./scripts/test.sh -k test_async_queue
Manage Dependencies in setup.cfg
Dependencies are managed in setup.cfg
using the install_requires
and extras_require
sections.
To add a new dependency:
- Install the package in the virtual environment with
pip install <package_name>
(Hint: use the terminal built in to PyCharm) - Run
pip show <package_name>
to get the package name and version - Add the package name and version to
setup.cfg
in theinstall_requires
section. Use the compatible release syntaxpackage_name ~=version
.
DO NOT add the package to the requirements.txt
file. This file is automatically generated by
scripts/gen_requirements.sh
.
If the dependency is only needed for development, add it to the dev
section of extras_require
in setup.cfg
.
Building Docker Images Locally
tl;dr: run ./scripts/build_docker.sh
.
We need to inject a GCP access token into the Docker build to access private PyPI packages. This requires using BuildKit (enabled by default in recent versions of Docker), and passing the token as a build argument.
pre-commit
Hooks
This project uses pre-commit
to run a series of checks before each git commit.
To install the pre-commit
hooks, run pre-commit install
in the virtual environment.
(This is done automatically by ./scripts/bootstrap_dev.sh
)
To format all your code manually, run pre-commit run --all-files
.
Note: If your code does not pass the pre-commit
checks, automatic builds may fail.
Use pip-sync
to Update Dependencies
To update local dependencies, run pip-sync
in the virtual environment.
This will make sure your virtual environment is in sync with the requirements.txt
file,
including uninstalling any packages that are not in the requirements.txt
file.
Versions
The project uses semantic versioning.
Package versions are automatically generated from git tags.
Create your first tag with git tag 0.1.0
and push it with git push --tags
Installation
tl;dr: ./scripts/install_package.sh
For development, it is recommended to install the package in editable mode with pip install -e .[dev]
.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.