Python utilities for working with inaturalist-open-data
Project description
pyinaturalist-open-data
This is a work in progress and not yet complete!
pyinaturalist-open-data is a python library and CLI tool for working with inaturalist-open-data. Its goal is to make it easy to import and use this dataset in a python application backed by any SQLAlchemy-compatible database engine (SQLite by default), or simply for local data exploration.
See the CLI in action here or on asciinema:
Installation
Install with pip:
pip install pyinaturalist-open-data
Or for local development:
git clone https://github.com/JWCook/pyinaturalist-open-data.git
cd pyinaturalist-open-data
pip install poetry && poetry install
Usage
This package provides the command pynat
. See --help
for commands and options:
Usage: pynat [OPTIONS] COMMAND [ARGS]...
Commands for working with inaturalist open data
Options:
-v, --verbose Show more detailed output
--help Show this message and exit.
Commands:
db Load contents of CSV files into a database
dl Download and extract inaturalist open data archive
init Just create tables (if they don't already exist) without populating...
load Download and load all data into a database.
Run everything
The simplest command is load
, which runs all steps:
- Download and extract the dataset
- Create database tables and indices
- Load the data into the database
Options:
Usage: pynat load [OPTIONS]
Options:
-d, --download-dir TEXT Alternate path for downloads
-u, --uri TEXT Alternate database URI to connect to
--help Show this message and exit.
By default, this will create a new SQLite database. Alternatively, you can provide a URI for any supported database.
Run individual steps
Other commands are available if you only one to run one of those steps at a time.
dl
command:
Usage: pynat dl [OPTIONS]
Download and extract all files in the inaturalist open data archive
Options:
-d, --download-dir TEXT Alternate path for downloads
--help Show this message and exit
Note: Both dl
and load
will reuse local data if already exists and is up to date.
db
command:
Usage: pynat db [OPTIONS]
Load contents of CSV files into a database. Also creates tables and
indexes, if they don't already exist.
Options:
-d, --download-dir TEXT Alternate path for downloads
-i, --init Just initialize the database with tables
+ indexes without loading data
-t, --tables [observation|photo|taxon|user]
Load only these specific tables
-u, --uri TEXT Alternate database URI to connect to
--help Show this message and exit.
Note: This can take a long time to run. Depending on the database type, you will likely get
better performance with database-specific bulk loading tools (for example, psql
with COPY for PostgreSQL)
Python package
To use as a python package instead of a CLI tool:
from pyinaturalist_open_data import download_metadata, load_all
download_metadata()
load_all()
Full package documentation on readthedocs will be coming soon.
Planned features
Some features I would ideally like to add to this:
- Performance optimizations
- Basic querying features
- Image downloads based on query results
- Integration with iNaturalist API data via pyinaturalist
- Integration with CSV data from the iNaturalist export tool
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file pyinaturalist-open-data-0.1.1.tar.gz
.
File metadata
- Download URL: pyinaturalist-open-data-0.1.1.tar.gz
- Upload date:
- Size: 11.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.6 CPython/3.9.5 Linux/5.4.0-1047-azure
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | bda40cbec21cb80076deba0651c696824614141a5b85f3f634f3d0b50bf2a689 |
|
MD5 | 92550b25937c1c54051715deb17a7f4e |
|
BLAKE2b-256 | 9d0372888af6a897ea6a93be6a86dc383efc3361a28accffc3bae3289778a41f |
File details
Details for the file pyinaturalist_open_data-0.1.1-py3-none-any.whl
.
File metadata
- Download URL: pyinaturalist_open_data-0.1.1-py3-none-any.whl
- Upload date:
- Size: 15.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.6 CPython/3.9.5 Linux/5.4.0-1047-azure
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b896272cc0fcde98fa6c42b7abb8e488778ab4246c95184c01819951fa1c4750 |
|
MD5 | 50ee13fe348f03aadd7d6f5fc2b43a94 |
|
BLAKE2b-256 | 936ebdcdd3639d816abecf40213729fce1b8c8617d5fcd89f8d3c92462923359 |