A package to translate ecological names in any format- from taxnomic rank (such as genus or family), or common English name (e.g. "blackbird") to specific, scientific species names.
Project description
The Ecological Name Translator
What is it?
A lightweight python package for translating ecological names in any format [e.g. at the species level (e.g. "Panthera tigris"), higher levels of taxonomy (such as genus, or family), or common name ("Tiger")] to the standardised species name equivalent.
Input & Output
A list of names describing species is accepted as input. This undergoes a data-cleaning procedure, after which, the following actions are taken:
-
Names that are already in a standard species format (that is, genus + species extension), have any spelling errors corrected
-
Names at higher levels of taxonomy (currently family, sub-family and genus are supported) again have any spelling mistakes corrected, and are then mapped to a list of specific species belonging to that taxonomic rank
-
Common names (currently, English only) are mapped to their scientific name (or all scientific names that could be described by the common English name)
The output is a python dictionary which maps each item in the input list to the equivalent standard scientific name(s).
Examples
Already a scientific name at the species level
If correct, the name will be returned as is, and if there are any spelling mistakes, they will be corrected.
from EcoNameTranslator import EcoNameTranslator
unstandardised_names = ['Panhera tigris'] #Should be "Panthera tigris"
translator = EcoNameTranslator()
index = translator.translate(unstandardised_names)
print(index)
# {'Panera tigris':['panthera tigris']}
Higher levels of taxonomy
If an entry in your list is at a higher taxonomic rank, all species under that rank will be returned
from EcoNameTranslator import EcoNameTranslator
unstandardised_names = ['Panthera'] # A genus of various big cats
translator = EcoNameTranslator()
index = translator.translate(unstandardised_names)
print(index)
# {'Panthera': ['panthera leo', 'panthera uncia', 'panthera tigris', 'panthera onca', 'panthera pardus','panthera spec']}
Common names
A common name can also be provided, in which case all possible scientific names described by the common name are returned. Be careful; quite generic common names, e.g. "monkey", can return hundreds of species!
from EcoNameTranslator import EcoNameTranslator
unstandardised_names = ['Bengal Tiger']
translator = EcoNameTranslator()
index = translator.translate(unstandardised_names)
print(index)
# {'Bengal Tiger': ['panthera tigris']}
Note: this feature should be used with caution, as partial matching is included- that is, a list item of just "Tiger" would also bring unwanted results of things like a Tiger Beetle.
Saving
Two convenience functions are provided to save the returned python data structure to file. Simple call one of:
from EcoNameTranslator import EcoNameTranslator
items = [...]
index = translator.translate(items)
translator.toPickleFile('C:/Users/...','filename') # Python serialization
translator.toCSV('C:/Users/...','filename') # CSV
Why did we build it?
This package was refactored as part of a larger species interaction networks project. Our project required large datasets of standardised, scientific species names; and while there is plenty of species data, the existing datasets are often incredibly messy and constructed ad-hoc- meaning assembling a standard collection of species is difficult. To help, we made this package that takes in ecological names in any format, and makes a best-effort attempt to translate the input to a standardised set of scientific names.
Detailed Docs + Contributing
See the Github page for both, here
Credit
The package uses various APIs for conversions of names. These are:
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for EcoNameTranslator-1.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 242db59be18cddc23a2c876b70b8489591636f73e31417ec11c8e3ec50263a95 |
|
MD5 | bfe886468c4d73fe2194ade8a6a84cf0 |
|
BLAKE2b-256 | c1f1cf45b6c2e27487ae59ca35fcc0e652ff6920ad9076e5c2149e2ff63773af |