A package to translate ecological names in any format- from taxnomic rank (such as genus or family), or common English name (e.g. "blackbird") to specific, scientific species names.
Project description
The Ecological Name Translator
What is it?
A lightweight python package for translating ecological names in any format [e.g. at the species level (e.g. "Panthera tigris"), higher levels of taxonomy (such as genus, or family), or common name ("Tiger")] to the standardised species name equivalent.
Input & Output
A list of names describing species is accepted as input. This undergoes a data-cleaning procedure, after which, the following actions are taken:
-
Names that are already in a standard species format (that is, genus + species extension), have any spelling errors corrected
-
Names at higher levels of taxonomy (currently family, sub-family and genus are supported) again have any spelling mistakes corrected, and are then mapped to a list of specific species belonging to that taxonomic rank
-
Common names (currently, English only) are mapped to their scientific name (or all scientific names that could be described by the common English name)
The output is a python dictionary which maps each item in the input list to the equivalent standard scientific name(s).
Examples
Already a scientific name at the species level
If correct, the name will be returned as is, and if there are any spelling mistakes, they will be corrected.
unstandardised_names = ['Panhera tigris'] #Should be "Panthera tigris"
translator = EcoNameTranslator()
index = translator.translate(unstandardised_names)
print(index)
# {'Panera tigris':['panthera tigris']}
Higher levels of taxonomy
If an entry in your list is at a higher taxonomic rank, all species under that rank will be returned
unstandardised_names = ['Panthera'] # A genus of various big cats
translator = EcoNameTranslator()
index = translator.translate(unstandardised_names)
print(index)
# {'Panthera': ['panthera leo', 'panthera uncia', 'panthera tigris', 'panthera onca', 'panthera pardus','panthera spec']}
Common names
A common name can also be provided, in which case all possible scientific names described by the common name are returned. Be careful; quite generic common names, e.g. "monkey", can return hundreds of species!
unstandardised_names = ['Bengal Tiger']
translator = EcoNameTranslator()
index = translator.translate(unstandardised_names)
print(index)
# {'Bengal Tiger': ['panthera tigris']}
Note: this feature should be used with caution, as partial matching is included- that is, a list item of just "Tiger" would also bring unwanted results of things like a Tiger Beetle.
Saving
Two convenience functions are provided to save the returned python data structure to file. Simple call one of:
items = [...]
index = translator.translate(items)
translator.toPickleFile('C:/Users/...','filename') # Python serialization
translator.toCSV('C:/Users/...','filename') # CSV
Why did we build it?
This package was refactored as part of a larger species interaction networks project. Our project required large datasets of standardised, scientific species names; and while there is plenty of species data, the existing datasets are often incredibly messy and constructed ad-hoc- meaning assembling a standard collection of species is difficult. To help, we made this package that takes in ecological names in any format, and makes a best-effort attempt to translate the input to a standardised set of scientific names.
Detailed Docs + Contributing
See the Github page for both, here
Credit
The package uses various APIs for conversions of names. These are:
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for EcoNameTranslator-0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | baf1b2675df1712670ce08bdacdb6f316fc6f0fb74f5f4ac8fe9e57d3bc3939d |
|
MD5 | 8148545d36153bf0e445c964b0ee888c |
|
BLAKE2b-256 | 49a0af40035e0769b5ddc56e4b0a88a822b3178776d52ea2a22d344d8d177918 |