Pytaxon é uma aplicação open source de auxílio à pesquisa para identificação de erros e correção de nomenclatura taxonômica das espécies da biodiversidade
Project description
Pytaxon: A Python software package for the identification and correction of errors in the taxonomic data of biodiversity species
We present pytaxon, a Python software designed to resolve and correct taxonomic names in biodiversity data by leveraging the Global Names Verifier (GNV) API and employing fuzzy matching techniques to suggest corrections for discrepancies and nomenclatural inconsistencies. The pytaxon offers both a Command Line Interface (CLI) and a Graphical User Interface (GUI), ensuring accessibility to users with different levels of computing expertise.
Installation Guide
Dependencies
- Listed at requirements.txt
Install the package from PyPI:
$ pip install pytaxon
To download the Pytaxon GUI .exe:
| win | lin | |
|---|---|---|
| .zip | Link | Link |
| .rar | Link | Link |
Workflow
Firstly, you will want to check your spreadsheet for errors, then the program will return you and Excel file (.xlsx) containing all the incorrect data depending on the selected data source.
Then, you may select which data are to be corrected with the "Change" column, after this, you may run the second command to correct automatically the original spreadsheet with the checked spreadsheet.
$ pytaxon -r <column names> -os <path to original spreadsheet> -ss <name of suggestion spreadsheet> -si <source id>
$ pytaxon -os <path to original spreadsheet> -cs <path of checked spreadsheet> -o <name of corrected spreadsheet>
Explore the options for these commands with the --help flag.
Illustrative Examples
Pytaxon CLI running on the Visual Studio Code terminal (Powershell) with a modified version of the Uropygi dataset
The to correct spreadsheet of the modified Uropygi dataset
Pytaxon GUI application running with a modified version of the Uropygi dataset
Pytaxon's CLI and GUI workflow
Citing
If you use the source code of Pytaxon in any form, please cite the following manuscript (we encorage citing Global Names Resolver as well):
future manuscript
Acknowledgements
We thank the following institutions, which contributed to ensuring the success of our work:
Museu Paraense Emílio Goeldi (MPEG)
Centro Universitário do Estado do Pará (CESUPA)
Funding
This research was supported by Centro Universitário do Pará - CESUPA with the PIBICT scientific initiation scholarship project.
Authors
Marco Aurélio Proença Neto
Marcos Paulo Alves de Sousa
Contact
Dr. Marcos Paulo Alves de Sousa (Project leader)
Email: msousa@museu-goeldi.br
Grupo de Estudos Temático em Computação Aplicada (GET-COM)
Centro Universitário do Pará - CESUPA
Av. Perimetral 1901. CEP 66077- 530. Belém, Pará, Brazil.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pytaxon-0.2.6.tar.gz.
File metadata
- Download URL: pytaxon-0.2.6.tar.gz
- Upload date:
- Size: 17.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.9.21
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
55232854a9995e879ff59bd558a4fb6aa1442b228710d82df1092ecc5abdc46a
|
|
| MD5 |
c94d9a73139be3e6cd064b765806a221
|
|
| BLAKE2b-256 |
fd02c0c748b559941cc96f1a86063a4440770683677ce1cd7ad09a4b136acaa5
|
File details
Details for the file pytaxon-0.2.6-py3-none-any.whl.
File metadata
- Download URL: pytaxon-0.2.6-py3-none-any.whl
- Upload date:
- Size: 15.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.9.21
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0d9685d801c7f56d8fe11e0b403f3deb7f1bbfee4a0e68164caacaea1d1260f0
|
|
| MD5 |
16a984ba99a69aed680e00acecb2fd09
|
|
| BLAKE2b-256 |
20f01bf4ac2b2922a19055f360cfd6019125081dad39a990bfabed71d527d9c8
|