xeno-canto.org API Wrapper
xeno-canto API Wrapper
xeno-canto-py is an API wrapper designed to help users download xeno-canto.org recordings and associated information in an efficient manner. Download requests are processed concurrently using the
aiofiles libraries to optimize retrieval time. The wrapper also offers delete and metadata generation functions for recording library management.
Created to aid in data collection and filtering for the training of machine learning models.
xeno-canto-py is available on PyPi and can be downloaded with the package manager pip to install xeno-canto-py.
pip install xeno-canto
The package can then be used straight from the command-line:
xeno-canto -dl Bearded Bellbird
Or imported into an existing Python project:
For users who want more control over the wrapper, navigate to your desired file location in a terminal window and then clone the repository with the following command:
git clone https://github.com/ntivirikin/xeno-canto-py
The only file required for operation is
xenocanto.py, so feel free to remove the others or move
xenocanto.py to another working directory.
WARNING: Please exercise caution using
test.py as executing the tests via
unittest or other test harness will delete any
dataset folder in the working directory following completion of the tests.
The xeno-canto-py wrapper supports the retrieval of metadata and audio from the xeno-canto database, as well as library management functions such as deletion of recordings matching input tags, removal of folders with an insufficient amount of audio recordings and generation of a single JSON metadata file for a given path containing xeno-canto audio recordings. Examples of command usage are given below.
xeno-canto -m [parameters]
Downloads metadata as a series of JSON files and returns the path to the metadata folder.
Example: Metadata retrieval for Bearded Bellbird recordings of quality A
xeno-canto -m Bearded Bellbird q:A
Audio Recording Download
xeno-canto -dl [parameters]
Retrieves the metadata for the request and uses it to download audio recordings as MP3s from the database.
Example: Download Bearded Bellbird recordings from the country of Brazil
xeno-canto -dl Bearded Bellbird cnt:Brazil
xeno-canto -del [parameters]
Delete recordings with ANY of the parameters given as input.
Example: Delete ALL quality D recordings and ALL recordings from Brazil
xeno-canto -del q:D cnt:Brazil
Removes any folders within the
dataset/audio/ directory that have less recordings than the input value
xeno-canto -p [num]
Example: Remove recording folders with less than 10 recordings (not inclusive)
xeno-canto -p 10
Generates metadata for the xeno-canto database recordings at the input path, defaulting to
dataset/audio/ within the working directory if none is given.
xeno-canto -g [path]
Example: Generate metadata for the recordings located in
bird_rec/audio/ within the working directory
xeno-canto -g bird_rec/audio/
parameters are given in tag:value form in accordance with the API search guidelines. For help in building search terms, consult the xeno-canto API guide and this article. The only exception is when providing English bird names as an argument to the delete function, which must be preceded with
en: and have all spaces be replaced with underscores.
Files are saved in the working directory under the folder
dataset/. Metadata and audio recordings are separated into
audio/ folders by request information and bird species respectively. For example:
dataset/ - audio/ - Indigo Bunting/ - 14325.mp3 - Northern Cardinal/ - 8273.mp3 - metadata/ - library.json - IndigoBuntingcnt_Canada/ - page1.json - NorthernCardinalq_A/ - page1.json
Metadata is retrieved as a JSON file and contains information on each of the audio recordings matching the request parameters provided as input. The metadata also contains the download links used to retrieve the audio recordings. The
library.json file is generated by running the metadata generation command
If an Error 503 is given when attempting a recording download, try passing a value lower than 4 as the num_chunks value in download(filt, num_chunks). This can either be done by changing the default value in the function definition for
download(filt, num_chunks), or by passing a value into
download(params) in the body of
main() as shown below.
# Running with default 4 locks on semaphore asyncio.run(download(params)) # Running with 3 locks rather than default asyncio.run(download(params, 3))
Alternatively, you can try experimenting with higher values for num_chunks to see some performance improvements.
All pull requests are welcome! If any issues are found, please do not hesitate to bring them to my attention.
Thank you to the team at xeno-canto.org and all its contributors for putting together such an amazing database.
Release history Release notifications | RSS feed
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.