Skip to main content

Kaggle CLI based python wrapper

Project description

Introduction

The repository adds simple class for adding a wrapper on top of kaggle cli tools for ease of downloading competition datasets, listing competitions, downloading and un-zipping datasets.

It aims to make it easy to use kaggle datasets directly from python scripts. It's main usecase is to be used as a submodule in other repos/ or used as a standalone package. If the library is used as a standalone package - the data is downloaded and stored in data folder. In case of using the package as a submodule - the data is downloaded in a data folder created in the base folder of the calling script.

Installation

The repo relies on setuptools and kaggle library packages. To install the repo please use one of the following options:

pip install -U kaggle setuptools
python setup.py install

It also assumes that presence of Kaggle User ID and public API credentials at

On Windows: C:\Users\<Windows-username>\.kaggle\kaggle.json

Others: ~/. kaggle/kaggle.json

After the above steps: the repo can be installed from git using:

pip install git+https://github.com/sunny1401/kaggle_utils.git#egg=kaggle-cli-wrapper

The project can also be installed using pip:

pip install kaggle-cli-wrapper

Code details

DataAPI class

To list all available datasets

from kaggle_cli_wrapper import KaggleDataApi

kda = KaggleDataApi(call_path=__file__)
kda.list_all_kaggle_datasets(search_term="cityscapes")

The call_path argument is required to decide the folder where downloaded files are stored.

The function would return all available datasets with cityscape. This would also be saved as a .txt file.

Currently the list is sorted by "votes". To change this sorting please use the argument sort_by. Allowed values of sort by are:

'hottest', 'votes', 'updated', 'active', 'published'

To list all available competition datasets

from kaggle_cli_wrapper import KaggleDataApi

kda = KaggleDataApi(call_path=__file__)
kda.list_all_kaggle_competitions(search_term="cityscapes")

Currently the list is sorted by "earliestDeadline". To change this sorting please use the argument sort_by. Allowed values of sort by are:

'grouped', 'prize', 'earliestDeadline','latestDeadline', 'numberOfTeams', 'recentlyCreated'

To download datasets

from kaggle_cli_wrapper import KaggleDataApi

kda = KaggleDataApi(call_path=__file__)
kda.download_kaggle_dataset(dataset_name="cityscapes_train_val_test", is_competition_dataset=False)

To download a competition dataset, is_competition_dataset needs to be set to True

ScoringAPI

The code currently supports minimum functionality for submission of files to a competition and getting scores for the competition

To submit to a competition

from kaggle_utils.kaggle_cli_wrapper import KaggleScoringsApi

kaggle_scoring_api = KaggleScoringsApi(competition_name="facial-keypoints-detection")

kaggle_scoring_api.submit_solution(submissions_file=submission_path, description="facial_keypoint_vanilla_cnn")
kaggle_scoring_api.get_top_scores()

Remaining Issues

  • improve error handling
  • allow for saving of stdout

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kaggle_cli_wrapper-0.2.tar.gz (6.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

kaggle_cli_wrapper-0.2-py3-none-any.whl (7.2 kB view details)

Uploaded Python 3

File details

Details for the file kaggle_cli_wrapper-0.2.tar.gz.

File metadata

  • Download URL: kaggle_cli_wrapper-0.2.tar.gz
  • Upload date:
  • Size: 6.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.16

File hashes

Hashes for kaggle_cli_wrapper-0.2.tar.gz
Algorithm Hash digest
SHA256 7164e4a502430a6063f0cad1aa00d612c45cb7a7c1cf7afa48c2b81357305fa1
MD5 ccc20fbac00d3219178c207f61104618
BLAKE2b-256 8a9ac57ff02d0c852b3fe117599e72ffd349acbcb0217e99479b895842d4a32c

See more details on using hashes here.

File details

Details for the file kaggle_cli_wrapper-0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for kaggle_cli_wrapper-0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 4e7b1b74db28801b87400c7ccf8f02157543d23c27b2850c4d95102c9a73feef
MD5 e1678c97d8b9007ffd4c046c86df6e3a
BLAKE2b-256 b2e840065cad1e71a3c53c5b3ce0e456d61497fdec7936fde44f3adc52396457

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page