A package to download datasets from Kaggle.
Project description
KaggleDownloader
KaggleDownloader is a Python package designed to simplify the process of downloading and extracting datasets from Kaggle. With this package, you can search for datasets by theme, authenticate your Kaggle account, download any available dataset, and extract the contents of zip files, all with a simple interface.
Features
- Authenticate with Kaggle: Easily authenticate using your Kaggle API credentials.
- Search for Datasets: Search for datasets on Kaggle based on specific themes.
- Download Datasets: Download any dataset from Kaggle using its slug.
- Extract Zip Files: Extract the contents of zip files downloaded from Kaggle.
- Directory Management: Automatically create directories to store datasets if they don't exist.
Installation
First, install the package using Poetry. Make sure you have Poetry installed.
poetry install
Usage
You can use KaggleDownloader through the command line. Here’s an example:
python -m kaggle_downloader <dataset_slug> --path <download_path>
Replace <dataset_slug>
with the identifier of the Kaggle dataset and <download_path>
with the location where you want the dataset to be saved.
Example:
python -m kaggle_downloader zynicide/wine-reviews --path ./datasets
Methods
Authentication
The method authenticate_kaggle()
allows you to authenticate with your Kaggle account by providing the path to your kaggle.json
file, which contains your API credentials.
Searching for Datasets
The method search_datasets(dataset_theme)
allows you to search for datasets on Kaggle related to a specific theme.
Downloading Datasets
The method download_dataset(dataset_slug, path)
allows you to download a dataset by providing its Kaggle slug and specifying the location to save it.
Extracting Zip Files
The method extract_zip(zip_path, extract_to)
enables you to extract the contents of a zip file. If the file is not a valid zip or doesn’t exist, an error is raised.
Contributing
Feel free to contribute to this project by submitting issues, feature requests, or pull requests on GitHub.
License
This project is licensed under the MIT License. See the LICENSE
file for details.
Author
Mariano Gobea Alcoba
Email: gobeamariano@gmail.com
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file kaggle_downloader_package-0.1.0.tar.gz
.
File metadata
- Download URL: kaggle_downloader_package-0.1.0.tar.gz
- Upload date:
- Size: 4.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.12.3 Darwin/23.5.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | de894da6c1254236e5e7e915980f2700f2d31c16c9c7a7aea8994bc4c3ea74fb |
|
MD5 | 9280d8131c60297fe8eea8647f0938ed |
|
BLAKE2b-256 | 5af7601f3380a839792d21fc2bee2803dce9221cb443d92508cb364201ffc0fc |
File details
Details for the file kaggle_downloader_package-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: kaggle_downloader_package-0.1.0-py3-none-any.whl
- Upload date:
- Size: 5.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.12.3 Darwin/23.5.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4a20f57b07603624a0919b053719fc91db9781334bae774077a46a538f8aea87 |
|
MD5 | 46d3f31beecf606c5d8aecabedd2283e |
|
BLAKE2b-256 | b33b4ceed6567f2f780daeafbda290a7e39584cb82bc2ed7b2c83bca92258472 |