Datazets is a python package to import well known example data sets.
Project description
datazets
datazetsis Python package
Star this repo if you like it! ⭐️
pip install datazets
Import datazets
# Import library
import datazets as dz
# Import data set
df = dz.get('titanic')
Data sets:
| Dataset Name | Shape Size | Type | Description |
|---|---|---|---|
| meta | (1472, 20) | Continuous, time | Stock price of Meta |
| bitcoin | (2522, 2) | Continuous, time | Bitcoin price history data for time series and price prediction |
| iris | (150, 3) | Continuous | Classic flower classification dataset with iris species measurements with coordinates |
| gas_prices | (6556, 2) | Mixed, time | Historical gas prices by region for trend analysis |
| ads | (10000, 10) | Discrete | Data on online ads, covering click-through rates and targeting information |
| sprinkler | (1000, 4) | Discrete | Synthetic dataset with binary variables for rain and sprinkler probability illustration |
| random_discrete | (1000, 5) | Discrete | Synthetic dataset with random discrete variables, useful for probability modeling |
| malicious_urls | (387588, 2) | Text | URLs labeled as malicious or benign, useful in cybersecurity |
| malicious_phish | (651191, 4) | Text | URLs labeled as malicious or benign, defacement, phishing, malware (cybersecurity) |
| stormofswords | (352, 3) | Network | Character data from A Storm of Swords, with relationships, traits, and alliance info |
| bigbang | (9, 3) | Network | Data on The Big Bang Theory episodes and characters |
| energy | (68, 3) | Network | Data on building energy consumption |
| auto_mpg | (392, 8) | Mixed | Data on cars with features for predicting miles per gallon |
| breast_cancer | (569, 30) | Mixed | Dataset for breast cancer diagnosis prediction using tumor cell features |
| cancer | (4674, 9) | Mixed | Cancer patient data for classification and prediction of diagnosis outcome with Coordinates |
| census_income | (32561, 15) | Mixed | US Census data with various demographic and economic factors for income prediction |
| elections_rus | (94487, 23) | Mixed | Russian election data with demographic and political attributes |
| elections_usa | (24611, 8) | Mixed | US election data with demographic and political attributes |
| fifa | (128, 27) | Mixed | FIFA player stats including attributes like skill, position, country, and performance |
| marketing_retail | (999, 8) | Mixed | Retail customer data for behavior and segmentation analysis |
| predictive_maintenance | (10000, 14) | Mixed | Industrial equipment data for predictive maintenance |
| student | (649, 33) | Mixed | Data on student performance with socio-demographic and academic factors |
| surfspots | (9413, 4) | Mixed, latlon | Information on global surf spots, with details on location and wave characteristics |
| tips | (244, 7) | Mixed | Restaurant tipping data with variables on meal size, day, and tip amount |
| titanic | (891, 12) | Mixed | Titanic passenger data with demographic, class, and survival information |
| waterpump | (59400, 41) | Mixed | Water pump data with features for predicting functionality and maintenance needs |
| cat_and_dog | None | Image | Images of cats and dogs for classification and object recognition |
| digits | (1083, 65) | Image | Handwritten digit images (8x8 pixels) for recognition and classification |
| faces | (400, 4097) | Image | Images of faces used in facial recognition and feature analysis |
| flowers | None | Image | Various flower images for classification and image recognition |
| img_peaks1 | (930, 930, 3) | Image | Synthetic peak images for image processing and analysis |
| img_peaks2 | (125, 496, 3) | Image | Additional synthetic peak images for image processing |
| mnist | (1797, 65) | Image | MNIST handwritten digit images (28x28 pixels) for classification tasks |
| scenes | None | Image | Scene images for scene classification tasks |
| southern_nebula | None | Image | Images of the Southern Nebula, suitable for astronomical analysis |
| blobs | Custom | Continuous | Synthetic data of datapoints in blob shape |
| moons | Custom | Continuous | Synthetic data of datapoints in moon shape |
| circles | Custom | Continuous | Synthetic data of datapoints in circle shape |
| anisotropic | Custom | Continuous | Synthetic data of datapoints with anisotropic shape |
| globular | Custom | Continuous | Synthetic data of datapoints with globular shape |
| uniform | Custom | Continuous | Synthetic data with uniform shape |
| densities | Custom | Continuous | Synthetic data with different densities |
Example:
import datazets as dz
df = dz.get(data='titanic')
import datazets as dz
# Import from url
url='https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data'
df = dz.get(url=url, sep=',')
Maintainer
- Erdogan Taskesen, github: erdogant
Contribute
- All kinds of contributions are welcome!
- If you wish to buy me a Coffee for this work, it is very appreciated :)
Licence
See LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
datazets-1.1.3.tar.gz
(192.6 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
datazets-1.1.3-py3-none-any.whl
(190.4 kB
view details)
File details
Details for the file datazets-1.1.3.tar.gz.
File metadata
- Download URL: datazets-1.1.3.tar.gz
- Upload date:
- Size: 192.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0efb6946c2f1c2ab15f6db628648f7a31bad974ec6299aa7ff4f2ae51b220435
|
|
| MD5 |
f0fc23dc5a9fc56a4f83d310702aef4d
|
|
| BLAKE2b-256 |
25938b56a7153b8f9935b9a5d9e3a662a7e5ccfbd88a5ac4c63d36995e75c97d
|
File details
Details for the file datazets-1.1.3-py3-none-any.whl.
File metadata
- Download URL: datazets-1.1.3-py3-none-any.whl
- Upload date:
- Size: 190.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
906018dfd4b3d863607279e3dcd46d16ba47fc6a55a838257b2fd92b674801ee
|
|
| MD5 |
0e8dc9eca5c00e267c4c2342f267c1ca
|
|
| BLAKE2b-256 |
16bd5420a5daf4bdf02271776fcaea5ca47935868a75c0cfc9b016cf189c6564
|