Skip to main content

Datazets is a python package to import well known example data sets.

Project description

datazets

Python Pypi Docs LOC Downloads Downloads License Forks Issues Project Status GitHub Repo stars GitHub repo size Donate

  • datazets is Python package

Star this repo if you like it! ⭐️

pip install datazets

Import datazets

# Import library
import datazets as dz
# Import data set
df = dz.get('titanic')

Data sets:

Dataset Name Shape Size Type Description
meta (1472, 20) Continuous time
bitcoin (2522, 2) Continuous time
iris (150, 3) Continuous Classic flower classification dataset with iris species measurements with coordinates
------------------------ ---------------------- --------------------- -----------------------------------------------------------------------------------------------
gas_prices (6556, 2) Mixed time
ads (10000, 10) Discrete Data on online ads, covering click-through rates and targeting information
sprinkler (1000, 4) Discrete Synthetic dataset with binary variables for rain and sprinkler probability illustration
random_discrete (1000, 5) Discrete Synthetic dataset with random discrete variables, useful for probability modeling
------------------------ ---------------------- --------------------- -----------------------------------------------------------------------------------------------
malicious_urls (387588, 2) Text URLs labeled as malicious or benign, useful in cybersecurity
malicious_phish (651191, 4) Text URLs labeled as malicious or benign, defacement, phishing, malware (cybersecurity)
------------------------ ---------------------- --------------------- -----------------------------------------------------------------------------------------------
stormofswords (352, 3) Network Character data from A Storm of Swords, with relationships, traits, and alliance info
bigbang (9, 3) Network Data on The Big Bang Theory episodes and characters
energy (68, 3) Network Data on building energy consumption
------------------------ ---------------------- --------------------- -----------------------------------------------------------------------------------------------
auto_mpg (392, 8) Mixed Data on cars with features for predicting miles per gallon
breast_cancer (569, 30) Mixed Dataset for breast cancer diagnosis prediction using tumor cell features
cancer (4674, 9) Mixed Cancer patient data for classification and prediction of diagnosis outcome with Coordinates
census_income (32561, 15) Mixed US Census data with various demographic and economic factors for income prediction
elections_rus (94487, 23) Mixed Russian election data with demographic and political attributes
elections_usa (24611, 8) Mixed US election data with demographic and political attributes
fifa (128, 27) Mixed FIFA player stats including attributes like skill, position, country, and performance
marketing_retail (999, 8) Mixed Retail customer data for behavior and segmentation analysis
predictive_maintenance (10000, 14) Mixed Industrial equipment data for predictive maintenance
student (649, 33) Mixed Data on student performance with socio-demographic and academic factors
surfspots (9413, 4) Mixed latlon
tips (244, 7) Mixed Restaurant tipping data with variables on meal size, day, and tip amount
titanic (891, 12) Mixed Titanic passenger data with demographic, class, and survival information
waterpump (59400, 41) Mixed Water pump data with features for predicting functionality and maintenance needs
------------------------ ---------------------- --------------------- -----------------------------------------------------------------------------------------------
cat_and_dog None Image Images of cats and dogs for classification and object recognition
digits (1083, 65) Image Handwritten digit images (8x8 pixels) for recognition and classification
faces (400, 4097) Image Images of faces used in facial recognition and feature analysis
flowers None Image Various flower images for classification and image recognition
img_peaks1 (930, 930, 3) Image Synthetic peak images for image processing and analysis
img_peaks2 (125, 496, 3) Image Additional synthetic peak images for image processing
mnist (1797, 65) Image MNIST handwritten digit images (28x28 pixels) for classification tasks
scenes None Image Scene images for scene classification tasks
southern_nebula None Image Images of the Southern Nebula, suitable for astronomical analysis
------------------------ ---------------------- --------------------- -----------------------------------------------------------------------------------------------
blobs Custom Continuous Synthetic data of datapoints in blob shape
moons Custom Continuous Synthetic data of datapoints in moon shape
circles Custom Continuous Synthetic data of datapoints in circle shape
anisotropic Custom Continuous Synthetic data of datapoints with anisotropic shape
globular Custom Continuous Synthetic data of datapoints with globular shape
uniform Custom Continuous Synthetic data with uniform shape
densities Custom Continuous Synthetic data with different densities
------------------------ ---------------------- --------------------- -----------------------------------------------------------------------------------------------

Example:

import datazets as dz
df = dz.get(data='titanic')
import datazets as dz

# Import from url
url='https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data'
df = dz.get(url=url, sep=',')

Maintainer

Contribute

  • All kinds of contributions are welcome!
  • If you wish to buy me a Coffee for this work, it is very appreciated :)

Licence

See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datazets-1.1.0.tar.gz (14.9 kB view details)

Uploaded Source

Built Distribution

datazets-1.1.0-py3-none-any.whl (14.4 kB view details)

Uploaded Python 3

File details

Details for the file datazets-1.1.0.tar.gz.

File metadata

  • Download URL: datazets-1.1.0.tar.gz
  • Upload date:
  • Size: 14.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for datazets-1.1.0.tar.gz
Algorithm Hash digest
SHA256 27962c727f0c02f370153f183a81fc7e0b33277b95047324c60cae06bad15f99
MD5 88b149fd27fd4da6e02563e5edd5a7aa
BLAKE2b-256 1658d629173d37b4b704656b82299568d182ea12fcb8ecf0eff1468d0703dbe0

See more details on using hashes here.

File details

Details for the file datazets-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: datazets-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 14.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for datazets-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 edf21e39c480edcd80c0b1fc4b36f9546c5097a1aa7f9276e97f7e990f1424de
MD5 1f7e487a7702a029283ea85bc51c9e73
BLAKE2b-256 4471b7012dee713198a598c836da8c446792ab39235714a389ac4db2417a6bca

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page