Skip to main content

Datazets is a python package to import well known example data sets.

Project description

datazets

Python Pypi Docs LOC Downloads Downloads License Forks Issues Project Status GitHub Repo stars GitHub repo size Donate

  • datazets is Python package

Star this repo if you like it! ⭐️

pip install datazets

Import datazets

# Import library
import datazets as dz
# Import data set
df = dz.get('titanic')

Data sets:

Dataset Name Shape Size Type Description
meta (1472, 20) Continuous, time Stock price of Meta
bitcoin (2522, 2) Continuous, time Bitcoin price history data for time series and price prediction
iris (150, 3) Continuous Classic flower classification dataset with iris species measurements with coordinates
gas_prices (6556, 2) Mixed, time Historical gas prices by region for trend analysis
ads (10000, 10) Discrete Data on online ads, covering click-through rates and targeting information
sprinkler (1000, 4) Discrete Synthetic dataset with binary variables for rain and sprinkler probability illustration
random_discrete (1000, 5) Discrete Synthetic dataset with random discrete variables, useful for probability modeling
malicious_urls (387588, 2) Text URLs labeled as malicious or benign, useful in cybersecurity
malicious_phish (651191, 4) Text URLs labeled as malicious or benign, defacement, phishing, malware (cybersecurity)
stormofswords (352, 3) Network Character data from A Storm of Swords, with relationships, traits, and alliance info
bigbang (9, 3) Network Data on The Big Bang Theory episodes and characters
energy (68, 3) Network Data on building energy consumption
auto_mpg (392, 8) Mixed Data on cars with features for predicting miles per gallon
breast_cancer (569, 30) Mixed Dataset for breast cancer diagnosis prediction using tumor cell features
cancer (4674, 9) Mixed Cancer patient data for classification and prediction of diagnosis outcome with Coordinates
census_income (32561, 15) Mixed US Census data with various demographic and economic factors for income prediction
elections_rus (94487, 23) Mixed Russian election data with demographic and political attributes
elections_usa (24611, 8) Mixed US election data with demographic and political attributes
fifa (128, 27) Mixed FIFA player stats including attributes like skill, position, country, and performance
marketing_retail (999, 8) Mixed Retail customer data for behavior and segmentation analysis
predictive_maintenance (10000, 14) Mixed Industrial equipment data for predictive maintenance
student (649, 33) Mixed Data on student performance with socio-demographic and academic factors
surfspots (9413, 4) Mixed, latlon Information on global surf spots, with details on location and wave characteristics
tips (244, 7) Mixed Restaurant tipping data with variables on meal size, day, and tip amount
titanic (891, 12) Mixed Titanic passenger data with demographic, class, and survival information
waterpump (59400, 41) Mixed Water pump data with features for predicting functionality and maintenance needs
cat_and_dog None Image Images of cats and dogs for classification and object recognition
digits (1083, 65) Image Handwritten digit images (8x8 pixels) for recognition and classification
faces (400, 4097) Image Images of faces used in facial recognition and feature analysis
flowers None Image Various flower images for classification and image recognition
img_peaks1 (930, 930, 3) Image Synthetic peak images for image processing and analysis
img_peaks2 (125, 496, 3) Image Additional synthetic peak images for image processing
mnist (1797, 65) Image MNIST handwritten digit images (28x28 pixels) for classification tasks
scenes None Image Scene images for scene classification tasks
southern_nebula None Image Images of the Southern Nebula, suitable for astronomical analysis
blobs Custom Continuous Synthetic data of datapoints in blob shape
moons Custom Continuous Synthetic data of datapoints in moon shape
circles Custom Continuous Synthetic data of datapoints in circle shape
anisotropic Custom Continuous Synthetic data of datapoints with anisotropic shape
globular Custom Continuous Synthetic data of datapoints with globular shape
uniform Custom Continuous Synthetic data with uniform shape
densities Custom Continuous Synthetic data with different densities

Example:

import datazets as dz
df = dz.get(data='titanic')
import datazets as dz

# Import from url
url='https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data'
df = dz.get(url=url, sep=',')

Maintainer

Contribute

  • All kinds of contributions are welcome!
  • If you wish to buy me a Coffee for this work, it is very appreciated :)

Licence

See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datazets-1.1.3.tar.gz (192.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

datazets-1.1.3-py3-none-any.whl (190.4 kB view details)

Uploaded Python 3

File details

Details for the file datazets-1.1.3.tar.gz.

File metadata

  • Download URL: datazets-1.1.3.tar.gz
  • Upload date:
  • Size: 192.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.11

File hashes

Hashes for datazets-1.1.3.tar.gz
Algorithm Hash digest
SHA256 0efb6946c2f1c2ab15f6db628648f7a31bad974ec6299aa7ff4f2ae51b220435
MD5 f0fc23dc5a9fc56a4f83d310702aef4d
BLAKE2b-256 25938b56a7153b8f9935b9a5d9e3a662a7e5ccfbd88a5ac4c63d36995e75c97d

See more details on using hashes here.

File details

Details for the file datazets-1.1.3-py3-none-any.whl.

File metadata

  • Download URL: datazets-1.1.3-py3-none-any.whl
  • Upload date:
  • Size: 190.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.11

File hashes

Hashes for datazets-1.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 906018dfd4b3d863607279e3dcd46d16ba47fc6a55a838257b2fd92b674801ee
MD5 0e8dc9eca5c00e267c4c2342f267c1ca
BLAKE2b-256 16bd5420a5daf4bdf02271776fcaea5ca47935868a75c0cfc9b016cf189c6564

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page