easy access to commonly used datasets in pandas dataframe format
Project description
DataSetLib
Introduction
a simple library having access to some often used datasets in pandas dataframe format
Getting Started
There are two main functions in the module:
get_datasets()
returns a list with all the available datasetsget_dataset()
returns a specific dataset identified by a name
Usage of datasetlib
>>> import datasetlib as dsl
>>> dsl.get_datasets()
['aapl', 'amazon_reviews', 'avocado', 'babynames', 'bank_clients', 'bmw', 'canada_population',
'cancer', 'crypto_prices', 'crypto_returns', 'human_resources', 'project1_sales_data',
'project1_stores_data', 'sp500_prices', 'stock_prices', 'stocks', 'summergames',
'temperatures', 'titanic']
>>> dsl.get_dataset("titanic")
survived pclass sex age sibsp parch fare embarked deck
0 0 3 male 22.0 1 0 7.2500 S NaN
1 1 1 female 38.0 1 0 71.2833 C C
2 1 3 female 26.0 0 0 7.9250 S NaN
3 1 1 female 35.0 1 0 53.1000 S C
4 0 3 male 35.0 0 0 8.0500 S NaN
.. ... ... ... ... ... ... ... ... ...
886 0 2 male 27.0 0 0 13.0000 S NaN
887 1 1 female 19.0 0 0 30.0000 S B
888 0 3 female NaN 1 2 23.4500 S NaN
889 1 1 male 26.0 0 0 30.0000 C C
890 0 3 male 32.0 0 0 7.7500 Q NaN
[891 rows x 9 columns]
>>>
Project Homepage
https://dev.azure.com/neuraldevelopment/datasetlib
Contribute
If you find a defect or suggest a new function, please send an eMail to neutro2@outlook.de
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
datasetlib-0.2.5.1.tar.gz
(17.4 MB
view hashes)
Built Distribution
Close
Hashes for datasetlib-0.2.5.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 437d10cbf507c25a03622c9a1972f4eb06ad977701456d84ce780d3faf712a75 |
|
MD5 | e6edea0ed1549dc59c788a1b9b466849 |
|
BLAKE2b-256 | 45df7b112efd01c08408a06b5b99bfd939dec466b5746520410610c3cb52d201 |