Skip to main content

No project description provided

Project description

Giza Datasets

Welcome to the Giza Datasets repository. Here you can find a collection of datasets ready to be used for blockchain ML use cases. Familiarize yourself with the ease of using dataframes through our DatasetsLoader class.

Before discovering how our library works, if you want to find detailed information about each dataset provided by Giza, access our documentation! You will find usage examples for each dataset, the schema of each one with descriptions of every field, the relationship between the datasets, potential use cases for them, and much more!

Enhanced Features

Explore the robust capabilities of the Giza Datasets repository:

  • Streamlined Dataset Access: Instantly connect to a curated collection of blockchain datasets, ready for machine learning applications, with no configuration needed.
  • Effortless Data Loading: Utilize the DatasetsLoader class to easily load Parquet files, streamlining your data workflow.
  • Optimized Data Handling: Leverage the integration with the polars library, designed for efficient manipulation of large datasets. For detailed guidance on using polars for dataset operations, refer to the polars documentation.

Quick Start

To get started with Giza Datasets, follow the steps below:

  1. Install the giza-datasets package if you haven't already:

    pip install giza-datasets
    
  2. Import the DatasetsLoader class and initialize it:

    from giza_datasets import DatasetsLoader
    loader = DatasetsLoader()
    
  3. Optional: Depending on your device's configuration, it may be necessary to provide SSL certificates to verify the authenticity of HTTPS connections. You can ensure that all these certifications are correct by executing the following line of code:

    import certifi
    import os
    os.environ['SSL_CERT_FILE'] = certifi.where()
    
  4. Load a dataset using the load method. For example, to load tvl-fee-per-protocol:

    df = loader.load('tvl-fee-per-protocol')
    
  5. To view the loaded dataset, simply print the dataframe:

    print(df)
    

Start exploring the datasets and building your machine learning models with ease!

Datasets Hub

The DatasetsHub class provides methods to manage and access datasets. Here are some of its methods:

  • show(): Prints a table of all datasets in the hub.
  • list(): Returns a list of all datasets in the hub.
  • get(dataset_name): Returns a Dataset object with the given name.
  • describe(dataset_name): Prints a table of details for the given dataset.

To get started with the DatasetsHub class, follow the steps below:

  1. Import the DatasetsHub class and initialize it:
    from giza_datasets import DatasetsHub
    hub = DatasetsHub()
    
  2. Use the show method to print a table of all datasets in the hub:
    hub.show()
    
  3. Use the list method to get a list of all datasets in the hub:
    datasets = hub.list()
    print(datasets)
    
  4. Use the get method to get a Dataset object with a given name:
    dataset = hub.get('tvl-fee-per-protocol')
    print(dataset)
    
  5. Use the describe method to print a table of details for a given dataset:
    hub.describe('tvl-fee-per-protocol')
    
  6. Use the list_tags method to print a list of all tags in the hub.
    hub.list_tags()
    
  7. Use the get_by_tag method to a list of Dataset objects with the given tag.
    hub.get_by_tag('Liquidity')
    

Contributing

We welcome contributions to the Giza Datasets repository. If you have suggestions for improvements or new features, feel free to open an issue or submit a pull request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

giza_datasets-0.1.1.tar.gz (16.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

giza_datasets-0.1.1-py3-none-any.whl (16.3 kB view details)

Uploaded Python 3

File details

Details for the file giza_datasets-0.1.1.tar.gz.

File metadata

  • Download URL: giza_datasets-0.1.1.tar.gz
  • Upload date:
  • Size: 16.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.8

File hashes

Hashes for giza_datasets-0.1.1.tar.gz
Algorithm Hash digest
SHA256 e6e491b5de15b0cc95ab5d352f5e68a425966d767f99e31f90a238ec0dbfa326
MD5 65d641d109137d118a7d7b09ee96ba6f
BLAKE2b-256 a155cfc2e0d9ecb2575cab04e9ed7b7169001cbe287ba6e7274539aaa92aa6c4

See more details on using hashes here.

File details

Details for the file giza_datasets-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: giza_datasets-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 16.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.8

File hashes

Hashes for giza_datasets-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 01a04075e05d9a4ace11c7a661e74be90a7d28d78e33a781ba3f00ac127bb0ea
MD5 cef5eb5dc9aa6641d10c499ec404c1d0
BLAKE2b-256 8aa76acc35d98da963a2aca5253cdc9190f3c18d0fc44833f475aac7fce26234

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page