Skip to main content

A dataset module for your projects

Project description

DataNexus

DataNexus is a simple to use Python module that you can use in your projects to get transcripts, datasets, etc. The module also allows you to extract character lines from transcripts witch makes it easyer for you to be able to do finetunning of a GPT2 model as an example.

Key feactures

  • Downloading of Datasets and Transcripts
  • Extract Characters from Transcripts

Installation

To get started:

pip install datanexus

Usage

⚠️ | Full documenation link to come in the future and the code may be unstable as in testing!

Downloading of Datasets/Transcripts

from datanexus import download_dataset_raw, download_dataset

datanexus = datanexus('Models/') # Insert your directory that you would like to use

model = datanexus.download_dataset(model='ironman') # Choose a model
print(model)

Extract character's from transcripts

from datanexus import save_character

datanexus = datanexus('Models/') # Insert your directory that you would like to use

character = datanexus.save_character(output_dir='Models', character='JARVIS:')
print(character)

Support

If you have any question or any issues then feel free to create an issue on Github.

Feel free to join The Workshop discord server and send me a ping (_Ethan_)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datanexus-0.0.3.tar.gz (2.7 kB view details)

Uploaded Source

File details

Details for the file datanexus-0.0.3.tar.gz.

File metadata

  • Download URL: datanexus-0.0.3.tar.gz
  • Upload date:
  • Size: 2.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.5

File hashes

Hashes for datanexus-0.0.3.tar.gz
Algorithm Hash digest
SHA256 52d8f4275834c08617c8d52c1a2a64ef7c0cea47ee5b421fb54384ca6a8143af
MD5 8d40ac3f0c10f1b31477b6f8919501a9
BLAKE2b-256 d1ffb67273b5c5b4e5c60a2a1084bb2d055bb69c172225ced1129f3153c577c2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page