Skip to main content

A dataset utils repository. For tensorflow 2.x only!

Project description

datasets

A dataset utils repository. For tensorflow>=2.0.0b only!

Requirements

  • python 3.6
  • tensorflow>=2.0.0b

Installation

pip install nlp-datasets

Contents

Usage

For NMT task

from nlp_datasets import NMTSameFileDataset

o = NMTSameFileDataset(config=None, logger_name=None)
train_files = [] # your files
# train_dataset is an instance of tf.data.Dataset
train_dataset = o.build_train_dataset(train_files)
from nlp_datasets import NMTSeparateFileDataset

o = NMTSeparateFileDataset(config=None, logger_name=None)
feature_files = [] # your files
label_files = []
train_dataset = o.build_train_dataset(feature_files,label_files)

For DSSM task

from nlp_datasets import DSSMSameFileDataset

o = DSSMSameFileDataset(config=None, logger_name=None)
train_dataset = o.build_train_dataset(train_files=[])
from nlp_datasets import DSSMSeparateFileDataset

o = DSSMSeparateFileDataset(config=None, logger_name=None)
query_files = []
doc_files = []
label_files = []
train_dataset = o.build_train_dataset(query_files, doc_files, label_files)

For MatchPyramid task

from nlp_datasets import MatchPyramidSameFileDataset

o = MatchPyramidSameFileDataset(config=None, logger_name=None)
train_dataset = o.build_train_dataset(train_files=[])
from nlp_datasets import MatchPyramidSeparateFilesDataset

o = MatchPyramidSeparateFilesDataset(config=None, logger_name=None)
query_files = []
doc_files = []
label_files = []
train_dataset = o.build_train_dataset(query_files, doc_files, label_files)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

naivenmt_datasets-0.0.7.tar.gz (12.0 kB view details)

Uploaded Source

Built Distribution

naivenmt_datasets-0.0.7-py3-none-any.whl (27.3 kB view details)

Uploaded Python 3

File details

Details for the file naivenmt_datasets-0.0.7.tar.gz.

File metadata

  • Download URL: naivenmt_datasets-0.0.7.tar.gz
  • Upload date:
  • Size: 12.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.6.8

File hashes

Hashes for naivenmt_datasets-0.0.7.tar.gz
Algorithm Hash digest
SHA256 704debe74263a3e42ea7d71dabcc5134002638e933d299eca4ce3e45c7547266
MD5 8c406e280363e660078dc1d569542253
BLAKE2b-256 81abaec3f7623a547a27ac1af7ad1da0fa1fb059e8f171e613ef8a1332873a70

See more details on using hashes here.

File details

Details for the file naivenmt_datasets-0.0.7-py3-none-any.whl.

File metadata

  • Download URL: naivenmt_datasets-0.0.7-py3-none-any.whl
  • Upload date:
  • Size: 27.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.6.8

File hashes

Hashes for naivenmt_datasets-0.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 9ae339aab797486eb8285469e5c2c5d34f9599777783e1f6f075a53f4f1d8947
MD5 59bfc9a4fc1cca8e060986111d4bd8a2
BLAKE2b-256 e8da5a6fd3efdc41e7c3ab9e62b79819a03b21d82998ac6e3b899e83b5d5bcd1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page