A dataset utils repository. For tensorflow 2.x only!
Project description
datasets
A dataset utils repository. For tensorflow>=2.0.0b only!
Requirements
- python 3.6
- tensorflow>=2.0.0b
Installation
pip install naivenmt-datasets
Contents
- Build dataset for seq2seq models. seq2seq_dataset.py
- Build dataset for NMT. nmt_dataset.py
- Build dataset for DSSM. dssm_dataset.py
- Build dataset for MatchPyramid. matchpyramid_dataset.py
Usage
For NMT task
from datasets import NMTSameFileDataset
o = NMTSameFileDataset(config=None, logger_name=None)
train_files = [] # your files
# train_dataset is an instance of tf.data.Dataset
train_dataset = o.build_train_dataset(train_files)
from datasets import NMTSeparateFileDataset
o = NMTSeparateFileDataset(config=None, logger_name=None)
feature_files = [] # your files
label_files = []
train_dataset = o.build_train_dataset(feature_files,label_files)
For DSSM task
from datasets import DSSMSameFileDataset
o = DSSMSameFileDataset(config=None, logger_name=None)
train_dataset = o.build_train_dataset(train_files=[])
from datasets import DSSMSeparateFileDataset
o = DSSMSeparateFileDataset(config=None, logger_name=None)
query_files = []
doc_files = []
label_files = []
train_dataset = o.build_train_dataset(query_files, doc_files, label_files)
For MatchPyramid task
from datasets import MatchPyramidSameFileDataset
o = MatchPyramidSameFileDataset(config=None, logger_name=None)
train_dataset = o.build_train_dataset(train_files=[])
from datasets import MatchPyramidSeparateFilesDataset
o = MatchPyramidSeparateFilesDataset(config=None, logger_name=None)
query_files = []
doc_files = []
label_files = []
train_dataset = o.build_train_dataset(query_files, doc_files, label_files)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
naivenmt_datasets-0.0.6.tar.gz
(12.7 kB
view hashes)
Built Distribution
Close
Hashes for naivenmt_datasets-0.0.6-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 749a984bab60b9e0fe99ed157a14e447ce7855cce7eba94fa7e1e4ee69641292 |
|
MD5 | b544064de664312306f89519f98dea3e |
|
BLAKE2b-256 | 3ce34408f54e66db3469b08f60951d5a91d8d22e45a6b83c1229bd0f0495775c |