datasets for easy machine learning use
Project description
datasets
datasets for easy machine learning use
- Free software: Apache Software License 2.0
- Documentation: https://ml-dataset.readthedocs.io.
Datasets API
- DataSchema
- generate_dataset()
raw_dataset
RawDataset是针对一行的数据内容进行解析处理,转换得到Dataset的
展开查看具体举例说明
token_dicts = None
data_filed_list = []
data_filed_list.append(DataSchema(name='query', processor='to_np', type=tf.int32,
dtype='int32', shape=(None,), is_with_len=True))
label_field = DataSchema(name='label', processor='to_np',
type=tf.float32, dtype='float32', shape=(1,), is_with_len=False)
generator = RawDataset(file_path="tests/data/raw_datasets", token_dicts=token_dicts,
data_field_list=data_filed_list, label_field=label_field, file_suffix='varnum.input')
dataset = generator.generate_dataset(batch_size=4, is_training=True)
for batch_num, (x, label) in enumerate(dataset):
pass
seq_dataset
SeqDataset是针对序列按照行来排列的形式解析解析处理,并转换得到Dataset的。
展开查看具体举例说明
#@TODO
pairwise_dataset
展开查看具体举例说明
#@TODO
listwise_dataset
展开查看具体举例说明
#@TODO
data_processor_dicts
data_processor_dicts是数据处理函数的集合词典,这里面包含了很多针对不同数据类型(e.g. 文本、语音、图像、数值等)进行特征提取、数据转换等处理,最终转换成Dataset,用以喂到模型进行训练预测等操作。
展开查看具体举例说明
System.out.println("Hello to see U!"); # aaadf
hello
example data
- tests/data/raw_datasets/query_float.input format:id\tlabel\tquery\tfloats
- tests/data/raw_datasets/varnum.input format:id\tlabel\tnums
- tests/data/pairwise_datasets/simple_pair.input
- tests/data/seq_datasets/simple_seq.input
TODO
- 1
- 2
- 3
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
ml-dataset-0.0.6.tar.gz
(20.0 kB
view hashes)
Built Distribution
Close
Hashes for ml_dataset-0.0.6-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6c0e8a9573d02104cf287d0b96f8d4b3bdfb2f01771a0a290a3618729e2f5a74 |
|
MD5 | 789233feaafd212766e0424e03c645bf |
|
BLAKE2b-256 | fe8cf3ffa342071a4c967e091a07d1d52c4167618799a1299070b220e85b9e69 |