Skip to main content

Terry toolkit tkitDatasetEx 构建数据集过程中的方便,

Project description

BulidDataset

数据集预处理

包含lm,孪生,seq2seq,文本分类,示例参考dataDome目录下

默认使用Bert 21128分词方案,如果想要修改自己的分词可以修改config下的词典方案。

Getting Started

Download links:

SSH clone URL: ssh://git@git.jetbrains.space/terrychanorg/yuxunlianlm-bert/BulidDataset.git

HTTPS clone URL: https://git.jetbrains.space/terrychanorg/yuxunlianlm-bert/BulidDataset.git

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.

Prerequisites

What things you need to install the software and how to install them.

Examples

Deployment

Add additional notes about how to deploy this on a production system.

Resources

Add links to external resources for this project, such as CI server, bug tracker, etc.

关于tkitDatasetEx

tkitDatasetEx各种函数

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tkitDatasetEx-0.0.0.116399031.tar.gz (4.1 kB view hashes)

Uploaded Source

Built Distribution

tkitDatasetEx-0.0.0.116399031-py3-none-any.whl (4.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page