baselibs
Project description
# 通用基础库
版本: v0.1.5
- 修改了splitset方法,可用于拆分数据集
- 增加 split_dataframe方法,可对DataFrame进行拆分数据集;
- 增加 分层抽取方法: data_split, save_data_split
版本: v0.1.4
- 修改了TimeCount类
版本: v0.1.1
可对目录下的文件进行以下批量处理:
- 清除空格 空行 按句子分行;
- 删除空文件,找到后改名(改为"原文件名.del") 或者直接删除
- 删除重复的文件: 根据文件的MD5判断文件是否相同,找到后改名(原文件.same)或者直接删除
- 批量重命名: 可按序号进行重命名,默认从1开始,文件名会自动在前面补0,例如"0001.txt"
- 可统计文本文件的行数 [2019/1/18 添加]
- 对数据进行检查;
- 对数据重复数据检查并删除;
- 对数据进行随机抽样;
- 处理参数可以自定义顺序,
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
baselibs-0.1.8.tar.gz
(25.4 kB
view details)
Built Distribution
baselibs-0.1.8-py3-none-any.whl
(34.5 kB
view details)
File details
Details for the file baselibs-0.1.8.tar.gz
.
File metadata
- Download URL: baselibs-0.1.8.tar.gz
- Upload date:
- Size: 25.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.8.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0d9d50e699a02733ce77c49652a714ea7c6fc5bc160965dbac05eb5c49d74ff9 |
|
MD5 | 0bfdc17683592f51a2ef20678dd983d0 |
|
BLAKE2b-256 | 3261f5ae44324631e0df6097e800c04d0cf6bbe9bd08c76baf1defce7cf3b9cb |
File details
Details for the file baselibs-0.1.8-py3-none-any.whl
.
File metadata
- Download URL: baselibs-0.1.8-py3-none-any.whl
- Upload date:
- Size: 34.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.8.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ad038622f3ae40b6d0c15a19f1b69ad9060146b41ec9ab36113612b72c8c80f1 |
|
MD5 | 5601d41b91e48505916deb79835e3484 |
|
BLAKE2b-256 | f1deb0b8f46cf29eb7ba50cf5b7568f9994f12c712cc507f1e0c3b12d34be0ce |