offline_train_framework_in_diting_group
Project description
Documentation |
|---|
| 谛听组推荐系统离线训练框架 |
RSLib主要功能
Faster Deployment (sql-as-backbone)
State-of-the-art Recurrent Model (transformer-xl etc.)
Distributed DL (horovod etc.)
Deep Learning Accelerator (tvm etc.)
Utility Classes (file2hdfs etc.)
设计思路十问十答
Install
To install the current release:
$ pip install rslib
Demo
dataframe2hive功能demo
功能描述: 通过洛阁组通过的hdfs上传接口实现本地dataframe上传至hive表('\t'分割)的功能。由于hive数据导入时不进行类型检查(不支持schema on write),我们不提供直接插入现有表分区的操作,而是建一张新表。用户需要管理好dataframe的列名。 由于洛阁接口的问题,上传文件会有报错信息,本接口有报错重连机制,一般是能上传成功的。大文件不建议上传,不过测试下来也比较稳定,1.3G文件能在10分钟内上传完成。
环境要求(在user_profile/basic镜像基础上)
$ apt-get update && apt-get install -y krb5-user krb5-config libkrb5-dev
$ pip install requests-kerberos==0.12.0 hdfs==2.5.8 kerberos==1.3.0
$ pip install rslib
$ pip install requirements.txt #custom path
$ kinit -kt code/data/up_recommend.keytab up_recommend #custom path
示例python代码
import pandas as pd
from rslib.utils import dataupload
df = pd.DataFrame({'bb': [1, 2, 3], 'c': [2, 2, 3], 'aa': ['4', '5', '6']})
table = 'up_nsh_tmp.diting_rslib_test_20191021'
dataupload.pandas2hive(df, table) #no partition
dataupload.pandas2hive(df, table, partition='2019-10-21') #add partition
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
rslib-1.6.2.tar.gz
(90.9 kB
view details)
File details
Details for the file rslib-1.6.2.tar.gz.
File metadata
- Download URL: rslib-1.6.2.tar.gz
- Upload date:
- Size: 90.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.15.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.5.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d6fc7aad313bc3ba8db81c28c1c65b159acdff26b2dd0ad7310d3225769e1701
|
|
| MD5 |
b535f05099e51cf0d826980eafa1131a
|
|
| BLAKE2b-256 |
a3281a3af1f0c38a15bc001916323386843a91ad4470918ec14f93866912fd88
|