A tool for transforming conversational data to a unified format
Project description
Convector: Conversational Data Transformation Simplified
Convector facilitates the standardization of conversational datasets, making conversational data preparation for NLP tasks efficient and straightforward.
Installation
Install with pip:
pip install convector
Or install from source:
git clone https://github.com/teilomillet/convector && cd convector && pip install .
First Use
Running Convector initiates config.yaml
in your directory, storing your transformation profiles.
Using Convector
Transform data files with:
convector process <file_path> [OPTIONS]
Use --help for more information and deep dive in the possibilities.
Profiles
Use -p
with a profile name to apply specific transformations. New or modified profiles auto-save to config.yaml
.
Default Schema
Out-of-the-box schema structure:
{"instruction":"", "input":"", "output":""}
Examples
Process nested data, assuming conversational format, to JSON:
convector process data.csv -c -f output.json
Automatically create/update a profile with additional columns you want to keep:
convector process data.csv -p new_profile --add-cols 'col1,col2'
Capabilities
- Standardize datasets for model training.
- Convert varied data formats to a unified structure.
- Automate data preparation processes.
Convector serves as a practical tool for data format standardization, offering a suite of options for custom data transformation needs.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file convector-0.1.0.tar.gz
.
File metadata
- Download URL: convector-0.1.0.tar.gz
- Upload date:
- Size: 32.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6c368480e51ba0108ec4b69142db8029344667ff34f4fb9a84f9fc87873a63a9 |
|
MD5 | d671adf74435aaa55fef63e98a4547d4 |
|
BLAKE2b-256 | 675157990b00c2ac688b897f92b226db92510241b2d50de616cae5f07e8c15a4 |
File details
Details for the file convector-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: convector-0.1.0-py3-none-any.whl
- Upload date:
- Size: 40.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | cd08168f213cea9065c042226d11a816f37a225e4a23553287736eaf68e17c21 |
|
MD5 | 9ca24e66c034a2e02a3fd0e3eb62ebd2 |
|
BLAKE2b-256 | c6c3b33a7261a70cfb3b7dec533e62a024513660c1ffff25145d28aa293e1cba |