A small tool to merge CSV/TSV files
Project description
combine_csv
Based on an idea from https://github.com/ekapope/Combine-CSV-files-in-the-folder/blob/master/Combine_CSVs.py, this small script simply focus on merging CSV/TSV files, by combining either lines or column.
Item | Project site |
---|---|
Source | https://github.com/gmtsciencedev/combine_csv |
Documentation | https://combine_csv.readthedocs.io/ |
Download | https://pypi.org/project/combine-csv/ |
Keywords | python, csv, merge, combine |
Basic usage
The tool can be used either :
- in line mode (default) which use all different CSV to create new lines in a merged CSV,
- or in column mode (using flag
-c
) which use all different CSV to add new columns, using the first column as an index in all files.
Line mode
combine_csv -i '*.csv' -o my_merged_csv.csv
Thus if folder contains:
1.csv
name,age
Jean,23
Paul,12
2.csv
name,age,sex
Jane,19,female
John,74,male
It will create this file:
my_merged_csv.csv
name,age,sex
Jean,23,
Paul,12,
Jane,19,female
John,74,male
Column mode
combine_csv -c -i '*.csv' -o my_merged_csv.csv
Thus if folder contains:
1.csv
task_id,name,desc
1,create,create a new object
2,delete,delete an object
2.csv
task_id,program
1,create.py
2,delete.py
3,random.py
It will create this file:
my_merged_csv.csv
task_id,name,desc,program
1,create,create a new object,create.py
2,delete,delete an object,delete.py
3,,,random.py
Main options
See command line combine_csv -h
for all options. Here we would like to point the most convinient ones.
As you have seen -i
is the input selector which takes a python glob.glob pattern (protect it with single quotes as in the examples above to prevent shell interpretation), and -o
give the name of the file (which default to combine.csv
)
-s --separator
: Change the default field separator from ,
to whatever you need. For TSV file, say \t
(add single quotes around to prevent backslash interpretation by shell, e.g. -s '\t'
or -s \\t
). This separator will be used to read input files and to write output file. You can chose to have a different output separator with -t
option which behaves likewise.
-a --addname
: Add the name of the input files (without extension). In line mode, this will add a new column named source
(which name can be changed with --source-column
option) containing the name of the files. In column mode, this will add the names to non-index columns preceded by an underscore.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file combine_csv-1.0b2.tar.gz
.
File metadata
- Download URL: combine_csv-1.0b2.tar.gz
- Upload date:
- Size: 5.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.10.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
b07de19d22464aa7af01cea7363293a3250b0da8da949723bd9a1d48d07d55fe
|
|
MD5 |
2b451ed54179c5027a614dbef7ba225f
|
|
BLAKE2b-256 |
a267a948d904ca2920af94d1ac073c795f8423cdf3e2fa9e2b415fba095cc217
|
File details
Details for the file combine_csv-1.0b2-py3-none-any.whl
.
File metadata
- Download URL: combine_csv-1.0b2-py3-none-any.whl
- Upload date:
- Size: 6.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.10.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
bc5e3b0f585384a80bf10c3bb4d4f8e19dc951c644b72bbcd9d55767a96e8460
|
|
MD5 |
303ba4fad7da52f3221f1cc060febfe7
|
|
BLAKE2b-256 |
55b6df60ced0ef61b9d0318aeacd350eaddaa9a69e06ba02249fbba0704d06b7
|