Skip to main content

A small tool to merge CSV/TSV files

Project description

combine_csv

Based on an idea from https://github.com/ekapope/Combine-CSV-files-in-the-folder/blob/master/Combine_CSVs.py, this small script simply focus on merging CSV/TSV files, by combining either lines or column.

Item Project site
Source https://github.com/gmtsciencedev/combine_csv
Documentation https://combine_csv.readthedocs.io/
Download https://pypi.org/project/combine-csv/
Keywords python, csv, merge, combine

Basic usage

The tool can be used either :

  • in line mode (default) which use all different CSV to create new lines in a merged CSV,
  • or in column mode (using flag -c) which use all different CSV to add new columns, using the first column as an index in all files.

Line mode

combine_csv -i '*.csv' -o my_merged_csv.csv

Thus if folder contains:

1.csv

name,age
Jean,23
Paul,12

2.csv

name,age,sex
Jane,19,female
John,74,male

It will create this file: my_merged_csv.csv

name,age,sex
Jean,23,
Paul,12,
Jane,19,female
John,74,male

Column mode

combine_csv -c -i '*.csv' -o my_merged_csv.csv

Thus if folder contains: 1.csv

task_id,name,desc
1,create,create a new object
2,delete,delete an object

2.csv

task_id,program
1,create.py
2,delete.py
3,random.py

It will create this file: my_merged_csv.csv

task_id,name,desc,program
1,create,create a new object,create.py
2,delete,delete an object,delete.py
3,,,random.py

Main options

See command line combine_csv -h for all options. Here we would like to point the most convinient ones.

As you have seen -i is the input selector which takes a python glob.glob pattern (protect it with single quotes as in the examples above to prevent shell interpretation), and -o give the name of the file (which default to combine.csv)

-s --separator : Change the default field separator from , to whatever you need. For TSV file, say \t (add single quotes around to prevent backslash interpretation by shell, e.g. -s '\t' or -s \\t). This separator will be used to read input files and to write output file. You can chose to have a different output separator with -t option which behaves likewise.

-a --addname : Add the name of the input files (without extension). In line mode, this will add a new column named source (which name can be changed with --source-column option) containing the name of the files. In column mode, this will add the names to non-index columns preceded by an underscore.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

combine_csv-2.0.tar.gz (6.5 kB view details)

Uploaded Source

Built Distribution

combine_csv-2.0-py3-none-any.whl (7.5 kB view details)

Uploaded Python 3

File details

Details for the file combine_csv-2.0.tar.gz.

File metadata

  • Download URL: combine_csv-2.0.tar.gz
  • Upload date:
  • Size: 6.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.6

File hashes

Hashes for combine_csv-2.0.tar.gz
Algorithm Hash digest
SHA256 e977421c06926d85b29d708e66b6a003e9aaf82917666a5eadec0234e6e793f7
MD5 35a00751672708f2cc5910a6206ee139
BLAKE2b-256 199514e494789c2b449f3045767820d5d7c8fd88d1260e12b9a1b24622aa6288

See more details on using hashes here.

File details

Details for the file combine_csv-2.0-py3-none-any.whl.

File metadata

  • Download URL: combine_csv-2.0-py3-none-any.whl
  • Upload date:
  • Size: 7.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.6

File hashes

Hashes for combine_csv-2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5f233781bba4b5d00eb3fc78be67799ee35b9b99f120aeed2d95914646e765dd
MD5 de0baf937a4112dbdb4fd5e063e33f89
BLAKE2b-256 ed4ab063ddcb039886d4824488a8e9c11b16d20543e7e401fdb1d03ce66a233f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page