Skip to main content

Aggregate CSV files

Project description

Agg

Supported Python Versions Last commit pypi version

A Python library to aggregate files and data. This release supports merging two or more csv files.

Documentation

merge_csv(files_to_merge: tuple,
          output_file: Union[str, pathlib.Path],
          first_line_is_header: Optional[bool] = None) -> dict:

The method merge_csv merges multiple CSV files in the order they are specified. It will overwrite any existing file with the same name.

Parameters:

  • files_to_merge: A tuple containing paths to a files in the order they are to be merged.
  • output_file: The path to the result file. The folder must already exist. An existing file with the same name will be overwritten.
  • first_line_is_header: if True agg will remove the first line of all csv files except for the first. If not set agg will guess if the first line is a header or not.

Its return value is a dictionary containing:

  • a SHA256 hash of the result file,
  • the name of the result file,
  • its absolute path,
  • a boolean indicating whether the first line is a header or not,
  • its size in bytes,
  • its number of lines (including the header),
  • a list of the files merged (absolute path).

Example

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
import agg

# tuples are ordered:
my_files = ('file_01.csv', 'file_02.csv')

# Merge the CSV-files - in the order specified by the tuple - into a new file
# called "merged_file". Meanwhile copy the header / first line only once from
# first file.
merged_file = agg.merge_csv(my_files, 'merged_file', True)
# The return value is a dictionary!


print(merged_file)

# {'sha256hash': 'fff30942d3d042c5128062d1a29b2c50494c3d1d033749a58268d2e687fc98c6',
#  'file_name': 'merged_file',
#  'file_path': '/home/exampleuser/merged_file',
#  'first_line_is_header': True,
#  'file_size_bytes': 76,
#  'line_count': 8,
#  'merged_files': ['/home/exampleuser/file_01.csv',
#                  '/home/exampleuser/file_02.csv']
# }

print(merged_file['file_path'])
# '/home/exampleuser/merged_file'

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agg-0.3.1.tar.gz (4.5 kB view details)

Uploaded Source

Built Distribution

agg-0.3.1-py3-none-any.whl (9.2 kB view details)

Uploaded Python 3

File details

Details for the file agg-0.3.1.tar.gz.

File metadata

  • Download URL: agg-0.3.1.tar.gz
  • Upload date:
  • Size: 4.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.25.1 setuptools/45.2.0 requests-toolbelt/0.9.1 tqdm/4.54.1 CPython/3.8.5

File hashes

Hashes for agg-0.3.1.tar.gz
Algorithm Hash digest
SHA256 26968c11c67595da007ec717f7fb10d49462bf251ad7943d7be606d5f4f3370a
MD5 47258108bb22a684bd0204bcf5900bdf
BLAKE2b-256 55d36bea70f4de343683cfa523762816234a5e39a389bf063f0b7c1fee55db63

See more details on using hashes here.

File details

Details for the file agg-0.3.1-py3-none-any.whl.

File metadata

  • Download URL: agg-0.3.1-py3-none-any.whl
  • Upload date:
  • Size: 9.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.25.1 setuptools/45.2.0 requests-toolbelt/0.9.1 tqdm/4.54.1 CPython/3.8.5

File hashes

Hashes for agg-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 6b71bdde34616940553463f9b39d70318fca412e41d720db621947305b0acef9
MD5 85077fb7d98032ff393549d3dfbd27b8
BLAKE2b-256 4cc5b78dc4ca7878b8542aa533e52a096e77dfa1b2eb22c4fe36377ec6a3f306

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page