Skip to main content

Aggregate files and data

Project description

Agg

Supported Python Versions Last commit pypi version

A Python library to aggregate files and data. This release supports merging two or more csv files.

Documentation

merge_csv(files_to_merge: tuple,
          output_file: Union[str, pathlib.Path],
          first_line_is_header: Optional[bool] = None) -> dict:

The method merge_csv merges multiple CSV files in the order they are specified. It will overwrite any existing file with the same name.

Parameters:

  • files_to_merge: A tuple containing paths to a files in the order they are to be merged.
  • output_file: The path to the result file. The folder must already exist. An existing file with the same name will be overwritten.
  • first_line_is_header: if True agg will remove the first line of all csv files except for the first. If not set agg will guess if the first line is a header or not.

Its return value is a dictionary containing:

  • a SHA256 hash of the result file,
  • its absolute path,
  • a boolean indicating whether the first line is a header or not,
  • its size in bytes,
  • its number of lines (including the header),
  • a list of the files merged (absolute path).

Example

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
import agg

# tuples are ordered:
my_files = ('file_01.csv', 'file_02.csv')

# Merge the CSV-files - in the order specified by the tuple - into a new file
# called "merged_file". Meanwhile copy the header / first line only once from
# first file.
merged_file = agg.merge_csv(my_files, 'merged_file', True)
# The return value is a dictionary!


print(merged_file)

# {'sha256hash': 'fff30942d3d042c5128062d1a29b2c50494c3d1d033749a58268d2e687fc98c6',
#  'file_path': '/home/exampleuser/merged_file',
#  'first_line_is_header': True,
#  'file_size_bytes': 76,
#  'line_count': 8,
#  'merged_files': ['/home/exampleuser/file_01.csv',
#                  '/home/exampleuser/file_02.csv']
# }

print(merged_file['file_path'])
# '/home/exampleuser/merged_file'

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agg-0.2.0.tar.gz (4.1 kB view hashes)

Uploaded Source

Built Distribution

agg-0.2.0-py3-none-any.whl (9.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page