Skip to main content

Group and split a csv or Excel into separate files based on data in a column.

Project description

Spreadsheet Column Splitter

Takes in a file with column data (csv or Excel), groups and filters the data based on a given splitter column, and splits the data in the input file into separate output files based on groupings via the splitter column.

For example, if you have sales data for fruit, and one of the columns is 'fruit name' and the options are "apple, orange, peach", then Spreadsheet Column Splitter will output three files, one with the apple data, one with the orange data, and one with the peach data.

If there are multiple worksheets, you can optionally provide a sheets_list to indicate which worksheets contain data that you would like to include in the split files. Spreadsheet Column Splitter will split the data in each worksheet into a corresponding worksheet in each output file. If you do not provide sheets_list, it will try to read every worksheet found and look for the splitter_column.

Installation

pip install spreadsheet-column-splitter

How to use:

from spreadsheet_column_splitter import splitter

splitter(
    input_file=STRING,
    output_folder=STRING,
    splitter_column=STRING,
    naming_suffix=STRING,
    sheets_list=LIST(optional),
    template_name=STRING(optional)
)

The first 4 parameters are required. sheets_list and template_name are optional.

Parameters:

  • input_file: the column data file that holds the data that will be split. Takes csv and excel formats.
  • output_folder: subfolder director name that will be created in the current working directory to hold the output files.
  • splitter_column: the column that will be used to group and split the data into filtered separate files.
  • naming_suffix: a suffix that will be added to the output files.
  • sheets_list: (optional) If you are using as an input_file an Excel with multiple sheets, but you do not wish to split all of them, add a sheets list that holds the names of the sheets that you wish to split from the Workbook.
  • template_name: (optional) If you want the data to be added to a formated Excel template file, include that filename/path here.

Excel files with multiple sheets

By default, Spreadsheet Column Splitter will look for table data and the splitter column in every sheet within a workbook. The splitter column must be named identical in each worksheet.

If you have worksheets that you do not want to be looked at for data to be split, use sheets_list to provide the names of the worksheets that you wish to split, omitting names of worksheets that you wish to skip.

Using an Excel template for output

An output template file should contain a header row in row 1 and be blank below that. You can format however you'd like.

Be careful: Cells that look empty in Excel are not always recognized as such by openpyxl. Thus when using a template, your data may be added to a row way at the bottom, past many rows that look empty but are not recognized as such.

To remedy this, delete all rows in an Excel file template below your header row. Then they will usually be recognized as empty.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spreadsheet_column_splitter-0.1.2.tar.gz (4.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

spreadsheet_column_splitter-0.1.2-py3-none-any.whl (6.3 kB view details)

Uploaded Python 3

File details

Details for the file spreadsheet_column_splitter-0.1.2.tar.gz.

File metadata

File hashes

Hashes for spreadsheet_column_splitter-0.1.2.tar.gz
Algorithm Hash digest
SHA256 eb6bf87c8bf701c0e4bff8a0efa790133712fc2a5ffff4c102d8e8a94ec6931c
MD5 10ae3ee7821c7419f00019cc072a9640
BLAKE2b-256 b140e0ff555711c0a8a56257fb15babc1c4dcf06c82c69e48126dc6682df716a

See more details on using hashes here.

File details

Details for the file spreadsheet_column_splitter-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for spreadsheet_column_splitter-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 64df96561d9605949a77a8737a0b3aef2feb2debba7f0f4b044fcd4ba8420212
MD5 1e8d7fff9306d7a380947454940df14f
BLAKE2b-256 1795571957bba1763aa5447cc91608e7ed1373ff7dd46e592fea767088fe0154

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page