Skip to main content

A package for reading and writing mixed width tables

Project description

Mixed Width

About

This project is designed to enable the easy reading/writing of fixed-width files with variable widths. For example:

FOO     BAR     BAZ
Hello   World   SomeExtraLongString

In this case we have differing widths between the columns. This means we can't just specify a single width for the columns. This might not be much of an issue itself, just code in the differing widths and parse using that, however with things like console output we can't rely on a consistent width for cells, which is where Mixed Width comes in. To resolve this, it detects the start of cells by looking for location where spaces precede letters which line up across all lines of the files. Using this method we are able to parse files with unknown widths and convert them into either a list of lists, or a list of dictionaries where the keys are the header values.

Examples

To use multiwidth for the most part it's the same as the built-in json module. This means that parsing some output takes the form of:

import multiwidth

string_to_parse = """FOO     BAR     BAZ
Hello   World   SomeExtraLongString"""

data = multiwidth.loads(string_to_parse)

print(data)
# output:
# [['Hello', 'World', 'SomeExtraLongString']]

If preserving the headers is important, output_json=True can be added to the loads method:

import multiwidth

string_to_parse = """FOO     BAR     BAZ
Hello   World   SomeExtraLongString"""

data = multiwidth.loads(string_to_parse, output_json=True)

print(data)
# output:
# [{'FOO': 'Hello', 'BAR': 'World', 'BAZ': 'SomeExtraLongString'}]

Each line will then be a dictionary with the header keys and their corresponding values

In addition, if the content is stored in a file, multiwidth.load(<file_object>) can be used.

Finally, data can be output as well from multiwidth.

import multiwidth

headers = ['FOO', 'BAR', 'BAZ']
data = [['Hello', 'World', 'SomeExtraLongString']]

print(multiwidth.dumps(data, headers=headers))

# Output:
# FOO   BAR   BAZ
# Hello World SomeExtraLongString 

You can also control the spacing between columns with cell_suffix='<your desired padding between columns>'. For example:

import multiwidth

headers = ['FOO', 'BAR', 'BAZ']
data = [['Hello', 'World', 'SomeExtraLongString']]

print(multiwidth.dumps(data, headers=headers, cell_suffix='   '))

# Output:
# FOO     BAR     BAZ
# Hello   World   SomeExtraLongString 

You can also dump JSON data by omitting the headers argument:

import multiwidth

data = [{'FOO': 'Hello', 'BAR': 'World', 'BAZ': 'SomeExtraLongString'}]

print(multiwidth.dumps(data))

# Output:
# FOO   BAR   BAZ
# Hello World SomeExtraLongString

Finally, you can dump to a file with dumps(<your file object>)

Usage

load

"""Parse data from a file object

Args:
    file_object (io.TextIOWrapper): File object to read from
    padding (str, optional): Which character takes up the space to create the fixed
        width. Defaults to " ".
    header (bool, optional): Does the file contain a header. Defaults to True.
    output_json (bool, optional): Should a list of dictionaries be returned instead
        of a list of lists. Defaults to False. Requires that 'header' be set to
        True.

Returns:
    Union[List[List],List[Dict]]: Either a list of lists or a list of dictionaries that
        represent the extracted data
"""

loads

"""Takes a string of a fixed-width file and breaks it apart into the data contained.

Args:
    contents (str): String fixed-width contents.
    padding (str, optional): Which character takes up the space to create the fixed
        width. Defaults to " ".
    header (bool, optional): Does the file contain a header. Defaults to True.
    output_json (bool, optional): Should a list of dictionaries be returned instead
        of a list of lists. Defaults to False. Requires that 'header' be set to
        True.

Raises:
    Exception: 'output_json' is True but 'header' is False.

Returns:
    List[List] | List[Dict]: Either a list of lists or a list of dictionaries that
        represent the extracted data
"""

dump

"""Dumps a formatted table to a file

Args:
    data (Union[List[List],List[Dict]]): Data to dump to a file. If using JSON data
        then omit the `headers` argument
    file_object (io.TextIOWrapper): File object to write to
    headers (List[str], optional): Headers to use with list data. Defaults to None.
    padding (str, optional): Character to use as padding between values. Defaults to
        ' '.
    cell_suffix (str, optional): String to use as the padding between columns.
        Defaults to ' '.
"""

dumps

"""Dumps a formatted table to a string

Args:
    data (Union[List[List],List[Dict]]): List or dictionary data to format
    headers (List[str], optional): Headers to use with list data. Defaults to None.
    padding (str, optional): Character to use as padding between values. Defaults to
        ' '.
    cell_suffix (str, optional): String to use as the padding between columns.
        Defaults to ' '.

Returns:
    str: Formatted table of input data
"""

License

Multiwidth is under the MIT license.

Contact

If you have any questions or concerns please reach out to me (John Carter) at jfcarter2358@gmail.com

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

multiwidth-1.0.1.tar.gz (6.2 kB view details)

Uploaded Source

Built Distribution

multiwidth-1.0.1-py3-none-any.whl (6.5 kB view details)

Uploaded Python 3

File details

Details for the file multiwidth-1.0.1.tar.gz.

File metadata

  • Download URL: multiwidth-1.0.1.tar.gz
  • Upload date:
  • Size: 6.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.3.1 CPython/3.9.7 Linux/5.15.83.1-microsoft-standard-WSL2

File hashes

Hashes for multiwidth-1.0.1.tar.gz
Algorithm Hash digest
SHA256 f782ea465712dd9f2dcd576ade3bf7be905b3becd648ba9c05d33191d04dac89
MD5 e21cf5b288f8c1d11cf5c25cf5b74490
BLAKE2b-256 859754f5c24c9d9c2fed21ec4d6b2396a3a72c504a2e4ec10fe204930dcedb33

See more details on using hashes here.

File details

Details for the file multiwidth-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: multiwidth-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 6.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.3.1 CPython/3.9.7 Linux/5.15.83.1-microsoft-standard-WSL2

File hashes

Hashes for multiwidth-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 842d13fac268abf7ab90c5daad45d07d4ef6c5c86a17f88798b3c2a5fa6124b1
MD5 72987990bad6471ff4d72b5aafab5982
BLAKE2b-256 6485e9fa9fbf2ea37c9b0839ca71e09505241d3dbc173defea2fc1f0dca7428b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page