Skip to main content

Python commented file reader

Project description

A module to access data in text formatted file

In data analysis, data is often stored in text formatted files, where values are written in columns on a text line.

The file may containt comments, usually starting with #, unused or uniteresting columns or multiple files can contain data of intereste.

Consequently, it maybe useful to be have a python shortcut to:

  • read one or several files one after the other
  • or put some files side by side (i.e. append the columns together)
  • filter out commented or empty lines.

This module provides one function that wraps around a file iterator, allowing the file(s) to be read as following :

for one_line in data_file('myfile.txt', comment_prefix='#'):
    print(one_line)

Getting Started

The following instructions will get you a copy of the project up and running on your local machine.

Installing

The module comes with no external dependency, and can easily be installed with the distutils tools of Python.

Get the ascii_data_file.tar.gz file. Then cd to the directory where the file was download and execute the following commands:

tar xvvzf `ascii_data_file-001.tar.gz 
cd `ascii_data_file-001
python3 setup.py install

This will unpack, build, install and test the module.

Testing

You can test the library online with pytest

Dependencies

The module is built with no dependencies.

Usage

The data_file function is defined as follow:

data_file(file_path: Union[str, Sequence[str]],
          returned_columns: Union[str, slice, Sequence[int]] = '*',
          comment_prefix: str = "#",
          separator: Union[None, str] = None,
          returned_type: type = float,
          multi_files_behavior: str = 'append',
          skip_empty_lines: bool = True,
          skip_error_lines: bool = True,
          error_line_warning: bool = True,
          error_line_error: bool = False) -> Generator

It returns a generator filtering out commented lines

The parameters are:

  • file_path (str or list of str), required: the path to the file or files to open
  • returned columns ('*' or slice or list of int), default = ''*': select the columns to return. either '*' for all, a list of indices, or a slice.
  • comment_prefix (str), default = "#": the characters to look for at the start of a commented line.
  • returned_type (type), default = float: the type of data to return.
  • multi_files_behavior (str), default = 'append': what to do when multiple files are given in input. either append or side_by_side
  • skip_empty_lines (bool), default = True: wether to skip empty lines
  • skip_error_lines (bool), default = True: wether to skip files with errorin the processing
  • error_line_warning (bool), default = True: if error lines are not skipped, wether to issue a warning
  • error_line_error (bool), default = True: if error lines are not skipped, wether to raise a RuntimeError when there is a problem reading the line.

For example of usage, go see the test_ascii_data_file.py file in the repository.

Authors

  • Greg Henning - ghenning​.at.​iphc․cnrs․fr

License

This project is licensed under the CeCILL FREE SOFTWARE LICENSE AGREEMENT.

See LICENSE for more.

Project details


Release history Release notifications | RSS feed

This version

1

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ascii_data_file-1.tar.gz (4.2 kB view details)

Uploaded Source

Built Distribution

ascii_data_file-1-py3-none-any.whl (12.8 kB view details)

Uploaded Python 3

File details

Details for the file ascii_data_file-1.tar.gz.

File metadata

  • Download URL: ascii_data_file-1.tar.gz
  • Upload date:
  • Size: 4.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/50.0.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.6.8

File hashes

Hashes for ascii_data_file-1.tar.gz
Algorithm Hash digest
SHA256 0f664b435a72c2e2ee5f109d31cbe4cfcb55541dcb1b089bbcd2603ee90efd68
MD5 3742f42e1566562812aee62ecad6356b
BLAKE2b-256 2ad06b1d23184fa63b59ff3152aeee76cae5b2fb9c17baca81cd322352a22903

See more details on using hashes here.

File details

Details for the file ascii_data_file-1-py3-none-any.whl.

File metadata

  • Download URL: ascii_data_file-1-py3-none-any.whl
  • Upload date:
  • Size: 12.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/50.0.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.6.8

File hashes

Hashes for ascii_data_file-1-py3-none-any.whl
Algorithm Hash digest
SHA256 5e1ee3049e9b07ca199260754b6accd0ddfc3768acf11795f96c9f8156096acf
MD5 5091385d66efa7aa49a472358e716f6b
BLAKE2b-256 d5cc77fe5e20bbbc8df2a71b6c9c18b3c06f545d51111638999fda164c1ab526

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page