Python commented file reader
Project description
A module to access data in text formatted file
In data analysis, data is often stored in text formatted files, where values are written in columns on a text line.
The file may containt comments, usually starting with #
, unused or uniteresting columns or multiple files can contain data of intereste.
Consequently, it maybe useful to be have a python shortcut to:
- read one or several files one after the other
- or put some files side by side (i.e. append the columns together)
- filter out commented or empty lines.
This module provides one function that wraps around a file iterator, allowing the file(s) to be read as following :
for one_line in data_file('myfile.txt', comment_prefix='#'):
print(one_line)
Getting Started
The following instructions will get you a copy of the project up and running on your local machine.
Installing
The module comes with no external dependency, and can easily be installed with the distutils
tools of Python.
Get the ascii_data_file.tar.gz
file. Then cd
to the directory where the file was download and execute the following commands:
tar xvvzf `ascii_data_file-001.tar.gz
cd `ascii_data_file-001
python3 setup.py install
This will unpack, build, install and test the module.
Testing
You can test the library online with pytest
Dependencies
The module is built with no dependencies.
Usage
The data_file
function is defined as follow:
data_file(file_path: Union[str, Sequence[str]],
returned_columns: Union[str, slice, Sequence[int]] = '*',
comment_prefix: str = "#",
separator: Union[None, str] = None,
returned_type: type = float,
multi_files_behavior: str = 'append',
skip_empty_lines: bool = True,
skip_error_lines: bool = True,
error_line_warning: bool = True,
error_line_error: bool = False) -> Generator
It returns a generator filtering out commented lines
The parameters are:
file_path
(str or list of str), required: the path to the file or files to openreturned columns
('*'
or slice or list of int), default = ''*'
: select the columns to return. either'*'
for all, a list of indices, or a slice.comment_prefix
(str), default = "#": the characters to look for at the start of a commented line.returned_type
(type), default =float
: the type of data to return.multi_files_behavior
(str), default = 'append': what to do when multiple files are given in input. eitherappend
orside_by_side
skip_empty_lines
(bool), default = True: wether to skip empty linesskip_error_lines
(bool), default = True: wether to skip files with errorin the processingerror_line_warning
(bool), default = True: if error lines are not skipped, wether to issue a warningerror_line_error
(bool), default = True: if error lines are not skipped, wether to raise a RuntimeError when there is a problem reading the line.
For example of usage, go see the test_ascii_data_file.py file in the repository.
Authors
- Greg Henning - ghenning.at.iphc․cnrs․fr
License
This project is licensed under the CeCILL FREE SOFTWARE LICENSE AGREEMENT.
See LICENSE for more.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file ascii_data_file-1.tar.gz
.
File metadata
- Download URL: ascii_data_file-1.tar.gz
- Upload date:
- Size: 4.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/50.0.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.6.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0f664b435a72c2e2ee5f109d31cbe4cfcb55541dcb1b089bbcd2603ee90efd68 |
|
MD5 | 3742f42e1566562812aee62ecad6356b |
|
BLAKE2b-256 | 2ad06b1d23184fa63b59ff3152aeee76cae5b2fb9c17baca81cd322352a22903 |
File details
Details for the file ascii_data_file-1-py3-none-any.whl
.
File metadata
- Download URL: ascii_data_file-1-py3-none-any.whl
- Upload date:
- Size: 12.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/50.0.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.6.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5e1ee3049e9b07ca199260754b6accd0ddfc3768acf11795f96c9f8156096acf |
|
MD5 | 5091385d66efa7aa49a472358e716f6b |
|
BLAKE2b-256 | d5cc77fe5e20bbbc8df2a71b6c9c18b3c06f545d51111638999fda164c1ab526 |