A Python library for handling multi-CSV format.
Project description
MultiCSV
Python library multicsv
is designed for handling multi-CSV format
files. It provides an interface for reading, writing, and manipulating
sections of a CSV file as individual text file objects.
Key Features
- Efficient Section Management: Read and write multiple independent sections within a single CSV file.
- TextIO Interface: Sections are treated as TextIO objects, enabling familiar file operations.
- Flexible Operations: Supports reading, writing, iterating, and deleting sections.
- Context Management: Ensures resource safety with
with
statement compatibility. - Integrated Testing: Includes comprehensive unit tests, covering 100% of the functionality.
The Multi-CSV Format
The multi-CSV format is an extension of the traditional CSV
(Comma-Separated Values) format that supports dividing a single file
into multiple independent sections. Each section is demarcated by a
header enclosed in square brackets, e.g., [section_name]
.
This format is commonly known for usage in Illumina-MiSeq sample sheet
files.
Conceptually, this file format provides the ability to store a whole SQL database in a single, human readable file.
Example
Here's a simplified example of a multi-CSV file:
[section1]
header1,header2,header3
value1,value2,value3
[section2]
headerA,headerB,headerC
valueA,valueB,valueC
In the example above, the file contains two sections: section1
and
section2
. Each section has its own headers and rows of data.
Usage
Here's a quick example of how to use the multicsv
library:
import csv
import multicsv
with multicsv.open('example.csv', mode='w+') as csv_file:
# Write the CSV content to the file
csv_file.section('section1').write("header1,header2,header3\nvalue1,value2,value3\n")
csv_file.section('section2').write("header4,header5,header6\nvalue4,value5,value6\n")
# Read a section using the csv module
csv_reader = csv.reader(csv_file['section1'])
assert list(csv_reader) == [['header1', 'header2', 'header3'],
['value1', 'value2', 'value3']]
There are only two methods exported in multicsv
: open
and wrap
.
This is how the latter one is meant to be used:
import io
import multicsv
# Initialize the MultiCSVFile with a base CSV string
csv_content = io.StringIO("""\
[section1]
a,b,c
1,2,3
[section2]
d,e,f
4,5,6
""")
csv_file = multicsv.wrap(csv_content)
# Accessing a section
section1 = csv_file["section1"]
print(section1.read()) # Outputs: "a,b,c\n1,2,3\n"
# Adding a new section
new_section = io.StringIO("g,h,i\n7,8,9\n")
csv_file["section3"] = new_section
csv_file.flush()
# Verify the new section is added
csv_content.seek(0)
print(csv_content.read())
# Outputs:
# [section1]
# a,b,c
# 1,2,3
# [section2]
# d,e,f
# 4,5,6
# [section3]
# g,h,i
# 7,8,9
Both exported methods return a MultiCSVFile
object.
Objects of that class are MutableMapping
s from names of sections (: str
) to contents of sections (: TextIO
).
So, for instance, this is how to print all sections in a multi-csv file:
import multicsv
for section in multicsv.open("example.csv"):
print(section)
Installation
Install the library using pip:
pip install multicsv
Development
Setting Up
Set up your environment for development as follows:
-
Clone the repository:
git clone https://github.com/cfe-lab/multicsv.git
-
Navigate to the project directory:
cd multicsv
-
Create a virtual environment:
python3 -m venv venv source venv/bin/activate
-
Install dependencies:
pip install -e .[dev,test]
Running Tests
Run the test suite to ensure everything is functioning correctly:
pytest
Contributing
Contributions are welcome! Please follow these steps for contributions:
- Fork the repository.
- Create a new branch with a descriptive name.
- Make your changes and ensure the test suite passes.
- Open a pull request with a clear description of what you've done.
License
This project is licensed under the GPL-3.0 License - see the LICENSE file for details.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file multicsv-1.0.4.tar.gz
.
File metadata
- Download URL: multicsv-1.0.4.tar.gz
- Upload date:
- Size: 20.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 44afe54a0435b1f4fec86cd03bedad5a33bdeca1625107d06e3d65b918e7a0c4 |
|
MD5 | 1162f88f897837a966c54ce37a848e98 |
|
BLAKE2b-256 | 06f62120636ad5ad6cbd2bb7a6b06ef8fdcbdff68a31ad3af6abd33bedb57208 |
File details
Details for the file multicsv-1.0.4-py2.py3-none-any.whl
.
File metadata
- Download URL: multicsv-1.0.4-py2.py3-none-any.whl
- Upload date:
- Size: 23.4 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 15b69d5218f45b7d21a1c07808f97530962418608ff3c97e2ea6aa5409466847 |
|
MD5 | 37d90f39ec7399cbb8c5cc54ccf10dc2 |
|
BLAKE2b-256 | 9e54ca24396fbfcc079dcb74d38a5674bcd8dc3dfe3e9b43dbfcaef126f70ef4 |