A project to betterize your csv experience.
When your statics data is too large with many lines you may want to remove wrong data and bad data. There is others csv modules that help you with that, but csv utils provide functions that made it easier to you, all you need to do is read the documentation and find what you want to do with your csv file and the implement it. This module was developed while I was taking statics class in college and found many csv data files with wrong and bad data that I myself had to filter with python, so why not give it back to community.
You can install it using pip:
pip3 install csv_utils
While we still are developing new util functions to use there is plenty of developed functions that may help you.
How to read a file
To read a file all you need to do is import the module and then use the function read_file. It will read the file and return a list of its values.
import csv_util f = csv_util.read_file('/path/to/file.csv', field_delimiter=';')
You can also read a file in the FileCSV object provided in the module.
from csv_util import csv_file f = csv_file.FileCSV() f.read_file('/path/to/file.csv', field_delimiter=';', word_delimiter='"')
The code above creates an FileCSV object that will have data about the header, if
read_file function and the lines. Let's suppose we have the csv file with the content below:
Reanding the table above will give us the follwing using the last code:
f # output> [['Name', 'Age', 'Occupation'], ['Jonh', '16', 'Student'], ['Jose', '35', 'Professor'], ['Ellen', '45', 'Scientist']] f.headers # output> ['Name', 'Age', 'Occupation'] f.lines # output> [['Jonh', '16', 'Student'], ['Jose', '35', 'Professor'], ['Ellen', '45', 'Scientist']]
How to filter content
While dealing with many data you may want to uniform the content, for example, if you are making a research about the most used programming language in your field. You could for example create an online form where the user input its data, some people may insert
Python for the language used, other will insert
Python3, while other could just insert
Py3, while dealing with data you may want to use a statics tool that don't allow you to account in one variable
Py3. Using this module you can modify the content in the cells where this errors occurs and the use the new file in your research.
To change all the values in a column with the name o
Programming Language you can use:
from csv_utils import csv_file f = csv_file.FileCSV() f.read_file('/path/to/file.csv') f.substitute_cells_content(['Py3', 'Python3'], ['Python'], 'Programming Language') f.save_file()
The code above will make all cells content that are
Python3 in the column
Programming Language into
Python and then save the modification.
Other situation could be if you forgot to make the fiel required and get a plenty of data with empty cells. You can filter this lines from your data doing:
from csv_utils import csv_file f = csv_file.FileCSV() f.read_file('/path/to/file.csv') f.remove_lines_with_empty_cells() f.save_file()
The code above will get all lines with empty cells in any column and delete it from the object, then you save your changes.
save_file() cleans the content of the given file and subscript its content.
First version with class CSV_file and no methods implemented for csv_utils besides those in the class.
- Rename class CSV_file to FileCSV.
- Refactored functions
__repr__functions for FileCSV. Enabling the objects of FileCSV to be printed and reproduced in the interpreter.
- Added method
remove_lines_with_empty_cellswhich remove all line with empty cells.
- Added methods to csv_utils module,
read_file, which reads a file and returns it as a list,
clean_file, which cleans the content from the file, and
clean_lines_empty_cells, which cleans all lines with an empty cell.
- Added method
account_occurrences_columnwhich accounts the occurence of values in a column.
- Fixed methods
- Added method
FileCSVto undo the last change made by one of its methods.
- Added method
FileCSVto save the changes made in a new file.
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
|Filename, size & hash SHA256 hash help||File type||Python version||Upload date|
|csv_utils-0.1.1-py3-none-any.whl (7.5 kB) Copy SHA256 hash SHA256||Wheel||py3|
|csv_utils-0.1.1.tar.gz (6.2 kB) Copy SHA256 hash SHA256||Source||None|