Python functions for data analysis using python native container. Load data from csv files and deal with data like sql.
Project description
csv2sqlLike
csv2sqlLike is a package for simple data analysis using light data set(<30MB). This package has filtering method similar with sql's filtering functions. Hope this package could be helpful for who analyze data in social science.
csv2sqlLike is consistent with 2 main classes.
- PseudoSQLFromCSV
- Transfer2SQLDB
PseudoSQLFromCSV is charging on handling data:
- load data and heads as nested list and list from csv file
- filtering data under specific condition
- grouping data with specific head
- write csv file with data inside this object
Transfer2SQLDB is charging on data transferring between PseudoSQLFromCSV and DB:
- create table in DB from data inside PseudoSQLFromCSV
- bring data as nested list from table in DB
Installation
PIP:
pip3 install csv2sqllike
Usages
load data from csv file
data = csv2sqllike.get_data_from_csv("[path_to_file]") # example data = csv2sqllike.get_data_from_csv("./data.csv") data = csv2sqllike.get_data_from_csv("./test.csv", type_dict={'region': 'str', 'country': 'str', 'name': 'str', 'sex': 'str', 'university': 'str', 'age': 'int'} ) # check loaded data type : nested list print(type(data.data)) # check data head print(data.head) # check first row data print(data.data[0]) # check data print(data.data)
filtering data using condition
data.where("[head] [operator] [specific value]") # example data.where("region == east-asia") # check conditions used print(data.condition_where) # check filtered data print(data.cache_data)
grouping data using specific head
data.group_by("[head]") # example data.group_by("region") # check heads used for grouping data print(data.condition_group_by) # check grouping data stored in dictionary print(data.cache_dict)
clear cache storage(storage for filtering and grouping)
# check cache stroage befor clearing caches print(data.condition_where) print(data.cache_data) print(data.condition_group_by) print(data.cache_dict) # clear cache storage data.clear_cache_data() # check cache stroage after clearing caches print(data.condition_where) print(data.cache_data) print(data.condition_group_by) print(data.cache_dict)
add head and delete head
print(data.header) # add new head data.add_head("new_head") # check added head print(data.header) # delete head data.delete_head("new_head") # check heads after deleting specific head print(data.header)
For more examples and usage, please refer to the jupyter notebook.
Release History
- 1.0.0
- First release
- 1.0.1
- Add encoding option(default encode is utf-8)
- 1.0.2
- Add auto installing for required package
- 1.0.3
- Improve precision on data shape check function
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
csv2sqllike-1.6.3.tar.gz
(10.8 kB
view hashes)