Skip to main content

Python functions for data analysis using python native container. Load data from csv files and deal with data like sql.

Project description

csv2sqlLike

csv2sqlLike is a package for simple data analysis using light data set(<30MB). This package has filtering method similar with sql's filtering functions. Hope this package could be helpful for who analyze data in social science.

csv2sqlLike is consistent with 2 main classes.

  1. PseudoSQLFromCSV
  2. Transfer2SQLDB

PseudoSQLFromCSV is charging on handling data:

  • load data and heads as nested list and list from csv file
  • filtering data under specific condition
  • grouping data with specific head
  • write csv file with data inside this object

Transfer2SQLDB is charging on data transferring between PseudoSQLFromCSV and DB:

  • create table in DB from data inside PseudoSQLFromCSV
  • bring data as nested list from table in DB

Installation

PIP:

pip3 install csv2sqllike

Usages

load data from csv file

data = csv2sqllike.get_data_from_csv("[path_to_file]")
# example
data = csv2sqllike.get_data_from_csv("./data.csv")
data = csv2sqllike.get_data_from_csv("./test.csv", 
                                     type_dict={'region': 'str', 'country': 'str', 'name': 'str', 'sex': 'str', 'university': 'str', 'age': 'int'}
                                    )
# check loaded data type : nested list
print(type(data.data)) 
# check data head
print(data.head)
# check first row data
print(data.data[0])
# check data
print(data.data)

filtering data using condition

data.where("[head] [operator] [specific value]")
# example
data.where("region == east-asia")
# check conditions used
print(data.condition_where) 
# check filtered data
print(data.cache_data)

grouping data using specific head

data.group_by("[head]")
# example
data.group_by("region")
# check heads used for grouping data
print(data.condition_group_by)
# check grouping data stored in dictionary
print(data.cache_dict)

clear cache storage(storage for filtering and grouping)

# check cache stroage befor clearing caches
print(data.condition_where)
print(data.cache_data)
print(data.condition_group_by)
print(data.cache_dict)
# clear cache storage
data.clear_cache_data()
# check cache stroage after clearing caches
print(data.condition_where)
print(data.cache_data)
print(data.condition_group_by)
print(data.cache_dict)

add head and delete head

print(data.header)
# add new head
data.add_head("new_head")
# check added head
print(data.header)
# delete head
data.delete_head("new_head")
# check heads after deleting specific head
print(data.header)

For more examples and usage, please refer to the jupyter notebook.

Release History

  • 1.0.0
    • First release
  • 1.0.1
    • Add encoding option(default encode is utf-8)
  • 1.0.2
    • Add auto installing for required package
  • 1.0.3
    • Improve precision on data shape check function

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for csv2sqllike, version 1.6.3
Filename, size File type Python version Upload date Hashes
Filename, size csv2sqllike-1.6.3.tar.gz (10.8 kB) File type Source Python version None Upload date Hashes View

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring DigiCert DigiCert EV certificate Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page