A data model based on in-memory sqlite to fetch, manipulate and push data to and from multiple sources
Project description
SQLDataModel
SQLDataModel is a fast & lightweight data model with no additional dependencies for quickly fetching and storing your tabular data to and from the most commonly used databases & data sources in a couple lines of code. It's as easy as ETL:
from SQLDataModel import SQLDataModel
# Do the E part:
my_table = SQLDataModel.from_sql("your_table", cx_Oracle.Connection)
# Take care of your T business:
for row in my_table.iter_rows():
print(row)
# Finish the L and be done:
my_table.to_sql("new_table", psycopg2.Connection)
Made for those times when you just want to use raw SQL on your dataframe, or need to move data around but the full Pandas, Numpy, SQLAlchemy installation is just overkill. SQLDataModel includes all the most commonly used features, including additional ones like pretty printing your table, at 1/1000 the size, 0.03MB vs 30MB
Installation
Use the package manager pip to install SQLDataModel.
$ pip install SQLDataModel
Then import the main class SQLDataModel
into your local project, see usage below or go straight to the project docs.
Quick Example
A SQLDataModel
can be created from any number of sources, as a quick demo lets create one using a Wikipedia page:
>>> from SQLDataModel import SQLDataModel
>>>
>>> url = 'https://en.wikipedia.org/wiki/1998_FIFA_World_Cup'
>>>
>>> sdm = SQLDataModel.from_html(url, table_identifier=94)
>>>
>>> sdm[:4, ['R', 'Team', 'W', 'Pts.']]
┌──────┬─────────────┬──────┬──────┐
│ R │ Team │ W │ Pts. │
├──────┼─────────────┼──────┼──────┤
│ 1 │ France │ 6 │ 19 │
│ 2 │ Brazil │ 4 │ 13 │
│ 3 │ Croatia │ 5 │ 15 │
│ 4 │ Netherlands │ 3 │ 12 │
└──────┴─────────────┴──────┴──────┘
[4 rows x 4 columns]
SQLDataModel provides a quick and easy way to import, view, transform and export your data in multiple formats and sources, providing the full power of executing raw SQL against your model in the process.
Usage
from SQLDataModel import SQLDataModel
# Create a SQLDataModel object from any valid source, whether csv:
sdm = SQLDataModel.from_csv('region_data.csv')
# Any DB-API 2.0 connection like psycopg2, cx-oracle, pyodbc, sqlite3:
sdm = SQLDataModel.from_sql('region_data', psycopg2.Connection)
# Python objects like dicts, lists, tuples, iterables:
sdm = SQLDataModel.from_dict(data=region_data)
# Slice it by rows and columns
sdm_country = sdm[2:7, ['country','total']]
# Transform and filter it
sdm = sdm[sdm['total'] < 3200]
# View it
print(sdm)
# Group by single or multiple columns:
sdm_group = sdm.group_by(['region','check'])
# View output.
print(sdm_group)
# Loop through it.
for row in sdm.iter_rows():
print(row)
# Save it for later as csv.
sdm.to_csv('region_data.csv')
# Or SQL databases like PostgreSQL, SQL Server, SQLite.
sdm.to_sql('table_data', sqlite3.Connection)
# Get it back again from any number of sources.
sdm_new = SQLDataModel.from_sql('table_data', sqlite3.Connection)
Data Sources
SQLDataModel supports various data formats and sources, including:
- HTML files or websites
- SQL database connections (PostgreSQL, SQLite, Oracle DB, SQL Server, TeraData)
- CSV or delimited text files
- JSON files or objects
- LaTeX files or formatted strings
- Markdown files or formatted strings
- Numpy arrays
- Pandas dataframes
- Parquet files
- Python objects
- Pickle files
Note that SQLDataModel
does not install any additional dependencies by default. This is done to keep the package as light-weight and small as possible. This means that to use package dependent methods like to_parquet()
or the inverse from_parquet()
the pyarrow
package is required. The same goes for other package dependent methods like those converting to and from pandas
and numpy
objects.
Documentation
SQLDataModel's documentation can be found at https://sqldatamodel.readthedocs.io containing detailed descriptions for the key modules in the package. These are listed below as links to their respective sections in the docs:
ANSIColor
for terminal styling.HTMLParser
for parsing tabular data from the web.JSONEncoder
for type casting and encoding JSON data.SQLDataModel
for wrapping it all up.
However, to skip over the less relevant modules and jump straight to the meat of the package, the SQLDataModel
module, click here.
Contributing
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
Please make sure to update tests as appropriate.
License
Thank you!
Ante Tonkovic-Capin
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for SQLDataModel-0.1.83-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 64de4aa4d22b1556df0a4e6430a3d94beeb09d3345b324aa4ef20590081e608e |
|
MD5 | f3356243efd46a45df0cd8c6b37a7700 |
|
BLAKE2b-256 | 1d235280fa0feb24c88a5e689437a912d15df80043ef3bc6a91345dab19a9696 |