PandaSQLite is a lightweight library that combines the power of SQLite databases with the ease of use of Python numerical libraries like pandas, numpy, scipy, etc.
Project description
PandaSQLite
PandaSQLite is a lightweight wrapper library that combines the power of SQLite databases with the ease of use of Python numerical libraries like pandas, numpy, scipy, etc. This library allows you to store and manage data using a modern SQLite database, while still being able to use query results seamlessly in python.
Why bother with PandaSQLite?
Features
- Fast and reliable data storage and management.
- Easy to use, with no setup required.
- Designed for use in Jupyter notebooks.
- Easily import data from different file formats.
- Leverage powerful SQL syntax for advanced data manipulation and analysis.
Advantages
- Fast and reliable data storage and management.
- SQL is a declarative language, which optimizes your queries automatically in a way that imperative languages cannot. SQL will be almost certainly faster than Python scripts that parse data from disk!
- All data stored in a single binary file, keeping your data organized, tidy and much easier to share.
- Materialize intermediate results efficiently to speed up your data analysis.
- Speed up your code automatically using indices.
Getting Started
Requirements
PandaSQLite only supports Python 3 and is built on top of the pandas
and sqlite3
packages.
Installation
To start using PandaSQLite, simply install the library using pip:
pip install PandaSQLite
Once the library is installed, you can start using it in your Python projects or Jupyter notebooks.
Basic usage
This script defines the most basic usage of the library. The raw data must be imported in the database only once.
from PandaSQLite import PandaSQLiteDB
# Create/open database
db = PandaSQLiteDB("my_database.sql")
# Import raw data -- must only be done once!
# Import example CSV data
db.import_data("my_table", "my_csv.csv", format="csv")
# Execute query with no return value
db.execute("INSERT INTO my_table VALUES (...)")
# Query dataframe with return values
df = db.query("SELECT * FROM my_table")
For a more comprehensive showcase of features, check out the examples in the examples directory to get started.
Documentation
The documentation for PandaSQLite is available here
Common problems
TypeError: 'NoneType' object is not iterable
:
This issue is usually caused by executing a query with no return data in the db.query()
function, which should only be used for queries that return a table ("SELECT" queries). Use the db.execute()
function for queries with no return data (e.g: "INSERT", "UPDATE", "ALTER", ... queries).
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file PandaSQLite-1.1.1.tar.gz
.
File metadata
- Download URL: PandaSQLite-1.1.1.tar.gz
- Upload date:
- Size: 5.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3d1100855eeb0ca97f674ad1db3692582458d594f45042608cfd5ad2cf958d6c |
|
MD5 | f395f9bb48d053c6599e74095c6f732d |
|
BLAKE2b-256 | 705adef0c9e4fd551f0f75b0c3b78ed17a75e217417cf79b6cba54bb1902445d |