Use SQLite database to store datasets.
Reason this release was yanked:
Critical bugs.
Project description
SQLite Dataset
Use SQLite database to store datasets. Based on SQLAlchemy core.
Structure
The core of sqlite-dataset is the Class SQLiteDataset, which wraps a SQLAlchemy connection to a SQLite database.
Usage
Create a new dataset
A schema is required to create a new dataset. The schema should be a dictionary where each item is a column.
There are two ways to specify the schema:
- Extend the base SQLiteDataset class and override the schema field.
from sqlite_dataset import SQLiteDataset
from sqlalchemy import Column, String, Float
class MyIrisDataset(SQLiteDataset):
schema = {
'iris': [
Column('sepal_length_cm', String),
Column('sepal_width_cm', Float),
Column('petal_length_cm', Float),
Column('petal_width_cm', Float),
Column('class', String),
]
}
ds = MyIrisDataset.create('test.db')
- Pass schema to
SQLiteDataset.create()
schema = {
'iris': [
Column('sepal_length_cm', String),
Column('sepal_width_cm', Float),
Column('petal_length_cm', Float),
Column('petal_width_cm', Float),
Column('class', String),
]
}
ds = SQLiteDataset.create('test.db', schema=schema)
Be aware that the schema passed to SQLiteDataset.create()
will override the class field schema.
Connect to an existing dataset
An existing dataset could be one created by calling SQLiteDataset.create()
as shown above, or could be any SQLite database file.
To connect to a dataset, call the connect()
method and close()
after complete all tasks.
ds = SQLiteDataset('test.db')
ds.connect()
# do something
ds.close()
Or the dataset can be used as a context manager
with SQLiteDataset('test.db') as ds:
# do something
pass
Schema for existing dataset
SQLiteDataset object uses SQLAlchemy connection under the hood, soa schema is required to make any database queries or operations.
The way to specify schema for an existing dataset is similar to create a new dataset.
If it's a class extending the base SQLiteDataset class, and overrides the schema field, then this schema will be used.
with MyIrisDataset('test.db') as ds:
pass
Or a schema can be passed into the class constructor. The schema passed into the constructor will always override the class field schema.
with MyIrisDataset('test.db', schema=schema) as ds:
pass
with SQLiteDataset('test.db', schema=schema) as ds:
pass
If no schema provided by either of the above, a SQLAlchemy reflection is performed to load and parse schema from the existing database.
with SQLiteDataset('test.db') as ds:
pass
It is recommended to explicitly define the schema as reflection may have performance issue in some cases if the schema is very large and complex.
Add and read data
data = [
{
'sepal_length_cm': '5.1',
'sepal_width_cm': '3.5',
'petal_length_cm': '1.4',
'petal_width_cm': '0.2',
'class': 'setosa'
},
{
'sepal_length_cm': '4.9',
'sepal_width_cm': '3.0',
'petal_length_cm': '1.4',
'petal_width_cm': '0.2',
'class': 'setosa'
}
]
with MyIrisDataset('test.db') as ds:
ds.insert_data('iris', data)
with MyIrisDataset('test.db') as ds:
res = ds.read_data('iris')
Use with pandas
A pandas DataFrame can be inserted into a dataset by utilizing the to_sql()
function, and read from the dataset using read_sql
function.
Be aware that in this case, SQLiteDataset()
should be used without specifying the schema.
import seaborn as sns
import pandas as pd
df = sns.load_dataset('iris')
with SQLiteDataset('iris11.db') as ds:
df.to_sql('iris', ds.connection)
ds.connection.commit()
with SQLiteDataset('iris11.db') as ds:
res = pd.read_sql(
ds.get_table('iris').select(),
ds.connection
)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for sqlite_dataset-0.3.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6290baa2d484d64c99bb643f53883f5863e52e785f25a2cf8a9de1cbb5667581 |
|
MD5 | 46476f2a718df9fbe537b670cf95ce8e |
|
BLAKE2b-256 | eb2c9ac4d6cabf1a9a544c319afd4dd603df88a387f17f3c0ded6ff6bb63e3e0 |