Use SQLite database to store datasets.
Project description
SQLite Dataset
Use SQLite database to store datasets. Based on SQLAlchemy core.
Structure
The core of sqlite-dataset is the Class SQLiteDataset, which wraps a SQLAlchemy connection to a SQLite database.
Usage
Declare a dataset
To declare a dataset, extend the base SQLiteDataset class and specify fields.
from sqlite_dataset import SQLiteDataset, Field, String, Float
class MyIrisDataset(SQLiteDataset):
sepal_length_cm = Field(String, tablename='iris')
sepal_width_cm = Field(Float, tablename='iris')
petal_length_cm = Field(Float, tablename='iris')
petal_width_cm = Field(Float, tablename='iris')
class_field = Field(String, tablename='iris', name='class')
ds = MyIrisDataset('my_iris_dataset.db')
This will create a sqlite database file on the specified path. If a dataset already exists, it will then be loaded.
The Field object
Field
can be seen as a factory that can create a SQLAlchemy's Column object.
The Field
object's constructor takes same arguments as sqlalchemy.schema.Column
with the difference that Field
does not take positional name argument and has an extra keyword argument tablename
.
Declare a Column using sqlalchemy.Column:
from sqlalchemy import Column, String
Column('sepal_length_cm', String)
Column(name='sepal_length_cm', type_=String)
Declare a sqlite_dataset.Field:
from sqlite_dataset import Field, String
sepal_length_cm = Field(String)
The variable name will be automatically used as the column name.
Column name can also be specified using name
argument, which is useful if the column name is a Python preserved keyword:
from sqlite_dataset import SQLiteDataset, Field, String
class MyDataset(SQLiteDataset):
class_field = Field(String, name='class')
type_field = Field(String, name='type')
Table name can be specified using tablename
keyword argument:
from sqlite_dataset import SQLiteDataset, Field, String
class MyDataset(SQLiteDataset):
class_field = Field(String, name='class', tablename='table1')
type_field = Field(String, name='type', tablename='table2')
This will create two tables: table1, table2.
If tablename is not specified, the column will be created in default table data.
Field type and keyword arguments
The field type is the sqlalchemy column type. All sqlalchemy members were imported into sqlite_dataset.
from sqlalchemy import String, Integer
is exactly the same as:
from sqlite_dataset import String, Integer
Inheritance
Dataset can be inherited.
from sqlite_dataset import SQLiteDataset, Field, String, Float
class BaseDataset(SQLiteDataset):
class_field = Field(String, tablename='iris', name='class')
example_field = Field(String, tablename='example_table')
class ChildDataset(BaseDataset):
sepal_length_cm = Field(String, tablename='iris')
sepal_width_cm = Field(Float, tablename='iris')
petal_length_cm = Field(Float, tablename='iris')
petal_width_cm = Field(Float, tablename='iris')
Connect to an existing dataset
To connect to a dataset, call the connect()
method. Call close()
to close it.
ds = SQLiteDataset('test.db')
ds.connect()
# do something
ds.close()
Or the dataset can be used as a context manager
with SQLiteDataset('test.db') as ds:
# do something
pass
Schema for existing dataset
SQLiteDataset object uses SQLAlchemy connection under the hood, so a schema is required to make any database queries or operations.
If no schema provided by either of the above, a SQLAlchemy reflection is performed to load and parse schema from the existing database.
It is recommended to explicitly define the schema as reflection may have performance issue in some cases if the schema is very large and complex.
Add and read data
data = [
{
'sepal_length_cm': '5.1',
'sepal_width_cm': '3.5',
'petal_length_cm': '1.4',
'petal_width_cm': '0.2',
'class': 'setosa'
},
{
'sepal_length_cm': '4.9',
'sepal_width_cm': '3.0',
'petal_length_cm': '1.4',
'petal_width_cm': '0.2',
'class': 'setosa'
}
]
with MyIrisDataset('test.db') as ds:
ds.insert_data('iris', data)
with MyIrisDataset('test.db') as ds:
res = ds.read_data('iris')
Use with pandas
A pandas DataFrame can be inserted into a dataset by utilizing the to_sql()
function, and read from the dataset using read_sql
function.
Be aware that in this case, SQLiteDataset()
should be used without specifying the schema.
import seaborn as sns
import pandas as pd
df = sns.load_dataset('iris')
with SQLiteDataset('iris11.db') as ds:
df.to_sql('iris', ds.connection)
ds.connection.commit()
with SQLiteDataset('iris11.db') as ds:
res = pd.read_sql(
ds.get_table('iris').select(),
ds.connection
)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file sqlite_dataset-0.5.1.tar.gz
.
File metadata
- Download URL: sqlite_dataset-0.5.1.tar.gz
- Upload date:
- Size: 6.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.4.2 CPython/3.10.4 Windows/10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ad50d47fe13c15f5c996fbecfa9e43c44bcf922acee1af26d7d0cbdeef8f3235 |
|
MD5 | 1c55af6da9ee88b8ebfe550a0635d5c1 |
|
BLAKE2b-256 | 727cde6b77e77a2063456deff7a2a7ceecb4384a4ffcafdc7474b60b4eaa52a8 |
File details
Details for the file sqlite_dataset-0.5.1-py3-none-any.whl
.
File metadata
- Download URL: sqlite_dataset-0.5.1-py3-none-any.whl
- Upload date:
- Size: 7.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.4.2 CPython/3.10.4 Windows/10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8e9764e3cbd9a22ebe01524ffc23a58a78e1d6d209fb617285b99a95fcb65cd7 |
|
MD5 | b7d42673ed49bf64a7b0405c4a26dcf8 |
|
BLAKE2b-256 | 2a816dfb17c18fb5fbde453e01ddf6c813a5ac976ccf688dbcf012b65616f28f |