Skip to main content

ORM for the dataset library

Project description

Fast Active Record ORM for the dataset library


The dataset library is a great and easy tool to work with any SQL database. Unfortunately, it lacks an object mapper ( ORM) - if you need one you are left with the complexity that is sqlalchemy.

Enter dataset-orm



$ pip install dataset-orm

Define classes that define a dataset.Table:

from dataset_orm import Model, connect

class User(Model):
    username = Column(types.string, unique=True)
    data = Column(types.json)


Alternatively, use the functional API, e.g. to create models dynamically:

User = ds.Model.from_spec(name='User',
                          columns=[ds.Column(ds.types.string, 'name', unique=True),
                                   ds.Column(ds.types.json, 'data')],

Then create rows directly from Python objects:

user = User(username='dave', data={'sports': ['football', 'tennis']'})

user = User.objects.find_one(username='dave')
{'sports': ['football', 'tennis']'}

Query exiting tables, ORM-style:

User = Model.from_table(db['customer'])
[ User(pk=1), User(pk=2), User(pk=3)]

user = User.objects.find_one(name='John Walker')
1 John Walker

Update and delete

user = User.objects.find_one(name='John Walker') = 'New York'

users = User.objects.find(place='London')

Store and access any data types, including json and binary values

class User(Model):
    # in some dbs, unique strings must be limited in length
    username = Column(types.string(length=100), unique=True)
    picture = Column(types.binary)

user = User.objects.get(name='Dave')
with open('image.png', 'rb') as fimg:
    user.picture =

Use the file column type for transparently storing binary data:

class Image(Model):
    imagefile = Column(types.file)


img = Image()
with open('/path/to/image') as f:
data =

Here the imagefile field provides a file-like API. This is an efficient way to store binary data in the database. The file's data is split in chunks and written to the database in multiple parts. On reading back, the chunks are retrieved from the db in parallel, in order to improve performance for large files. Tests indicate a 25% speed up is possible v.v. a binary field.

You may use the dataset.files API to get a filesystem-like API to binary data stored in the database, without the need to use a model:

from dataset_orm import files


files.write('myfile', b'some data')'myfile')
=> b'some data'

=> True

=> ['myfile']

=> ['myfile']


The convenience methods put() and get() allow for an even simpler use of the files api:

files.put(b'some data', 'myfile')
data = files.get('myfile').read()
=> b'some data'

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release. See tutorial on generating distribution archives.

Built Distribution

dataset_orm-0.3-py3-none-any.whl (19.1 kB view hashes)

Uploaded py3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page