This is a pre-production deployment of Warehouse, however changes made here WILL affect the production instance of PyPI.
Latest Version Dependencies status unknown Test status unknown Test coverage unknown
Project Description

About

Pythonic access to tokyo cabinet table database api. (NOTE: The original cython code was from pykesto.) The aims is to provide a simple syntax to load and query data in a table. Most of the work is handled by the Col query interface. e.g.

>>> from totable import ToTable, Col
>>> tbl = ToTable('t.tct', 'w')
>>> result = tbl.select(Col('age') > 18, Col('name').startswith('T'))

to allow querying columns with numbers and letters transparently. Even though tokyo cabinet stores all values as strings. And more syntatic sugar below.

Install

first, install Tokyo-Cabinet source, then, from a the directory containing this file:

# requires cython for now.
$ cython src/ctotable.pyx
$ python setup.py build_ext -i

# test
$ PYTHONPATH=. python totable/tests/test_totable.py

# install
$ sudo python setup.py install

Example Use

Make some fake data. Note it works just like a DBM or dictionary, except that the values themselves are dictionaries.

>>> from totable import ToTable, Col
>>> tbl = ToTable('doctest.tct', 'w')
>>> fnames = ['fred', 'jane', 'john', 'mark', 'bill', 'ted', 'ann']
>>> lnames = ['smith', 'cox', 'kit', 'ttt', 'zzz', 'ark', 'ddk']
>>> for i in range(len(fnames)):
...     tbl[str(i)] = {'fname': fnames[i], 'lname': lnames[i],
...                    'age': str((10 + i) * 2)}
...     tbl[str(i + len(fnames))] = {'fname': fnames[i],
...                                  'lname': lnames[len(lnames) - i - 1],
...                                   'age': str((30 + i) * 2)}

>>> len(tbl)
14

Col

Col, as sent to the select method makes it easy to do queries on a database the format is Col(colname) == ‘Fred’ where colname is one of the keys in the dictionary items in the database. or can use kwargs to select()

>>> tbl.select(lname='cox')
[('1', {'lname': 'cox', 'age': '22', 'fname': 'jane'}), ('12', {'lname': 'cox', 'age': '70', 'fname': 'ted'})]

though using Col gives more power

startswith

>>> results = tbl.select(Col('fname').startswith('j'))
>>> [d['fname'] + ' ' + d['lname'] for k, d in results]
['jane cox', 'jane ark', 'john kit', 'john zzz']

endswith

#and combine queries by sending them in together.
>>> results = tbl.select(Col('fname').startswith('j'), Col('lname').endswith('k'))
>>> [d['fname'] + ' ' + d['lname'] for k, d in results]
['jane ark']

like

this works like an sql query with ‘%’ on either end. (dont attach those values to the query!). so to get everyone with and ‘e’ in their firstname…

>>> r = tbl.select(Col('fname').like('e'))
>>> sorted(set([v['fname'] for k, v in r]))
['fred', 'jane', 'ted']

in_list

return row that exactly match 1 of the values in the list.

>>> r = tbl.select(Col('fname').in_list(['ted', 'fred']))
>>> sorted(set([v['fname'] for k, v in r]))
['fred', 'ted']

>>> r = tbl.select(Col('age').in_list([20, 70]))
>>> sorted(set([v['age'] for k, v in r]))
['20', '70']

between

use for number querying between a min and max. includes the endpoints.

>>> r = tbl.select(Col('age').between(68, 70))
>>> [v['age'] for k, v in r]
['68', '70']

numeric queries (richcmp)

in TC, everything is stored as strings, but you can force number based comparisons with ToTable by using (you guessed it) a number. Or using a string for non-numeric comparisons.

>>> results = tbl.select(Col('age') > 68)
>>> [d['age'] for k, d in results]
['70', '72']

combining queries

just add multiple Col() arguments to the select() call and they will be essentially and’ed together.

>>> results = tbl.select(Col('age') > 68, Col('age') < 72)
>>> [d['age'] for k, d in results]
['70']

Negate(~)

for example get everything that’s not a given value…

>>> results = tbl.select(~Col('age') <= 68)
>>> [d['age'] for k, d in results]
['70', '72']

#all rows where fname is not 'jane'
>>> results = tbl.select(~Col('fname') != 'jane')
>>> 'jane' in [d['fname'] for k, d in results]
False

Regular Expression Matching

supports normal regular expression characters “[ $ ^ | ” , etc.

>>> results = tbl.select(Col('fname').matches("a"))
>>> sorted(set([d['fname'] for k, d in results]))
['ann', 'jane', 'mark']

>>> results = tbl.select(Col('fname').matches("^a"))
>>> sorted(set([d['fname'] for k, d in results]))
['ann']

Offset/Limit

just like SQL, yo.

>>> results = tbl.select(Col('age') < 68, limit=1)
>>> len(results)
1

order

currently only works for string keys. use ‘-‘ for descending and ‘+’ for ascending

>>> [v['fname'] for k, v in tbl.select(lname='cox', order='-fname')]
['ted', 'jane']

# ascending
>>> [v['fname'] for k, v in tbl.select(lname='cox', order='+fname')]
['jane', 'ted']

values

TC is a key-value store, but it also acts as a table. it may be convenient to get just the values as you’d expect from a database table. Note in all examples above, the ‘k’ey is not used, only the value dictionary. This can be made simpler with ‘values_only’. When ‘values_only’ is True, some python call overhead is removed as well.

::
>>> tbl.select(Col('fname').matches("^a"), values_only=True)
[{'lname': 'ddk', 'age': '32', 'fname': 'ann'}, {'lname': 'smith', 'age': '72', 'fname': 'ann'}]

Schemaless

since it’s schemaless, you can add anything

>>> tbl['weird'] = {"val": "hello"}
>>> tbl['weird']
{'val': 'hello'}

delete

delete as expected for a dictionary interface.

>>> del tbl['weird']
>>> print tbl.get('weird')
None

put

encapsulates put, putkeep and putcat with a mode kwarg that takes ‘p’ or ‘k’ or ‘c’ respectively.

>>> tbl.put('a', {'a': '1'}, mode='p')
>>> tbl.put('a', {'a': '2'}, mode='k')
'keep'
>>> assert tbl['a'] == {'a': '1'}

>>> tbl.put('b', {'a': '3'}, mode='k')
'put'

>>> tbl.put('a', {'b': '99'}, 'c')
>>> assert tbl['a'] == {'a': '1', 'b': '99'}

Performance Tuning

Tokyo Cabinet allows you to tune or optimize a table. the available parameters are:

  • bnum specifies the number of elements of the bucket array. Suggested size of ‘bnum’ is about from 0.5 to 4 times of the number of all records to be stored. default is about 132K.
  • apow specifies the size of record alignment by power of 2. The default value is 4 standing for 2^4=16.
  • fpow specifies the maximum number of elements of the free block pool by power of 2. The default value is 10 standing for 2^10=1024.
  • opts specifies options by bitwise-or (|):
    • ‘TDBTLARGE’ must be specified to use a database larger than 2GB. (you must also specify a config flag when compiling the TC library to enable this)
    • ‘TDBTDEFLATE’ use Deflate encoding.
    • ‘TDBTBZIP’ use BZIP2 encoding.
    • ‘TDBTTCBS’ use TCBS encoding.

The other parameters: cache and mmap_size are explained below.

tune

The arguments can be sent to the constructor.

>>> import totable
>>> t = ToTable("some.tct", 'w', bnum=1234, fpow=6, \
...                    opts=totable.TDBTLARGE | totable.TDBTBZIP)

>>> t.close()

optimize

optimize is called on an database opened with mode=’w’. if no arguments are specified, it will automatically adjust ‘bnum’ (only) according to the number of elements in the table.

>>> t = ToTable("some.tct", 'w')

# ... add some records ...
>>> t.optimize()
True

mmap_size

mmap_size is the size of mapped memory. default is 67,108,864 (64MB) set in the constructor. this is xmsiz in TC parlance.

>>> t.close()
>>> t = ToTable("some.tct", 'w', mmap_size=128 * 1e6) # ~128MB.

cache

TC also allows setting various caching parameters. * rcnum is the max number of records to be cached. default is 0 * lcnum is the max number of leaf-nodes to be cached. default is 4096 * ncnum is the max number of non-leaf nodes cached. default is 512 these also must be set in the constructor.

>>> t.close()
>>> t = ToTable("some.tct", 'w', rcnum=1e7, lcnum=32768)

index

create or delete a ‘s’tring or ‘d’ecimal index on a column for faster queries.

# create a decimal index on the number column 'age'.
>>> tbl.create_index('age', 'd')
True

# create a 'string index on the string column 'fname'.
>>> tbl.create_index('fname', 's')
True

# remove the index.
>>> tbl.delete_index('fname')
True

# optimize the index
>>> tbl.optimize_index('age')
True

clear

remove all records from the db.

>>> len(tbl)
16
>>> tbl.clear()
>>> len(tbl)
0

transaction

do stuff in a transaction. a rollback() is performed on any exceptions.

>>> try:
...     with transaction(tbl):
...         tbl['zzz'] = {'a': '4'}
...         1/0
... except: pass

>>> 'zzz' in tbl
False

See Also

Release History

Release History

0.1

This version

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.0.1

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

Download Files

Download Files

TODO: Brief introduction on what you do with files - including link to relevant help section.

File Name & Checksum SHA256 Checksum Help Version File Type Upload Date
totable-0.1.tar.gz (59.1 kB) Copy SHA256 Checksum SHA256 Source Jan 7, 2010

Supported By

WebFaction WebFaction Technical Writing Elastic Elastic Search Pingdom Pingdom Monitoring Dyn Dyn DNS HPE HPE Development Sentry Sentry Error Logging CloudAMQP CloudAMQP RabbitMQ Heroku Heroku PaaS Kabu Creative Kabu Creative UX & Design Fastly Fastly CDN DigiCert DigiCert EV Certificate Rackspace Rackspace Cloud Servers DreamHost DreamHost Log Hosting