souper

Souper - Generic Indexed Storage based on ZODB

These details have not been verified by PyPI

Project links

Homepage

Project description

https://travis-ci.org/bluedynamics/souper.svg?branch=master

ZODB Storage for lots of (light weight) data.

Utilizes:

https://raw.githubusercontent.com/bluedynamics/souper/master/docs/Souper-64.png

Souper is a tool for programmers. It offers an integrated storage tied together with indexes in a catalog. The records in the storage are generic. It is possible to store any data on a record if it is persistent pickable in ZODB.

Souper can be used used in any Python application, either standalone using the pure ZODB or with Pyramid, Zope or Plone.

Using Souper

Providing a Locator

Soups are looked up by adapting souper.interfaces.IStorageLocator to some context. Souper does not provide any default locator. So first one need to be provided. Let’s assume context is some persistent dict-like instance

>>> from zope.interface import implementer
>>> from zope.interface import Interface
>>> from zope.component import provideAdapter
>>> from souper.interfaces import IStorageLocator
>>> from souper.soup import SoupData
>>> @implementer(IStorageLocator)
... class StorageLocator(object):
...
...     def __init__(self, context):
...        self.context = context
...
...     def storage(self, soup_name):
...        if soup_name not in self.context:
...            self.context[soup_name] = SoupData()
...        return self.context[soup_name]

>>> provideAdapter(StorageLocator, adapts=[Interface])

So we have locator creating soups by name on the fly. Now its easy to get a soup by name:

>>> from souper.soup import get_soup
>>> soup = get_soup('mysoup', context)
>>> soup
<souper.soup.Soup object at 0x...>

Providing a Catalog Factory

Depending on your needs the catalog and its indexes may look different from use-case to use-case. The catalog factory is responsible to create a catalog for a soup. The factory is a named utility implementing souper.interfaces.ICatalogFactory. The name of the utility has to the the same as the soup have.

Here repoze.catalog is used and to let the indexes access the data on the records by key the NodeAttributeIndexer is used. For special cases one may write its custom indexers, but the default one is fine most of the time:

>>> from souper.interfaces import ICatalogFactory
>>> from souper.soup import NodeAttributeIndexer
>>> from souper.soup import NodeTextIndexer
>>> from zope.component import provideUtility
>>> from repoze.catalog.catalog import Catalog
>>> from repoze.catalog.indexes.field import CatalogFieldIndex
>>> from repoze.catalog.indexes.text import CatalogTextIndex
>>> from repoze.catalog.indexes.keyword import CatalogKeywordIndex

>>> @implementer(ICatalogFactory)
... class MySoupCatalogFactory(object):
...
...     def __call__(self, context=None):
...         catalog = Catalog()
...         userindexer = NodeAttributeIndexer('user')
...         catalog[u'user'] = CatalogFieldIndex(userindexer)
...         textindexer = NodeTextIndexer(['text', 'user')
...         catalog[u'text'] = CatalogTextIndex(textindexer)
...         keywordindexer = NodeAttributeIndexer('keywords')
...         catalog[u'keywords'] = CatalogKeywordIndex(keywordindexer)
...         return catalog

>>> provideUtility(MySoupCatalogFactory(), name="mysoup")

The catalog factory is used soup-internal only but one may want to check if it works fine:

>>> catalogfactory = getUtility(ICatalogFactory, name='mysoup')
>>> catalogfactory
<MySoupCatalogFactory object at 0x...>

>>> catalog = catalogfactory()
>>> sorted(catalog.items())
[(u'keywords', <repoze.catalog.indexes.keyword.CatalogKeywordIndex object at 0x...>),
(u'text', <repoze.catalog.indexes.text.CatalogTextIndex object at 0x...>),
(u'user', <repoze.catalog.indexes.field.CatalogFieldIndex object at 0x...>)]

Adding records

As mentioned above the souper.soup.Record is the one and only kind of data added to the soup. A record has attributes containing the data:

>>> from souper.soup import get_soup
>>> from souper.soup import Record
>>> soup = get_soup('mysoup', context)
>>> record = Record()
>>> record.attrs['user'] = 'user1'
>>> record.attrs['text'] = u'foo bar baz'
>>> record.attrs['keywords'] = [u'1', u'2', u'ü']
>>> record_id = soup.add(record)

A record may contains other records. But to index them one would need a custom indexer. So, usually contained records are valuable for later display, not for searching:

>>> record['subrecord'] = Record()
>>> record['homeaddress'].attrs['zip'] = '6020'
>>> record['homeaddress'].attrs['town'] = 'Innsbruck'
>>> record['homeaddress'].attrs['country'] = 'Austria'

Access data

Even without any query a record can be fetched by id:

>>> from souper.soup import get_soup
>>> soup = get_soup('mysoup', context)
>>> record = soup.get(record_id)

All records can be accessed using utilizing the container BTree:

>>> soup.data.keys()[0] == record_id
True

Query data

How to query a repoze catalog is documented well. Sorting works the same too. Queries are passed to soups query method (which uses then repoze catalog). It returns a generator:

>>> from repoze.catalog.query import Eq
>>> [r for r in soup.query(Eq('user', 'user1'))]
[<Record object 'None' at ...>]

>>> [r for r in soup.query(Eq('user', 'nonexists'))]
[]

To also get the size of the result set pass a with_size=True to the query. The first item returned by the generator is the size:

>>> [r for r in soup.query(Eq('user', 'user1'), with_size-True)]
[1, <Record object 'None' at ...>]

To optimize handling of large result sets one may not to fetch the record but a generator returning light weight objects. Records are fetched on call:

>>> lazy = [l for l in soup.lazy(Eq('name', 'name'))]
>>> lazy
[<souper.soup.LazyRecord object at ...>,

>>> lazy[0]()
<Record object 'None' at ...>

Here the size is passed as first value of the geneartor too if with_size=True is passed.

Delete a record

To remove a record from the soup python del is used like one would do on any dict:

>>> del soup[record]

Reindex

After a records data changed it needs a reindex:

>>> record.attrs['user'] = 'user1'
>>> soup.reindex(records=[record])

Sometimes one may want to reindex all data. Then reindex has to be called without parameters. It may take a while:

>>> soup.reindex()

Rebuild catalog

Usally after a change of the catalog factory was made - i.e. some index was added - a rebuild of the catalog i needed. It replaces the current catalog with a new one created by the catalog factory and reindexes all data. It may take while:

>>> soup.rebuild()

Reset (or clear) the soup

To remove all data from the soup and empty and rebuild the catalog call clear.

Attention: All data is lost!

>>> soup.clear()

Source Code

The sources are in a GIT DVCS with its main branches at github.

We’d be happy to see many forks and pull-requests to make souper even better.

Contributors

Robert Niederreiter <rnix [at] squarewave [dot] at>
Jens W. Klein <jk [at] kleinundpartner [dot] at>

Changelog

1.1.2 (2022-12-05)

Release wheel. [rnix]

1.1.1 (2019-09-16)

Cleanup NodeTextIndexer (one loop is enough). [jensens]

1.1.0 (2019-03-08)

Code style (black, isort, utf8headers). [jensens]
Switched to tox for testing, builodut gone. [jensens]
Python 2/3 compatibility [agitator]

1.0.2 (2015-02-25)

fix: unicode with special chars in text indexer failed. [jensens, 2014-02-25]

1.0.1

PEP-8. [rnix, 2012-10-16]
Python 2.7 Support. [rnix, 2012-10-16]
Fix documentation.

1.0

make it work [rnix, jensens, et al]

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

1.1.2

Dec 5, 2022

1.1.1

Sep 16, 2019

1.1.0

Mar 8, 2019

1.0.2

Feb 25, 2015

1.0.1

Dec 5, 2012

1.0

Oct 7, 2012

1.0-beta1 pre-release

Aug 28, 2012

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

souper-1.1.2.tar.gz (25.6 kB view details)

Uploaded Dec 5, 2022 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

souper-1.1.2-py3-none-any.whl (11.5 kB view details)

Uploaded Dec 5, 2022 Python 3

File details

Details for the file souper-1.1.2.tar.gz.

File metadata

Download URL: souper-1.1.2.tar.gz
Upload date: Dec 5, 2022
Size: 25.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.9.2

File hashes

Hashes for souper-1.1.2.tar.gz
Algorithm	Hash digest
SHA256	`38a0fcf8e1d1e830895483e7d3d91a03a4c465c3855051e805d518f53aa81c9d`
MD5	`1dd6003b5728bb6841af7fa70d24b698`
BLAKE2b-256	`fe148d08137567531fd283569079d31f38d0e3d9880f3c3855ef71d7a5f4b152`

See more details on using hashes here.

File details

Details for the file souper-1.1.2-py3-none-any.whl.

File metadata

Download URL: souper-1.1.2-py3-none-any.whl
Upload date: Dec 5, 2022
Size: 11.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.9.2

File hashes

Hashes for souper-1.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`07f8bcfc858c5d764f0fde8f62636280916c08309cf872f380417e89d9d7396e`
MD5	`751ae8b5de95f87fc65bba8ec7a6554e`
BLAKE2b-256	`210322dba11501592d08d43b83d3c81fb09bad776dacb9dca2ec439db41f1b71`

See more details on using hashes here.

souper 1.1.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Using Souper

Providing a Locator

Providing a Catalog Factory

Adding records

Access data

Query data

Delete a record

Reindex

Rebuild catalog

Reset (or clear) the soup

Source Code

Contributors

Changelog

1.1.2 (2022-12-05)

1.1.1 (2019-09-16)

1.1.0 (2019-03-08)

1.0.2 (2015-02-25)

1.0.1

1.0

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes