This is a pre-production deployment of Warehouse, however changes made here WILL affect the production instance of PyPI.
Latest Version Dependencies status unknown Test status unknown Test coverage unknown
Project Description

shellinford

Shellinford is an implementation of a Wavelet Matrix/Tree succinct data structure for document retrieval.

Based on shellinford C++ library.

Installation

$ pip install shellinford

Usage

Create a new FM-index instance

>>> import shellinford
>>> fm = shellinford.FMIndex()
  • shellinford.Shellinford([use_wavelet_tree=True, filename=None])
    • When given a filename, Shellinford loads FM-index data from the file

Build FM-index

>>> fm.build(['Milky Holmes', 'Sherlock "Sheryl" Shellingford', 'Milky'], 'milky.fm')
  • build([docs, filename])
    • When given a filename, Shellinford stores FM-index data to the file

Search word from FM-index

>>> for doc in fm.search('Milky'):
>>>     print 'doc_id:', doc.doc_id
>>>     print 'count:', doc.count
>>>     print 'text:', doc.text
doc_id:    0
count: 1
text:  Milky Holmes
doc_id:    2
count: 1
text:  Milky

>>> for doc in fm.search(['Milky', 'Holmes']):
>>>     print 'doc_id:', doc.doc_id
>>>     print 'count:', doc.count
>>>     print 'text:', doc.text
doc_id:    1
count: 1
text:  Milky Holmes
  • search(query, [_or=False, ignores=[]])
    • If _or = True, then “OR” search is executed, else “AND” search
    • Given ignores, “NOT” search is also executed
    • NOTE: The search function is available after FM-index is built or loaded

Add a document

>>> fm.push_back('Baritsu')
  • push_back(doc)
    • NOTE: A document added by this method is not available to search until build

Read FM-index from a binary file

>>> fm.read('milky_holmes.fm')
  • read(path)

Write FM-index binary to a file

>>> fm.write('milky_holmes.fm')
  • write(path)

License

  • Wrapper code is licensed under the New BSD License.
  • Bundled shellinford C++ library (c) 2012 echizen_tm is licensed under the New BSD License.

CHANGES

0.3.4 (2016-10-28)

  • FMIndex.search() returns list

0.3 (2014-11-24)

  • “OR” search and “NOT” search are available in FMIndex.search().
  • FMIndex.size and FMIndex.docsize are available as property

0.2 (2014-03-28)

“AND” search is available by giving Sequence (list, tuple, etc.) FMIndex.search()

0.1 (2014-03-11)

First release.

Release History

Release History

0.3.4

This version

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.3.1

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

Download Files

Download Files

TODO: Brief introduction on what you do with files - including link to relevant help section.

File Name & Checksum SHA256 Checksum Help Version File Type Upload Date
shellinford-0.3.4.tar.gz (61.1 kB) Copy SHA256 Checksum SHA256 Source Oct 29, 2016

Supported By

WebFaction WebFaction Technical Writing Elastic Elastic Search Pingdom Pingdom Monitoring Dyn Dyn DNS HPE HPE Development Sentry Sentry Error Logging CloudAMQP CloudAMQP RabbitMQ Heroku Heroku PaaS Kabu Creative Kabu Creative UX & Design Fastly Fastly CDN DigiCert DigiCert EV Certificate Rackspace Rackspace Cloud Servers DreamHost DreamHost Log Hosting