Skip to main content

A Python client for Sherlok

Project description

Sherlok is a flexible and powerful open source, distributed, real-time text-mining engine.

pip install --upgrade sherlok

from sherlok import Sherlok

# returns a generator of tuples (begin, end, text, annotation_type, attributes{})
print list(Sherlok().annotate('neuroner', 'layer 4 neuron'))

[(0, 14, 'layer 4 neuron', u'Neuron', {}),
 (8, 14, 'neuron',  u'Neuron', {}),
 (8, 14, 'neuron',  u'NeuronTrigger', {}),
 (0, 7,  'layer 4', u'Layer', {u'ontologyId': u'HBP_LAYER:0000004'})]


# filtering and finding the text back
s = Sherlok()
txt = 'parvalbumin-positive fast-spiking basket cells, somatostatin-positive regular-spiking bipolar and multipolar cells, and cholecystokinin-positive irregular-spiking bipolar and multipolar cells'
morphology = s.select(s.annotate('neuroner', txt), u'Morphology')
for (start, end, text, type, properties) in morphology:
    print text, properties[u'ontologyId']

basket HBP_MORPHOLOGY:0000019
bipolar HBP_MORPHOLOGY:0000006
multipolar HBP_MORPHOLOGY:0000035
bipolar HBP_MORPHOLOGY:0000006
multipolar HBP_MORPHOLOGY:0000035

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for sherlok, version 0.1.5
Filename, size File type Python version Upload date Hashes
Filename, size sherlok-0.1.5.tar.gz (2.6 kB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page