piboso·PyPI

Sentence tagger for biomedical abstracts.

Project description

This module contains a fully-standalone implementation of the PIBOSO tagger that won the ALTA2012 Shared task [1]. The features and algorithms used are described in [2].

Installing

The tagger (including a pre-trained model) is packaged as a Python module and distributed via pypi. Installing it should be as simple as

pip install piboso

Dependencies

hydrat [3] - automatically installed by pip TreeTagger [4] - must be manually installed

Configuration

The path to the folder in which treetagger is located must be specified in configuration file. When invoked, piboso_tag will attempt to locate a configuration file at ~/.pibosorc and ./.pibosorc. If neither exists, it will generate a blank configuration file at ./.pibosorc. The path to treetagger should be set in this configuration file.

An alternative location for reading the configuration file can be specified with the -c command-line option.

Using the tagger

The tagger can be invoked with the script piboso_tag, that is automatically installed when the package is installed with pip. The simplest invocation is

piboso_tag -o <OUTPUT_PATH> <FILE TO TAG> <FILE TO TAG> …

If no files are specified on the command line, piboso_tag will read STDIN and interpret each line as a path to a file to be tagged. More detailed information about invoking piboso_tag can be obtained by invoking

piboso_tag –help

Files are assumed to be sentence tokenized and presented in a sentence-per-line format. The output produced by piboso-tag is in a CSV format, for example:

subsample/1454068-1,background subsample/1454068-2,background subsample/1454068-3,outcome subsample/1454088-1,background subsample/1454088-2,background subsample/1454088-3,background subsample/1454088-4,background

The first item in each record is the path of the file and the sentence number separated by a dash. Sentences are enumerated from 1. The second item is the label assigned to the sentence.

Contact

Marco Lui <mhlui@unimelb.edu.au>

[1] http://alta.asn.au/events/sharedtask2012/ [2] http://aclweb.org/anthology-new/U/U12/U12-1019.pdf [3] http://hydrat.googlecode.com [4] http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/

Project details

Release history Release notifications | RSS feed

This version

Mar 27, 2013

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

piboso-1.tar.gz (24.3 MB view details)

Uploaded Mar 27, 2013 Source

File details

Details for the file piboso-1.tar.gz.

File metadata

Download URL: piboso-1.tar.gz
Upload date: Mar 27, 2013
Size: 24.3 MB
Tags: Source
Uploaded using Trusted Publishing? No

File hashes

Hashes for piboso-1.tar.gz
Algorithm	Hash digest
SHA256	`c140fe777b25167e7ed1b983f3df1a87508b5d3e2a87c426d393d745d82b9c1f`
MD5	`93d4def5b66add4cd10b1131715321fd`
BLAKE2b-256	`658050c408f34d67477d86fffc62dba6ddb499c7d322fafbb2534b5f06139ddc`

See more details on using hashes here.

piboso 1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta