Skip to main content

Sequence Tagger for Partially Annotated Dataset in PyTorch

Project description

Sequence Tagger for Partially Annotated Dataset in PyTorch

This is a CRF tagger for partially annotated dataset in PyTorch. You can easily utilize marginal log likelihood for CRF (Tsuboi, et al., 2008). The implementation of this library is based on Rush, 2020.

Usage

First, import some modules as follows.

from partial_tagger.crf.nn import CRF
from partial_tagger.crf import functional as F

Initialize CRF by giving it the number of tags.

num_tags = 2
crf = CRF(num_tags)

Prepare incomplete tag sequence (partial annotation) and convert it to a tag bitmap.
This tag bitmap represents the target value for CRF.

# 0-1 indicates a true tag
# -1 indicates that a tag is unknown
incomplete_tags = torch.tensor([[0, 1, 0, 1, -1, -1, -1, 1, 0, 1]])

tag_bitmap = F.to_tag_bitmap(incomplete_tags, num_tags=num_tags, partial_index=-1)

Compute marginal log likelihood from logits.

batch_size = 1
sequence_length = 10
# Dummy logits
logits = torch.randn(batch_size, sequence_length, num_tags)

log_potentials = crf(logits)

loss = F.marginal_log_likelihood(log_potentials, tag_bitmap).sum().neg()

Installation

To install this package:

pip install partial-tagger

References

  • Yuta Tsuboi, Hisashi Kashima, Shinsuke Mori, Hiroki Oda, and Yuji Matsumoto. 2008. Training Conditional Random Fields Using Incomplete Annotations. In Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008), pages 897–904, Manchester, UK. Coling 2008 Organizing Committee.
  • Alexander Rush. 2020. Torch-Struct: Deep Structured Prediction Library. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages 335–342, Online. Association for Computational Linguistics.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

partial-tagger-0.6.1.tar.gz (7.7 kB view details)

Uploaded Source

Built Distribution

partial_tagger-0.6.1-py3-none-any.whl (10.4 kB view details)

Uploaded Python 3

File details

Details for the file partial-tagger-0.6.1.tar.gz.

File metadata

  • Download URL: partial-tagger-0.6.1.tar.gz
  • Upload date:
  • Size: 7.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.13

File hashes

Hashes for partial-tagger-0.6.1.tar.gz
Algorithm Hash digest
SHA256 7dc0e818636c9f2a7a9b685aaeec34699455a0d593790da7cd386cfe4d4bfbfd
MD5 0fb3461dbdd5a9c8bd20fa68ed226abd
BLAKE2b-256 2f06ac85e5e8893d5454298d0839eb04002f1ae37a6c81013d61618a574aa598

See more details on using hashes here.

File details

Details for the file partial_tagger-0.6.1-py3-none-any.whl.

File metadata

File hashes

Hashes for partial_tagger-0.6.1-py3-none-any.whl
Algorithm Hash digest
SHA256 63ce20f638708b6868804ab17ab0da9943338042c75b1e56bce4c3c1156fba84
MD5 726136b98f5dfb7d81daa0eca4e0cdd0
BLAKE2b-256 791264b179c718320c085852a183747fb9984f076be8b3b5cffbe46ecda4ec5a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page