Sequence Tagger for Partially Annotated Dataset in PyTorch
Project description
Sequence Tagger for Partially Annotated Dataset in PyTorch
This is a CRF tagger for partially annotated dataset in PyTorch. You can easily utilize marginal log likelihood for CRF (Tsuboi, et al., 2008). The implementation of this library is based on Rush, 2020.
Usage
First, import some modules as follows.
from partial_tagger.crf.nn import CRF
from partial_tagger.crf import functional as F
Initialize CRF
by giving it the number of tags.
num_tags = 2
crf = CRF(num_tags)
Prepare incomplete tag sequence (partial annotation) and convert it to a tag bitmap.
This tag bitmap represents the target value for CRF.
# 0-1 indicates a true tag
# -1 indicates that a tag is unknown
incomplete_tags = torch.tensor([[0, 1, 0, 1, -1, -1, -1, 1, 0, 1]])
tag_bitmap = F.to_tag_bitmap(incomplete_tags, num_tags=num_tags, partial_index=-1)
Compute marginal log likelihood from logits.
batch_size = 1
sequence_length = 10
# Dummy logits
logits = torch.randn(batch_size, sequence_length, num_tags)
log_potentials = crf(logits)
loss = F.marginal_log_likelihood(log_potentials, tag_bitmap).sum().neg()
Installation
To install this package:
pip install partial-tagger
References
- Yuta Tsuboi, Hisashi Kashima, Shinsuke Mori, Hiroki Oda, and Yuji Matsumoto. 2008. Training Conditional Random Fields Using Incomplete Annotations. In Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008), pages 897–904, Manchester, UK. Coling 2008 Organizing Committee.
- Alexander Rush. 2020. Torch-Struct: Deep Structured Prediction Library. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages 335–342, Online. Association for Computational Linguistics.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file partial-tagger-0.6.1.tar.gz
.
File metadata
- Download URL: partial-tagger-0.6.1.tar.gz
- Upload date:
- Size: 7.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7dc0e818636c9f2a7a9b685aaeec34699455a0d593790da7cd386cfe4d4bfbfd |
|
MD5 | 0fb3461dbdd5a9c8bd20fa68ed226abd |
|
BLAKE2b-256 | 2f06ac85e5e8893d5454298d0839eb04002f1ae37a6c81013d61618a574aa598 |
File details
Details for the file partial_tagger-0.6.1-py3-none-any.whl
.
File metadata
- Download URL: partial_tagger-0.6.1-py3-none-any.whl
- Upload date:
- Size: 10.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 63ce20f638708b6868804ab17ab0da9943338042c75b1e56bce4c3c1156fba84 |
|
MD5 | 726136b98f5dfb7d81daa0eca4e0cdd0 |
|
BLAKE2b-256 | 791264b179c718320c085852a183747fb9984f076be8b3b5cffbe46ecda4ec5a |