Skip to main content

Format transformer tool for doccano

Project description

doccano-transformer

Codacy Badge Build Status

Doccano Transformer helps you to transform an exported dataset into the format of your favorite machine learning library.

Supported formats

Doccano Transformer supports the following formats:

  • CoNLL 2003
  • spaCy

Install

To install doccano-transformer, simply use pip:

pip install doccano-transformer

Examples

Named Entity Recognition

The following formats are supported:

  • CoNLL 2003
  • spaCy
from doccano_transformer.datasets import NERDataset
from doccano_transformer.utils import read_jsonl

dataset = read_jsonl(filepath='example.jsonl', dataset=NERDataset, encoding='utf-8')
dataset.to_conll2003(tokenizer=str.split)
dataset.to_spacy(tokenizer=str.split)

Contribution

We encourage you to contribute to doccano transformer! Please check out the Contributing to doccano transformer guide for guidelines about how to proceed.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

doccano-transformer-1.0.2.tar.gz (28.2 kB view hashes)

Uploaded Source

Built Distribution

doccano_transformer-1.0.2-py3-none-any.whl (6.4 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page