Skip to main content

Library that adds FoLiA (format for linguistic annotation) support to spaCy

Project description

https://travis-ci.com/proycon/foliapy.svg?branch=master http://applejack.science.ru.nl/lamabadge.php/spacy2folia

Convert Spacy output to FoLiA XML Documents. Also supports FoLiA input.

Installation

$ pip install spacy2folia

You also need to install the spacy models you want like:

python -m spacy download en_core_web_sm

Usage Example

Using the command line tool on an input file named test.txt:

$ spacy2folia --model en_core_web_sm test.txt

This results in a document test.folia.xml in the current working directory.

You can also invoke the command line tool on one or more FoLiA documents as input:

$ spacy2folia --model en_core_web_sm document.folia.xml

The output file will be written to the currrent working directory (so it may overwirte the input if it’s in the same directory!)

Usage from Python:

import spacy
from spacy2folia import spacy2folia

text = "Input text goes here"

nlp = spacy.load("en_core_web_sm")
doc = nlp(text)
foliadoc = spacy2folia.convert(doc, "example", paragraphs=True)
foliadoc.save("/tmp/output.folia.xml")

Usage from Python with FoLiA input:

import spacy
import folia.main as folia
from spacy2folia import spacy2folia

foliadoc = folia.Document(file="/tmp/input.folia.xml")
nlp = spacy.load("en_core_web_sm")
spacy2folia.convert_folia(foliadoc, nlp)
foliadoc.save("/tmp/output.folia.xml")

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Spacy2FoLiA-0.3.4.tar.gz (5.6 kB view details)

Uploaded Source

File details

Details for the file Spacy2FoLiA-0.3.4.tar.gz.

File metadata

  • Download URL: Spacy2FoLiA-0.3.4.tar.gz
  • Upload date:
  • Size: 5.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.7

File hashes

Hashes for Spacy2FoLiA-0.3.4.tar.gz
Algorithm Hash digest
SHA256 ef983c93f5809677cd2a46602e4440ccf4cf3a270a35bed1c18d3d0280d8e275
MD5 8e11f9074c68d2f1e2e301faa4483913
BLAKE2b-256 b3331039e2309a400283602fb95d65045f4f9c0bf6e821ebfd45eacf40277dcb

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page