Skip to main content

Library that adds FoLiA (format for linguistic annotation) support to spaCy

Project description

https://travis-ci.com/proycon/foliapy.svg?branch=master http://applejack.science.ru.nl/lamabadge.php/spacy2folia

Convert Spacy output to FoLiA XML Documents. Also supports FoLiA input.

Installation

$ pip install spacy2folia

You also need to install the spacy models you want like:

python -m spacy download en_core_web_sm

Usage Example

Using the command line tool on an input file named test.txt:

$ spacy2folia --model en_core_web_sm test.txt

This results in a document test.folia.xml in the current working directory.

You can also invoke the command line tool on one or more FoLiA documents as input:

$ spacy2folia --model en_core_web_sm document.folia.xml

The output file will be written to the currrent working directory (so it may overwirte the input if it’s in the same directory!)

Usage from Python:

import spacy
from spacy2folia import spacy2folia

text = "Input text goes here"

nlp = spacy.load("en_core_web_sm")
doc = nlp(text)
foliadoc = spacy2folia.convert(doc, "example", paragraphs=True)
foliadoc.save("/tmp/output.folia.xml")

Usage from Python with FoLiA input:

import spacy
import folia.main as folia
from spacy2folia import spacy2folia

foliadoc = folia.Document(file="/tmp/input.folia.xml")
nlp = spacy.load("en_core_web_sm")
spacy2folia.convert_folia(foliadoc, nlp)
foliadoc.save("/tmp/output.folia.xml")

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Spacy2FoLiA-0.3.4.tar.gz (5.6 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page