Library that adds FoLiA (format for linguistic annotation) support to spaCy
Project description
Convert Spacy output to FoLiA XML Documents. Also supports FoLiA input.
Installation
$ pip install spacy2folia
You also need to install the spacy models you want like:
python -m spacy download en_core_web_sm
Usage Example
Using the command line tool on an input file named test.txt:
$ spacy2folia --model en_core_web_sm test.txt
This results in a document test.folia.xml in the current working directory.
You can also invoke the command line tool on one or more FoLiA documents as input:
$ spacy2folia --model en_core_web_sm document.folia.xml
The output file will be written to the currrent working directory (so it may overwirte the input if it’s in the same directory!)
Usage from Python:
import spacy
from spacy2folia import spacy2folia
text = "Input text goes here"
nlp = spacy.load("en_core_web_sm")
doc = nlp(text)
foliadoc = spacy2folia.convert(doc, "example", paragraphs=True)
foliadoc.save("/tmp/output.folia.xml")
Usage from Python with FoLiA input:
import spacy
import folia.main as folia
from spacy2folia import spacy2folia
foliadoc = folia.Document(file="/tmp/input.folia.xml")
nlp = spacy.load("en_core_web_sm")
spacy2folia.convert_folia(foliadoc, nlp)
foliadoc.save("/tmp/output.folia.xml")
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file Spacy2FoLiA-0.3.4.tar.gz
.
File metadata
- Download URL: Spacy2FoLiA-0.3.4.tar.gz
- Upload date:
- Size: 5.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ef983c93f5809677cd2a46602e4440ccf4cf3a270a35bed1c18d3d0280d8e275 |
|
MD5 | 8e11f9074c68d2f1e2e301faa4483913 |
|
BLAKE2b-256 | b3331039e2309a400283602fb95d65045f4f9c0bf6e821ebfd45eacf40277dcb |