Skip to main content

Convert FLEx data to CLDF-ready CSV.

Project description

cldflex

Convert FLEx data to CLDF-ready CSV.

License Tests Linting Codecov PyPI Versions

Many descriptive linguists have annotated language data in a FLEx (SIL's Fieldworks Lexical Explorer) database, perhaps the most popular and accessible assisted segmentation and annotation workflow. However, a reasonably complete data export is only available in XML, which is not human-friendly, and is not readily converted to other data. A data format growing in popularity is the CLDF standard, a table-based approach with human-readable datasets, designed to be used in CLLD apps and easily processable by any software that can read CSV files, including R, pandas or spreadsheet applications. The goal of cldflex is to convert lexicon and corpus data stored in FLEx to CSV tables, primarily for use in CLDF datasets.

Installation

cldflex is available on PyPI:

pip install cldflex

Usage

At the moment, there are two commands: cldflex flex2csv processes .flextext (corpora), and cldflex lift2csv processes .lift (lexica) files. Both commands create a number of CSV files. One can either use cldfbench to create one's own CLDF datasets from these files, or add the --cldf argument to create (simple) datasets.

Project-specific configuration can be passed via --conf your/config.yaml

flex2csv

Basic usage:

cldflex flex2csv texts.flextext

Connect the corpus with the lexicon:

cldflex flex2csv texts.flextext --lexicon lexicon.lift

Create a CLDF dataset:

cldflex flex2csv texts.flextext --lexicon lexicon.lift --cldf

lift2csv

Extract morphemes, morphs, and entries from lexicon.lift:

cldflex lift2csv lexicon.lift

Create a CLDF dataset with a Dictionary module:

cldflex lift2csv lexicon.lift --cldf

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cldflex-0.1.0.tar.gz (22.1 kB view hashes)

Uploaded Source

Built Distribution

cldflex-0.1.0-py3-none-any.whl (22.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page