Skip to main content

EDITABLE French dictionaries from Laboratoire d'Automatique Documentaire et Linguistique (LADL)

Project description

Installation

pip install dict-fr-AU-DELA

EDITABLE French dictionaries from Laboratoire d'Automatique Documentaire et Linguistique (LADL)

DESCRIPTION

Starting from the original inflected form DELA French dictionary, provided by the former Laboratoire d'Automatique Documentaire et Linguistique (LADL), now integrated into Institut Gaspard Monge (IGM) of the Université Gustave Eiffel, this repository contains:

  • modified dictionary data, publicly editable here;
  • a Python package gathering the results for exploitation by other tools.

The selected original dictionary is the inflected form DELA French dictionary in UTF-16 LE encoding, from March 16, 2006, with 683.824 simple entries for 102.073 different lemmas and 108.436 compounded entries for 83.604 different lemmas.

FILES

All files are installed in Python's /usr/local equivalent, under share/dict.

Original files

Filename Description
dict-fr-AU-DELA-License Lesser General Public License For Linguistic Resources

Generated files

Filename Description
dict-fr-AU-DELA Modified inflected form DELA French dictionary in UTF-8 encoding and Unix-style end of lines
dict-fr-AU-DELA.ascii French words and compound words list (unaccented)
dict-fr-AU-DELA.unicode 742.889 entries French words and compound words list (accented)
dict-fr-AU-DELA.combined French words and compound words list (with both accented and unaccented words)
dict-fr-AU-DELA-proper_nouns.ascii French proper nouns list (unaccented, sometimes compounded)
dict-fr-AU-DELA-proper_nouns.unicode 823 entries French proper nouns list (accented, sometimes compounded)
dict-fr-AU-DELA-proper_nouns.combined French proper nouns list (with both accented and unaccented words, sometimes compounded)
dict-fr-AU-DELA-common-words.ascii French common words list (unaccented)
dict-fr-AU-DELA-common-words.unicode 641.759 entries French common words list (accented)
dict-fr-AU-DELA-common-words.combined French common words list (with both accented and unaccented words)
dict-fr-AU-DELA-common-compound-words.ascii French common compound words list (unaccented)
dict-fr-AU-DELA-common-compound-words.unicode 100.320 entries French common compound words list (accented)
dict-fr-AU-DELA-common-compound-words.combined French common compound words list (with both accented and unaccented words)

Beside manual edits, apart from the dict-fr-AU-DELA file, these generated files went through the following transformations:

  • removal of escape backslashes
  • removal of lemma and grammatical info from dict-fr-AU-DELA
  • lossless conversion of accents for the *-ascii versions
  • combination of the *-ascii and *-unicode versions into the *-combined ones (without duplicates)

SEE ALSO

spell(1) like tools, anagram(6), conjuguer(1)

HISTORY

DELA means "Dictionnaire Electronique du LADL" (LADL's electronic dictionaries). These dictionaries were initiated by the lab's founder, Maurice Gross.

This modified version of the original DELA dictionary was necessary because our PNU project's conjuguer command made it clear that there were errors in some verb conjugations.

It was naturally called AU-DELA, a pun meaning beyond DELA ("au-delà" in French being translated as "beyond").

I wrote an history of Unix & French dictionaries (in French only), which covers this dictionary and many others.

LICENSE

The original contents, as well as this package, are licensed under the Lesser General Public License For Linguistic Resources.

AUTHORS

Laboratoire d'Automatique Documentaire et Linguistique (LADL) for the original contents.

Hubert Tournier for the package and some initial changes.

The GitHub community for further changes.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dict-fr-AU-DELA-2021.9.9.tar.gz (19.1 MB view details)

Uploaded Source

Built Distribution

dict_fr_AU_DELA-2021.9.9-py3-none-any.whl (19.0 MB view details)

Uploaded Python 3

File details

Details for the file dict-fr-AU-DELA-2021.9.9.tar.gz.

File metadata

  • Download URL: dict-fr-AU-DELA-2021.9.9.tar.gz
  • Upload date:
  • Size: 19.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.2 CPython/3.8.11

File hashes

Hashes for dict-fr-AU-DELA-2021.9.9.tar.gz
Algorithm Hash digest
SHA256 7b8764d1ff74fd83ffebd95a451184c2b6d1b5cf2f2e57fd540409ce03077df6
MD5 48d131b45536657e4cdbf1d092cb2a4d
BLAKE2b-256 314b774f623b09bca2317d8296bf2647d056c07ce0d7f11b013e7c989a3ac03c

See more details on using hashes here.

Provenance

File details

Details for the file dict_fr_AU_DELA-2021.9.9-py3-none-any.whl.

File metadata

  • Download URL: dict_fr_AU_DELA-2021.9.9-py3-none-any.whl
  • Upload date:
  • Size: 19.0 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.2 CPython/3.8.11

File hashes

Hashes for dict_fr_AU_DELA-2021.9.9-py3-none-any.whl
Algorithm Hash digest
SHA256 47367285bd407147985da96d740c7516048e1b534e05aa6d529f192a2d175931
MD5 d906c078a15376c5239a256cb9c8994c
BLAKE2b-256 e1d84696a1ec5e242c50fcbdea5f8bbf8124dbfdc0a683eaf1f0edd425f64ac1

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page