A French-English text realizer
Project description
pyrealb - A Python Bilingual Text Realizer
Version 3.2.2 - April 2025
pyrealb is a Python adaptation of the JavaScript jsRealB text realizer with the same constituent and dependency syntax notation. It facilitates its integration within Python applications by simply adding
from pyrealb import *
Version 3.0.0 was a major code reorganization, but without any new feature, to clearly separate language dependent parts from the language independent ones. This organization is described here .
The use of pyrealb for Bilingual Data-to-text generation is described in this document.
Installing the distribution package from PyPI
pip install pyrealb
Caution: do not forget the b at the end of pyrealb. On PyPI, there is an unrelated package pyreal for evaluating and deploying human readable machine learning explanations.
Upgrading the version
pip install pyrealb --upgrade
Building and installing the package from the sources
cdinto this directory (withpyproject.tomlfile)- Build the distribution package
python3 -m build - Install with
python3 -m pip install .
First realization tests at the Python 3 prompt
from pyrealb import *loadEn()print(S(Pro("I").g("f"),VP(V("say"),"hello",PP(P("to"),NP(D("the"),N("world"))))))- this should print
She says hello to the world. print(root(V("say").t("ps"),subj(Pro("him").c("nom")),comp(N("goodbye"))).typ({"neg":True}))- this should print
He did not say goodbye.
Use pyrealb in a Jupyter notebook
Directories
src/pyrealb__init__.py: import classes and functions and export relevant symbols.Constituent.py: Constituent is the top class for methods shared between Phrases and TerminalsConstituentEn.py,ConstituentFr.py: English and French specific processing ofConstituentDependent.py: subclass of Constituent for creating complex phrases using dependenciesDependentEn.py,DependentFr.py: English and French specific processing ofDependentlemmatize.py: function for building the lemmata mapsLexicon.py: class to access lexicon entries and syntactic rulesLICENSE.txt: Apache 2.0 LicenseNonTerminalEn.py,NonTerminalFr.py: language dependent processing common toPhraseandDependentNumber.py: utility functions for dealing with number formattingPhrase.py: subclass of Constituent for creating complex phrasesPhraseEn.py,PhraseFr.py: English and French specific processing ofPhraseTerminal.py: subclass of Constituent for creating a single unit (most often a single word)TerminalEn.py,TerminalFr.py: English and French specific processing ofTerminalutils.py: some useful functions
./src/pyrealb/data: these resources are identical to the corresponding files in jsRealBLICENSE.txt: Creative Common licenselexicon-en.json: English lexicon (33,932 entries) in json formatrule-en.js: English conjugation and declension tableslexicon-fr.json: French lexicon (52,547 entries) in json formatrule-fr.js: French conjugation and declension tableslexicon-en.jsonrnc, lexicon-fr.jsonrnc: json-rnc schemas for the lexiconslexicon-en.jsonrnc.json, lexicon-fr.jsonrnc.json: standard JSON schemas corresponding to the json-rnc schemas for the lexicons; these files are created by the validation process.
Nota bene:
- In the following directories, the
__init__.pyfile is used to set the appropriate search path for pyrealb functions; this ensures that the current Python source files are used for execution and testing. - Some directories include
markup.pywhich should be loaded usingpip. Unfortunately I never managed to make this "piped" version work, it does not import the nameonelineralthough it should. It works only if the file is in the local directory.
-
docs: The html and image files should be copied athttp://www.iro.umontreal.ca/~lapalme/pyrealb/which is used for convenient web access.-
English and French documentation
documentation.html: generated documentation DO NOT EDIT directly Online versiondocumentation.py: Python program for generatingdocumentation.htmlusingmarkup.pystyle.css: style sheet for the documentationuserinfos.py: definitions of variables containing the examplesuser.js: Python helper script.
-
Supplements to the documentation for specific aspects. Edit a Markdown (
.md)file and use theMakefilefor generating the html version.Hacking-pyrealb.md: tricks of the trade for dynamic constituent structure modificationLexicon-Format-en.md, Lexicon-Format-fr.md: language specific detailed documentation about lexicon entries.Realizer-Architecture.md: description of the class organisation of the jsRealB/pyrealb ecosystemMakefile:make all: update the documentation after changingdocumentation.pyor any Markdown filemake export: list the files that should be present on the web consultation directory
-
-
IDE: Integrated Development Environmentide.py: built on the Python read-eval-print loop, it imports pyrealb to get the realization of an expression, to consult the lexicon, the conjugation and declension tables. It is also possible to get a lemmatization: i.e. the pyrealb expression corresponding to a form.README.html: documentation and examples
Nota bene: The evaluation demo of jsRealB is more convenient than this IDE to develop pyrealb expressions as both programs share the same formalism. The jsRealB demo provides an editor and access to the lexicons and rules.
-
Notebooks : Jupyter notebooks (in English and French) with can be used as an executable introduction to pyrealb
-
tests: unit tests of special features of pyrealb in both French and English. They are designed to launched withpytest. Files have the patterntest_*_{en|fr}.py.README.md: more details on the organisation and use of the test filestest_all.sh: run this file in a terminal to run all test files of the directory
Demos
99bottlesofbeer/99bottlesofbeer.py: simple generation of a classic repetitive text in English.basketball/sportsettsum.py: generation of French and English basketball summaries paper describing the approachBilinguo/bilinguo.py: generation of translation drill exercisesdev_example/dev_example.py: examples of English and French expressions to be realized and checked against expected output, useful for debugging when adding a new expression and enabling tracing.eliza/eliza-talk.py: French version of Eliza. It illustrates some interesting features of pyrealb. See this document (in French) for an explanation and rationale. It is a Python translation of this jsRealB demo.evenements/evenements.py: Description (in French) of a list of events, it creates HTML.flight_infos/README.md: development of a RASA NLG server giving information about flights, aircrafts, etc...gen_from_words.py: generation of English and French sentences from a plain list of words, adding some structure.gen_stanza_uds/*.py: various programs used for generating sentences for helping the Stanza lemmatizing learn new inflected forms in French but also in English.gophypi/amr2text.py: generate a literal reading of an AMR (Abstract Meaning Representation); paper describing the approachinflection/inflection.py: French or English conjugation and declension of a form.kilometresapied/kilometresapied.py: simple generation of a classic repetitive text in French.methodius/methodius.py: generation of English sentences from a logical form expressed in XML.randomgen/randomgen.py: Generation of random English sentencesRDFpyrealb/WebGenerate.py: Generation from RDF triplesreport/report.py: Single sentence parameterized by language, tense and subject using two different program organizationvariantes/variantes.py: French or English sentences realized with all possible sentence modifiers; some challenging examples are inexamples.py.weather/Bulletin.py: French and English weather bulletins generated from information in a json-line file. (weather-data.jsonl). It uses the packages in theRealizationdirectory.
Licenses
- pyrealb source code is licensed under Apache-2.0
- Linguistic resources in the
./datadirectory are licensed under CC-BY-SA-4.0
Contact
Acknowledgement
Thanks to Fabrizio Gotti, François Lareau and Ludan Stoeckle for interesting suggestions.
For the maintainer mainly
Updating package version on PyPI
see this tutorial
These steps take for granted that the password for PyPI has already been given...
- Update version number in
setup.cfg(it should be the same aspython_versioninsrc/pyrealb/utils.pyand at the beginning of this document). - Run
docs/documentation.pyto update the version number indocs/documentation.html - Commit pyrealb on GitHub
cdinto the directory with thepyproject.tomlfile (the same as thisREADME.md)- Build the distribution package
python3 -m build - Upload to PyPi the last version I.J.K
twine upload dist/*-I.J.K.* - Install new version from PyPI
python3 -m pip install pyrealb --upgrade
Useful trick for debugging with breaking point and tracing in PyCharm
- add
pyrealbexpression to debug at the end ofdemo/dev_example/dev_example.py - comment the line calling
testPreviousExamples() - debug
demo/dev_example/dev_example.py
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file pyrealb-3.2.4.tar.gz.
File metadata
- Download URL: pyrealb-3.2.4.tar.gz
- Upload date:
- Size: 766.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2c037829dcedc6a1972b2565db034ec348536a3d5fb514a7a508a474eddc6e00
|
|
| MD5 |
82e9323adbb99614010b30a3081be42e
|
|
| BLAKE2b-256 |
965308133a4da74c5feaa3bb7787418ef6fe642452a033f28ee998622ad53990
|