Skip to main content
Join the official 2019 Python Developers SurveyStart the survey!

A Free, Commonsense-Enriched Natural Language Understander for English

Project description

MontyLingua is a free*, commonsense-enriched, end-to-end natural language
understander for English. Feed raw English text into MontyLingua, and the output
will be a semantic interpretation of that text. Perfect for information
retrieval and extraction, request processing, and question answering. From
English sentences, it extracts subject/verb/object tuples, extracts adjectives,
noun phrases and verb phrases, and extracts people's names, places, events,
dates and times, and other semantic information. MontyLingua makes traditionally
difficult language processing tasks trivial!

Version 2.1 is substantially FASTER, MORE ACCURATE, and MORE RELIABLE than
version 1.3.1. It has now been tested across Windows, many flavors of UNIX, and
Mac OS X, and several flavors of Java, and is in use by several university
research projects and under several commercial settings.

MontyLingua differs from other natural language processing tools because:

* it is complete end-to-end.. input raw_text; output semantic interpretation
* not many dated tools and implementations sewn together; it is one
well-integrated implementation
* it does not require "training" and other fidgetting, and will work right
* it is enriched with "common sense" knowledge about the everyday world,
allowing it to escape many stupid interpretive mistakes. e.g.:
o "(NX the/DT mosquito/NN bit/NN NX) (NX the/DT boy/NN NX)" ==corrected==>
o "(NX the/DT mosquito/NN NX) (VX bit/VBD VX) (NX the/DT boy/NN NX)"
* it is lightweight and portable across platforms, written in portable
Python and also available as a compiled Java library
* it is easy to customize by allowing for a user lexicon

MontyLingua performs the following tasks over text:

1. MontyTokenizer - Tokenizes raw English text (sensitive to abbreviations),
and resolve contractions, e.g. "you're" ==> "you are"
2. MontyTagger - Part-of-speech tagging based on Brill94, enriched with
common sense.
3. MontyChunker - Lightning fast regular expression chunker
4. MontyExtractor - Extracts phrases and subject/verb/object triplets from
5. MontyLemmatiser - Strips inflectional morphology, i.e. changes verbs to
infinitive form and nouns to singular form
6. MontyNLGenerator - Uses MontyLingua's concise predicate-arg representation
to generate naturalistic English sentences and text summaries

Project details

Release history Release notifications

This version


Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page