Skip to main content

A command-line regex search engine for the English language

Project description

Lexitron

A command-line regex search engine for the English language.

Requirements

The only major requirement is Python.

I don't actually know which versions of Python this package will work on, I've only tested on my own system which is using Python 3.11. Any feedback about what works and doesn't would be helpful.

I did not write Lexitron to work on Windows, although it is a simple enough package that I don't see why it shouldn't.

If you try to install Lexitron and something goes wrong, let me know what your system details are and I'll try to get it fixed.

Installation

Lexitron is available on the Python Package Index (pip). To install, simply type

$ pip install lexitron

at the command line.

Once the install is complete, you can access Lexitron with the lx command at the terminal.

Usage

Usage syntax is

$ lx [options] expression

where expression is a regular expression and [options] are as follows.

option function
-h Print help and exit
-i Print info header along with search results
-n Print only the number of matches
-u Include uppercase/proper words (like "France") in addition to lowercase/common words
-U Search only for uppercase/proper words
-v Print version and exit
-x Print unformatted output, one word per line

Type $ lx -h for full help text.

If you aren't familiar with regular expressions, it isn't too hard to learn the basics. There are many resources online. A good starting point is the Wikipedia article.

Output

By default, Lexitron will output a well-formatted (potentially multi-column) list of words, along with a header describing the results.

The results are separated into "proper" words (capitalized, like "France") and "common" words (lowercase, like "boat").

Using the -x flag will return a more machine-readable output with one word per line.

Examples

Example 1

A list of lowercase English words ending with "icide".

$ lx icide$
aborticide      germicide       ovicide         spermicide
acaricide       giganticide     parasiticide    sporicide
agricide        herbicide       parasuicide     stillicide
algicide        homicide        parricide       suicide
aphicide        infanticide     patricide       tyrannicide
aphidicide      insecticide     pesticide       uxoricide
bacillicide     larvicide       prolicide       vaticide
bactericide     liberticide     pulicide        verbicide
deicide         matricide       raticide        vermicide
feticide        medicide        regicide        viricide
filicide        menticide       rodenticide     vulpicide
foeticide       miticide        scabicide
fratricide      molluscicide    silicide
fungicide       nematicide      sororicide

Example 2

A list of lowercase English words that contain the substring "rdb", printed with info header.

$ lx -i rdb
20 matches for /rdb/

birdbath          herdbook
birdbrain         herdboy
cardboard         leopardbane
hardback          recordbook
hardbake          standardbearer
hardball          standardbred
hardbeam          swordbill
hardboard         thirdborough
hardboot          wordbook
hardbound         yardbird

Example 3

The number of lowercase English words that end in "tion".

$ lx -n ".*tion"
3837

(This number should be taken with a grain of salt, since no dictionary is perfect, and it depends on what you count as a valid english word, and which technical or niche jargons are included; etc etc.)

Example 4

A list of English words with the same double letter appearing twice, except for those whose double letter is a vowel or the letter s (to ignore words of the form *lessness), printed with info header.

$ lx -iu "([^aeious])\1.*\1\1"
45 matches for /([^aeious])\1.*\1\1/ (9 proper, 36 common)

Allhallowmas
Allhallows
Allhallowtide
Armillariella
Chancellorsville
Dullsville
Gallirallus
Hunnemannia
Llullaillaco

acciaccatura       jellyroll          rollcollar
bellpull           kinnikinnic        rollerball
chiffchaff         kinnikinnick       scuttlebutt
dillydallier       millefeuille       shillyshally
dillydally         niffnaff           skillfully
dullsville         parallelling       snippersnapper
flibbertigibbet    pellmell           villanelle
granddaddy         pizzazz            volleyball
hallalling         pralltriller       volleyballer
hillbilly          razzamatazz        whippersnapper
huggermugger       razzmatazz         willfully
hullaballoo        riffraff           yellowbelly

Example 5

Compare the number of lowercase English words that end in "woman" with the number that end in "man".

$ lx -n ".*woman"
107
$ lx -n ".*(?<\!wo)man"
1145

Acknowledgements

For its dictionary, Lexitron uses the Automatically Generated Inflection Database (AGID) by Kevin Atkinson. See http://wordlist.sourceforge.net/.

License

Lexitron is licensed under GNU GPL Version 2.

Contact

Questions, bug reports, and feature requests can be filed on the Github issues tracker.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lexitron-2.1.2.tar.gz (874.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

Lexitron-2.1.2-py3-none-any.whl (873.5 kB view details)

Uploaded Python 3

File details

Details for the file lexitron-2.1.2.tar.gz.

File metadata

  • Download URL: lexitron-2.1.2.tar.gz
  • Upload date:
  • Size: 874.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.13.1

File hashes

Hashes for lexitron-2.1.2.tar.gz
Algorithm Hash digest
SHA256 7ca896a3e8ac772424501b741d532cacbb46a7b698544d9adb70b871bb90d5e6
MD5 d91357dd3f381ea1743f8826611172bb
BLAKE2b-256 08c9804512e6c2b63125a111747b5e969fb1c7ed742f4c3da4255c3239f19c00

See more details on using hashes here.

File details

Details for the file Lexitron-2.1.2-py3-none-any.whl.

File metadata

  • Download URL: Lexitron-2.1.2-py3-none-any.whl
  • Upload date:
  • Size: 873.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.13.1

File hashes

Hashes for Lexitron-2.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 fc479d40c1a7c9737f86dc157018cbd4207e01a62bb84bc7c5b8904ca6f24dcc
MD5 2740618538f27b28454bc9d8d753bb67
BLAKE2b-256 dcdd9e255baa6936e26dbee1dd63bec9c69cf5f3ffb3366915a5dc0600e783eb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page