Skip to main content

A command-line regex search engine for the English language

Project description

Lexitron

A command-line regex search engine for the English language.

Requirements

The only major requirement is Python.

I don't actually know which versions of Python this package will work on, I've only tested on my own system which is using Python 3.11. Any feedback about what works and doesn't would be helpful.

I did not write Lexitron to work on Windows, although it is a simple enough package that I don't see why it shouldn't.

If you try to install Lexitron and something goes wrong, let me know what your system details are and I'll try to get it fixed.

Installation

Lexitron is available on the Python Package Index (pip). To install, simply type

$ pip install lexitron

at the command line.

Once the install is complete, you can access Lexitron with the lx command at the terminal.

Usage

Usage syntax is

$ lx [options] expression

where expression is a regular expression and [options] are as follows.

option function
-d Append start and end delimiters ^...$ to search query
-n Print only the number of matches
-u Search only for lowercase/common/uncapitalized words
-U Search only for uppercase/proper/capitalized words
-v Show version and exit
-x Print unformatted output, one word per line

Type $ lx -h for full help text.

If you aren't familiar with regular expressions, it isn't too hard to learn the basics. There are many resources online. A good starting point is the Wikipedia article.

Output

By default, Lexitron will output a well-formatted (potentially multi-column) list of words, along with a header describing the results.

The results are separated into "proper" words (capitalized, like "France") and "common" words (lowercase, like "boat").

Using the -x flag will return a more machine-readable output with one word per line.

Examples

Example 1

A list of English words ending with "icide".

$ lx icide$
---------------------------------------------------------------------------
53 results for /.*icide/
0 proper ~ 53 common
---------------------------------------------------------------------------

aborticide      foeticide       matricide       pesticide       stillicide
acaricide       fratricide      medicide        prolicide       suicide
agricide        fungicide       menticide       pulicide        tyrannicide
algicide        germicide       miticide        raticide        uxoricide
aphicide        giganticide     molluscicide    regicide        vaticide
aphidicide      herbicide       nematicide      rodenticide     verbicide
bacillicide     homicide        ovicide         scabicide       vermicide
bactericide     infanticide     parasiticide    silicide        viricide
deicide         insecticide     parasuicide     sororicide      vulpicide
feticide        larvicide       parricide       spermicide
filicide        liberticide     patricide       sporicide

Example 2

A list of English words that contain the substring "rdb".

$ lx rdb
---------------------------------------------------------------------------
21 results for /rdb/
1 proper ~ 20 common
---------------------------------------------------------------------------

Standardbred

birdbath          herdbook
birdbrain         herdboy
cardboard         leopardbane
hardback          recordbook
hardbake          standardbearer
hardball          standardbred
hardbeam          swordbill
hardboard         thirdborough
hardboot          wordbook
hardbound         yardbird

Example 3

The number of lowercase English words that end in "tion".

$ lx -nxu ".*tion"
3837

(This number should be taken with a grain of salt, since no dictionary is perfect, and it depends on what you count as a valid english word, and which technical or niche jargons are included; etc etc.)

Example 4

A list of English words with the same double letter appearing twice, except for those whose double letter is a vowel or the letter s (to ignore words of the form *lessness).

$ lx "([^aeious])\1.*\1\1"
---------------------------------------------------------------------------
45 results for /([^aeious])\1.*\1\1/
9 proper ~ 36 common
---------------------------------------------------------------------------

Allhallowmas
Allhallows
Allhallowtide
Armillariella
Chancellorsville
Dullsville
Gallirallus
Hunnemannia
Llullaillaco

acciaccatura       hillbilly          pellmell           shillyshally
bellpull           huggermugger       pizzazz            skillfully
chiffchaff         hullaballoo        pralltriller       snippersnapper
dillydallier       jellyroll          razzamatazz        villanelle
dillydally         kinnikinnic        razzmatazz         volleyball
dullsville         kinnikinnick       riffraff           volleyballer
flibbertigibbet    millefeuille       rollcollar         whippersnapper
granddaddy         niffnaff           rollerball         willfully
hallalling         parallelling       scuttlebutt        yellowbelly

Example 5

Compare the number of lowercase/uncapitalized words that end in "woman" with the number that end in "man".

$ lx -nxu ".*woman"
107
$ lx -nxu ".*(?<\!wo)man"
1145

Acknowledgements

For its dictionary, Lexitron uses the Automatically Generated Inflection Database (AGID) by Kevin Atkinson. See http://wordlist.sourceforge.net/.

License

Lexitron is licensed under GNU GPL Version 2.

Contact

Questions, bug reports, and feature requests can be filed on the Github issues tracker.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Lexitron-2.0.4.tar.gz (873.6 kB view details)

Uploaded Source

Built Distribution

Lexitron-2.0.4-py3-none-any.whl (872.6 kB view details)

Uploaded Python 3

File details

Details for the file Lexitron-2.0.4.tar.gz.

File metadata

  • Download URL: Lexitron-2.0.4.tar.gz
  • Upload date:
  • Size: 873.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.2

File hashes

Hashes for Lexitron-2.0.4.tar.gz
Algorithm Hash digest
SHA256 cdd8a7fb4ece6efb9c5253a0aaf456df22d6124b088699cb3179f3a7c62daaa9
MD5 d80c9a786059c60809920fb48b47cce7
BLAKE2b-256 3dbaba3e386c319342d8c7007521e6ac3c26aef0588adb35d4986c07da575f01

See more details on using hashes here.

File details

Details for the file Lexitron-2.0.4-py3-none-any.whl.

File metadata

  • Download URL: Lexitron-2.0.4-py3-none-any.whl
  • Upload date:
  • Size: 872.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.2

File hashes

Hashes for Lexitron-2.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 13c6210749054aa996bf97e4987339bd5fb509d3935272f26d2459e8d898fcbe
MD5 7bfa1115fe3ef10fde490451d80b4bc5
BLAKE2b-256 8b39606aaba0079c90679f1cc076489c1953027260e6184268e578a1bca1750d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page