Skip to main content

A command-line regex search engine for the English language

Project description

Lexitron

A command-line regex search engine for the English language.

Requirements

The only major requirement is Python.

I don't actually know which versions of Python this package will work on, I've only tested on my own system which is using Python 3.11. Any feedback about what works and doesn't would be helpful.

I did not write Lexitron to work on Windows, although it is a simple enough package that I don't see why it shouldn't.

If you try to install Lexitron and something goes wrong, let me know what your system details are and I'll try to get it fixed.

Installation

Lexitron is available on the Python Package Index (pip). To install, simply type

$ pip install lexitron

at the command line.

Once the install is complete, you can access Lexitron with the lx command at the terminal.

Usage

Usage syntax is

$ lx [options] expression

where expression is a regular expression and [options] are as follows.

option function
-d Append start and end delimiters ^...$ to search query
-n Print only the number of matches
-u Search only for common/lowercase/non-capitalized words
-U Search only for proper/uppercase/capitalized words
-x Print unformatted output, one word per line

Type $ lx -h for full help text.

Output

By default, Lexitron will output a well-formatted (potentially multi-column) list of words, along with a header describing the results.

The results are separated into "proper" words (capitalized, like "France") and "common" words (lowercase, like "banana").

Using the -x flag will return a more machine-readable output with one word per line.

Examples

Example 1

A list of English words ending with "icide".

$ lx icide$
---------------------------------------------------------------------------
53 results for /.*icide/
0 proper ~ 53 common
---------------------------------------------------------------------------

aborticide      foeticide       matricide       pesticide       stillicide
acaricide       fratricide      medicide        prolicide       suicide
agricide        fungicide       menticide       pulicide        tyrannicide
algicide        germicide       miticide        raticide        uxoricide
aphicide        giganticide     molluscicide    regicide        vaticide
aphidicide      herbicide       nematicide      rodenticide     verbicide
bacillicide     homicide        ovicide         scabicide       vermicide
bactericide     infanticide     parasiticide    silicide        viricide
deicide         insecticide     parasuicide     sororicide      vulpicide
feticide        larvicide       parricide       spermicide
filicide        liberticide     patricide       sporicide

Example 2

A list of English words that contain the substring "rdb".

$ lx rdb
---------------------------------------------------------------------------
21 results for /rdb/
1 proper ~ 20 common
---------------------------------------------------------------------------

Standardbred

birdbath          herdbook
birdbrain         herdboy
cardboard         leopardbane
hardback          recordbook
hardbake          standardbearer
hardball          standardbred
hardbeam          swordbill
hardboard         thirdborough
hardboot          wordbook
hardbound         yardbird

Example 3

The number of lowercase English words that end in "tion".

$ lx -nxu ".*tion"
3837

(This number should be taken with a grain of salt, since no dictionary is perfect, and it depends on what you count as a valid english word, and which technical or niche jargons are included; etc etc.)

Example 4

A list of English words with the same double letter appearing twice, except for those whose double letter is a vowel or the letter s (to ignore words of the form *lessness).

$ lx "([^aeious])\1.*\1\1"
---------------------------------------------------------------------------
45 results for /([^aeious])\1.*\1\1/
9 proper ~ 36 common
---------------------------------------------------------------------------

Allhallowmas
Allhallows
Allhallowtide
Armillariella
Chancellorsville
Dullsville
Gallirallus
Hunnemannia
Llullaillaco

acciaccatura       hillbilly          pellmell           shillyshally
bellpull           huggermugger       pizzazz            skillfully
chiffchaff         hullaballoo        pralltriller       snippersnapper
dillydallier       jellyroll          razzamatazz        villanelle
dillydally         kinnikinnic        razzmatazz         volleyball
dullsville         kinnikinnick       riffraff           volleyballer
flibbertigibbet    millefeuille       rollcollar         whippersnapper
granddaddy         niffnaff           rollerball         willfully
hallalling         parallelling       scuttlebutt        yellowbelly

Example 5

Compare the number of lowercase/non-capitalized words that end in "woman" with the number that end in "man".

$ lx -nxu ".*woman"
107
$ lx -nxu ".*(?<\!wo)man"
1145

Acknowledgements

For its dictionary, Lexitron uses the Automatically Generated Inflection Database (AGID) by Kevin Atkinson. See http://wordlist.sourceforge.net/.

License

Lexitron is licensed under GNU GPL Version 2.

Contact

Questions, bug reports, and feature requests can be filed on the Github issues tracker.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Lexitron-2.0.1.tar.gz (3.6 MB view details)

Uploaded Source

Built Distribution

Lexitron-2.0.1-py3-none-any.whl (3.6 MB view details)

Uploaded Python 3

File details

Details for the file Lexitron-2.0.1.tar.gz.

File metadata

  • Download URL: Lexitron-2.0.1.tar.gz
  • Upload date:
  • Size: 3.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.2

File hashes

Hashes for Lexitron-2.0.1.tar.gz
Algorithm Hash digest
SHA256 93f0574d3c661d403bf3dc49cc2c8e66a1fcff7b60b7881781160e0ca3669555
MD5 c864ec9b03cd07b1751cd8981a6d0386
BLAKE2b-256 0381d4dc1bf6a0623c0d53e0227a2c19d6da490206f726dc85bbde9e70dd1ce3

See more details on using hashes here.

File details

Details for the file Lexitron-2.0.1-py3-none-any.whl.

File metadata

  • Download URL: Lexitron-2.0.1-py3-none-any.whl
  • Upload date:
  • Size: 3.6 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.2

File hashes

Hashes for Lexitron-2.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b3cacce1820529de2be1ec143de8d30045f5b298b150ac63b23ef038223e5b80
MD5 8124d7c5b4d13485bb582142aee63a67
BLAKE2b-256 b1d93cde75edc843c6470cc6d2082c781919d156493973e55a28b89adb93d9e3

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page