find English words having specified letter frequencies
Project description
OVERVIEW
========
The Python program ``frequencies.py`` finds English words having specified
letter frequencies. The user specifies letter frequencies via a ``dict`` (Python
dictionary), which is a series of key-value pairs. The letter frequency ``dict``
may appear on the command line; if it does not, the program prompts the user to
enter it interactively.
In each key-value pair, the key appears first, followed by a colon (:) and then
the associated value. The order of the key and the value within a pair are
critical, but the order of the pairs does not matter. Keys must be unique;
values are not required to be unique.
In each key-value pair, the key must be one of the following:
- a single letter, optionally enclosed in quotes. In this case the frequency
specified by the value (see below) applies to that specific letter.
- an integer, in which case the frequency applies to an unspecified letter that
must be distinct from the letters associated with other frequencies.
The value must be one of the following:
- a positive integer. This is the number of times that some letter--either the
letter specified via the key or some other unique letter if the key is an
integer-- appears in a matching word.
- a range of integers of the form m-n (two integers separated by a hyphen).
This indicates that the corresponding letter must appear at least m times but no
more than n times. For example, the key-value pair 'e:3-99' indicates that the
letter 'e' must appear at least 3 and no more than 99 times. The first integer
(m) may equal zero. One may use an asterisk (*) as a shorthand for 0-99, i.e.,
the asterisk indicates that the given letter may appear any number of times
(including zero). One may use a question mark (?) as a shorthand for 0-1, and
a plus sign (+) as a shorthand for 1-99.
Note: When specifying the frequency as a range, the key must name a specific
letter. Thus, for example, the key-value pair '1:*' would be illegal.
TO RUN THIS PROGRAM:
===================
(1) Verify that a spelling dictionary file containing one word per line exists
either in the folder containing frequencies.py or one level higher in the folder
tree.
(2) Verify that the file command_line_input.py exists either in the same folder
as frequencies.py or in a folder specified via the PYTHONPATH environment
variable.
(3) Depending on what operating system you are using, open a Windows or Linux
command prompt and make the folder containing frequencies.py the current folder.
(4) Issue a command of the form 'python frequencies.py <freqs>' where <freqs> is
a letter frequency specification.
EXAMPLES
========
(1) To find 6-letter words containing one c, one d, two o's, one r, and one
other letter, issue the following command:
python frequencies.py c:1, d:1, o:2, r:1, 1:1
There are 4 matches: 'condor', 'cordon', 'corody', and 'doctor'.
(2) To find 10-letter words containing one v, two instances of a second letter,
three instances of a third letter, and four instances of a fourth letter, issue
the following command:
python frequencies.py v:1, 2:2, 3:3, 4:4
There are 2 matches: 'evennesses' and 'sleeveless'.
(3) To find all words containing three or more e's and any number of m's, n's,
p's, and t's, issue the following command:
python frequencies.py e:3-99,m:*,n:*,p:*,t:*
There are 6 matches: 'entente', 'epee', 'pentene', 'teepee', 'tenement', and
'tepee'
========
The Python program ``frequencies.py`` finds English words having specified
letter frequencies. The user specifies letter frequencies via a ``dict`` (Python
dictionary), which is a series of key-value pairs. The letter frequency ``dict``
may appear on the command line; if it does not, the program prompts the user to
enter it interactively.
In each key-value pair, the key appears first, followed by a colon (:) and then
the associated value. The order of the key and the value within a pair are
critical, but the order of the pairs does not matter. Keys must be unique;
values are not required to be unique.
In each key-value pair, the key must be one of the following:
- a single letter, optionally enclosed in quotes. In this case the frequency
specified by the value (see below) applies to that specific letter.
- an integer, in which case the frequency applies to an unspecified letter that
must be distinct from the letters associated with other frequencies.
The value must be one of the following:
- a positive integer. This is the number of times that some letter--either the
letter specified via the key or some other unique letter if the key is an
integer-- appears in a matching word.
- a range of integers of the form m-n (two integers separated by a hyphen).
This indicates that the corresponding letter must appear at least m times but no
more than n times. For example, the key-value pair 'e:3-99' indicates that the
letter 'e' must appear at least 3 and no more than 99 times. The first integer
(m) may equal zero. One may use an asterisk (*) as a shorthand for 0-99, i.e.,
the asterisk indicates that the given letter may appear any number of times
(including zero). One may use a question mark (?) as a shorthand for 0-1, and
a plus sign (+) as a shorthand for 1-99.
Note: When specifying the frequency as a range, the key must name a specific
letter. Thus, for example, the key-value pair '1:*' would be illegal.
TO RUN THIS PROGRAM:
===================
(1) Verify that a spelling dictionary file containing one word per line exists
either in the folder containing frequencies.py or one level higher in the folder
tree.
(2) Verify that the file command_line_input.py exists either in the same folder
as frequencies.py or in a folder specified via the PYTHONPATH environment
variable.
(3) Depending on what operating system you are using, open a Windows or Linux
command prompt and make the folder containing frequencies.py the current folder.
(4) Issue a command of the form 'python frequencies.py <freqs>' where <freqs> is
a letter frequency specification.
EXAMPLES
========
(1) To find 6-letter words containing one c, one d, two o's, one r, and one
other letter, issue the following command:
python frequencies.py c:1, d:1, o:2, r:1, 1:1
There are 4 matches: 'condor', 'cordon', 'corody', and 'doctor'.
(2) To find 10-letter words containing one v, two instances of a second letter,
three instances of a third letter, and four instances of a fourth letter, issue
the following command:
python frequencies.py v:1, 2:2, 3:3, 4:4
There are 2 matches: 'evennesses' and 'sleeveless'.
(3) To find all words containing three or more e's and any number of m's, n's,
p's, and t's, issue the following command:
python frequencies.py e:3-99,m:*,n:*,p:*,t:*
There are 6 matches: 'entente', 'epee', 'pentene', 'teepee', 'tenement', and
'tepee'