MREP: morpheme regular expression printer
Project description
MREP is a regular expression matcher for morpheme sequences. You can find morpheme sub-sequences that match a given pattern, such as noun sequences.
Requirement
Python >=2.7
mecab-python3 ( https://github.com/SamuraiT/mecab-python3 )
Install
$ pip install mrep
If you want to install it from its source, use setup.py.
$ python setup.py install
Usage
usage: mrep [-h] [-o] [--color {never,auto,always}] [-n] [--mecab-arg MECAB_ARG] PATTERN [FILE [FILE ...]]
- positional arguments:
- PATTERN:
pattern
- FILE:
data file
- optional arguments:
- -h, --help
show this help message and exit
- -o, --only-matching
print only matching
- --color COLOR
color mode. select from “never”, “auto” and “always”. (default: auto)
- -n, --line-number
Show line number
- --mecab-arg MECAB_ARG
argument to pass to mecab (ex: “-r /path/to/resource/file”)
Pattern
- .
matches all morphemes
- <surface=XXX>
matches morphemes whose surface are XXX
- <pos=XXX>
matches morphemes whose POS are XXX
- X*
matches repetiion of a pattern X
- X|Y
matches X or Y
- (X)
matches X
Example
- <pos=名詞>
matches a noun
- <pos=名詞>*
matches repetition of nouns
- <pos=名詞>*<pos=助詞>
matches repetition of nouns and a particle
- (<pos=名詞>|<pos=動詞>)*
matches repetition of nouns or verbs
License
This program is distributed under the MIT license.
Copyright
(c) 2014, Yuya Unno.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for mrep-0.1.1.macosx-10.4-x86_64.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | ab15f67bb861907d8e8accef654aecdb95ee4a5224b9f59055531cb2f8050fc6 |
|
MD5 | 65eadc22b03032d7d360818d186b0425 |
|
BLAKE2b-256 | 5e20e1176bc9f635e79021649a7c5e0e76ba15273f08661459f5c4761907fc7c |