Skip to main content

AI/NLP (Spacy) Rule Based Matcher pattern finder

Project description

PatternOmatic 0.2.*

#AI · #EvolutionaryComputation · #NLP

Built with spaCy License: LGPL v3 Build Status Sonar Coverage Duplicated Lines (%) Maintainability Rating GitHub repo size Libraries.io SourceRank PyPI - Downloads PyPI version

Discover spaCy's linguistic patterns matching a given set of string samples

Requirements

Basic usage

From sources

Clone SCM official repository

git clone git@github.com:revuel/PatternOmatic.git

Play with Makefile

  • make venv to activate project's Virtual Environment*
  • make libs to install dependencies
  • make test to run Unit Tests
  • make coverage to run Code Coverage
  • make run to run PatternOmatic's script with example parameters

* you must have one first

From package

Install package

pip install PatternOmatic

Play with the CLI

# Show help 
patternomatic.py -h

# Usage example 1: Basic
patternomatic.py -s Hello world -s Goodbye world

# Usage example 2: Using a different language
python -m spacy download es_core_news_sm
patternomatic.py -s Me llamo Miguel -s Se llama PatternOmatic -l es_core_news_sm

Play with the library

""" 
PatternOmatic library client example.
Find linguistic patterns to be used by the spaCy Rule Based Matcher

"""
from PatternOmatic.api import find_patterns, Config

if __name__ == '__main__':

    my_samples = ['I am a cat!', 'You are a dog!', 'She is an owl!']

    # Optionally, let it evolve a little bit more!
    config = Config()
    config.max_generations = 150
    config.max_runs = 3

    patterns_found, _ = find_patterns(my_samples)

    print(f'Patterns found: {patterns_found}')


Features

Generic

✅ No OS dependencies, no storage or database required!

✅ Lightweight package with just a little direct pip dependencies

✅ Easy and highly configurable to boost clever searches

✅ Includes basic logging mechanism

✅ Includes basic reporting, JSON and CSV format supported. Report file path is configurable

✅ Configuration file example provided (config.ini)

✅ Default configuration is run if no configuration file provided

✅ Provides rollback actions against several possible misconfiguration scenarios

Evolutionary

✅ Basic Evolutionary (Grammatical Evolution) parameters available and configurable

✅ Supports two different Evolutionary Fitness functions

✅ Supports Binary Tournament Evolutionary Selection Type

✅ Supports Random One Point Crossover Evolutionary Recombination Type

✅ Supports "µ + λ" Evolutionary Replacement Type

✅ Supports "µ ∪ λ" with elitism Evolutionary Replacement Type

✅ Supports "µ ∪ λ" without elitism Evolutionary Replacement Type

✅ Typical evolutionary performance metrics included:

  • Success Rate (SR)
  • Mean Best Fitness (MBF)
  • Average Evaluations to Solution (AES)

Linguistic

Compatible with any spaCy Language Model

Supports all spaCy's Rule Based Matcher standard Token attributes

Supports the following spaCy's Rule Based Matcher non standard Token attributes (via underscore)

  • ent_id
  • ent_iob
  • ent_kb_id
  • has_vector
  • is_bracket
  • is_currency
  • is_left_punct
  • is_oov
  • is_quote
  • is_right_punct
  • lang
  • norm
  • prefix
  • sentiment
  • string
  • suffix
  • text_with_ws
  • whitespace

✅ Supports skipping boolean Token attributes

Supports spaCy's Rule Based Matcher Extended Pattern Syntax

Supports spaCy's Rule Based Matcher Grammar Operators and Quantifiers

Supports Token Wildcard

✅ Supports defining the number of attributes per token within searched patterns

✅ Supports usage of non repeated token attribute values


Author: Miguel Revuelta Espinosa (revuel), a humble AI enthusiastic

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

PatternOmatic-0.2.3.tar.gz (28.6 kB view details)

Uploaded Source

Built Distribution

PatternOmatic-0.2.3-py3-none-any.whl (56.4 kB view details)

Uploaded Python 3

File details

Details for the file PatternOmatic-0.2.3.tar.gz.

File metadata

  • Download URL: PatternOmatic-0.2.3.tar.gz
  • Upload date:
  • Size: 28.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.54.1 CPython/3.7.1

File hashes

Hashes for PatternOmatic-0.2.3.tar.gz
Algorithm Hash digest
SHA256 2780e4e27ebdb40f42fcb6dd41a46b4fc2c103551fc057a2a36616c370156649
MD5 2aa1aeba8e9315a547e4f0e614371063
BLAKE2b-256 7a9303083ffd86c5263abb4ec4d9c0a507ea117dfd4150789554ae6c30cc5cf9

See more details on using hashes here.

File details

Details for the file PatternOmatic-0.2.3-py3-none-any.whl.

File metadata

  • Download URL: PatternOmatic-0.2.3-py3-none-any.whl
  • Upload date:
  • Size: 56.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.54.1 CPython/3.7.1

File hashes

Hashes for PatternOmatic-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 277890de3052c3ac188160f45981d39b133738516ab2677263b6e4b8915b4ea4
MD5 4f8789f9e9c51eed2f3fa4cd25ad993b
BLAKE2b-256 cd6053b3d16618647965d5d42af83a12120fb8021dab66f3476f9775130c110d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page