Skip to main content

Perform Intent Classification using an External Schema

Project description

schema-classification

This microservice performs the classification of parse results

Usage

The input format looks like this

input_tokens = [
    {
        "normal": "my",
    },
    {
        "normal": "late",
    },
    {
        "normal": "transport",
    },
    {
        "normal": "late_transport",
        "swaps": {
            "canon": "late_transport",
            "type": "chitchat"
        }
    },
]

Calling the service looks like this

from schema_classification import classify

absolute_path = os.path.normpath(
    os.path.join(os.getcwd(), 'resources/testing',
                    'test-intents-0.1.0.yaml'))

svcresult = classify(
    absolute_path=absolute_path,
    input_tokens=input_tokens)

The output from this call looks like

{
    'result': [{
        'classification': 'Late_Transport',
        'confidence': 99 }],
    'tokens': {
        'late': '',
        'late_transport': 'chitchat',
        'my': '',
        'transport': ''}
}

Classification via Mapping

Classification of Unstructured Text is a mapping exercise

The mapping is composed of these elements

  1. Include One Of
  2. Include All Of
  3. Exclude One Of
  4. Exclude All Of

The classifier will map extracted entities from unstructured text using the listed elements.

for example,

TEST_INTENT
  - include_one_of:
    - alpha
    - apple
  - include_all_of:
    - beta
    - gamma
  - exclude_one_of:
    - delta
  - exclude_all_of:
    - epsilon
    - digamma

This intent will be selected if the set of extracted entities has either alpha or apple and has both (beta, gamma). The intent will be discarded if delta occurs or if both (epsilon, digamma) occur.

In python, everything can be loaded into a native set structure and use native operations like difference, intersection, union, and symmetric difference.

Because all set operations are native (underlying C modules), it's extremely fast to find an accurate classification.

The system adds more smarts by figuring out what to do if the rule states delta is excluded, and a descendant of delta is present.

Or if alpha should be included and a sibling or child of alpha is present, etc.

In this case, I usually rely on a heuristic to boost or lower confidence and tweak that overtime to get a good result.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

schema-classification-0.1.6.tar.gz (16.8 kB view details)

Uploaded Source

Built Distribution

schema_classification-0.1.6-py3-none-any.whl (32.3 kB view details)

Uploaded Python 3

File details

Details for the file schema-classification-0.1.6.tar.gz.

File metadata

  • Download URL: schema-classification-0.1.6.tar.gz
  • Upload date:
  • Size: 16.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.13 CPython/3.8.5 Windows/10

File hashes

Hashes for schema-classification-0.1.6.tar.gz
Algorithm Hash digest
SHA256 9b009e1cca45bb7b2352e4b00f87d36930f0d06aa5218a3cbb13360386927d21
MD5 31006ef11ffddf0109b64cf620a695cb
BLAKE2b-256 a6665d0d4cd81a95cc18e743f97ac563d79499374b448788818bd64c777d98e6

See more details on using hashes here.

File details

Details for the file schema_classification-0.1.6-py3-none-any.whl.

File metadata

File hashes

Hashes for schema_classification-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 0292bda451b89491a50f13da8912646b70b8fc1e8aa77bde2a6707c812a155b4
MD5 db8b4637442df7f3380062aaafd6cf18
BLAKE2b-256 17496d17d6a5e0f0416f065e2b8c4d183d623d5f8e507f89c76b8d73f5e44eae

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page