Perform Intent Classification using an External Schema
Project description
schema-classification
This microservice performs the classification of parse results
Usage
The input format looks like this
input_tokens = [
{
"normal": "my",
},
{
"normal": "late",
},
{
"normal": "transport",
},
{
"normal": "late_transport",
"swaps": {
"canon": "late_transport",
"type": "chitchat"
}
},
]
Calling the service looks like this
from schema_classification import classify
absolute_path = os.path.normpath(
os.path.join(os.getcwd(), 'resources/testing',
'test-intents-0.1.0.yaml'))
svcresult = classify(
absolute_path=absolute_path,
input_tokens=input_tokens)
The output from this call looks like
{
'result': [{
'classification': 'Late_Transport',
'confidence': 99 }],
'tokens': {
'late': '',
'late_transport': 'chitchat',
'my': '',
'transport': ''}
}
Classification via Mapping
Classification of Unstructured Text is a mapping exercise
The mapping is composed of these elements
- Include One Of
- Include All Of
- Exclude One Of
- Exclude All Of
The classifier will map extracted entities from unstructured text using the listed elements.
for example,
TEST_INTENT
- include_one_of:
- alpha
- apple
- include_all_of:
- beta
- gamma
- exclude_one_of:
- delta
- exclude_all_of:
- epsilon
- digamma
This intent will be selected if the set of extracted entities has either alpha
or apple
and has both (beta, gamma)
. The intent will be discarded if delta
occurs or if both (epsilon, digamma)
occur.
In python, everything can be loaded into a native set structure and use native operations like difference
, intersection
, union
, and symmetric difference
.
Because all set operations are native (underlying C modules), it's extremely fast to find an accurate classification.
The system adds more smarts by figuring out what to do if the rule states delta
is excluded, and a descendant of delta
is present.
Or if alpha
should be included and a sibling or child of alpha
is present, etc.
In this case, I usually rely on a heuristic to boost or lower confidence and tweak that overtime to get a good result.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for schema-classification-0.1.8.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 191fc2769c34cee72db5a0eba287508b390c45a6ffcf2fed03ff9c2738df9fb7 |
|
MD5 | b491cad365d8fc70ae491d6862107f5c |
|
BLAKE2b-256 | 22d7d4151ae7a71e65039a04b5dbb2d5dc54d67d798f60fcabb69fc7408363e0 |
Hashes for schema_classification-0.1.8-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 07bcbef08b8284744f6bd98b754a16d40ea9682d48d3614195493d090ae3f433 |
|
MD5 | cdef05b375be03c2cd4da64070503b92 |
|
BLAKE2b-256 | ee156eba9f0a231e47b5d2acf1bc70751d12951bbb653cce44a65227b110b29a |