Perform Intent Classification using an External Schema
Project description
schema-classification
This microservice performs the classification of parse results
Usage
The input format looks like this
input_tokens = [
{
"normal": "my",
},
{
"normal": "late",
},
{
"normal": "transport",
},
{
"normal": "late_transport",
"swaps": {
"canon": "late_transport",
"type": "chitchat"
}
},
]
Calling the service looks like this
from schema_classification import classify
absolute_path = os.path.normpath(
os.path.join(os.getcwd(), 'resources/testing',
'test-intents-0.1.0.yaml'))
svcresult = classify(
absolute_path=absolute_path,
input_tokens=input_tokens)
The output from this call looks like
{
'result': [{
'classification': 'Late_Transport',
'confidence': 99 }],
'tokens': {
'late': '',
'late_transport': 'chitchat',
'my': '',
'transport': ''}
}
Classification via Mapping
Classification of Unstructured Text is a mapping exercise
The mapping is composed of these elements
- Include One Of
- Include All Of
- Exclude One Of
- Exclude All Of
The classifier will map extracted entities from unstructured text using the listed elements.
for example,
TEST_INTENT
- include_one_of:
- alpha
- apple
- include_all_of:
- beta
- gamma
- exclude_one_of:
- delta
- exclude_all_of:
- epsilon
- digamma
This intent will be selected if the set of extracted entities has either alpha
or apple
and has both (beta, gamma)
. The intent will be discarded if delta
occurs or if both (epsilon, digamma)
occur.
In python, everything can be loaded into a native set structure and use native operations like difference
, intersection
, union
, and symmetric difference
.
Because all set operations are native (underlying C modules), it's extremely fast to find an accurate classification.
The system adds more smarts by figuring out what to do if the rule states delta
is excluded, and a descendant of delta
is present.
Or if alpha
should be included and a sibling or child of alpha
is present, etc.
In this case, I usually rely on a heuristic to boost or lower confidence and tweak that overtime to get a good result.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for schema-classification-0.1.7.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 901f1e77b3b84dcbf71c2e5c42f3aff834c803cd32a5cdd5d582315f5286ae20 |
|
MD5 | 9f4b8b2438fd5baa92c03bc4ee7a4a05 |
|
BLAKE2b-256 | 887473f366c388696420602c555e75cf9636bf2998d37ab2ca66d5cd4eebfa53 |
Hashes for schema_classification-0.1.7-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d4f4b5215251e211be44a4a49fdbcd8eb53012b9554b1839971a6a6b92fb0cda |
|
MD5 | 74383ab5435e9f41238d3a7bad5450ca |
|
BLAKE2b-256 | 93904ca82c2362f447724382f3674400b70e1925414d518943d53ca2fe0e1fbe |