contentai-activity-classifier

ContentAI Activity Classification Service

Project description

activity-classifier-extractor

Generates activity classifications from low-level feature inputs in support of analytic workflows within the ContentAI Platform, published as the extractor dsai_activity_classifier.

Getting Started
Execution
Creating Models
Testing
Future Development
Changes

Getting Started

This library is used as a single-run executable.

Runtime parameters can be passed for processing that configure the returned results and can be examined in more detail in the main script.

verbose - (bool) - verbose input/output configuration printing (default=false)
path_content - (str) - input video path for files to label (default=video.mp4)
path_result - (str) - output path for samples (default=.)
path_models - (str) - manifest path for model information (default=data/models/manifest.json)
time_interval - (float) - time interval for predictions from models (default=3.0)
average_predictions - (bool) - flatten predictions across time and class (default=false)
round_decimals - (int) - rounding decimals for predictions (default=5)
score_min - (float) - apply a minimum score threshold for classes (default=0.1)

dependencies

To install package dependencies in a fresh system, the recommended technique is a set of

vanilla pip packages. The latest requirements should be validated from the requirements.txt file but at time of writing, they were the following.

pip install --no-cache-dir -r requirements.txt

Execution and Deployment

This package is meant to be run as a one-off processing tool that aggregates the insights of other extractors.

command-line standalone

Run the code as if it is an extractor. In this mode, configure a few environment variables to let the code know where to look for content.

One can also run the command-line with a single argument as input and optionally ad runtime configuration (see runtime variables) as part of the EXTRACTOR_METADATA variable as JSON.

EXTRACTOR_METADATA='{"compressed":True}'

Locally Run Classifier on Results

For utility, the above line has been wrapped in the bash script run_local.sh.

./run_local.sh <docker_image> [<source_directory> <output_data_dir> [<json_args>]] [<all_args>]
   - run clip extraction on source with prior processing

  <docker_image> = 0 IF local command-line based (args using arg parse)
                 = 1 IF local docker emulation
                 = IMAGE_NAME IF docker image name to run

  ./run_local.sh 0 --path_content features/ --path_result results/ --verbose
  ./run_local.sh 1 features/ results/ 0 '{\"verbose\"true}'

Through all of the above examples, the underlying command-line execution is similar to this excution run on the testing data.

python -u activity_classifier/main.py --path_content testing/data/launch/video.mp4
        --path_result testing/class --path_models activity_classifier/data/models/manifest.json --verbose

Feature-Based Similarity

A helper script is also avaialble to compute the similarity of clips in one or more feature files. (v1.1.0)

python -u activity_classifier/features.py --path_content testing/data/dummy.txt \\
        --feature_type dsai_videocnn dsai_vggish --path_result testing/dist

ContentAI

Deployment

Deployment is easy and follows standard ContentAI steps.

contentai deploy dsai_activity_classifier
Deploying...
writing workflow.dot
done

Alternatively, you can pass an image name to reduce rebuilding a docker instance.

docker build -t dsai_activity_classifier
contentai deploy metadata-flatten dsai_activity_classifier

Locally Downloading Results

You can locally download data from a specific job for this extractor to directly analyze.

contentai data wHaT3ver1t1s --dir data

Run as an Extractor

contentai run https://bucket/video.mp4  -w 'digraph { dsai_videocnn -> dsai_activity_classifier; dsai_vggish -> dsai_activity_classifier }'

JOB ID:     1Tfb1vPPqTQ0lVD1JDPUilB8QNr
CONTENT:    s3://bucket/video.mp4
STATE:      complete
START:      Fri Feb 15 04:38:05 PM (6 minutes ago)
UPDATED:    1 minute ago
END:        Fri Feb 15 04:43:04 PM (1 minute ago)
DURATION:   4 minutes

EXTRACTORS

my_extractor

TASK      STATE      START           DURATION
724a493   complete   5 minutes ago   1 minute

Or run it via the docker image. Please review the run_local.sh file for more information.

View Extractor Logs (stdout)

contentai logs -f <my_extractor>
my_extractor Fri Nov 15 04:39:22 PM writing some data
Job complete in 4m58.265737799s

Adding New Models

There are two steps to adding new models.

First, train the models and formulate a well-known structure (this can be done exhaustively across a number of model types). See MODELS.rst for more details.
Update the manifest according to the instructions below to indicate how the activity classifier should load the model (e.g. the framework), the required features, and a few fields for understanding other descriptions (e.g. the name and the id).

Updating The Manifest

Adding models to the pre-determined set of models is as easy as editing a manifest file and adding a model into git LFS.

Archive the new model into a serialized fileset. At time of writing, this was serializing models from sklearn with simple pickle load/save serialization.
Gather all of the relevant output files and compress them if you can. Currently, the library understands gzip compression extensions (e.g. “.gz”).
Choose the appropriate sub-directory that corresponds to the upstream feature extractor. For example, models built on 3dcnn features may process new videos (via extractor chaining) to the extractor dsai_3dcnn. If one doesn’t exist yet, please create a new directory, but remember what combination of audio and video features is required.
Modify the manifest file in activity_classifier/data/models/manifest.json for your new entry. Specifically, the input video and audio features must be defined as well as the serialization library. Below is an example block that indicates 3dcnn` video and ``vggish audio features for a model crated with sklearn where prediction results will be nested with the name Running.
```
[ ...
{
    "path": "3dcnn-vggish/lr-Running.pkl.gz",
    "name": "Running",
    "id": "ugc",
    "framework": "sklearn",
    "video": "dsai_videocnn",
    "audio": "dsai_vggish"
},
... ]
```
Prepare to add your model files to the repo. NOTE This repo uses `git-lfs <https://git-lfs.github.com/>`__ to store all binary files like models. If your model is added with regular git tools alone, you will get a sternly worded email (and friendly advice on how to re-add correctly).
```
(from the base directory only)
git lfs track activity_classifier/data/models/3dcnn/moonwalk_model.pkl.gz
git add activity_classifier/data/models/3dcnn/moonwalk_model.pkl.gz
git add activity_classifier/data/models/manifest.json
```
Test your model with the data in the testing directory. The CI/CD process should do this too but it’s always easier to find and fix problems here than with a vague email. The features in this directory came from processing of the HBO Max Launch Video, which is publicly available as a reference.
```
(from the base directory)

./run_local.sh 0 --path_content testing/data/test.mp4 --time_interval 1.5

(check for predictions from your new model in data.json)
```

Testing

Testing is included via tox. To launch testing for the entire package, just run tox at the command line. Testing can also be run for a specific file within the package by setting the evironment variable TOX_ARGS.

TOX_ARG=test_basic.py tox

Future Development

additional training hooks?

Changes

Generates activity classifications from low-level feature inputs in support of analytic workflows within the ContentAI Platform.

1.3

1.3.7

fix run_local typos
more verbosity checks

1.3.6

modeling.py separators
docs reorg

1.3.5

contentai key request fix

1.3.3

docs update
multiclass write

1.3.2

docker build update, run example update

1.3.1

docs fix for example of using package
bug fix for default location, change inputs to classify function

1.3.0

move models out of the primary package
breaking change, rename input param path_models to path_manifest

1.2

1.2.2

bump version for model migration to LFS

1.2.1

fix docker/deployed image run command

1.2.0

switch to package representation, push to pypi
several updates for MANIFEST definition (id)
inclusion of multi-parameter training and testing framework
safety for model loading, catch exceptions, return gracefully
update documents to split for binary models

1.1

1.1.1

cosmetic change for reuse in other libraries

1.1.0

refactor feature code, add utility for difference computation among segments
min value thresholding to avoid low scoring results in output (default=0.1)
refactor caching information for feature load (allow flatten, remove cache, allow multi-asset)
allow recursive feature load for distance compute

1.0

1.0.2

fixes for output, modify to require other extractors as dependencies
fix order of paramters for local runs

1.0.1

updates for integration of other models, fixes for prediction output
add l2norm after average/merge in time of source features

1.0.0

initial project merge from other sources
generates json prediction dict
callable as package
includes some testing routines with windowing comparison

Project details

Release history Release notifications | RSS feed

This version

1.3.7

Aug 18, 2020

1.3.6

Aug 14, 2020

1.3.5

Aug 12, 2020

1.2.2

Aug 8, 2020

1.2.1

Aug 7, 2020

1.2.0

Aug 7, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

contentai_activity_classifier-1.3.7-py2.py3-none-any.whl (33.3 kB view hashes)

Uploaded Aug 18, 2020 Python 2 Python 3

Hashes for contentai_activity_classifier-1.3.7-py2.py3-none-any.whl

Hashes for contentai_activity_classifier-1.3.7-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`56e26f407fff2ba91eb1d46cc1ecbe40611b2fa850f68e0b6be8bd4830b15ead`
MD5	`9fdc93eb53b81f224d5498430f5922ea`
BLAKE2b-256	`88f91ea16c08edee2934a0cd195c28c8e9bae36315dfe929ca4fde555ea3f681`

contentai-activity-classifier 1.3.7

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

activity-classifier-extractor

Getting Started

dependencies

Execution and Deployment

command-line standalone

Locally Run Classifier on Results

Feature-Based Similarity

ContentAI

Deployment

Locally Downloading Results

Run as an Extractor

View Extractor Logs (stdout)

Adding New Models

Updating The Manifest

Testing

Future Development

Changes

1.3

1.3.7

1.3.6

1.3.5

1.3.3

1.3.2

1.3.1

1.3.0

1.2

1.2.2

1.2.1

1.2.0

1.1

1.1.1

1.1.0

1.0

1.0.2

1.0.1

1.0.0

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution