makes it easier to implement a ContentAI extractor
Project description
contentai-extractor-runtime-python
This is a python package used for implementing a custom extractor that runs on the ContentAI platform.
https://pypi.org/project/contentaiextractor/
Usage
pip install contentaiextractor
import contentaiextractor as contentai
# download content locally
content_path = contentai.download_content()
# access metadata that was supplied when running a job
# contentai run s3://bucket/video.mp4 -d '{ "input": "value" }'
inputData = contentai.metadata()["input"]
# get output from another extractor
csv = contentai.get("extractor", "data.csv")
json = contentai.get_json("extractor", "data.json")
# extract some data
outputData = []
outputData.append({"frameNumber": 1})
# output data from this extractor
contentai.set("output", outputData)
API Documentation
ContentAIError Objects
class ContentAIError(Exception)
represents a contentai error
Fields
extractor_name
- name of the extractor being runjob_id
- current job idcontent_url
- URL of the content the extractor is run againstcontent_path
- local path where the extractor can access the contentresult_path
- local path where the extractor should write the resultsrunning_in_contentai
- boolean set toTrue
; useful for testing code locallymetadata_json
- raw string (orNone
if not set) for active extractor run (also, see parsed metadata()`)
Functions
download_content
download_content()
download content to work with locally
returns local path where content is written
metadata
metadata()
returns a dict containing input metadata
example:
access metadata that was supplied when running a job
contentai run s3://bucket/video.mp4 -d '{ "input: "value" }'
input = contentai.metadata()["input"]
keys
keys(extractor_name)
get a list of keys for specified extractor
returns a dict containing a list of keys
[
"data.json",
"data.csv",
"data.txt,"
]
example:
keys = contentai.keys("azure_videoindexer")
for key in keys:
data = contentai.get("azure_videoindexer", key)
get
get(extractor_name, key)
get the contents of a particular key
example:
# get another extractor's output
data = contentai.get("some_extractor", "output.csv")
get_json
get_json(extractor_name, key)
get the json contents of a particular key
example:
# get another extractor's output
data = contentai.get_json("some_extractor", "data.json")
get_bytes
get_bytes(extractor_name, key)
get the contents of a particular key in raw bytes
example:
# get another extractor's output
data = contentai.get_bytes("some_extractor", "output.bin")
set
set(key, value)
set results data for this extractor
can be called multiple times with different keys
value is a string
example:
contentai.set("output", "hello world")
set_json
set_json(key, value)
set results data for this extractor
can be called multiple times with different keys
value can be anything
example:
data = {}
data["foo"] = bar
contentai.set_json("output", data)
set_bytes
set_bytes(key, value)
set results data for this extractor
can be called multiple times with different keys
value is bytes
example:
some_file = open("some-file", "rb")
contentai.set_bytes("output", some_file.read())
save_results
save_results()
save results immediately, instead of waiting until process exits
parse_content_url
parse_content_url()
extract details from content url
returns
source_bucket_name
- the s3 bucket name derived from content_urlsource_bucket_key
- the s3 bucket key derived from content_urlsource_bucket_region
- the s3 bucket region derived from content_url
the following content url
formats are supported:
- Simple (CLI) Format -
s3://{bucket}/{key}
- Virtual Hosted Format -
https://{bucket}.s3.amazonaws.com/{key}
- Virtual Hosted Format with Region -
https://{bucket}.s3.{region}.amazonaws.com/{key}
Develop
Choose a make command to run
build build package
deploy upload package to pypi
docs generates api docs in markdown
Changes
-
1.0.4
- updated changelog
-
1.0.3
- fixes issue where
EXTRACTOR_METADATA
envvar was indavertently required
- fixes issue where
-
1.0.2
- add safety to setting retrieval on local runs
- documentation updates
-
1.0.1
- api docs for publish to pypi
-
1.0.0
- initial release
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for contentaiextractor-1.0.5rc0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8d7f36db14e9f4241e796ee2539116bf0762400002b9406831166a94d7a2716c |
|
MD5 | d9e473f06c77ca25f0f9a6bf1dd47f02 |
|
BLAKE2b-256 | 037d8f3e3f26a30d40d356af0b11e3481dfa0682d5af864e732c8d4c58535458 |
Hashes for contentaiextractor-1.0.5rc0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | cd13776bce9291ccfb20760e99f38aced62f7e05cc0293d37d4477b6369754f0 |
|
MD5 | 7d8a7cacc5dc0099d963f25f19365d6d |
|
BLAKE2b-256 | 85d4770f2cd349704e4383fe6de50e08f4b4a6e67a75fa377f966185e23142af |