Sermos - Machine Learning for the Real World
Project description
Sermos
Note: This represents Sermos' developer README. Public documentation generated using Sphinx and located at Sermos.
Quickstart
-
Add
sermos
as a dependency to your Python application -
Install extras depending on what you are building:
-
flask
- Convenient interface for Flask applications -
flask_auth
- Utilize Sermos' authentication system (Sermos Cloud only) -
web
- Some standard web server dependencies we like -
workers
- Installs Celery and networkx, which are required if using Sermos Cloud for your deployed pipelines and scheduled tasks -
deploy
- Only required for deploying to Sermos Cloud
Overview
Sermos provides a set of tools and design roadmap for developers to quickly and effectively integrate Data Science into real-world applications. The core design decisions and recommended technologies are based on nearly a decade of putting data science to work in demanding applications such as real-time motorsports strategy at Pit Rho, custom implementations across industries as diverse as energy, finance, and healthcare at Rho AI, and climate impact assesment tools at CRANE. With that said, this is built by developers and data scientists and strives to strike an appropriate balance between opinionated decisions and personal choice.
This is open source software and we look forward to seeing what you can build on top of Sermos. For those looking for quick, scalable, enterprise-grade deployments, make sure to check out Sermos Cloud, which is purpose-built for running complex, scalable, highly available Machine Learning (ML) workloads, including Natural Language Processing (NLP), Computer Vision (CV), Decision Modeling, Internet of Things (IoT), and more.
A standard Sermos application comprises two key libraries:
- This is the base package with optional scaffolding, architecture components (e.g. DAG implementation on top of Celery), and some useful utilities.
- Tool catalog for use in Sermos (or other!) Machine Learning applications.
Sermos
- Celery Configuration
- Pipelines
- APIs
- Utilities
Sermos Tools
- NLP Tools
- Document Classification
- Document Similarity Searching
- Date Finding
- etc.
- IoT Tols
- General Tools
- Slack Notifications
Your Application
aka "Sermos Client" or "Client Codebase"
This is where all of your code lives and only has a few requirements:
- It is a base application written in Python (additional language support slated for the future).
- Scheduled tasks and Pipeline nodes must be Python Methods that accept
at least one positional argument:
event
- [Optional] A
sermos.yaml
file, which is a configuration file if you choose to deploy using Sermos Cloud
Celery [optional]
Sermos provides sensical default configurations for the use of Celery. The default deployment uses RabbitMQ, and is recommended. This library can be implemented in any other workflow (e.g. Kafka) as desired.
There are two core aspects of Celery that Sermos handles and differ from a standard Celery deployment.
ChainedTask
In celery.py
when imported it will configure Celery and also run
GenerateCeleryTasks().generate()
, which will use the sermos.yaml
config
to turn customer methods into decorated Celery tasks.
Part of this process includes adding ChainedTask
as the base for all of
these dynamically generated tasks.
ChainedTask
is a Celery Task
that injects tools
and event
into the
signature of all dynamically generated tasks.
SermosScheduler
We allow users to set new scheduled / recurring tasks on-the-fly. Celery's
default beat_scheduler
does not support this behavior and would require the
Beat process be killed/restarted upon every change. Instead, we set our
custom sermos.celery_beat:SermosScheduler
as the beat_scheduler
,
which takes care of watching the database for new/modified entries and reloads
dynamically.
Workers / Tasks / Pipeline Nodes
If running in Sermos Cloud, Sermos runs all work (whether scheduled tasks or pipeline nodes) as Celery workers. Sermos handles decorating the tasks, generating the correct Celery chains, etc.
Customer code has one requirement: write a python method that accepts one
positional argument: event
e.g.
def demo_pipeline_node_a(event):
logger.info(f"RUNNING demo_pipeline_node_a: {event}")
return
Sermos Tools [optional]
Sermos provides much of the scaffolding and design guidance for running machine learning workloads and has a companion project Sermos Tools, which provides a set of useful (and growing) tools intended to streamline development of machine learning workflows and tasks.
Generators
TODO: This needs to be updated both in code and documentation. Leaving here because it's valuable to update in the future.
A common task associated with processing batches of documents is generating
the list of files to process. sermos.generators
contains two helpful
classes to generate lists of files from S3 and from a local file system.
S3KeyGenerator
will produce a list of object keys in S3. Example:
gen = S3KeyGenerator('access-key', 'secret-key')
files = gen.list_files(
'bucket-name',
'folder/path/',
offset=0,
limit=4,
return_full_path=False
)
LocalKeyGenerator
will produce a list of file names on a local file system.
Example:
gen = LocalKeyGenerator()
files = gen.list_files('/path/to/list/')
Testing
If you are developing Sermos core and want to test this package, install the test dependencies:
$ pip install -e .[test]
Now, run the tests:
$ tox
Contributors
Thank you to everyone who has helped in our quest to put machine learning to work in the real-world!
- Kevin Lyons
- Alejandro Mesa
- Cassie Borish
- Vickram Premakumar
- Aral Tasher
- Akshay Pakhle
- Gilman Callsen
- Your Name Here!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file sermos-1.2.0.tar.gz
.
File metadata
- Download URL: sermos-1.2.0.tar.gz
- Upload date:
- Size: 251.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.7.16
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 80be6bb49140fa13788a2ba92a329ec9202b0a3df613242921388bcc5185c627 |
|
MD5 | e145c29db996c83c1713ac9a129aaa31 |
|
BLAKE2b-256 | b19a739bc7a86dc969c1aada279e796834f3ac3a14b8c1e695a0ec99b7dbc813 |
File details
Details for the file sermos-1.2.0-py2.py3-none-any.whl
.
File metadata
- Download URL: sermos-1.2.0-py2.py3-none-any.whl
- Upload date:
- Size: 262.9 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.7.16
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 95ce8e29d9cd952420cd68519814b8b826ea78e5e8d47f98bad13ed1839af0ad |
|
MD5 | 0a523da21ce3e1746b974569a136a3f2 |
|
BLAKE2b-256 | 06f08db254066a67f2b5abfdfdf4fe24d029c3f3ae826269ce7014b0d8612d54 |