Skip to main content

A radically simple framework for ML/AI model management

Project description

modsys: model management tool

python GitHub Workflow Status Release

ModsysML is an upstream open-source validation framework for testing, automating and data analytics.


Docs [Coming soon]»

Join the waitlist · Report Bug · Community Discord

Why ModsysML?

Before modsys, running proactive intelligence & insights through testing data quality and automating workloads was time-consuming, with modsys, you can simplify, accelerate and backtest the entire process. This makes it easier to train classifiers, handle real-time changes and make data driven decisions.

🚀 Interesting, how can I try it?

Lets install the SDK first...

pip install modsys

Regression tests vs Automated pipelines

modsys helps you tune LLM prompts systematically across many relevant test cases. By evaluating and comparing LLM outputs to build decision making workflows. Users can test prompt quality and catch regressions faster.

Evaluating prompt quality

With Modsys python library and CLI toolkit, you can:

  • Detecting real-time changes in your data
  • Automating tasks against image, video, audio or text
  • Simplifying the process of back-testing quality for your AI models
  • Making sure your integration is robust, so you never again have to worry about stuck/stale data or false-positives
  • Test multiple prompts against predefined test cases
  • Evaluate quality and catch regressions by comparing LLM outputs side-by-side
  • Speed up evaluations with caching and concurrent tests
  • Flag bad outputs automatically by setting "expectations"
  • Use as a command line tool, or integrate into your workflow with our library
  • Use any AI provider, API or database under one API

modsys produces table views that allow you to quickly review prompt outputs across many inputs. The goal: tune prompts systematically across all relevant test cases, instead of testing prompts by trial and error.

Usage (command line)

Support for user interface coming soon

It works on the command line, you can output to [json, csv, yaml]:

Prompt eval

To get started, run the following command:

modsys init

This will create some templates in your current directory: prompts.txt, vars.csv, and config.json.

After editing the prompts and variables to your desired state, modsys command to kick off an prompt evaluation test:

modsys -p ./prompts.txt -v ./vars.csv -r openai:completion

If you're looking to customize your usage, you have a wide set of parameters at your disposal. See the Configuration docs for more detail:

Option Description
-p, --prompts <paths...> Paths to prompt files, directory, or glob
-r, --providers <name or path...> One of: openai:chat, openai:completion, openai:model-name, hive:hate, google:safety, etc. See AI Providers
-o, --output <path> Path to output file (csv, json, yaml, html)
-v, --vars <path> Path to file with prompt variables (csv, json, yaml)
-c, --config <path> Path to configuration file. config.json is automatically loaded if present
-j, --max-concurrency <number> coming soon Maximum number of concurrent API calls
--table-cell-max-length <number> coming soon Truncate console table cells to this length
--grader coming soon Provider that will grade outputs, if you are using

Examples

Prompt quality

In this example, we evaluate whether adding adjectives to the personality of an chat bot affects the responses:

modsys -p prompts.txt -v vars.csv -r openai:gpt-3.5-turbo

Prompt eval

This command will evaluate the prompts in prompts.txt, substituing the variable values from vars.csv, and output results in your terminal.

Have a look at the setup and full output in another format:

modsys -p prompts.txt -v vars.csv -r openai:gpt-3.5-turbo -o output.json

You can also output a nice spreadsheet, JSON, or YAML file:

{
  "results": [
    {
      "prompt": {
        "raw": "Rephrase this in French: Hello world",
        "display": "Rephrase this in French: {{body}}"
      },
      "vars": {
        "body": "Hello world"
      },
      "response": {
        "output": "Bonjour le monde",
        "tokenUsage": {
          "total": 19,
          "prompt": 16,
          "completion": 3
        }
      }
    }
    // ...
  ],
  "stats": {
    "successes": 4,
    "failures": 0,
    "tokenUsage": {
      "total": 120,
      "prompt": 72,
      "completion": 48
    }
  }
}

Here's an example of a side-by-side comparison of multiple prompts and inputs:

Model quality

You can evaluate the difference between safety outputs for a specific context:

Model quality tests & python package for model testing is a beta feature at the moment, open an issue and tag us to setup

modsys -p prompts.txt -r hiveai:hate google:safety -o output.json

Configuration

  • Setting up an model test: Learn more about how to set up prompt files, vars file, output, etc.

Building Automated Pipelines in the User Interface or Programmatically

image

Let's setup your first Integration!

It will pull from your local database (and keep it in sync).

# import the package
from modsys.client import Modsys

# sync data from your database instance
# (we support supabase at the current moment or postgresql via uri format)
Modsys.connect("postgres://username:password@hostname:port/database_name")

# If you want to test out operation on your external connection
Modsys.fetch_tables()
Modsys.query("desc", "table", "column")

...and create a workflow with a simple command:

# import the package
from modsys.client import Modsys

# Use any provider
Modsys.use("google_perspective:<model name>", google_perspective_api_key="YOUR_API_TOKEN_HERE")

# An option for image detection, connect to sightengine provider or other image service first
Modsys.detectImage('https://example.com/some-endpoint') # Image Analysis/OCR

# Lets check to see if a phrase contains threats
Modsys.detectText(prompt="Phrase1", content_id="content-id", community_id="user-id")

Example response:

{
  "attributeScores": {
    "THREAT": {
      "spanScores": [
        {
          "begin": 0,
          "end": 12,
          "score": { "value": 0.008090926, "type": "PROBABILITY" }
        }
      ],
      "summaryScore": { "value": 0.008090926, "type": "PROBABILITY" }
    },
    "INSULT": {
      "spanScores": [
        {
          "begin": 0,
          "end": 12,
          "score": { "value": 0.008804884, "type": "PROBABILITY" }
        }
      ],
      "summaryScore": { "value": 0.008804884, "type": "PROBABILITY" }
    },
    "SPAM" // ...
  },
  "languages": ["en"],
  "clientToken": "content_123",
  "detectedLanguages": ["en", "fil"]
}

Experimental inputs:

# Create custom rules which creates a task!
Modsys.rule('Phrase1', '>=', '0.8')

Modsys.detectImage('Image1', 'contains', 'VERY_LIKELY') # Image Analysis/OCR
Modsys.detectSpeech('Audio1', 'contains', 'UNLIKELY') # Audio Processing
Modsys.detectVideo('Video1', 'contains', 'POSSIBLE') # Video Analysis
Modsys.detectText('Phrase1', 'contains', 'UNKNOWN') # Text Analysis
Modsys.test('prompt', 'expected_output') # ML Validation

That's all it takes!

In practice, you probably want to use one of our native SDKs to interact with different AI providers or use our custom browser client so you dont have to write code. If so, sign up for the downstream Apollo ModsysML Console!

Cool, what can I build with it?
  • Modsys can help you quickly automate tasks for model management, performance, labeling, object detection and more.
  • Teams can use Modsys to build native in-app connections related to active response, content moderation, risk management, fraud detection, etc.
  • Some automate their personal lives with Modsys by integrating against discord communities or their personal lives

Development

Contributions are welcome! Please feel free to submit a pull request or open an issue.

📦 pre-commit config

As an open source project, Modsys welcomes contributions from the community at large. This isn’t an exhaustive reference and is a living document subject to change as needed when the project formalizes any practice or pattern.

Clone the repo and start Modsys locally...

git clone https://github.com/modsysML/modsys.git
cd modsys && python3 -m venv env && source env/bin/activate && pip install -r requirements.txt
  • After installing system dependencies be sure to install pre-commit for lint checks
pip install pre-commit

pre-commit install

pre-commit run --all-files

Modsys uses commit messages for automated generation of project changelog. For every pull request we request contributors to be compliant with the following commit message notation.

<type>: <summary>

<body>

Accepted <type> values:

  • new = newly implemented user-facing features
  • chg = changes in existing user-facing features
  • fix = user-facing bugfixes
  • oth = other changes which users should know about
  • dev = any developer-facing changes, regardless of new/chg/fix status
Summary (The first line)

The first line should not be longer than 75 characters, the second line is always blank and other lines should be wrapped at 80 characters.

🔍 Neat, I would like to learn more

⭐ Follow our development by starring us here on GitHub ⭐

AI Providers

We support OpenAI as well as a number of models. It's also possible to set up your own custom AI provider.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

modsys-0.5.1.tar.gz (32.2 kB view details)

Uploaded Source

Built Distribution

modsys-0.5.1-py3-none-any.whl (55.9 kB view details)

Uploaded Python 3

File details

Details for the file modsys-0.5.1.tar.gz.

File metadata

  • Download URL: modsys-0.5.1.tar.gz
  • Upload date:
  • Size: 32.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.3

File hashes

Hashes for modsys-0.5.1.tar.gz
Algorithm Hash digest
SHA256 2360e281a93cda6fd855c43800ae0af70c6e11292f8f49f6683b7c189e37f49c
MD5 b0f67f3e0e658ee88ec446f126bfa45a
BLAKE2b-256 0f2352cee0329ea2f576762e5804858648398a2b5163c1f716e445a3ef6c10b4

See more details on using hashes here.

File details

Details for the file modsys-0.5.1-py3-none-any.whl.

File metadata

  • Download URL: modsys-0.5.1-py3-none-any.whl
  • Upload date:
  • Size: 55.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.3

File hashes

Hashes for modsys-0.5.1-py3-none-any.whl
Algorithm Hash digest
SHA256 185e80dcdc5ff5b58b844712d185ecd07b720776696a78f300cbdd0c610067e7
MD5 a4fdc907ec7168dddac24a4298280da3
BLAKE2b-256 b9bdf195a70364d1490a840bfca608e147fef06957b35d8e0e53b659fc59ae04

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page