Skip to main content

A multi-modal LLM integrated ChatGPT with Azure Cognitive Service

Project description

CogsGPT

A conversational system which integrates ChatGPT with Azure Cognitive Services to achieve multimodal capabilities.

Open in Spaces

cogsgpt-demo

If you find this repo useful, please consider giving it a star! :)

Updates

  • [2023.04.28] Now you can go to CogsGPT on Hugging Face Space to experience the full capabilities of CogsGPT!!! We are offering an Azure Cognitive Service resource for FREE to use in the demo. All you need is an OpenAI API key to get started chatting with CogsGPT!
  • [2023.04.25] CogsGPT now supports image type output! You can ask CogsGPT to crop a thumbnail of an image, or remove its background.
  • [2023.04.18] Release the first version of CogsGPT!

Overview

What is Azure Cognitive Service

(Answered by ChatGPT)

Azure Cognitive Services is a collection of pre-built machine learning models that developers can use to add intelligent features to their applications without requiring extensive knowledge of data science or machine learning. These services include vision, speech, language, and decision-making capabilities, such as text translation, speech recognition, image recognition, and sentiment analysis. Azure Cognitive Services allows developers to quickly and easily incorporate advanced AI features into their applications, reducing the time and cost of building such features from scratch. It also provides enterprise-level security, scalability, and availability for applications that require high levels of reliability and performance.

What is CogsGPT

CogsGPT is a conversational system which utilizes the ChatGPT model as the controller and integrates with Azure Cognitive Services as collaborative executors to achieve multimodal capabilities to some extent. Using CogsGPT, you can simply access Azure Cognitive Services via natural language to process image or audio inputs, without any knowledge of the underlying APIs. You can even ask CogsGPT to perform some complex tasks such as summarizing a long speech into a short audio clip while retaining the main information. CogsGPT will automatically decide which services to use and how to use them to achieve the goal.

You can find the list of pre-built services supported by CogsGPT here.

How does CogsGPT work

The workflow of CogsGPT consists of three stages:

  1. Task Planing Stage: In this stage, CogsGPT will leverage ChatGPT to parse user's input into a sequence of Azure Cognitive Service tasks which have the most potentials to solve user's request. Each task may depend on the execution result of previous tasks.
  2. Task Execution Stage: In this stage, CogsGPT will execute the tasks sequentially. The execution results will be stored for future reference.
  3. Response Generation Stage: In this stage, CogsGPT will leverage ChatGPT again to generate a final response to user's request based on the execution results of the second stage. The response may be a text, an image, an audio, or a combination of them.

Getting Started

Prerequisites

  • Python 3.8+
  • OpenAI API key
  • Azure Cognitive Multi-Services resource (How to deploy)
  • Set the following environment variables:
    # OpenAI
    export OPENAI_API_TYPE="openai"
    export OPENAI_API_KEY="<OpenAI API Key>"
    
    # Azure Cognitive Service
    export COGS_ENDPOINT="<Azure Cognitive Service Endpoint>"
    export COGS_KEY="<Azure Cognitive Service Key>"
    export COGS_REGION="<Azure Cognitive Service Region>"
    

Quick Install

pip install cogsgpt

Usage

You can use CogsGPT in your own application to process image or audio inputs within three lines of codes:

from cogsgpt import CogsGPT

agent = CogsGPT(model_name="gpt-3.5-turbo")
agent.chat("What's the content in a.jpg?")

For more details of the usage, please refer to the API Reference

Gradio Demo

The CogsGPT Gradio demo is now available on Hugging Face Space! To make it easier and more affordable to try out the capabilities of CogsGPT, we are offering an Azure Cognitive Service resource for FREE to use in the demo! All you need is an OpenAI API key to get started chatting with CogsGPT!

You can also use the following commands to run the demo locally with your own Azure Cognitive Service resources (Don't forget to set the environment variables first!):

pip install gradio
python app.py

Now open your favorite browser and ENJOY YOUR CHAT!

Acknowledgment

This project is inspired by HuggingGPT, and is built on top of LangChain.

License

This project is licensed under the MIT License - see the LICENSE file for details

Contributing

As an open source project, we welcome contributions and suggestions. Please follow the fork and pull request workflow to contribute to this project. Please do not try to push directly to this repo unless you are maintainer.

Contact

If you have any questions, please feel free to contact us via weitian.bnu@gmail.com

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cogsgpt-1.0.2.tar.gz (25.6 kB view details)

Uploaded Source

Built Distribution

cogsgpt-1.0.2-py3-none-any.whl (27.6 kB view details)

Uploaded Python 3

File details

Details for the file cogsgpt-1.0.2.tar.gz.

File metadata

  • Download URL: cogsgpt-1.0.2.tar.gz
  • Upload date:
  • Size: 25.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.11.3

File hashes

Hashes for cogsgpt-1.0.2.tar.gz
Algorithm Hash digest
SHA256 35d6ceb265eaffb7ba3aa69ed39bc49a0ac5760a756d664e3293ba0d4c405d4d
MD5 205ca82ae15ec5ab6f78f6cb38bb0b47
BLAKE2b-256 4a1e9e62ac8c78d0fd22a15b81c2d82886deb9833721c11d7232c2b55b63cdf3

See more details on using hashes here.

Provenance

File details

Details for the file cogsgpt-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: cogsgpt-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 27.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.11.3

File hashes

Hashes for cogsgpt-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 e7aec1123d049b1737d03b0c23f5b643f5c9ffb6dc2e845ecb7d69b47e414e1e
MD5 6f05a8763f9465996a67e87c05917549
BLAKE2b-256 09d9e71ae7e79f91380eff6a8ef5bffd14a720a8823fecada0b5b2bca47a8396

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page