Skip to main content

A multi-modal LLM integrated ChatGPT with Azure Cognitive Service

Project description

CogsGPT

A multi-modal LLM integrated ChatGPT with Azure Cognitive Service, inspired by HuggingGPT.

Overview

This project is inspired by HuggingGPT. As the name CogsGPT suggests, it utilizes the ChatGPT model as the language center and integrates with Azure Cognitive Services to achieve multimodal capabilities to some extent.

Typical user cases include:

  • Information extraction: Extract the main information from a doc or an image.
  • Image translation: Translate the text in an image to another language.
  • Speech summarization: Summarize a long speech into a short audio clip while retaining the main information.
  • Speech translation: Translate input speech into another language.

There are more user cases waiting for your exploration!

Here is a demo of creating a poem based on an image and converting it into speech in another language.

demo

Getting Started

Prerequisites

OpenAI Requirements

First, you need to register an OpenAI account or deploy an Azure OpenAI Service. Follow the official documents to obtain the API key and other resources.

If you want to use OpenAI API, you need to set these environment variables:

export OPENAI_API_TYPE="openai"
export OPENAI_API_KEY="<OpenAI API Key>"
export OPENAI_MODEL_NAME="<OpenAI Model Name>"

If you want to use Azure OpenAI Service, you need to set these environment variables:

export OPENAI_API_TYPE="azure"
export OPENAI_API_BASE="<Azure OpenAI Service Endpoint>"
export OPENAI_API_KEY="<Azure OpenAI Service Key>"
export OPENAI_MODEL_NAME="<Deployment Name>"
export OPENAI_MODEL_VERSION="<Model Version>"

Azure Cognitive Service Requirements

Next, you need also to deploy an Azure Cognitive Service. Follow the official documents to obtain the deployment key and other resources, and set these environment variables:

export COGS_ENDPOINT="<Azure Cognitive Service Endpoint>"
export COGS_KEY="<Azure Cognitive Service Key>"
export COGS_REGION="<Azure Cognitive Service Region>"

Platform Requirements

At last, follow the instruction here to check your platfrom requirments (which is necessary to use Azure Speech SDK for Python)

Quick Install

You can now install CogsGPT with pip:

pip install cogsgpt

Usage

You can use CogsGPT in your own application to process image or audio inputs within 3 lines of codes:

from cogsgpt import CogsGPT

agent = CogsGPT()
agent.chat("What's the content in a.jpg?")

Or you can experience an interactive console application with the following command:

python ./tests/test_awesome_chat.py

Enjoy your chat!

License

This project is licensed under the MIT License - see the LICENSE file for details

Contributing

As an open source project, we welcome contributions and suggestions. Please follow the fork and pull request workflow to contribute to this project. Please do not try to push directly to this repo unless you are maintainer.

Contact

If you have any questions, please feel free to contact us via weitian.bnu@gmail.com

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cogsgpt-0.0.1.tar.gz (18.9 kB view hashes)

Uploaded Source

Built Distribution

cogsgpt-0.0.1-py3-none-any.whl (20.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page