Skip to main content

Streamlit multimodal chat input component with text, image, and voice support

Project description

English | 日本語

Demo

Demo

Streamlit Multimodal Chat Input

A multimodal chat input component for Streamlit that supports text input, image upload, and voice input.

Note: Voice and image features require HTTPS or localhost environment to function properly.

Features

  • 📝 Text Input: Same usability as st.chat_input
  • 🖼️ Image File Upload: Supports jpg, png, gif, webp
  • 🎤 Voice Input: Web Speech API / OpenAI Whisper API support
  • 🎨 Streamlit Standard Theme: Fully compatible design
  • 🔄 Drag & Drop: File drag and drop support
  • ⌨️ Ctrl+V: Paste images from clipboard
  • ⚙️ Customizable: Rich configuration options

Installation

pip install st-chat-input-multimodal

Basic Usage

import streamlit as st
from st_chat_input_multimodal import multimodal_chat_input

# Basic usage
result = multimodal_chat_input()

if result:
    # Display text
    if result['text']:
        st.write(f"Text: {result['text']}")
    
    # Display uploaded files
    if result['files']:
        for file in result['files']:
            import base64
            base64_data = file['data'].split(',')[1]
            image_bytes = base64.b64decode(base64_data)
            st.image(image_bytes, caption=file['name'])
    
    # Display voice input metadata
    if result.get('audio_metadata'):
        st.write(f"Voice input used: {result['audio_metadata']['used_voice_input']}")

Advanced Usage

Voice Input Features

# Enable voice input
result = multimodal_chat_input(
    enable_voice_input=True,
    voice_recognition_method="web_speech",  # or "openai_whisper"
    voice_language="ja-JP",
    max_recording_time=60
)

# Using OpenAI Whisper API
result = multimodal_chat_input(
    enable_voice_input=True,
    voice_recognition_method="openai_whisper",
    openai_api_key="sk-your-api-key",
    voice_language="ja-JP"
)

Custom Configuration

result = multimodal_chat_input(
    placeholder="Please enter your message...",
    max_chars=500,
    accepted_file_types=["jpg", "png", "gif", "webp"],
    max_file_size_mb=10,
    disabled=False,
    key="custom_chat_input"
)

Chat Usage

import streamlit as st
import base64
from st_chat_input_multimodal import multimodal_chat_input

# Page configuration
st.set_page_config(
    page_title="Multimodal Chat Input Demo",
    page_icon="💬",
    layout="wide"
)

st.subheader("💭 Multimodal Chat Input Demo")
st.markdown("Simulate a chat application with voice input and file upload.")

# Manage history in session state
if "chat_history" not in st.session_state:
    st.session_state.chat_history = []

# Input for new messages
chat_result = multimodal_chat_input(
    placeholder="Enter chat message...",
    enable_voice_input=True,  # Enable voice input for chat as well
    key="chat_input"
)
if chat_result:
    st.session_state.chat_history.append(chat_result)

# Display chat history
if st.session_state.chat_history:
    for i, message in enumerate(st.session_state.chat_history):
        with st.chat_message("user"):
            if message.get("text"):
                st.write(message["text"])
            
            if message.get("files"):
                for file in message["files"]:
                    try:
                        base64_data = file['data'].split(',')[1] if ',' in file['data'] else file['data']
                        image_bytes = base64.b64decode(base64_data)
                        st.image(image_bytes, caption=file['name'], width=200)
                    except:
                        st.write(f"📎 {file['name']}")
            
            # Display voice input information
            if message.get("audio_metadata") and message["audio_metadata"]["used_voice_input"]:
                st.caption(f"🎤 Voice input ({message['audio_metadata']['transcription_method']})")


# Clear history
if st.button("Clear History"):
    st.session_state.chat_history = []
    st.rerun()

Example Chat App

tsuzukia21/streamlit-chatbot

License

MIT License

Author

tsuzukia21

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

st_chat_input_multimodal-1.0.6.tar.gz (132.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

st_chat_input_multimodal-1.0.6-py3-none-any.whl (131.2 kB view details)

Uploaded Python 3

File details

Details for the file st_chat_input_multimodal-1.0.6.tar.gz.

File metadata

  • Download URL: st_chat_input_multimodal-1.0.6.tar.gz
  • Upload date:
  • Size: 132.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.11

File hashes

Hashes for st_chat_input_multimodal-1.0.6.tar.gz
Algorithm Hash digest
SHA256 ac655f12383d4c81d4027f0edff6974a45bff35820c34d37ee2547a19120794a
MD5 8556cbb8878c1306abe02c743c12aab3
BLAKE2b-256 5e7a041b42ca02f59e1323f07c82a288d6c3dd61512172a3f201f0d1bc59bd02

See more details on using hashes here.

File details

Details for the file st_chat_input_multimodal-1.0.6-py3-none-any.whl.

File metadata

File hashes

Hashes for st_chat_input_multimodal-1.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 4bcd6065df87cab6960f534da43ff83d156926ab3b0191d59443ee71024b99ad
MD5 93d6af2ed79efb58464f83d7fdeddf5b
BLAKE2b-256 e0386238a1d6b62be8c5628a70fab594c557b2fa256920f2b238b3bad1ff9e95

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page