Streamlit multimodal chat input component with text, image, and voice support

These details have been verified by PyPI

Project links

Homepage

GitHub Statistics

Maintainers

tsuzukia

These details have not been verified by PyPI

Project description

English | 日本語

Demo

Streamlit Multimodal Chat Input

A multimodal chat input component for Streamlit that supports text input, image upload, and voice input.

Note: Voice and image features require HTTPS or localhost environment to function properly.

Features

📝 Text Input: Same usability as st.chat_input
🖼️ Image File Upload: Supports jpg, png, gif, webp
🎤 Voice Input: Web Speech API / server-side OpenAI Whisper support
🔒 Input Validation: File count, file size, MIME/content, and parameter validation
🎨 Streamlit Standard Theme: Fully compatible design
🔄 Drag & Drop: File drag and drop support
⌨️ Ctrl+V: Paste images from clipboard
🚨 Inline Errors: User-safe inline messages
⚙️ Customizable: Rich configuration options

Installation

pip install st-chat-input-multimodal

Basic Usage

import streamlit as st
from st_chat_input_multimodal import multimodal_chat_input

# Basic usage
result = multimodal_chat_input()

if result:
    # Display text
    if result['text']:
        st.write(f"Text: {result['text']}")
    
    # Display uploaded files
    if result['files']:
        for file in result['files']:
            import base64
            base64_data = file['data'].split(',')[1] if ',' in file['data'] else file['data']
            image_bytes = base64.b64decode(base64_data)
            st.image(image_bytes, caption=file['name'])
    
    # Display voice input metadata
    if result.get('audio_metadata'):
        st.write(f"Voice input used: {result['audio_metadata']['used_voice_input']}")

Advanced Usage

Voice Input Features

# Enable voice input
result = multimodal_chat_input(
    enable_voice_input=True,
    voice_recognition_method="web_speech",  # or "openai_whisper"
    voice_language="ja-JP",
    max_recording_time=60
)

# Using OpenAI Whisper on the server side
# Preferred: set OPENAI_API_KEY in the Python environment
result = multimodal_chat_input(
    enable_voice_input=True,
    voice_recognition_method="openai_whisper",
    voice_language="ja-JP"
)

When openai_whisper is selected, recorded audio is sent to the Python backend for transcription. The API key is used only on the server side and is never forwarded to the browser.

Voice Transcription Error Handling

Runtime voice-transcription failures are converted into user-safe inline messages instead of exposing raw backend exception details.

Voice transcription is not available in this app.: Whisper is not available for the current app configuration.
Recorded audio could not be processed. Please try recording again.: The recorded audio payload was invalid or unreadable.
Voice transcription is temporarily unavailable. Please try again.: Temporary API or network issue.
Voice transcription failed. Please try again.: Final fallback for unexpected runtime errors.

Developer-facing parameter validation still raises ValueError during multimodal_chat_input(...) initialization so configuration mistakes fail fast.

Custom Configuration

result = multimodal_chat_input(
    placeholder="Please enter your message...",
    max_chars=500,
    accepted_file_types=["jpg", "png", "gif", "webp"],
    max_file_size_mb=10,
    max_files=5,
    disabled=False,
    key="custom_chat_input"
)

max_chars, max_file_size_mb, and max_files must be positive integers. max_recording_time must be between 1 and 300, and voice_recognition_method must be either "web_speech" or "openai_whisper". Uploaded images are validated by extension and file signature, and displayed filenames are sanitized before rendering.

Chat Usage

import streamlit as st
import base64
from st_chat_input_multimodal import multimodal_chat_input

# Page configuration
st.set_page_config(
    page_title="Multimodal Chat Input Demo",
    page_icon="💬",
    layout="wide"
)

st.subheader("💭 Multimodal Chat Input Demo")
st.markdown("Simulate a chat application with voice input and file upload.")

# Manage history in session state
if "chat_history" not in st.session_state:
    st.session_state.chat_history = []

# Input for new messages
chat_result = multimodal_chat_input(
    placeholder="Enter chat message...",
    enable_voice_input=True,  # Enable voice input for chat as well
    key="chat_input"
)
if chat_result:
    st.session_state.chat_history.append(chat_result)

# Display chat history
if st.session_state.chat_history:
    for i, message in enumerate(st.session_state.chat_history):
        with st.chat_message("user"):
            if message.get("text"):
                st.write(message["text"])
            
            if message.get("files"):
                for file in message["files"]:
                    try:
                        base64_data = file['data'].split(',')[1] if ',' in file['data'] else file['data']
                        image_bytes = base64.b64decode(base64_data)
                        st.image(image_bytes, caption=file['name'], width=200)
                    except:
                        st.write(f"📎 {file['name']}")
            
            # Display voice input information
            if message.get("audio_metadata") and message["audio_metadata"]["used_voice_input"]:
                st.caption(f"🎤 Voice input ({message['audio_metadata']['transcription_method']})")


# Clear history
if st.button("Clear History"):
    st.session_state.chat_history = []
    st.rerun()

Example Chat App

streamlit-chatbot example

License

MIT License

Author

tsuzukia21

Project details

These details have been verified by PyPI

Project links

Homepage

GitHub Statistics

Maintainers

tsuzukia

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

2.0.0

Mar 29, 2026

1.0.6

Dec 11, 2025

1.0.5

Dec 11, 2025

1.0.4

Jul 5, 2025

1.0.3

Jul 5, 2025

1.0.2

Jun 27, 2025

1.0.1

Jun 27, 2025

1.0.0

Jun 27, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

st_chat_input_multimodal-2.0.0.tar.gz (137.7 kB view details)

Uploaded Mar 29, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

st_chat_input_multimodal-2.0.0-py3-none-any.whl (135.8 kB view details)

Uploaded Mar 29, 2026 Python 3

File details

Details for the file st_chat_input_multimodal-2.0.0.tar.gz.

File metadata

Download URL: st_chat_input_multimodal-2.0.0.tar.gz
Upload date: Mar 29, 2026
Size: 137.7 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for st_chat_input_multimodal-2.0.0.tar.gz
Algorithm	Hash digest
SHA256	`aa6c07ed5f4ce7de58b810925b121cd349ab6e379e27532ede2e668de19bfd79`
MD5	`50045cf77398918acbd4afb55266a92d`
BLAKE2b-256	`32e11a99e876ecc8441149a99cc0bcb0d83f9ee2d31574b31cad49869022e975`

See more details on using hashes here.

Provenance

The following attestation bundles were made for st_chat_input_multimodal-2.0.0.tar.gz:

Publisher: release.yml on tsuzukia21/st-chat-input-multimodal

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: st_chat_input_multimodal-2.0.0.tar.gz
- Subject digest: aa6c07ed5f4ce7de58b810925b121cd349ab6e379e27532ede2e668de19bfd79
- Sigstore transparency entry: 1191972731
- Sigstore integration time: Mar 29, 2026
Source repository:
- Permalink: tsuzukia21/st-chat-input-multimodal@ac0d0dcdbac8b415f3dfd9f168ca78cea7ce0e7c
- Branch / Tag: refs/tags/v2.0.0
- Owner: https://github.com/tsuzukia21
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@ac0d0dcdbac8b415f3dfd9f168ca78cea7ce0e7c
- Trigger Event: push

File details

Details for the file st_chat_input_multimodal-2.0.0-py3-none-any.whl.

File metadata

Download URL: st_chat_input_multimodal-2.0.0-py3-none-any.whl
Upload date: Mar 29, 2026
Size: 135.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for st_chat_input_multimodal-2.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`bb48faee816702283992a7707380a71ea03dd082cc28b4b62301c3480cf937ca`
MD5	`9ed781f77f1219c563375d4962c0c627`
BLAKE2b-256	`bd6658e254d703229ac425f4f9870b58b2441937f7cfb88cc9b03db09807afe1`

See more details on using hashes here.

Provenance

The following attestation bundles were made for st_chat_input_multimodal-2.0.0-py3-none-any.whl:

Publisher: release.yml on tsuzukia21/st-chat-input-multimodal

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: st_chat_input_multimodal-2.0.0-py3-none-any.whl
- Subject digest: bb48faee816702283992a7707380a71ea03dd082cc28b4b62301c3480cf937ca
- Sigstore transparency entry: 1191972732
- Sigstore integration time: Mar 29, 2026
Source repository:
- Permalink: tsuzukia21/st-chat-input-multimodal@ac0d0dcdbac8b415f3dfd9f168ca78cea7ce0e7c
- Branch / Tag: refs/tags/v2.0.0
- Owner: https://github.com/tsuzukia21
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@ac0d0dcdbac8b415f3dfd9f168ca78cea7ce0e7c
- Trigger Event: push

st-chat-input-multimodal 2.0.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Demo

Streamlit Multimodal Chat Input

Features

Installation

Basic Usage

Advanced Usage

Voice Input Features

Voice Transcription Error Handling

Custom Configuration

Chat Usage

Example Chat App

License

Author

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance