Miscellaneous differnet AI tools for embedding into projects.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

pui4

Project description

MiscAI

Tools to make it easier to embed AI related things into your exsisting projects.

Features

Ollama local LLM wrapper (with build-in Jina embeddings model)
Mercury diffusion language model (with the same embeddings as with Ollama, requires a mercury API key)
Easy tool function creation with both types of language models
Whisper-cpp speech to text
Text to speech cloning with pocket-tts (requires a hugging face account)
Voice activity detection with silero
Wakeword detection with local wake

Ollama local LLM

To use the local LLM with Ollama the host has to have Ollama installed already. The embeddings model is ran on the CPU as it is light and performant enough. Then install it into the python project with:

pip install miscai[llm]

Here is an example of how to use it:

from miscai.llm import LLM

llm = LLM(promt="You are helpful assistant.", model="qwen3:4b", convo_file="./convo.json")
print(llm.ask_LLM("Hello!"))

Diffusion Language Model (DLM)

Use of the DLM module is very similar to using the local LLM. You will need to make a Mercury API key to use the wrapper. The embeddings model is ran on the CPU as it is light and performant enough. Then install it into the python project with:

pip install miscai[dlm]

Here is an example of how to use it:

from miscai.dlm import DLM

dlm = DLM(promt="You are helpful assistant.", model="mercury-v2", convo_file="./convo.json", api_key="123abc")
print(dlm.ask_LLM("Hello!"))

Tool calling for Language models

This requires that you have installed one of the language models above. Here is an example of how to use it in a project:

from miscai.tools import ToolLoader

tool_loader = ToolLoader("./tools")

Then when you are creating your language model object, create it like this (using the Ollama one for example):

llm = LLM(promt="You are helpful assistant.", model="qwen3:4b", convo_file="./convo.json", tools=tool_loader.get_tools())

To create a tool, install the required dependencies to your project and place the python file in the directory specified in the ToolLoader object. Here is an example tool that gets the time with the 'pytz' package:

import pytz
from datetime import datetime

tool = {
    "type": "function",
    "function": {
        "name": "get_current_time",
        "description": "Gets the current time for a provided time zone.",
        'parameters': {
            "type": "object",
            "properties": {
                "timezone": {
                    "type": "string",
                    "description": "The time zone as specified in the tz (zoneinfo) library."
                }
            },
            "required": ["timezone"]
        }
    }
}

def get_current_time(timezone: str) -> str:
    tz = pytz.timezone(timezone)
    return str(datetime.now(tz))

Speech to text

The whisper-cpp model is ran on the CPU but is quite performant and accurate, but may use a lot of CPU. The audio bytes are in a numpy array encoded at 16000kHz with 512 byte chunks. Install it into your project with:

pip install miscai[stt]

Here is an example of how to use it:

from miscai.stt import STT

stt = STT()
print(stt.transcribe(audio_bytes=audio_bytes))

Text to speech

You will have to create a hugging face account and accept the eula for using the pocket-tts model. Then create an API key as you will need this later. The model is ran on the CPU for reason mentioned on the model's page. The audio base needs to be encoded with 16000kHz for the best results. The outputed audio is a numpy array. Then install it into the project using:

pip install miscai[tts]

Here is an example of how to use it:

from miscai.tts import TTS

tts = TTS("./voice_base.wav")
audio = tts.get_audio("Hello!")

Voice activity detection

This uses Silero VAD for voice detection and it runs on the CPU due to the model being light weight enough. The audio bytes inputed is a numpy array with 512 byte chunks. For best results the audio should be encoded in 16000kHz. Install it into your project with:

pip install miscai[vad]

Here is an example of how to use it:

from miscai.vad import VAD

vad = VAD(threshold=0.5)
print(vad.is_speech(audio_bytes=audio_bytes))

Wakeword detection

This uses local-wake and so any audio file of any person saying the wakeword works as the wakeword. The audio inputed is a audio stream with 512 byte chunks (best gotten through SoundDevice). For best results the audio for the input and the audio for the reference audio files should be encoded in 16000kHz. Install it into your project with:

pip install miscai[wake]

Here is an example of how to use it:

from miscai.wakeword import WakeWord

wake_word = WakeWord(threshold=0.5, audio_dir="./wakeword")
print("Begining to wait for wakeword")
wake_word.waitForWord(callback=awoke, stream=stream)

def awoke(detection: dict, stream: sd.InputStream):
    print(f"Wake word detected: {detection['wakeword']}")

Final notes

This project is in quite an early stage and could do with some more polish. It is not meant to be used in production but is good for making small prototypes for different ideas that you may have. I made this to make it easier to implement LLMs into my other projects and it grew from there. Changes are welcome as the documention is hastely written and the codes is arguable worse. Thanks for reading and maybe using this. :)

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

pui4

Release history Release notifications | RSS feed

This version

0.1.1

May 30, 2026

0.1.0

May 30, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

miscai-0.1.1.tar.gz (8.2 kB view details)

Uploaded May 30, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

miscai-0.1.1-py3-none-any.whl (11.6 kB view details)

Uploaded May 30, 2026 Python 3

File details

Details for the file miscai-0.1.1.tar.gz.

File metadata

Download URL: miscai-0.1.1.tar.gz
Upload date: May 30, 2026
Size: 8.2 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for miscai-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`64a82e18dd98a33d68bde4535fbb5ab1bc34fc5db2f44cb0a9f4f521d5ae8eac`
MD5	`c6832d4342bc7804baae8e03a5b96293`
BLAKE2b-256	`aea935ac2af713690925cfbd78bc5601fc0a036f5f57ab40a51990c7e46ee0a3`

See more details on using hashes here.

Provenance

The following attestation bundles were made for miscai-0.1.1.tar.gz:

Publisher: python-publish.yml on pui4/miscai

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: miscai-0.1.1.tar.gz
- Subject digest: 64a82e18dd98a33d68bde4535fbb5ab1bc34fc5db2f44cb0a9f4f521d5ae8eac
- Sigstore transparency entry: 1676659097
- Sigstore integration time: May 30, 2026
Source repository:
- Permalink: pui4/miscai@ee9e644717177627d13103ca415d47735ee61164
- Branch / Tag: refs/tags/0.1.1
- Owner: https://github.com/pui4
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: python-publish.yml@ee9e644717177627d13103ca415d47735ee61164
- Trigger Event: release

File details

Details for the file miscai-0.1.1-py3-none-any.whl.

File metadata

Download URL: miscai-0.1.1-py3-none-any.whl
Upload date: May 30, 2026
Size: 11.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for miscai-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e5095604efe9ecc1d036c1113c383db89d7df4cddeb2352b8a171bba3a03caca`
MD5	`d8da0b57ac36bb21e1ee6b16b46b7065`
BLAKE2b-256	`ae4e055f79ef678665e5fa9c184bc9fb855cc02231b2981745403f2cd5babb50`

See more details on using hashes here.

Provenance

The following attestation bundles were made for miscai-0.1.1-py3-none-any.whl:

Publisher: python-publish.yml on pui4/miscai

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: miscai-0.1.1-py3-none-any.whl
- Subject digest: e5095604efe9ecc1d036c1113c383db89d7df4cddeb2352b8a171bba3a03caca
- Sigstore transparency entry: 1676659106
- Sigstore integration time: May 30, 2026
Source repository:
- Permalink: pui4/miscai@ee9e644717177627d13103ca415d47735ee61164
- Branch / Tag: refs/tags/0.1.1
- Owner: https://github.com/pui4
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: python-publish.yml@ee9e644717177627d13103ca415d47735ee61164
- Trigger Event: release

miscai 0.1.1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Project description

MiscAI

Features

Ollama local LLM

Diffusion Language Model (DLM)

Tool calling for Language models

Speech to text

Text to speech

Voice activity detection

Wakeword detection

Final notes

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance