DeepSeek Model for DashAI
Project description
DeepSeek LLM Plugin for DashAI
This plugin integrates two DeepSeek models into the DashAI framework using the llama.cpp backend. It enables text generation tasks through a lightweight and efficient inference engine with support for quantized GGUF models.
Included Models
1. DeepSeek LLM 7B Chat
- Pretrained chat-oriented model for general text generation
- Based on
TheBloke/deepseek-llm-7B-chat-GGUF - Uses quantized file:
deepseek-llm-7b-chat.Q5_K_M.gguf
2. DeepSeek Coder 6.7B Instruct
- Instruction-tuned model for code-related and general instruction tasks
- Initialized from
deepseek-coder-6.7b-base, fine-tuned on 2B instruction tokens - Based on
TheBloke/deepseek-coder-6.7B-instruct-GGUF - Uses quantized file:
deepseek-coder-6.7b-instruct.Q5_K_M.gguf
Both models use the Q5_K_M quantization method for a balance of quality and efficiency, and are compatible with both CPU and GPU inference.
Components
DeepSeekModel
- Implements the
TextToTextGenerationTaskModelinterface from DashAI - Uses the
llama.cppbackend with GGUF support - Loads the model from Hugging Face at runtime
- Supports configurable generation parameters
- Automatically truncates long prompts and uses custom stop sequences for cleaner output
Features
- Configurable text generation with:
max_tokens: Number of tokens to generatetemperature: Controls output randomnessfrequency_penalty: Reduces repetitionn_ctx: Context window sizedevice:"cpu"or"gpu"
- Efficient memory usage with GGUF quantization
- Custom stop sequence:
["Q:"]
Model Parameters
| Parameter | Description | Default |
|---|---|---|
max_tokens |
Maximum number of tokens to generate | 100 |
temperature |
Sampling temperature (higher = more random) | 0.7 |
frequency_penalty |
Penalizes repeated tokens to encourage diversity | 0.1 |
n_ctx |
Maximum context window (tokens in prompt) | 4096 |
device |
Inference device | "gpu" if available |
Requirements
DashAIllama-cpp-python- Model files from Hugging Face:
Notes
This plugin uses the GGUF format, introduced by the llama.cpp team in August 2023.
GGUF replaces the older GGML format, which is no longer supported.
GGUF models are optimized for fast inference and lower memory consumption, especially on CPU- or GPU-constrained devices.
Both models (deepseek-llm-7b-chat and deepseek-coder-6.7b-instruct) are distributed in the Q5_K_M quantized format.
This quantization method offers a solid trade-off between model size and quality, making them suitable for real-time or resource-limited environments.
⚠️ These models are pretrained and instruction-tuned for inference only. They are not intended for fine-tuning.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dashai_deepseek_model_package-0.0.6.tar.gz.
File metadata
- Download URL: dashai_deepseek_model_package-0.0.6.tar.gz
- Upload date:
- Size: 21.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c45847e37547ed6fba896ded380defe05e865154521d3da03b1f628d9ac0fdf9
|
|
| MD5 |
8791d9387ba780e8ce8ba13e6ca5f51f
|
|
| BLAKE2b-256 |
7f52579a37806f4915d5026f50ef078b343c7f49449376579f444d7f4998f7d6
|
File details
Details for the file dashai_deepseek_model_package-0.0.6-py3-none-any.whl.
File metadata
- Download URL: dashai_deepseek_model_package-0.0.6-py3-none-any.whl
- Upload date:
- Size: 4.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ce9dc4f012b19d4cc29e2075dc1c96d8ad34f418dbdda0b54f58eb0fd3a2597a
|
|
| MD5 |
25a9f8f53d7b807796cc89824077bcdb
|
|
| BLAKE2b-256 |
3366c4563473e7a19c06f72dfcb0b7c2135c0709876f3532310b304a03b398ef
|