DeepSeek Model for DashAI
Project description
DeepSeek LLM model for DashAI
This plugin integrates the DeepSeek LLM 7B Chat model into the DashAI framework using the llama.cpp backend. It enables text generation tasks via a lightweight and efficient inference engine with support for quantized GGUF models.
Components
DeepSeekModel
- Based on the
llama.cppbackend using the GGUF quantized format - Loads the model from HuggingFace:
TheBloke/deepseek-llm-7B-chat-GGUF - Uses the
deepseek-llm-7b-chat.Q5_K_M.ggufquantized file - Compatible with CPU and GPU inference
- Implements the
TextToTextGenerationTaskModelinterface of DashAI
Features
- Text generation with the following configurable parameters:
max_tokens: Number of tokens to generatetemperature: Controls randomness of outputfrequency_penalty: Reduces repetition in outputn_ctx: Context window sizedevice: Inference device ("cpu"or"gpu")
- Efficient memory usage via quantized GGUF format
- Automatic truncation of overly long prompts
- Custom stop sequence (
["Q:"]) for cleaner outputs
Model Parameters
| Parameter | Description | Default |
|---|---|---|
max_tokens |
Maximum number of tokens to generate | 100 |
temperature |
Sampling temperature (higher = more random) | 0.7 |
frequency_penalty |
Penalizes repeated tokens to encourage diversity | 0.1 |
n_ctx |
Maximum context window (tokens in prompt) | 4096 |
device |
Device for inference ("gpu" or "cpu") |
"gpu" if available |
Requirements
DashAIllama-cpp-python- Model files from HuggingFace:
TheBloke/deepseek-llm-7B-chat-GGUF- Use the quantized file
deepseek-llm-7b-chat.Q5_K_M.gguf
Notes
This plugin uses the GGUF format, introduced by the llama.cpp team in August 2023.
GGUF replaces the older GGML format, which is no longer supported by llama.cpp.
GGUF models are optimized for fast inference and lower memory consumption, especially in CPU/GPU-constrained environments.
The file deepseek-llm-7b-chat.Q5_K_M.gguf is a quantized version of the original DeepSeek LLM 7B Chat model.
The Q5_K_M quantization offers a good trade-off between model size and quality, making it suitable for real-time or resource-limited applications.
The model used in this plugin is a pretrained chat-oriented version and is not designed for fine-tuning. It is intended for inference only.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dashai_deepseek_model_package-0.0.4.tar.gz.
File metadata
- Download URL: dashai_deepseek_model_package-0.0.4.tar.gz
- Upload date:
- Size: 19.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2d44df678fd68e9a7d23b7bc5fb2e8c2ad5d6f02ee38402415e36cbc124891e0
|
|
| MD5 |
af2fcc55c9608a30a56c07178b2bfd66
|
|
| BLAKE2b-256 |
29cdccec8fab7b4d23f8c693302482a0b3c615846aade418fe88beb95b01719d
|
File details
Details for the file dashai_deepseek_model_package-0.0.4-py3-none-any.whl.
File metadata
- Download URL: dashai_deepseek_model_package-0.0.4-py3-none-any.whl
- Upload date:
- Size: 4.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ec00a9aa43b888db3435f975d730c3f0e1ce29e6fe266e24ae38edf4915bcfb3
|
|
| MD5 |
d4faeb2436bea3e77230b0069557f780
|
|
| BLAKE2b-256 |
0db0b742790591617f1054eff3bcf312763b3796b2b7c2355eeacc9f0e41c535
|