Skip to main content

DeepSeek Model for DashAI

Project description

DeepSeek LLM model for DashAI

This plugin integrates the DeepSeek LLM 7B Chat model into the DashAI framework using the llama.cpp backend. It enables text generation tasks via a lightweight and efficient inference engine with support for quantized GGUF models.

Components

DeepSeekModel

  • Based on the llama.cpp backend using the GGUF quantized format
  • Loads the model from HuggingFace: TheBloke/deepseek-llm-7B-chat-GGUF
  • Uses the deepseek-llm-7b-chat.Q5_K_M.gguf quantized file
  • Compatible with CPU and GPU inference
  • Implements the TextToTextGenerationTaskModel interface of DashAI

Features

  • Text generation with the following configurable parameters:
    • max_tokens: Number of tokens to generate
    • temperature: Controls randomness of output
    • frequency_penalty: Reduces repetition in output
    • n_ctx: Context window size
    • device: Inference device ("cpu" or "gpu")
  • Efficient memory usage via quantized GGUF format
  • Automatic truncation of overly long prompts
  • Custom stop sequence (["Q:"]) for cleaner outputs

Model Parameters

Parameter Description Default
max_tokens Maximum number of tokens to generate 100
temperature Sampling temperature (higher = more random) 0.7
frequency_penalty Penalizes repeated tokens to encourage diversity 0.1
n_ctx Maximum context window (tokens in prompt) 4096
device Device for inference ("gpu" or "cpu") "gpu" if available

Requirements

Notes

This plugin uses the GGUF format, introduced by the llama.cpp team in August 2023.
GGUF replaces the older GGML format, which is no longer supported by llama.cpp.

GGUF models are optimized for fast inference and lower memory consumption, especially in CPU/GPU-constrained environments.

The file deepseek-llm-7b-chat.Q5_K_M.gguf is a quantized version of the original DeepSeek LLM 7B Chat model.
The Q5_K_M quantization offers a good trade-off between model size and quality, making it suitable for real-time or resource-limited applications.

The model used in this plugin is a pretrained chat-oriented version and is not designed for fine-tuning. It is intended for inference only.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dashai_deepseek_model_package-0.0.4.tar.gz (19.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file dashai_deepseek_model_package-0.0.4.tar.gz.

File metadata

File hashes

Hashes for dashai_deepseek_model_package-0.0.4.tar.gz
Algorithm Hash digest
SHA256 2d44df678fd68e9a7d23b7bc5fb2e8c2ad5d6f02ee38402415e36cbc124891e0
MD5 af2fcc55c9608a30a56c07178b2bfd66
BLAKE2b-256 29cdccec8fab7b4d23f8c693302482a0b3c615846aade418fe88beb95b01719d

See more details on using hashes here.

File details

Details for the file dashai_deepseek_model_package-0.0.4-py3-none-any.whl.

File metadata

File hashes

Hashes for dashai_deepseek_model_package-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 ec00a9aa43b888db3435f975d730c3f0e1ce29e6fe266e24ae38edf4915bcfb3
MD5 d4faeb2436bea3e77230b0069557f780
BLAKE2b-256 0db0b742790591617f1054eff3bcf312763b3796b2b7c2355eeacc9f0e41c535

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page