Skip to main content

DeepSeek Model for DashAI

Project description

DeepSeek LLM Plugin for DashAI

This plugin integrates two DeepSeek models into the DashAI framework using the llama.cpp backend. It enables text generation tasks through a lightweight and efficient inference engine with support for quantized GGUF models.

Included Models

1. DeepSeek LLM 7B Chat

2. DeepSeek Coder 6.7B Instruct

  • Instruction-tuned model for code-related and general instruction tasks
  • Initialized from deepseek-coder-6.7b-base, fine-tuned on 2B instruction tokens
  • Based on TheBloke/deepseek-coder-6.7B-instruct-GGUF
  • Uses quantized file: deepseek-coder-6.7b-instruct.Q5_K_M.gguf

Both models use the Q5_K_M quantization method for a balance of quality and efficiency, and are compatible with both CPU and GPU inference.

Components

DeepSeekModel

  • Implements the TextToTextGenerationTaskModel interface from DashAI
  • Uses the llama.cpp backend with GGUF support
  • Loads the model from Hugging Face at runtime
  • Supports configurable generation parameters
  • Automatically truncates long prompts and uses custom stop sequences for cleaner output

Features

  • Configurable text generation with:
    • max_tokens: Number of tokens to generate
    • temperature: Controls output randomness
    • frequency_penalty: Reduces repetition
    • n_ctx: Context window size
    • device: "cpu" or "gpu"
  • Efficient memory usage with GGUF quantization
  • Custom stop sequence: ["Q:"]

Model Parameters

Parameter Description Default
max_tokens Maximum number of tokens to generate 100
temperature Sampling temperature (higher = more random) 0.7
frequency_penalty Penalizes repeated tokens to encourage diversity 0.1
n_ctx Maximum context window (tokens in prompt) 4096
device Inference device "gpu" if available

Requirements

Notes

This plugin uses the GGUF format, introduced by the llama.cpp team in August 2023.
GGUF replaces the older GGML format, which is no longer supported.

GGUF models are optimized for fast inference and lower memory consumption, especially on CPU- or GPU-constrained devices.

Both models (deepseek-llm-7b-chat and deepseek-coder-6.7b-instruct) are distributed in the Q5_K_M quantized format.
This quantization method offers a solid trade-off between model size and quality, making them suitable for real-time or resource-limited environments.

⚠️ These models are pretrained and instruction-tuned for inference only. They are not intended for fine-tuning.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dashai_deepseek_model_package-0.0.6.tar.gz (21.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dashai_deepseek_model_package-0.0.6-py3-none-any.whl (4.4 kB view details)

Uploaded Python 3

File details

Details for the file dashai_deepseek_model_package-0.0.6.tar.gz.

File metadata

File hashes

Hashes for dashai_deepseek_model_package-0.0.6.tar.gz
Algorithm Hash digest
SHA256 c45847e37547ed6fba896ded380defe05e865154521d3da03b1f628d9ac0fdf9
MD5 8791d9387ba780e8ce8ba13e6ca5f51f
BLAKE2b-256 7f52579a37806f4915d5026f50ef078b343c7f49449376579f444d7f4998f7d6

See more details on using hashes here.

File details

Details for the file dashai_deepseek_model_package-0.0.6-py3-none-any.whl.

File metadata

File hashes

Hashes for dashai_deepseek_model_package-0.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 ce9dc4f012b19d4cc29e2075dc1c96d8ad34f418dbdda0b54f58eb0fd3a2597a
MD5 25a9f8f53d7b807796cc89824077bcdb
BLAKE2b-256 3366c4563473e7a19c06f72dfcb0b7c2135c0709876f3532310b304a03b398ef

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page