Skip to main content

Metapackage bundling qwen-think and qwen3.6-mtp under a shared namespace

Project description

forge-infer

Metapackage bundling qwen-think and qwen3.6-mtp under a shared namespace.

pip install forge-infer pulls in qwen-think and qwen3.6-mtp as dependencies and re-exports their key APIs under a single forge namespace. This is packaging and narrative, not new code.

Why this exists

Two focused packages -- thinking-mode session control and MTP speculative decoding -- that belong together. forge-infer gives them a shared identity so you can recommend, install, and document them as a unit instead of scattering links across READMEs.

Install

pip install forge-infer

This installs both qwen-think and qwen3.6-mtp automatically.

Quick start

Thinking sessions (qwen-think)

Control when and how Qwen3.6 "thinks" -- budget tokens, toggle thinking on/off mid-conversation, route by complexity.

from forge.session import ThinkingSession

session = ThinkingSession(model="Qwen/Qwen3.6-27B")
response = session.chat("Explain merge sort", thinking=True)
print(response)

MTP speculative decoding (qwen3.6-mtp)

Tune multi-token prediction for throughput, find crossover points, generate backend configs.

from forge.mtp import recommend, quick_crossover, vllm_mtp_command, sglang_mtp_command
from forge.mtp import UseCase, Objective

# Get a recommendation for your hardware
rec = recommend(use_case=UseCase.SINGLE_USER, objective=Objective.MINIMIZE_LATENCY, gpu_id="rtx-4090")
print(rec.enable, rec.expected_gain)

# Find where MTP flips from positive to negative
for s in quick_crossover(gpu_id="rtx-3090"):
    print(f"MTP-{s.spec_tokens}: crossover at batch {s.crossover_batch_size}")

# Generate serve commands
print(vllm_mtp_command(model="Qwen/Qwen3.6-27B", num_speculative_tokens=2).command)
print(sglang_mtp_command(model="Qwen/Qwen3.6-27B", num_speculative_tokens=2).command)

Architecture

How the packages relate:

+---------------------------------------------+
|              forge (metapackage)             |
+------------------+--------------------------+
|   forge.session  |       forge.mtp          |
|  (qwen-think)    |   (qwen3.6-mtp)         |
|                  |                          |
|  Thinking-mode   |  MTP speculative decode  |
|  session control |  tuning & backend config |
+------------------+--------------------------+
|              Qwen3.6 model family           |
+---------------------------------------------+
  • forge.session -- Re-exports ThinkingSession from qwen-think.
  • forge.mtp -- Re-exports recommend, quick_crossover, vllm_mtp_command, sglang_mtp_command, UseCase, Objective from qwen3.6-mtp.

Individual packages

Package What it does
qwen-think Thinking-mode session management
qwen3.6-mtp MTP speculative decoding tuner

What this package does NOT do

  • No new functionality -- strictly re-exports from the underlying packages
  • No CLI -- the libraries are Python-first
  • No model generalization -- wraps Qwen3.6-specific versions as-is

License

Apache 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

forge_infer-0.1.1.tar.gz (7.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

forge_infer-0.1.1-py3-none-any.whl (7.4 kB view details)

Uploaded Python 3

File details

Details for the file forge_infer-0.1.1.tar.gz.

File metadata

  • Download URL: forge_infer-0.1.1.tar.gz
  • Upload date:
  • Size: 7.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for forge_infer-0.1.1.tar.gz
Algorithm Hash digest
SHA256 eb6638fccf54cf51a2f9740b3c03028b5255ad1759a16673677f15ee7f58fab2
MD5 e09222bcf86eea9df0e970942fc40e5a
BLAKE2b-256 02060dc8af1ef9575751c299833402385340b51a8b927de90beae17c7c6dc888

See more details on using hashes here.

Provenance

The following attestation bundles were made for forge_infer-0.1.1.tar.gz:

Publisher: publish.yml on ArkaD171717/FORGE-Infer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file forge_infer-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: forge_infer-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 7.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for forge_infer-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 69b6ef5640def9be6f8942340b1d3acf9e15a5724fb99dce91bfc679144f1f35
MD5 33153c4eadc62a6fd5aa3a7f49d176b9
BLAKE2b-256 85e6ea14ebfd04a7bac919f58ee7d706a2f100447b3108eee1ad0b1ac3ed6f6e

See more details on using hashes here.

Provenance

The following attestation bundles were made for forge_infer-0.1.1-py3-none-any.whl:

Publisher: publish.yml on ArkaD171717/FORGE-Infer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page