enova-instrumentation-llmo
Project description
使用方式
安装whl包
pip install enova_instrumentation_llmo-0.0.4-py3-none-any.whl
在vllm程序代码中进行ot配置和开启注入
# 开启instrument
from enova.llmo import start
# 指定ot collector地址和service name
start(otlp_exporter_endpoint="localhost:4317", service_name="service_name")
#######接原代码内容#######
Metrics 指标说明
avg_prompt_throughput
prompt 输入速率,单位 tokens/savg_generation_throughput
生成速率,单位 tokens/srunning_requests
当前 running 的 requests 数swapped_requests
当前 swapped 的 requests 数pending_requests
当前 pending 的 requests 数gpu_kv_cache_usage
gpu kv cache 使用率cpu_kv_cache_usage
cpu kv cache 使用率generated_tokens
生成的 tokens 数llm_engine_init_config
engine启动参数,attributes如下model
tokenizer
tokenizer_mode
revision
tokenizer_revision
trust_remote_code
dtype
max_seq_len
download_dir
load_format
tensor_parallel_size
disable_custom_all_reduce
quantization
enforce_eager
kv_cache_dtype
seed
max_num_batched_tokens
max_num_seqs
max_paddings
pipeline_parallel_size
worker_use_ray
max_parallel_loading_workers
http.server.active_requests
FastAPI 正在处理中的 HTTP 请求的数量http.server.duration
FastAPI 服务端请求处理时间。http.server.response.size
FastAPI HTTP 响应消息的大小http.server.request.size
FastAPI HTTP 请求的大小
trace span 说明
POST /generate
/generate请求POST /generate prompt
带有prompt
attributeModelRunner.execute_model
模型execute,对应一次 token 生成CUDAGraphRunner.forward
CUDA Graph的 forward 计算,在ModelRunner.execute_model
中被调用ChatGLMForCausalLM.forward
chatglm 模型 forwardLlamaForCausalLM.forward
llama 模型 forward
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
No source distribution files available for this release.See tutorial on generating distribution archives.
Built Distribution
File details
Details for the file enova_instrumentation_llmo-0.0.4-py3-none-any.whl
.
File metadata
- Download URL: enova_instrumentation_llmo-0.0.4-py3-none-any.whl
- Upload date:
- Size: 8.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.10.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 96dba89b012bd31411fb14950df0d10b7e87469d03691531a4ca7056428c8190 |
|
MD5 | 4640959619f9391974d1d3ce7e7cd856 |
|
BLAKE2b-256 | cb9a118b42e9a20d7b788db2881f105b4f9b54d7d3e8dd16400f1ea650327eb1 |