HiQ - A Modern Observability System
Project description
🦉 Observability And Optimization In Modern AI Era
🔥 HiQ now supports GPU profiling, DNN model visualization and tracing for DNN libraries like pyTorch,
transformers
, LAVIS, and LLMs like LLaMA, OPT, Bloom, T5 and GPT2 in addition to Onnxruntime, FastAPI and Flask.
HiQ is a declarative
, non-intrusive
, dynamic
and transparent
tracking system for both monolithic application and distributed system. It brings the runtime information tracking and optimization to a new level without compromising with speed and system performance, or hiding any tracking overhead information. HiQ applies for both I/O bound and CPU bound applications. In addition to latency tracking, HiQ provides memory, disk I/O and Network I/O tracking out of the box. The output can be saved in form of normal line by line log file, or HiQ tree, or span graph.
HiQ's philosophy is to decouple observability logic
from business logic
. We don't have to enter the black hole to observe it. Do you like the idea? Leave a ⭐ if you enjoy the project and welcome to say Hi to us on Slack 👋
Installation
- Basic Installation
pip install hiq-python
- HiQ also supports extra installation
pip install hiq-python[fastapi] # To support fastapi web server online tracing
pip install hiq-python[gpu] # To support GPU tracing, which will install pynvml
pip install hiq-python[lavis] # To support Salesforce LAVIS Vision Language models
pip install hiq-python[transformers] # To support tracing Hugging Face's transformers library
pip install hiq-python[full] # To support all the cases, and this will install all the dependency libraries
Get Started
Let start with a simplest example by running HiQ against a simple monolithic python code 📄 main.py
:
# this is the main.py python source code
import time
def func1():
time.sleep(1.5)
print("func1")
func2()
def func2():
time.sleep(2.5)
print("func2")
def main():
func1()
if __name__ == "__main__":
main()
In this code, there is a simple chain of function calls: main()
-> func1
-> func2
.
Now we want to trace the functions without modifying its code. Let's run the following:
git clone https://github.com/oracle-samples/hiq.git
cd hiq/examples/quick_start
python main_driver.py
If everything is fine, you should be able to see the output like this:
From the screenshot we can see the timestamp and the latency of each function:
main | func1 | func2 | tracing overhead | |
---|---|---|---|---|
latency(second) | 4.0045 | 4.0044 | 2.5026 | 0.0000163 |
HiQ just traced the main.py
file running without touching one line of its code.
Documentation
HTML: 🔗 HiQ Online Documents | PDF: Please check 🔗 HiQ User Guide.
Logging: https://hiq.readthedocs.io/en/latest/4_o_advanced.html#log-monkey-king
Tracing: https://hiq.readthedocs.io/en/latest/5_distributed.html
- Zipkin: https://hiq.readthedocs.io/en/latest/5_distributed.html#zipkin
- Jaeger: https://hiq.readthedocs.io/en/latest/5_distributed.html#jaeger
Metrics:
Streaming:
DNN Model Observability & Visualization
HiQ can visualize DNN model. To get the following BERT model's structure, you can just run:
python -m hiq.vis
The graph is self-explantory. There are several conventions:
- ❄️ means frozen layer, where
requires_grad
is false. - 📈 means gradient exists for that model parameter, which usually happens after backpopulation.
+
, bold font, and underscored dotted line mean the displayed layer is a folded version of multiple layers with the same structure.
What you need to do is just calling print_model(model)
in your code. Refer to: here for how to use it.
HiQ Web UI
- Main Page
- Latency Details
Jupyter NoteBook
HiQ was originally developed to find Onnxruntime performance bottleneck in DNN inference, and it works well for other computation intensive applications too. The following are two examples.
Add Observability to PaddlePaddle (PaddleOCR)
- HiQ Call Graph
Add Observability to Onnxruntime (AlexNet)
Examples
Please check 🔗 examples for usage examples.
Contributing
HiQ welcomes contributions from the community. Before submitting a pull request, please review our contribution guide](./CONTRIBUTING.md).
Security
Please consult the 🔗 security guide for our responsible security vulnerability disclosure process.
License
Copyright (c) 2022, 2023 Oracle and/or its affiliates. Released under the Universal Permissive License v1.0 as shown at https://oss.oracle.com/licenses/upl/.
Presentation and Demos
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
File details
Details for the file hiq_python-1.1.14.dev5-py3-none-any.whl
.
File metadata
- Download URL: hiq_python-1.1.14.dev5-py3-none-any.whl
- Upload date:
- Size: 79.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.10.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | db0a843954d6aa0614c490621edb02a1e9e5a21c19914b003a329af76fd9bf15 |
|
MD5 | f0eb9a661bb1627cc034e591d79198b4 |
|
BLAKE2b-256 | 1086dee893a35497554383a7dc7930470e191f4c02d5bf3476e6eca2a04f229c |