Python analytics client for https://github.com/madesroches/micromegas/
Project description
Micromegas
Python analytics client for https://github.com/madesroches/micromegas/
Example usage
Query the most recent 2 log entries from the analytics service
import datetime
import pandas as pd
import micromegas
BASE_URL = "http://localhost:8082/"
client = micromegas.client.Client(BASE_URL)
sql = """
SELECT time, process_id, level, target, msg
FROM log_entries
WHERE level <= 4
AND exe LIKE '%analytics%'
ORDER BY time DESC
LIMIT 2
"""
now = datetime.datetime.now(datetime.timezone.utc)
begin = now - datetime.timedelta(minutes=2)
end = now
client.query(sql, begin, end)
time | process_id | level | target | msg | |
---|---|---|---|---|---|
0 | 2024-10-03 18:17:56.087543714+00:00 | 1db06afc-1c88-47d1-81b3-f398c5f93616 | 4 | acme_telemetry::trace_middleware | response status=200 OK uri=/analytics/query |
1 | 2024-10-03 18:17:53.924037729+00:00 | 1db06afc-1c88-47d1-81b3-f398c5f93616 | 4 | micromegas_analytics::lakehouse::query | query sql= |
SELECT time, process_id, level, target, msg | |||||
FROM log_entries | |||||
WHERE level <= 4 | |||||
AND exe LIKE '%analytics%' | |||||
ORDER BY time DESC | |||||
LIMIT 2 |
Query the 10 slowest top level spans in a trace within a specified time window
sql = """
SELECT begin, end, duration, name
FROM view_instance('thread_spans', '{stream_id}')
WHERE depth=1
ORDER BY duration DESC
LIMIT 10
;""".format(stream_id=stream_id)
client.query(sql, begin_spans, end_spans)
begin | end | duration | name | |
---|---|---|---|---|
0 | 2024-10-03 18:00:59.308952900+00:00 | 2024-10-03 18:00:59.371890+00:00 | 62937100 | FEngineLoop::Tick |
1 | 2024-10-03 18:00:58.752476800+00:00 | 2024-10-03 18:00:58.784389+00:00 | 31912200 | FEngineLoop::Tick |
2 | 2024-10-03 18:00:58.701507300+00:00 | 2024-10-03 18:00:58.731479500+00:00 | 29972200 | FEngineLoop::Tick |
3 | 2024-10-03 18:00:59.766343100+00:00 | 2024-10-03 18:00:59.792513700+00:00 | 26170600 | FEngineLoop::Tick |
4 | 2024-10-03 18:00:59.282902100+00:00 | 2024-10-03 18:00:59.308952500+00:00 | 26050400 | FEngineLoop::Tick |
5 | 2024-10-03 18:00:59.816034500+00:00 | 2024-10-03 18:00:59.841376900+00:00 | 25342400 | FEngineLoop::Tick |
6 | 2024-10-03 18:00:58.897813100+00:00 | 2024-10-03 18:00:58.922769700+00:00 | 24956600 | FEngineLoop::Tick |
7 | 2024-10-03 18:00:59.860637+00:00 | 2024-10-03 18:00:59.885523700+00:00 | 24886700 | FEngineLoop::Tick |
8 | 2024-10-03 18:00:58.630051300+00:00 | 2024-10-03 18:00:58.654871500+00:00 | 24820200 | FEngineLoop::Tick |
9 | 2024-10-03 18:00:57.952279800+00:00 | 2024-10-03 18:00:57.977024+00:00 | 24744200 | FEngineLoop::Tick |
SQL reference
The Micromegas analytics service is built on Apache DataFusion, please see Apache DataFusion SQL Reference for details.
View sets
All view instances in a set have the same schema. Some view instances are global (their view_instance_id is 'global').
Global view instances are implicitly accessible to SQL queries. Non-global view instances are accessible using the table function view_instance
.
log_entries
client.query("DESCRIBE log_entries")
column_name | data_type | is_nullable | |
---|---|---|---|
0 | process_id | Dictionary(Int16, Utf8) | NO |
1 | exe | Dictionary(Int16, Utf8) | NO |
2 | username | Dictionary(Int16, Utf8) | NO |
3 | computer | Dictionary(Int16, Utf8) | NO |
4 | time | Timestamp(Nanosecond, Some("+00:00")) | NO |
5 | target | Dictionary(Int16, Utf8) | NO |
6 | level | Int32 | NO |
7 | msg | Utf8 | NO |
log_entries view instances
The implicit use of the log_entries
table corresponds to the 'global' instance, which contains the log entries of all the processes.
Except the 'global' instance, the instance_id refers to any process_id. view_instance('log_entries', process_id)
contains that process's log. Process-specific views are materialized just-in-time and can provide much better query performance compared to the global instance.
measures
client.query("DESCRIBE measures")
column_name | data_type | is_nullable | |
---|---|---|---|
0 | process_id | Dictionary(Int16, Utf8) | NO |
1 | exe | Dictionary(Int16, Utf8) | NO |
2 | username | Dictionary(Int16, Utf8) | NO |
3 | computer | Dictionary(Int16, Utf8) | NO |
4 | time | Timestamp(Nanosecond, Some("+00:00")) | NO |
5 | target | Dictionary(Int16, Utf8) | NO |
6 | name | Dictionary(Int16, Utf8) | NO |
7 | unit | Dictionary(Int16, Utf8) | NO |
8 | value | Float64 | NO |
measures view instances
The implicit use of the measures
table corresponds to the 'global' instance, which contains the metrics of all the processes.
Except the 'global' instance, the instance_id refers to any process_id. view_instance('measures', process_id)
contains that process's metrics. Process-specific views are materialized just-in-time and can provide much better query performance compared to the 'global' instance.
thread_spans
column_name | data_type | is_nullable | |
---|---|---|---|
0 | id | Int64 | NO |
1 | parent | Int64 | NO |
2 | depth | UInt32 | NO |
3 | hash | Uint32 | NO |
4 | begin | Timestamp(Nanosecond, Some("+00:00")) | NO |
5 | end | Timestamp(Nanosecond, Some("+00:00")) | NO |
6 | duration | Int64 | NO |
7 | name | Dictionary(Int16, Utf8) | NO |
8 | target | Dictionary(Int16, Utf8) | NO |
9 | filename | Dictionary(Int16, Utf8) | NO |
10 | line | UInt32 | NO |
thread_spans view instances
There is no 'global' instance in the 'thread_spans' view set, there is therefore no implicit thread_spans table availble.
Users can call the table function view_instance('thread_spans', stream_id)
to query the spans in the thread associated with the specified stream_id.
processes
client.query("DESCRIBE processes")
column_name | data_type | is_nullable | |
---|---|---|---|
0 | process_id | Utf8 | NO |
1 | exe | Utf8 | NO |
2 | username | Utf8 | NO |
3 | realname | Utf8 | NO |
4 | computer | Utf8 | NO |
5 | distro | Utf8 | NO |
6 | cpu_brand | Utf8 | NO |
7 | tsc_frequency | Int64 | NO |
8 | start_time | Timestamp(Nanosecond, Some("+00:00")) | NO |
9 | start_ticks | Int64 | NO |
10 | insert_time | Timestamp(Nanosecond, Some("+00:00")) | NO |
11 | parent_process_id | Utf8 | NO |
12 | properties | List(Field { name: "Property", data_type: Struct([Field { name: "key", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "value", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }]), nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }) | NO |
There is only one instance in this view set and it is implicitly available.
streams
client.query("DESCRIBE streams")
column_name | data_type | is_nullable | |
---|---|---|---|
0 | stream_id | Utf8 | NO |
1 | process_id | Utf8 | NO |
2 | dependencies_metadata | Binary | NO |
3 | objects_metadata | Binary | NO |
4 | tags | List(Field { name: "tag", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }) | YES |
5 | properties | List(Field { name: "Property", data_type: Struct([Field { name: "key", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "value", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }]), nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }) | NO |
6 | insert_time | Timestamp(Nanosecond, Some("+00:00")) | NO |
There is only one instance in this view set and it is implicitly available.
blocks
client.query("DESCRIBE blocks")
column_name | data_type | is_nullable | |
---|---|---|---|
0 | block_id | Utf8 | NO |
1 | stream_id | Utf8 | NO |
2 | process_id | Utf8 | NO |
3 | begin_time | Timestamp(Nanosecond, Some("+00:00")) | NO |
4 | begin_ticks | Int64 | NO |
5 | end_time | Timestamp(Nanosecond, Some("+00:00")) | NO |
6 | end_ticks | Int64 | NO |
7 | nb_objects | Int32 | NO |
8 | object_offset | Int64 | NO |
9 | payload_size | Int64 | NO |
10 | insert_time | Timestamp(Nanosecond, Some("+00:00")) | NO |
11 | streams.dependencies_metadata | Binary | NO |
12 | streams.objects_metadata | Binary | NO |
13 | streams.tags | List(Field { name: "tag", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }) | YES |
14 | streams.properties | List(Field { name: "Property", data_type: Struct([Field { name: "key", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "value", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }]), nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }) | NO |
15 | processes.start_time | Timestamp(Nanosecond, Some("+00:00")) | NO |
16 | processes.start_ticks | Int64 | NO |
17 | processes.tsc_frequency | Int64 | NO |
18 | processes.exe | Utf8 | NO |
19 | processes.username | Utf8 | NO |
20 | processes.realname | Utf8 | NO |
21 | processes.computer | Utf8 | NO |
22 | processes.distro | Utf8 | NO |
23 | processes.cpu_brand | Utf8 | NO |
24 | processes.insert_time | Timestamp(Nanosecond, Some("+00:00")) | NO |
25 | processes.parent_process_id | Utf8 | NO |
26 | processes.properties | List(Field { name: "Property", data_type: Struct([Field { name: "key", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "value", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }]), nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }) | NO |
There is only one instance in this view set and it is implicitly available.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for micromegas-0.2.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 71f339a1bd8e365ec8ce0b6296782db8b6df1ac011e4cb213ca67143052226e5 |
|
MD5 | 3effee965c1ee0f7bae8d9cc5927d78c |
|
BLAKE2b-256 | 6e84c4985a0eec70d00c8fb79f117fe280cbe3b44b22840dd9ff9e4ba2d29b05 |