WitLog is a lightweight zero-overhead static structured log library for researchers.
Project description
Features
- 🎈 Lightweight: No heavy dependencies, plug and play!
- 🎯 Zero-overhead: No overhead if you turn it off!
- 👨🎓 Structured: Use output format defined by you. You master every detail!
- 🔧 Flexible and easy for use. Just
logger.log("witness", variables)
! Witlog will automatically organize them.
Installation
Simply pip install witlog
If you want to support your machine learning project, try pip install witlog[full]
. Full version would introduce a lot of "heavy" dependencies.
Usage
Quick Start
import witlog as wl
from witlog import StaticLogger
logger = StaticLogger([
"layer_idx",
["x","y"] * 5,
"post_scores",
...
], print)#print for example
wl.register_logger('CHECK_SCORES', logger)
... ##your code
with wl.monitor('CHECK_SCORES') as logger:
logger.log('layer_idx', layer)
for i in range(5):
##do sth
with wl.monitor('CHECK_SCORES') as logger:
logger.log('x', x)
logger.log('y', y)
with wl.monitor('CHECK_SCORES') as logger:
logger.log('post_scores', post_scores)
This example code would print all variables, because print
is passed as output_func
.
Let's have a quick view of the signature of StaticLogger
:
class StaticLogger:
def __init__(self, template, output_func):
...
template
is a list, the format will be elaborated later. output_func
is a callable which will receive a LogBlock
(your logging result) object. We also offer a PickleTo
helper class for better saving. You can access it by wl.PickleTo
.
The overall workflow is:
- define your logging format.
- define your output_func.
- register logger.
- Insert
monitor
andlogger.log
into your code to inspect what is happening.
Monitor will return a default logger if no logger is registered. And this logger will do nothing.
If you want to disable a logger, it's quite easy: wl.remove_logger(name)
.
WitLog is designed for inspect existing code, not intended for production-level real-time logging. This narrows the scope of application of Witlog, but makes it a sharp knife in the research field.
Logging
Main Concepts
Witlog considers code as many loops(including nested loop), and hence logs can be structured as blocks. Each block corresponds to a loop in code.
So, the code below
print(x)
for i in range(K):
print(y)
print(z)
can be seen as :
[
item,
block,
item
]
Naturally, we can come up with such a definition:
- The log message unit is Message. A message is indivisible.
- Each Block consists of ordered Message and Block.
- A Message is a Block.
Hence, we can define log format as a list:
[
"name1",
["a", "b"] * m,
"samples"...
]
Since python code is executed orderedly, logging when running has same order as traversing this list. Thus, we can get a well-structured log.
This has an advantage when use: you ONLY need to specify the name
when logging, ignoring the actual tree-like structure. It's very convenient like:
with wl.monitor("SCORES") as logger:
logger.log('a', a)
This means you don't need to care about other code affect when developing.
In implementation, Block is corresponding to LogBlock
, Message is corresponding to LogMsg
. Both have a property content
, including the actual object(s) they store. LogMsg
has a name
but LogBlock
is anonymous.
Post-processing
Witlog's biggest advantage lies in its simple post-processing. You can easily extract the data you need from the structured LogBlock
, rather than manually writing cumbersome analysis scripts.
Say you've got a LogBlock
defined by:
[
'outer_loop_idx',
['inner_x'] * 10,
'final_flag'
]
and you want extract 5th inner_x
:
print(block[1][4].content)
If you want to get the content associating to a unique name, such as final_flag
, it's easier and more readable:
print(block['final_flag'].content)
But if you want to extract content associating to duplicate names, the default indexing would only return the first match. This is not recommended.
Timing
Now witlog.timing
(shorthand as wt
) provides two approaches to measure time and organize them:
- Use decorator
@wt.timethis(name)
on definition of functions. We recommend writing it as the outermost decorator. - Use contextmanger
@wt.timing(name)
.
All records will be aggregated into a list records
. You can access it by wt.get_records()
. This function would return a list of (name, duration)
, sorted by end timing.
The advantages of the timing module compared to cprofiler are:
- It can easily attach hook functions. See
wt.set_config(config)
. - It takes CUDA synchronization into account(requires installation of full version).
- It allows assigning different names to the same function call, thus distinguishing between them.
This last point is particularly useful when analyzing frequently called low-level modules, as these modules usually can't be made faster and can only be optimized in terms of access patterns. Distinguishing names helps you discover different access patterns.
FAQ
- Q: I just want to log when something is wrong. So it's not a perfect static loop. How should I handle this?
- A: Divide your logger to serval small loggers. Ensure for each small logger, they are handling static loops. Worth noting, a trivial logger
logger = StaticLogger(['single'])
can handle any logging pattern ofsingle
, because it outputs immediately after receiving one logging message. So in theory, you can use many loggers for any patterns. In practice, you need to balance the coding effort and logging pattern.
- A: Divide your logger to serval small loggers. Ensure for each small logger, they are handling static loops. Worth noting, a trivial logger
Convention
- No expression in
log(key, value)
. Expressions would be always executed(even if you remove the register), and that might cause unexpected overhead. The best practice is simple string for key and only object for value. A common mistake is f-string. If you really want to combine something, we recommend you to implement a custom logger. - Use
with wl.monitor(name)
instead ofget_logger
. The latter one would work and reduce some coding effort indeed, butwith
create a block of code. It's more readable, and allowing "truly zero-overhead". Though skippingwith
is rejected in PEP-343, there is will some space for hacking code stack. It's dangerous so will not be integrated into witlog, but maybe we can achieve this safely some day 😉
Star History
Contributors
Hope WitLog make your life easier 🌹
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file witlog-0.1.4.tar.gz
.
File metadata
- Download URL: witlog-0.1.4.tar.gz
- Upload date:
- Size: 126.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 20d52e54503be93dfe07341aa87a1dd6ccff06657aeb7ec02e93fa49b461a903 |
|
MD5 | 9a16f36b413aad5d2a9017ba7af8e356 |
|
BLAKE2b-256 | ff01be2ba8da97d7f691a414be0718baa9ac326fa7aa3610b79ad1ba638a4787 |
File details
Details for the file witlog-0.1.4-py3-none-any.whl
.
File metadata
- Download URL: witlog-0.1.4-py3-none-any.whl
- Upload date:
- Size: 15.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e47b6d6bdff743b5f5fb54115462a6cb61725e0dadddddf6d043c5ae3e1e7ed5 |
|
MD5 | 7ef3239e34ea3a4ff18fed52060d4e7d |
|
BLAKE2b-256 | 3d13522c8e83f7174480f79f9ef378ac13e218005fedbf6ff74c13fb14d589f7 |