Skip to main content

WitLog is a lightweight zero-overhead static structured log library for researchers.

Project description

Logo

PyPI - Version PyPI - Wheel Python Version from PEP 621 TOML

Features

  • 🎈 Lightweight: No heavy dependencies, plug and play!
  • 🎯 Zero-overhead: No overhead if you turn it off!
  • 👨‍🎓 Structured: Use output format defined by you. You master every detail!
  • 🔧 Flexible and easy for use. Just logger.log("witness", variables)! Witlog will automatically organize them.

Installation

Simply pip install witlog

If you want to support your machine learning project, try pip install witlog[full]. Full version would introduce a lot of "heavy" dependencies.

Usage

Quick Start

import witlog as wl
from witlog import StaticLogger

logger = StaticLogger([
            "layer_idx",
            ["x","y"] * 5,
            "post_scores",
            ...
        ], print)#print for example
wl.register_logger('CHECK_SCORES', logger)

... ##your code

with wl.monitor('CHECK_SCORES') as logger:
    logger.log('layer_idx', layer)

for i in range(5):
    ##do sth
    with wl.monitor('CHECK_SCORES') as logger:
        logger.log('x', x)
        logger.log('y', y)
with wl.monitor('CHECK_SCORES') as logger:
    logger.log('post_scores', post_scores)

This example code would print all variables, because print is passed as output_func.

Let's have a quick view of the signature of StaticLogger:

class StaticLogger:
    def __init__(self, template, output_func):
        ...

template is a list, the format will be elaborated later. output_func is a callable which will receive a LogBlock(your logging result) object. We also offer a PickleTo helper class for better saving. You can access it by wl.PickleTo.

The overall workflow is:

  1. define your logging format.
  2. define your output_func.
  3. register logger.
  4. Insert monitor and logger.log into your code to inspect what is happening.

Monitor will return a default logger if no logger is registered. And this logger will do nothing.

If you want to disable a logger, it's quite easy: wl.remove_logger(name).

WitLog is designed for inspect existing code, not intended for production-level real-time logging. This narrows the scope of application of Witlog, but makes it a sharp knife in the research field.

Logging

Main Concepts

Witlog considers code as many loops(including nested loop), and hence logs can be structured as blocks. Each block corresponds to a loop in code.

So, the code below

print(x)
for i in range(K):
    print(y)

print(z)

can be seen as :

[
    item,
    block,
    item
]

Naturally, we can come up with such a definition:

  1. The log message unit is Message. A message is indivisible.
  2. Each Block consists of ordered Message and Block.
  3. A Message is a Block.

Hence, we can define log format as a list:

[
    "name1",
    ["a", "b"] * m,
    "samples"...
]

Since python code is executed orderedly, logging when running has same order as traversing this list. Thus, we can get a well-structured log.

This has an advantage when use: you ONLY need to specify the name when logging, ignoring the actual tree-like structure. It's very convenient like:

with wl.monitor("SCORES") as logger:
    logger.log('a', a)

This means you don't need to care about other code affect when developing.

In implementation, Block is corresponding to LogBlock, Message is corresponding to LogMsg. Both have a property content, including the actual object(s) they store. LogMsg has a name but LogBlock is anonymous.

Post-processing

Witlog's biggest advantage lies in its simple post-processing. You can easily extract the data you need from the structured LogBlock, rather than manually writing cumbersome analysis scripts.

Say you've got a LogBlock defined by:

[
    'outer_loop_idx',
    ['inner_x'] * 10,
    'final_flag'
]

and you want extract 5th inner_x:

print(block[1][4].content)

If you want to get the content associating to a unique name, such as final_flag, it's easier and more readable:

print(block['final_flag'].content)

But if you want to extract content associating to duplicate names, the default indexing would only return the first match. This is not recommended.

Timing

Now witlog.timing(shorthand as wt) provides two approaches to measure time and organize them:

  1. Use decorator @wt.timethis(name) on definition of functions. We recommend writing it as the outermost decorator.
  2. Use contextmanger @wt.timing(name).

All records will be aggregated into a list records. You can access it by wt.get_records(). This function would return a list of (name, duration), sorted by end timing.

The advantages of the timing module compared to cprofiler are:

  1. It can easily attach hook functions. See wt.set_config(config).
  2. It takes CUDA synchronization into account(requires installation of full version).
  3. It allows assigning different names to the same function call, thus distinguishing between them.

This last point is particularly useful when analyzing frequently called low-level modules, as these modules usually can't be made faster and can only be optimized in terms of access patterns. Distinguishing names helps you discover different access patterns.

FAQ

  • Q: I just want to log when something is wrong. So it's not a perfect static loop. How should I handle this?
    • A: Divide your logger to serval small loggers. Ensure for each small logger, they are handling static loops. Worth noting, a trivial logger logger = StaticLogger(['single']) can handle any logging pattern of single, because it outputs immediately after receiving one logging message. So in theory, you can use many loggers for any patterns. In practice, you need to balance the coding effort and logging pattern.

Convention

  1. No expression in log(key, value). Expressions would be always executed(even if you remove the register), and that might cause unexpected overhead. The best practice is simple string for key and only object for value. A common mistake is f-string. If you really want to combine something, we recommend you to implement a custom logger.
  2. Use with wl.monitor(name) instead of get_logger. The latter one would work and reduce some coding effort indeed, but with create a block of code. It's more readable, and allowing "truly zero-overhead". Though skipping with is rejected in PEP-343, there is will some space for hacking code stack. It's dangerous so will not be integrated into witlog, but maybe we can achieve this safely some day 😉

Star History

Star History Chart

Contributors

Hope WitLog make your life easier 🌹

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

witlog-0.1.4.tar.gz (126.4 kB view details)

Uploaded Source

Built Distribution

witlog-0.1.4-py3-none-any.whl (15.9 kB view details)

Uploaded Python 3

File details

Details for the file witlog-0.1.4.tar.gz.

File metadata

  • Download URL: witlog-0.1.4.tar.gz
  • Upload date:
  • Size: 126.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.6

File hashes

Hashes for witlog-0.1.4.tar.gz
Algorithm Hash digest
SHA256 20d52e54503be93dfe07341aa87a1dd6ccff06657aeb7ec02e93fa49b461a903
MD5 9a16f36b413aad5d2a9017ba7af8e356
BLAKE2b-256 ff01be2ba8da97d7f691a414be0718baa9ac326fa7aa3610b79ad1ba638a4787

See more details on using hashes here.

File details

Details for the file witlog-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: witlog-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 15.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.6

File hashes

Hashes for witlog-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 e47b6d6bdff743b5f5fb54115462a6cb61725e0dadddddf6d043c5ae3e1e7ed5
MD5 7ef3239e34ea3a4ff18fed52060d4e7d
BLAKE2b-256 3d13522c8e83f7174480f79f9ef378ac13e218005fedbf6ff74c13fb14d589f7

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page