Structured logging to JSONL with support for asynchronous logging via asyncio
Project description
grammlog
GrammAcc's structured logging.
Pure python package that provides structured logging to JSONL with no dependencies.
Supports asynchronous logging via asyncio. See: Asynchronous Logging
Provides wrappers around the standard debug
, info
, warning
, error
, and critical
functions.
Each function accepts a Logger
as its first argument, so you can provide a custom logger with your own handlers to write
structured logs to any target.
This package also provides a make_logger
convenience function that creates and configures
a file-based logger with rotating log files and a size limit.
Installation
pip install grammlog
Basic Usage
>>> import os
>>> import grammlog
>>> logger = grammlog.make_logger("app", log_dir="logs", log_level=grammlog.Level.INFO)
>>> grammlog.info(logger, "application initialized")
>>> grammlog.info(
... logger,
... "env vars set",
... {"env": os.environ},
... )
>>> grammlog.set_env(grammlog_dir="logs", default_grammlog_level=grammlog.Level.INFO)
>>> auth_log = grammlog.make_logger("auth") # Use the log_dir and log_level from the env.
>>> try:
... user_id = 42
... get_user_if_logged_in(user_id)
... except NameError as e:
... grammlog.error(
... logger,
... "User was not logged in",
... {"user_id": user_id},
... err=e,
... )
>>> db_logger = grammlog.make_logger(
... "db_queries", log_dir="logs/db", log_level=grammlog.Level.ERROR
... )
>>> try:
... user_name = "invalid"
... db.query(table="users", user_name=user_name)
... except NameError as e:
... grammlog.error(
... db_logger,
... "Unknown error in db query",
... {"queried_table": "users", "user_name": user_name},
... err=e,
... )
Structured Data
The logging functions all take an arbitrary logger as their first argument, which
allows them to output structured logs to any handlers supported by the stdlib's logging module.
But they also accept a required string msg
, and an optional details
dictionary as well as
an optional err
Exception.
These arguments will be merged together into a JSON-formatted object that will be serialized and written to a single line of JSONL logged output using the provided logger.
In addition to the data provided in the function arguments, the logged object will also include the following keys:
level
: The logging level of the message. E.g. 'DEBUG', 'ERROR'.timestamp
: The timestamp of the logging event in UTC as given bydatetime.datetime.timestamp()
If the err
argument is not None
, then the logged line will also contain the following keys:
err
: Therepr
of theerr
object.traceback
: The formatted traceback of the exception.
The msg
string will be merged into the object under the msg
key, and all keys and values
in the details
dict will be merged into the resulting object as-is.
The keys in the details
dict are assumed to be strings, but the values can be any type.
If a value is not json-serializable, the logging function will force a string conversion
by calling str(value)
on it. This is applied recursively to any nested dictionaries as well.
The default of calling str
on the values is fine for logging blobs of dictionary data, but
usually, it's best to explicitly convert an unsupported value to a json-serializable form before
logging it so that the logs contain all of the information expected. For example, when logging
dates/datetimes, it may be desirable to have a very high precision POSIX timestamp, or you may
want to log a more human-readable ISO-formatted date string. Converting the value to the desired
format before logging it is preferred.
Processing Structured Logs
Logging in a structured format like JSON is useful because we can query the logs based on keys and values instead of simply scanning through thousands of lines of text manually or with inaccurate search heuristics.
For example, using the shell tool jq
, we can
filter our logs to only the lines that have an err
object logged.
cat debug.log | jq 'select(.err != null)'
Or we can get a list of all log entries from the last 42 minutes.
cat debug.log | jq 'select(.timestamp >= (now - 2520))'
Or we can count the number of log entries.
cat debug.log | jq --slurp 'length'
Or we can get an array of (msg, traceback) tuples for all of the ValueErrors that we've logged.
cat debug.log | jq 'select(.err != null) | select(.err | contains("ValueError")) | [.msg, .traceback]'
Or we can use any other combination of query-like filters to examine the exact messages we're concerned with.
This means that if we include queryable keys in our logging calls in the source code, it is easy to find the specific error messages we need to debug all the nasty issues our applications give us at 3am on a Saturday.
Asynchronous Logging
There are async versions of each of the logging functions as well as
register_async_logger
and deregister_async_logger
functions:
async_debug
async_info
async_warning
async_error
async_critical
register_async_logger
deregister_async_logger
The async logging functions only work with a logger that has been registered.
This is because the logging calls themselves are synchronous, and they
need to be queued in order to run them concurrently with other tasks in the event
loop. Registering a logger to be used asynchronously doesn't mutate the logger in
any way, and async loggers are still the same
logging.Logger
objects. Calling register_async_logger
simply creates an
asyncio.Queue
and a Task
to run synchronous logging functions in the background.
The async queues are managed internally in this package and run the logging
events on the event loop in the background. This means that a call like
await async_info(logger, msg)
doesn't actually wait until the message is logged;
it puts an event into the queue to be logged in the background at the discretion of
the event loop's task scheduler. This means that deregister_async_logger
needs to
be called on any loggers registered as async before application shutdown in order to
guarantee all log messages are flushed to their targets. Failing to deregister a
logger will not cause any problems, but it may result in pending log messages being
lost.
Similarly to how any logger with any handlers can be used with the sync functions
for structured logging to any target, any logger can be registered as an async logger by passing
it into the register_async_logger
function. That does not mean that registering another library's
logger will cause that library's logging events to run asynchronously. The asynchronous logging only
works if the async_*
functions are used. Registering a logger that you don't control will only add
overhead due to the empty task taking CPU cycles away from other background work on the event loop.
TODO: Add an example of async usage here.
Async Performance Considerations
Using async logging won't make your logging any faster. Because writing the actual log messages is synchronous, excessive logging will still cause a CPU-bound bottleneck in your application. However, if you are using asyncio already, using async logging should make your code more efficient by giving the event loop a chance to start other background tasks in between registering a logging event and actually logging the message. In other words, the main benefit is not to make logging more efficient but instead to make sure the event loop can keep as many concurrent tasks running as possible.
One thing to consider with respect to the async event loop is the size limit for the logging
queues. The queues will not block the event loop from
running other tasks regardless of the size limit, but there are tradeoffs to consider.
Due to the way
asyncio.Queue
works, when the queue is full, it will continue to pass execution to other tasks until
an item is removed from the queue. This means that in situations where the application is
performing excessive logging due to some unforseen usage pattern or a programming oversight, the
size limit on the queue will help to throttle the CPU usage of the logging events by not continuing
to enqueue more events until the oldest one is evicted. This will give the event loop more chances
to start other tasks such as handling a request through the web framework or sending an async API
response to the frontend. The logging events will still hog the CPU while they are running, but
the size limit maximizes the chances the application has to start other IO-bound tasks in between
logging events. The flipside of this is that if the async logging call is happening inside a handler
for a request or before sending a response to the client, then that entire coroutine will wait until
there is space in the queue to add another logging event. For this reason, some applications may
want to use a large size limit for logging queues depending on their needs, but it is very unlikely
that the wait time for a queue eviction would result in a more significant slowdown than the CPU
load that an unbounded queue would allow the logging events to accumulate.
When in doubt, profile.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file grammlog-1.1.0.tar.gz
.
File metadata
- Download URL: grammlog-1.1.0.tar.gz
- Upload date:
- Size: 16.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | eaa1dcbb287813715e9f2487efe2c7c792dec95ec4a6584e712026e78dd1d1a4 |
|
MD5 | 2f4745bb92af33231c1072257c185f17 |
|
BLAKE2b-256 | 184673f5b22e7d948c1fbd4b724976e2eae317280861d8a4ff38ea5f9aa4694b |
File details
Details for the file grammlog-1.1.0-py3-none-any.whl
.
File metadata
- Download URL: grammlog-1.1.0-py3-none-any.whl
- Upload date:
- Size: 14.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5a893134166ad9e5b49f9d02d8d88028e88a4acff283f262ca2a65738112fe67 |
|
MD5 | 2db42c28dcdb32dd066a32fff8f314d6 |
|
BLAKE2b-256 | 0cfd7028b5ed19cca90ca0d04950ab716b21ef41c2b6717122d02ff5fabd8591 |