Python HTTP logging server
Project description
Chronologer
Chronologer is a counterpart of Python stdlib’s logging HTTPHandler [1]. It provides RESTful API for accepting Python logging HTTP POST requests and UI for browsing and searching the logs. The idea is the same as for database backends of logging software, like rsyslog-mysql [2].
UI features are described in the frontend branch.
Scope
Chronologer is meant for small- and medium-sized logging workloads. Practically it’s limited by its backend’s write throughput. In case of MySQL backend vertical scaling can suffice many types of applications. Especially it’s useful for microservice architecture where file logging isn’t practical.
Installation
Chronologer is available as a Python package [3] and as a Docker image [4]. The former can be installed like pip install chronologer. The latter can be used like in the following docker-compose.yml for deployment with MySQL database.
version: '2.1' services: web: image: saaj/chronologer environment: CHRONOLOGER_STORAGE_DSN: mysql://chronologer:pass@mysql/chronologer depends_on: mysql: condition: service_healthy ports: - 8080:8080 mysql: image: mysql:5.7 environment: MYSQL_DATABASE: chronologer MYSQL_USER: chronologer MYSQL_PASSWORD: pass MYSQL_ROOT_PASSWORD: pass healthcheck: test: mysqladmin ping --protocol=tcp --password=pass --silent interval: 5s retries: 4
It can be run like the following. The second line applies database migrations.
docker-compose up -d docker-compose run --rm web python -m chronologer -e production migrate
Chronologer’s configuration can be fine-tuned with several environment variables defined in envconf [5]. Default envconf can be overridden completely, see python -m chronologer --help.
Logging handler
The naive imperative logging configuration looks like:
import logging.handlers
chrono = logging.handlers.HTTPHandler(
'localhost:8080', '/api/v1/record', 'POST', credentials = ('logger', ''))
handlers = [logging.StreamHandler(), chrono]
logging.basicConfig(level = logging.DEBUG, handlers = handlers)
The same can be expressed declaratively:
import logging.config
conf = {
'version' : 1,
'disable_existing_loggers' : False,
'handlers' : {
'console' : {
'class' : 'logging.StreamHandler',
},
'http' : {
'class' : 'logging.handlers.HTTPHandler',
'host' : 'localhost:8080',
'url' : '/api/v1/record',
'method' : 'POST',
'credentials' : ('logger', ''),
'secure' : False
},
},
'root' : {
'handlers' : ['console', 'http'],
'level' : 'DEBUG'
}
}
logging.config.dictConfig(conf)
This configuration is called naive because the handler is blocking. It may work in simple cases but generally it’s discouraged because the network is not reliable [6]. Instead Python provides logging queueing in stdlib [7]:
Along with the QueueHandler class, QueueListener can be used to let handlers do their work on a separate thread from the one which does the logging. This is important in Web applications and also other service applications where threads servicing clients need to respond as quickly as possible, while any potentially slow operations (such as sending an email via SMTPHandler) are done on a separate thread.
Here follows imperative configuration with memory queueing.
chrono = logging.handlers.HTTPHandler(
'localhost:8080', '/api/v1/record', 'POST', credentials = ('logger', ''))
q = queue.Queue(maxsize = 4096)
qh = logging.handlers.QueueHandler(q)
ql = logging.handlers.QueueListener(q, chrono)
ql.start()
handlers = [logging.StreamHandler(), qh]
logging.basicConfig(level = logging.DEBUG, handlers = handlers)
# somewhere on shutdown
ql.stop()
Because the queue listener’s shutdown procedure is inconvenient this way and it’s hard to express declaratively, QueueProxyHandler is suggested.
import logging.handlers
import logging.config
class QueueProxyHandler(logging.handlers.QueueHandler):
'''Queue handler which creates its own ``QueueListener`` to
proxy log records via provided ``queue`` to ``target`` handler.'''
_listener = None
'''Queue listener'''
def __init__(self, queue, target = logging.handlers.HTTPHandler, **kwargs):
# user-supplied factory is not converted by default
if isinstance(queue, logging.config.ConvertingDict):
queue = queue.configurator.configure_custom(queue)
super().__init__(queue)
self._listener = logging.handlers.QueueListener(queue, target(**kwargs))
self._listener.start()
def close(self):
super().close()
self._listener.stop()
conf = {
'version' : 1,
'disable_existing_loggers' : False,
'handlers' : {
'console' : {
'class' : 'logging.StreamHandler',
},
'http' : {
'class' : 'somemodule.QueueProxyHandler',
'queue' : {'()': 'queue.Queue', 'maxsize': 4096},
'host' : 'localhost:8080',
'url' : '/api/v1/record',
'method' : 'POST',
'credentials' : ('logger', ''),
'secure' : False
},
},
'root' : {
'handlers' : ['console', 'http'],
'level' : 'DEBUG'
}
}
logging.config.dictConfig(conf)
API
By default Chronologer listens port 8080 and is protected by HTTP Basic Authentication, username “logger” without password (see environment variables to override these).
Chronologer provides Record resource.
Create record
URL |
/api/v1/record |
Method |
POST |
Request content-type |
application/x-www-form-urlencoded |
Request body |
logging.LogRecord representation |
Response content-type |
application/json |
Response body |
New Record’s identifier |
Successful response code |
201 Created |
Retrieve record count
URL |
/api/v1/record |
Method |
HEAD |
Query string |
Optional filtering fields (see details below):
|
Response headers |
|
Successful response code |
200 OK |
Retrieve record timeline
URL |
/api/v1/record |
Method |
HEAD |
Query string |
Required fields:
Optional filtering fields (see details below):
|
Response headers |
|
Successful response code |
200 OK |
Retrieve record range
URL |
/api/v1/record |
Method |
GET |
Query string |
Required fields:
Optional filtering fields (see details below):
|
Response content-type |
application/json |
Response body |
[ { "name": "some.module", "ts": "2018-05-10 16:36:53.377493+00:00", "message": "Et quoniam eadem...", "id": 177260, "level": 20 }, ... ] |
Successful response code |
200 OK |
Retrieve record
URL |
/api/v1/record/{id} |
Method |
GET |
Response content-type |
application/json |
Response body |
{ "name": "some.module", "logrec": { "data": { "foo": 387 }, "meta": { "process": 29406, "module": "some.module", "relativeCreated": 103.23762893676758, "msecs": 376.4379024505615, "pathname": "logtest.py", "msg": "Et quoniam eadem...", "stack_info": null, "processName": "MainProcess", "filename": "logtest.py", "thread": 140312867051264, "threadName": "MainThread", "lineno": 20, "funcName": "main", "args": null } }, "id": 177260, "level": 20, "message": "Et quoniam eadem...", "ts": "2018-05-10 16:36:53.377493+00:00" } logrec has two nested dictionaries. data has what was passed to extra [16] and meta has internal fields of logging.LogRecord. |
Successful response code |
200 OK |
Filtering
Filter fields have the following semantics:
after – ISO8601 timestamp. The predicate is true for a record which was created after given timestamp.
before – ISO8601 timestamp. The predicate is true for a record which was created before given timestamp.
level – integer logging level. The predicate is true for a record whose severity level is greater or equal to given level.
name – logging record prefix. Optionally can be a comma-separated list of prefixes. The predicate is true for a record whose logger name starts with any of given prefixes.
query – storage-specific expression. See JSON path description below.
MySQL
Chronologer relies on a compressed InnoDB table which provides good compromise between reliability, data modelling, search features, performance and size of logging data. The data of logging records are written into logrec JSON field (see the initial migration [9] and examples above).
Because currently there’s immediate write to the table, it’s recommended to allow MySQL to batch writes by setting innodb_flush_log_at_trx_commit = 0 [10]. Disabling performance schema [11] by setting performance_schema = 0 is also recommended, because it has significant overhead. Basic InnoDB settings should be reasonably configured:
JSON path query
query passes a storage-specific expression. Particularly, it’s useful to write post-filtering conditions for logrec JSON field using JSONPath expressions and -> operator [15]. It may look like the following, though arbitrary WHERE clause expressions are possible.
"logrec->'$.data.foo' = 387 AND logrec->'$.meta.lineno' = 20"
"logrec->'$.meta.threadName' != 'MainThread'"
R&D roadmap
See the roadmap issue.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.