Skip to main content

Python HTTP logging server

Project description

https://img.shields.io/badge/dynamic/json.svg?url=https://api.bitbucket.org/2.0/repositories/saaj/chronologer/pipelines/?page=1%26pagelen=1%26sort=-created_on%26target.ref_name=backend&label=build&query=$.values[0].state.result.name&colorB=blue https://codecov.io/bitbucket/saaj/chronologer/branch/backend/graph/badge.svg https://badge.fury.io/py/Chronologer.svg https://images.microbadger.com/badges/image/saaj/chronologer.svg

Chronologer

Chronologer

Chronologer is a counterpart of Python stdlib’s logging HTTPHandler [1] and more. It provides RESTful API for accepting Python logging HTTP POST requests and optional UI for browsing and searching the logs. The idea is the same as for database backends of logging software, like rsyslog-mysql [2].

UI features are described in the frontend branch and released as ChronologerUI [17] package.

Scope

Chronologer is meant for small- and medium-sized logging workloads and audit logging. Practically it’s limited by its backend’s write throughput and capacity. In case of MySQL backend vertical scaling can suffice many types of applications.

Especially it’s useful for microservice architecture where file logging is no longer practical.

Installation

Chronologer is available as a Python package [3] and as a Docker image [4] (includes UI). The former can be installed like pip install chronologer[server,mysql,ui]. The latter can be used like in the following docker-compose.yml for deployment with MySQL database.

version: '2.1'
services:
  web:
    image: saaj/chronologer
    environment:
      CHRONOLOGER_STORAGE_DSN: mysql://chronologer:pass@mysql/chronologer
    depends_on:
      mysql:
        condition: service_healthy
    ports:
      - 8080:8080

  mysql:
    image: mysql:5.7
    environment:
      MYSQL_DATABASE: chronologer
      MYSQL_USER: chronologer
      MYSQL_PASSWORD: pass
      MYSQL_ROOT_PASSWORD: pass
    healthcheck:
      test: mysqladmin ping --protocol=tcp --password=pass --silent
      interval: 5s
      retries: 4

It can be run like the following. The second line applies database migrations.

docker-compose up -d
docker-compose run --rm web python -m chronologer -e production migrate

Chronologer’s configuration can be fine-tuned with several environment variables defined in envconf [5]. Default envconf can be overridden completely, see python -m chronologer --help.

Logging handler

The naive imperative logging configuration looks like:

import logging.handlers

chrono = logging.handlers.HTTPHandler(
  'localhost:8080', '/api/v1/record', 'POST', credentials = ('logger', ''))
handlers = [logging.StreamHandler(), chrono]
logging.basicConfig(level = logging.DEBUG, handlers = handlers)

The same can be expressed declaratively:

import logging.config

conf = {
  'version'                  : 1,
  'disable_existing_loggers' : False,
  'handlers'                 : {
    'console' : {
      'class' : 'logging.StreamHandler',
    },
    'http' : {
      'class'       : 'logging.handlers.HTTPHandler',
      'host'        : 'localhost:8080',
      'url'         : '/api/v1/record',
      'method'      : 'POST',
      'credentials' : ('logger', ''),
      'secure'      : False
    },
  },
  'root' : {
    'handlers' : ['console', 'http'],
    'level'    : 'DEBUG'
  }
}
logging.config.dictConfig(conf)

This configuration is called naive because the handler is blocking. It may work in simple cases but generally it’s discouraged because the network is not reliable [6]. Instead Python provides logging queueing in stdlib [7]:

Along with the QueueHandler class, QueueListener can be used to let handlers do their work on a separate thread from the one which does the logging. This is important in Web applications and also other service applications where threads servicing clients need to respond as quickly as possible, while any potentially slow operations (such as sending an email via SMTPHandler) are done on a separate thread.

Here follows imperative configuration with memory queueing.

chrono = logging.handlers.HTTPHandler(
  'localhost:8080', '/api/v1/record', 'POST', credentials = ('logger', ''))
q = queue.Queue(maxsize = 4096)
qh = logging.handlers.QueueHandler(q)
ql = logging.handlers.QueueListener(q, chrono)
ql.start()
handlers = [logging.StreamHandler(),  qh]
logging.basicConfig(level = logging.DEBUG, handlers = handlers)

# somewhere on shutdown
ql.stop()

Because the queue listener’s shutdown procedure is inconvenient this way and it’s hard to express declaratively, QueueProxyHandler is suggested.

import logging.handlers
import logging.config


class QueueProxyHandler(logging.handlers.QueueHandler):
  '''Queue handler which creates its own ``QueueListener`` to
  proxy log records via provided ``queue`` to ``target`` handler.'''

  _listener = None
  '''Queue listener'''


  def __init__(self, queue, target = logging.handlers.HTTPHandler, **kwargs):
    # user-supplied factory is not converted by default
    if isinstance(queue, logging.config.ConvertingDict):
      queue = queue.configurator.configure_custom(queue)

    super().__init__(queue)
    self._listener = logging.handlers.QueueListener(queue, target(**kwargs))
    self._listener.start()

  def close(self):
    super().close()
    self._listener.stop()

conf = {
  'version'                  : 1,
  'disable_existing_loggers' : False,
  'handlers'                 : {
    'console' : {
      'class' : 'logging.StreamHandler',
    },
    'http' : {
      'class'       : 'somemodule.QueueProxyHandler',
      'queue'       : {'()': 'queue.Queue', 'maxsize': 4096},
      'host'        : 'localhost:8080',
      'url'         : '/api/v1/record',
      'method'      : 'POST',
      'credentials' : ('logger', ''),
      'secure'      : False
    },
  },
  'root' : {
    'handlers' : ['console', 'http'],
    'level'    : 'DEBUG'
  }
}
logging.config.dictConfig(conf)

Client

For convenience reasons, the above is available as chronologer.client.QueueProxyHandler.

In addition it has logger name prefixing and suffixing capability. prefix is passed to QueueProxyHandler on creation. It allows many applications logging into the same Chronologer instance to have separate logger namespaces (e.g. including aiohttp logging whose namespace is fixed). suffix is an extra attribute of LogRecord which allows to fine-tune the logger namespace for easier search of the records.

import logging.config


conf = {
  'version'                  : 1,
  'disable_existing_loggers' : False,
  'handlers'                 : {
    'console' : {
      'class' : 'logging.StreamHandler',
    },
    'http' : {
      'class'       : 'chronologer.client.QueueProxyHandler',
      'queue'       : {'()': 'queue.Queue', 'maxsize': 4096},
      'prefix'      : 'appname',
      'host'        : 'localhost:8080',
      'url'         : '/api/v1/record',
      'method'      : 'POST',
      'credentials' : ('logger', ''),
      'secure'      : False
    },
  },
  'root' : {
    'handlers' : ['console', 'http'],
    'level'    : 'DEBUG'
  }
}
logging.config.dictConfig(conf)

logging.getLogger('some').info(
  'Chronologer!', extra = {'suffix': 'important.transfer'})

The LogRecord corresponding to the last line will have name equal to 'appname.some.important.transfer'. If name is modified the original is saved as origname.

JSON input support

Besides application/x-www-form-urlencoded of HTTPHandler Chronologer supports application/json of the same structure. It also supports application/x-ndjson [19] for bulk ingestion.

JSON of arbitrary structure can ingested in the raw mode. In the mode Chronologer will not classify input on logging meta, data and error and will not insist on presence of Python logging-specific keys. For example, a file containing newline separated JSON entries can be sent to Chronologer like:

curl -H "content-type: application/x-ndjson" --user logger: \
  --data-binary @/path/to/some/file.ndjson localhost:8080/api/v1/record?raw=1

Record retention

When CHRONOLOGER_RETENTION_DAYS is set, daily, around midnight a background thread will purge records older than given number of days.

Authentication

Chronologer does not provide (neither intends to) a user management. The intent is to delegate authentication. The credentials and roles used by the server can be provided by the following environment variables:

  • CHRONOLOGER_USERNAME

  • CHRONOLOGER_PASSWORD

  • CHRONOLOGER_ROLES ­– space separated role list (see below)

Alternatively a JSON file located by CHRONOLOGER_AUTHFILE of the following structure can be used to authenticate multiple users:

[
  {
    "username": "bob",
    "pbkdf2": "f57ef1e3e8f90cb367dedd44091f251b5b15c9c36ddd7923731fa7ee41cbaa82",
    "hashname": "sha256",
    "salt": "c0139cff",
    "iterations": 32,
    "roles": ["writer"]
  }, {
    "username": "obo",
    "pbkdf2": "ff680a9237549f698da5345119dec1ed314eb4fdefe59837d0724d747c3169089ae45...",
    "hashname": "sha384",
    "salt": "9230dbdd5a13f009",
    "iterations": 4096,
    "roles": ["basic-reader", "query-reader"]
  }
]

The value of pbkdf2 and keys hashname, salt, iterations correspond to Python hashlib.pbkdf2_hmac [21].

Authorisation

Chronologer defines the following roles:

  • basic-reader allows HEAD and GET to /api/v1/record

  • query-reader in combination with basic-reader allows the use query, SQL expression, to (further) filter the records

  • writer allows POST to /api/v1/record

The UI (in case chronologerui is installed) is available to every authenticated user.

API

By default Chronologer listens port 8080 and is protected by HTTP Basic Authentication, username “logger” without password (see environment variables to override these).

Chronologer provides Record resource.

Create record

URL

/api/v1/record

Method

POST

Request content-type

application/x-www-form-urlencoded, application/json, application/x-ndjson

Request body

Representation of logging.LogRecord

Response content-type

application/json

Response body

Representation of created model.Record, except for application/x-ndjson input where only a list of insert record identifiers is returned

Successful response code

201 Created

Optional raw mode, accepting arbitrary JSON documents, is supported by passing raw=1 into the query string.

application/x-ndjson request body can produce 207 Multi-Status response when a successful chunk is followed by a failed chunk, say that contained malformed a JSON line. Multi-status body looks like:

{
  "multistatus": [
    {"status": 201, "body": [1, 2, ...]},
    {"status": 400, "body": "Invalid JSON document on line 2012"},
  ]
}

Retrieve record count

URL

/api/v1/record

Method

HEAD

Query string

Optional filtering fields (see details below):

  • after – ISO8601 timestamp

  • before – ISO8601 timestamp

  • level – integer logging level

  • name – logging record prefix(es)

  • query – storage-specific expression

Response headers

  • X-Record-Count: 42

Successful response code

200 OK

Retrieve record timeline

URL

/api/v1/record

Method

HEAD

Query string

Required fields:

  • group – “day” or “hour”

  • timezonepytz-compatible one

Optional filtering fields (see details below):

  • after – ISO8601 timestamp

  • before – ISO8601 timestamp

  • level – integer logging level

  • name – logging record prefix(es)

  • query – storage-specific expression

Response headers

  • X-Record-Count: 90,236

  • X-Record-Group: 1360450800,1360537200

Successful response code

200 OK

Retrieve record range

URL

/api/v1/record

Method

GET

Query string

Required fields:

  • left – left offset in the result set

  • right – right offset in the result set

Optional filtering fields (see details below):

  • after – ISO8601 timestamp

  • before – ISO8601 timestamp

  • level – integer logging level

  • name – logging record prefix(es)

  • query – storage-specific expression

Response content-type

application/json

Response body

[
  {
    "name": "some.module",
    "ts": "2018-05-10 16:36:53.377493+00:00",
    "message": "Et quoniam eadem...",
    "id": 177260,
    "level": 20
  },
  ...
]

Successful response code

200 OK

Retrieve record

URL

/api/v1/record/{id}

Method

GET

Response content-type

application/json

Response body

{
  "name": "some.module",
  "logrec": {
    "data": {
      "foo": 387
    },
    "meta": {
      "process": 29406,
      "module": "some.module",
      "relativeCreated": 103.23762893676758,
      "msecs": 376.4379024505615,
      "pathname": "logtest.py",
      "msg": "Et quoniam eadem...",
      "stack_info": null,
      "processName": "MainProcess",
      "filename": "logtest.py",
      "thread": 140312867051264,
      "threadName": "MainThread",
      "lineno": 20,
      "funcName": "main",
      "args": null
    }
  },
  "id": 177260,
  "level": 20,
  "message": "Et quoniam eadem...",
  "ts": "2018-05-10 16:36:53.377493+00:00"
}

logrec has two nested dictionaries. data has what was passed to extra [16] and meta has internal fields of logging.LogRecord.

Successful response code

200 OK

Error representation

Errors for HTTP method requests that allow a response body are represented like:

{
  "error" : {
    "type"    : "HTTPError",
    "message" : "Nothing matches the given URI"
  }
}

Errors for HTTP method requests that don’t allow a response body are represented in the headers:

  • X-Error-Type: StorageQueryError

  • X-Error-Message: Make sure the query filter is a valid WHERE expression

Filtering

Filter fields have the following semantics:

  • after – ISO8601 timestamp. The predicate is true for a record which was created after given timestamp.

  • before – ISO8601 timestamp. The predicate is true for a record which was created before given timestamp.

  • level – integer logging level. The predicate is true for a record whose severity level is greater or equal to given level.

  • name – logging record prefix. Optionally can be a comma-separated list of prefixes. The predicate is true for a record whose logger name starts with any of given prefixes.

  • query – storage-specific expression. See JSON path description below.

MySQL

Chronologer relies on a compressed InnoDB table which provides good compromise between reliability, data modelling, search features, performance and size of logging data. The data of logging records are written into logrec JSON field (see the initial migration [9] and examples above).

Because currently there’s immediate write to the table, it’s recommended to allow MySQL to batch writes by setting innodb_flush_log_at_trx_commit = 0 [10]. Disabling performance schema [11] by setting performance_schema = 0 is also recommended, because it has significant overhead. Basic InnoDB settings should be reasonably configured:

  • innodb_buffer_pool_size [12]

  • innodb_log_buffer_size [13]

  • innodb_log_file_size [14]

It is a good idea to have dedicated MySQL instance for Chronologer.

JSON path query

query passes a storage-specific expression. Particularly, it’s useful to write post-filtering conditions for logrec JSON field using JSONPath expressions and -> operator [15]. It may look like the following, though arbitrary WHERE clause expressions are possible.

  • "logrec->'$.data.foo' = 387 AND logrec->'$.meta.lineno' = 20"

  • "logrec->'$.meta.threadName' != 'MainThread'"

Note that connection to MySQL works in ANSI_QUOTES mode [18], so " cannot be used to form string literals. ' must be used instead.

SQLite

SQLite is supported for very simple, one-off or evaluation cases. Also it doesn’t support compression. JSON1 extension [20] is required for JSON Path queries.

  • "json_extract(logrec, '$.data.foo') = 387 AND json_extract(logrec, '$.meta.lineno') = 20"

  • "json_extract(logrec, '$.meta.threadName') = 'MainThread'"

A one-off Chronologer container with SQLite storage can be run on port 8080 like:

docker run -d -p 8080:8080 --name chronologer -v /tmp \
  -e CHRONOLOGER_STORAGE_DSN=sqlite:////tmp/db.sqlite  saaj/chronologer
docker exec chronologer python -m chronologer -e production migrate
docker exec chronologer chown www-data:www-data /tmp/db.sqlite

Later, when the task is done, the container can be cleaned up like:

docker rm -fv chronologer

R&D roadmap

See the roadmap issue.

Credits

Logo is contributed by lightypaints.


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Chronologer-0.3.0.tar.gz (38.6 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page