Skip to main content

Middleware for setting and propagating correlation/request IDs

Project description

pypi test codecov

ASGI Correlation ID middleware

Middleware for loading or generating correlation IDs, making it simple to correlate logs from a single HTTP request.

Correlation IDs are loaded when the middleware detects a correlation ID HTTP header in an incoming request. When no header is found, a correlation ID is generated for the request instead.

The header key name is customizable. Platforms like Heroku use X-Request-ID as their standard (source), but X-Correlation-ID is also commonly used.

Once a correlation ID has been set or generated, it's added to your project logs with a log filter.

Table of contents

Installation

pip install asgi-correlation-id

Configuring the middleware

To set up the package, you need to add the middleware and configure logging.

1/3 Setting up the middleware

The middleware can be added like this

from fastapi import FastAPI
from starlette.middleware import Middleware

from asgi_correlation_id import CorrelationIdMiddleware


app = FastAPI(middleware=[Middleware(CorrelationIdMiddleware)])

or like this

from fastapi import FastAPI

from asgi_correlation_id import CorrelationIdMiddleware


app = FastAPI()
app.add_middleware(CorrelationIdMiddleware)

or any other way your framework allows.

For Starlette apps, just substitute FastAPI with Starlette in the example above.

2/3 Settings

The middleware has a few settings. These are the defaults:

class CorrelationIdMiddleware(
    header_name='X-Request-ID',
    validate_header_as_uuid=True,
): ...

header_name

The HTTP header key to read IDs from.

In addition to the default, another popular choice for header name is X-Correlation-ID, but the header name can be any string.

Defaults to X-Request-ID.

validate_header_as_uuid

By default, the middleware validates correlation IDs as valid UUIDs. If turned off, any string will be accepted. The benefit of enforcing UUID usage has to do with limiting the chance of correlation ID collisions.

When a header value is detected but found to be invalid, the value is discarded, and a fresh UUID is generated in its place. The package logs a warning in these cases.

Defaults to True.

3/3 Configuring logging

To set up logging of the correlation ID, you simply have to implement the log-filter supplied by the package.

If your logging config looks something like this:

LOGGING = {
    'version': 1,
    'disable_existing_loggers': False,
    'formatters': {
        'web': {
            'class': 'logging.Formatter',
            'datefmt': '%H:%M:%S',
            'format': '%(levelname)s ... %(name)s %(message)s',
        },
    },
    'handlers': {
        'web': {
            'class': 'logging.StreamHandler',
            'formatter': 'web',
        },
    },
    'loggers': {
        'my_project': {
            'handlers': ['web'],
            'level': 'DEBUG',
            'propagate': True,
        },
    },
}

You simply have to add a log filter, like this

+ from asgi_correlation_id.log_filters import correlation_id_filter

LOGGING = {
    'version': 1,
    'disable_existing_loggers': False,
+   'filters': {
+       'correlation_id': {'()': correlation_id_filter(uuid_length=32)},
+   },
    'formatters': {
        'web': {
            'class': 'logging.Formatter',
            'datefmt': '%H:%M:%S',
+           'format': '%(levelname)s ... [%(correlation_id)s] %(name)s %(message)s',
        },
    },
    'handlers': {
        'web': {
            'class': 'logging.StreamHandler',
+           'filters': ['correlation_id'],
            'formatter': 'web',
        },
    },
    'loggers': {
        'my_project': {
            'handlers': ['web'],
            'level': 'DEBUG',
            'propagate': True,
        },
    },
}

And your log output will go from this

INFO ... project.views This is a DRF view log, and should have a GUID.
WARNING ... project.services.file Some warning in a function
INFO ... project.views This is a DRF view log, and should have a GUID.
INFO ... project.views This is a DRF view log, and should have a GUID.
WARNING ... project.services.file Some warning in a function
WARNING ... project.services.file Some warning in a function

to this

INFO ... [773fa6885e03493498077a273d1b7f2d] project.views This is a DRF view log, and should have a GUID.
WARNING ... [773fa6885e03493498077a273d1b7f2d] project.services.file Some warning in a function
INFO ... [0d1c3919e46e4cd2b2f4ac9a187a8ea1] project.views This is a DRF view log, and should have a GUID.
INFO ... [99d44111e9174c5a9494275aa7f28858] project.views This is a DRF view log, and should have a GUID.
WARNING ... [0d1c3919e46e4cd2b2f4ac9a187a8ea1] project.services.file Some warning in a function
WARNING ... [99d44111e9174c5a9494275aa7f28858] project.services.file Some warning in a function

If you're using a json log-formatter, just add correlation-id: %(correlation_id)s to your list of properties.

Extensions

We've added a couple of (we think) nice extensions to extend the scope of correlation IDs.

Sentry

If your project has sentry-sdk installed, correlation IDs will automatically be added to Sentry events as a transaction_id.

See this blogpost for a little bit of detail.

Celery

There's one issue with Celery: when calling task.delay() from the web server, the worker process does not have access to the correlation_id context var, so it will not include the request correlation id by default.

To make sure correlation IDs are transferred to worker processes we use Celery signal hooks that transfer and receive correlation IDs, like this:

@before_task_publish.connect()
def transfer_correlation_id(headers) -> None:
    headers['REQUEST_ID'] = correlation_id.get()


@task_prerun.connect()
def load_correlation_id(task) -> None:
    id_value = task.request.get('REQUEST_ID')
    correlation_id.set(id_value)

To make the setup as simple as possible, we've put all the required code into a single setup function, so to enable correlation IDs for Celery, just add this code to your celery.py:

from asgi_correlation_id.extensions.celery import load_correlation_ids

load_correlation_ids()

Taking it one step further - Adding Celery tracing IDs

In addition to transferring request IDs to Celery workers, we've added one more log filter for improving tracing in celery processes. This is completely separate from the middleware, but is something we use personally, so maintain here regardless.

The log filter adds an ID, celery_current_id for each worker process, and an ID, celery_parent_id for the process that spawned it.

Here's a quick summary of outputs from different scenarios:

Scenario Correlation ID Celery Current ID Celery Parent ID
Request
Request -> Worker
Request -> Worker -> Another worker
Beat -> Worker ✅*
Beat -> Worker -> Worker ✅*

*When we're in a process spawned separately from an HTTP request, a correlation ID is still spawned for the first process in the chain, and passed down. You can think of the correlation ID as an origin ID, while the combination of current and parent-ids as a way of linking the chain.

To add the current and parent IDs, just alter your celery.py to this:

+ from asgi_correlation_id.extensions.celery import load_correlation_ids, load_celery_current_and_parent_ids

load_correlation_ids()
+ load_celery_current_and_parent_ids()

To set up the additional log filters, update your log config like this:

+ from asgi_correlation_id.log_filters import celery_tracing_id_filter, correlation_id_filter

LOGGING = {
    'version': 1,
    'disable_existing_loggers': False,
    'filters': {
        'correlation_id': {'()': correlation_id_filter(uuid_length=32)},
+       'celery_tracing': {'()': celery_tracing_id_filter(uuid_length=32)},
    },
    'formatters': {
        'web': {
            'class': 'logging.Formatter',
            'datefmt': '%H:%M:%S',
            'format': '%(levelname)s ... [%(correlation_id)s] %(name)s %(message)s',
        },
+       'celery': {
+           'class': 'logging.Formatter',
+           'datefmt': '%H:%M:%S',
+           'format': '%(levelname)s ... [%(correlation_id)s] [%(celery_parent_id)s-%(celery_current_id)s] %(name)s %(message)s',
+       },
    },
    'handlers': {
        'web': {
            'class': 'logging.StreamHandler',
            'filters': ['correlation_id'],
            'formatter': 'web',
        },
+       'celery': {
+           'class': 'logging.StreamHandler',
+           'filters': ['correlation_id', 'celery_tracing'],
+           'formatter': 'celery',
+       },
    },
    'loggers': {
        'my_project': {
+           'handlers': ['celery' if any('celery' in i for i in sys.argv) else 'web'],
            'level': 'DEBUG',
            'propagate': True,
        },
    },
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

asgi-correlation-id-0.1.6.tar.gz (11.7 kB view details)

Uploaded Source

Built Distribution

asgi_correlation_id-0.1.6-py3-none-any.whl (9.9 kB view details)

Uploaded Python 3

File details

Details for the file asgi-correlation-id-0.1.6.tar.gz.

File metadata

  • Download URL: asgi-correlation-id-0.1.6.tar.gz
  • Upload date:
  • Size: 11.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.11 CPython/3.9.7 Linux/5.8.0-1042-azure

File hashes

Hashes for asgi-correlation-id-0.1.6.tar.gz
Algorithm Hash digest
SHA256 23e1c16ff5de963e2807e874df5683c829f7b0ade3abac1fc401eaae31fa0587
MD5 c43393f78737a385342f46d632576622
BLAKE2b-256 b3969f28cb24c2207fbd25a66491094b9f8c710b3c4d9eb07406896b387527bc

See more details on using hashes here.

File details

Details for the file asgi_correlation_id-0.1.6-py3-none-any.whl.

File metadata

File hashes

Hashes for asgi_correlation_id-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 7bdcee5a4b4e6d33e427e3c2f50651572cf4b658e1912e0377c5464075c1e2a8
MD5 81bfa27aeed873fe0bdf426201393fca
BLAKE2b-256 8fe80f0fbb856b05df486f9e5ce085cb88d4c860feaed47b584e9840c62c8357

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page