Django + Celery + Sentry + JSON Logging meta data
When working with a Python service it can be daunting to make all those tools work together and output useful logs with proper meta data.
This document is a step by step setup guide for Django. But is can be also applied to FastAPI or flask with minimal changes.
First we want to gather the meta-data for our logs. We will need a middleware that will store the request informations in a ContextVar. This part is really the only part that is Django specific and that should be adapted.
# project/request_context.py
import uuid
from contextvars import ContextVar
import sentry_sdk
# our request context variable
request_context = ContextVar("request_context", default=None)
class RequestContextMiddleware:
def __init__(self, get_response):
self.get_response = get_response
def __call__(self, request):
# we collect various information that we want to see in our logs
context = {
"user_id": request.user and request.user.id,
"correlation_id": str(
request.headers.get("X-Request-Id", uuid.uuid4())
),
"path": request.path,
}
# we save this information for later use
request_context.set(context)
# we can instruct sentry to include this information as well
sentry_sdk.set_context("request", context)
response = self.get_response(request)
# not strictly necessary but a good idea
request_context.set(None)
return response
Now add this new middleware to your Django application
MIDDLEWARE = [
'django.middleware.security.SecurityMiddleware',
'django.contrib.sessions.middleware.SessionMiddleware',
'django.middleware.common.CommonMiddleware',
'django.middleware.csrf.CsrfViewMiddleware',
'django.contrib.auth.middleware.AuthenticationMiddleware',
'django.contrib.messages.middleware.MessageMiddleware',
'django.middleware.clickjacking.XFrameOptionsMiddleware',
# Add your middleware here
"project.request_context.RequestContextMiddleware",
]
For now not much will happen as you do not use this request context variable, but you should already see the effect of sentry_sdk.set_context in Sentry

Now let’s try to inject this meta data into your service logs so they can be exploited by logging tool such as Google Cloud Logging.

For that we need a new logging JSON formatter, to this effect let’s use the library python-json-logger and subclass it:
# project/logger.py
from pythonjsonlogger import jsonlogger
from project.request_context import request_context
class CustomJsonFormatter(jsonlogger.JsonFormatter):
def add_fields(self, log_record, record, message_dict):
super(CustomJsonFormatter, self).add_fields(log_record, record, message_dict)
if rc := request_context.get():
if uid := rc.get("user_id"):
log_record["user_id"] = uid
if cid := rc.get("correlation_id"):
log_record["correlation_id"] = cid
if path:= rc.get("path"):
log_record["path"] = path
if log_record.get("level"):
log_record["severity"] = log_record["level"].upper()
else:
log_record["severity"] = record.levelname
And let’s activate this new JSON formatter by changing our settings.py
LOGGING = {
"version": 1,
"disable_existing_loggers": False,
"formatters": {
"json": {
"()": "project.logger.CustomJsonFormatter",
"format": "%(asctime)s %(name)s:%(lineno)d %(module)s %(message)s",
},
},
"handlers": {
"console": {
"level": "DEBUG",
"class": "logging.StreamHandler",
"formatter": "json",
}
},
'loggers': {
'django': { # to avoid duplicated django logs
'handlers': ['console'],
'level': 'INFO',
'propagate': False,
},
},
"root": {"level": "INFO", "handlers": ["console"]},
}
Now logging.info(“My message”) will yield a JSON formatted log that contain a correlation_id, a user_id and a path if available. It should look like this:
{
"path": "/api/v1/billing/test-task/",
"user_id": 56,
"correlation_id": "ad78cc7d-7ea6-4699-b215-34393734cd54",
"name": "root",
"module": "request_context",
"message": "My message",
"severity": "INFO"
}
Once those changes are deployed you can now filter your logs like so (Goggle Cloud Logging syntax):
jsonPayload.path="/api/v1/billing/test-task/" OR jsonPayload.user_id="56"
If you do not know how to setup Google Cloud Logging with JSON parsing, here is my solution after I setup the Ops Agent configuration. If you use Kubernetes logging setup might differ.
# /etc/google-cloud-ops-agent/config.yaml
# See https://cloud.google.com/stackdriver/docs/solutions/agents/ops-agent/configuration
# for more details.
logging:
receivers:
syslog:
type: files
include_paths: # <- this might depend on your config/needs
- /var/log/messages
- /var/log/syslog
- /var/log/celery/worker.log
- /var/log/asgi.log
processors:
json_parser:
type: parse_json
time_key: asctime
time_format: %Y-%m-%d %H:%M:%S
move_severity:
type: modify_fields
fields:
jsonPayload."logging.googleapis.com/severity":
move_from: jsonPayload.severity
service:
pipelines:
default_pipeline:
receivers: [syslog]
processors: [json_parser, move_severity]
metrics:
receivers:
hostmetrics:
type: hostmetrics
collection_interval: 30s
processors:
metrics_filter:
type: exclude_metrics
metrics_pattern: []
service:
pipelines:
default_pipeline:
receivers: [hostmetrics]
processors: [metrics_filter]
Great! It is already a big step in the right direction. But there is something missing: Celery logs will not work. The logs will not look like JSON and the meta-data will not be automatically propagated. No problems there is a way to fix both things with little code. In your Celery setup add something like this:
# project/celery.py
from django.conf import settings
from logging.config import dictConfig
from project.request_context import request_context
from celery.signals import after_setup_task_logger, before_task_publish, task_prerun
# forces celery to format logs like we want them
@after_setup_task_logger.connect
def setup_task_logger(logger, *args, **kwargs):
dictConfig(settings.LOGGING)
# inject the request context in the header of the task
@before_task_publish.connect
def modify_header(body, headers, **kwargs):
if headers:
rc = request_context.get()
if rc:
headers["request_context"] = rc
# on the Celery side reestablish the information before starting the task
@task_prerun.connect
def reestablish_request_context(task_id, task, *args, **kwargs):
rc = task.request.get("request_context", None)
if rc:
request_context.set(rc)
sentry_sdk.set_context("request", rc)
else:
# better reset to avoid picking up any leftover context
request_context.set(None)
Now Celery should behave exactly the same way as your service. Logs and Sentry context should match. You can use the correlation_id to track all the logs produced by one request from your service and the Celery tasks. Or use the user_id instead if you want to follow what a client has done on multiple requests.
Be aware that the special Sentry Django integration will not work with Celery and this technic does not solve this. It is an alternative that always works and that can be easily adapted for other frameworks such as FastAPI.