Skip to main content

Sentry nodestore Elasticsearch backend

Project description

sentry-nodestore-elastic

Sentry nodestore Elasticsearch backend

image

Supported Sentry 24.x & elasticsearch 8.x versions

Use Elasticsearch cluster for store node objects from Sentry

By default selfhosted Sentry uses Postgresql database for settings and nodestore, and under high load it becomes a bottleneck, database size growing fast and slowing down entire system

Switching nodestore to dedicated Elasticsearch cluster provides more scalability:

  • Elasticsearch cluster may be scaled horizontally by adding more data nodes (Postgres not)
  • Data in Elasticsearch may be sharded and replicated between data nodes, which increases throughput
  • Elasticsearch can rebalance automatically when new data nodes added
  • Scheduled Sentry cleanup performs much faster and stable when using elastic nodestore because of simple deleting old indices (cleanup in Postgresql terabyte-size nodestore is a huge pain)

Installation

Rebuild sentry docker image with nodestore package installation

FROM getsentry/sentry:24.4.1
RUN  pip install sentry-nodestore-elastic

Configuration

Set SENTRY_NODESTORE at your sentry.conf.py

from elasticsearch import Elasticsearch
es = Elasticsearch(
        ['https://username:password@elasticsearch:9200'],
        http_compress=True,
        request_timeout=60,
        max_retries=3,
        retry_on_timeout=True,
        # ❯ openssl s_client -connect elasticsearch:9200 < /dev/null 2>/dev/null | openssl x509 -fingerprint -noout -in /dev/stdin
        ssl_assert_fingerprint=(
            "PUT_FINGERPRINT_HERE"
        )
    )
SENTRY_NODESTORE = 'sentry_nodestore_elastic.ElasticNodeStorage'
SENTRY_NODESTORE_OPTIONS = {
    'es': es,
    'refresh': False,  # ref: https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-refresh.html
    # other ES related options
}

from sentry.conf.server import *  # default for sentry.conf.py
INSTALLED_APPS = list(INSTALLED_APPS)
INSTALLED_APPS.append('sentry_nodestore_elastic')
INSTALLED_APPS = tuple(INSTALLED_APPS)

Usage

Setup elasticsearch index template

Elasticsearch shoud be up and running before this step, this will create index template in elasticsearch

sentry upgrade --with-nodestore

Or you can prepare index template manually with this json, it may be customized for your needs (but template name should be sentry because of nodestore init script checks)

{
  "template": {
    "settings": {
      "index": {
        "number_of_shards": "3",
        "number_of_replicas": "0",
        "routing": {
          "allocation": {
            "include": {
              "_tier_preference": "data_content"
            }
          }
        }
      }
    },
    "mappings": {
      "dynamic": "false",
      "dynamic_templates": [],
      "properties": {
        "data": {
          "type": "text",
          "index": false,
          "store": true
        },
        "timestamp": {
          "type": "date",
          "store": true
        }
      }
    },
    "aliases": {
      "sentry": {}
    }
  }
}

Migrate data from default Postgres nodestore to elasticsearch

Postgres and Elasticsearch must be accessible from place where you run this code
from elasticsearch import Elasticsearch
from elasticsearch.helpers import bulk, BulkIndexError
import psycopg2

es = Elasticsearch(
        ['https://username:password@elasticsearch:9200'],
        http_compress=True,
        request_timeout=60,
        max_retries=3,
        retry_on_timeout=True,
        # ❯ openssl s_client -connect elasticsearch:9200 < /dev/null 2>/dev/null | openssl x509 -fingerprint -noout -in /dev/stdin
        ssl_assert_fingerprint=(
            "PUT_FINGERPRINT_HERE"
        )
    )

name = 'sentry'

conn = psycopg2.connect(dbname="sentry", user="sentry", password="password", host="hostname", port="5432")

cur = conn.cursor()
cur.execute("SELECT reltuples AS estimate FROM pg_class where relname = 'nodestore_node'")
result = cur.fetchone()
count = int(result[0])
print(f"Estimated rows: {count}")
cur.close()

cursor = conn.cursor(name='fetch_nodes')
cursor.execute("SELECT * FROM nodestore_node ORDER BY timestamp ASC")

while True:
    records = cursor.fetchmany(size=2000)

    if not records:
        break

    bulk_data = []

    for r in records:
        id = r[0]
        data = r[1]
        date = r[2].strftime("%Y-%m-%d")
        ts = r[2].isoformat()
        index = f"sentry-{date}"

        doc = {
            'data': data,
            'timestamp' : ts
        }

        action = {
                "_index": index,
                "_id": id,
                "_source": doc
        }

        bulk_data.append(action)

    bulk(es, bulk_data)
    count = count - 2000
    print(f"Remainig rows: {count}")

cursor.close()
conn.close()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sentry_nodestore_elastic-1.0.1.tar.gz (9.4 kB view details)

Uploaded Source

Built Distribution

sentry_nodestore_elastic-1.0.1-py3-none-any.whl (9.8 kB view details)

Uploaded Python 3

File details

Details for the file sentry_nodestore_elastic-1.0.1.tar.gz.

File metadata

File hashes

Hashes for sentry_nodestore_elastic-1.0.1.tar.gz
Algorithm Hash digest
SHA256 b75ac9563cc5d444bfe807ab4ebc5a2148718270e3d38fb680c58f5f74f90755
MD5 7678d5328630d9c503e22ae8b237b6b5
BLAKE2b-256 9095a147423ab2a18b7399c050d80a79232b5e92c53170efe493d3b2a98f3272

See more details on using hashes here.

File details

Details for the file sentry_nodestore_elastic-1.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for sentry_nodestore_elastic-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 50ecc6c7640e3c3cbf3f077bbbe7e113a89be63d9bed4b014c1078c8a704160b
MD5 b15ba765e10b99ebf16a896ba99698c1
BLAKE2b-256 9ef7f2fea8f1924101ec29a127224ff9c2129b01e079eb34099102056655e578

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page