Skip to main content

Logprep allows to collect, process and forward log messages from various data sources.

Project description

Logprep

GitHub release (latest by date) GitHub Workflow Status (branch) Documentation Status GitHub contributors Coverage GitHub Repo stars

Introduction

Logprep allows to collect, process and forward log messages from various data sources. Log messages are being read and written by so-called connectors. Currently, connectors for Kafka, Opensearch, S3, HTTP and JSON(L) files exist.

The log messages are processed in serial by a pipeline of processors, where each processor modifies an event that is being passed through. The main idea is that each processor performs a simple task that is easy to carry out. Once the log message is passed through all processors in the pipeline the resulting message is sent to a configured output connector.

Logprep is primarily designed to process log messages. Generally, Logprep can handle JSON messages, allowing further applications besides log handling.

About Logprep

Pipelines

Logprep processes incoming log messages with a configured pipeline that can be spawned multiple times via multiprocessing. The following chart shows a basic setup that represents this behaviour. The pipeline consists of three processors: the Dissector, Geo-IP Enricher and the Dropper. Each pipeline runs concurrently and takes one event from it's Input Connector. Once the log messages is fully processed the result will be forwarded to the Output Connector, after which the pipeline will take the next message, repeating the processing cycle.

flowchart LR
A1[Input\nConnector] --> B
A2[Input\nConnector] --> C
A3[Input\nConnector] --> D
subgraph Pipeline 1
B[Dissector] --> E[Geo-IP Enricher]
E --> F[Dropper]
end
subgraph Pipeline 2
C[Dissector] --> G[Geo-IP Enricher]
G --> H[Dropper]
end
subgraph Pipeline n
D[Dissector] --> I[Geo-IP Enricher]
I --> J[Dropper]
end
F --> K1[Output\nConnector]
H --> K2[Output\nConnector]
J --> K3[Output\nConnector]

Processors

Every processor has one simple task to fulfill. For example, the Dissector can split up long message fields into multiple subfields to facilitate structural normalization. The Geo-IP Enricher, for example, takes an ip-address and adds the geolocation of it to the log message, based on a configured geo-ip database. Or the Dropper deletes fields from the log message.

As detailed overview of all processors can be found in the processor documentation.

To influence the behaviour of those processors, each can be configured with a set of rules. These rules define two things. Firstly, they specify when the processor should process a log message and secondly they specify how to process the message. For example which fields should be deleted or to which IP-address the geolocation should be retrieved.

Connectors

Connectors are responsible for reading the input and writing the result to a desired output. The main connectors that are currently used and implemented are a kafka-input-connector and a kafka-output-connector allowing to receive messages from a kafka-topic and write messages into a kafka-topic. Addionally, you can use the Opensearch or Opensearch output connectors to ship the messages directly to Opensearch or Opensearch after processing.

The details regarding the connectors can be found in the input connector documentation and output connector documentation.

Configuration

To run Logprep, certain configurations have to be provided. Because Logprep is designed to run in a containerized environment like Kubernetes, these configurations can be provided via the filesystem or http. By providing the configuration via http, it is possible to control the configuration change via a flexible http api. This enables Logprep to quickly adapt to changes in your environment.

First, a general configuration is given that describes the pipeline and the connectors, and lastly, the processors need rules in order to process messages correctly.

The following yaml configuration shows an example configuration for the pipeline shown in the graph above:

process_count: 3
timeout: 0.1

pipeline:
  - dissector:
      type: dissector
      rules:
        - https://your-api/dissector/
        - rules/01_dissector/rules/
  - geoip_enricher:
      type: geoip_enricher
      rules:
        - https://your-api/geoip/
        - rules/02_geoip_enricher/rules/
      tree_config: artifacts/tree_config.json
      db_path: artifacts/GeoDB.mmdb
  - dropper:
      type: dropper
      rules:
        - rules/03_dropper/rules/

input:
  mykafka:
    type: confluentkafka_input
    bootstrapservers: [127.0.0.1:9092]
    topic: consumer
    group: cgroup
    auto_commit: true
    session_timeout: 6000
    offset_reset_policy: smallest
output:
  opensearch:
    type: opensearch_output
    hosts:
        - 127.0.0.1:9200
    default_index: default_index
    error_index: error_index
    message_backlog_size: 10000
    timeout: 10000
    max_retries:
    user: the username
    secret: the passord
    cert: /path/to/cert.crt

The following yaml represents a dropper rule which according to the previous configuration should be in the rules/03_dropper/rules/ directory.

filter: "message"
drop:
  - message
description: "Drops the message field"

The condition of this rule would check if the field message exists in the log. If it does exist then the dropper would delete this field from the log message.

Details about the rule language and how to write rules for the processors can be found in the rule configuration documentation.

Documentation

The documentation for Logprep is online at https://logprep.readthedocs.io/en/latest/ or it can be built locally via:

sudo apt install pandoc
uv sync --frozen --extra doc
cd ./doc/
make html

A HTML documentation can be then found in doc/_build/html/index.html.

Container signatures

From release 15 on, Logprep containers are signed using the cosign tool. To verify the container, you can copy the following public key into a file logprep.pub:

-----BEGIN PUBLIC KEY-----
MFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAEgkQXDi/N4TDFE2Ao0pulOFfbGm5g
kVtARE+LJfSFI25BanOG9jaxxRGVt+Sa1KtQbMcy7Glxu0s7XgD9VFGjTA==
-----END PUBLIC KEY-----

And use it to verify the signature:

cosign verify --key logprep.pub ghcr.io/fkie-cad/logprep:py3.11-latest

The output should look like:

Verification for ghcr.io/fkie-cad/logprep:py3.11-latest --
The following checks were performed on each of these signatures:
  - The cosign claims were validated
  - Existence of the claims in the transparency log was verified offline
  - The signatures were verified against the specified public key

[{"critical":{"identity":{"docker-reference":"ghcr.io/fkie-cad/logprep"}, ...

Container SBOM

From release 15 on, Logprep container images are shipped with a generated sbom. To verify the attestation and extract the SBOM use cosign with:

cosign verify-attestation --key logprep.pub ghcr.io/fkie-cad/logprep:py3.11-latest | jq '.payload | @base64d | fromjson | .predicate | .Data | fromjson' > sbom.json

The output should look like:

Verification for ghcr.io/fkie-cad/logprep:py3.11-latest --
The following checks were performed on each of these signatures:
  - The cosign claims were validated
  - Existence of the claims in the transparency log was verified offline
  - The signatures were verified against the specified public key

Finally, you can view the extracted sbom with:

cat sbom.json | jq

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

logprep-19.0.0.tar.gz (3.7 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

logprep-19.0.0-cp314-cp314-musllinux_1_2_x86_64.whl (909.9 kB view details)

Uploaded CPython 3.14musllinux: musl 1.2+ x86-64

logprep-19.0.0-cp314-cp314-manylinux_2_28_x86_64.whl (837.1 kB view details)

Uploaded CPython 3.14manylinux: glibc 2.28+ x86-64

logprep-19.0.0-cp313-cp313-musllinux_1_2_x86_64.whl (909.9 kB view details)

Uploaded CPython 3.13musllinux: musl 1.2+ x86-64

logprep-19.0.0-cp313-cp313-manylinux_2_28_x86_64.whl (837.0 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

logprep-19.0.0-cp312-cp312-musllinux_1_2_x86_64.whl (909.6 kB view details)

Uploaded CPython 3.12musllinux: musl 1.2+ x86-64

logprep-19.0.0-cp312-cp312-manylinux_2_28_x86_64.whl (837.1 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

logprep-19.0.0-cp311-cp311-musllinux_1_2_x86_64.whl (911.5 kB view details)

Uploaded CPython 3.11musllinux: musl 1.2+ x86-64

logprep-19.0.0-cp311-cp311-manylinux_2_28_x86_64.whl (838.8 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

File details

Details for the file logprep-19.0.0.tar.gz.

File metadata

  • Download URL: logprep-19.0.0.tar.gz
  • Upload date:
  • Size: 3.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for logprep-19.0.0.tar.gz
Algorithm Hash digest
SHA256 6660fb7859a6cbb0601cfef6b37bdcb320b8c03d6170fd6bf0683d10aac8bd82
MD5 57e730e8ecad2a0ea649bb5099b15e4e
BLAKE2b-256 114bc17d921793d6dd4074f998b0f0c3a0fbda45bf329a0cb3013f46042f4b65

See more details on using hashes here.

Provenance

The following attestation bundles were made for logprep-19.0.0.tar.gz:

Publisher: publish-release-to-pypi.yml on fkie-cad/Logprep

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file logprep-19.0.0-cp314-cp314-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for logprep-19.0.0-cp314-cp314-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 a4adf3044e501ec216922e34779543bef2760cf23e2a94983f8a0c02d59be836
MD5 b20251b6c87390e59e4c43c237ab9add
BLAKE2b-256 7b6f1287e01446244f3c45ba79e3bbe25e0b78aa47350fa1b97c3c08afd493fc

See more details on using hashes here.

Provenance

The following attestation bundles were made for logprep-19.0.0-cp314-cp314-musllinux_1_2_x86_64.whl:

Publisher: publish-release-to-pypi.yml on fkie-cad/Logprep

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file logprep-19.0.0-cp314-cp314-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for logprep-19.0.0-cp314-cp314-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 cc25b98285bc1b0bb7fc6bd68f9bbf2c70e96b35351a4847b546b627cf14e3a6
MD5 3421edfb5434ae02ea13803f216f95a4
BLAKE2b-256 f18ba4b1cfa0c0bb46a997be18eb349be096d6cd440cfb410dec2eb725c76df9

See more details on using hashes here.

Provenance

The following attestation bundles were made for logprep-19.0.0-cp314-cp314-manylinux_2_28_x86_64.whl:

Publisher: publish-release-to-pypi.yml on fkie-cad/Logprep

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file logprep-19.0.0-cp313-cp313-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for logprep-19.0.0-cp313-cp313-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 1b98f3a367de3f545391c3dfcfc4f652dc16984da8b04ce8371a4e1bfd40f3dd
MD5 d9c34727eaaffb428b9562879039aa23
BLAKE2b-256 cf1ecf00a23e3121caa65472264e81fb5717e008383ddacdbead2b4af60f97cb

See more details on using hashes here.

Provenance

The following attestation bundles were made for logprep-19.0.0-cp313-cp313-musllinux_1_2_x86_64.whl:

Publisher: publish-release-to-pypi.yml on fkie-cad/Logprep

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file logprep-19.0.0-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for logprep-19.0.0-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 16bf960fea04ad814e9645f385323dcc99e933ceac2b28a4bd540f0ef151a3c1
MD5 b47dcb62aeb3d4bd3cb2bde5b2dc8aa1
BLAKE2b-256 4f8b60223810bedb59c4a0a7d590e0662c8212c4251da59c5173d32d8f6759ef

See more details on using hashes here.

Provenance

The following attestation bundles were made for logprep-19.0.0-cp313-cp313-manylinux_2_28_x86_64.whl:

Publisher: publish-release-to-pypi.yml on fkie-cad/Logprep

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file logprep-19.0.0-cp312-cp312-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for logprep-19.0.0-cp312-cp312-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 530b317daa7d5c7480e8f798e36ed28ce82a79e1526789a599924dbfa3467408
MD5 66281abdd7385915628e448ef61a66bb
BLAKE2b-256 73b3c5fb33fe47e2c3726ff2516c324207ecbf2244c04420ade7153743025027

See more details on using hashes here.

Provenance

The following attestation bundles were made for logprep-19.0.0-cp312-cp312-musllinux_1_2_x86_64.whl:

Publisher: publish-release-to-pypi.yml on fkie-cad/Logprep

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file logprep-19.0.0-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for logprep-19.0.0-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 da18a93af043c50c8052751d3db24213e0c6fead924dbd2344db54346622ad3a
MD5 7b7db90b4b78268576586d02727e004e
BLAKE2b-256 0015e3cbe7e89f52d35404cd0acd983b6530f67c2058e2d2c043ba975145eefb

See more details on using hashes here.

Provenance

The following attestation bundles were made for logprep-19.0.0-cp312-cp312-manylinux_2_28_x86_64.whl:

Publisher: publish-release-to-pypi.yml on fkie-cad/Logprep

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file logprep-19.0.0-cp311-cp311-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for logprep-19.0.0-cp311-cp311-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 09af8baa369273ae45128a1e5f9743cb55887a36dc3424353a64a16a5f40b3d8
MD5 e1635da65661d6858d66fa0fa97f47e8
BLAKE2b-256 c0a4452c55a92f60ed380f2b042202da959d7730d56a830539e54696a65fbde2

See more details on using hashes here.

Provenance

The following attestation bundles were made for logprep-19.0.0-cp311-cp311-musllinux_1_2_x86_64.whl:

Publisher: publish-release-to-pypi.yml on fkie-cad/Logprep

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file logprep-19.0.0-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for logprep-19.0.0-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 ba9fb5fd739f2429a16672e1d72c4188df7f49841c0103fad6b16890c3ff8777
MD5 fd123f9d68edf90ee52fe7eab90bf0cf
BLAKE2b-256 0a930c82d609bbe380dd8daa3bfe75cfb173e0d2bcb37602556978b0e80113f4

See more details on using hashes here.

Provenance

The following attestation bundles were made for logprep-19.0.0-cp311-cp311-manylinux_2_28_x86_64.whl:

Publisher: publish-release-to-pypi.yml on fkie-cad/Logprep

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page