Skip to main content

LogDelta - Go Beyond Grepping with NLP-based Log File Analysis

Project description

LogDelta

LogDelta - Go Beyond Grepping with NLP-based Log File Analysis

LogDelta assumes your folders represent a collection of software logs of interest. LogDelta performs a comparison between two or more folders using matching file names. A target run represents a software run we are interested in analyzing. LogDelta uses comparison runs as a baseline. For example, the "My_passing_logs1", "My_passing_logs2", "My_passing_logs3" folders can be comparison runs, while "My_failing_logs" would be your target run that you want to analyze with respect to comparison runs.

Installation and Example

Performs installation, data acquisition, and demo execution.

  • pip install LogDelta
  • git clone https://github.com/EvoTestOps/LogDelta.git
  • cd LogDelta/demo
  • wget -O Hadoop.zip https://zenodo.org/records/8196385/files/Hadoop.zip?download=1
  • unzip Hadoop.zip -d Hadoop
  • python -m logdelta.config_runner -c config.yml

Observer results in LogDelta/demo/Output

For more examples see LogDelta/demo/label_investigation and LogDelta/demo/full

Types of Analysis

In LogDelta, three types of analysis are available:

  1. Visualize

    • Multiple logs files or runs with UMAP based on two dimensional scaling of the log contents.
    • Individual log files with log anomaly scoring (see step 3 for details anomaly detection supported)
  2. Measure the distance between two logs or sets of logs using:

    • Jaccard distance
    • Cosine distance
    • Containment distance
    • Compression distance
  3. Build an anomaly detection model from a set of logs and use it to score anomalies (higher scores more anomalous) in a log file using :

    • KMeans (kmeans)
    • IsolationForest (IF)
    • RarityModel (RM)
    • Out-of-Vocabulary Detector (OOVD)

Levels of Analysis

Analysis can be done at four different levels:

  1. Run (folder) level, investigating the names of files without looking at their contents.
  2. Run (folder) level, investigating run contents (this is slower than what is done in 1).
  3. File level, investigating file contents (matched with the same names between runs).
  4. Line level, investigating line contents (matched with the same names between runs).

LogDelta is build on top of LogLead[^1]. https://pypi.org/project/LogLead/ [^1]: Mäntylä MV, Wang Y, Nyyssölä J. Loglead-fast and integrated log loader, enhancer, and anomaly detector. In2024 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER) 2024 Mar 12 (pp. 395-399). IEEE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

logdelta-1.0.0.tar.gz (22.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

logdelta-1.0.0-py3-none-any.whl (22.7 kB view details)

Uploaded Python 3

File details

Details for the file logdelta-1.0.0.tar.gz.

File metadata

  • Download URL: logdelta-1.0.0.tar.gz
  • Upload date:
  • Size: 22.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.11.10

File hashes

Hashes for logdelta-1.0.0.tar.gz
Algorithm Hash digest
SHA256 68e297a0d709feb166c7a9b81905f923d322e7a153e9009f20e10d441e4c7632
MD5 fb43d5fcb99be345a14bbd4d139b3277
BLAKE2b-256 7ccf84d3cba503a1364b2d786369a4ce0a2057646509d9b2a5bdcc35a2b30486

See more details on using hashes here.

File details

Details for the file logdelta-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: logdelta-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 22.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.11.10

File hashes

Hashes for logdelta-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c940366a6172cf2a3320fb805b02f702a937934537faf32291eff882ddc8c322
MD5 a995aa1fcf9199d1bc54b2c926f9e850
BLAKE2b-256 99c5c13ccd46a76159837ef1e63eeb2de07a5b63a1848473df92b30839cc11d3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page