Skip to main content

No project description provided

Project description

pat-cli

Table of Contents

Overview

pat-cli is a tool for clustering logs based on the textual content of the log. The tool uses a two-step process to achieve this:

Vectorization: In this step, the log statement is converted into a vector in an n-dimensional space. This way, we can treat logs just like we treat points on a 2D graph and try to cluster them. The only difference is that the dimension of this space.

To achieve this vectorization, few algorithms can be employed. The available algorithms can be found using the help page of the tool, but one example is the TF-IDF.

Usually, vectorization involves a sub-step called tokenization. In this step, the logs are broken down into a set of tokens. For example, a log like "Writing output to file" can be broken down into the following tokens: "writing", "output", "to", "file". This allows the tool to understand the textual content of the logs. For example, in the TF-IDF algorithm, the frequencies of the words (tokens) making up each log statement are used to determine how important each word in the log is.

Clustering: In this step, a clustering algorithm like K-Means or Birch is used to cluster the logs into multiple groups that are likely to be similar to each other.

Usage

pat-cli is available on PyPI, so you can install it with pip:

pip install pat-cli

The pat command will then be available in your environment. You can get help on how to use it using:

pat --help

Issue

If you face any issue, feel free to create a GitHub Issue in this repository and I will try to address it or respond to it as soon as possible.

Contribution

Contribution is welcome. If you have an interesting addition to the tool, be it another vectorization or clustering algorithm, feel free to publish a PR.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pat_cli-0.1.4.tar.gz (17.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pat_cli-0.1.4-py3-none-any.whl (18.7 kB view details)

Uploaded Python 3

File details

Details for the file pat_cli-0.1.4.tar.gz.

File metadata

  • Download URL: pat_cli-0.1.4.tar.gz
  • Upload date:
  • Size: 17.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.4 CPython/3.12.7 Linux/6.5.0-1025-azure

File hashes

Hashes for pat_cli-0.1.4.tar.gz
Algorithm Hash digest
SHA256 169b4bc232e792b0789adcee4526d5aa2cd0627b501f45404fb50d616a9e9d7d
MD5 d9149df54ec011a67005396f32f5e0a2
BLAKE2b-256 98c5927b66315775d1241a60114456f9a47c1e7b26ae44cac830705b9f5e8cc9

See more details on using hashes here.

File details

Details for the file pat_cli-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: pat_cli-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 18.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.4 CPython/3.12.7 Linux/6.5.0-1025-azure

File hashes

Hashes for pat_cli-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 1a3da8eaa6d5162e0f0ba9343bcb0e26dc5fd332a1cfd881bb92ae48bdbd37a6
MD5 575ccecc5e51b9b45d08f0e81bf4b5cc
BLAKE2b-256 2a34cf195cf3c423d7415ce310e19bd6759cde7989b61c4cfc0c808d6cfe227e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page