Skip to main content

No project description provided

Project description

README

Parse and slice hadoop logs

Yarn RM

alt

Dataset

from khadoop.yarn import logrm

Parse all files that look like a regular Ressource Manager log with default name.

logrm.FILEPATTERN is a unix-like pattern file to help glob them.

parsed = []
for filelog in LOGFOLDER.glob(logrm.FILEPATTERN):
    print(filelog)
    parsed += logrm.process(filelog.open())

logrm.process will parse each line and produce a list of dict with sensible information

each dict look like :

 {
   'accepted_to_running': 6,  # nb sec between ACCEPT to RUNNING
   'id_application': 'application_1596547077642_6854',
   'accept_to_running_ts':'2020-08-06 14:59:59,119' # timestamp set for log line 'FROM accepted to RUNNING'
   }

the accepted_to_running represent here the number between these two timestamps on yarn aggregated RM log:

2020-08-06 14:59:52,756 INFO  rmapp.RMAppImpl (RMAppImpl.java:handle(779)) - application_1596547077642_6854 State change from SUBMITTED to ACCEPTED
...
2020-08-06 14:59:59,119 INFO  rmapp.RMAppImpl (RMAppImpl.java:handle(779)) - application_1596547077642_6854 State change from ACCEPTED to RUNNING

Related

Setup dev

Env variables:

HIVESERVER_TEST= #raw hiveserver log file
YARNLOG #folder with RM logs

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

khadoop-1.4.0.tar.gz (9.1 kB view details)

Uploaded Source

Built Distribution

khadoop-1.4.0-py3-none-any.whl (10.0 kB view details)

Uploaded Python 3

File details

Details for the file khadoop-1.4.0.tar.gz.

File metadata

  • Download URL: khadoop-1.4.0.tar.gz
  • Upload date:
  • Size: 9.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.4 CPython/3.8.5 Linux/4.19.128-microsoft-standard

File hashes

Hashes for khadoop-1.4.0.tar.gz
Algorithm Hash digest
SHA256 49d7eeb7c5bc433efabed99e9f43a2199e210282c58c6a181ebf8375b512c174
MD5 59d37ddec55b2498eedf2d912ec7453e
BLAKE2b-256 c95896b0573bd1db118e538b2f719ad0ef5b726155a465b2a3a834eb2526fed0

See more details on using hashes here.

File details

Details for the file khadoop-1.4.0-py3-none-any.whl.

File metadata

  • Download URL: khadoop-1.4.0-py3-none-any.whl
  • Upload date:
  • Size: 10.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.4 CPython/3.8.5 Linux/4.19.128-microsoft-standard

File hashes

Hashes for khadoop-1.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 443478183abac2e3e8f501cb7e3963e3905e5316547a07dd0b255bed49f92ba8
MD5 0c4d16691ee93f3cfb616eba0be99f21
BLAKE2b-256 915e263f55205147d721e60a45e867f9764fe1db8ed39bd670f5190174434259

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page