yahi · PyPI

Versatile parallel log parser

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 4 - Beta
Intended Audience
- Developers
License
- OSI Approved :: Python Software Foundation License
Operating System
Programming Language
- Python

Project description

source: https://github.com/jul/yahi
doc: http://yahi.readthedocs.org/
ticketting: https://github.com/jul/yahi/issues

Versatile log parser (providing default extractors for apache/lighttpd)

Command line usage

Example of data parsed with yahi: http://wwwstat.julbox.fr/

Simplest usage is:

speed_shoot -g /usr/local/data/geoIP /var/www/apache/access*log

Well I guess, it does not work because you first need to fetch geoIP data file:

wget -O- "http://www.maxmind.com/download/geoip/database/GeoLiteCountry/GeoIP.dat.gz" | zcat > /usr/local/data/GeoIP.dat

Of course, this is the geoLite database, I don’t include the data in the package since geoIP must be updated often to stay accurate.

Default path for geoIP is data/GeoIP.dat

Use as a script

speed shoot is in fact a template of how to use yahi as a module:

#!/usr/bin/env python
from archery.bow import Hankyu as _dict
from yahi import notch, shoot
from datetime import datetime


######################## Setting UP ##################################
# parsing command line & default settings. Return a context
context=notch()
##### OKAY, now we can do the job ####################################
date_formater= lambda dt :"%s-%s-%s" % ( dt.year, dt.month, dt.day)
context.output(
    shoot(
        context,
        lambda data : _dict({
            'by_country': _dict({data['_country']: 1}),
            'by_date': _dict({date_formater(data['_datetime']): 1 }),
            'by_hour': _dict({data['_datetime'].hour: 1 }),
            'by_os': _dict({data['_os_name']: 1 }),
            'by_dist': _dict({data['_dist_name']: 1 }),
            'by_browser': _dict({data['_browser_name']: 1 }),
            'by_ip': _dict({data['ip']: 1 }),
            'by_status': _dict({data['status']: 1 }),
            'by_url': _dict({data['uri']: 1}),
            'by_agent': _dict({data['agent']: 1}),
            'by_referer': _dict({data['referer']: 1}),
            'ip_by_url': _dict({data['uri']: _dict( {data['ip']: 1 })}),
            'bytes_by_ip': _dict({data['ip']: int(data['bytes'])}),
            'week_browser' : _dict({data['_datetime'].weekday():
                _dict({data["_browser_name"] :1 })}),
            'total_line' : 1,
        }),
    ),
)

Recommanded usage

for basic log aggregation, I do recommand using command line;
for one shot metrics I recommend an interactive console (bpython or ipython);
for specific metrics or elaborate filters I recommand using the API.

Project details

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 4 - Beta
Intended Audience
- Developers
License
- OSI Approved :: Python Software Foundation License
Operating System
Programming Language
- Python

Release history Release notifications | RSS feed

0.1.4

Oct 26, 2018

0.1.3a0 pre-release

Oct 26, 2018

0.1.2

Sep 3, 2012

This version

0.1.1

Sep 1, 2012

0.1.0

Aug 31, 2012

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

yahi-0.1.1.tar.gz (13.8 kB view hashes)

Uploaded Sep 1, 2012 Source

Built Distribution

yahi-0.1.1.linux-x86_64.tar.gz (26.3 kB view hashes)

Uploaded Sep 1, 2012 Source

Hashes for yahi-0.1.1.tar.gz

Hashes for yahi-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`a7df06bb9083730bd87ab1eec5127d6bb7675b68a9d2ab6e9da368774a9e180a`
MD5	`516159184b5eea75767ee98479b41e40`
BLAKE2b-256	`11be4acefcd70f073f46baea077da1e5ac4d810a0d473eb34f8eeb970eb61eab`

Hashes for yahi-0.1.1.linux-x86_64.tar.gz

Hashes for yahi-0.1.1.linux-x86_64.tar.gz
Algorithm	Hash digest
SHA256	`2631e55271c58eff0b8d3c4e55c0b6407b52349bce55ebbc1bf02b2f29b331ef`
MD5	`75d4020f62aba8aa6e40173c793daed5`
BLAKE2b-256	`7c0e0fdd6dad52104bd03e8adf5ea272d3dc6652b6a92308d42417bedb0df472`