Parser Nginx logs like SQL
Project description
tailparse
Meant to mimic, sort of, how the OG logparser for MS Server worked.
Very much a work-in-progress atm. Goal is to make this a standalone python executable for deployment on servers. Will probably have to change the name to make space on package repos.
Installation
tailparse
has no dependencies to install. It only works with
python 3 and has only been tested in python 3.10. Presumably it would
work with any python 3.
python3 setup.py install --user
This will install it in your home directory
(i.e. home/<username>/.local/bin/
) and doesn't require sudo
privileges.
Usage
tailparse -q <sqlite-query> your.logs
Help:
$ tailparse --help
usage: tailparse [-h] [-p] [-i INPUT_FORMAT] [-q QUERY] [-r MAX_ROWS] [-s SAVE_DB] [-f FILE] [logs]
Process logs as if they were SQL.
positional arguments:
logs The path to the input log we're processing. If not present, will use stdin.
options:
-h, --help show this help message and exit
-p, --print Print out the requisite columns
-i INPUT_FORMAT, --input-format INPUT_FORMAT
The format of the log we're processing. Defaults to 'nginx'. Options include ['nginx']
-q QUERY, --query QUERY
The query to execute. Don't include any 'FROM' statement as this is added automatically. If not
included, make sure to include a -f/--file arugmenet
-r MAX_ROWS, --max-rows MAX_ROWS
Number of max rows to print. Defaults to 20. Put 0 to print all.
-s SAVE_DB, --save-db SAVE_DB
Whether to save the resulting SQLite data file. Defaults to not saving it and using ':memory:'
instead. If the database exists, then the log file will not be used to populate it and instead it
will be read from. This can be helpful if you're running a lot of queries as the log file doesn't
need to be re-parsed everytime.
-f FILE, --file FILE Execute multiple queries contained with a file. Can be used in place of -q/--query
Caching the database on-disk
It can help to write the SQL database to disk instead of reading from
memory. This can be done with the -s
or --save-db
arguments:
$ time tailparse -s tmp.sqlite3.db -q "SELECT COUNT(*) FROM logs" sample.logs
COUNT(*)
100090
real 0m1.026s
user 0m0.835s
sys 0m0.117s
$ time tailparse -s tmp.sqlite3.db -q "SELECT COUNT(*) FROM logs" sample.logs
COUNT(*)
100090
real 0m0.648s
user 0m0.533s
sys 0m0.095s
# and without any caching
$ time tailparse -q "SELECT COUNT(*) FROM logs" sample.logs
COUNT(*)
100090
real 0m0.910s
user 0m0.866s
sys 0m0.045s
Note that if you do this, tailparse
will not attempt to rewrite the database
Running Tests
To run tests, at the top level run the following:
./test.sh
To get coverage reports, which requires the coverage
module, you can run:
./test.sh -h
While testing relies entirely on unittest
, which is built-in,
coverage
and mypy
are used to do coverage reports and type
checking, respectively. ./test.sh
will check for these installs at
runtime.
Contributing
Feel free to make a PR!
Note that this library is trying very hard to avoid any dependencies and stay core-python-only. If there is a strong reason to include one, please include a reason why.
black
is used for all code formatting. Testing will be required for
any contributions.
Todos
-
write proper tests
-
split the
logparser.py
file up into separate chunks -
tailparse.execute.execute_query
: write integration tests against this and check for the output string -- all for 'nginx'- include
sample.log
viashuf -n N input > output
-
SELECT * FROM logs LIMIT 1
-
SELECT * FROM logs LIMIT 2
-
SELECT * FROM logs
: stops at 20 by default -
SELECT * FROM logs
, w/max_rows=0: does not stop at 20 -
SELECT *
: complains before it executes - w/
query_file
: write to/tmp
and clean up after - w/
save_db
: write to/tmp
and clean up after - w/
print_columns
: asTrue
, should print just columns
- include
-
tailparse.print_output.print_output
: write unit tests against this and check for output string -
write testing shell script -- should fail if
mypy
fails or ifcoverage
andmypy
aren't installed
-
-
update the README to use current version screenshots
-
add
--version
argument -
write proper contribution guides, especially for new log formats
-
write proper
Make
file to make it easy for people to see whats going on -
remove pandas dependency and only use pure python
-
support writing to disk for the sqlite3 database, not just to memory
-
support other formats, not just Nginx
- apache
- morgan
-
Ability to process multiple SQL commands in a text file, separated by line
-
infer the table (e.g.
FROM logs
) so I don't have to specify it everytime
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file tailparse-0.2.tar.gz
.
File metadata
- Download URL: tailparse-0.2.tar.gz
- Upload date:
- Size: 23.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.10.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 618ff065d67679d1a308096aa9c469789f2a3374fc7c367bddfd35d4aec56689 |
|
MD5 | eaacac0f1ddd8a8d6310bc44064a9500 |
|
BLAKE2b-256 | ac6c86832608fbf24406436b156810392b0ef866d6d9d7fab9d1ab33071637f8 |
File details
Details for the file tailparse-0.2-py3-none-any.whl
.
File metadata
- Download URL: tailparse-0.2-py3-none-any.whl
- Upload date:
- Size: 23.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.10.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 25598fa76712788d05ff0c151ebdf6c672452ae7fdc709325da2cc8f21098735 |
|
MD5 | cce3b8575eeb8a4642c2690d137e1146 |
|
BLAKE2b-256 | f545c80e1e0185f3bf88ee1d1404cfa29560374c2f162e4ec6f93d8eb0bf33d6 |