Parser Nginx logs like SQL
Project description
tailparse
Meant to mimic, sort of, how the OG logparser for MS Server worked.
Very much a work-in-progress atm. Goal is to make this a standalone python executable for deployment on servers. Will probably have to change the name to make space on package repos.
Installation
tailparse has no dependencies to install. It only works with
python 3 and has only been tested in python 3.10. Presumably it would
work with any python 3.
python3 setup.py install --user
This will install it in your home directory
(i.e. home/<username>/.local/bin/) and doesn't require sudo
privileges.
Usage
tailparse -q <sqlite-query> your.logs
Help:
$ tailparse --help
usage: tailparse [-h] [-p] [-i INPUT_FORMAT] [-q QUERY] [-r MAX_ROWS] [-s SAVE_DB] [-f FILE] [logs]
Process logs as if they were SQL.
positional arguments:
logs The path to the input log we're processing. If not present, will use stdin.
options:
-h, --help show this help message and exit
-p, --print Print out the requisite columns
-i INPUT_FORMAT, --input-format INPUT_FORMAT
The format of the log we're processing. Defaults to 'nginx'. Options include ['nginx']
-q QUERY, --query QUERY
The query to execute. Don't include any 'FROM' statement as this is added automatically. If not
included, make sure to include a -f/--file arugmenet
-r MAX_ROWS, --max-rows MAX_ROWS
Number of max rows to print. Defaults to 20. Put 0 to print all.
-s SAVE_DB, --save-db SAVE_DB
Whether to save the resulting SQLite data file. Defaults to not saving it and using ':memory:'
instead. If the database exists, then the log file will not be used to populate it and instead it
will be read from. This can be helpful if you're running a lot of queries as the log file doesn't
need to be re-parsed everytime.
-f FILE, --file FILE Execute multiple queries contained with a file. Can be used in place of -q/--query
Caching the database on-disk
It can help to write the SQL database to disk instead of reading from
memory. This can be done with the -s or --save-db arguments:
$ time tailparse -s tmp.sqlite3.db -q "SELECT COUNT(*) FROM logs" sample.logs
COUNT(*)
100090
real 0m1.026s
user 0m0.835s
sys 0m0.117s
$ time tailparse -s tmp.sqlite3.db -q "SELECT COUNT(*) FROM logs" sample.logs
COUNT(*)
100090
real 0m0.648s
user 0m0.533s
sys 0m0.095s
# and without any caching
$ time tailparse -q "SELECT COUNT(*) FROM logs" sample.logs
COUNT(*)
100090
real 0m0.910s
user 0m0.866s
sys 0m0.045s
Note that if you do this, tailparse will not attempt to rewrite the database
Running Tests
To run tests, at the top level run the following:
./test.sh
To get coverage reports, which requires the coverage
module, you can run:
./test.sh -h
While testing relies entirely on unittest, which is built-in,
coverage and mypy are used to do coverage reports and type
checking, respectively. ./test.sh will check for these installs at
runtime.
Contributing
Feel free to make a PR!
Note that this library is trying very hard to avoid any dependencies and stay core-python-only. If there is a strong reason to include one, please include a reason why.
black is used for all code formatting. Testing will be required for
any contributions.
Todos
-
write proper tests
-
split the
logparser.pyfile up into separate chunks -
tailparse.execute.execute_query: write integration tests against this and check for the output string -- all for 'nginx'- include
sample.logviashuf -n N input > output -
SELECT * FROM logs LIMIT 1 -
SELECT * FROM logs LIMIT 2 -
SELECT * FROM logs: stops at 20 by default -
SELECT * FROM logs, w/max_rows=0: does not stop at 20 -
SELECT *: complains before it executes - w/
query_file: write to/tmpand clean up after - w/
save_db: write to/tmpand clean up after - w/
print_columns: asTrue, should print just columns
- include
-
tailparse.print_output.print_output: write unit tests against this and check for output string -
write testing shell script -- should fail if
mypyfails or ifcoverageandmypyaren't installed
-
-
update the README to use current version screenshots
-
add
--versionargument -
write proper contribution guides, especially for new log formats
-
write proper
Makefile to make it easy for people to see whats going on -
remove pandas dependency and only use pure python
-
support writing to disk for the sqlite3 database, not just to memory
-
support other formats, not just Nginx
- apache
- morgan
-
Ability to process multiple SQL commands in a text file, separated by line
-
infer the table (e.g.
FROM logs) so I don't have to specify it everytime
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tailparse-0.2.tar.gz.
File metadata
- Download URL: tailparse-0.2.tar.gz
- Upload date:
- Size: 23.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.10.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
618ff065d67679d1a308096aa9c469789f2a3374fc7c367bddfd35d4aec56689
|
|
| MD5 |
eaacac0f1ddd8a8d6310bc44064a9500
|
|
| BLAKE2b-256 |
ac6c86832608fbf24406436b156810392b0ef866d6d9d7fab9d1ab33071637f8
|
File details
Details for the file tailparse-0.2-py3-none-any.whl.
File metadata
- Download URL: tailparse-0.2-py3-none-any.whl
- Upload date:
- Size: 23.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.10.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
25598fa76712788d05ff0c151ebdf6c672452ae7fdc709325da2cc8f21098735
|
|
| MD5 |
cce3b8575eeb8a4642c2690d137e1146
|
|
| BLAKE2b-256 |
f545c80e1e0185f3bf88ee1d1404cfa29560374c2f162e4ec6f93d8eb0bf33d6
|