Web server HTTP access log parsing, filtering, and SQL database storage.
Parses your bloated HTTP access logs to extract the info you want about hits from (hopefully) real people instead of just the endless stream of hackers and bots that passes for web traffic nowadays. Stores the info in a relational database where you can access it using all the power of SQL.
Uses the power of your multicore CPU with Twisted, AsynQueue, and sAsync to process log files concurrently and fast. Duplicate entries are ignored, so you don’t need to fret about redundancies in your logfiles. (It happens.) The filtering goes forwards and backwards; once an entry has been determined to come from a bad actor, all log entries from that IP address are purged and ignored.
If you see bot garbage getting through and polluting your logs with some new attempt at an exploit, just add a rule for it to your rules lists, starting with what logalyzer comes prepackaged with. The next time you run it, those entries will get purged as well.
Optionally produces a list of offender IP addresses that you can use to deny access to your web server entirely.