Parse log files from an ERDDAP server
Project description
erddaplogs
Quick utilities for parsing nginx and apache logs.
This script takes apache and/or nginx logs as input. It is made to analyse visitors to an ERDDAP server, but should work on any web traffic.
The jupyter notebook performs the following steps:
- Read in apache and nginx logs, combine them into one consistent dataframe
- Find the ips that made the greatest number of requests. Get their info from ip-api.com
- Remove suspected spam/bot requests
- Perform basic anaylysis to graph number of requests and users over time, most popular datasets/datatypes and geographic distribution of users
A blog post explaining this notebook in more detail can be found at https://callumrollo.com/weblogparse.html
A note on example data
If you don't have your own ERDDAP logs to hand, you can use the example data in example_data/nginx_example_logs
. This is anonymmised data from a production ERDDAPP server erddap.observations.voiceoftheocean.org. The ip addresses have been randommly generated, as have the user agents. All subscription emails have been replaced with fake@example.com
License
This project is licensed under MIT.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for erddaplogs-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b34fee4b8ba7a5ce5485b898e79f53c6d8fe9d5bbe4df7e3100b71ab6630a5d3 |
|
MD5 | a7b8075819d7ccbfcb9b9ef5777e1f30 |
|
BLAKE2b-256 | 30f0521412198e608ffc1eb743fa2df0e2d84f9b339fcf37835e4bcca79998b8 |