Skip to main content

Functions required by the access-logs-local-driver

Project description

# Access Logs Driver

Load the content of gzipped Apache HTTP log files Exclude bots, scrapers, etc., select URLs matching the provided regex(es), and generate a CSV of the relevant log entries.

Take postprocessed logs and strip out multiple hits in sessions, and resolve URLs to the chosen URI_SCHEME (e.g. info:doi).

We strip out entries where the same (IP address * user agent) pair has accessed a URL within the last SESSION_TIMEOUT (e.g. half-hour)

Additionally, we convert the URLs to ISBNs and collate request data by date, outputting a CSV for ingest via the stats system.

Release Notes: [0.0.5] - 2023-07-03

Changed:
  • Added start_date and end_date for searching in the log files

  • Added the measure_uri to the result

Release Notes: [0.0.4] - 2023-07-31

Changed:
  • Update file structure and name of the driver

Release Notes: [0.0.3] - 2023-07-25

Changed:
  • Update requirements

  • Update using a pyproject.toml file as well as the new deployment structure

[0.0.2] - 2023-07-11

Added:
  • Unittests

Changed:
  • Moved the files out of the package and get the file’s data as parameters and return the filtered data.

  • renamed the plugin to access-logs-local

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

access_logs_local-0.0.5.tar.gz (10.5 kB view hashes)

Uploaded Source

Built Distribution

access_logs_local-0.0.5-py3-none-any.whl (7.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page