Functions required by the access-logs-local-driver
Project description
Load the content of gzipped Apache HTTP log files Exclude bots, scrapers, etc., select URLs matching the provided regex(es), and generate a CSV of the relevant log entries.
Take postprocessed logs and strip out multiple hits in sessions, and resolve URLs to the chosen URI_SCHEME (e.g. info:doi).
We strip out entries where the same (IP address * user agent) pair has accessed a URL within the last SESSION_TIMEOUT (e.g. half-hour)
Additionally, we convert the URLs to ISBNs and collate request data by date, outputting a CSV for ingest via the stats system.
Release Notes:
[0.0.6] - 2023-08-13
- Changed:
Refactored driver logic
- breaking | Changed parameters for the Request.__init__() method
Removed re_match_dict parameter
Added timestamp and user_agent parameters
Changed Request.timestamp from type time to datetime
Changed LogStream to use the new Request.__init__()
Expanded range for LogStream.logfile_names logic to include files within 1 day of the search_date
LogStream.lines() yields Request objects, not str values
LogStream.filter_in_line_request() only yields one line per measure
[0.0.5] - 2023-07-03
- Changed:
Added start_date and end_date for searching in the log files
Added the measure_uri to the result
[0.0.4] - 2023-07-31
- Changed:
Update file structure and name of the driver
[0.0.3] - 2023-07-25
- Changed:
Update requirements
Update using a pyproject.toml file as well as the new deployment structure
[0.0.2] - 2023-07-11
- Added:
Unittests
- Changed:
Moved the files out of the package and get the file’s data as parameters and return the filtered data.
renamed the plugin to access-logs-local
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file access_logs_local-0.0.6.tar.gz
.
File metadata
- Download URL: access_logs_local-0.0.6.tar.gz
- Upload date:
- Size: 10.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | bc960acd37f0b2287161feac22d12c640985a32e263b267cd49042028da4a91c |
|
MD5 | 94ba4bbed758fb3674545fde0c38cd03 |
|
BLAKE2b-256 | 53ce7fb8d33bedb8414ed293665ff9c803fbdbda25324ba4d86f7bb607536fed |
File details
Details for the file access_logs_local-0.0.6-py3-none-any.whl
.
File metadata
- Download URL: access_logs_local-0.0.6-py3-none-any.whl
- Upload date:
- Size: 7.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d9fc4c16818379244d8d1846022754185ab7aa861977ef288c8a227f3f940ee8 |
|
MD5 | a5d7dcf4ffb310d24f81f4577047dece |
|
BLAKE2b-256 | 5236946e8d055b22c3063f134ab5202b199bc26902d94f2f6918e7dd584c9387 |