Skip to main content

Parse lines from an s3 log file

Project description

Parses log lines from an s3 log file.


import s3_log_parser line_parser = s3_log_parser.make_parser(“%BO %B %t %a %r %si %o %k “%R” %s %e %b %y %m %n “%{Referer}i” “%{User-Agent}i” %v”)

This creates & returns a function, line_parser, which accepts a line from an s3 log file in that format, and will return the parsed values in a dictionary.


%BO - bucket owner - The canonical user ID of the owner of the source bucket.
%B - bucket - The name of the bucket that the request was processed against. If the     system receives a malformed
request and cannot determine the bucket, the request will not appear in any server access log.
%t - date/time - The time at which the request was received. The format, using strftime() terminology, is
[%d/%b/%Y:%H:%M:%S %z]
%a - remote ip - Remote IP-address The apparent Internet address of the requester. Intermediate proxies and
firewalls might obscure the actual address of the machine making the request.
%r - requester_id - The canonical user ID of the requester, or the string "Anonymous" for unauthenticated requests.
If the requester was an IAM user, this field will return the requester's IAM user name along with the AWS root
account that the IAM user belongs to. This identifier is the same one used for access control purposes.
%si - s3_request_id - The request ID is a string generated by Amazon S3 to uniquely identify each request.
%o - operation - The operation listed here is declared as SOAP.operation, REST.HTTP_method.resource_type,
WEBSITE.HTTP_method.resource_type, or BATCH.DELETE.OBJECT.
%k - key - The "key" part of the request, URL encoded, or "-" if the operation does not take a key parameter.
\"%R\" - request_firs_line - First line of request. The Request-URI part of the HTTP request message.
%s - status - The request method The numeric HTTP status code of the response.
%e - error - The Amazon S3 Error Code, or "-" if no error occurred.
%b - bytes - Size of response in bytes, excluding HTTP headers. The number of response bytes sent, excluding HTTP
protocol overhead, or "-" if zero.
%y - total_bytes - Size of response in bytes, excluding HTTP headers. In CLF format, i.e. a '-' rather than a 0 when
no bytes are sent. The total size of the object in question.
%m - total_time - The number of milliseconds the request was in flight from the server's perspective. This value is
measured from the time your request is received to the time that the last byte of the response is sent. Measurements
made from the client's perspective might be longer due to network latency.
%n - turnaround_time - The number of milliseconds that Amazon S3 spent processing your request. This value is
measured from the time the last byte of your request was received until the time the first byte of the response was
\"%{Referer}i\" - referer - The contents of Foobar: header line(s) in the request sent to the server. Changes made
by other modules (e.g. mod_headers) affect this. If you're interested in what the request header was prior to when
most modules would have modified it, use mod_setenvif to copy the header into an internal environment variable and
log that value with the %\{VARNAME}e described above. The value of the HTTP Referrer header, if present. HTTP
user-agents (e.g. browsers) typically set this header to the URL of the linking or embedding page when making a
\"%{User-Agent}i\" - user agent - The value of the HTTP User-Agent header.
%v - version_id - The version ID in the request, or "-" if the operation does not take a versionId parameter.

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for s3-log-parser, version 1.0.0
Filename, size File type Python version Upload date Hashes
Filename, size s3-log-parser-1.0.0.tar.gz (6.5 kB) File type Source Python version None Upload date Hashes View

Supported by

Pingdom Pingdom Monitoring Google Google Object Storage and Download Analytics Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page