Skip to main content

Audit Archivematica user activities via nginx access logs.

Project description

auditmatica

About

Audit Archivematica user activities via nginx access logs.

Overview

auditmatica is intended to facilitate auditing of user activities in Archivematica and the Archivematica Storage Service. It uses nginx access logs as its data source, and outputs either logs in Common Event Format (CEF) or a JSON overview of user activities.

Usage

auditmatica has two subcommands, write-cef and overview.

Usage: auditmatica [OPTIONS] COMMAND [ARGS]...

  Auditmatica: Archivematica auditing package

Options:
  --version  Show the version and exit.
  --help     Show this message and exit.

Commands:
  overview   Print overview of user activities from nginx access log.
  write-cef  Write Common Event Format (CEF) log from nginx access log.

write-cef

To write CEF events, use the write-cef subcommand. E.g.:

auditmatica write-cef /path/to/nginx/access.log

or

cat /var/log/nginx/access.log | auditmatica write-cef

CEF is a widely used standard for network and security analysis. CEF events can be sent to applications for review, monitoring, and visualization via a file or over syslog. CEF events written by auditmatica include an event name, signature (unique identifier), and severity level (0-10), which are determined based on details from the nginx access log such as URL, HTTP method, and HTTP return code.

A sample CEF event written by auditmatica looks like the following:

CEF:0|Artefactual Systems, Inc.|Archivematica|hosted|16|AIP downloaded from Archival Storage|3|cs1Label=requestClientApplication cs1="Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:84.0) Gecko/20100101 Firefox/84.0" rt=Jan 13 2021 20:01:33 requestMethod=GET request=/archival-storage/download/aip/8fa54cfc-f5c5-4673-b44e-fc514496bad7/ src=172.19.0.1 suser=test msg=UUID:8fa54cfc-f5c5-4673-b44e-fc514496bad7

For a comprehensive list of Archivematica CEF events, see IMPLEMENTATION.md.

For more details on CEF, see the CEF specification.

This command accepts several optional arguments:

Usage: auditmatica write-cef [OPTIONS] [LOG]

  Write Common Event Format (CEF) log from nginx access log.

Options:
  -o, --output PATH       Filepath for output CEF file (default=None, print to
                          stdout)

  -s, --syslog            Write CEF events to syslog instead of file
  --syslog-address TEXT   Address for syslog connection (default='/dev/log')
  --syslog-facility TEXT  Facility for syslog messages (default='USER')
  --syslog-port INTEGER   Port for remote syslog connections
  --ss-base-url TEXT      Override the Storage Service URL to scan for
                          (default='http://127.0.0.1:62081')

  --suppress              Suppress log lines that do not map to a specific
                          event instead of reverting to default event

  -v, --verbose           Enable verbose error message reporting
  --help                  Show this message and exit.


Storage Service

auditmatica looks for Storage Service events in the nginx access log by checking each URL to determine if it begins with the expected base URL of the Storage Service. By default, this is http://127.0.0.1:62081.

To override the Storage Service URL to scan for, use --ss-base-url. E.g.:

auditmatica --ss-base-url http://archivematica.example.com:8000

Output

By default, auditmatica writes CEF events to stdout and some end-user facing messages to stderr.

To write CEF events to a file, use the -o/--output option to specify a filepath for the output file. E.g.:

auditmatica write-cef /path/to/nginx/access.log --output my-output-file.log

To write CEF events to syslog, use the -s/--syslog option. By default, this will write syslog messages to /dev/log/ using the USER facility. The --syslog-address, --syslog-port, and --syslog-facility options can be used to customize the syslog connection. E.g.:

auditmatica write-cef -s \
  --syslog-address localhost \
  --syslog-port 514 \
  --syslog-facility local0 \
  /path/to/nginx/access.log

--syslog-port will only be used if an address other than /dev/log is also passed with--syslog-address.

Valid --syslog-facility values are local0-local7, which are reserved by syslog for local use.

overview

To generate a high-level JSON overview of Archivematica user activities, use the overview subcommand. E.g.:

cat access.log | auditmatica overview
Usage: auditmatica overview [OPTIONS] [LOG]

  Write JSON overview of user activities from nginx access log.

Options:
  --ss-base-url TEXT  Override the Storage Service URL to scan for
                      (default='http://127.0.0.1:62081')
  --help              Show this message and exit.

Storage Service

auditmatica looks for Storage Service events in the nginx access log by checking each URL to determine if it begins with the expected base URL of the Storage Service. By default, this is http://127.0.0.1:62081.

To override the Storage Service URL to scan for, use --ss-base-url. E.g.:

auditmatica --ss-base-url http://archivematica.example.com:8000

Download

Download from GitHub:

git clone https://github.com/artefactual-labs/auditmatica.git

Install

General

auditmatica requires Python 3.6+.

Change into the cloned directory and install:

cd auditmatica/
pip install .

Usernames

Including authenticated usernames in auditmatica's outputs requires some additional setup:

  1. Enable auditing middleware in Archivematica and the Storage Service via environment variables
# Archivematica 1.13+
ARCHIVEMATICA_DASHBOARD_DASHBOARD_AUDIT_LOG_MIDDLEWARE: "true"

# Storage Service 0.18+
SS_AUTH_LOG_MIDDLEWARE: "true"
  1. Restart Archivematica and Storage Service services
sudo service archivematica-dashboard restart
sudo service archivematica-storage-service restart
  1. Add the following configuration to the http block of the nginx.conf configuration file (likely /etc/nginx/nginx.conf, though this may vary)
log_format main '$remote_addr - $remote_user [$time_local] "$request" $status $body_bytes_sent "$http_referer" "$http_user_agent" "$http_x_forwarded_for" user=$upstream_http_x_username';
access_log "/var/log/nginx/access.log" main;

This format must be exact, as it maps directly to a pattern auditmatica uses to parse nginx access log lines with usernames. If an access_log is already specified, replace it or name it something other than main. The access_log path (by default /var/log/nginx/access.log) can be changed as needed.

  1. Optionally, add proxy_hide_header x-username; to the nginx server blocks to prevent the authenticated username from being sent back with each response to the client device.

  2. Restart nginx service:

sudo service nginx restart

If the above is configured correctly, the resulting nginx access log lines should look as follows:

172.10.5.1 - - [13/Jan/2021:19:53:10 +0000] "GET /backlog/download/2e28b8a9-351c-4da7-92d2-837ac04cd2d9/ HTTP/1.1" 200 19436360 "http://127.0.0.1:62080/backlog/" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:84.0) Gecko/20100101 Firefox/84.0" "-" user=test

Development

For development, it may be useful to install auditmatica with pip install -e ., which will apply changes made to the source code immediately.

To run all tests with tox: tox

Or run tests directly with pytest:

pip install -r requirements/test.txt
pytest

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

auditmatica-0.1.0.tar.gz (25.0 kB view hashes)

Uploaded Source

Built Distribution

auditmatica-0.1.0-py2.py3-none-any.whl (36.4 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page