Skip to main content

A tool to import Nginx logs into SQLite for easy querying.

Project description

nginx-sqlize

A tool for importing Nginx logs into SQLite for easy querying and analysis.

Quick Start

Installation from PyPI

pip install nginx-sqlize

Installation from Source

git clone https://github.com/fidacura/nginx-sqlize.git
cd nginx-sqlize

# create venv (recommended)
python -m venv venv
source venv/bin/activate

# install in editable mode
pip install -e .

Basic Usage

# process nginx logs into sqlite
nginx-sqlize ingest /var/log/nginx/access.log

# analyze the data
nginx-sqlize query --top-paths 20
nginx-sqlize query --top-ips 15
nginx-sqlize query --status-codes

# check database status
nginx-sqlize status

# clean and optimize
nginx-sqlize clean --duplicates --vacuum

Detailed Command Usage

Ingest Logs

Process nginx log files into a SQLite database with automatic optimization:

# single file
nginx-sqlize ingest /var/log/nginx/access.log

# multiple files with pattern
nginx-sqlize ingest "/var/log/nginx/*.log"

# gzipped files supported
nginx-sqlize ingest "/var/log/nginx/access.log*.gz"

# custom database name
nginx-sqlize ingest /path/to/logs --output mysite

# force reprocess all files
nginx-sqlize ingest /path/to/logs --force

# verbose output with detailed progress
nginx-sqlize ingest /path/to/logs --verbose

Query and Analytics

Powerful querying with pre-built analytics:

# top requested paths
nginx-sqlize query --top-paths 20

# most active ip addresses
nginx-sqlize query --top-ips 15

# http status distribution
nginx-sqlize query --status-codes

# traffic patterns by hour
nginx-sqlize query --traffic hour

# error analysis
nginx-sqlize query --errors

# bot activity detection
nginx-sqlize query --bots 10

# potential attack patterns
nginx-sqlize query --attacks 20

# export results to json
nginx-sqlize query --top-paths 50 --export results.json

Database Status

Get comprehensive database information:

# overview with statistics
nginx-sqlize status

# specific database
nginx-sqlize status --db mysite.sqlite

Database Maintenance

Keep your database optimized and clean:

# remove duplicates and optimize
nginx-sqlize clean --duplicates --vacuum

# remove old logs
nginx-sqlize clean --older-than 30d

# remove logs older than 1 year
nginx-sqlize clean --older-than 1y

# skip confirmation
nginx-sqlize clean --duplicates --yes

Command Reference

Ingestion

Command Description Example
ingest <logs> Process nginx logs into SQLite database nginx-sqlize ingest /var/log/nginx/*.log
ingest --output <name> Specify custom database name nginx-sqlize ingest logs/ --output mysite
ingest --force Reprocess all files (ignore tracking) nginx-sqlize ingest logs/ --force
ingest --verbose Show detailed processing information nginx-sqlize ingest logs/ --verbose
ingest --batch-size <n> Set processing batch size nginx-sqlize ingest logs/ --batch-size 5000

Analytics

Command Description Example
query --top-paths <n> Most requested paths nginx-sqlize query --top-paths 20
query --top-ips <n> Most active IP addresses nginx-sqlize query --top-ips 15
query --status-codes HTTP status distribution nginx-sqlize query --status-codes
query --methods HTTP method distribution nginx-sqlize query --methods
query --referrers <n> Top referrer sources nginx-sqlize query --referrers 10
query --response-sizes <n> Paths by response size nginx-sqlize query --response-sizes 15
query --traffic <period> Traffic patterns (hour/day) nginx-sqlize query --traffic hour
query --errors Error analysis and patterns nginx-sqlize query --errors
query --bots <n> Bot activity detection nginx-sqlize query --bots 10
query --attacks <n> Potential attack patterns nginx-sqlize query --attacks 20

Database

Command Description Example
query --export <file> Export results to JSON nginx-sqlize query --top-paths 50 --export report.json
query --combine Combine multiple database results nginx-sqlize query --db "*.sqlite" --combine --status-codes
query --limit <n> Limit number of results nginx-sqlize query --top-paths 5 --limit 5

Management

Command Description Example
status Show database statistics nginx-sqlize status
status --db <path> Status for specific database nginx-sqlize status --db mysite.sqlite
clean --duplicates Remove duplicate entries nginx-sqlize clean --duplicates
clean --vacuum Optimize database storage nginx-sqlize clean --vacuum
clean --older-than <period> Remove old logs nginx-sqlize clean --older-than 30d
clean --yes Skip confirmation prompts nginx-sqlize clean --duplicates --yes

Architecture

Database Schema

-- main logs table with optimized indexes
CREATE TABLE logs (
    id INTEGER PRIMARY KEY,
    timestamp TEXT NOT NULL,
    remote_addr TEXT NOT NULL,
    request_method TEXT,
    request_path TEXT,
    status INTEGER,
    bytes_sent INTEGER,
    -- ... additional fields
);

-- composite index for time-series queries
CREATE INDEX idx_logs_composite ON logs(timestamp, remote_addr, status);

-- path-specific queries
CREATE INDEX idx_logs_path ON logs(request_path) WHERE request_path != '';

-- file processing tracking
CREATE TABLE processed_files (
    filename TEXT PRIMARY KEY,
    last_position INTEGER,
    lines_processed INTEGER,
    file_hash TEXT,
    processed_at TEXT
);

Use Cases

System Administration

  • Traffic Monitoring: Real-time insights into web server performance
  • Error Tracking: Identify and resolve 4xx/5xx error patterns
  • Capacity Planning: Analyze traffic trends for resource allocation

Security Analysis

  • Attack Detection: Monitor for suspicious access patterns
  • Bot Management: Identify and analyze automated traffic
  • Threat Intelligence: Track attack sources and methods

Business Intelligence

  • Content Performance: Most popular pages and resources
  • User Behavior: Traffic sources and navigation patterns
  • Marketing Analytics: Referrer analysis and campaign tracking

DevOps Integration

  • Log Aggregation: Centralized log storage and analysis
  • Alerting: Query-based monitoring and notifications
  • Reporting: Automated analytics and dashboard data

Development

Setup Development Environment

git clone https://github.com/fidacura/nginx-sqlize.git
cd nginx-sqlize

# create virtual environment
python -m venv venv
source venv/bin/activate

# install with development dependencies
pip install -e ".[dev]"

Project Structure

nginx-sqlize/
├── nginx_sqlize/
│   ├── __init__.py      # version and package info
│   ├── main.py          # cli interface with typer
│   ├── core.py          # log processing engine
│   └── queries.py       # analytics and query engine
├── tests/               # comprehensive test suite
├── pyproject.toml       # modern python packaging
└── README.md

License

This project is licensed under the GNU General Public License v2.0 ~ see the LICENSE file for details.

Copyright (C) 2025 fidacura. This is free software distributed under GPL v2.0 terms.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nginx_sqlize-0.2.2.tar.gz (28.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nginx_sqlize-0.2.2-py3-none-any.whl (29.5 kB view details)

Uploaded Python 3

File details

Details for the file nginx_sqlize-0.2.2.tar.gz.

File metadata

  • Download URL: nginx_sqlize-0.2.2.tar.gz
  • Upload date:
  • Size: 28.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for nginx_sqlize-0.2.2.tar.gz
Algorithm Hash digest
SHA256 8c54ae5fc297041a2eb3f2daca8f15106420b88c6646f91bb7615c5e3ca6681c
MD5 e827a6af46979a4a2e807bd4cce227b4
BLAKE2b-256 fa1cad78358067e8c0c7c2a7be04e1dedb107f991409f1b9420b78aadae4d64c

See more details on using hashes here.

File details

Details for the file nginx_sqlize-0.2.2-py3-none-any.whl.

File metadata

  • Download URL: nginx_sqlize-0.2.2-py3-none-any.whl
  • Upload date:
  • Size: 29.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for nginx_sqlize-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 18bf519d1e15d819ee29b0f1b147ba613fa3187b5c957a0bb85795716fb7bef4
MD5 ad19b473593b10eaf10d6de5cb23c68d
BLAKE2b-256 4945b28f7f2f2ab07cc0b271bb9657399d1cf1d5932d0a9bb60dd8456895f8cd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page