Skip to main content

Advanced logging utilities for robust, standardized logs in Python projects, APIs, data engineering, and more.

Project description

PyPI version Python versions License Build Status Publish codecov

logging-metrics

Utilities Library for Logging Configuration and Management

A library for configuring and managing logs in Python, focused on simplicity, performance, and observability โ€” with support for PySpark integration.


๐Ÿ“‘ Table of Contents


โœจ Features

  • ๐ŸŽจ Colored logs for the terminal with different levels
  • ๐Ÿ“ Automatic file rotation by time or size
  • โšก PySpark DataFrame integration
  • ๐Ÿ“Š JSON format for observability systems
  • โฑ๏ธ Timing with LogTimer
  • ๐Ÿ“ˆ Metrics monitoring with LogMetrics
  • ๐Ÿ”ง Hierarchical logger configuration
  • ๐Ÿš€ Optimized performance for critical applications

๐Ÿ“ฆ Installation

From PyPI:

pip install logging-metrics

For development:

git clone https://github.com/ThaissaTeodoro/logging-metrics.git
cd logging-metrics
pip install -e ".[dev]"

๐Ÿ“‹ Functions and Classes Overview

Name Type Description
configure_basic_logging Function Configures root logger for colored console logging.
setup_file_logging Function Configures a logger with file output (rotation), optional console, JSON formatting.
LogTimer Class Context manager and decorator to log execution time of code blocks or functions.
log_spark_dataframe_info Function Logs schema, sample, stats of a PySpark DataFrame (row count, sample, stats, etc).
LogMetrics Class Utility for collecting, incrementing, timing, and logging custom processing metrics.
get_logger Function Returns a logger with custom handlers and caplog-friendly mode for pytest.

๐Ÿš€ Quick Start

import logging
from logging_metrics import setup_file_logging, LogTimer

logger = setup_file_logging(
    logger_name="my_app",
    log_dir="./logs",
    console_level=logging.INFO,
    level=logging.DEBUG
)

logger.info("Application started!")

with LogTimer(logger, "Critical operation"):
    # your code here
    pass

๐Ÿ“– Main Features

  1. Logging configuration:
import logging
from logging-metrics import configure_basic_logging
logger = configure_basic_logging()
logger.debug("Debug message")     # Gray
logger.info("Info")               # Green  
logger.warning("Warning")         # Yellow
logger.error("Error")             # Red
logger.critical("Critical")       # Bold red
  1. Automatic Log Rotation:
from logging-metrics import setup_file_logging, LogTimer
# Size-based rotation
logger = setup_file_logging(
    logger_name="app",
    log_dir="./logs",
    max_bytes=10*1024*1024,  # 10MB
    rotation='size'
)

# Time-based rotation
logger = setup_file_logging(
    logger_name="app", 
    log_dir="./logs",
    rotation='time'    
)
  1. Spark/Databricks Integration:
from pyspark.sql import SparkSession
from logging_metrics import configure_basic_logging, log_spark_dataframe_info

spark = SparkSession.builder.getOrCreate()
df = spark.createDataFrame([(1, "Ana"), (2, "Bruno")], ["id", "nome"])

logger = configure_basic_logging()
print("Logger:", logger)

log_spark_dataframe_info(
    df = df,logger = logger, name ="spark_app")

logger.info("Spark processing started")
  1. โฑ Timing with LogTimer:
from logging_metrics import LogTimer, configure_basic_logging

logger = configure_basic_logging()
# As a context manager
with LogTimer(logger, "DB query"):
    logger.info("Test")

# As a decorator
@LogTimer.as_decorator(logger, "Data processing")
def process_data(data):
  return data.transform()
  1. ๐Ÿ“ˆ Metrics Monitoring:
from logging_metrics import LogMetrics, configure_basic_logging
import time

logger = configure_basic_logging()

metrics = LogMetrics(logger)

items = [10, 5, 80, 60, 'test1', 'test2']

# Start timer for total operation
metrics.start('total_processing')


for item in items:
    # Increments the processed records counter
    metrics.increment('records_processed')

    # If it is an error (simulation)
    if isinstance(item, str):
        metrics.increment('errors')

    # Simulates item processing
    time.sleep(0.1)

    # Custom value example
    metrics.set('last_item', item)


# Finalize and log all metrics
elapsed = metrics.stop('total_processing')

# Logs all collected metrics
metrics.log_all()

# Output:
# --- Processing Metrics ---
# Counters:
#   - records_processed: 6
#   - errors_found: 2
#  Values:
#   - last_item: test2
#  Completed timers:
#   - total_processing: 0.60 seconds
  1. Hierarchical Configuration:
from logging_metrics import setup_file_logging
  import logging
  
  # Main logger
  main_logger = setup_file_logging("my_app", log_dir="./logs")
  
  # Sub-loggers organized hierarchically
  db_logger = logging.getLogger("my_app.database")
  api_logger = logging.getLogger("my_app.api")
  auth_logger = logging.getLogger("my_app.auth")
  
  # Module-specific configuration
  db_logger.setLevel(logging.DEBUG)      # More verbose for DB
  api_logger.setLevel(logging.INFO)      # Normal for API
  auth_logger.setLevel(logging.WARNING)  # Only warnings/errors for auth
  
  db_logger.debug("querying the database")
  db_logger.info("consultation successfully completed")
  db_logger.error("Error connecting to database!")
  
  auth_logger.debug("doing authentication")
  auth_logger.info("authentication successfully completed")
  api_logger.debug("querying the api")
  api_logger.info("consultation successfully completed")
  api_logger.error("Error querying the api")
  auth_logger.error("Auth error!")
  1. ๐Ÿ“Š JSON Format for Observability:
from logging_metrics import setup_file_logging

# JSON logs for integration with ELK, Grafana, etc.
logger = setup_file_logging(
    logger_name="microservice",
    log_dir="./logs",
    json_format = True
)

logger.info("User logged in", extra={"user_id": 12345, "action": "login"})

# Example JSON output:
# {
#   "timestamp": "2024-08-05T10:30:00.123Z",
#   "level": "INFO", 
#   "name": "microservice",
#   "message": "User logged in",
#   "module": "user-api",
#   "function": "<module>",
#   "line": 160,
#   "taskName": null,
#   "user_id": 12345,
#   "action": "login"
# }

๐Ÿ† Best Practices

  1. Configure logging once at the start:
# In main.py or __init__.py
logger = setup_file_logging("my_app", log_dir="./logs")
  1. Use logger hierarchy:
# Organize by modules/features
db_logger = logging.getLogger("app.database")
api_logger = logging.getLogger("app.api")
  1. Different levels for console and file:
logger = setup_file_logging(
    console_level=logging.WARNING,  # Less verbose in console
    level=logging.DEBUG             # More detailed in the file
)
  1. Use LogTimer for critical operations:
with LogTimer(logger, "Complex query"):
    result = run_heavy_query()
  1. Monitor metrics in long processes:
metrics = LogMetrics(logger)
for batch in batches:
    with metrics.timer('batch_processing'):
        process_batch(batch)

โŒ Avoid

  • Configuring loggers multiple times
  • Using print() instead of logger
  • Excessive logging in critical loops
  • Exposing sensitive information in logs
  • Ignoring log file rotation

๐Ÿ”ง Advanced Configuration

Example of full configuration:

from logging_metrics import setup_file_logging, LogMetrics
import logging

# Main configuration with all options
logger = setup_file_logging(
    logger_name="my_app",
    log_folder: str = "unknown/"
    log_dir="./logs",
    level=logging.DEBUG,
    console_level=logging.INFO,
    rotation='time',
    log_format="%(asctime)s - %(name)s - %(levelname)s - %(message)s",
    date_format="%Y-%m-%d %H:%M:%S",
    max_bytes=50*1024*1024,  # 50MB
    backup_count=10,
    add_console= True
)

# Sub-module configuration
modules = ['database', 'api', 'auth', 'cache']
for module in modules:
    module_logger = logging.getLogger(f"my_app.{module}")
    module_logger.setLevel(logging.INFO)

๐Ÿงช Complete Example

import logging
from logging_metrics import setup_file_logging, LogTimer, LogMetrics

def main():
    # Initial configuration
    logger = setup_file_logging(
        logger_name="data_processor",
        log_dir="./logs",
        console_level=logging.INFO,
        level=logging.DEBUG
    )
    
    # Sub-loggers
    db_logger = logging.getLogger("data_processor.database")
    api_logger = logging.getLogger("data_processor.api")
    
    # Metrics
    metrics = LogMetrics(logger)
    
    logger.info("Application started")
    
    try:
        # Main processing with timing
        with LogTimer(logger, "Full processing"):
            metrics.start('total_processing')
            
            # Simulate processing
            for i in range(1000):
                metrics.increment('records_processed')
                
                if i % 100 == 0:
                    logger.info(f"Processed {i} records")
                
                # Simulate occasional error
                if i % 250 == 0:
                    metrics.increment('errors_recovered')
                    logger.warning(f"Recovered error at record {i}")
            
            metrics.stop('total_processing')
            metrics.log_all()
            
        logger.info("Processing successfully completed")
        
    except Exception as e:
        logger.error(f"Error during processing: {e}", exc_info=True)
        raise

if __name__ == "__main__":
    main()

๐Ÿงช Tests

The library has a complete test suite to ensure quality and reliability.

# Install development dependencies
pip install -e ".[dev]"

# Run all tests
make test

# Tests with coverage
make test-cov

Test structure:

test/
โ”œโ”€โ”€ conftest.py
โ”œโ”€โ”€ pytest.ini
โ”œโ”€โ”€ test-requirements.txt
โ”œโ”€โ”€ test_logging_metrics.py

โš™๏ธ CI/CD

This project uses GitHub Actions for continuous integration and delivery.

CI Workflow (tests.yml):

  • Runs on push and PR to main/master.
  • Steps:
    1. Install dependencies and package in editable mode.
    2. Lint code with ruff and black.
    3. Run tests with pytest and measure coverage.
    4. Fail build if coverage < 85%.
    5. Upload HTML coverage report and send to Codecov.

CD Workflow (publish.yml):

  • Triggered on push tags v*.*.*.
  • Steps:
    1. Build wheel and sdist.
    2. Check version tag matches pyproject.toml.
    3. Publish to PyPI using TWINE_USERNAME=__token__ and TWINE_PASSWORD from secrets.

Run CI locally:

make test-ci     # Full pipeline
make test-local  # Install + tests with coverage

How to publish a new version

  1. Update the version in pyproject.toml (version field).
  2. Update the CHANGELOG with the release notes.
  3. Create and push the tag:
 git add .
 git commit -m "release: v0.1.0"
 git tag -a v0.1.0 -m "release: v0.1.0"
 git push origin v0.1.0

This will automatically trigger the publish.yml workflow, which builds the package and uploads it to PyPI.

๐Ÿ”ง Requirements

  • Python >= 3.8
  • Dependencies: pytz, pyspark

๐Ÿ“ Changelog

v0.2.0 (Current)

  • Initial stable version
  • LogTimer and LogMetrics
  • Spark integration
  • Colored logs
  • JSON log support
  • Fixed file rotation bug on Windows
  • Expanded documentation with more examples

๐Ÿค Contributing

  1. Fork the project
  2. Create your feature branch
  3. Commit your changes
  4. Push to your branch
  5. Open a Pull Request

๐Ÿ“„ License

MIT License. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

logging_metrics-0.2.1.tar.gz (19.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

logging_metrics-0.2.1-py3-none-any.whl (13.4 kB view details)

Uploaded Python 3

File details

Details for the file logging_metrics-0.2.1.tar.gz.

File metadata

  • Download URL: logging_metrics-0.2.1.tar.gz
  • Upload date:
  • Size: 19.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for logging_metrics-0.2.1.tar.gz
Algorithm Hash digest
SHA256 18431ae436b351092b752c99a1d75def78578fa0d89a4553207dd578c7bb5824
MD5 be97ef0cd6c8844fb710647f8ce8f577
BLAKE2b-256 cce92e71b8ba7341b29fb6d35447e2a9cc01bc30652d86d9cc0cb5a86c2d1af5

See more details on using hashes here.

Provenance

The following attestation bundles were made for logging_metrics-0.2.1.tar.gz:

Publisher: publish-to-pypi.yml on ThaissaTeodoro/logging-metrics

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file logging_metrics-0.2.1-py3-none-any.whl.

File metadata

File hashes

Hashes for logging_metrics-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 a0a88e57e4dabcd8d24a2a7912dc734d440c962fda558fd011c7566e898e7875
MD5 7a5a510453cca29a5e5155a807822c75
BLAKE2b-256 415a082d7d8ef4534bbc0861b2def938f0ce22b0355391e8ff3a3799d2f5a13c

See more details on using hashes here.

Provenance

The following attestation bundles were made for logging_metrics-0.2.1-py3-none-any.whl:

Publisher: publish-to-pypi.yml on ThaissaTeodoro/logging-metrics

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page