Skip to main content

Advanced logging utilities for robust, standardized logs in Python projects, APIs, data engineering, and more.

Project description

PyPI version Python versions License Build Status Publish codecov

logging-metrics

Utilities Library for Logging Configuration and Management

A library for configuring and managing logs in Python, focused on simplicity, performance, and observability โ€” with support for PySpark integration.


๐Ÿ“‘ Table of Contents


โœจ Features

  • ๐ŸŽจ Colored logs for the terminal with different levels
  • ๐Ÿ“ Automatic file rotation by time or size
  • โšก PySpark DataFrame integration
  • ๐Ÿ“Š JSON format for observability systems
  • โฑ๏ธ Timing with LogTimer
  • ๐Ÿ“ˆ Metrics monitoring with LogMetrics
  • ๐Ÿ”ง Hierarchical logger configuration
  • ๐Ÿš€ Optimized performance for critical applications

๐Ÿ“ฆ Installation

From PyPI:

pip install logging-metrics

For development:

git clone https://github.com/ThaissaTeodoro/logging-metrics.git
cd logging-metrics
pip install -e ".[dev]"

๐Ÿ“‹ Functions and Classes Overview

Name Type Description
configure_basic_logging Function Configures root logger for colored console logging.
setup_file_logging Function Configures a logger with file output (rotation), optional console, JSON formatting.
LogTimer Class Context manager and decorator to log execution time of code blocks or functions.
log_spark_dataframe_info Function Logs schema, sample, stats of a PySpark DataFrame (row count, sample, stats, etc).
LogMetrics Class Utility for collecting, incrementing, timing, and logging custom processing metrics.
get_logger Function Returns a logger with custom handlers and caplog-friendly mode for pytest.

๐Ÿš€ Quick Start

import logging
from logging_metrics import setup_file_logging, LogTimer

logger = setup_file_logging(
    logger_name="my_app",
    log_dir="./logs",
    console_level=logging.INFO,
    level=logging.DEBUG
)

logger.info("Application started!")

with LogTimer(logger, "Critical operation"):
    # your code here
    pass

๐Ÿ“– Main Features

  1. Logging configuration:
import logging
from logging-metrics import configure_basic_logging
logger = configure_basic_logging()
logger.debug("Debug message")     # Gray
logger.info("Info")               # Green  
logger.warning("Warning")         # Yellow
logger.error("Error")             # Red
logger.critical("Critical")       # Bold red
  1. Automatic Log Rotation:
from logging-metrics import setup_file_logging, LogTimer
# Size-based rotation
logger = setup_file_logging(
    logger_name="app",
    log_dir="./logs",
    max_bytes=10*1024*1024,  # 10MB
    rotation='size'
)

# Time-based rotation
logger = setup_file_logging(
    logger_name="app", 
    log_dir="./logs",
    rotation='time'    
)
  1. Spark/Databricks Integration:
from pyspark.sql import SparkSession
from logging_metrics import configure_basic_logging, log_spark_dataframe_info

spark = SparkSession.builder.getOrCreate()
df = spark.createDataFrame([(1, "Ana"), (2, "Bruno")], ["id", "nome"])

logger = configure_basic_logging()
print("Logger:", logger)

log_spark_dataframe_info(
    df = df,logger = logger, name ="spark_app")

logger.info("Spark processing started")
  1. โฑ Timing with LogTimer:
from logging_metrics import LogTimer, configure_basic_logging

logger = configure_basic_logging()
# As a context manager
with LogTimer(logger, "DB query"):
    logger.info("Test")

# As a decorator
@LogTimer.as_decorator(logger, "Data processing")
def process_data(data):
  return data.transform()
  1. ๐Ÿ“ˆ Metrics Monitoring:
from logging_metrics import LogMetrics, configure_basic_logging
import time

logger = configure_basic_logging()

metrics = LogMetrics(logger)

items = [10, 5, 80, 60, 'test1', 'test2']

# Start timer for total operation
metrics.start('total_processing')


for item in items:
    # Increments the processed records counter
    metrics.increment('records_processed')

    # If it is an error (simulation)
    if isinstance(item, str):
        metrics.increment('errors')

    # Simulates item processing
    time.sleep(0.1)

    # Custom value example
    metrics.set('last_item', item)


# Finalize and log all metrics
elapsed = metrics.stop('total_processing')

# Logs all collected metrics
metrics.log_all()

# Output:
# --- Processing Metrics ---
# Counters:
#   - records_processed: 6
#   - errors_found: 2
#  Values:
#   - last_item: test2
#  Completed timers:
#   - total_processing: 0.60 seconds
  1. Hierarchical Configuration:
from logging_metrics import setup_file_logging
  import logging
  
  # Main logger
  main_logger = setup_file_logging("my_app", log_dir="./logs")
  
  # Sub-loggers organized hierarchically
  db_logger = logging.getLogger("my_app.database")
  api_logger = logging.getLogger("my_app.api")
  auth_logger = logging.getLogger("my_app.auth")
  
  # Module-specific configuration
  db_logger.setLevel(logging.DEBUG)      # More verbose for DB
  api_logger.setLevel(logging.INFO)      # Normal for API
  auth_logger.setLevel(logging.WARNING)  # Only warnings/errors for auth
  
  db_logger.debug("querying the database")
  db_logger.info("consultation successfully completed")
  db_logger.error("Error connecting to database!")
  
  auth_logger.debug("doing authentication")
  auth_logger.info("authentication successfully completed")
  api_logger.debug("querying the api")
  api_logger.info("consultation successfully completed")
  api_logger.error("Error querying the api")
  auth_logger.error("Auth error!")
  1. ๐Ÿ“Š JSON Format for Observability:
from logging_metrics import setup_file_logging

# JSON logs for integration with ELK, Grafana, etc.
logger = setup_file_logging(
    logger_name="microservice",
    log_dir="./logs",
    json_format = True
)

logger.info("User logged in", extra={"user_id": 12345, "action": "login"})

# Example JSON output:
# {
#   "timestamp": "2024-08-05T10:30:00.123Z",
#   "level": "INFO", 
#   "name": "microservice",
#   "message": "User logged in",
#   "module": "user-api",
#   "function": "<module>",
#   "line": 160,
#   "taskName": null,
#   "user_id": 12345,
#   "action": "login"
# }

๐Ÿ† Best Practices

  1. Configure logging once at the start:
# In main.py or __init__.py
logger = setup_file_logging("my_app", log_dir="./logs")
  1. Use logger hierarchy:
# Organize by modules/features
db_logger = logging.getLogger("app.database")
api_logger = logging.getLogger("app.api")
  1. Different levels for console and file:
logger = setup_file_logging(
    console_level=logging.WARNING,  # Less verbose in console
    level=logging.DEBUG             # More detailed in the file
)
  1. Use LogTimer for critical operations:
with LogTimer(logger, "Complex query"):
    result = run_heavy_query()
  1. Monitor metrics in long processes:
metrics = LogMetrics(logger)
for batch in batches:
    with metrics.timer('batch_processing'):
        process_batch(batch)

โŒ Avoid

  • Configuring loggers multiple times
  • Using print() instead of logger
  • Excessive logging in critical loops
  • Exposing sensitive information in logs
  • Ignoring log file rotation

๐Ÿ”ง Advanced Configuration

Example of full configuration:

from logging_metrics import setup_file_logging, LogMetrics
import logging

# Main configuration with all options
logger = setup_file_logging(
    logger_name="my_app",
    log_folder: str = "unknown/"
    log_dir="./logs",
    level=logging.DEBUG,
    console_level=logging.INFO,
    rotation='time',
    log_format="%(asctime)s - %(name)s - %(levelname)s - %(message)s",
    date_format="%Y-%m-%d %H:%M:%S",
    max_bytes=50*1024*1024,  # 50MB
    backup_count=10,
    add_console= True
)

# Sub-module configuration
modules = ['database', 'api', 'auth', 'cache']
for module in modules:
    module_logger = logging.getLogger(f"my_app.{module}")
    module_logger.setLevel(logging.INFO)

๐Ÿงช Complete Example

import logging
from logging_metrics import setup_file_logging, LogTimer, LogMetrics

def main():
    # Initial configuration
    logger = setup_file_logging(
        logger_name="data_processor",
        log_dir="./logs",
        console_level=logging.INFO,
        level=logging.DEBUG
    )
    
    # Sub-loggers
    db_logger = logging.getLogger("data_processor.database")
    api_logger = logging.getLogger("data_processor.api")
    
    # Metrics
    metrics = LogMetrics(logger)
    
    logger.info("Application started")
    
    try:
        # Main processing with timing
        with LogTimer(logger, "Full processing"):
            metrics.start('total_processing')
            
            # Simulate processing
            for i in range(1000):
                metrics.increment('records_processed')
                
                if i % 100 == 0:
                    logger.info(f"Processed {i} records")
                
                # Simulate occasional error
                if i % 250 == 0:
                    metrics.increment('errors_recovered')
                    logger.warning(f"Recovered error at record {i}")
            
            metrics.stop('total_processing')
            metrics.log_all()
            
        logger.info("Processing successfully completed")
        
    except Exception as e:
        logger.error(f"Error during processing: {e}", exc_info=True)
        raise

if __name__ == "__main__":
    main()

๐Ÿงช Tests

The library has a complete test suite to ensure quality and reliability.

# Install development dependencies
pip install -e ".[dev]"

# Run all tests
make test

# Tests with coverage
make test-cov

Test structure:

test/
โ”œโ”€โ”€ conftest.py
โ”œโ”€โ”€ pytest.ini
โ”œโ”€โ”€ test-requirements.txt
โ”œโ”€โ”€ test_logging_metrics.py

โš™๏ธ CI/CD

This project uses GitHub Actions for continuous integration and delivery.

CI Workflow (ci.yml):

  • Runs on push and PR to main/master.
  • Steps:
    1. Install dependencies and package in editable mode.
    2. Lint code with ruff and black.
    3. Run tests with pytest and measure coverage.
    4. Fail build if coverage < 85%.
    5. Upload HTML coverage report and send to Codecov.

CD Workflow (Publish to PyPI.yml):

  • Triggered on push tags v*.*.*.
  • Steps:
    1. Build wheel and sdist.
    2. Check version tag matches pyproject.toml.

Run CI locally:

make test-ci     # Full pipeline
make test-local  # Install + tests with coverage

How to publish a new version

  1. Update the version in pyproject.toml (version field).
  2. Update the CHANGELOG with the release notes.
  3. Create and push the tag:
 git add .
 git commit -m "release: v0.1.0"
 git tag -a v0.1.0 -m "release: v0.1.0"
 git push origin v0.1.0

This will automatically trigger the publish.yml workflow, which builds the package and uploads it to PyPI.

๐Ÿ”ง Requirements

  • Python >= 3.9
  • Dependencies: pytz, pyspark

๐Ÿ“ Changelog

v0.2.2 (Current)

  • Initial stable version
  • LogTimer and LogMetrics
  • Spark integration
  • Colored logs
  • JSON log support
  • Fixed file rotation bug on Windows
  • Expanded documentation with more examples

๐Ÿค Contributing

  1. Fork the project
  2. Create your feature branch
  3. Commit your changes
  4. Push to your branch
  5. Open a Pull Request

๐Ÿ“„ License

MIT License. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

logging_metrics-0.2.2.tar.gz (19.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

logging_metrics-0.2.2-py3-none-any.whl (13.3 kB view details)

Uploaded Python 3

File details

Details for the file logging_metrics-0.2.2.tar.gz.

File metadata

  • Download URL: logging_metrics-0.2.2.tar.gz
  • Upload date:
  • Size: 19.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for logging_metrics-0.2.2.tar.gz
Algorithm Hash digest
SHA256 71b7fcdf7ced6852bcb5a698267df2eaea7ea466aab31c4ade5381ed6d7faaf8
MD5 b3670fca0a241d6c07e357165023c09a
BLAKE2b-256 d525fcd5fd1a1732aa115c87b4db9bd6edc3b7d0429f78ffa1c2c7423615a68e

See more details on using hashes here.

Provenance

The following attestation bundles were made for logging_metrics-0.2.2.tar.gz:

Publisher: publish-to-pypi.yml on ThaissaTeodoro/logging-metrics

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file logging_metrics-0.2.2-py3-none-any.whl.

File metadata

File hashes

Hashes for logging_metrics-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 7d3861d36851a683ef31186dad240489c1db14a2e87c3768937e7bdba2e7ac46
MD5 2fb2db2e1d8dbb380520520436850bda
BLAKE2b-256 4cbf3fb824c2ed041c05f6208b0daea69ce6284f30b854e19412ac5ad2cb63f8

See more details on using hashes here.

Provenance

The following attestation bundles were made for logging_metrics-0.2.2-py3-none-any.whl:

Publisher: publish-to-pypi.yml on ThaissaTeodoro/logging-metrics

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page