Advanced logging utilities for robust, standardized logs in Python projects, APIs, data engineering, and more.
Project description
logging-metrics
Utilities Library for Logging Configuration and Management
A library for configuring and managing logs in Python, focused on simplicity, performance, and observability โ with support for PySpark integration.
๐ Table of Contents
- โจ Features
- ๐ฆ Installation
- ๐ Functions and Classes Overview
- ๐ Quick Start
- ๐ Main Features
- ๐ Best Practices
- โ Avoid
- ๐ง Advanced Configuration
- ๐งช Complete Example
- ๐งช Tests
- โ๏ธ CI/CD
- ๐ง Requirements
- ๐ Changelog
- ๐ค Contributing
- ๐ License
โจ Features
- ๐จ Colored logs for the terminal with different levels
- ๐ Automatic file rotation by time or size
- โก PySpark DataFrame integration
- ๐ JSON format for observability systems
- โฑ๏ธ Timing with
LogTimer - ๐ Metrics monitoring with
LogMetrics - ๐ง Hierarchical logger configuration
- ๐ Optimized performance for critical applications
๐ฆ Installation
From PyPI:
pip install logging-metrics
For development:
git clone https://github.com/ThaissaTeodoro/logging-metrics.git
cd logging-metrics
pip install -e ".[dev]"
๐ Functions and Classes Overview
| Name | Type | Description |
|---|---|---|
configure_basic_logging |
Function | Configures root logger for colored console logging. |
setup_file_logging |
Function | Configures a logger with file output (rotation), optional console, JSON formatting. |
LogTimer |
Class | Context manager and decorator to log execution time of code blocks or functions. |
log_spark_dataframe_info |
Function | Logs schema, sample, stats of a PySpark DataFrame (row count, sample, stats, etc). |
LogMetrics |
Class | Utility for collecting, incrementing, timing, and logging custom processing metrics. |
get_logger |
Function | Returns a logger with custom handlers and caplog-friendly mode for pytest. |
๐ Quick Start
import logging
from logging_metrics import setup_file_logging, LogTimer
logger = setup_file_logging(
logger_name="my_app",
log_dir="./logs",
console_level=logging.INFO,
level=logging.DEBUG
)
logger.info("Application started!")
with LogTimer(logger, "Critical operation"):
# your code here
pass
๐ Main Features
- Logging configuration:
import logging
from logging-metrics import configure_basic_logging
logger = configure_basic_logging()
logger.debug("Debug message") # Gray
logger.info("Info") # Green
logger.warning("Warning") # Yellow
logger.error("Error") # Red
logger.critical("Critical") # Bold red
- Automatic Log Rotation:
from logging-metrics import setup_file_logging, LogTimer
# Size-based rotation
logger = setup_file_logging(
logger_name="app",
log_dir="./logs",
max_bytes=10*1024*1024, # 10MB
rotation='size'
)
# Time-based rotation
logger = setup_file_logging(
logger_name="app",
log_dir="./logs",
rotation='time'
)
- Spark/Databricks Integration:
from pyspark.sql import SparkSession
from logging_metrics import configure_basic_logging, log_spark_dataframe_info
spark = SparkSession.builder.getOrCreate()
df = spark.createDataFrame([(1, "Ana"), (2, "Bruno")], ["id", "nome"])
logger = configure_basic_logging()
print("Logger:", logger)
log_spark_dataframe_info(
df = df,logger = logger, name ="spark_app")
logger.info("Spark processing started")
- โฑ Timing with LogTimer:
from logging_metrics import LogTimer, configure_basic_logging
logger = configure_basic_logging()
# As a context manager
with LogTimer(logger, "DB query"):
logger.info("Test")
# As a decorator
@LogTimer.as_decorator(logger, "Data processing")
def process_data(data):
return data.transform()
- ๐ Metrics Monitoring:
from logging_metrics import LogMetrics, configure_basic_logging
import time
logger = configure_basic_logging()
metrics = LogMetrics(logger)
items = [10, 5, 80, 60, 'test1', 'test2']
# Start timer for total operation
metrics.start('total_processing')
for item in items:
# Increments the processed records counter
metrics.increment('records_processed')
# If it is an error (simulation)
if isinstance(item, str):
metrics.increment('errors')
# Simulates item processing
time.sleep(0.1)
# Custom value example
metrics.set('last_item', item)
# Finalize and log all metrics
elapsed = metrics.stop('total_processing')
# Logs all collected metrics
metrics.log_all()
# Output:
# --- Processing Metrics ---
# Counters:
# - records_processed: 6
# - errors_found: 2
# Values:
# - last_item: test2
# Completed timers:
# - total_processing: 0.60 seconds
- Hierarchical Configuration:
from logging_metrics import setup_file_logging
import logging
# Main logger
main_logger = setup_file_logging("my_app", log_dir="./logs")
# Sub-loggers organized hierarchically
db_logger = logging.getLogger("my_app.database")
api_logger = logging.getLogger("my_app.api")
auth_logger = logging.getLogger("my_app.auth")
# Module-specific configuration
db_logger.setLevel(logging.DEBUG) # More verbose for DB
api_logger.setLevel(logging.INFO) # Normal for API
auth_logger.setLevel(logging.WARNING) # Only warnings/errors for auth
db_logger.debug("querying the database")
db_logger.info("consultation successfully completed")
db_logger.error("Error connecting to database!")
auth_logger.debug("doing authentication")
auth_logger.info("authentication successfully completed")
api_logger.debug("querying the api")
api_logger.info("consultation successfully completed")
api_logger.error("Error querying the api")
auth_logger.error("Auth error!")
- ๐ JSON Format for Observability:
from logging_metrics import setup_file_logging
# JSON logs for integration with ELK, Grafana, etc.
logger = setup_file_logging(
logger_name="microservice",
log_dir="./logs",
json_format = True
)
logger.info("User logged in", extra={"user_id": 12345, "action": "login"})
# Example JSON output:
# {
# "timestamp": "2024-08-05T10:30:00.123Z",
# "level": "INFO",
# "name": "microservice",
# "message": "User logged in",
# "module": "user-api",
# "function": "<module>",
# "line": 160,
# "taskName": null,
# "user_id": 12345,
# "action": "login"
# }
๐ Best Practices
- Configure logging once at the start:
# In main.py or __init__.py
logger = setup_file_logging("my_app", log_dir="./logs")
- Use logger hierarchy:
# Organize by modules/features
db_logger = logging.getLogger("app.database")
api_logger = logging.getLogger("app.api")
- Different levels for console and file:
logger = setup_file_logging(
console_level=logging.WARNING, # Less verbose in console
level=logging.DEBUG # More detailed in the file
)
- Use LogTimer for critical operations:
with LogTimer(logger, "Complex query"):
result = run_heavy_query()
- Monitor metrics in long processes:
metrics = LogMetrics(logger)
for batch in batches:
with metrics.timer('batch_processing'):
process_batch(batch)
โ Avoid
- Configuring loggers multiple times
- Using print() instead of logger
- Excessive logging in critical loops
- Exposing sensitive information in logs
- Ignoring log file rotation
๐ง Advanced Configuration
Example of full configuration:
from logging_metrics import setup_file_logging, LogMetrics
import logging
# Main configuration with all options
logger = setup_file_logging(
logger_name="my_app",
log_folder: str = "unknown/"
log_dir="./logs",
level=logging.DEBUG,
console_level=logging.INFO,
rotation='time',
log_format="%(asctime)s - %(name)s - %(levelname)s - %(message)s",
date_format="%Y-%m-%d %H:%M:%S",
max_bytes=50*1024*1024, # 50MB
backup_count=10,
add_console= True
)
# Sub-module configuration
modules = ['database', 'api', 'auth', 'cache']
for module in modules:
module_logger = logging.getLogger(f"my_app.{module}")
module_logger.setLevel(logging.INFO)
๐งช Complete Example
import logging
from logging_metrics import setup_file_logging, LogTimer, LogMetrics
def main():
# Initial configuration
logger = setup_file_logging(
logger_name="data_processor",
log_dir="./logs",
console_level=logging.INFO,
level=logging.DEBUG
)
# Sub-loggers
db_logger = logging.getLogger("data_processor.database")
api_logger = logging.getLogger("data_processor.api")
# Metrics
metrics = LogMetrics(logger)
logger.info("Application started")
try:
# Main processing with timing
with LogTimer(logger, "Full processing"):
metrics.start('total_processing')
# Simulate processing
for i in range(1000):
metrics.increment('records_processed')
if i % 100 == 0:
logger.info(f"Processed {i} records")
# Simulate occasional error
if i % 250 == 0:
metrics.increment('errors_recovered')
logger.warning(f"Recovered error at record {i}")
metrics.stop('total_processing')
metrics.log_all()
logger.info("Processing successfully completed")
except Exception as e:
logger.error(f"Error during processing: {e}", exc_info=True)
raise
if __name__ == "__main__":
main()
๐งช Tests
The library has a complete test suite to ensure quality and reliability.
# Install development dependencies
pip install -e ".[dev]"
# Run all tests
make test
# Tests with coverage
make test-cov
Test structure:
test/
โโโ conftest.py
โโโ pytest.ini
โโโ test-requirements.txt
โโโ test_logging_metrics.py
โ๏ธ CI/CD
This project uses GitHub Actions for continuous integration and delivery.
CI Workflow (ci.yml):
- Runs on push and PR to
main/master. - Steps:
- Install dependencies and package in editable mode.
- Lint code with
ruffandblack. - Run tests with
pytestand measure coverage. - Fail build if coverage < 85%.
- Upload HTML coverage report and send to Codecov.
CD Workflow (Publish to PyPI.yml):
- Triggered on push tags
v*.*.*. - Steps:
- Build wheel and sdist.
- Check version tag matches
pyproject.toml.
Run CI locally:
make test-ci # Full pipeline
make test-local # Install + tests with coverage
How to publish a new version
- Update the version in pyproject.toml (version field).
- Update the CHANGELOG with the release notes.
- Create and push the tag:
git add .
git commit -m "release: v0.1.0"
git tag -a v0.1.0 -m "release: v0.1.0"
git push origin v0.1.0
This will automatically trigger the publish.yml workflow, which builds the package and uploads it to PyPI.
๐ง Requirements
- Python >= 3.9
- Dependencies:
pytz,pyspark
๐ Changelog
v0.2.3 (Current)
- Initial stable version
LogTimerandLogMetrics- Spark integration
- Colored logs
- JSON log support
- Fixed file rotation bug on Windows
- Expanded documentation with more examples
๐ค Contributing
- Fork the project
- Create your feature branch
- Commit your changes
- Push to your branch
- Open a Pull Request
๐ License
MIT License. See LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file logging_metrics-0.2.3.tar.gz.
File metadata
- Download URL: logging_metrics-0.2.3.tar.gz
- Upload date:
- Size: 20.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dd3c8a8a41212abdba6441720dd985ffc03495a463ca1b2ae1248ecbe41b0d2d
|
|
| MD5 |
6182e94dda28dda232fd25261b97f7ed
|
|
| BLAKE2b-256 |
249a0ea8a6dbe315c15943a29b17f4b9c59d9506514f3dd2ff577044bd30a576
|
Provenance
The following attestation bundles were made for logging_metrics-0.2.3.tar.gz:
Publisher:
publish-to-pypi.yml on ThaissaTeodoro/logging-metrics
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
logging_metrics-0.2.3.tar.gz -
Subject digest:
dd3c8a8a41212abdba6441720dd985ffc03495a463ca1b2ae1248ecbe41b0d2d - Sigstore transparency entry: 374361755
- Sigstore integration time:
-
Permalink:
ThaissaTeodoro/logging-metrics@0e1e9d34553720da4bf31e5a3735f1f2353a7bca -
Branch / Tag:
refs/tags/v0.2.3 - Owner: https://github.com/ThaissaTeodoro
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-to-pypi.yml@0e1e9d34553720da4bf31e5a3735f1f2353a7bca -
Trigger Event:
push
-
Statement type:
File details
Details for the file logging_metrics-0.2.3-py3-none-any.whl.
File metadata
- Download URL: logging_metrics-0.2.3-py3-none-any.whl
- Upload date:
- Size: 13.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
67131fb41146dc5fd1bc379bbc75c56f0cfb9ea8d983ef89b4b5c9817200ab06
|
|
| MD5 |
e45953b8786a167ae3dbf0d8520682fc
|
|
| BLAKE2b-256 |
3239b3161a7267f21ab21f0c8fd99470b5afe3762575297dab3cb4330f04f39b
|
Provenance
The following attestation bundles were made for logging_metrics-0.2.3-py3-none-any.whl:
Publisher:
publish-to-pypi.yml on ThaissaTeodoro/logging-metrics
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
logging_metrics-0.2.3-py3-none-any.whl -
Subject digest:
67131fb41146dc5fd1bc379bbc75c56f0cfb9ea8d983ef89b4b5c9817200ab06 - Sigstore transparency entry: 374361757
- Sigstore integration time:
-
Permalink:
ThaissaTeodoro/logging-metrics@0e1e9d34553720da4bf31e5a3735f1f2353a7bca -
Branch / Tag:
refs/tags/v0.2.3 - Owner: https://github.com/ThaissaTeodoro
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-to-pypi.yml@0e1e9d34553720da4bf31e5a3735f1f2353a7bca -
Trigger Event:
push
-
Statement type: