No project description provided
Project description
🚀 PipeLogger Library 🚀
Simplify the generation and management of logs in your data pipelines.
📖 What is PipeLogger?
PipeLogger is a library designed to standardize the creation of logs in data pipelines, providing a consistent format that facilitates problem identification and troubleshooting. With PipeLogger, you can manage detailed and structured logs, enabling more effective tracking of operations and deeper analysis of data ingestion processes.
🚀 Main features
- Log standardization: PipeLogger creates detailed logs that follow a consistent format, making them easy to read and analyze.
- Integration with Google Cloud Platform (GCP): Designed for pipelines deployed on GCP, supporting Cloud Functions and Cloud Run.
- BigQuery Table Monitoring: Logs and monitors the size of BigQuery tables over time.
- Storage in Google Cloud Storage: Automatically stores logs in a GCP bucket for centralized access and management.
🌟 Example of Log Generated
PipeLogger creates logs in a clear and structured JSON format as follows:
{
"PipelineLogs": {
"PipelineID": "Pipeline-Example",
"Timestamp": "MM-DD-YY-THH:MM:SS",
"Status": "Success",
"Message": "Data uploaded successfully",
"ExecutionTime": 20.5075738430023
},
"BigQueryLogs": [
{
"BigQueryID": "project.pipeline-example.table_1",
"Size": 1555
},
{
"BigQueryID": "project.pipeline-example.table_2",
"Size": 3596
}
],
"Details": [
{
"additional_info": [
"Data downloaded successfully",
"Data processed successfully",
"Data uploaded successfully"
]
}
]
}
💻 Implementation
📋 Prerequisites
Before implementing PipeLogger, make sure you meet the following requirements:
- The pipeline must be deployed on Google Cloud Platform (GCP), using Cloud Functions or Cloud Run.
- The pipeline must interact with BigQuery tables.
- A bucket on Google Cloud Storage is required to store the generated logs.
🛠️ How to Implement PipeLogger in your Pipeline
Follow the steps detailed in our Official Documentation to integrate PipeLogger into your pipeline projects.
🧑💻 Example of Basic Use
from pipelogger import logsformatter
import time
# Initialize the log formatter
logger = logsformatter(
pipeline_id="Pipeline-Example",
table_ids=["project.pipeline-example.table_1", "project.pipeline-example.table_2"],
project_id="your-gcp-project-id",
bucket_name="your-gcs-bucket",
folder_bucket="logs_folder"
)
# Simulate pipeline execution
start_time = time.time()
# Simulation of pipeline operations....
# Generate and upload logs
logger.generate_the_logs(
execution_status="Success",
msg="Data uploaded successfully",
start_timer=start_time,
logs_details=["Process completed without errors."]
)
📦 Installation
You can easily install PipeLogger from PyPI using pip:
pip install pipelogger
📚 Complete Documentation
For complete details on implementation, advanced configuration and more usage examples, visit the Official Documentation.
🤝 Contribute
Contributions are welcome! If you have ideas, improvements or have found a bug, please open an issue or submit a pull request in our GitHub repository.
📄 License
This project is licensed under the terms of the MIT License.
📧 Contact
If you have any questions, feel free to contact us through our GitHub page or send us an email.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for pipelogger-1.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d42e61cd0984ebb7f263332995c5f54758e4bde3c50ed4b01f3b0bbc436d0a90 |
|
MD5 | 94a4171747408110cfa21df9c10b61fb |
|
BLAKE2b-256 | 6258613504592c4cb1a82e1bf116c0ed2b36b5a335cf1ea8b56d0c9013074365 |