Lib to connect python/django modules to redshift
Project description
Weni Data Lake SDK
The Weni Data Lake SDK is a Python library that provides an interface to interact with Weni's data lake services. It supports operations for sending data, managing message templates, and handling traces.
Installation
pip install weni-datalake-sdk
In case you are using poetry, you can add the package to your project with the following command:
poetry add weni-datalake-sdk
Environment Variables
To insert data into the data lake, you need to set the following environment variables:
DATALAKE_SERVER_ADDRESS=your_server_address
To get data from the data lake, you need to set the following environment variables:
REDSHIFT_QUERY_BASE_URL=your_redshift_url
REDSHIFT_SECRET=your_secret
REDSHIFT_ROLE_ARN=your_role_arn
MESSAGE_TEMPLATES_METRIC_NAME=your_metric_name (if you want to get message templates)
TRACES_METRIC_NAME=your_trace_metric_name (if you want to get traces)
EVENTS_METRIC_NAME=your_event_metric_name (if you want to get events)
Although you will need some AWS credentials to get data from the data lake, you can use the following environment variables:
AWS_ACCESS_KEY_ID=your_access_key_id
AWS_SECRET_ACCESS_KEY=your_secret_access_key
AWS_DEFAULT_REGION=your_region
This is important that we will use assumed role to get data from the data lake.
Usage Examples
1. Sending Data
from weni_datalake_sdk.clients.client import send_data
from weni_datalake_sdk.paths.your_path import YourPath
# Prepare your data
data = {
"field1": "value1",
"field2": "value2"
}
# Send data using a path class
send_data(YourPath, data)
# Or using an instantiated path
path = YourPath()
send_data(path, data)
2. Send Event Data
from weni_datalake_sdk.clients.client import send_event_data
from weni_datalake_sdk.paths.events_path import EventPath
# Prepare your data
data = {
"event_name": "event_name",
"key": "key",
"value": "value",
"value_type": "value_type",
"date": "2021-01-01",
"project": "project_uuid",
"contact_urn": "contact_urn",
"metadata": {
"field1": "value1",
"field2": "value2"
}
}
3. Send Commerce Webhook Data
from weni_datalake_sdk.clients.client import send_commerce_webhook_data
from weni_datalake_sdk.paths.commerce_webhook import CommerceWebhookPath
from datetime import datetime
# Prepare your data (all fields are optional)
data = {
"status": 1,
"template": "template_name",
"template_variables": {"foo": "bar"},
"contact_urn": "whatsapp:+55123456789",
"error": {"msg": "error"},
"data": {"foo": "bar"},
"date": datetime.now().isoformat(),
"project": "your-project-uuid",
"request": {"req": "value"},
"response": {"res": "value"},
"agent": "some-uuid"
}
# Send commerce webhook data
send_commerce_webhook_data(CommerceWebhookPath, data)
All fields are optional. For Struct fields, use dicts. For date, use an ISO string. If you don't want to send a field, omit it or set it to None.
4. Get Message Templates
from weni_datalake_sdk.clients.redshift.message_templates import get_message_templates
# Get templates with specific parameters
result = get_message_templates(
contact_urn="contact123",
template_uuid="template_uuid"
)
5. Get Traces
from weni_datalake_sdk.clients.redshift.traces import get_traces
# Get traces with query parameters
result = get_traces(
query_params={
"message_uuid": "123e4567-e89b-12d3-a456-426614174000"
}
)
6. Get Events
from weni_datalake_sdk.clients.redshift.events import get_events
# Get events with query parameters
result = get_events(
query_params={
"date_start": "2021-01-01", # date_start is required
"date_end": "2021-01-01", # date_end is required
"project": "project_uuid", # project is optional
"event_type": "event_type", # event_type is optional
"contact_urn": "contact_urn", # contact_urn is optional
"event_name": "event_name", # event_name is optional
"key": "key", # key is optional
"value": "value", # value is optional
"value_type": "value_type" # value_type is optional
}
)
5. Get Events Count
from weni_datalake_sdk.clients.redshift.events import get_events_count
# Get events count with required and optional parameters
result = get_events_count(
project="your_project_uuid", # project is required
date_start="2025-06-03T00:00:00Z", # date_start is required
date_end="2025-07-30T23:59:59Z", # date_end is required
event_type="event_type", # event_type is optional
event_name="event_name", # event_name is optional
key="topics", # key is optional
value="value", # value is optional
value_type="value_type", # value_type is optional
contact_urn="contact_urn", # contact_urn is optional
)
print(result)
6. Get Events Count By Group
from weni_datalake_sdk.clients.redshift.events import get_events_count_by_group
# Get events count grouped by a metadata key
result = get_events_count_by_group(
project="your_project_uuid", # project is required
date_start="2025-06-03T00:00:00Z", # date_start is required
date_end="2025-07-30T23:59:59Z", # date_end is required
metadata_key="topic_uuid", # metadata_key is required
event_type="event_type", # event_type is optional
event_name="event_name", # event_name is optional
key="topics", # key is optional
value="value", # value is optional
value_type="value_type", # value_type is optional
contact_urn="contact_urn", # contact_urn is optional
group_by="subtopic_uuid", # group_by is optional
metadata_value="uuid" # metadata_value is optional
)
print(result)
If you don't pass group_by value, the result will be aggregated by value.
7. Get Events from silver tables
from weni_datalake_sdk.clients.redshift.events import get_events_silver
# Get events count grouped by a metadata key
result = get_events_silver(
project="your_project_uuid", # project is required
date_start="2025-06-03T00:00:00Z", # date_start is required
date_end="2025-07-30T23:59:59Z", # date_end is required
table="topics", # table is required
... # other parameters are optional
)
print(result)
8. Get Events Count from silver tables
from weni_datalake_sdk.clients.redshift.events import get_events_silver_count
Get events count grouped by a metadata key
result = get_events_silver_count( project="your_project_uuid", # project is required date_start="2025-06-03T00:00:00Z", # date_start is required date_end="2025-07-30T23:59:59Z", # date_end is required table="topics", # table is required ... # other parameters are optional ) print(result)
### 9. Get Events Count from silver tables by group
```python
from weni_datalake_sdk.clients.redshift.events import get_events_silver_count_by_group
Get events count grouped by a metadata key
result = get_events_silver_count_by_group( project="your_project_uuid", # project is required date_start="2025-06-03T00:00:00Z", # date_start is required date_end="2025-07-30T23:59:59Z", # date_end is required table="topics", # table is required ... # other parameters are optional ) print(result)
The valid tables are: "topics", "weni_csat", "weni_nps", "conversation_classification", "conversion_lead"
This function is used to get events from silver tables. You can use the same parameters as the get_events function.
Don't forget to set in your enviroment the following variables to get silver data:
EVENTS_SILVER_METRIC_NAME
EVENTS_SILVER_COUNT_METRIC_NAME
EVENTS_SILVER_COUNT_BY_GROUP_METRIC_NAME
## Error Handling
The SDK includes proper error handling. Always wrap your calls in try-except blocks:
```python
try:
result = get_message_templates(template_id="template123")
except Exception as e:
print(f"Error: {e}")
Best Practices
- Environment Variables: Always ensure all required environment variables are set before using the SDK.
- Path Validation: Use proper path classes instead of raw strings.
- Error Handling: Implement proper error handling in your code.
- Data Types: Ensure you're passing the correct data types for each parameter.
- Security: Never hardcode sensitive information like tokens or credentials.
Common Issues and Solutions
-
Connection Issues
- Ensure
DATALAKE_SERVER_ADDRESSis correct and accessible - Check your network connectivity
- Ensure
-
Authentication Errors
- Verify your AWS credentials are properly configured
- Check if
REDSHIFT_SECRETandREDSHIFT_ROLE_ARNare correct
-
Missing Environment Variables
- Double-check all required environment variables are set
- Use a
.envfile for local development
Contributing
For contributing to this SDK, please follow these steps:
- Fork the repository
- Create a feature branch
- Commit your changes
- Push to the branch
- Create a Pull Request
License
This project is licensed under the MIT License - see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file weni_datalake_sdk-0.6.1.tar.gz.
File metadata
- Download URL: weni_datalake_sdk-0.6.1.tar.gz
- Upload date:
- Size: 33.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.5 CPython/3.11.13 Linux/6.11.0-1018-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d19a6bff608d30ef2dbc544434a13b7640571a0afe443ae1c17f24b057ab47d6
|
|
| MD5 |
2a6fc197ee05916dba8238d781e8b8b2
|
|
| BLAKE2b-256 |
bd17efb8fc8e92d979b06860ca76b27d5d1bf3ce984f2afb44b619ad247e3712
|
File details
Details for the file weni_datalake_sdk-0.6.1-py3-none-any.whl.
File metadata
- Download URL: weni_datalake_sdk-0.6.1-py3-none-any.whl
- Upload date:
- Size: 48.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.5 CPython/3.11.13 Linux/6.11.0-1018-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1789d9a71adccd02c4db51e94d925fd5436c336fdbf90faa40f253a6c7d41ec3
|
|
| MD5 |
0a3fa78ebd9e53af02fdbe3b5df7e6ba
|
|
| BLAKE2b-256 |
bfec8a3fc7f58fffbf330158be3062f7976ef2ed5542fd7c6a7148483fc067d0
|