Various utilities for IBIS applications in data science and engineering
Project description
i38e-utils
i38e-utils is a collection of utility functions and classes that I use in my projects. It is a work in progress and will be updated as I add more functionality.
Currently, it includes the following:
DfHelper: A class that provides a set of utility functions for working with pandas DataFrames. GeoPyHelper: A class that provides a set of utility functions for working with GeoPy. OsmxHelper: A class that provides a set of utility functions for working with Osmnx. data_utils: A set of utility functions for working with data. date_utils: A set of utility functions for working with dates. df_utils: A set of utility functions for working with pandas DataFrames. file_utils: A set of utility functions for working with files. log_utils: A set of utility functions for working with logs.
Installation
To install this project, follow these steps:
pip install i38e-utils
Usage
DfHelper: Dataframe Helper Class
Scenarios:
- Connect to a database table using a Django's ORM connection, query, transform and convert the data to a pandas DataFrame.
import pandas as pd
from i38e_utils import DfHelper
phone_mobile_gps_fields = {
'id_tracking': 'id',
'id_producto': 'product_id',
'pk_empleado': 'associate_id',
'latitud': 'latitude',
'longitud': 'longitude',
'fecha_hora_servidor': 'server_dt',
'fecha_hora': 'date_time',
'accion': 'action',
'descripcion': 'description',
'imei': 'imei'
}
class GpsCube(DfHelper):
df: pd.DataFrame = None
live: bool = False
save_parquet = True
config={
'connection_name': 'replica',
'table': 'asm_tracking_movil_gps',
'field_map': phone_mobile_gps_fields,
'legacy_filters': True,
}
def __init__(self, **opts):
config = {**self.config, **opts}
super().__init__(**config)
def load(self, **kwargs):
self.df = super().load(**kwargs)
self.fix_data()
return self.df
def fix_data(self):
self.df['latitude'] = self.df['latitude'].astype(np.float64)
self.df['longitude'] = self.df['longitude'].astype(np.float64)```python
gps_cube=GpsCube(live=True, debug=False)
df=gps_cube.load(date_time__date='2023-03-04')
# to save to a parquet file
gps_cube.save_parquet(df, parquet_full_path='gpscube.parquet')
- Use a parquet storage file or folder structure to load data and perform some transformations.
import pandas as pd
from i38e_utils import DfHelper
class GpsParquetCube(DfHelper):
df: pd.DataFrame = None
config={
'df_as_dask': True,
'parquet_storage_path': '/storage/data/parquet/gps_cube_data/',
'parquet_start_date': '2024-01-01',
'parquet_end_date': '2024-03-31',
}
def __init__(self, **opts):
config = {**self.config, **opts}
super().__init__(**config)
def load(self, **kwargs):
self.df = super().load(**kwargs)
return self.df
# This would load all the parquet files in the folder structure that match the date range and return a single dask dataframe for employee_id 27 for the month of January.
params = {
'employee_id': 27,
'date_time__date__range': ['2024-01-01','2024-01-31']
}
dask_df = GpsParquetCube().load(**params)
df = dask_df.compute()
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file i38e_utils-1.0.3.tar.gz.
File metadata
- Download URL: i38e_utils-1.0.3.tar.gz
- Upload date:
- Size: 23.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.5.1 CPython/3.11.2 Darwin/23.2.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b9262e76754d6122d5e3c5cb339a31b565a238959b35665559702401ca914355
|
|
| MD5 |
febad148d753039f2b34f9ba58a24194
|
|
| BLAKE2b-256 |
eb2cad7c54d0dcc0fc255e559c59bdb0fc51125e107b81b6156d0add5262048e
|
File details
Details for the file i38e_utils-1.0.3-py3-none-any.whl.
File metadata
- Download URL: i38e_utils-1.0.3-py3-none-any.whl
- Upload date:
- Size: 27.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.5.1 CPython/3.11.2 Darwin/23.2.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
36911a1d6c0b870a6b034a031cea3f68938abb6575feba57dd2f1f55b040e25a
|
|
| MD5 |
d0d908aaf58c167f0d6f886e2ea31f7e
|
|
| BLAKE2b-256 |
a706ce48959c7cb30a25480228ab734ecdf62a4d25da0864d6e810d2ddb0959d
|