No project description provided
Project description
mlplatformutils
mlplatformutils package for observability and ML Pipeline Processing
Structure
.
|-- LICENSE.txt
|-- README.rst
|-- setup.cfg
|-- setup.py
|-- src
| |-- mlplatformutils
| | |-- __init__.py
| | |-- core
| | |-- |-- __init__.py
| | |-- |-- sparkcoreutils.py
| | |-- |-- sparkutils.py
| | |-- |-- platformutils.py
| | |-- |-- pandascoreutils.py
| | |-- |-- pandasutils.py
| | |-- |-- lineagegraph.py
| | |-- |-- app_insights_logger.py
|-- tests
| |-- __init__.py
| |-- core
| |-- |--__init__.py
| |-- |-- sparkcoreutils.py
| |-- |-- sparkutils.py
| |-- |-- platformutils.py
| |-- |-- pandascoreutils.py
| |-- |-- pandasutils.py
| |-- |-- lineagegraph.py
| |-- |-- app_insights_logger.py
Instructions
install twine - twine is a utility package that is used for publishing Python packages on PyPI
python -m pip install twine
Build Package - create the source distribution of the package
python setup.py sdist
Upload Package to PyPI
**python -m twine upload dist/ ***
Description
app_insights_logger - Contains telemetrylogger Class with Functions to Manage and Log Telemetry into Azure Application Insights
lineagegraph - Contains LineageGraph Class with functions to manage Graph on Azure Cosmos DB enabled with Gremlin
platformutils - Contains platform utility functions to check, install depedencies, check Azure ML Compute
- is_package_installed
- install_pip
- get_environment
- set_environment
- assert_amlcompute
- read_setup_ini
sparkutils - Contains functions to read data from sources such as (Azure Data Lake Gen2, Azure Data Explorer (Kusto), Azure Sql Server) and write (Azure Data Lake Gen2)while ensuring integrated Lineage Graph Logging.
- read_from_adls_gen2
- write_to_adls_gen2
- read_from_kusto
- read_from_azsql
sparkcoreutils - Contains functions to read data from sources such as (Azure Data Lake Gen2, Azure Data Explorer (Kusto), Azure Sql Server) and write (Azure Data Lake Gen2) without integrated Lineage Graph Logging.
- read_from_adls_gen2
- write_to_adls_gen2
- read_from_kusto
- read_from_azsql
pandasutils - Contains functions to read data from Azure Data Lake Gen2 (from Delta Format or Parquet Format) into Pandas Dataframe without Spark while ensuring integrated Lineage Graph Logging.
- read_from_delta_as_pandas
- read_parquet_file_from_adlsgen2_as_pandas
- read_parquet_directory_from_adlsgen2_as_pandas
- write_pandas_as_parquet_file_to_adlsgen2
pandascoreutils - Contains functions to read data from Azure Data Lake Gen2 (from Delta Format or Parquet Format) into Pandas Dataframe without Spark without integrated Lineage Graph Logging.
- read_from_delta_as_pandas
- read_parquet_file_from_adlsgen2_as_pandas
- read_parquet_directory_from_adlsgen2_as_pandas
- write_pandas_as_parquet_file_to_adlsgen2
Examples
from mlplatformutils.core.platformutils import is_package_installed
print(is_package_installed("pandas"))
from mlplatformutils.core.app_insights_logger import telemetrylogger
from mlplatformutils.core.lineagegraph import LineageGraph
from mlplatformutils.core.sparkutils import write_to_adls_gen2, read_from_adls_gen2
from mlplatformutils.core.pandasutils import write_pandas_as_parquet_file_to_adlsgen2, read_parquet_directory_from_adlsgen2_as_pandas
from mlplatformutils.core.sparkcoreutils import write_to_adls_gen2, read_from_adls_gen2
from mlplatformutils.core.pandascoreutils import write_pandas_as_parquet_file_to_adlsgen2, read_parquet_directory_from_adlsgen2_as_pandas
import mlplatformutils.core.version as vr
print(vr.version)
Notes
When Running this Lineage Package from Jupyter Nootebook, the below 3 Lines Help overcome JupyterNotebook RuntimeError: Cannot run the event loop while another loop is running
import asyncio
import nest_asyncio
nest_asyncio.apply()
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.