Skip to main content

Logging for pandas dataframes

Project description

pdlog

pdlog provides logging for pandas dataframes, to better enable you to monitor and debug your data pipelines.

For example:

>>> import pdlog
>>> df = df.log.dropna()
2020-05-26 20:55:30,049 INFO <pdlog> dropna: dropped 1 row (17%), 5 rows remaining

Example

The above assumes that the logging module has been configured and that data has been loaded into a pandas DataFrame. Let's walk through those steps with a simple example.

  1. Configure logging:

    >>> import logging
    >>> fmt = "{asctime} {levelname} <{name}> {message}"
    >>> logging.basicConfig(format=fmt, style="{", level=logging.INFO)
    
  2. Load data into a pandas.DataFrame:

    >>> import pandas as pd
    >>> df = pd.DataFrame([0, 1, 2, None, 4])
    >>> df.head()
         0
    0  0.0
    1  1.0
    2  2.0
    3  NaN
    4  4.0
    
  3. Importing pdlog and call a method under the log accessor:

    >>> import pdlog
    >>> df = df.log.dropna()
    2020-05-26 20:55:30,049 INFO <pdlog> dropna: dropped 1 row (17%), 5 rows remaining
    

Supported methods

pdlog currently supports the following pandas.DataFrame methods:

  • Filter rows and select columns:
    • drop_duplicates
    • drop
    • dropna
    • head
    • query
    • sample
    • tail
  • (Re-)set indexes:
    • reset_index
    • set_index
  • Rename indexes:
    • rename
  • Reshape:
    • melt
    • pivot
  • Impute:
    • bfill
    • ffill
    • fillna

Related Work

pandas-log

pandas-log is aimed at interactive usage. Its messages are friendlier and more verbose than pdlog aims to be. Ideally, each pdlog message should be a single line of dense information to help you understand whether your production code is doing what you think it is, while not overflowing your logs. These don't tend to make particularly friendly messages.

tidylog

pdlog can be considered a port of tidylog (R package) to pandas. Their goals align with ours, and we think they've done a great job at reaching those goals.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pdlog-0.1.0.tar.gz (7.4 kB view details)

Uploaded Source

Built Distribution

pdlog-0.1.0-py3-none-any.whl (5.7 kB view details)

Uploaded Python 3

File details

Details for the file pdlog-0.1.0.tar.gz.

File metadata

  • Download URL: pdlog-0.1.0.tar.gz
  • Upload date:
  • Size: 7.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.5

File hashes

Hashes for pdlog-0.1.0.tar.gz
Algorithm Hash digest
SHA256 5b6578d5bb784ab98e9894dd0fcdfa9dd25a21457403308aebc3255a40ef35f3
MD5 34add3ed71abc83728f5f534ce250ba6
BLAKE2b-256 c3c74c68252225f0fe18f0dfc9de2526b070ec87ec1d28da44dbdeb39514c02f

See more details on using hashes here.

File details

Details for the file pdlog-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: pdlog-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 5.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.5

File hashes

Hashes for pdlog-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fa2f04d8a8c8f80268fd4c2f56b13a3d6b87b1a59a6910b4b8021654f5eeb17c
MD5 64e663a0a3957d04d362ca744f40ef66
BLAKE2b-256 26b1c099a222b0938fa3f04b71dd85bc13ea479422a36f4fb6e04a3fe488ee0e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page