Skip to main content

Pyspark Dataframe Analyzer - Smartest DataFrame Analysis

Project description

DFAnalyzer

DFAnalyzer Python is a Python package for data analysis, built on top of the popular DFAnalyzer for Excel. It provides a powerful set of tools for importing, exploring, cleaning, transforming, and visualizing data. It also offers features such as filtering, sorting, grouping, and performing calculations on data. DFAnalyzer Python is designed to enable users to quickly and easily analyze large amounts of data and extract meaningful insights.

  • Find details & insight about each columns.
  • Easy to perform cycles over pyspark.
  • Percentage stats around NaN , Blank Values, Null Values.
  • Describes datatypes of Pyspark Dataframe.
  • Help in POC of data.

Who Should use DFAnalyser

  • Developers working with bigdata
  • Developers using pyspark in the Data exploration.
  • Developers who needs to do poc over raw data.

Usage

PySpark

You can install the DFAnalyzer package using the pip command. To install DFAnalyzer, open a terminal window and type: pip install dfanalyzer. Once the installation is complete, you can start using DFAnalyzer with Python.

  1. Install the preset:

    pip install dfanalyzer
    
  2. Import it:

    import DFAnalyzer as dfa
    
  3. Use it on existing pyspark dataframe:

      #[isHavingNullData,%NullData,isHavingNanValues,%NanValues,isHavingBlankValues,%BlankValues,DataType]
      options=[1,1,1,1,1,1,1]#flags of what all kind of analysis you need
      dfa.analyze(df,options)
    

More is about to come. Stay tuned.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dfanalyzer-0.0.4.tar.gz (3.1 kB view details)

Uploaded Source

Built Distribution

dfanalyzer-0.0.4-py3-none-any.whl (3.5 kB view details)

Uploaded Python 3

File details

Details for the file dfanalyzer-0.0.4.tar.gz.

File metadata

  • Download URL: dfanalyzer-0.0.4.tar.gz
  • Upload date:
  • Size: 3.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.8

File hashes

Hashes for dfanalyzer-0.0.4.tar.gz
Algorithm Hash digest
SHA256 9227c71619e1f6ffb579c080c5cc079cdeea7b29ceaff0d146437a750984c8f6
MD5 b6c369d735022f8392e9a28faaaa6cab
BLAKE2b-256 e196a7fd59b6041305ea52f36b39d6083d478381412ad43ff940af9eef85464f

See more details on using hashes here.

File details

Details for the file dfanalyzer-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: dfanalyzer-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 3.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.8

File hashes

Hashes for dfanalyzer-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 0903a2aa1bcb2a010b6c54e9890fff7d5f8973b2bc960b93280b0c391560d316
MD5 4d840a175a8020be15dd3d961899d0d0
BLAKE2b-256 7e5438a3ef27336e56c79a610eae092c104af388993e57c867acbcc09bfcd93f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page