Skip to main content

A package for efortless data manager.

Project description

Data Manager

Data Manager is a Python package designed to provide a comprehensive suite of tools specialized in managing and analyzing data. It offers a user-friendly toolkit suitable for individuals of all levels of expertise. This initial release encompasses the following features:

  • Effortless reading of datafiles with support for Pandas-compatible extensions.
  • Simplified dataframe creation by specifying column names and corresponding values.
  • Automatic generation of hexadecimal and numeric identifiers.
  • Provision of common statistical analyses for specified columns.
  • Histogram plotting functionality.
  • Easy visualization of class distribution within the database.
  • Data standardization capabilities.
  • Feature discretization support.
  • Creation of fold columns for conducting stratified k-fold analyses.

Components

'CreateDataPD' Class

This class enables the creation and manipulation of Pandas DataFrames. It supports the following functionalities:

  • Reading data from various file formats: CSV, Excel, JSON, Parquet, Feather, and Pickle.
  • Generating unique identifiers for rows.
  • Adding new data to the DataFrame.
  • Displaying the DataFrame.
  • Retrieving column names.
  • Printing specific columns.

'PDNumericAnalysis' Class

This class extends the CreateDataPD class and provides additional functionalities for numerical analysis and preprocessing. It includes the following features:

  • Computing statistics for numerical columns.
  • Plotting histograms.
  • Analyzing data balance.
  • Standardizing numerical data.
  • Reconstructing data.
  • Converting non-numeric columns to numeric values.
  • Creating folds for cross-validation.

Dependencies

This toolkit requires the following dependencies:

  • pandas
  • numpy
  • matplotlib
  • scikit-learn

Installation

To install the dependencies, run:

pip install requirements.txt

Usage

You can use this toolkit in your Python projects by importing the necessary classes and functions. Here's an example of how to use the CreateDataPD class:

from data_analysis_toolkit import CreateDataPD

# Create a DataFrame from data
data = [[1, 'A', 10], [2, 'B', 20], [3, 'C', 30]]
columns = ['ID', 'Category', 'Value']
data_pd = CreateDataPD(data=data, columns=columns)

# Display DataFrame
data_pd.show_dataset()

# Add new data
new_data = [[4, 'D', 40], [5, 'E', 50]]
data_pd.add_data(new_data)

# Display DataFrame with added data
data_pd.show_dataset()

Work in progress...

Stay tuned for upcoming releases, which will incorporate the following enhancements:

  • Missing data completion functionality.
  • Conversion of dataframes into datasets suitable for machine learning inputs.
  • Introduction of PDTextAnalysis for handling features related to text manipulation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pd-data-manager-0.1.1.tar.gz (21.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pd_data_manager-0.1.1-py3-none-any.whl (21.5 kB view details)

Uploaded Python 3

File details

Details for the file pd-data-manager-0.1.1.tar.gz.

File metadata

  • Download URL: pd-data-manager-0.1.1.tar.gz
  • Upload date:
  • Size: 21.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.11.3

File hashes

Hashes for pd-data-manager-0.1.1.tar.gz
Algorithm Hash digest
SHA256 66709b9670e494d1bf65fe8859db7e814f88e2032c49504197a064c1354b7ef0
MD5 f202b889312bbf6ef03749047dfe8eaa
BLAKE2b-256 c9a102728ac2566ca101e1f1e57c102d67854b6aabbbab12c8010d3a4e25bc21

See more details on using hashes here.

File details

Details for the file pd_data_manager-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for pd_data_manager-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 0ea640adf29517fb459f06c7201f04b77c069e1b51d0cd445856e4c77242aa27
MD5 c966a5912b9fbcd420d56914acaf929d
BLAKE2b-256 417c34e817a8b627b12aa66f37decd192b3b35abb3f5454a28aeeaa18ff7fb41

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page