Skip to main content

An advanced Python data analysis library with enhanced cleaning, transformation, and visualization.

Project description

DataAnalysts Package

DataAnalysts is a Python library designed to simplify and streamline data analysis tasks, including data cleaning, transformation, and visualization. Whether you're a student, a data analyst, or a researcher, this package is built to handle datasets efficiently and interactively.


🚀 Key Features

  • Data Cleaning:
    • Handle missing values (mean, median, mode strategies).
    • Remove duplicates, manage outliers, and preprocess raw datasets.
  • Data Transformation:
    • Scale (standard, min-max, robust) and normalize datasets.
    • Encode categorical data and apply dimensionality reduction (PCA).
  • Data Visualization:
    • Generate professional plots: Histogram, Line Plot, Scatter Plot, Heatmap, Pair Plot, Box Plot, Violin Plot.
    • Supports interactive and customizable visualizations.
  • Data Loading:
    • Easily load datasets from CSV and Excel files.
  • Error Handling:
    • Robust exception handling with clear error messages.
  • Interactive Tools:
    • Interactive cleaning, transformation, and plotting tools for hands-on data analysis.

🛠️ Installation Steps

1. Install the Package from PyPI

To use the library in Google Colab or your local environment, install it directly from PyPI:

pip install dataanalysts

💡 Usage Examples

1. Import the Library

import dataanalysts as da
import pandas as pd

2. Load Data

df = da.load_csv('data.csv')
df_excel = da.load_excel('data.xlsx', sheet_name='Sheet1')

3. Data Cleaning

df_cleaned = da.clean(df)
df_cleaned_outliers = da.clean(df, handle_outliers=True)
df_interactive_clean = da.interactive_clean(df)

4. Data Transformation

df_transformed = da.transform(df, strategy='standard')
df_pca = da.transform(df_transformed, reduce_dimensionality=True, n_components=3)
df_interactive_transform = da.interactive_transform(df)

5. Data Visualization

da.histogram(df, column='age', bins=30, kde=True)
da.barchart(df, x_col='city', y_col='population')
da.linechart(df, x_col='date', y_col='sales')
da.scatter(df, x_col='height', y_col='weight', hue='gender')
da.heatmap(df)
da.pairplot(df, hue='category')
da.boxplot(df, x_col='region', y_col='sales')
da.violinplot(df, x_col='region', y_col='sales')

6. Interactive Visualization

da.interactive_plot(df)

🤝 Contributing

Contributions are welcome! Please submit a pull request via our GitHub Repository.


📜 License

This project is licensed under the MIT License. See the LICENSE file for details.


🛠️ Support

If you encounter any issues, feel free to open an issue on our GitHub Issues page.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataanalysts-2.0.0.tar.gz (10.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dataanalysts-2.0.0-py3-none-any.whl (10.3 kB view details)

Uploaded Python 3

File details

Details for the file dataanalysts-2.0.0.tar.gz.

File metadata

  • Download URL: dataanalysts-2.0.0.tar.gz
  • Upload date:
  • Size: 10.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for dataanalysts-2.0.0.tar.gz
Algorithm Hash digest
SHA256 49694139439432b2478acc883e7ed4006cbfe94156b979eccfb56b52a975e026
MD5 d6fb9f65b37f8f397a1eff0d7f6be9c2
BLAKE2b-256 f2e76e75c1c285e0531747c05184cc93fd76915f9c466e4f5b61dee26d6abc4f

See more details on using hashes here.

File details

Details for the file dataanalysts-2.0.0-py3-none-any.whl.

File metadata

  • Download URL: dataanalysts-2.0.0-py3-none-any.whl
  • Upload date:
  • Size: 10.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for dataanalysts-2.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 040c26c3ff7cfc351bd69666b956fdad23daa7438c286cd5fc36743ff9804488
MD5 91279f260dbfa028eceab233e8fa9651
BLAKE2b-256 650ee8714fd8fa986b29c789a1a971b873a2378ef0d42b12703ed62912bf2ca7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page