Skip to main content

An advanced Python data analysis library with enhanced cleaning, transformation, and visualization.

Project description

DataAnalysts Package

DataAnalysts is a Python library designed to simplify and streamline data analysis tasks, including data cleaning, transformation, and visualization. Whether you're a student, a data analyst, or a researcher, this package is built to handle datasets efficiently and interactively.


🚀 Key Features

  • Data Cleaning:
    • Handle missing values (mean, median, mode strategies).
    • Remove duplicates, manage outliers, and preprocess raw datasets.
  • Data Transformation:
    • Scale (standard, min-max, robust) and normalize datasets.
    • Encode categorical data and apply dimensionality reduction (PCA).
  • Data Visualization:
    • Generate professional plots: Histogram, Line Plot, Scatter Plot, Heatmap, Pair Plot, Box Plot, Violin Plot.
    • Supports interactive and customizable visualizations.
  • Data Loading:
    • Easily load datasets from CSV and Excel files.
  • Error Handling:
    • Robust exception handling with clear error messages.
  • Interactive Tools:
    • Interactive cleaning, transformation, and plotting tools for hands-on data analysis.

🛠️ Installation Steps

1. Install the Package from PyPI

To use the library in Google Colab or your local environment, install it directly from PyPI:

pip install dataanalysts

💡 Usage Examples

1. Import the Library

import dataanalysts as da
import pandas as pd

2. Load Data

df = da.load_csv('data.csv')
df_excel = da.load_excel('data.xlsx', sheet_name='Sheet1')

3. Data Cleaning

df_cleaned = da.clean(df)
df_cleaned_outliers = da.clean(df, handle_outliers=True)
df_interactive_clean = da.interactive_clean(df)

4. Data Transformation

df_transformed = da.transform(df, strategy='standard')
df_pca = da.transform(df_transformed, reduce_dimensionality=True, n_components=3)
df_interactive_transform = da.interactive_transform(df)

5. Data Visualization

da.histogram(df, column='age', bins=30, kde=True)
da.barchart(df, x_col='city', y_col='population')
da.linechart(df, x_col='date', y_col='sales')
da.scatter(df, x_col='height', y_col='weight', hue='gender')
da.heatmap(df)
da.pairplot(df, hue='category')
da.boxplot(df, x_col='region', y_col='sales')
da.violinplot(df, x_col='region', y_col='sales')

6. Interactive Visualization

da.interactive_plot(df)

🤝 Contributing

Contributions are welcome! Please submit a pull request via our GitHub Repository.


📜 License

This project is licensed under the MIT License. See the LICENSE file for details.


🛠️ Support

If you encounter any issues, feel free to open an issue on our GitHub Issues page.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataanalysts-0.2.0.tar.gz (10.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dataanalysts-0.2.0-py3-none-any.whl (10.3 kB view details)

Uploaded Python 3

File details

Details for the file dataanalysts-0.2.0.tar.gz.

File metadata

  • Download URL: dataanalysts-0.2.0.tar.gz
  • Upload date:
  • Size: 10.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for dataanalysts-0.2.0.tar.gz
Algorithm Hash digest
SHA256 7c8462edd617531b113ba9f1ee6fda4cd1aade783df74631335445c8df26c648
MD5 a8020ca27bc0a9b7d464a2fde7e0f78f
BLAKE2b-256 2939114057949851a304a0fde18fa0727c875c7f7ec13b7ce378cfb674de0fa9

See more details on using hashes here.

File details

Details for the file dataanalysts-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: dataanalysts-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 10.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for dataanalysts-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5d9b87b8145223645d9a6cdfa0ab421658594961816bbedbb983a7885923ed9a
MD5 504db57e21c193a9cf48fcae1e3ed5c8
BLAKE2b-256 2216d9fa8f9915b27be2f5a976b07513c287f49cf4489852d9b088bcc8a53566

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page