Skip to main content

A framework that uses multi-agents to enable users to perform a systematic data science pipeline with just two inputs.

Project description

MADS - Multi-Agents for Data Science! Our goal is to enable everyone to apply machine learning with just two inputs!

[Our Website] | [Our Research Website] | [pré-print MADS paper]

PyPI

Table of contents

Overview

MADS is a project aimed at creating a platform where users can update a dataset. Our agents will then execute all the necessary steps in the data science pipeline. The user simply needs to define the goal of the project, and our agents will handle the rest. In the end, the user will have access to a trained model, to the predictions and a report that includes insights from each agent.

Roadmap

  • Implement a Reiforcment Learning Agent to improve the current prompts.
  • Create tools for the agents to use them.
    • Example 1: For the model-building agent, provide a tool like one from Nixtla's library to improve forecasting for time series problems.
    • Example 2: For the model consultant agent, create a tool to optimize the selected model.
  • Introduce a new agent to interact with the data analyst and generate visualizations useful for the final report.

Installation

MADS requires Python >=3.11.7 installed on your system. We recommend setting up a virtual environment before starting to work with MADS:

  • Create a virtual environment: python -m venv .venv
  • Activate your virtual environment: .\.venv\Scripts\activate

To begin using our library, simply install it via pip: pip install pymads

Setting Up MADS

Before running the bellow script create a .env file and place your API Key inside it: GROQ_API_KEY="your_api_key" or OPENAI_API_KEY="your_api_key". If you don't have docker on your system, place AUTOGEN_USE_DOCKER = "False" inside your .env.

import os
from mads.chat_manager import ChatManager
from mads.config import configure_llm, create_task_folders

# Check if the 'tasks' directory exists, if not, create it.
if not os.path.exists('tasks'):
    create_task_folders()

# After creating the folder, upload the dataset (a csv) you wish to test into tasks/datasets. 
# Note: If there is no dataset in tasks, the pipeline will proceed but yield no results.

# Set your API keys here.
# These keys may be from Groq or OpenAI; current configurations are better suited for Groq.

GROQ_API_KEY = os.getenv("GROQ_API_KEY")
# OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")

# Select the model to use (tested options include 'llama3-70b-8192' and 'gpt-3.5-turbo-0125').
# To test additional models, modify the config.py script accordingly.
model = configure_llm("llama3-70b-8192", GROQ_API_KEY)

# Define the supervised ML problem that you intend to solve.
problem = "I want to predict wine quality"

# Specify the filename of your data
dataset = "winequality-red.csv"

# Choose the agents to deploy.
# Here, we select all six available agents.
agents = [1, 2, 3, 4, 5, 6]

# Configure the ChatManager class.
chat_manager = ChatManager(dataset, problem, model, agents)

# Begin the chat sessions.
chat_results = chat_manager.initiate_chats()

Contribuition

MADS is open-source and we welcome contributions. If you're looking to contribute, please:

  • Fork the repository.
  • Create a new branch for your feature.
  • Add your feature or improvement.
  • Send a pull request.

Issue Reporting

If you encounter any problems while using our library, please don't hesitate to create an issue on GitHub. When reporting issues, please provide as much detail as possible, including:

  • Steps to reproduce the issue
  • Expected behavior
  • Actual behavior
  • Any error messages or stack traces

Your feedback is valuable to us and will help improve the library for everyone. Thank you for contributing to the project!

Contact Us

If you have any questions, suggestions, or feedback regarding MADS, please feel free to reach out to us:

Main Contributors

We are committed to improving MADS based on your input and look forward to hearing from you!

Ackonwledgments

A heartfelt thank you to all the contributors of the autogen framework. Your dedication and hard work have been instrumental in making this project possible. We deeply appreciate the entire community's support and involvement.

License

MADS is released under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pymads-0.0.6.tar.gz (15.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pymads-0.0.6-py3-none-any.whl (13.1 kB view details)

Uploaded Python 3

File details

Details for the file pymads-0.0.6.tar.gz.

File metadata

  • Download URL: pymads-0.0.6.tar.gz
  • Upload date:
  • Size: 15.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.10.12

File hashes

Hashes for pymads-0.0.6.tar.gz
Algorithm Hash digest
SHA256 4e2036c95f1e3d858c83d05ec7ef148e57d8d6a88c342cab84bb74cceacd7d0c
MD5 b9f378ba1650c0d71ef6f7fdf818b353
BLAKE2b-256 1427859d89de8462288d592069c5f23078380a9fd0661307be03ef125d6a142c

See more details on using hashes here.

File details

Details for the file pymads-0.0.6-py3-none-any.whl.

File metadata

  • Download URL: pymads-0.0.6-py3-none-any.whl
  • Upload date:
  • Size: 13.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.10.12

File hashes

Hashes for pymads-0.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 1f79ad1ec38fcc55b8eabcf4621cd91ecc5712432ebab3f6849905a1bf799b78
MD5 c242a5e87bef926f71955c4a61c4f5ea
BLAKE2b-256 f34e99601698961651999b164829fa7b9b0e7fc3f71b1c6488c49d4843b4fd05

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page