Skip to main content

A Python package for low-code analysis of tabular data

Project description

TableMage   🧙‍♂️📊

Python Versions License Code style: black Tests Passing Documentation Status

TableMage is a Python package for low-code/conversational clinical data science. TableMage can help you quickly explore tabular datasets, easily perform regression analyses, and effortlessly benchmark machine learning models.

Installation

We recommend installing TableMage in a new virtual environment.

To install TableMage:

git clone https://github.com/ajy25/TableMage.git
cd TableMage
pip install .

TableMage supports Python versions 3.10 through 3.12.

[!NOTE] For MacOS users: You might run into an error involving XGBoost, one of TableMage's dependencies, when using TableMage for the first time. To resolve this error, you'll need to install libomp: brew install libomp. This requries Homebrew.

Quick start (low-code)

You'll likely use TableMage for machine learning model benchmarking. Here's how to do it.

import tablemage as tm
import pandas as pd
import joblib

# load table (assume 'y' is a numeric variable we wish to predict)
df = ...

# initialize an Analyzer object
analyzer = tm.Analyzer(df, test_size=0.2)

# preprocess data, taking care to exclude the target variable 'y' from the operations
c

# train regressors
reg_report = analyzer.regress(  # categorical variables are automatically one-hot encoded
    models=[                    # hyperparameter tuning is preset and automatic
        tm.ml.LinearR('l2', name='ridge'),
        tm.ml.TreesR('random_forest', name='rf'),
        tm.ml.TreesR('xgboost', name='xgb'),
    ],
    target='y',                 # automatically drops examples with missing values in target variable
    predictors=None,            # None signifies all variables except target variable
    feature_selectors=[
        tm.fs.BorutaFSR()       # select subset of predictors prior to training
    ]
)

# view model metrics
print(reg_report.metrics('test'))

# predict on new data
new_df = ...
ridge_model = reg_report.model('ridge').sklearn_pipeline()
y_pred = ridge_model.predict(new_df)

# save as sklearn pipeline
joblib.dump(ridge_model, 'ridge.joblib')

Quick start (conversational)

First, install the required additional dependencies.

pip install '.[agents]'

Next, add your API key. You only need to do this once; your API key will be written to a local .env file.

import tablemage as tm
tm.use_agents()                                             # import the agents module
tm.agents.set_key("openai", "add-your-api-key-here")        # set API key

You can open up a chat user interface by running the following code and navigating to the URL that appears in the terminal. Your conversation with the ChatDA, the AI agent, appears on the left, while ChatDA's analyses (figures made, tables produced, TableMage commands used) appear on the right.

import tablemage as tm
tm.use_agents()
tm.agents.options.set_llm(
    llm_type="openai", 
    model_name="gpt-4o-mini", 
    temperature=0.1
)
# optionally, multimodal ChatDA can interpret figures
tm.agents.options.set_multimodal_llm(
    llm_type="openai",
    model_name="gpt-4o-mini",
    temperature=0.1
)                           # multimodal LLM must be specified for multimodal ChatDA
tm.agents.App(
    multimodal=True         # additional parameters can be set, e.g. memory type, 
).run(debug=False)          # disabling/enabling Python environment, etc.

Or, you can chat with the AI agent directly in Python:

import pandas as pd
import tablemage as tm
tm.use_agents()
tm.agents.options.set_llm(
    llm_type="openai", 
    model_name="gpt-4o-mini", 
    temperature=0.1
)

# load table
df = ...

# initialize a ChatDA object
agent = tm.agents.ChatDA(
    df,                     # additional parameters can be set, e.g. memory type, 
    test_size=0.2           # disabling/enabling Python environment, etc.
)

# chat with the agent
print(agent.chat("Compute the summary statistics for the numeric variables."))

[!NOTE] You must be connected to the internet to use the agents module, even if you are using Ollama to run a locally-hosted LLM. TableMage's agent, ChatDA, relies on FastEmbed for retrieval augmented generation, but it may need to download the FastEmbed model from the internet prior to use. ChatDA can be run with a local LLM and FastEmbed, ensuring total data privacy.

Updates

TableMage is under active development.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tablemage-0.1.0a1.tar.gz (912.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tablemage-0.1.0a1-py3-none-any.whl (230.9 kB view details)

Uploaded Python 3

File details

Details for the file tablemage-0.1.0a1.tar.gz.

File metadata

  • Download URL: tablemage-0.1.0a1.tar.gz
  • Upload date:
  • Size: 912.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for tablemage-0.1.0a1.tar.gz
Algorithm Hash digest
SHA256 84c484958e2c98a8d56f91031e9a7b91be2c3425f327fc722eccb428e1dd2359
MD5 918bdd69794cf645f02e944c7ba433bb
BLAKE2b-256 0de2b3ae09aa93f3ec5604512fbcd698665a37a46b4bf62388d79483cecae3d5

See more details on using hashes here.

File details

Details for the file tablemage-0.1.0a1-py3-none-any.whl.

File metadata

  • Download URL: tablemage-0.1.0a1-py3-none-any.whl
  • Upload date:
  • Size: 230.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for tablemage-0.1.0a1-py3-none-any.whl
Algorithm Hash digest
SHA256 12fc0dd406979cf7c27b9a4179c51d909759f8a791a56cfc2065ef009f451169
MD5 f9066d2d56e67434d05454ecb76eb63d
BLAKE2b-256 7a7054194cd62d7e8e500ba1f61e3f47bb4561a25e17c213559527b81c7467a5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page