Skip to main content

CARTE-AI: Context Aware Representation of Table Entries for AI

Project description

Downloads PyPI Version Python Version Code Style: Black License: MIT

CARTE:
Pretraining and Transfer for Tabular Learning

CARTE_outline

This repository contains the implementation of the paper CARTE: Pretraining and Transfer for Tabular Learning.

CARTE is a pretrained model for tabular data by treating each table row as a star graph and training a graph transformer on top of this representation.

Colab Examples (Give it a test):

Open In Colab

  • CARTERegressor on Wine Poland dataset
  • CARTEClassifier on Spotify dataset

01 Install 🚀

The library has been tested on Linux, MacOSX and Windows.

CARTE-AI can be installed from PyPI:

pip install carte-ai

Post installation check

After a correct installation, you should be able to import the module without errors:

import carte_ai

02 CARTE-AI example on sampled data step by step ➡️

1️⃣ Load the Data 💽

import pandas as pd
from carte_ai.data.load_data import *

num_train = 128  # Example: set the number of training groups/entities
random_state = 1  # Set a random seed for reproducibility
X_train, X_test, y_train, y_test = wina_pl(num_train, random_state)
print("Wina Poland dataset:", X_train.shape, X_test.shape)

sample

2️⃣ Convert Table 2 Graph 🪵

The basic preparations are:

  • preprocess raw data
  • load the prepared data and configs; set train/test split
  • generate graphs for each table entries (rows) using the Table2GraphTransformer
  • create an estimator and make inference
import fasttext
from huggingface_hub import hf_hub_download
from carte_ai import Table2GraphTransformer

model_path = hf_hub_download(repo_id="hi-paris/fastText", filename="cc.en.300.bin")

preprocessor = Table2GraphTransformer(fasttext_model_path=model_path)

# Fit and transform the training data
X_train = preprocessor.fit_transform(X_train, y=y_train)

# Transform the test data
X_test = preprocessor.transform(X_test)

sample

3️⃣ Make Predictions🔮

For learning, CARTE currently runs with the sklearn interface (fit/predict) and the process is:

  • Define parameters
  • Set the estimator
  • Run 'fit' to train the model and 'predict' to make predictions
from carte_ai import CARTERegressor, CARTEClassifier

# Define some parameters
fixed_params = dict()
fixed_params["num_model"] = 10 # 10 models for the bagging strategy
fixed_params["disable_pbar"] = False # True if you want cleanness
fixed_params["random_state"] = 0
fixed_params["device"] = "cpu"
fixed_params["n_jobs"] = 10
fixed_params["pretrained_model_path"] = config_directory["pretrained_model"]


# Define the estimator and run fit/predict

estimator = CARTERegressor(**fixed_params) # CARTERegressor for Regression
estimator.fit(X=X_train, y=y_train)
y_pred = estimator.predict(X_test)

# Obtain the r2 score on predictions

score = r2_score(y_test, y_pred)
print(f"\nThe R2 score for CARTE:", "{:.4f}".format(score))

sample

03 CARTE-AI references 📚

@article{kim2024carte,
  title={CARTE: pretraining and transfer for tabular learning},
  author={Kim, Myung Jun and Grinsztajn, L{\'e}o and Varoquaux, Ga{\"e}l},
  journal={arXiv preprint arXiv:2402.16785},
  year={2024}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

carte_ai-0.0.13.tar.gz (40.3 MB view details)

Uploaded Source

Built Distribution

carte_ai-0.0.13-py3-none-any.whl (40.3 MB view details)

Uploaded Python 3

File details

Details for the file carte_ai-0.0.13.tar.gz.

File metadata

  • Download URL: carte_ai-0.0.13.tar.gz
  • Upload date:
  • Size: 40.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.14

File hashes

Hashes for carte_ai-0.0.13.tar.gz
Algorithm Hash digest
SHA256 94bfdbc4f8cf38f0581e1df1123130b0c97755cb8371370d5b536e5c7481c98f
MD5 f90f94e3925ad818beec7968fb9dfb95
BLAKE2b-256 86e9f80c7502518697c077caeb0f4e05ffdd77ba162580870d322a995433c727

See more details on using hashes here.

File details

Details for the file carte_ai-0.0.13-py3-none-any.whl.

File metadata

  • Download URL: carte_ai-0.0.13-py3-none-any.whl
  • Upload date:
  • Size: 40.3 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.14

File hashes

Hashes for carte_ai-0.0.13-py3-none-any.whl
Algorithm Hash digest
SHA256 ec8e746b4f8e0870b77e1fc96603fd19ffe9068e8b7f67979e5ac67dcae586c4
MD5 03f7b4a214ddc6677c692b9cd2bdd396
BLAKE2b-256 4521b36d3e2cd9cce0dcba4d67eff790db2ef3c7192aa9210302c5204c491e57

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page