Skip to main content

CARTE-AI: Context Aware Representation of Table Entries for AI

Project description

Downloads PyPI Version Python Version Code Style: Black License: MIT

CARTE:
Pretraining and Transfer for Tabular Learning

CARTE_outline

This repository contains the implementation of the paper CARTE: Pretraining and Transfer for Tabular Learning.

CARTE is a pretrained model for tabular data by treating each table row as a star graph and training a graph transformer on top of this representation.

Colab Examples (Give it a test):

Open In Colab

  • CARTERegressor on Wine Poland dataset
  • CARTEClassifier on Spotify dataset

01 Install 🚀

The library has been tested on Linux, MacOSX and Windows.

CARTE-AI can be installed from PyPI:

pip install carte-ai

Post installation check

After a correct installation, you should be able to import the module without errors:

import carte_ai

02 CARTE-AI example on sampled data step by step ➡️

1️⃣ Load the Data 💽

import pandas as pd
from carte_ai.data.load_data import *

num_train = 128  # Example: set the number of training groups/entities
random_state = 1  # Set a random seed for reproducibility
X_train, X_test, y_train, y_test = wina_pl(num_train, random_state)
print("Wina Poland dataset:", X_train.shape, X_test.shape)

sample

2️⃣ Convert Table 2 Graph 🪵

The basic preparations are:

  • preprocess raw data
  • load the prepared data and configs; set train/test split
  • generate graphs for each table entries (rows) using the Table2GraphTransformer
  • create an estimator and make inference
import fasttext
from huggingface_hub import hf_hub_download
from carte_ai import Table2GraphTransformer

model_path = hf_hub_download(repo_id="hi-paris/fastText", filename="cc.en.300.bin")

preprocessor = Table2GraphTransformer(fasttext_model_path=model_path)

# Fit and transform the training data
X_train = preprocessor.fit_transform(X_train, y=y_train)

# Transform the test data
X_test = preprocessor.transform(X_test)

sample

3️⃣ Make Predictions🔮

For learning, CARTE currently runs with the sklearn interface (fit/predict) and the process is:

  • Define parameters
  • Set the estimator
  • Run 'fit' to train the model and 'predict' to make predictions
from carte_ai import CARTERegressor, CARTEClassifier

# Define some parameters
fixed_params = dict()
fixed_params["num_model"] = 10 # 10 models for the bagging strategy
fixed_params["disable_pbar"] = False # True if you want cleanness
fixed_params["random_state"] = 0
fixed_params["device"] = "cpu"
fixed_params["n_jobs"] = 10
fixed_params["pretrained_model_path"] = config_directory["pretrained_model"]


# Define the estimator and run fit/predict

estimator = CARTERegressor(**fixed_params) # CARTERegressor for Regression
estimator.fit(X=X_train, y=y_train)
y_pred = estimator.predict(X_test)

# Obtain the r2 score on predictions

score = r2_score(y_test, y_pred)
print(f"\nThe R2 score for CARTE:", "{:.4f}".format(score))

sample

03 CARTE-AI references 📚

@article{kim2024carte,
  title={CARTE: pretraining and transfer for tabular learning},
  author={Kim, Myung Jun and Grinsztajn, L{\'e}o and Varoquaux, Ga{\"e}l},
  journal={arXiv preprint arXiv:2402.16785},
  year={2024}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

carte_ai-0.0.14.tar.gz (40.3 MB view details)

Uploaded Source

Built Distribution

carte_ai-0.0.14-py3-none-any.whl (40.3 MB view details)

Uploaded Python 3

File details

Details for the file carte_ai-0.0.14.tar.gz.

File metadata

  • Download URL: carte_ai-0.0.14.tar.gz
  • Upload date:
  • Size: 40.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.14

File hashes

Hashes for carte_ai-0.0.14.tar.gz
Algorithm Hash digest
SHA256 0594b27a0945f2ac1e73ac8247cdaf07ebbaea2f6e8fd56998aeab1f7cf53bf3
MD5 0a67db037d7415027969f8a8f0ba7936
BLAKE2b-256 0abd48a9ee0c9bafafc7c08f8c0055b2f28346255a98e20566c720b118f4d7aa

See more details on using hashes here.

File details

Details for the file carte_ai-0.0.14-py3-none-any.whl.

File metadata

  • Download URL: carte_ai-0.0.14-py3-none-any.whl
  • Upload date:
  • Size: 40.3 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.14

File hashes

Hashes for carte_ai-0.0.14-py3-none-any.whl
Algorithm Hash digest
SHA256 2af1f92db279bccedf0fa36b0d7c341589c96aba82414ea36b644fc1c6d06008
MD5 11a17d9cda87982c1f7c72a6bb5ceb25
BLAKE2b-256 2972637a0af88cd7c96b91cc955f7514f71b8d9f6cada7151ddc91dd265885af

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page