Skip to main content

CARTE-AI: Context Aware Representation of Table Entries for AI

Project description

Downloads PyPI Version Python Version Code Style: Black License Code Coverage Hugging Face arXiv

CARTE:
Pretraining and Transfer for Tabular Learning

CARTE_outline

This repository contains the implementation of the paper CARTE: Pretraining and Transfer for Tabular Learning.

CARTE is a pretrained model for tabular data by treating each table row as a star graph and training a graph transformer on top of this representation.

Colab Examples (Give it a test):

Open In Colab

  • CARTERegressor on Wine Poland dataset
  • CARTEClassifier on Spotify dataset

Other datasets are available for testing: datasets

01 Install 🚀

The library has been tested on Linux, MacOSX and Windows.

CARTE-AI can be installed from PyPI:

pip install carte-ai

Post installation check

After a correct installation, you should be able to import the module without errors:

import carte_ai

02 CARTE-AI example on sampled data step by step ➡️

1️⃣ Load the Data 💽

import pandas as pd
from carte_ai.data.load_data import *

num_train = 128  # Example: set the number of training groups/entities
random_state = 1  # Set a random seed for reproducibility
X_train, X_test, y_train, y_test = wina_pl(num_train, random_state)
print("Wina Poland dataset:", X_train.shape, X_test.shape)

sample

2️⃣ Convert Table 2 Graph 🪵

The basic preparations are:

  • preprocess raw data
  • load the prepared data and configs; set train/test split
  • generate graphs for each table entries (rows) using the Table2GraphTransformer
  • create an estimator and make inference
import fasttext
from huggingface_hub import hf_hub_download
from carte_ai import Table2GraphTransformer

model_path = hf_hub_download(repo_id="hi-paris/fastText", filename="cc.en.300.bin")

preprocessor = Table2GraphTransformer(fasttext_model_path=model_path)

# Fit and transform the training data
X_train = preprocessor.fit_transform(X_train, y=y_train)

# Transform the test data
X_test = preprocessor.transform(X_test)

sample

3️⃣ Make Predictions🔮

For learning, CARTE currently runs with the sklearn interface (fit/predict) and the process is:

  • Define parameters
  • Set the estimator
  • Run 'fit' to train the model and 'predict' to make predictions
from carte_ai import CARTERegressor, CARTEClassifier

# Define some parameters
fixed_params = dict()
fixed_params["num_model"] = 10 # 10 models for the bagging strategy
fixed_params["disable_pbar"] = False # True if you want cleanness
fixed_params["random_state"] = 0
fixed_params["device"] = "cpu"
fixed_params["n_jobs"] = 10
fixed_params["pretrained_model_path"] = config_directory["pretrained_model"]


# Define the estimator and run fit/predict

estimator = CARTERegressor(**fixed_params) # CARTERegressor for Regression
estimator.fit(X=X_train, y=y_train)
y_pred = estimator.predict(X_test)

# Obtain the r2 score on predictions

score = r2_score(y_test, y_pred)
print(f"\nThe R2 score for CARTE:", "{:.4f}".format(score))

sample

03 Reproducing paper results ⚙️

➡️ installation instructions setup paper

04 Contribute to the package 🚀

➡️ read the contributions guidelines

05 Star History ⭐️

Star History Chart

06 CARTE-AI references 📚

@article{kim2024carte,
  title={CARTE: pretraining and transfer for tabular learning},
  author={Kim, Myung Jun and Grinsztajn, L{\'e}o and Varoquaux, Ga{\"e}l},
  journal={arXiv preprint arXiv:2402.16785},
  year={2024}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

carte_ai-0.0.24.tar.gz (40.3 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

carte_ai-0.0.24-py3-none-any.whl (40.3 MB view details)

Uploaded Python 3

File details

Details for the file carte_ai-0.0.24.tar.gz.

File metadata

  • Download URL: carte_ai-0.0.24.tar.gz
  • Upload date:
  • Size: 40.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.14

File hashes

Hashes for carte_ai-0.0.24.tar.gz
Algorithm Hash digest
SHA256 df6418e4c767d9eaa39dfa8e091007a783d45f66495d22bf742eeb7741a2907f
MD5 bc3cdac87d74be948b679ace1e070145
BLAKE2b-256 77bb5ce84db77502c8ddaf5adb39fabfd7d93bdfa728f99bd26be699e263ef7a

See more details on using hashes here.

File details

Details for the file carte_ai-0.0.24-py3-none-any.whl.

File metadata

  • Download URL: carte_ai-0.0.24-py3-none-any.whl
  • Upload date:
  • Size: 40.3 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.14

File hashes

Hashes for carte_ai-0.0.24-py3-none-any.whl
Algorithm Hash digest
SHA256 9cec170e5312d55187ad910754d3113477dd1f7e0c81aa5f1ce087a0c8a8d36e
MD5 74c8fd48688e2799f6e3361c007f3fd0
BLAKE2b-256 8b7b51fd55f432d801a105165f8c6e3c041bac5d44b6d98eec75859fd5f01e58

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page