Skip to main content

No project description provided

Project description

Tests

Deep Feature Synthesis in Rust

This is a project to implement the Deep Feature Synthesis algorithm in Rust.

Roadmap

  • (30JUL2022[^1]) Parity with Featuretools creating feature definitions (no calculation) on single table
  • Parity with Featuretools creating feature definitions (no calculation) on multiple tables
  • Explore Calculation on single table using Polars
  • Explore Calculation on multiple tables using Polars

[^1]: 3 primitives are not yet passing tests. 'diff_datetime', 'geomidpoint', 'age'

Running in Python

pip install rust_dfs

Using from python

# Import Featuretools, rust_dfs, and some other utility functions
import featuretools as ft
from rust_dfs.utils import *
from rust_dfs import generate_features, compare_featuresets, Feature
from rust_dfs.generate_fake_dataframe import generate_pandas_fake_dataframe

# Generate a fake dataset with 4 Numeric columns
df = generate_pandas_fake_dataframe(
    n_rows=10,
    col_defs=[
        ("Numeric", 4),
    ]
)

# pick some primitives
f_primitives = [
    ft.primitives.GreaterThan,
    # ft.primitives.LessThan
]

# or use all of them
# f_primitives = list(ft.primitives.utils.get_transform_primitives().values())

# convert datafame to an entityset
es = df_to_es(df)

# run dfs with features_only=True
ft_feats = ft.dfs(
    entityset=es, 
    target_dataframe_name="nums", 
    trans_primitives=f_primitives, 
    features_only=True,
    max_depth=1
)

# ft_feats = [<Feature: F_0>, <Feature: F_1>, <Feature: F_0 > F_1>, <Feature: F_1 > F_0>]

# Convert back into a format that we can use to compare with rust
c_feats = list(convert_features(ft_feats).values())

# Now run using Rust

# convert featuretools primitives to rust primitives
r_primitives = convert_primitives(f_primitives)

# convert dataframe to rust features
r_features = dataframe_to_features(es.dataframes[0])

# generate engineered features using Rust (create new features only)
r_derived_feats = generate_features(r_features, r_primitives)


a,b = compare_featuresets(c_feats, r_derived_feats)

print("=== Features generated by Featuretools, that were NOT generated by Rust ===")
print(a)
print()
print("=== Features generated by Rust, that were NOT generated by Featuretools ===")
print(b)


# Persist Rust Features to disk
Feature.save_features(r_derived_feats, "all_features.json")

Develop Guide

Install pipenv

pip install pipenv
pipenv install
pipenv shell

Ensure Cargo.toml is configured

[lib]
name = "rust_dfs"
crate-type = ["cdylib"]

Run maturin

maturin develop

Run main.rs

To run as a rust binary:

cargo run --no-default-features

Test

cargo test --no-default-features

Why --no-default-features? Because we are using pyo3 and is describe here:

Finally, don't forget that on MacOS the extension-module feature will cause cargo test to fail without the --no-default-features flag (see the FAQ).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rust_dfs-0.3.0.tar.gz (150.0 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

rust_dfs-0.3.0-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (407.1 kB view details)

Uploaded PyPymanylinux: glibc 2.17+ x86-64

rust_dfs-0.3.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (403.8 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

rust_dfs-0.3.0-cp311-cp311-macosx_10_7_x86_64.whl (363.4 kB view details)

Uploaded CPython 3.11macOS 10.7+ x86-64

rust_dfs-0.3.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (403.8 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

rust_dfs-0.3.0-cp310-cp310-macosx_10_7_x86_64.whl (363.4 kB view details)

Uploaded CPython 3.10macOS 10.7+ x86-64

rust_dfs-0.3.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (403.9 kB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64

rust_dfs-0.3.0-cp39-cp39-macosx_10_7_x86_64.whl (363.6 kB view details)

Uploaded CPython 3.9macOS 10.7+ x86-64

rust_dfs-0.3.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (403.9 kB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64

rust_dfs-0.3.0-cp38-cp38-macosx_10_7_x86_64.whl (363.5 kB view details)

Uploaded CPython 3.8macOS 10.7+ x86-64

rust_dfs-0.3.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (404.0 kB view details)

Uploaded CPython 3.7mmanylinux: glibc 2.17+ x86-64

File details

Details for the file rust_dfs-0.3.0.tar.gz.

File metadata

  • Download URL: rust_dfs-0.3.0.tar.gz
  • Upload date:
  • Size: 150.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/0.12.20

File hashes

Hashes for rust_dfs-0.3.0.tar.gz
Algorithm Hash digest
SHA256 e35686444e846903a33f27291b6ae9dbaac0e991af2a0a1ebce3c643b3f79b49
MD5 07677c15975e14db380a1c631a0a71c6
BLAKE2b-256 89509bd8e97ab94bc091f2a00c519ca7316530310b48deb450eff504d8b92638

See more details on using hashes here.

File details

Details for the file rust_dfs-0.3.0-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rust_dfs-0.3.0-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 3791c290ca42ab3ee84914a48d7ab2fd0d1c9792d5cbbdd6f58dd377a164415f
MD5 5f3d569d912e892b0116dd3ae71365e1
BLAKE2b-256 8cf74a365d27d405b617310c075b892afc3115e3afa6e5114a5fe935cd038d88

See more details on using hashes here.

File details

Details for the file rust_dfs-0.3.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rust_dfs-0.3.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 70f12981c9fa02bc8ff6b719f3de2555761260b84224b05deaccfefa3ba65169
MD5 a3afa8192ef187a1b449a8cd849e89b8
BLAKE2b-256 f2d378afec24aecd4e967dbe2b74fa5a0d2d6a212fd5d73bd36fe7bb585036f4

See more details on using hashes here.

File details

Details for the file rust_dfs-0.3.0-cp311-cp311-macosx_10_7_x86_64.whl.

File metadata

File hashes

Hashes for rust_dfs-0.3.0-cp311-cp311-macosx_10_7_x86_64.whl
Algorithm Hash digest
SHA256 61a6e91470b2d6555d71c8405feb0ed1bfbc8c659eef5c096dae5f970191f109
MD5 65b7fc6405fb6def5c19d1d7bc4e09ab
BLAKE2b-256 2b377bb5d5e6dfae7b9c136aec248398587a8c2ed11e5fc7ab6396d2288bb805

See more details on using hashes here.

File details

Details for the file rust_dfs-0.3.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rust_dfs-0.3.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 76470315fed4efa1a30645cd661b03a65f7211aa3005bfa9573d1fef8e2b7c1a
MD5 09e493f514a20404c61385b88000fe44
BLAKE2b-256 4d0a27614a9f3c91c1a8e6efce440fe55ea2d9a45dc3c8f8852ee9caeda07fb8

See more details on using hashes here.

File details

Details for the file rust_dfs-0.3.0-cp310-cp310-macosx_10_7_x86_64.whl.

File metadata

File hashes

Hashes for rust_dfs-0.3.0-cp310-cp310-macosx_10_7_x86_64.whl
Algorithm Hash digest
SHA256 ec8190e044824a7aa5419150635b9e3b47f518b51ffc607d62302008160c085b
MD5 b1797a87c7c9e708351300ad6ccdfeae
BLAKE2b-256 07e51d49a181e8a6ae687659da7b5ee31a4cf87cecab2fcbb55638807e987f34

See more details on using hashes here.

File details

Details for the file rust_dfs-0.3.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rust_dfs-0.3.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 0d5aeb0c0d4d9ed74618f0f5c71cc12285876d56d1502aee3639d02a8ab249f8
MD5 eb4497f6965ee0e4092e849d353cce3d
BLAKE2b-256 69018d64c661318926ad0476bbbfeccf66f59692bef35624ef1588c4b3bf1e0f

See more details on using hashes here.

File details

Details for the file rust_dfs-0.3.0-cp39-cp39-macosx_10_7_x86_64.whl.

File metadata

File hashes

Hashes for rust_dfs-0.3.0-cp39-cp39-macosx_10_7_x86_64.whl
Algorithm Hash digest
SHA256 32531540335d2d3894b054e01036e9fb46947d7dbe15236dfb186bf2f5d7ceb3
MD5 b5c037f305a723d0ca049b607ea39701
BLAKE2b-256 c86c8dfa05519b8acd936cb5107781241531f84c8fbe09eaa937058eca8cd362

See more details on using hashes here.

File details

Details for the file rust_dfs-0.3.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rust_dfs-0.3.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 317e4297489fb80daedd37ccca340b112b8635b7bf6188569f5ed7a5f3e508bf
MD5 c500bd07e21a3db8510074b7c4a3344f
BLAKE2b-256 bdda136b546c127db47dfe6b7b34ade7eaf8842d44073ccf5d762b0ab3986223

See more details on using hashes here.

File details

Details for the file rust_dfs-0.3.0-cp38-cp38-macosx_10_7_x86_64.whl.

File metadata

File hashes

Hashes for rust_dfs-0.3.0-cp38-cp38-macosx_10_7_x86_64.whl
Algorithm Hash digest
SHA256 d2f0038db64f25d62d954716f325bee4f4ab8d4f0d71fc7fc8f8243caf678dbd
MD5 8a7bf866ede3cf71328f69da399905c3
BLAKE2b-256 5f261335382c3e91bc9a8470466c078d37b89f91e32d4d1b86eedd865e017cca

See more details on using hashes here.

File details

Details for the file rust_dfs-0.3.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rust_dfs-0.3.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 084778daf0091eec76830b929e4bb5775d433a9a9ffa0a45d44bb1c75de8b68d
MD5 1bc9114f6c019f90741c4b47c89564c7
BLAKE2b-256 24fef981176657908ba84e921ae86373c796c7fc78d6acbeb444940e9ffa02f8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page