Skip to main content

Python SDK for Chalk

Project description

Chalk

Chalk enables innovative machine learning teams to focus on building the unique products and models that make their business stand out. Behind the scenes Chalk seamlessly handles data infrastructure with a best-in-class developer experience. Here’s how it works –


Develop

Chalk makes it simple to develop feature pipelines for machine learning. Define Python functions using the libraries and tools you're familiar with instead of specialized DSLs. Chalk then orchestrates your functions into pipelines that execute in parallel on a Rust-based engine and coordinates the infrastructure required to compute features.

Features

To get started, define your features with Pydantic-inspired Python classes. You can define schemas, specify relationships, and add metadata to help your team share and re-use work.

@features
class User:
    id: int
    full_name: str
    nickname: Optional[str]
    email: Optional[str]
    birthday: date
    credit_score: float
    datawarehouse_feature: float

    transactions: DataFrame[Transaction] = has_many(lambda: Transaction.user_id == User.id)

Resolvers

Next, tell Chalk how to compute your features. Chalk ingests data from your existing data stores, and lets you use Python to compute features with feature resolvers. Feature resolvers are declared with the decorators @online and @offline, and can depend on the outputs of other feature resolvers.

Resolvers make it easy to rapidly integrate a wide variety of data sources, join them together, and use them in your model.

SQL

pg = PostgreSQLSource()

@online
def get_user(uid: User.id) -> Features[User.full_name, User.email]:
    return pg.query_string(
        "select email, full_name from users where id=:id",
        args=dict(id=uid)
    ).one()

REST

import requests

@online
def get_socure_score(uid: User.id) -> Features[User.socure_score]:
    return (
        requests.get("https://api.socure.com", json={
            id: uid
        }).json()['socure_score']
    )

Execute

Once you've defined your features and resolvers, Chalk orchestrates them into flexible pipelines that make training and executing models easy.

Chalk has built-in support for feature engineering workflows -- no need to manage Airflow or orchestrate complicated streaming flows. You can execute resolver pipelines with declarative caching, ingest batch data on a schedule, and easily make slow sources available online for low-latency serving.

Caching

Many data sources (like vendor APIs) are too slow for online use cases and/or charge a high dollar cost-per-call. Chalk lets you optimize latency and cost by defining declarative caching policies which are well-integrated throughout the system. You no longer have to manage Redis, Memcached, DynamodDB, or spend time tuning cache-warming pipelines.

Add a caching policy with one line of code in your feature definition:

@features
class ExternalBankAccount:
-   balance: int
+   balance: int = feature(max_staleness="**1d**")

Optionally warm feature caches by executing resolvers on a schedule:

@online(cron="**1d**")
def fn(id: User.id) -> User.credit_score:
  return redshift.query(...).all()

Or override staleness tolerances at query time when you need fresher data for your models:

chalk.query(
    ...,
    outputs=[User.fraud_score],
    max_staleness={User.fraud_score: "1m"}
)

Batch ETL ingestion

Chalk also makes it simple to generate training sets from data warehouse sources -- join data from services like S3, Redshift, BQ, Snowflake (or other custom sources) with historical features computed online. Specify a cron schedule on an @offline resolver and Chalk automatically ingests data with support for incremental reads:

@offline(cron="**1h**")
def fn() -> Feature[User.id, User.datawarehouse_feature]:
  return redshift.query(...).incremental()

Chalk makes this data available for point-in-time-correct dataset generation for data science use-cases. Every pipeline has built-in monitoring and alerting to ensure data quality and timeliness.

Reverse ETL

When your model needs to use features that are canonically stored in a high-latency data source (like a data warehouse), Chalk's Reverse ETL support makes it simple to bring those features online and serve them quickly.

Add a single line of code to an offline resolver, and Chalk constructs a managed reverse ETL pipeline for that data source:

@offline(offline_to_online_etl="5m")

Now data from slow offline data sources is automatically available for low-latency online serving.


Deploy + query

Once you've defined your pipelines, you can rapidly deploy them to production with Chalk's CLI:

chalk apply

This creates a deployment of your project, which is served at a unique preview URL. You can promote this deployment to production, or perform QA workflows on your preview environment to make sure that your Chalk deployment performs as expected.

Once you promote your deployment to production, Chalk makes features available for low-latency online inference and offline training. Significantly, Chalk uses the exact same source code to serve temporally-consistent training sets to data scientists and live feature values to models. This re-use ensures that feature values from online and offline contexts match and dramatically cuts development time.

Online inference

Chalk's online store & feature computation engine make it easy to query features with ultra low-latency, so you can use your feature pipelines to serve online inference use-cases.

Integrating Chalk with your production application takes minutes via Chalk's simple REST API:

result = ChalkClient().query(
    input={
        User.name: "Katherine Johnson"
    },
    output=[User.fico_score],
    staleness={User.fico_score: "10m"},
)
result.get_feature_value(User.fico_score)

Features computed to serve online requests are also replicated to Chalk's offline store for historical performance tracking and training set generation.

Offline training

Data scientists can use Chalk's Jupyter integration to create datasets and train models. Datasets are stored and tracked so that they can be re-used by other modelers, and so that model provenance is tracked for audit and reproducibility.

X = ChalkClient.offline_query(
    input=labels[[User.uid, timestamp]],
    output=[
        User.returned_transactions_last_60,
        User.user_account_name_match_score,
        User.socure_score,
        User.identity.has_verified_phone,
        User.identity.is_voip_phone,
        User.identity.account_age_days,
        User.identity.email_age,
    ],
)

Chalk datasets are always "temporally consistent." This means that you can provide labels with different past timestamps and get historical features that represent what your application would have retrieved online at those past times. Temporal consistency ensures that your model training doesn't mix "future" and "past" data.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chalkpy-2.123.2.tar.gz (1.6 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

chalkpy-2.123.2-cp313-cp313-win_amd64.whl (3.5 MB view details)

Uploaded CPython 3.13Windows x86-64

chalkpy-2.123.2-cp313-cp313-manylinux_2_28_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

chalkpy-2.123.2-cp313-cp313-manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

chalkpy-2.123.2-cp313-cp313-macosx_11_0_arm64.whl (3.6 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

chalkpy-2.123.2-cp313-cp313-macosx_10_13_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.13macOS 10.13+ x86-64

chalkpy-2.123.2-cp312-cp312-win_amd64.whl (3.5 MB view details)

Uploaded CPython 3.12Windows x86-64

chalkpy-2.123.2-cp312-cp312-manylinux_2_28_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

chalkpy-2.123.2-cp312-cp312-manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

chalkpy-2.123.2-cp312-cp312-macosx_11_0_arm64.whl (3.6 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

chalkpy-2.123.2-cp312-cp312-macosx_10_13_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.12macOS 10.13+ x86-64

chalkpy-2.123.2-cp311-cp311-win_amd64.whl (3.5 MB view details)

Uploaded CPython 3.11Windows x86-64

chalkpy-2.123.2-cp311-cp311-manylinux_2_28_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

chalkpy-2.123.2-cp311-cp311-manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

chalkpy-2.123.2-cp311-cp311-macosx_11_0_arm64.whl (3.6 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

chalkpy-2.123.2-cp311-cp311-macosx_10_13_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.11macOS 10.13+ x86-64

chalkpy-2.123.2-cp310-cp310-win_amd64.whl (3.5 MB view details)

Uploaded CPython 3.10Windows x86-64

chalkpy-2.123.2-cp310-cp310-manylinux_2_28_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

chalkpy-2.123.2-cp310-cp310-manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

chalkpy-2.123.2-cp310-cp310-macosx_11_0_arm64.whl (3.6 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

chalkpy-2.123.2-cp310-cp310-macosx_10_13_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.10macOS 10.13+ x86-64

File details

Details for the file chalkpy-2.123.2.tar.gz.

File metadata

  • Download URL: chalkpy-2.123.2.tar.gz
  • Upload date:
  • Size: 1.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.123.2.tar.gz
Algorithm Hash digest
SHA256 7d6c1bf422531cb823d84842cb2611ea423b7196a39db6218e653ff9809d3c66
MD5 1e9def3991d20364697ef591e2512a68
BLAKE2b-256 6034a5b7dc78e11ec0a618355811953cc3ec41cf91cabfbbf7e2c8942984e7ba

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.2-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.123.2-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 3.5 MB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.123.2-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 fcdd2d7c8ee78a0645a59529ef25699b13aa471ff4d0b5d0bf2c6ee5cc0ff70f
MD5 38ebb9fba8b1a0026edb94efca8bd555
BLAKE2b-256 83605ed850df6e3096ac1316b109484691a2c44cb276e6cf11dce93e81b4491a

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.2-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.2-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 411434e8833b4d7a5378a1965fa7fca37eb254ed1a89f7a4ab476dbd51bea550
MD5 7fcbab979c8d623d50262a3b26ce643f
BLAKE2b-256 d14e3f70350cbcd0bc545cb3d6e4938e96a0eaafe0a75b32e59cf607b857a5c0

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.2-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.2-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 88976d611f485b7fa51407c4fa8dd3be80f6b0937e115d7103232a8d961b5e45
MD5 d19c938ad0a55a29da2ead24c4f9858a
BLAKE2b-256 efa54bae43ab2bad9b5d29f3b9e6568d766bd73e516a21d14c30dff926e10491

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.2-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.2-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 100ae90f29dfc3fbe3115708b3b828899bb60585a42e20d02f699ab4d812c17c
MD5 2196fcb62b3ef33b3cc03ed475dcab3e
BLAKE2b-256 2cc2679b396db4dc5ceb855e31cdafe39d6d7904d03cc8271a12dd8b7014d071

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.2-cp313-cp313-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.2-cp313-cp313-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 7c1025062a232fbafc9c14ba370304085029dfaba922ebcaec908cb441a6de9e
MD5 f267aa3604106260b538bf492b0b69f8
BLAKE2b-256 a44111a4c563e32ba95fced51bfa9b6f6d1457be5635fafe0fd81c9147d4f296

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.2-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.123.2-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 3.5 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.123.2-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 d851fe2ec871e41f706413e53c9df15e7ba0aa2c0ab033ef97f326e911f1593b
MD5 339bce7ea79ad73ace9e48edc0958bcb
BLAKE2b-256 c663aa3dba4b1683632b4be2195507d9dc496ec13feef4eac5fbc9d7665ac82b

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.2-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.2-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 80ed55a23beb06b9e7b5fc858217332bf0596de9c81d4713bed467148787cfb1
MD5 68fca2e83e330c461105a51004943dc5
BLAKE2b-256 b117f8ca4894e337597bfaa731542f720f6694832e5b4a98102bc51f266ad54e

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.2-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.2-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 f663dc8ff05060f6f9314fbbd5dcdbefcb8dcb6b4e603f255b6a98b6dd3de6da
MD5 c0c61aaa6c94c829acb6a7c8b0c93705
BLAKE2b-256 1dc848b0e804d18a52aed7e52de4da4efde821aaca5406286b7e00bd820b3efe

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.2-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.2-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 f08c09529a7e22fa4c76d9c62d25fda8ad83974139602f378a6a56aebf5c82c3
MD5 07b1e33a8a8bad6358b81a028f837530
BLAKE2b-256 f9bff670baa274851e443877dc78a1231667728f1bf09bd58921f0f6c1c45c93

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.2-cp312-cp312-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.2-cp312-cp312-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 844076deef74c01ac0994a7f20b8e36cd76c760f43454c4624b57eb91691fc30
MD5 7de12e85529f4c91b17050ffddf5aad3
BLAKE2b-256 cfa8709c2c20e55619a40e6e17a7a68552b0b6c164ee0747ff8f97b479079753

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.2-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.123.2-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 3.5 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.123.2-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 825f41f4f7a07f6d414e69d2ee66b5c7cff18f1a7c4c75424aaa7b73d19154d7
MD5 4baf9c32c3368829f32acf0cf99496e4
BLAKE2b-256 6055005112f72053298a81489df02bf82a8d5e42753525174dea55f3022c68c5

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.2-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.2-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 9fac3ac09737a87a2ae3dd408d061c7bb6cd28b11cb2eec0c09f66ff825f1e7f
MD5 8e4330508299b989e751a1c066ec5b3c
BLAKE2b-256 5e27f3eeac01c8255b4c12eb4023647601a610d5671831cd029b4affc4fa6f9c

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.2-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.2-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 4027e072151a7f247a6a2275e09388e812420e04cf764c46b797c35747436cd1
MD5 ce14ed3b41ea87dd2bf4efe37f76cdcb
BLAKE2b-256 d6e0b425491ea127eaa792de905ef866c3d6fb45c720402923433ebcf34ee680

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.2-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.2-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 431aae4dfacb947fdd6713b9fc8b1490f5ede6bd7304043789ee4f6d15a73437
MD5 71ff043c001ed9b47edbb0d4e7134b6d
BLAKE2b-256 2f31c449f52882eb990e0aa23a8d1d83393d9445397dc7ad53f484634b0aaabc

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.2-cp311-cp311-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.2-cp311-cp311-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 4fe427b530f11823478d2226028ec10bba9c675059f66109b2c7ad8af3164b28
MD5 53b357e9c7a96acfcdeea25df35f9df7
BLAKE2b-256 c003c5081ab2c6ae820d1e81309994ae8b77850bedebc4644dd1352e4d6030f2

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.2-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.123.2-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 3.5 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.123.2-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 624ccad94c5730b86ee9ef6b62f2254cc334590cb3bf8c0ea6f4a4fd493cbeaa
MD5 a1849e5fa664f51f4d83df4cf7ed5cd0
BLAKE2b-256 7bffba6d3d73aaceb4afc27cd7de20624bb5f4e5b26fa6a6dc31299468191fcd

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.2-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.2-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 118670965177d515af1a8158028951bbaf69d1ccfa5b151d5d6ca261cc59d702
MD5 8c5d1ae08500f4d595471251666151f0
BLAKE2b-256 508b8534d2b2df675e0935c0ecaecd512a4e4c7e518748a54274ac6bb8eb3ed8

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.2-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.2-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 5d6876c6865abefda2f3fb7fc017f9f02f63ac8769ae73e10a75924a0373b0e1
MD5 e2fab008efba8584d7f6246463c246db
BLAKE2b-256 75e59b72abf9b20d2f30acd017d0279b1c9e705e9ae231a3812661a5ae47b9a8

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.2-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.2-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 338768b947d2bef1c194ff9a2417586e97bd0a90693c89cf36dcf544a245fdaa
MD5 a69dba32db6f98dd1ce9605a335fe225
BLAKE2b-256 b41ac3f9ff06b77bba314b1766350ba470a87e2320f9a113a1b86807cfc7e10b

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.2-cp310-cp310-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.2-cp310-cp310-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 4d2d7f75c69d04c656ff7ebdb8f401d46b0af18c3fbf997b2fbf082251d51f90
MD5 65ebcd47308318e5c3732d7370d48091
BLAKE2b-256 e3dbee30799b444a7fbb8e94f519e8630375fd97422a84cae5ac94cba2cfaa3f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page