Skip to main content

Python SDK for Chalk

Project description

Chalk

Chalk enables innovative machine learning teams to focus on building the unique products and models that make their business stand out. Behind the scenes Chalk seamlessly handles data infrastructure with a best-in-class developer experience. Here’s how it works –


Develop

Chalk makes it simple to develop feature pipelines for machine learning. Define Python functions using the libraries and tools you're familiar with instead of specialized DSLs. Chalk then orchestrates your functions into pipelines that execute in parallel on a Rust-based engine and coordinates the infrastructure required to compute features.

Features

To get started, define your features with Pydantic-inspired Python classes. You can define schemas, specify relationships, and add metadata to help your team share and re-use work.

@features
class User:
    id: int
    full_name: str
    nickname: Optional[str]
    email: Optional[str]
    birthday: date
    credit_score: float
    datawarehouse_feature: float

    transactions: DataFrame[Transaction] = has_many(lambda: Transaction.user_id == User.id)

Resolvers

Next, tell Chalk how to compute your features. Chalk ingests data from your existing data stores, and lets you use Python to compute features with feature resolvers. Feature resolvers are declared with the decorators @online and @offline, and can depend on the outputs of other feature resolvers.

Resolvers make it easy to rapidly integrate a wide variety of data sources, join them together, and use them in your model.

SQL

pg = PostgreSQLSource()

@online
def get_user(uid: User.id) -> Features[User.full_name, User.email]:
    return pg.query_string(
        "select email, full_name from users where id=:id",
        args=dict(id=uid)
    ).one()

REST

import requests

@online
def get_socure_score(uid: User.id) -> Features[User.socure_score]:
    return (
        requests.get("https://api.socure.com", json={
            id: uid
        }).json()['socure_score']
    )

Execute

Once you've defined your features and resolvers, Chalk orchestrates them into flexible pipelines that make training and executing models easy.

Chalk has built-in support for feature engineering workflows -- no need to manage Airflow or orchestrate complicated streaming flows. You can execute resolver pipelines with declarative caching, ingest batch data on a schedule, and easily make slow sources available online for low-latency serving.

Caching

Many data sources (like vendor APIs) are too slow for online use cases and/or charge a high dollar cost-per-call. Chalk lets you optimize latency and cost by defining declarative caching policies which are well-integrated throughout the system. You no longer have to manage Redis, Memcached, DynamodDB, or spend time tuning cache-warming pipelines.

Add a caching policy with one line of code in your feature definition:

@features
class ExternalBankAccount:
-   balance: int
+   balance: int = feature(max_staleness="**1d**")

Optionally warm feature caches by executing resolvers on a schedule:

@online(cron="**1d**")
def fn(id: User.id) -> User.credit_score:
  return redshift.query(...).all()

Or override staleness tolerances at query time when you need fresher data for your models:

chalk.query(
    ...,
    outputs=[User.fraud_score],
    max_staleness={User.fraud_score: "1m"}
)

Batch ETL ingestion

Chalk also makes it simple to generate training sets from data warehouse sources -- join data from services like S3, Redshift, BQ, Snowflake (or other custom sources) with historical features computed online. Specify a cron schedule on an @offline resolver and Chalk automatically ingests data with support for incremental reads:

@offline(cron="**1h**")
def fn() -> Feature[User.id, User.datawarehouse_feature]:
  return redshift.query(...).incremental()

Chalk makes this data available for point-in-time-correct dataset generation for data science use-cases. Every pipeline has built-in monitoring and alerting to ensure data quality and timeliness.

Reverse ETL

When your model needs to use features that are canonically stored in a high-latency data source (like a data warehouse), Chalk's Reverse ETL support makes it simple to bring those features online and serve them quickly.

Add a single line of code to an offline resolver, and Chalk constructs a managed reverse ETL pipeline for that data source:

@offline(offline_to_online_etl="5m")

Now data from slow offline data sources is automatically available for low-latency online serving.


Deploy + query

Once you've defined your pipelines, you can rapidly deploy them to production with Chalk's CLI:

chalk apply

This creates a deployment of your project, which is served at a unique preview URL. You can promote this deployment to production, or perform QA workflows on your preview environment to make sure that your Chalk deployment performs as expected.

Once you promote your deployment to production, Chalk makes features available for low-latency online inference and offline training. Significantly, Chalk uses the exact same source code to serve temporally-consistent training sets to data scientists and live feature values to models. This re-use ensures that feature values from online and offline contexts match and dramatically cuts development time.

Online inference

Chalk's online store & feature computation engine make it easy to query features with ultra low-latency, so you can use your feature pipelines to serve online inference use-cases.

Integrating Chalk with your production application takes minutes via Chalk's simple REST API:

result = ChalkClient().query(
    input={
        User.name: "Katherine Johnson"
    },
    output=[User.fico_score],
    staleness={User.fico_score: "10m"},
)
result.get_feature_value(User.fico_score)

Features computed to serve online requests are also replicated to Chalk's offline store for historical performance tracking and training set generation.

Offline training

Data scientists can use Chalk's Jupyter integration to create datasets and train models. Datasets are stored and tracked so that they can be re-used by other modelers, and so that model provenance is tracked for audit and reproducibility.

X = ChalkClient.offline_query(
    input=labels[[User.uid, timestamp]],
    output=[
        User.returned_transactions_last_60,
        User.user_account_name_match_score,
        User.socure_score,
        User.identity.has_verified_phone,
        User.identity.is_voip_phone,
        User.identity.account_age_days,
        User.identity.email_age,
    ],
)

Chalk datasets are always "temporally consistent." This means that you can provide labels with different past timestamps and get historical features that represent what your application would have retrieved online at those past times. Temporal consistency ensures that your model training doesn't mix "future" and "past" data.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chalkpy-2.119.2.tar.gz (1.6 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

chalkpy-2.119.2-cp313-cp313-win_amd64.whl (3.5 MB view details)

Uploaded CPython 3.13Windows x86-64

chalkpy-2.119.2-cp313-cp313-manylinux_2_28_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

chalkpy-2.119.2-cp313-cp313-manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

chalkpy-2.119.2-cp313-cp313-macosx_11_0_arm64.whl (3.6 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

chalkpy-2.119.2-cp313-cp313-macosx_10_13_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.13macOS 10.13+ x86-64

chalkpy-2.119.2-cp312-cp312-win_amd64.whl (3.5 MB view details)

Uploaded CPython 3.12Windows x86-64

chalkpy-2.119.2-cp312-cp312-manylinux_2_28_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

chalkpy-2.119.2-cp312-cp312-manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

chalkpy-2.119.2-cp312-cp312-macosx_11_0_arm64.whl (3.6 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

chalkpy-2.119.2-cp312-cp312-macosx_10_13_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.12macOS 10.13+ x86-64

chalkpy-2.119.2-cp311-cp311-win_amd64.whl (3.5 MB view details)

Uploaded CPython 3.11Windows x86-64

chalkpy-2.119.2-cp311-cp311-manylinux_2_28_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

chalkpy-2.119.2-cp311-cp311-manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

chalkpy-2.119.2-cp311-cp311-macosx_11_0_arm64.whl (3.6 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

chalkpy-2.119.2-cp311-cp311-macosx_10_13_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.11macOS 10.13+ x86-64

chalkpy-2.119.2-cp310-cp310-win_amd64.whl (3.5 MB view details)

Uploaded CPython 3.10Windows x86-64

chalkpy-2.119.2-cp310-cp310-manylinux_2_28_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

chalkpy-2.119.2-cp310-cp310-manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

chalkpy-2.119.2-cp310-cp310-macosx_11_0_arm64.whl (3.6 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

chalkpy-2.119.2-cp310-cp310-macosx_10_13_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.10macOS 10.13+ x86-64

File details

Details for the file chalkpy-2.119.2.tar.gz.

File metadata

  • Download URL: chalkpy-2.119.2.tar.gz
  • Upload date:
  • Size: 1.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.119.2.tar.gz
Algorithm Hash digest
SHA256 65ca342a3eaef320d2002e2921a168df885b340e5eab89206e4e81bb14c2c49c
MD5 147fc8c31123b03ff9d1e11c90c3f1c0
BLAKE2b-256 eadbed7c7406a31179540c64411480cdcef2c48afff1138ff9df4a06a21d448d

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.2-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.119.2-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 3.5 MB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.119.2-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 9911a2f684c0817bd5d4cc9c3772c35fef0bab58e9fb141aed51c444f55fa1fe
MD5 592be4368ba010b168f436889fd6155d
BLAKE2b-256 ab2911dfe870750875e79613f607549e791ba6d8edfe31856eedfd0b9a70fad7

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.2-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.119.2-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 719e59d75d88215cc6ecf83f3d4afa1dd2de395d83a700f8fb09229f2932967c
MD5 9f6bbedc9716753c7c85ab53c795c7b8
BLAKE2b-256 09c93be984d3b46c6e6957b94ac0a5ee4bc1b974b5bd1f5179aec9aa7db05cd8

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.2-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.119.2-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 75106995978f30bf6af1d3e9e59d31b88ffe91d37c89c4c78a13b4c801e86e99
MD5 8e872e20796d1dfa0e240e19361532df
BLAKE2b-256 5f5dd27a29c30922207eef2e2a5348adac1f6fb47dfbaeb522d62fc81f6374bc

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.2-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.119.2-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 779ac22a3dffcbcd5f890504b9798e3f583f61a4fed4517e01beeb7c50b480da
MD5 6d3b2732f42e4a4fd52925a4d25852ca
BLAKE2b-256 022df663baf753d04344a6bdeaca8c901146ebef342385d0a16f00f3710dbe54

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.2-cp313-cp313-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.119.2-cp313-cp313-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 86b7eab8f1dbd518aec67e0aeff8d675e12a9c5de952123863a4e3e8b5e2dba1
MD5 578b2fa0b30a2d5874d05aa87aff21f5
BLAKE2b-256 2570c74a4a5653d5a2db9c96f483c1dd50e011f671d205eb18c665897b46cd34

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.2-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.119.2-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 3.5 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.119.2-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 6fb77d9dc66f6ec6cf87be537a30c11ddc732d9f840df70eeccb130ef7e9c406
MD5 05d08d1ac5a9d7b52dc185d5ef0ab5b2
BLAKE2b-256 524bd7a47e4399c6ec44d7dee395468f3e8a2245ac5a786c57d49251c361ad44

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.2-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.119.2-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 c6ff5716e09a585b4a15ba9a9593265b6a567d014666c2b07aa9bce0a26a8c7f
MD5 d1b9d285e08548581081b64cde689019
BLAKE2b-256 34fdff00d86c02b64da050c50b0073e59348d2c90c03908288f4d24de4f7cf61

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.2-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.119.2-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 8017de3c987e6a5476d0883f075aebc3a08103f188e956426651a7a84fb163a3
MD5 ab554ca15989c4b0701ab2e8d13e1c33
BLAKE2b-256 857ed84cf5dfa12d92f48881aae3db8cb5c97dcdf9b48a18c26b698b06ef886e

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.2-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.119.2-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 6981830f3d700f351ef24972790b35b185ed933869281ccac1a8ff4abda810af
MD5 8d4edce93705c221308aa9b86aac35bb
BLAKE2b-256 4b53f44b9e193e768e14fd4cbf1d48ada05e9d9360ff7b1ff5f0bf5aa50adcf5

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.2-cp312-cp312-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.119.2-cp312-cp312-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 76ed691fc6432f9749e6db0e0196a2e26030ad02805998eef16c9284dbb8adc9
MD5 81ac911885240da859fd9bdcd1f93f54
BLAKE2b-256 75ee7cea0b1fe3bed94762230d8d8830075d7fc9be4a771b8ce6da540824f1c2

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.2-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.119.2-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 3.5 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.119.2-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 1b0bcaf665855ec5a2c8742707b225d6a1e80a841cb29607ed84bcc1d745c3f9
MD5 289c604aaae54238e4506afe24967a50
BLAKE2b-256 1a7521baf05b8e57cc6d7585a35ca38bbf528f9fa7cf9ce0d1d969f2e1db16f1

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.2-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.119.2-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 eb173c84b6e5e738ccc7c6b7b74bf250656bef59a6570fc23aaadf724ffdf0ef
MD5 b4116b5c67309232b3de6067a070b02f
BLAKE2b-256 3d6e57704105dc638d0e058c6aa44d1a4ef9a86d47936be04de58e89d288799c

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.2-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.119.2-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 1546d344fcdd41a3f2d494515d96da4eca53541a9b3b33b2c0ff4894f16b5124
MD5 a0d818372589fdc991c41b58f3cfcf96
BLAKE2b-256 426a3b7325885b87e7a2f83907b421b1e7ad028a90276f6d2e3a8f1f94ac804a

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.2-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.119.2-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 2740566d3e028512e7b1b940a9fd4e0bc3e3408035fb299aba5b11e23bd28e19
MD5 6dff5a07eabbba1658f28dbb8dc36e53
BLAKE2b-256 627d94d623e8865e46b995b0d814999aaf1c29ba0bdfe50aa48b783b1b01310d

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.2-cp311-cp311-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.119.2-cp311-cp311-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 804724f15e2d426e9347383511ff462c392f66cd40285b8cf53703a187dfdac2
MD5 6746f97874f0125793fb252e57752688
BLAKE2b-256 c0c0e0801e9d0d1a1e507c001bbf6db99ca776fc23f5a6c820e2c2953e07f716

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.2-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.119.2-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 3.5 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.119.2-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 a823ebf31e9a583cc82b970cd70ba1aee5e8778e525a04cf36506a57c373a6d0
MD5 41f3c9b40ee0c2373d5390a6ce6c60ce
BLAKE2b-256 3c302c92eba805f1d0f0d8b891f45e44553ab8352549a631c38fdde577d7e5ef

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.2-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.119.2-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 6f8676bda1ed42312048276da7fac597506aef111a51cc45de1c2d8079391b25
MD5 6a60e0a0c7dec43cb4ce63b714827a95
BLAKE2b-256 3be7201d16bace3a0e8c7ac4e10373b51247d2288bbfe8dd4a83a4308e07ded2

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.2-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.119.2-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 d4132bc140c4fafe57e688f84857dbf075cfc5a75ca121784668044982a59efc
MD5 be58dd3c9967faf691024be77cdb817b
BLAKE2b-256 54ea932378b604ea3c6a7351e8c61f9bcc884d6554b13b6c0a4507be05d039eb

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.2-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.119.2-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 dc19914efe0216b84bc6aa006ec07ace6ddec56336a66957e3bfccdf5fe12bcb
MD5 155b6c9e26005ae7c04b69c5abab02d3
BLAKE2b-256 50515909b35f0e5256eb3dc12bb22e492e7e5c77242ff4473a8281f555a0275d

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.2-cp310-cp310-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.119.2-cp310-cp310-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 38105f3e00f7d738f7502d8f7f4c42f89a97bd7d6f4203b05a8f87461ec7c469
MD5 54ffe5f0330ba8f84860735f8e6149ed
BLAKE2b-256 98de78e3fc980073b7c3f794772ca93c79e819aa9d05248cfae199e7e09004e7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page