Skip to main content

Python SDK for Chalk

Project description

Chalk

Chalk enables innovative machine learning teams to focus on building the unique products and models that make their business stand out. Behind the scenes Chalk seamlessly handles data infrastructure with a best-in-class developer experience. Here’s how it works –


Develop

Chalk makes it simple to develop feature pipelines for machine learning. Define Python functions using the libraries and tools you're familiar with instead of specialized DSLs. Chalk then orchestrates your functions into pipelines that execute in parallel on a Rust-based engine and coordinates the infrastructure required to compute features.

Features

To get started, define your features with Pydantic-inspired Python classes. You can define schemas, specify relationships, and add metadata to help your team share and re-use work.

@features
class User:
    id: int
    full_name: str
    nickname: Optional[str]
    email: Optional[str]
    birthday: date
    credit_score: float
    datawarehouse_feature: float

    transactions: DataFrame[Transaction] = has_many(lambda: Transaction.user_id == User.id)

Resolvers

Next, tell Chalk how to compute your features. Chalk ingests data from your existing data stores, and lets you use Python to compute features with feature resolvers. Feature resolvers are declared with the decorators @online and @offline, and can depend on the outputs of other feature resolvers.

Resolvers make it easy to rapidly integrate a wide variety of data sources, join them together, and use them in your model.

SQL

pg = PostgreSQLSource()

@online
def get_user(uid: User.id) -> Features[User.full_name, User.email]:
    return pg.query_string(
        "select email, full_name from users where id=:id",
        args=dict(id=uid)
    ).one()

REST

import requests

@online
def get_socure_score(uid: User.id) -> Features[User.socure_score]:
    return (
        requests.get("https://api.socure.com", json={
            id: uid
        }).json()['socure_score']
    )

Execute

Once you've defined your features and resolvers, Chalk orchestrates them into flexible pipelines that make training and executing models easy.

Chalk has built-in support for feature engineering workflows -- no need to manage Airflow or orchestrate complicated streaming flows. You can execute resolver pipelines with declarative caching, ingest batch data on a schedule, and easily make slow sources available online for low-latency serving.

Caching

Many data sources (like vendor APIs) are too slow for online use cases and/or charge a high dollar cost-per-call. Chalk lets you optimize latency and cost by defining declarative caching policies which are well-integrated throughout the system. You no longer have to manage Redis, Memcached, DynamodDB, or spend time tuning cache-warming pipelines.

Add a caching policy with one line of code in your feature definition:

@features
class ExternalBankAccount:
-   balance: int
+   balance: int = feature(max_staleness="**1d**")

Optionally warm feature caches by executing resolvers on a schedule:

@online(cron="**1d**")
def fn(id: User.id) -> User.credit_score:
  return redshift.query(...).all()

Or override staleness tolerances at query time when you need fresher data for your models:

chalk.query(
    ...,
    outputs=[User.fraud_score],
    max_staleness={User.fraud_score: "1m"}
)

Batch ETL ingestion

Chalk also makes it simple to generate training sets from data warehouse sources -- join data from services like S3, Redshift, BQ, Snowflake (or other custom sources) with historical features computed online. Specify a cron schedule on an @offline resolver and Chalk automatically ingests data with support for incremental reads:

@offline(cron="**1h**")
def fn() -> Feature[User.id, User.datawarehouse_feature]:
  return redshift.query(...).incremental()

Chalk makes this data available for point-in-time-correct dataset generation for data science use-cases. Every pipeline has built-in monitoring and alerting to ensure data quality and timeliness.

Reverse ETL

When your model needs to use features that are canonically stored in a high-latency data source (like a data warehouse), Chalk's Reverse ETL support makes it simple to bring those features online and serve them quickly.

Add a single line of code to an offline resolver, and Chalk constructs a managed reverse ETL pipeline for that data source:

@offline(offline_to_online_etl="5m")

Now data from slow offline data sources is automatically available for low-latency online serving.


Deploy + query

Once you've defined your pipelines, you can rapidly deploy them to production with Chalk's CLI:

chalk apply

This creates a deployment of your project, which is served at a unique preview URL. You can promote this deployment to production, or perform QA workflows on your preview environment to make sure that your Chalk deployment performs as expected.

Once you promote your deployment to production, Chalk makes features available for low-latency online inference and offline training. Significantly, Chalk uses the exact same source code to serve temporally-consistent training sets to data scientists and live feature values to models. This re-use ensures that feature values from online and offline contexts match and dramatically cuts development time.

Online inference

Chalk's online store & feature computation engine make it easy to query features with ultra low-latency, so you can use your feature pipelines to serve online inference use-cases.

Integrating Chalk with your production application takes minutes via Chalk's simple REST API:

result = ChalkClient().query(
    input={
        User.name: "Katherine Johnson"
    },
    output=[User.fico_score],
    staleness={User.fico_score: "10m"},
)
result.get_feature_value(User.fico_score)

Features computed to serve online requests are also replicated to Chalk's offline store for historical performance tracking and training set generation.

Offline training

Data scientists can use Chalk's Jupyter integration to create datasets and train models. Datasets are stored and tracked so that they can be re-used by other modelers, and so that model provenance is tracked for audit and reproducibility.

X = ChalkClient.offline_query(
    input=labels[[User.uid, timestamp]],
    output=[
        User.returned_transactions_last_60,
        User.user_account_name_match_score,
        User.socure_score,
        User.identity.has_verified_phone,
        User.identity.is_voip_phone,
        User.identity.account_age_days,
        User.identity.email_age,
    ],
)

Chalk datasets are always "temporally consistent." This means that you can provide labels with different past timestamps and get historical features that represent what your application would have retrieved online at those past times. Temporal consistency ensures that your model training doesn't mix "future" and "past" data.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chalkpy-2.110.6.tar.gz (1.3 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

chalkpy-2.110.6-cp313-cp313-win_amd64.whl (3.2 MB view details)

Uploaded CPython 3.13Windows x86-64

chalkpy-2.110.6-cp313-cp313-manylinux_2_28_x86_64.whl (3.5 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

chalkpy-2.110.6-cp313-cp313-manylinux_2_28_aarch64.whl (3.5 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

chalkpy-2.110.6-cp313-cp313-macosx_11_0_arm64.whl (3.3 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

chalkpy-2.110.6-cp313-cp313-macosx_10_13_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.13macOS 10.13+ x86-64

chalkpy-2.110.6-cp312-cp312-win_amd64.whl (3.2 MB view details)

Uploaded CPython 3.12Windows x86-64

chalkpy-2.110.6-cp312-cp312-manylinux_2_28_x86_64.whl (3.5 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

chalkpy-2.110.6-cp312-cp312-manylinux_2_28_aarch64.whl (3.5 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

chalkpy-2.110.6-cp312-cp312-macosx_11_0_arm64.whl (3.3 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

chalkpy-2.110.6-cp312-cp312-macosx_10_13_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.12macOS 10.13+ x86-64

chalkpy-2.110.6-cp311-cp311-win_amd64.whl (3.2 MB view details)

Uploaded CPython 3.11Windows x86-64

chalkpy-2.110.6-cp311-cp311-manylinux_2_28_x86_64.whl (3.5 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

chalkpy-2.110.6-cp311-cp311-manylinux_2_28_aarch64.whl (3.5 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

chalkpy-2.110.6-cp311-cp311-macosx_11_0_arm64.whl (3.3 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

chalkpy-2.110.6-cp311-cp311-macosx_10_13_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.11macOS 10.13+ x86-64

chalkpy-2.110.6-cp310-cp310-win_amd64.whl (3.2 MB view details)

Uploaded CPython 3.10Windows x86-64

chalkpy-2.110.6-cp310-cp310-manylinux_2_28_x86_64.whl (3.5 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

chalkpy-2.110.6-cp310-cp310-manylinux_2_28_aarch64.whl (3.5 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

chalkpy-2.110.6-cp310-cp310-macosx_11_0_arm64.whl (3.3 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

chalkpy-2.110.6-cp310-cp310-macosx_10_13_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.10macOS 10.13+ x86-64

File details

Details for the file chalkpy-2.110.6.tar.gz.

File metadata

  • Download URL: chalkpy-2.110.6.tar.gz
  • Upload date:
  • Size: 1.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.110.6.tar.gz
Algorithm Hash digest
SHA256 1606afaf98d1eeaf9a1cf84f3cc488f37590cdfdc7c8f04d7d2dccdcde09531f
MD5 3a30e6ba967201fc727b0c64149101a5
BLAKE2b-256 b4d57675c903de28b9282ba2ac76061d0fb8e198891d2d1ab822062417dff819

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.6-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.110.6-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 3.2 MB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.110.6-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 24aaf1fc12b6c8a19f1b8c5e5eaffd730e891a4c34e9cc72bd4da7c126adcb8a
MD5 a8d810e7a9c2a912fad2fb2966737a3e
BLAKE2b-256 65c8df6a50de2061ee3876f3c4f96c3b49ebb4b686e7cc7260bd406264a21c06

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.6-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.6-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 eaf6a83d57dcac390713e6953fc1882e21486c6083f1b0334a643b672f57ea65
MD5 317a242b8005f9f99a276af6e233636f
BLAKE2b-256 49cbf3eff7712e93c34f74f535179f6a90e73d52804f35b5e5bd8fde37595934

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.6-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.6-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 df1f5b6de294d16aeefe54e715c2bfc239b0c1bcdda18aae7abc3008355cc438
MD5 2617fe11501d64c04c512912c7c11bb3
BLAKE2b-256 f7c72f8de5376bb904f9db7eca03b5958b185bd4d2ba832451cc0fd3811396bc

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.6-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.6-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 be914e856d075df4e8652c8694c11f7f3938fcd614b03de88834ba1ef60527ea
MD5 4a99299581978263d3c723b45508b24d
BLAKE2b-256 d44f4f156f25cc0c0fa62ee65a43b8409f64b3aa18199703d9edc17b1e5f8367

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.6-cp313-cp313-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.6-cp313-cp313-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 b8792cfad45024f49d5a50218d217eb781c67b117096aaae207edd1903d6f68e
MD5 1222acd13c01d40e1eb9cdc002d0d8e0
BLAKE2b-256 8c216fe3269f5fb1723541bd11c350c2ae43976435c21e52d8be70406b08eec9

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.6-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.110.6-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 3.2 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.110.6-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 61896f5765905f18babc033952056b077961456953dc404e64dea2b2553ef356
MD5 aec679d43e3757407c9fb386e77f2406
BLAKE2b-256 10454044ca9fb0c8377a571aab745cf0ac404083a2dcb58f8eae4646254c3829

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.6-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.6-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 f04d0d08411957adf288afc64a491126bd85bd77772be40d8730c1869be932df
MD5 05cac5f7e62e63a43858f93f0328e9a4
BLAKE2b-256 02f100590006f853e532618944f2c0f80f8369c9bef10903966a949f2abbacb8

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.6-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.6-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 7912f68c7f3e3464697fe1816eb36ea7689c36c1b5aa2569103df1f3d13249d8
MD5 ae93890f6968bbbacc3a5867f31b53b8
BLAKE2b-256 417fedb480689068fdea17558c528a2551585ab3ac369aecf30661fffeadf31e

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.6-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.6-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 af92b7cfb506e23ee90115d7df972ffb1edbc82cbe7e86e87ecabaac3d97ac2d
MD5 e250aa0f28796c4e55eb571b51f06e40
BLAKE2b-256 7db95817b85d5741d2801ce50b93c25b07f4034cb303c5581fa4fe58195cd882

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.6-cp312-cp312-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.6-cp312-cp312-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 a59456843b271e68ee7661e45876944a63916f4489b2a837d924c863b3a4b08b
MD5 961b48a4823d608bd1c60df820ffab8a
BLAKE2b-256 6e597b984fac4fd5358eb81467f0ffa53b71dd162799ba9cbc1a4484ff8933c9

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.6-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.110.6-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 3.2 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.110.6-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 311c457d09cbc5fca97dcec7798edb6e75e6aa2d721152345c93639b2ddbd752
MD5 e990110c10317860fd5a91e0000ba713
BLAKE2b-256 2aab3bd958b83d5587afb6e15a4bb068b37d3523304402ee3d5eb216250becc7

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.6-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.6-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 3de21d2354170d8c3fbfeb71186c1b46fd45ae7d1ae31b023f81837458083d58
MD5 05660514df0be955adbcaaad1dc10983
BLAKE2b-256 266aec273c098cd3c8f774cb739e6b7539fb098c94e75b96ae58f3839b76dcff

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.6-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.6-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 aaee7420ca5addc97b7f398f77fccd3df067180413ddc1ece199352221a028f5
MD5 cf03536db0f0399ef7adf169ab206f8e
BLAKE2b-256 48d833cb857d63a861ed0922546c6bab5ae4e90055d0cc5a9121ffef842fc633

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.6-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.6-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 598d081c94770e9a2eb5a472edc72e64be50c2e391d77dfea7cd5e420a95779f
MD5 86add41c617ced22a97f1bd4d47c7c2c
BLAKE2b-256 02a917a869a8483eff5a2a284984ec10f6ce903e56ed2ff3c22c7e4730156e12

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.6-cp311-cp311-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.6-cp311-cp311-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 b5705d3187afe8ab3968c6d2e851843b45d309dba0e098b440dfd1bfce1dba18
MD5 1a31a723adb94b4a9c2085e635118636
BLAKE2b-256 48e7012b2cf08870eaa20b4bcfd2cc63daa926bb3bb0d41627a5a3553ab92f4a

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.6-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.110.6-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 3.2 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.110.6-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 bde5d3855b2f7885150976c38500f603a541060e4f113021a6857369d14be2d8
MD5 a6564fc49f9a63c24f6f46d5da24e985
BLAKE2b-256 7c99862fdfdea459f951938b8a2ce365a37f89ce3363d32ff657e31a8e27a618

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.6-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.6-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 36b381708a79a18bef2d34710e373ff6d7f5ef707cc3877027c445e9ca192ad0
MD5 4b0ceef96b8d11fc0c5841b16e0f4308
BLAKE2b-256 0fccb31c2ed70f7aebce2e10d1b3909128b2e77dc583d896c1592865ff7491b3

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.6-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.6-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 018cfbf29dcfe0caa9ffb7a7095e241c96338106a437b9286d0005ac4176fc9b
MD5 9a1e33db254f42d0f862bc6b47f69b34
BLAKE2b-256 e9ee48635188b0d7ed28c25ce7a2b86aad9b2627a71970b32e4e60c939985aa8

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.6-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.6-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 14c2c64bd70cd8ef52f4abada5cb4f093a4d49b885f9018a01b8840f1133b839
MD5 584cbdd3a66cc78a6ca6a031742d6687
BLAKE2b-256 eb136db9049a63075bd883f59e81b80d82e403d6cfa75eb4cfe0dfc911dda517

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.6-cp310-cp310-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.6-cp310-cp310-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 e840562446c709830d080283637f640b40c062e2c149a89eeb89e850dc6d4c26
MD5 b2266422c8a195462e623127a41761ab
BLAKE2b-256 0fa9dcd36384357a167a7933df4604621d930bf18acadd53181bd7d84011d19d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page