Skip to main content

Python SDK for Chalk

Project description

Chalk

Chalk enables innovative machine learning teams to focus on building the unique products and models that make their business stand out. Behind the scenes Chalk seamlessly handles data infrastructure with a best-in-class developer experience. Here’s how it works –


Develop

Chalk makes it simple to develop feature pipelines for machine learning. Define Python functions using the libraries and tools you're familiar with instead of specialized DSLs. Chalk then orchestrates your functions into pipelines that execute in parallel on a Rust-based engine and coordinates the infrastructure required to compute features.

Features

To get started, define your features with Pydantic-inspired Python classes. You can define schemas, specify relationships, and add metadata to help your team share and re-use work.

@features
class User:
    id: int
    full_name: str
    nickname: Optional[str]
    email: Optional[str]
    birthday: date
    credit_score: float
    datawarehouse_feature: float

    transactions: DataFrame[Transaction] = has_many(lambda: Transaction.user_id == User.id)

Resolvers

Next, tell Chalk how to compute your features. Chalk ingests data from your existing data stores, and lets you use Python to compute features with feature resolvers. Feature resolvers are declared with the decorators @online and @offline, and can depend on the outputs of other feature resolvers.

Resolvers make it easy to rapidly integrate a wide variety of data sources, join them together, and use them in your model.

SQL

pg = PostgreSQLSource()

@online
def get_user(uid: User.id) -> Features[User.full_name, User.email]:
    return pg.query_string(
        "select email, full_name from users where id=:id",
        args=dict(id=uid)
    ).one()

REST

import requests

@online
def get_socure_score(uid: User.id) -> Features[User.socure_score]:
    return (
        requests.get("https://api.socure.com", json={
            id: uid
        }).json()['socure_score']
    )

Execute

Once you've defined your features and resolvers, Chalk orchestrates them into flexible pipelines that make training and executing models easy.

Chalk has built-in support for feature engineering workflows -- no need to manage Airflow or orchestrate complicated streaming flows. You can execute resolver pipelines with declarative caching, ingest batch data on a schedule, and easily make slow sources available online for low-latency serving.

Caching

Many data sources (like vendor APIs) are too slow for online use cases and/or charge a high dollar cost-per-call. Chalk lets you optimize latency and cost by defining declarative caching policies which are well-integrated throughout the system. You no longer have to manage Redis, Memcached, DynamodDB, or spend time tuning cache-warming pipelines.

Add a caching policy with one line of code in your feature definition:

@features
class ExternalBankAccount:
-   balance: int
+   balance: int = feature(max_staleness="**1d**")

Optionally warm feature caches by executing resolvers on a schedule:

@online(cron="**1d**")
def fn(id: User.id) -> User.credit_score:
  return redshift.query(...).all()

Or override staleness tolerances at query time when you need fresher data for your models:

chalk.query(
    ...,
    outputs=[User.fraud_score],
    max_staleness={User.fraud_score: "1m"}
)

Batch ETL ingestion

Chalk also makes it simple to generate training sets from data warehouse sources -- join data from services like S3, Redshift, BQ, Snowflake (or other custom sources) with historical features computed online. Specify a cron schedule on an @offline resolver and Chalk automatically ingests data with support for incremental reads:

@offline(cron="**1h**")
def fn() -> Feature[User.id, User.datawarehouse_feature]:
  return redshift.query(...).incremental()

Chalk makes this data available for point-in-time-correct dataset generation for data science use-cases. Every pipeline has built-in monitoring and alerting to ensure data quality and timeliness.

Reverse ETL

When your model needs to use features that are canonically stored in a high-latency data source (like a data warehouse), Chalk's Reverse ETL support makes it simple to bring those features online and serve them quickly.

Add a single line of code to an offline resolver, and Chalk constructs a managed reverse ETL pipeline for that data source:

@offline(offline_to_online_etl="5m")

Now data from slow offline data sources is automatically available for low-latency online serving.


Deploy + query

Once you've defined your pipelines, you can rapidly deploy them to production with Chalk's CLI:

chalk apply

This creates a deployment of your project, which is served at a unique preview URL. You can promote this deployment to production, or perform QA workflows on your preview environment to make sure that your Chalk deployment performs as expected.

Once you promote your deployment to production, Chalk makes features available for low-latency online inference and offline training. Significantly, Chalk uses the exact same source code to serve temporally-consistent training sets to data scientists and live feature values to models. This re-use ensures that feature values from online and offline contexts match and dramatically cuts development time.

Online inference

Chalk's online store & feature computation engine make it easy to query features with ultra low-latency, so you can use your feature pipelines to serve online inference use-cases.

Integrating Chalk with your production application takes minutes via Chalk's simple REST API:

result = ChalkClient().query(
    input={
        User.name: "Katherine Johnson"
    },
    output=[User.fico_score],
    staleness={User.fico_score: "10m"},
)
result.get_feature_value(User.fico_score)

Features computed to serve online requests are also replicated to Chalk's offline store for historical performance tracking and training set generation.

Offline training

Data scientists can use Chalk's Jupyter integration to create datasets and train models. Datasets are stored and tracked so that they can be re-used by other modelers, and so that model provenance is tracked for audit and reproducibility.

X = ChalkClient.offline_query(
    input=labels[[User.uid, timestamp]],
    output=[
        User.returned_transactions_last_60,
        User.user_account_name_match_score,
        User.socure_score,
        User.identity.has_verified_phone,
        User.identity.is_voip_phone,
        User.identity.account_age_days,
        User.identity.email_age,
    ],
)

Chalk datasets are always "temporally consistent." This means that you can provide labels with different past timestamps and get historical features that represent what your application would have retrieved online at those past times. Temporal consistency ensures that your model training doesn't mix "future" and "past" data.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chalkpy-2.110.3.tar.gz (1.3 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

chalkpy-2.110.3-cp313-cp313-win_amd64.whl (3.2 MB view details)

Uploaded CPython 3.13Windows x86-64

chalkpy-2.110.3-cp313-cp313-manylinux_2_28_x86_64.whl (3.5 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

chalkpy-2.110.3-cp313-cp313-manylinux_2_28_aarch64.whl (3.5 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

chalkpy-2.110.3-cp313-cp313-macosx_11_0_arm64.whl (3.3 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

chalkpy-2.110.3-cp313-cp313-macosx_10_13_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.13macOS 10.13+ x86-64

chalkpy-2.110.3-cp312-cp312-win_amd64.whl (3.2 MB view details)

Uploaded CPython 3.12Windows x86-64

chalkpy-2.110.3-cp312-cp312-manylinux_2_28_x86_64.whl (3.5 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

chalkpy-2.110.3-cp312-cp312-manylinux_2_28_aarch64.whl (3.5 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

chalkpy-2.110.3-cp312-cp312-macosx_11_0_arm64.whl (3.3 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

chalkpy-2.110.3-cp312-cp312-macosx_10_13_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.12macOS 10.13+ x86-64

chalkpy-2.110.3-cp311-cp311-win_amd64.whl (3.2 MB view details)

Uploaded CPython 3.11Windows x86-64

chalkpy-2.110.3-cp311-cp311-manylinux_2_28_x86_64.whl (3.5 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

chalkpy-2.110.3-cp311-cp311-manylinux_2_28_aarch64.whl (3.5 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

chalkpy-2.110.3-cp311-cp311-macosx_11_0_arm64.whl (3.3 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

chalkpy-2.110.3-cp311-cp311-macosx_10_13_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.11macOS 10.13+ x86-64

chalkpy-2.110.3-cp310-cp310-win_amd64.whl (3.2 MB view details)

Uploaded CPython 3.10Windows x86-64

chalkpy-2.110.3-cp310-cp310-manylinux_2_28_x86_64.whl (3.5 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

chalkpy-2.110.3-cp310-cp310-manylinux_2_28_aarch64.whl (3.5 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

chalkpy-2.110.3-cp310-cp310-macosx_11_0_arm64.whl (3.3 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

chalkpy-2.110.3-cp310-cp310-macosx_10_13_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.10macOS 10.13+ x86-64

File details

Details for the file chalkpy-2.110.3.tar.gz.

File metadata

  • Download URL: chalkpy-2.110.3.tar.gz
  • Upload date:
  • Size: 1.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.110.3.tar.gz
Algorithm Hash digest
SHA256 356b670e8a6bf22851ccbba863a898e9890e07052ed172dffffa80a30ce36431
MD5 c57af580b7e9abbcca4d6c3d9b3fd5f0
BLAKE2b-256 b098d83914be9180485a0fafb6590d4785b1ded39e1f0f118a0ef219e8638f61

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.3-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.110.3-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 3.2 MB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.110.3-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 3d6ad0509545184c3c797bc40e0fad0e361978ce14e99d144035bbf012957b6d
MD5 713f2f70bd3a86244b2ab91de48f1764
BLAKE2b-256 c9a9d7a0c91286f80bac6bed1c6dcf2cae088714cc659de0bb2c9325d7cef370

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.3-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.3-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 f6b754145d6accff0d88173cf40f7b27ddcaddd74599a98251f1e4cd7f9aa15f
MD5 77aa47c175fca9f53a7bcd9bd0bf5e27
BLAKE2b-256 6dfca1e457808441f0815a48691c0e9516c1c9a73cb5f026fb374680c36bb0bc

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.3-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.3-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 a06fe7da0fa4c1caaa4713da78381f7710a03af37602a1696431110d5c8f6a07
MD5 cb94e026c4500b594b9b007200ed5bea
BLAKE2b-256 66406db0310e3d1b0e761ad795f1e540fb020ac071936362977341bdf3c532f1

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.3-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.3-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 f2bd7a66c3a395bc5f90d940c88d1b0ed48ab2b75057a809a8aa2f927dcd598b
MD5 7346e6eab9eacfdf4b307eda88c17953
BLAKE2b-256 63c5218bf7ebffda532f261720eae4a74a29ef5699469fc46c3ec0927c709071

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.3-cp313-cp313-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.3-cp313-cp313-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 b6eca40e8195c58206da99fe6089744dbd4c9ec6b3153587facb18ac86dcb740
MD5 57a011e979c3160e29243a8495f52827
BLAKE2b-256 c65e4befe19f5c8e51d49d5a2c3e040194ef2d3265be65bdb834e8e0b1099977

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.3-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.110.3-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 3.2 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.110.3-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 b302085f1634ebd332da38a3ae68dda153222737efdd6f8cb2406bdbacd1ecbe
MD5 5b3287be1f6bc8cb7d6a768e8491b170
BLAKE2b-256 348b7e00450dd3ad24066127baf6a431f226dfe2fae57799e3868f79687dd2df

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.3-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.3-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 c8cbab2d84bb66dac07da0fd9cff36b1020a7ee915237bef9d2c83c5d06fcdb1
MD5 436d613e0463e2c5f277e23e13940323
BLAKE2b-256 4856cd7454ec646046f223dbb3f5b618f4b019321685cc52e163f2bc582ee448

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.3-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.3-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 7872c641beff55876814d1925ee128b3f0385ab0e866d466aecc8c0c65c9f8c9
MD5 9eac8f06119b9386b0113dd11f3c8d2b
BLAKE2b-256 1efc794f8f2fcc4d9e8ec4b1d81ed6be73c0110c6630daa5849fc44bf3a66109

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.3-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.3-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 6bae07bf757217a2e02b578c6e842794f54a24e235b41152c4b8344dd73fc630
MD5 1ea83c53fd89da9f7b3076227f8e8bd0
BLAKE2b-256 f80d1091c56c3876f1a4fda415d9fcdba7c5c3f6ca39b632b8cfcb08b495526d

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.3-cp312-cp312-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.3-cp312-cp312-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 ac99a640f64edadf95d44e0efaccd2fc38050b1dd502481d603bb64a2b524ef4
MD5 29d213103c84234fab3715c782dc10c5
BLAKE2b-256 6f5e43f46dffee53e705f68c8e9b65de5a3037f480b7f6d2f7e2f27668b039c1

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.3-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.110.3-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 3.2 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.110.3-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 f7eadf4fb1cce364f430c791c6955c9ec67883fbeae9f602a9d43b74fc49dc78
MD5 e8908f13f6caf3dace02b72235d1cbc4
BLAKE2b-256 bb157a24112cb922fd4dec8fcdf58ac2292bcf580d69ce0115f186e93b5e4839

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.3-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.3-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 8c7f7b4262546ad9554bfaacf4094bc4f9207b8d974997e7b1c1c76194e7a2ae
MD5 35676bb699811c510d298a33442a71f2
BLAKE2b-256 ceaa3e38d9163b02e974b514f56743911306641a915d42eb3f0da72127dd339d

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.3-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.3-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 fa082fcb3784c0453fc304cccec4c9f360cca8856315777f919a83ebe4f8885c
MD5 81a0e16012ea264559f8e0fa938fe8bc
BLAKE2b-256 c60df56ed407bcb69ce120564e227d662dee247473d0d84a027f2717d2d29bc6

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.3-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.3-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 64796a3c9289f217b43a129c08a0e9471517039b468a3fe85de4a9629ddf2a72
MD5 7e7d46c25e798ae39a59019ec9e0d9e0
BLAKE2b-256 e55012fb5c38bb45b5f919d664a18cc190cae1489776b41b575137ec440bac71

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.3-cp311-cp311-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.3-cp311-cp311-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 79b468ca91252d759c0e460ac2c5da6d6c0a12475dbde20c3664589edf9d1a26
MD5 fc511f73bf7d5dc11b630d4f7f7a0e66
BLAKE2b-256 e1ada4c52c1868d1c87f640802d4dc357ee8163abb401a090bf21663d13023b8

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.3-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.110.3-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 3.2 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.110.3-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 d6e73de7a3f16d561c72be885832344898ffb5b3955f38891f687368751ff905
MD5 3f799cb5de340cd3be5752ed3bfcae80
BLAKE2b-256 567e057d534bd59e702b0b1b6b520fe7a66c2a1392f64445b0189552f4bc0280

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.3-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.3-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 f2cb59a71408c097460d4ca96ffe810b6ac9d48729f0cb9f691edc5314c5e912
MD5 07249e9d08886f0ca6839a7fca6e6306
BLAKE2b-256 4118316f4e178e766237f282d76fadbf7b314b61730d271835eb0580e158a978

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.3-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.3-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 e96b08c8f6d426ad5dc5f64736e2346633034422377e43e681c9f78113269f5e
MD5 82793690dae5e6d748d8ffdb647a0c5e
BLAKE2b-256 879b46386e9fbb3addaeef4ad230fd90bdf55ed62550fd752992f74e103f02b5

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.3-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.3-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 2f16566f2ae31d4084e9d1e33510a0238cca612c2c0fdd377da48fc60f5e4226
MD5 1ce404fefb9b3c42204dada620376cb1
BLAKE2b-256 622660f31f50365c58ce7048b58e10ebc69253609f1183072c6ed1d9c1f13476

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.3-cp310-cp310-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.3-cp310-cp310-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 747f4a10f2b6a1c41fdc297a0714cd3fa7a48d8a573fe14e98e233013906a47c
MD5 c6d41423b16598f769211ceecc76386d
BLAKE2b-256 1189719ab772ebee4527e3743caf72e21920591530b84a994d9424bd412b50ab

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page