Skip to main content

Python SDK for Chalk

Project description

Chalk

Chalk enables innovative machine learning teams to focus on building the unique products and models that make their business stand out. Behind the scenes Chalk seamlessly handles data infrastructure with a best-in-class developer experience. Here’s how it works –


Develop

Chalk makes it simple to develop feature pipelines for machine learning. Define Python functions using the libraries and tools you're familiar with instead of specialized DSLs. Chalk then orchestrates your functions into pipelines that execute in parallel on a Rust-based engine and coordinates the infrastructure required to compute features.

Features

To get started, define your features with Pydantic-inspired Python classes. You can define schemas, specify relationships, and add metadata to help your team share and re-use work.

@features
class User:
    id: int
    full_name: str
    nickname: Optional[str]
    email: Optional[str]
    birthday: date
    credit_score: float
    datawarehouse_feature: float

    transactions: DataFrame[Transaction] = has_many(lambda: Transaction.user_id == User.id)

Resolvers

Next, tell Chalk how to compute your features. Chalk ingests data from your existing data stores, and lets you use Python to compute features with feature resolvers. Feature resolvers are declared with the decorators @online and @offline, and can depend on the outputs of other feature resolvers.

Resolvers make it easy to rapidly integrate a wide variety of data sources, join them together, and use them in your model.

SQL

pg = PostgreSQLSource()

@online
def get_user(uid: User.id) -> Features[User.full_name, User.email]:
    return pg.query_string(
        "select email, full_name from users where id=:id",
        args=dict(id=uid)
    ).one()

REST

import requests

@online
def get_socure_score(uid: User.id) -> Features[User.socure_score]:
    return (
        requests.get("https://api.socure.com", json={
            id: uid
        }).json()['socure_score']
    )

Execute

Once you've defined your features and resolvers, Chalk orchestrates them into flexible pipelines that make training and executing models easy.

Chalk has built-in support for feature engineering workflows -- no need to manage Airflow or orchestrate complicated streaming flows. You can execute resolver pipelines with declarative caching, ingest batch data on a schedule, and easily make slow sources available online for low-latency serving.

Caching

Many data sources (like vendor APIs) are too slow for online use cases and/or charge a high dollar cost-per-call. Chalk lets you optimize latency and cost by defining declarative caching policies which are well-integrated throughout the system. You no longer have to manage Redis, Memcached, DynamodDB, or spend time tuning cache-warming pipelines.

Add a caching policy with one line of code in your feature definition:

@features
class ExternalBankAccount:
-   balance: int
+   balance: int = feature(max_staleness="**1d**")

Optionally warm feature caches by executing resolvers on a schedule:

@online(cron="**1d**")
def fn(id: User.id) -> User.credit_score:
  return redshift.query(...).all()

Or override staleness tolerances at query time when you need fresher data for your models:

chalk.query(
    ...,
    outputs=[User.fraud_score],
    max_staleness={User.fraud_score: "1m"}
)

Batch ETL ingestion

Chalk also makes it simple to generate training sets from data warehouse sources -- join data from services like S3, Redshift, BQ, Snowflake (or other custom sources) with historical features computed online. Specify a cron schedule on an @offline resolver and Chalk automatically ingests data with support for incremental reads:

@offline(cron="**1h**")
def fn() -> Feature[User.id, User.datawarehouse_feature]:
  return redshift.query(...).incremental()

Chalk makes this data available for point-in-time-correct dataset generation for data science use-cases. Every pipeline has built-in monitoring and alerting to ensure data quality and timeliness.

Reverse ETL

When your model needs to use features that are canonically stored in a high-latency data source (like a data warehouse), Chalk's Reverse ETL support makes it simple to bring those features online and serve them quickly.

Add a single line of code to an offline resolver, and Chalk constructs a managed reverse ETL pipeline for that data source:

@offline(offline_to_online_etl="5m")

Now data from slow offline data sources is automatically available for low-latency online serving.


Deploy + query

Once you've defined your pipelines, you can rapidly deploy them to production with Chalk's CLI:

chalk apply

This creates a deployment of your project, which is served at a unique preview URL. You can promote this deployment to production, or perform QA workflows on your preview environment to make sure that your Chalk deployment performs as expected.

Once you promote your deployment to production, Chalk makes features available for low-latency online inference and offline training. Significantly, Chalk uses the exact same source code to serve temporally-consistent training sets to data scientists and live feature values to models. This re-use ensures that feature values from online and offline contexts match and dramatically cuts development time.

Online inference

Chalk's online store & feature computation engine make it easy to query features with ultra low-latency, so you can use your feature pipelines to serve online inference use-cases.

Integrating Chalk with your production application takes minutes via Chalk's simple REST API:

result = ChalkClient().query(
    input={
        User.name: "Katherine Johnson"
    },
    output=[User.fico_score],
    staleness={User.fico_score: "10m"},
)
result.get_feature_value(User.fico_score)

Features computed to serve online requests are also replicated to Chalk's offline store for historical performance tracking and training set generation.

Offline training

Data scientists can use Chalk's Jupyter integration to create datasets and train models. Datasets are stored and tracked so that they can be re-used by other modelers, and so that model provenance is tracked for audit and reproducibility.

X = ChalkClient.offline_query(
    input=labels[[User.uid, timestamp]],
    output=[
        User.returned_transactions_last_60,
        User.user_account_name_match_score,
        User.socure_score,
        User.identity.has_verified_phone,
        User.identity.is_voip_phone,
        User.identity.account_age_days,
        User.identity.email_age,
    ],
)

Chalk datasets are always "temporally consistent." This means that you can provide labels with different past timestamps and get historical features that represent what your application would have retrieved online at those past times. Temporal consistency ensures that your model training doesn't mix "future" and "past" data.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chalkpy-2.123.9.tar.gz (1.6 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

chalkpy-2.123.9-cp313-cp313-win_amd64.whl (3.6 MB view details)

Uploaded CPython 3.13Windows x86-64

chalkpy-2.123.9-cp313-cp313-manylinux_2_28_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

chalkpy-2.123.9-cp313-cp313-manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

chalkpy-2.123.9-cp313-cp313-macosx_11_0_arm64.whl (3.7 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

chalkpy-2.123.9-cp313-cp313-macosx_10_13_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.13macOS 10.13+ x86-64

chalkpy-2.123.9-cp312-cp312-win_amd64.whl (3.6 MB view details)

Uploaded CPython 3.12Windows x86-64

chalkpy-2.123.9-cp312-cp312-manylinux_2_28_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

chalkpy-2.123.9-cp312-cp312-manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

chalkpy-2.123.9-cp312-cp312-macosx_11_0_arm64.whl (3.7 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

chalkpy-2.123.9-cp312-cp312-macosx_10_13_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.12macOS 10.13+ x86-64

chalkpy-2.123.9-cp311-cp311-win_amd64.whl (3.6 MB view details)

Uploaded CPython 3.11Windows x86-64

chalkpy-2.123.9-cp311-cp311-manylinux_2_28_x86_64.whl (3.9 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

chalkpy-2.123.9-cp311-cp311-manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

chalkpy-2.123.9-cp311-cp311-macosx_11_0_arm64.whl (3.7 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

chalkpy-2.123.9-cp311-cp311-macosx_10_13_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.11macOS 10.13+ x86-64

chalkpy-2.123.9-cp310-cp310-win_amd64.whl (3.6 MB view details)

Uploaded CPython 3.10Windows x86-64

chalkpy-2.123.9-cp310-cp310-manylinux_2_28_x86_64.whl (3.9 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

chalkpy-2.123.9-cp310-cp310-manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

chalkpy-2.123.9-cp310-cp310-macosx_11_0_arm64.whl (3.7 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

chalkpy-2.123.9-cp310-cp310-macosx_10_13_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.10macOS 10.13+ x86-64

File details

Details for the file chalkpy-2.123.9.tar.gz.

File metadata

  • Download URL: chalkpy-2.123.9.tar.gz
  • Upload date:
  • Size: 1.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.123.9.tar.gz
Algorithm Hash digest
SHA256 8a39a14ca90f8dce277cbb594820df029ef4ddebe1d5bf151c8e870b0e496143
MD5 0a480743890b9e5480b10b4190a6df37
BLAKE2b-256 6f61cd5d345593eab23377a7bc977f2351720f864a079bafacaef93d06069c5d

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.9-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.123.9-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 3.6 MB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.123.9-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 6180f9aa5c861fe5f04c2371cb9e8793ff13d0b5955bda4a4b9c9523bffad4c3
MD5 dcd7e2db41efb18e241a7fbe7ded27e4
BLAKE2b-256 3817dfa9c81ef015444c3e7d5e0f90125622bfe415037e9ab8b9c15aef55e0e6

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.9-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.9-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 35ad9bf95882ccee407edc3a594692f988bc7a53ce867b8ae2fa67521f6d7eff
MD5 2add8c5d7d7b5ccea191f7d4e78dc5d4
BLAKE2b-256 f80bfcc58591d842f864bc9480f72e9437f766a7027392638718aa9f11259bd5

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.9-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.9-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 84211eeefe23dc74e173607a382e7c23eb3a6650983de63d400dab8cba90f499
MD5 bbc4b6dc5b551a366e1fd833c44d5a92
BLAKE2b-256 4838d71178631c1d5c91e36d1f0f7e9282722261ae7c6e88cf712e062f7c612d

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.9-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.9-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 9756733512e242616867332e10c0d3862d11988751a553199ab39ee168ca66c6
MD5 70edf5f54f3fe392b29a65ab638502b3
BLAKE2b-256 70c4c20603f777c1db07dd1a6157ee340f1226bd301f6b5313f9bff0eaa40456

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.9-cp313-cp313-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.9-cp313-cp313-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 287ec3750db496f9a3aa902042a53b894a338c009edf3555f7f3be5e1b8a575f
MD5 dfe603a3c2f511fb64467ac1adaceaa8
BLAKE2b-256 e23b4b64a836956461c0e12ce39b9f2f4e10f9309bce400cfa609c227d9ee5de

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.9-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.123.9-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 3.6 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.123.9-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 3d2499727d6666c7f229871a2d82ef807452336d82d0fe03f8dfb6109089c06c
MD5 b0558086318ca1070e6b2b3ec4429950
BLAKE2b-256 a0b14c61a36a37ac53528d2d0e94e0dd1ad6cf41d176ec2760aa147476a7d670

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.9-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.9-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 e0f55dacd464a02998c6cf913c1212578683e0c700c4b64a7727f37cd4eb9014
MD5 feca28b676e6731986ccc57eaa4e6612
BLAKE2b-256 928516d885731c90b88bc67d49c0f7945e862a6738c181a7d4fdfd76a252c556

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.9-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.9-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 c34cf9cf517370d74ae782757e9409e429bdf30e7e5dc83f1d54efed027b3ab7
MD5 3e0198f715d9435c39b0eec70f1576dc
BLAKE2b-256 bf3eac3f5488fd9be0a83556e0542c510820d8bed4a556f0dd80bdd02f80578b

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.9-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.9-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 65a87a9bfd0c3a53262fa222639b195669e59f11e96654204dd64561b290c9c4
MD5 06720d5cbc19931a64b872b0b16404bd
BLAKE2b-256 f971e40fb9298b8b86522c287c948c8618460c40db56662ccfb9b75e97da4bfc

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.9-cp312-cp312-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.9-cp312-cp312-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 313801bbfbc6dd303a2f197d05a448a5c5b6656992b73c3a5074eb41cae89c10
MD5 9d9ef659c6f9b85f124d01ff56743878
BLAKE2b-256 1ce1b0c996e822d034e36365f90c2e29ec5f927be7a526dbd0fa05015ea2a030

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.9-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.123.9-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 3.6 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.123.9-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 2b254bd3a4e0913678e153b4e634846177684349196cc89ace21d106730b13cc
MD5 49e1b6ffd01c3e7a8b38b26f5a0c991f
BLAKE2b-256 491b4bcbea562b639990ac89b260373bc295e796f700b15ed48c3249d2699ca3

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.9-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.9-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 f69da4f682b92304209b2d837d5d5e959bad1e184103b874381188c79d68981a
MD5 47e2e5a18055f3dddac5aca58730654d
BLAKE2b-256 8815f1acdbbc6950c907237094e22e38909320cdab08042b434b06e771f56954

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.9-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.9-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 9cbb968f2d813629ca28c0d0a7dd511ac47939e5dd727b81e6b30cb1b235d55b
MD5 b261149ecfdd202b41b3e926c30b2654
BLAKE2b-256 498d65154720ac5ed986d5764eb338ba4935a40205038546e91e5fae552988dd

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.9-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.9-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 1448f2c9c36e90b3075d761a985b591d95784570b448ec3a236da32296aa0100
MD5 de58a00ee9b86b94b075350ab8bd4571
BLAKE2b-256 6a6c4f1bbda1258d654a6dd7808efb372b0bbf34366e32be47e03061989f8c00

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.9-cp311-cp311-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.9-cp311-cp311-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 d26b781fa705ad9445d3e41beb7d202fa654fc7400c0b4977a4b31b7c8e24618
MD5 c022ae533e9db892994bf37f8d92efd4
BLAKE2b-256 7595533e9d6121cbf085ea7b7da23be2eec636ae77647953fc92ca46d62d340f

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.9-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.123.9-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 3.6 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.123.9-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 2d676e33b0be800627a7b864145c0adc8d324fee8234a73ff485373991632f84
MD5 ce64697bcf4e0b3d66ef8b162454907d
BLAKE2b-256 d9aae43146b2a41116961c6707a2b3dcfff1abc5452e66ee2e4d49782ffca60d

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.9-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.9-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 43c17b6ab4e1fff1bf13792708dc71349807b2c43c0b47493804a9b636605249
MD5 6ccd567fe5cb7f293dd3f8a6dee973b3
BLAKE2b-256 0c5f2162646fd660be7599937e05e2124921207f55565256edaccdefaeceea61

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.9-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.9-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 a2443e3765f656be57265507829dd48036e2d6e6b0f41863afd780e657dc911c
MD5 c4f8295c91c973b236e396a0e55949da
BLAKE2b-256 c3a0338898c9bf6cd700960cc53b9b7ed73c358573259f250d1ac4a7b2dabd28

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.9-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.9-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 6bbb469af61a2119f0f6c5ca478f4947cffa30e29e85c1223fad00fd08162a7e
MD5 0df6b07b7acba6b8ae615abbb51ae659
BLAKE2b-256 e767bc9407058a6f6374bbaad27ac4aca52963244578d03ec8fe03a51d7ad647

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.9-cp310-cp310-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.9-cp310-cp310-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 a8a2063c371188fb70df76c813416677f280841a63fb1bd88c6583bc4eb29c77
MD5 1bc46b44b2cfd48850cfcd1c542b25ed
BLAKE2b-256 fa5d2021b459433f7f23ffb5784b79abebfc7676580bc5253b50deeaa7f452fd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page