Skip to main content

Python SDK for Chalk

Project description

Chalk

Chalk enables innovative machine learning teams to focus on building the unique products and models that make their business stand out. Behind the scenes Chalk seamlessly handles data infrastructure with a best-in-class developer experience. Here’s how it works –


Develop

Chalk makes it simple to develop feature pipelines for machine learning. Define Python functions using the libraries and tools you're familiar with instead of specialized DSLs. Chalk then orchestrates your functions into pipelines that execute in parallel on a Rust-based engine and coordinates the infrastructure required to compute features.

Features

To get started, define your features with Pydantic-inspired Python classes. You can define schemas, specify relationships, and add metadata to help your team share and re-use work.

@features
class User:
    id: int
    full_name: str
    nickname: Optional[str]
    email: Optional[str]
    birthday: date
    credit_score: float
    datawarehouse_feature: float

    transactions: DataFrame[Transaction] = has_many(lambda: Transaction.user_id == User.id)

Resolvers

Next, tell Chalk how to compute your features. Chalk ingests data from your existing data stores, and lets you use Python to compute features with feature resolvers. Feature resolvers are declared with the decorators @online and @offline, and can depend on the outputs of other feature resolvers.

Resolvers make it easy to rapidly integrate a wide variety of data sources, join them together, and use them in your model.

SQL

pg = PostgreSQLSource()

@online
def get_user(uid: User.id) -> Features[User.full_name, User.email]:
    return pg.query_string(
        "select email, full_name from users where id=:id",
        args=dict(id=uid)
    ).one()

REST

import requests

@online
def get_socure_score(uid: User.id) -> Features[User.socure_score]:
    return (
        requests.get("https://api.socure.com", json={
            id: uid
        }).json()['socure_score']
    )

Execute

Once you've defined your features and resolvers, Chalk orchestrates them into flexible pipelines that make training and executing models easy.

Chalk has built-in support for feature engineering workflows -- no need to manage Airflow or orchestrate complicated streaming flows. You can execute resolver pipelines with declarative caching, ingest batch data on a schedule, and easily make slow sources available online for low-latency serving.

Caching

Many data sources (like vendor APIs) are too slow for online use cases and/or charge a high dollar cost-per-call. Chalk lets you optimize latency and cost by defining declarative caching policies which are well-integrated throughout the system. You no longer have to manage Redis, Memcached, DynamodDB, or spend time tuning cache-warming pipelines.

Add a caching policy with one line of code in your feature definition:

@features
class ExternalBankAccount:
-   balance: int
+   balance: int = feature(max_staleness="**1d**")

Optionally warm feature caches by executing resolvers on a schedule:

@online(cron="**1d**")
def fn(id: User.id) -> User.credit_score:
  return redshift.query(...).all()

Or override staleness tolerances at query time when you need fresher data for your models:

chalk.query(
    ...,
    outputs=[User.fraud_score],
    max_staleness={User.fraud_score: "1m"}
)

Batch ETL ingestion

Chalk also makes it simple to generate training sets from data warehouse sources -- join data from services like S3, Redshift, BQ, Snowflake (or other custom sources) with historical features computed online. Specify a cron schedule on an @offline resolver and Chalk automatically ingests data with support for incremental reads:

@offline(cron="**1h**")
def fn() -> Feature[User.id, User.datawarehouse_feature]:
  return redshift.query(...).incremental()

Chalk makes this data available for point-in-time-correct dataset generation for data science use-cases. Every pipeline has built-in monitoring and alerting to ensure data quality and timeliness.

Reverse ETL

When your model needs to use features that are canonically stored in a high-latency data source (like a data warehouse), Chalk's Reverse ETL support makes it simple to bring those features online and serve them quickly.

Add a single line of code to an offline resolver, and Chalk constructs a managed reverse ETL pipeline for that data source:

@offline(offline_to_online_etl="5m")

Now data from slow offline data sources is automatically available for low-latency online serving.


Deploy + query

Once you've defined your pipelines, you can rapidly deploy them to production with Chalk's CLI:

chalk apply

This creates a deployment of your project, which is served at a unique preview URL. You can promote this deployment to production, or perform QA workflows on your preview environment to make sure that your Chalk deployment performs as expected.

Once you promote your deployment to production, Chalk makes features available for low-latency online inference and offline training. Significantly, Chalk uses the exact same source code to serve temporally-consistent training sets to data scientists and live feature values to models. This re-use ensures that feature values from online and offline contexts match and dramatically cuts development time.

Online inference

Chalk's online store & feature computation engine make it easy to query features with ultra low-latency, so you can use your feature pipelines to serve online inference use-cases.

Integrating Chalk with your production application takes minutes via Chalk's simple REST API:

result = ChalkClient().query(
    input={
        User.name: "Katherine Johnson"
    },
    output=[User.fico_score],
    staleness={User.fico_score: "10m"},
)
result.get_feature_value(User.fico_score)

Features computed to serve online requests are also replicated to Chalk's offline store for historical performance tracking and training set generation.

Offline training

Data scientists can use Chalk's Jupyter integration to create datasets and train models. Datasets are stored and tracked so that they can be re-used by other modelers, and so that model provenance is tracked for audit and reproducibility.

X = ChalkClient.offline_query(
    input=labels[[User.uid, timestamp]],
    output=[
        User.returned_transactions_last_60,
        User.user_account_name_match_score,
        User.socure_score,
        User.identity.has_verified_phone,
        User.identity.is_voip_phone,
        User.identity.account_age_days,
        User.identity.email_age,
    ],
)

Chalk datasets are always "temporally consistent." This means that you can provide labels with different past timestamps and get historical features that represent what your application would have retrieved online at those past times. Temporal consistency ensures that your model training doesn't mix "future" and "past" data.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chalkpy-2.110.2.post1.tar.gz (1.3 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

chalkpy-2.110.2.post1-cp313-cp313-win_amd64.whl (3.2 MB view details)

Uploaded CPython 3.13Windows x86-64

chalkpy-2.110.2.post1-cp313-cp313-manylinux_2_28_x86_64.whl (3.5 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

chalkpy-2.110.2.post1-cp313-cp313-manylinux_2_28_aarch64.whl (3.5 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

chalkpy-2.110.2.post1-cp313-cp313-macosx_11_0_arm64.whl (3.3 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

chalkpy-2.110.2.post1-cp313-cp313-macosx_10_13_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.13macOS 10.13+ x86-64

chalkpy-2.110.2.post1-cp312-cp312-win_amd64.whl (3.2 MB view details)

Uploaded CPython 3.12Windows x86-64

chalkpy-2.110.2.post1-cp312-cp312-manylinux_2_28_x86_64.whl (3.5 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

chalkpy-2.110.2.post1-cp312-cp312-manylinux_2_28_aarch64.whl (3.5 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

chalkpy-2.110.2.post1-cp312-cp312-macosx_11_0_arm64.whl (3.3 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

chalkpy-2.110.2.post1-cp312-cp312-macosx_10_13_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.12macOS 10.13+ x86-64

chalkpy-2.110.2.post1-cp311-cp311-win_amd64.whl (3.2 MB view details)

Uploaded CPython 3.11Windows x86-64

chalkpy-2.110.2.post1-cp311-cp311-manylinux_2_28_x86_64.whl (3.5 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

chalkpy-2.110.2.post1-cp311-cp311-manylinux_2_28_aarch64.whl (3.5 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

chalkpy-2.110.2.post1-cp311-cp311-macosx_11_0_arm64.whl (3.3 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

chalkpy-2.110.2.post1-cp311-cp311-macosx_10_13_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.11macOS 10.13+ x86-64

chalkpy-2.110.2.post1-cp310-cp310-win_amd64.whl (3.2 MB view details)

Uploaded CPython 3.10Windows x86-64

chalkpy-2.110.2.post1-cp310-cp310-manylinux_2_28_x86_64.whl (3.5 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

chalkpy-2.110.2.post1-cp310-cp310-manylinux_2_28_aarch64.whl (3.5 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

chalkpy-2.110.2.post1-cp310-cp310-macosx_11_0_arm64.whl (3.3 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

chalkpy-2.110.2.post1-cp310-cp310-macosx_10_13_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.10macOS 10.13+ x86-64

File details

Details for the file chalkpy-2.110.2.post1.tar.gz.

File metadata

  • Download URL: chalkpy-2.110.2.post1.tar.gz
  • Upload date:
  • Size: 1.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.110.2.post1.tar.gz
Algorithm Hash digest
SHA256 77b5a1b34f98367ecf3ebab2130d2125a6ca89d4f3abf022c1687b692a10c62f
MD5 afad8f64473ce7da48083fc0bf617a4d
BLAKE2b-256 8b3187996cee095db5d0e7b6fd10b0b68a97d6c6e690e7eb938a6584a4745995

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.2.post1-cp313-cp313-win_amd64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.2.post1-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 e615d56e393edf8fb692362e5766961defe98e21dbbf88992ccd88b470485ebc
MD5 0e5cd314a30806e1cd122fb1e69e4118
BLAKE2b-256 14254e800c89b98972f64e8db07b0a3ca1756bc6cc37931e74ebd7508130978f

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.2.post1-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.2.post1-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 b5ee30d166017ecfeac9dbcb66e397b7c352c07e9ed4332e1541d4444298c4a5
MD5 13267363527feb2992a0887e858b36bb
BLAKE2b-256 83fe6eaceff9dc9bd330acdcdd0467005d07cdf3f1fe09a6a30fa6f30f03c4e4

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.2.post1-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.2.post1-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 0ca517d15645ea108a268ba425d6e336dbc9b8664bece4eefe584d0d22db8312
MD5 23655d1e9aa0c8f3220aba72264a2c39
BLAKE2b-256 d58e2462ccade37376df41ff80e67a4f42049803de15c286f8fb911c37694a9d

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.2.post1-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.2.post1-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 320345dd0995295253bd1cde48a285a8cd8c1906f688fe72e1bdede34aa4da72
MD5 804eb46953face3e0eabe194c6a273db
BLAKE2b-256 e074e8f7c6ced5af2d244ddc934102dc4f56f20903dae2ab72f1444cb7e01144

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.2.post1-cp313-cp313-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.2.post1-cp313-cp313-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 40cc1a270b3fee1f5d962799550c054cf741c681f145362a9c45043bf6464d52
MD5 c51af82835ffead25326b439fd29d8d7
BLAKE2b-256 4bd6e37d3a1a6010bb68ecdd1a1a40eda962bdd7d89768b69a6ba7f07931afb3

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.2.post1-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.2.post1-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 6bc4205843f68c861d7a68debd82c2ae55222b1c67061e7e7e6be69759a7d0b4
MD5 94b212e79277b42520199b244eab22cf
BLAKE2b-256 8bd7e4006462c90c7158f48104c462918b346aaf27177f2941247cc937545f47

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.2.post1-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.2.post1-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 87c131f8388367c827588219904dc094fe63d4a82c06e98f0646bb46d6623b7f
MD5 3c0a781410df9031306f63e33d10dcb9
BLAKE2b-256 3181a9e65d1861ee24e375659d58aab1b66b52a92fe54a3f60f9c89cf62d318e

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.2.post1-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.2.post1-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 53c8f0d73decd47a19653b23910545c4ca42b22af23f52fc1180001c0643d015
MD5 911253eb611a3d1fa8ff7824c5673a47
BLAKE2b-256 fd41018ce97bf559a59e1ad1cf7afcc97668cb4e64f99f34ae0cedb6b312f509

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.2.post1-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.2.post1-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 0cbf9d8759770619f210ee6620b18b0e839bee56ad617a755246a7252e9b7adb
MD5 15c44bd6fd72673a89fdf49dde6eb589
BLAKE2b-256 cf95d2b178fc542e0d58ef0c99caaac8e5d8db7d259881159dc7ffcebf4f46df

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.2.post1-cp312-cp312-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.2.post1-cp312-cp312-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 f3d5635dec5a61bb5262afe959a06675f78bbe4ecbbb7f9efd80fe5f161d1146
MD5 d6830642805d4ef256415f305aea2443
BLAKE2b-256 e749696662981baaa6b7606ef002a2f054d1c62d53a93ed91d6f22431ca37c96

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.2.post1-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.2.post1-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 9ae72b171f3f813d5602c7abe553860d16d672ab59349218e623677904945b65
MD5 ab64186766daa0e64b190f9d47ee595d
BLAKE2b-256 2cfbb1adab1016048836f762b9bc075ebbed0155706c4152ef3eb24fd77d2575

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.2.post1-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.2.post1-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 1a9d4eeb88cbe1ffabd6b38884b9b96d3af5b6bf158c2c48577a4fab93d6e577
MD5 d68792d9ccb19a8f7c2d99898e6a367b
BLAKE2b-256 83e82e448ea64d181eeddea80136fd733804e1b63f3017fc18cba1e797b9c49b

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.2.post1-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.2.post1-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 3581289f9bfcfe371dd63c86bf3602ef222530361fbe051ae2ce8880e2732d04
MD5 64cbb47bdc82d46f963dfb535a8b8b28
BLAKE2b-256 e84e43a79609616498649041d708280f53327fe3083093d5cb07c2aeacc61a0f

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.2.post1-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.2.post1-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 2bdeb171ab91ea723bbfe108ac6334e833246cd9b24d879a3f18115997972906
MD5 c8a4c9603de99aba563c926c184ef2ce
BLAKE2b-256 f3fc530f78befb17a77b64dcfd2ebed9e33032ef03a4d54fc461feca73c49b79

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.2.post1-cp311-cp311-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.2.post1-cp311-cp311-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 af1f922b5075cdf88650bfd21ccb5a700662948f9431323552926e2faaf030fc
MD5 05993156f9e8927c77361b5f83de857e
BLAKE2b-256 42582964a163056c63dd6b2f15a10cef790735e82cfe1c1eb612f7b5d963b23e

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.2.post1-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.2.post1-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 216248cbe7dc2da4b11af36e46aa447f94ff93407a1a34b6362ff26412c384c7
MD5 59bdee1a7e19e383ba01cb44b0ca4946
BLAKE2b-256 405448728cf0f4f3bf985079dfdd63d81d035a576477f8390c215943546f94fa

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.2.post1-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.2.post1-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 6b1e764be581d73952773089c764fa8d83b849b533046acb22070e366cbb4e92
MD5 f65ca7cbdb781084e2b0a2a711c2c40f
BLAKE2b-256 c6c19629627eecbbefa4c9b408297a7f76a88c30e6feabe8ec881d68ac37fc43

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.2.post1-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.2.post1-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 8efd537d6ae99f1434d6da098d20848188f31bf2acff870f08ab5851b4fba7f8
MD5 219b6f4a47cdfea47595b0e775540fb3
BLAKE2b-256 0cef02d92b98709b38222b7bd60cc9e1c87d851564460cb47e66d43c57586f1d

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.2.post1-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.2.post1-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 2b2ca7ae808f3c1a7aff6a7b921c114b725f652f794ad9a0ea29c607e4bc752d
MD5 5bb4aff8afd99767d45681d20fe88dcb
BLAKE2b-256 0c0387b0a1882205cc97d76b3f3fec67b5107e83f562a2d81edf1c28fc621304

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.2.post1-cp310-cp310-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.2.post1-cp310-cp310-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 272138c085b15f921afff65e23882d27d6c1b8a10227b4a929a64a2078865484
MD5 035874a00ee4fc59c1021e595c308c2a
BLAKE2b-256 6d3a61321e9bb3719d7834193abc3c79886c27180b15f5c36f9809bd34035bd0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page