Skip to main content

Python SDK for Chalk

Project description

Chalk

Chalk enables innovative machine learning teams to focus on building the unique products and models that make their business stand out. Behind the scenes Chalk seamlessly handles data infrastructure with a best-in-class developer experience. Here’s how it works –


Develop

Chalk makes it simple to develop feature pipelines for machine learning. Define Python functions using the libraries and tools you're familiar with instead of specialized DSLs. Chalk then orchestrates your functions into pipelines that execute in parallel on a Rust-based engine and coordinates the infrastructure required to compute features.

Features

To get started, define your features with Pydantic-inspired Python classes. You can define schemas, specify relationships, and add metadata to help your team share and re-use work.

@features
class User:
    id: int
    full_name: str
    nickname: Optional[str]
    email: Optional[str]
    birthday: date
    credit_score: float
    datawarehouse_feature: float

    transactions: DataFrame[Transaction] = has_many(lambda: Transaction.user_id == User.id)

Resolvers

Next, tell Chalk how to compute your features. Chalk ingests data from your existing data stores, and lets you use Python to compute features with feature resolvers. Feature resolvers are declared with the decorators @online and @offline, and can depend on the outputs of other feature resolvers.

Resolvers make it easy to rapidly integrate a wide variety of data sources, join them together, and use them in your model.

SQL

pg = PostgreSQLSource()

@online
def get_user(uid: User.id) -> Features[User.full_name, User.email]:
    return pg.query_string(
        "select email, full_name from users where id=:id",
        args=dict(id=uid)
    ).one()

REST

import requests

@online
def get_socure_score(uid: User.id) -> Features[User.socure_score]:
    return (
        requests.get("https://api.socure.com", json={
            id: uid
        }).json()['socure_score']
    )

Execute

Once you've defined your features and resolvers, Chalk orchestrates them into flexible pipelines that make training and executing models easy.

Chalk has built-in support for feature engineering workflows -- no need to manage Airflow or orchestrate complicated streaming flows. You can execute resolver pipelines with declarative caching, ingest batch data on a schedule, and easily make slow sources available online for low-latency serving.

Caching

Many data sources (like vendor APIs) are too slow for online use cases and/or charge a high dollar cost-per-call. Chalk lets you optimize latency and cost by defining declarative caching policies which are well-integrated throughout the system. You no longer have to manage Redis, Memcached, DynamodDB, or spend time tuning cache-warming pipelines.

Add a caching policy with one line of code in your feature definition:

@features
class ExternalBankAccount:
-   balance: int
+   balance: int = feature(max_staleness="**1d**")

Optionally warm feature caches by executing resolvers on a schedule:

@online(cron="**1d**")
def fn(id: User.id) -> User.credit_score:
  return redshift.query(...).all()

Or override staleness tolerances at query time when you need fresher data for your models:

chalk.query(
    ...,
    outputs=[User.fraud_score],
    max_staleness={User.fraud_score: "1m"}
)

Batch ETL ingestion

Chalk also makes it simple to generate training sets from data warehouse sources -- join data from services like S3, Redshift, BQ, Snowflake (or other custom sources) with historical features computed online. Specify a cron schedule on an @offline resolver and Chalk automatically ingests data with support for incremental reads:

@offline(cron="**1h**")
def fn() -> Feature[User.id, User.datawarehouse_feature]:
  return redshift.query(...).incremental()

Chalk makes this data available for point-in-time-correct dataset generation for data science use-cases. Every pipeline has built-in monitoring and alerting to ensure data quality and timeliness.

Reverse ETL

When your model needs to use features that are canonically stored in a high-latency data source (like a data warehouse), Chalk's Reverse ETL support makes it simple to bring those features online and serve them quickly.

Add a single line of code to an offline resolver, and Chalk constructs a managed reverse ETL pipeline for that data source:

@offline(offline_to_online_etl="5m")

Now data from slow offline data sources is automatically available for low-latency online serving.


Deploy + query

Once you've defined your pipelines, you can rapidly deploy them to production with Chalk's CLI:

chalk apply

This creates a deployment of your project, which is served at a unique preview URL. You can promote this deployment to production, or perform QA workflows on your preview environment to make sure that your Chalk deployment performs as expected.

Once you promote your deployment to production, Chalk makes features available for low-latency online inference and offline training. Significantly, Chalk uses the exact same source code to serve temporally-consistent training sets to data scientists and live feature values to models. This re-use ensures that feature values from online and offline contexts match and dramatically cuts development time.

Online inference

Chalk's online store & feature computation engine make it easy to query features with ultra low-latency, so you can use your feature pipelines to serve online inference use-cases.

Integrating Chalk with your production application takes minutes via Chalk's simple REST API:

result = ChalkClient().query(
    input={
        User.name: "Katherine Johnson"
    },
    output=[User.fico_score],
    staleness={User.fico_score: "10m"},
)
result.get_feature_value(User.fico_score)

Features computed to serve online requests are also replicated to Chalk's offline store for historical performance tracking and training set generation.

Offline training

Data scientists can use Chalk's Jupyter integration to create datasets and train models. Datasets are stored and tracked so that they can be re-used by other modelers, and so that model provenance is tracked for audit and reproducibility.

X = ChalkClient.offline_query(
    input=labels[[User.uid, timestamp]],
    output=[
        User.returned_transactions_last_60,
        User.user_account_name_match_score,
        User.socure_score,
        User.identity.has_verified_phone,
        User.identity.is_voip_phone,
        User.identity.account_age_days,
        User.identity.email_age,
    ],
)

Chalk datasets are always "temporally consistent." This means that you can provide labels with different past timestamps and get historical features that represent what your application would have retrieved online at those past times. Temporal consistency ensures that your model training doesn't mix "future" and "past" data.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chalkpy-2.110.4.tar.gz (1.3 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

chalkpy-2.110.4-cp313-cp313-win_amd64.whl (3.2 MB view details)

Uploaded CPython 3.13Windows x86-64

chalkpy-2.110.4-cp313-cp313-manylinux_2_28_x86_64.whl (3.5 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

chalkpy-2.110.4-cp313-cp313-manylinux_2_28_aarch64.whl (3.5 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

chalkpy-2.110.4-cp313-cp313-macosx_11_0_arm64.whl (3.3 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

chalkpy-2.110.4-cp313-cp313-macosx_10_13_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.13macOS 10.13+ x86-64

chalkpy-2.110.4-cp312-cp312-win_amd64.whl (3.2 MB view details)

Uploaded CPython 3.12Windows x86-64

chalkpy-2.110.4-cp312-cp312-manylinux_2_28_x86_64.whl (3.5 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

chalkpy-2.110.4-cp312-cp312-manylinux_2_28_aarch64.whl (3.5 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

chalkpy-2.110.4-cp312-cp312-macosx_11_0_arm64.whl (3.3 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

chalkpy-2.110.4-cp312-cp312-macosx_10_13_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.12macOS 10.13+ x86-64

chalkpy-2.110.4-cp311-cp311-win_amd64.whl (3.2 MB view details)

Uploaded CPython 3.11Windows x86-64

chalkpy-2.110.4-cp311-cp311-manylinux_2_28_x86_64.whl (3.5 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

chalkpy-2.110.4-cp311-cp311-manylinux_2_28_aarch64.whl (3.5 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

chalkpy-2.110.4-cp311-cp311-macosx_11_0_arm64.whl (3.3 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

chalkpy-2.110.4-cp311-cp311-macosx_10_13_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.11macOS 10.13+ x86-64

chalkpy-2.110.4-cp310-cp310-win_amd64.whl (3.2 MB view details)

Uploaded CPython 3.10Windows x86-64

chalkpy-2.110.4-cp310-cp310-manylinux_2_28_x86_64.whl (3.5 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

chalkpy-2.110.4-cp310-cp310-manylinux_2_28_aarch64.whl (3.5 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

chalkpy-2.110.4-cp310-cp310-macosx_11_0_arm64.whl (3.3 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

chalkpy-2.110.4-cp310-cp310-macosx_10_13_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.10macOS 10.13+ x86-64

File details

Details for the file chalkpy-2.110.4.tar.gz.

File metadata

  • Download URL: chalkpy-2.110.4.tar.gz
  • Upload date:
  • Size: 1.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.110.4.tar.gz
Algorithm Hash digest
SHA256 94dcbad6744432cf06412e6ab514795596f447e0ed67371fb8fbf63195786f5c
MD5 c05256c6994073d405a103c21dfd9484
BLAKE2b-256 6994884c450af6f8d249ee7229837edcae3ce234dbc064b4c66f6dd48b40f4aa

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.4-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.110.4-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 3.2 MB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.110.4-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 6eff3e4cbf44d59f2c4e60cfe732410bb5b90c7e4d62411c47354a7746d26c66
MD5 0a6f5a9983b8b73fd846873b330f85c3
BLAKE2b-256 a3e18c8235ed5ccbc7f2ad0b539adc92eaced3270c1025d568a3721937c8ec8d

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.4-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.4-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 a74058d6ec33a85694894b19cebf5e5a1bc3fa13ca7e42a04c77da89b47bc540
MD5 186096072c7a9086b7f13e1ab49dbf57
BLAKE2b-256 b9d03c69ae2bbb6cbe0bdfab366684109f6e7d96e5675874dbc3d8c0b2099d75

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.4-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.4-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 9d78e6c784c68d188c2fba9a27395419e3801558631b9aaa1fb48d8a54deb2bf
MD5 f23102a5e0bdc86034e7ffe7be3e2bfe
BLAKE2b-256 e19ef79a9eca4763a45de30fbb4b80d209b194f983e220e1bd97d9dca2780eb9

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.4-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.4-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 40bf8366b75c7a9af48d0ff64e77b2b6eb9960a60e1255cbf35e044a59beb711
MD5 3936c20c839ef9d3799970ec4f45eb11
BLAKE2b-256 13bc546875de9f7ebbf4fbe23f487a0ab8e7cbabb16b602118752f76798d30c6

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.4-cp313-cp313-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.4-cp313-cp313-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 79d5a0de481d993bab3a4fc53102f50d6b88dd4910016c45a350232d66cac89f
MD5 330078dca53bc261e7327f1cd6d512e7
BLAKE2b-256 f603d0c4efa725190c0f628bab56340c1e3ab5fcf9bf5fdd5ce86635ae6b9936

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.4-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.110.4-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 3.2 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.110.4-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 cbc731fefbd6d6cf6ef6d8d5066884864ac29c62dee0bf90ac8fac3a11d1aa81
MD5 3170539fdaee114ef073837bfb0cff14
BLAKE2b-256 8b1f4721ba5a23f74969fe8fe9cefb70507801683a17123fd6086fc0932b1737

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.4-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.4-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 f5a6422b57b539f571ddc1e8978dbb21860fa5fec1ce6c552034e03eb2398a14
MD5 fc171497cb493ec1578815914d53abef
BLAKE2b-256 381b3715538b0f07a2b111bff14c3fb197e953517da96820a4f8f17f22fcec9d

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.4-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.4-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 f55955c51719aa203d0ffe0b08bfdc63dece98cfd5f97cb379b8593383bc49ef
MD5 bb70e7c02ff25e4f31fd23f8000319a2
BLAKE2b-256 bbcef472f3752a70f94536b081a4a1a82f52fb343b4f0cabcc297206e52e6d8a

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.4-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.4-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 10f0fc9a053bf0568429eba0bcc9889a876dc18cd05c8ce2167ab212bd2d5da0
MD5 25d2465b49a38baab8e77a4bb19f96f6
BLAKE2b-256 650fbd486ebac8a95d302bbfba3b9a5d2a58c4104414f1ffb58b7c942ea8a0b4

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.4-cp312-cp312-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.4-cp312-cp312-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 9fc7c0b7e17e37c56f76f018d2603ca89304444d5efee8e8782890c2128dfaa7
MD5 02277897e6f25373bd930306198cf88c
BLAKE2b-256 97d0cbef8abda66c4235dd5dbecf5d8aa34d278fe1d24dd547bb6d410ecb9335

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.4-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.110.4-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 3.2 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.110.4-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 c0aee0aade60bc097e6b516cdbb4d769095ebba351c664deba7e8614f3d15882
MD5 16ef75e9a3fb8dbabbc48fa04b67deb1
BLAKE2b-256 4c2bfcf21d866bf71ce78b5a87f345729db07ef22403e6c6b348a1bb84ee00e4

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.4-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.4-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 6184b57747c3b74dde89a035ed5a29d43f2b5ac29191c44628c50de839704aa8
MD5 061aee0a1f47b91a564ab7575b5b7e2f
BLAKE2b-256 e9c5e9cab9272b05548fcd810e2eb03c6f1db89a3014988af991f198de7b8b26

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.4-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.4-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 fdc7592b8eff8f4a2399219a02178c1241e189d631550f09d2e391e6eda2d84b
MD5 38615ac2707a0427f7017cfc8f78b6d8
BLAKE2b-256 55db1814bb002f62fceb5462885079c12b9368dc184f9882acd1ac8bb886f2f0

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.4-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.4-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 5354d4a033d54e134b1aac880110b7d3157090d5e009507464a676f923b169cd
MD5 e00f23e6b89b26118aa13813305c36ca
BLAKE2b-256 3d12faa4d035cea94fab8c1e7ff8ab22b486e6ea804bd9a364f809f9742216fa

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.4-cp311-cp311-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.4-cp311-cp311-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 46c3903aabd55a11f9383ee31736caed27eec408133b9bedff1be8be48677f6b
MD5 adb4198a6121bbfef5ea22ff3122eb7e
BLAKE2b-256 f37bc255ee6ae1fc62d26860f8e5b8c862c1c20bb36c1cb62bcd939f3b84e3a9

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.4-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.110.4-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 3.2 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.110.4-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 3e183046353124fb8fcec700264631d355b1fa7ad15ff7c71b9d702ab18d21cd
MD5 d8ebc2ba9adf27f886441006a063abad
BLAKE2b-256 c40f97ee9d1347a6d39794bdd759db805ac17fb2eaac89bf10af747d14baca79

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.4-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.4-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 0e6bd4a4ab414b6b414f4702d864ed26ae6901da6172c1303bb81d3f07522f8a
MD5 83fcf0ed2f6eb54e109c1709333ae404
BLAKE2b-256 555e5a88dd9ffc22a73b8a643490ecd138528a1d0d0523cf2644c7f94df757cf

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.4-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.4-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 38eed3e9826fe4219264b2249c39257f1dccf7604788fa96f3fa56765960da1f
MD5 4610571f09c7d26b721b644c8c2718aa
BLAKE2b-256 42b6f669475b544f3fe9d1bff1c17ebb96ccd7727933d8281ccc0f808b7d4524

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.4-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.4-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 772d97a1522439bbfbafe1a82e14057859f9e60e8b3a01b95022c030e1d7dc33
MD5 f80de02eae94a1a4895ce751800e081f
BLAKE2b-256 a0dde0540979b4ddd11bb8f009ae53b483cad0bbfefb1c6e463dea45be1509a3

See more details on using hashes here.

File details

Details for the file chalkpy-2.110.4-cp310-cp310-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.110.4-cp310-cp310-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 7c49453c9c7d7079cd6cce424fe60a4a07d199d96c53ef898aeec7115c5c082b
MD5 fc47bf80b697211cd74e51256aae8899
BLAKE2b-256 737c52158965462c0d9a91c4083eafab66cc48800059d0f966d65407a8bfde1e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page