Column-wise type annotations for pyspark DataFrames

These details have not been verified by PyPI

Project links

Homepage

Project description

Typedspark: column-wise type annotations for pyspark DataFrames

We love Spark! But in production code we're wary when we see:

from pyspark.sql import DataFrame

def foo(df: DataFrame) -> DataFrame:
    # do stuff
    return df

Because… How do we know which columns are supposed to be in df?

Using typedspark, we can be more explicit about what these data should look like.

from typedspark import Column, DataSet, Schema
from pyspark.sql.types import LongType, StringType

class Person(Schema):
    id: Column[LongType]
    name: Column[StringType]
    age: Column[LongType]

def foo(df: DataSet[Person]) -> DataSet[Person]:
    # do stuff
    return df

The advantages include:

Improved readability of the code
Typechecking, both during runtime and linting
Auto-complete of column names
Easy refactoring of column names
Easier unit testing through the generation of empty DataSets based on their schemas
Improved documentation of tables

Documentation

Please see our documentation on readthedocs.

Installation

You can install typedspark from pypi by running:

pip install typedspark

By default, typedspark does not list pyspark as a dependency, since many platforms (e.g. Databricks) come with pyspark preinstalled. If you want to install typedspark with pyspark, you can run:

pip install "typedspark[pyspark]"

Compatibility

Typedspark is tested in CI with PySpark 3.5.7 and 4.1.0. Spark Connect is supported when using PySpark 4.x, and the Connect-specific test runs if SPARK_CONNECT_URL is set.

Demo videos

IDE demo

https://github.com/kaiko-ai/typedspark/assets/47976799/e6f7fa9c-6d14-4f68-baba-fe3c22f75b67

You can find the corresponding code here.

Jupyter / Databricks notebooks demo

https://github.com/kaiko-ai/typedspark/assets/47976799/39e157c3-6db0-436a-9e72-44b2062df808

You can find the corresponding code here.

FAQ

I found a bug! What should I do?
Great! Please make an issue and we'll look into it.

I have a great idea to improve typedspark! How can we make this work?
Awesome, please make an issue and let us know!

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

1.7.0

Apr 28, 2026

1.6.3

Mar 6, 2026

1.6.2

Jan 5, 2026

1.6.1

Dec 28, 2025

1.6.0

Dec 24, 2025

1.5.5

Sep 19, 2025

1.5.4

Aug 9, 2025

1.5.3

Apr 9, 2025

1.5.2

Mar 10, 2025

1.5.1

Dec 17, 2024

1.5.0

Aug 12, 2024

1.4.4

Jun 8, 2024

1.4.3

Jun 3, 2024

1.4.2

Apr 30, 2024

1.4.1

Apr 13, 2024

1.3.3

Apr 5, 2024

1.3.2

Feb 27, 2024

1.3.1

Feb 26, 2024

1.3.0

Jan 24, 2024

1.2.3

Dec 22, 2023

1.2.2

Nov 13, 2023

1.2.1

Nov 8, 2023

1.2.0 yanked

Nov 8, 2023

Reason this release was yanked:

support for DataSet Implements was prematurely released

1.1.3

Nov 7, 2023

1.1.1

Sep 26, 2023

1.1.0

Sep 12, 2023

1.0.19

Aug 8, 2023

1.0.18

Aug 8, 2023

1.0.17

Aug 2, 2023

1.0.16

Jul 24, 2023

1.0.15

Jul 19, 2023

1.0.14

Jul 10, 2023

1.0.13

Jul 6, 2023

1.0.12

Jul 3, 2023

1.0.11

Jun 29, 2023

1.0.10

Jun 28, 2023

1.0.9

Jun 28, 2023

1.0.8

Jun 28, 2023

1.0.7

Jun 26, 2023

1.0.6

Jun 2, 2023

1.0.5

May 25, 2023

1.0.4

May 24, 2023

1.0.3

Apr 29, 2023

1.0.2

Apr 18, 2023

1.0.1

Mar 30, 2023

1.0.0

Mar 30, 2023

0.0.4

Mar 29, 2023

0.0.3

Mar 29, 2023

0.0.2

Mar 29, 2023

0.0.1

Mar 28, 2023

0.0.0

Apr 12, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

typedspark-1.7.0.tar.gz (31.5 kB view details)

Uploaded Apr 28, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

typedspark-1.7.0-py3-none-any.whl (39.1 kB view details)

Uploaded Apr 28, 2026 Python 3

File details

Details for the file typedspark-1.7.0.tar.gz.

File metadata

Download URL: typedspark-1.7.0.tar.gz
Upload date: Apr 28, 2026
Size: 31.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for typedspark-1.7.0.tar.gz
Algorithm	Hash digest
SHA256	`7e8448095b92059d857df264b3a77f2671d1906d12f252fa7fc8bdd9bcb05833`
MD5	`702ac98fe8a2d9aa5d0c3cce382c0b66`
BLAKE2b-256	`a6c90a0dd53706885af16f96de32a5828292f3023e2887d32aeaa54e7e0cc635`

See more details on using hashes here.

File details

Details for the file typedspark-1.7.0-py3-none-any.whl.

File metadata

Download URL: typedspark-1.7.0-py3-none-any.whl
Upload date: Apr 28, 2026
Size: 39.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for typedspark-1.7.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6c6cc354e812a4fae91676bdd4ba8e2142e274d70c8a983b2ec67378fd0e12bc`
MD5	`c116e5a88ab159a0d4849beb4a645688`
BLAKE2b-256	`f2b3b9d7487f3b27b49a28904b80ec007a689fc68ceb6ba39b4dc6a21521ed85`

See more details on using hashes here.

typedspark 1.7.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Typedspark: column-wise type annotations for pyspark DataFrames

Documentation

Installation

Compatibility

Demo videos

IDE demo

Jupyter / Databricks notebooks demo

FAQ

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes