Custom Spark data sources for reading and writing data in Apache Spark, using the Python Data Source API
Project description
pyspark-data-sources
This repository showcases custom Spark data sources built using the new Python Data Source API for the upcoming Apache Spark 4.0 release. For an in-depth understanding of the API, please refer to the API source code.
Installation
pip install pyspark-data-sources
Usage
Note: Currently the following code only works with Apache Spark
master
branch.
from pyspark_datasources.github import GithubDataSource
# Register the data source
spark.dataSource.register(GithubDataSource)
spark.read.format("github").load("apache/spark").show()
Contributing
We welcome and appreciate any contributions to enhance and expand the custom data sources. If you're interested in contributing:
- Add New Data Sources: Want to add a new data source using the Python Data Source API? Submit a pull request or open an issue.
- Suggest Enhancements: If you have ideas to improve a data source or the API, we'd love to hear them!
- Report Bugs: Found something that doesn't work as expected? Let us know by opening an issue.
Need help or have questions? Don't hesitate to open a new issue, and we'll do our best to assist you.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Close
Hashes for pyspark_data_sources-0.1.2.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6aa8528e561513c8a04936072e3787d81a6287ef6d000dcb54ef246d4b0ec7b8 |
|
MD5 | b0a642f1626bdc5e7b74ea303c242fc8 |
|
BLAKE2b-256 | bfe91e2cdb1357a5005f55de05bbd0eb9610f6efc82ce5ad7286002b33384fc6 |
Close
Hashes for pyspark_data_sources-0.1.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 15e1f85b45ccb1536359937bb6e414dd772220daa004ab20e25c5af5d9af855b |
|
MD5 | cf9c9b894c8af2bf04ddc461ee86f1cf |
|
BLAKE2b-256 | 9b665a49ea8abfbf905d0df7860ad387af614f9ddb05e2e918f65ec1f004a15b |