Custom Spark data sources for reading and writing data in Apache Spark, using the Python Data Source API
Project description
pyspark-data-sources
This repository showcases custom Spark data sources built using the new Python Data Source API for the upcoming Apache Spark 4.0 release. For an in-depth understanding of the API, please refer to the API source code.
Installation
pip install pyspark-data-sources
Usage
Note: Currently the following code only works with Apache Spark
master
branch.
from pyspark_datasources.github import GithubDataSource
# Register the data source
spark.dataSource.register(GithubDataSource)
spark.read.format("github").load("apache/spark").show()
Contributing
We welcome and appreciate any contributions to enhance and expand the custom data sources. If you're interested in contributing:
- Add New Data Sources: Want to add a new data source using the Python Data Source API? Submit a pull request or open an issue.
- Suggest Enhancements: If you have ideas to improve a data source or the API, we'd love to hear them!
- Report Bugs: Found something that doesn't work as expected? Let us know by opening an issue.
Need help or have questions? Don't hesitate to open a new issue, and we'll do our best to assist you.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Close
Hashes for pyspark_data_sources-0.1.1.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 86455c403ebbb108959a850c642e9a5467fc3f6deeba894118156b80781685db |
|
MD5 | ebf68ee1bd60abf809bde34b8f670b5b |
|
BLAKE2b-256 | 319f7d95bd91ab3b645421ea3472e26415992c84373e68ada267dcc3e7171926 |
Close
Hashes for pyspark_data_sources-0.1.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d7ddf50cacd0a521ecfc8285feed9ccd5b3d82757871b7736e40875098600cd3 |
|
MD5 | ed9c31e4dae297e361d4516c2fd189ef |
|
BLAKE2b-256 | d10d16c7c37c550cfefdc234cda4d3ed88f301eb1aa485fc70e93bc4d94a80e7 |