Pandas and AWS interoperability for data science.
Project description
Red Panda 🐼😊
=============
Data science on AWS without frustration.
Features
--------
- DataFrame/files to and from S3 and Redshift.
- Run queries on Redshift in Python.
- Manage files on S3.
Installation
------------
.. code-block:: console
$ pip install red-panda
Examples
--------
TODO
----
In no particular order:
- Improve tests and docs.
- Handle when user does have implicit column that is the index in a DataFrame. Currently index is automatically dropped.
- Better ways of inferring data types from dataframe to Redshift.
- Take advantage of Redshift slices for parallel processing. Split files for COPY.
- Explore using `S3 Transfer Manager`'s upload_fileobj for `df_to_s3` to take advantage of automatic multipart upload.
- More options for data consistency management. Currently deleting file from S3 after COPY.
- Add encryption options for files uploaded to S3.
- Add COPY from S3 manifest file, in addition to COPY from S3 source path.
- Support more data formats.
=============
Data science on AWS without frustration.
Features
--------
- DataFrame/files to and from S3 and Redshift.
- Run queries on Redshift in Python.
- Manage files on S3.
Installation
------------
.. code-block:: console
$ pip install red-panda
Examples
--------
TODO
----
In no particular order:
- Improve tests and docs.
- Handle when user does have implicit column that is the index in a DataFrame. Currently index is automatically dropped.
- Better ways of inferring data types from dataframe to Redshift.
- Take advantage of Redshift slices for parallel processing. Split files for COPY.
- Explore using `S3 Transfer Manager`'s upload_fileobj for `df_to_s3` to take advantage of automatic multipart upload.
- More options for data consistency management. Currently deleting file from S3 after COPY.
- Add encryption options for files uploaded to S3.
- Add COPY from S3 manifest file, in addition to COPY from S3 source path.
- Support more data formats.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
red-panda-0.1.0.tar.gz
(6.5 kB
view hashes)
Built Distribution
Close
Hashes for red_panda-0.1.0-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 30ebce76f6bfa7996f3418c218c8810212ea7bb082b45dde484553a772d20f16 |
|
MD5 | 6f65e9bc7cd14894006d267570adcef5 |
|
BLAKE2b-256 | fef404b7a98d6e5554ccf0085bb64a0552272bd853b888eded7618fefb4f95ef |