A Python wrapper for querying AWS RDS with mysql-connector-python, returning Pandas or Spark DataFrames.
Project description
rdsclient
is a Python package designed to simplify the interaction with AWS RDS instances using the package. It allows executing SQL queries and fetching the results as Pandas or Spark DataFrames.
Features
- Singleton RDS connection: Ensures a single connection to the RDS instance.
- Query execution: Execute SELECT, INSERT, UPDATE, DELETE queries on your RDS database.
- Return data as Pandas DataFrame: Fetch query results directly as Pandas DataFrames for easy data manipulation.
- Spark DataFrame support: Optionally convert query results to Spark DataFrames for distributed computing.
- Context-managed MySQL connection: Automatically manage the connection lifecycle with Python's .
Installation
You can install directly from PyPI:
pip install rdsclient
Prerequisites
- Python 3.7+
-
Dependencies:
These dependencies are automatically installed when you install .
Usage
Simple Query Example
from rdsclient import query_rds
# Define your RDS connection details
host = 'your-rds-host'
user = 'your-username'
password = 'your-password'
database = 'your-database'
# Execute the query and get the result as a Pandas DataFrame
df = query_rds(
query='SELECT * FROM your_table',
host=host,
user=user,
password=password,
database=database
)
# Display the result
print(df.head())
Using the RDS Class Directly
from rdsclient import RDS
# Create an RDS instance
rds = RDS(
host='your-rds-host',
user='your-username',
password='your-password',
database='your-database'
)
# Execute a query and fetch results as a Pandas DataFrame
df = rds.execute_query('SELECT * FROM your_table')
# Optionally, get the results as a Spark DataFrame
spark_df = rds.query_to_spark_df('SELECT * FROM your_table')
# Display the Pandas DataFrame
print(df.head())
Executing Non-Select Queries
You can also execute non-SELECT queries (such as INSERT, UPDATE, DELETE):
# Example for executing an UPDATE query
affected_rows = rds.execute_update('UPDATE your_table SET column_name = %s WHERE condition = %s', ('new_value', 'condition_value'))
print(f'{affected_rows} rows affected.')
Contributing
Contributions are welcome! If you find a bug or have a feature request, please feel free to open an issue or submit a pull request.
Steps for Contributing:
- Fork the repository.
- Create a feature branch (
git checkout -b feature-name
). - Commit your changes (
git commit -am 'Add feature'
). - Push to the branch (
git push origin feature-name
). - Create a new pull request.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file rdsclient-0.1.0.tar.gz
.
File metadata
- Download URL: rdsclient-0.1.0.tar.gz
- Upload date:
- Size: 3.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
793e0c2155deae8f4da9922634b8705ea53fef9b5677cead84cd4773e23afc35
|
|
MD5 |
6a959fe878a07d4b9a80910f6fdcc684
|
|
BLAKE2b-256 |
b2a7f5ce3f60607786e02c35f58dd49ec99b00bab5d99f5b1e71f8a8dabc8731
|
File details
Details for the file rdsclient-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: rdsclient-0.1.0-py3-none-any.whl
- Upload date:
- Size: 4.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
e9cf896e5fea980d3c5350cf4f6f8697191881beb771d2ba834a73c02a406350
|
|
MD5 |
2b3ba1f71a2258cce87a9d61c6ac2785
|
|
BLAKE2b-256 |
00eb13e102c7824260e8b35dbd3ee18867cc5e936f36fa17e459b20ed043fdd8
|