A Python client library for interacting with the scidx POP and create streams.
Project description
scidx Streaming
A Python library for managing streaming data using the sciDX platform and a Point of Presence. This library provides easy-to-use methods for creating, consuming, and managing Kafka streams and related resources.
Table of Contents
Installation
Ensure you have Python 3.7 or higher installed. Using a virtual environment is recommended.
Option 1: Install from GitHub
-
Clone the repository:
git clone https://github.com/sci-ndp/streaming-py.git cd streaming-py
-
Create and activate a virtual environment:
python3 -m venv .venv source .venv/bin/activate
-
Install the package in editable mode:
pip install -e .
-
Install development dependencies (optional, for testing):
pip install -r requirements.txt
Option 2: Install via pip
Once the package is published on PyPI, you can install it directly using pip:
pip install scidx-streaming
Tutorial
For a step-by-step guide on how to use the streaming library, check out our comprehensive tutorial: 10 Minutes for Streaming POP Data.
Running Tests
To run the tests, navigate to the project root and execute:
pytest
Usage examples
Below is an example showcasing how to set up the library, register a data object, create a filtered stream, and consume its data.
1. Set up the POP and Streaming libraries
This can be done by initializing the APIClient and using it to initilize the StreamingClient:
from streaming import StreamingClient
from ndp_ep import APIClient
API_URL = "http://your-api-url.com"
USERNAME = "your_username"
USERNAME = "your_password"
client = APIClient(base_url=API_URL, username=USERNAME, password=PASSWORD)
streaming = StreamingClient(client)
2. Register a data object
data_object_metadata = {
"name": "sample_data_object",
"type": "url",
"url": "http://example.com/data.csv",
"description": "Sample data object for streaming demo"
}
client.register_url(data_object_metadata)
3. Create a filtered data stream
# Define filters
filters = [
"column_name > 100",
"IF column_name < 50 THEN alert = 'low' ELSE alert = 'high'"
]
# Create a Kafka stream with filters
stream = await streaming.create_kafka_stream(
keywords=["sample_data_object"],
match_all=True,
filter_semantics=filters
)
print(f"Stream created with topic: {stream.data_stream_id}")
4. Consuming the filtered data stream
# Consume stream data
consumer = streaming.consume_kafka_messages(stream.data_stream_id)
print(consumer.dataframe.head())
5. Cleaning up
consumer.stop()
# Delete the stream and the data object
await streaming.delete_stream(stream)
client.delete_resource_by_id(search_results[0]["id"])
print("Cleanup completed.")
Contributing
Contributions are welcome! Please follow these steps:
- Fork the repository
- Create a new branch (
git checkout -b feature/new-feature) - Make your changes and commit (
git commit -m 'Add new feature') - Push to the branch (
git push origin feature/new-feature) - Open a Pull Reques
Contributing in PyPI
To publish the library to PyPI, follow these steps:
Ensure setup.py is correctly configured.
Build the distribution files:
python setup.py sdist bdist_wheel
Upload to PyPI using twine:
twine upload dist/*
Verify the package on PyPI:
Visit https://pypi.org/ and check your package listing.
If you need to update the library on PyPI:
- Make your changes and update the version in setup.py.
- Run the above steps to rebuild and upload the new version.
License
This project is licensed under the MIT License. See LICENSE.md for more details.
Contact
For any questions or suggestions, please open an issue on GitHub.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file scidx_streaming-0.1.6.tar.gz.
File metadata
- Download URL: scidx_streaming-0.1.6.tar.gz
- Upload date:
- Size: 33.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7aebc7ee357f2bfb8f4a8baf19a3793957df1e39bff7442fa5de63a77e5da604
|
|
| MD5 |
a4bc923333f81a039d6e0b74dd76b35d
|
|
| BLAKE2b-256 |
f9ee075165fce34043826a81ed135ac69efe65760448ccbc32242c8d2b66a276
|
File details
Details for the file scidx_streaming-0.1.6-py3-none-any.whl.
File metadata
- Download URL: scidx_streaming-0.1.6-py3-none-any.whl
- Upload date:
- Size: 43.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0c45613b9daf7dd6b1faed6aa89aaaf2c7919f85451ed84a7af09fa2f343f440
|
|
| MD5 |
d98042279a12af7c946cbcd5c3024663
|
|
| BLAKE2b-256 |
d441d5436a308d6e7dd07d1f8aca7935bca8287c6b46229c007c2c4b952c8578
|