Skip to main content

A Python client library for interacting with the scidx POP and create streams.

Project description

scidx Streaming

A Python library for managing streaming data using the sciDX platform and a Point of Presence. This library provides easy-to-use methods for creating, consuming, and managing Kafka streams and related resources.

Table of Contents

Installation

Ensure you have Python 3.7 or higher installed. Using a virtual environment is recommended.

Option 1: Install from GitHub

  1. Clone the repository:

    git clone https://github.com/sci-ndp/streaming-py.git
    cd streaming-py
    
  2. Create and activate a virtual environment:

    python3 -m venv .venv
    source .venv/bin/activate
    
  3. Install the package in editable mode:

    pip install -e .
    
  4. Install development dependencies (optional, for testing):

    pip install -r requirements.txt
    

Option 2: Install via pip

Once the package is published on PyPI, you can install it directly using pip:

pip install scidx-streaming

Tutorial

For a step-by-step guide on how to use the streaming library, check out our comprehensive tutorial: 10 Minutes for Streaming POP Data.

Running Tests

To run the tests, navigate to the project root and execute:

pytest

Usage examples

Below is an example showcasing how to set up the library, register a data object, create a filtered stream, and consume its data.

1. Set up the POP and Streaming libraries

This can be done by initializing the APIClient and using it to initilize the StreamingClient:

from streaming import StreamingClient
from pointofpresence import APIClient

API_URL = "http://your-api-url.com"
USERNAME = "your_username"
USERNAME = "your_password"

client = APIClient(base_url=API_URL, username=USERNAME, password=PASSWORD)
streaming = StreamingClient(client)

2. Register a data object

data_object_metadata = {
    "name": "sample_data_object",
    "type": "url",
    "url": "http://example.com/data.csv",
    "description": "Sample data object for streaming demo"
}
client.register_url(data_object_metadata)

3. Create a filtered data stream

# Define filters
filters = [
    "column_name > 100",
    "IF column_name < 50 THEN alert = 'low' ELSE alert = 'high'"
]

# Create a Kafka stream with filters
stream = await streaming.create_kafka_stream(
    keywords=["sample_data_object"],
    match_all=True,
    filter_semantics=filters
)
print(f"Stream created with topic: {stream.data_stream_id}")

4. Consuming the filtered data stream

# Consume stream data
consumer = streaming.consume_kafka_messages(stream.data_stream_id)
print(consumer.dataframe.head())

5. Cleaning up

consumer.stop()
# Delete the stream and the data object
await streaming.delete_stream(stream)
client.delete_resource_by_id(search_results[0]["id"])
print("Cleanup completed.")

Contributing

Contributions are welcome! Please follow these steps:

  1. Fork the repository
  2. Create a new branch (git checkout -b feature/new-feature)
  3. Make your changes and commit (git commit -m 'Add new feature')
  4. Push to the branch (git push origin feature/new-feature)
  5. Open a Pull Reques

Contributing in PyPI

To publish the library to PyPI, follow these steps:

Ensure setup.py is correctly configured.

Build the distribution files:

python setup.py sdist bdist_wheel

Upload to PyPI using twine:

twine upload dist/*

Verify the package on PyPI:

Visit https://pypi.org/ and check your package listing.

If you need to update the library on PyPI:

  • Make your changes and update the version in setup.py.
  • Run the above steps to rebuild and upload the new version.

License

This project is licensed under the MIT License. See LICENSE.md for more details.

Contact

For any questions or suggestions, please open an issue on GitHub.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scidx_streaming-0.1.5.tar.gz (33.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scidx_streaming-0.1.5-py3-none-any.whl (41.9 kB view details)

Uploaded Python 3

File details

Details for the file scidx_streaming-0.1.5.tar.gz.

File metadata

  • Download URL: scidx_streaming-0.1.5.tar.gz
  • Upload date:
  • Size: 33.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.6

File hashes

Hashes for scidx_streaming-0.1.5.tar.gz
Algorithm Hash digest
SHA256 0c3c57e37f948a8ff50119b2f6f5ef3c74627fa89368b98825aea1ce5e654a27
MD5 4ddf45afb746dae6cc79b4888553b46e
BLAKE2b-256 734e9533d37d95539eee74214ef2e7ae0deb09ed9ff95eb005cca69445e7c4ba

See more details on using hashes here.

File details

Details for the file scidx_streaming-0.1.5-py3-none-any.whl.

File metadata

File hashes

Hashes for scidx_streaming-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 6fe0d2ac0fab1d372f3a970cb7ad60ad90f5307574818159f6e67d4a897b9769
MD5 5cf0045ba1486c718ce7c0f7715315cf
BLAKE2b-256 3688616fc8b89f47f871fad962cac103e4712b679c1ba16fcfbeaadaedc4f2a8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page