Skip to main content

A Python client library for interacting with the scidx POP and create streams.

Project description

scidx Streaming

A Python library for managing streaming data using the sciDX platform and a Point of Presence. This library provides easy-to-use methods for creating, consuming, and managing Kafka streams and related resources.

Table of Contents

Installation

Ensure you have Python 3.7 or higher installed. Using a virtual environment is recommended.

Option 1: Install from GitHub

  1. Clone the repository:

    git clone https://github.com/sci-ndp/streaming-py.git
    cd streaming-py
    
  2. Create and activate a virtual environment:

    python3 -m venv .venv
    source .venv/bin/activate
    
  3. Install the package in editable mode:

    pip install -e .
    
  4. Install development dependencies (optional, for testing):

    pip install -r requirements.txt
    

Option 2: Install via pip

Once the package is published on PyPI, you can install it directly using pip:

pip install scidx-streaming

Tutorial

For a step-by-step guide on how to use the streaming library, check out our comprehensive tutorial: 10 Minutes for Streaming POP Data.

Running Tests

To run the tests, navigate to the project root and execute:

pytest

Usage examples

Below is an example showcasing how to set up the library, register a data object, create a filtered stream, and consume its data.

1. Set up the POP and Streaming libraries

This can be done by initializing the APIClient and using it to initilize the StreamingClient:

from streaming import StreamingClient
from pointofpresence import APIClient

API_URL = "http://your-api-url.com"
USERNAME = "your_username"
USERNAME = "your_password"

client = APIClient(base_url=API_URL, username=USERNAME, password=PASSWORD)
streaming = StreamingClient(client)

2. Register a data object

data_object_metadata = {
    "name": "sample_data_object",
    "type": "url",
    "url": "http://example.com/data.csv",
    "description": "Sample data object for streaming demo"
}
client.register_url(data_object_metadata)

3. Create a filtered data stream

# Define filters
filters = [
    "column_name > 100",
    "IF column_name < 50 THEN alert = 'low' ELSE alert = 'high'"
]

# Create a Kafka stream with filters
stream = await streaming.create_kafka_stream(
    keywords=["sample_data_object"],
    match_all=True,
    filter_semantics=filters
)
print(f"Stream created with topic: {stream.data_stream_id}")

4. Consuming the filtered data stream

# Consume stream data
consumer = streaming.consume_kafka_messages(stream.data_stream_id)
print(consumer.dataframe.head())

5. Cleaning up

consumer.stop()
# Delete the stream and the data object
await streaming.delete_stream(stream)
client.delete_resource_by_id(search_results[0]["id"])
print("Cleanup completed.")

Contributing

Contributions are welcome! Please follow these steps:

  1. Fork the repository
  2. Create a new branch (git checkout -b feature/new-feature)
  3. Make your changes and commit (git commit -m 'Add new feature')
  4. Push to the branch (git push origin feature/new-feature)
  5. Open a Pull Reques

Contributing in PyPI

To publish the library to PyPI, follow these steps:

Ensure setup.py is correctly configured.

Build the distribution files:

python setup.py sdist bdist_wheel

Upload to PyPI using twine:

twine upload dist/*

Verify the package on PyPI:

Visit https://pypi.org/ and check your package listing.

If you need to update the library on PyPI:

  • Make your changes and update the version in setup.py.
  • Run the above steps to rebuild and upload the new version.

License

This project is licensed under the MIT License. See LICENSE.md for more details.

Contact

For any questions or suggestions, please open an issue on GitHub.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scidx_streaming-0.1.3.tar.gz (33.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scidx_streaming-0.1.3-py3-none-any.whl (43.7 kB view details)

Uploaded Python 3

File details

Details for the file scidx_streaming-0.1.3.tar.gz.

File metadata

  • Download URL: scidx_streaming-0.1.3.tar.gz
  • Upload date:
  • Size: 33.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.1

File hashes

Hashes for scidx_streaming-0.1.3.tar.gz
Algorithm Hash digest
SHA256 608aae680619810098d1cb03dc062beb40778e7eeea4877984a3a6504a66140a
MD5 75ab0e7b0f3ba4d21d56ce940b9d502e
BLAKE2b-256 99566cf2e8e7b11b34255ac6e0b7e89ff282ff426ac8eb3e69e669441c06b242

See more details on using hashes here.

File details

Details for the file scidx_streaming-0.1.3-py3-none-any.whl.

File metadata

File hashes

Hashes for scidx_streaming-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 2f85c843cc452fb4e4f6e435dd04725690477bdde217cb5bcdeb5ff90fc57941
MD5 5a15a3e05c6b9663b5fd088d3bd3e4c8
BLAKE2b-256 f105a655e42dd90d4f20af993154d99feff0bf5685eaa52fa261a84b027580b3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page