Skip to main content

A Python client library for interacting with the scidx POP and create streams.

Project description

scidx Streaming

A Python library for managing streaming data using the sciDX platform and a Point of Presence. This library provides easy-to-use methods for creating, consuming, and managing Kafka streams and related resources.

Table of Contents

Installation

Ensure you have Python 3.7 or higher installed. Using a virtual environment is recommended.

Option 1: Install from GitHub

  1. Clone the repository:

    git clone https://github.com/sci-ndp/streaming-py.git
    cd streaming-py
    
  2. Create and activate a virtual environment:

    python3 -m venv .venv
    source .venv/bin/activate
    
  3. Install the package in editable mode:

    pip install -e .
    
  4. Install development dependencies (optional, for testing):

    pip install -r requirements.txt
    

Option 2: Install via pip

Once the package is published on PyPI, you can install it directly using pip:

pip install scidx-streaming

Tutorial

For a step-by-step guide on how to use the streaming library, check out our comprehensive tutorial: 10 Minutes for Streaming POP Data.

Running Tests

To run the tests, navigate to the project root and execute:

pytest

Usage examples

Below is an example showcasing how to set up the library, register a data object, create a filtered stream, and consume its data.

1. Set up the POP and Streaming libraries

This can be done by initializing the APIClient and using it to initilize the StreamingClient:

from streaming import StreamingClient
from pointofpresence import APIClient

API_URL = "http://your-api-url.com"
USERNAME = "your_username"
USERNAME = "your_password"

client = APIClient(base_url=API_URL, username=USERNAME, password=PASSWORD)
streaming = StreamingClient(client)

2. Register a data object

data_object_metadata = {
    "name": "sample_data_object",
    "type": "url",
    "url": "http://example.com/data.csv",
    "description": "Sample data object for streaming demo"
}
client.register_url(data_object_metadata)

3. Create a filtered data stream

# Define filters
filters = [
    "column_name > 100",
    "IF column_name < 50 THEN alert = 'low' ELSE alert = 'high'"
]

# Create a Kafka stream with filters
stream = await streaming.create_kafka_stream(
    keywords=["sample_data_object"],
    match_all=True,
    filter_semantics=filters
)
print(f"Stream created with topic: {stream.data_stream_id}")

4. Consuming the filtered data stream

# Consume stream data
consumer = streaming.consume_kafka_messages(stream.data_stream_id)
print(consumer.dataframe.head())

5. Cleaning up

consumer.stop()
# Delete the stream and the data object
await streaming.delete_stream(stream)
client.delete_resource_by_id(search_results[0]["id"])
print("Cleanup completed.")

Contributing

Contributions are welcome! Please follow these steps:

  1. Fork the repository
  2. Create a new branch (git checkout -b feature/new-feature)
  3. Make your changes and commit (git commit -m 'Add new feature')
  4. Push to the branch (git push origin feature/new-feature)
  5. Open a Pull Reques

Contributing in PyPI

To publish the library to PyPI, follow these steps:

Ensure setup.py is correctly configured.

Build the distribution files:

python setup.py sdist bdist_wheel

Upload to PyPI using twine:

twine upload dist/*

Verify the package on PyPI:

Visit https://pypi.org/ and check your package listing.

If you need to update the library on PyPI:

  • Make your changes and update the version in setup.py.
  • Run the above steps to rebuild and upload the new version.

License

This project is licensed under the MIT License. See LICENSE.md for more details.

Contact

For any questions or suggestions, please open an issue on GitHub.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scidx_streaming-0.1.2.tar.gz (33.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scidx_streaming-0.1.2-py3-none-any.whl (43.6 kB view details)

Uploaded Python 3

File details

Details for the file scidx_streaming-0.1.2.tar.gz.

File metadata

  • Download URL: scidx_streaming-0.1.2.tar.gz
  • Upload date:
  • Size: 33.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.13.1

File hashes

Hashes for scidx_streaming-0.1.2.tar.gz
Algorithm Hash digest
SHA256 f771fd85510d7820bf08fe8a8bb39b4b45a9b8f90412677d2f641541e36d0db9
MD5 2705034f7e93e2419bc0f340add3b5cd
BLAKE2b-256 3509827d83fbabecef44d83b2d4caf9a429efcc5fbff21d610d0b47533350743

See more details on using hashes here.

File details

Details for the file scidx_streaming-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for scidx_streaming-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 6938db1ad09e5bba3e3dffdf5ee08be802642f0197d0004853356609c1d0aeac
MD5 c577d6fb1f48edf5d95a611ade7e1ff2
BLAKE2b-256 04b8efde824a7445d4f98497b2d4a7f23ef3493007270787969dc96608356531

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page