Skip to main content

A simple spark streaming handler.

Project description

Spark Streaming Package

Package: SparkStream-pypi

What is it?

It is a handler for processing streaming text data from a kafka topic into cassandra and redis.

How it works?

The stream processing is done by the following steps:

  1. Read data from kafka topic
  2. Parse the data into a spark dataframe with a schema
  3. Clean the data: remove unwanted chars, fix abbreviations, remove stop-words, and remove empty fields
  4. Save the data into cassandra and redis

How to use it?

Use its API: SparkStream-API github

Dependency

The package requires the following dependency:

Its so to be able to write data into redis.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

SparkStream-1.3.0.tar.gz (1.2 MB view details)

Uploaded Source

Built Distribution

SparkStream-1.3.0-py3-none-any.whl (8.0 kB view details)

Uploaded Python 3

File details

Details for the file SparkStream-1.3.0.tar.gz.

File metadata

  • Download URL: SparkStream-1.3.0.tar.gz
  • Upload date:
  • Size: 1.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.12

File hashes

Hashes for SparkStream-1.3.0.tar.gz
Algorithm Hash digest
SHA256 21f469564de160ada453493268819a8eb7657703cc8b5db4f767a4f10cc3f509
MD5 681887fa137c0c73078ed5e2aaf5c225
BLAKE2b-256 b00d85c90186cfa7cb1e011e4c33ad4db6ae90903b7dbcca9670b8b2bfe77d1c

See more details on using hashes here.

File details

Details for the file SparkStream-1.3.0-py3-none-any.whl.

File metadata

  • Download URL: SparkStream-1.3.0-py3-none-any.whl
  • Upload date:
  • Size: 8.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.12

File hashes

Hashes for SparkStream-1.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2ec79c6457dfb826534a17d3e5ecf511a8fdfe97b8b02614c95dc15b7e9de573
MD5 82ef110612c62b8072b90079d7962c14
BLAKE2b-256 824d87d7466c45922d4122d8227dc3829d65ba35b60ecfb4e65b2d4696ddb557

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page