Skip to main content

udata analysis service

Project description

udata-analysis-service

This service's purpose is to analyse udata datalake files to enrich the metadata, starting with CSVs. It uses csv-detective to detect the type and format of CSV columns by checking both headers and contents.

Installation

Install udata-analysis-service:

pip install udata-analysis-service

Rename the .env.sample to .env and fill it with the right values.

REDIS_URL = redis://localhost:6381/0
REDIS_HOST = localhost
REDIS_PORT = 6381
KAFKA_HOST = localhost
KAFKA_PORT = 9092
KAFKA_API_VERSION = 2.5.0
MINIO_URL = https://object.local.dev/
MINIO_USER = sample_user
MINIO_PWD = sample_pwd
ROWS_TO_ANALYSE_PER_FILE=500
CSV_DETECTIVE_REPORT_BUCKET = benchmark-de
CSV_DETECTIVE_REPORT_FOLDER = report
TABLESCHEMA_BUCKET = benchmark-de
TABLESCHEMA_FOLDER = schemas
UDATA_INSTANCE_NAME=udata

Usage

Start the Kafka consumer:

udata-analysis-service consume

Start the Celery worker:

udata-analysis-service work

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

udata-analysis-service-0.0.1.dev24.tar.gz (6.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

udata_analysis_service-0.0.1.dev24-py2.py3-none-any.whl (5.5 kB view details)

Uploaded Python 2Python 3

File details

Details for the file udata-analysis-service-0.0.1.dev24.tar.gz.

File metadata

File hashes

Hashes for udata-analysis-service-0.0.1.dev24.tar.gz
Algorithm Hash digest
SHA256 82f6ac36d575037b51533a984f544b0eca29b090c938158a326c8ebdf22d4290
MD5 2b71b279f559a9fb5af63f070d1627ad
BLAKE2b-256 258ec43fb0d104f42dc2686729c2fff4112b37df6e95f8977a6ae7f65d78fc4e

See more details on using hashes here.

File details

Details for the file udata_analysis_service-0.0.1.dev24-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for udata_analysis_service-0.0.1.dev24-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 29dfe87f59862260bc6cbccd9e43bd10d0836315b71b84e104bd3c2e8487176c
MD5 d74dba2e349359bdd6635771f45fbd8a
BLAKE2b-256 385cb82ad82b86499d47b18f9a208b269e7073d845d2c2f9ce64740dab0870f3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page