Skip to main content

udata analysis service

Project description

udata-analysis-service

This service's purpose is to analyse udata datalake files to enrich the metadata, starting with CSVs. It uses csv-detective to detect the type and format of CSV columns by checking both headers and contents.

Installation

Install udata-analysis-service:

pip install udata-analysis-service

Rename the .env.sample to .env and fill it with the right values.

REDIS_URL = redis://localhost:6381/0
REDIS_HOST = localhost
REDIS_PORT = 6381
KAFKA_HOST = localhost
KAFKA_PORT = 9092
KAFKA_API_VERSION = 2.5.0
MINIO_URL = https://object.local.dev/
MINIO_USER = sample_user
MINIO_PWD = sample_pwd
ROWS_TO_ANALYSE_PER_FILE=500
CSV_DETECTIVE_REPORT_BUCKET = benchmark-de
CSV_DETECTIVE_REPORT_FOLDER = report
TABLESCHEMA_BUCKET = benchmark-de
TABLESCHEMA_FOLDER = schemas
UDATA_INSTANCE_NAME=udata

Usage

Start the Kafka consumer:

udata-analysis-service consume

Start the Celery worker:

udata-analysis-service work

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

udata-analysis-service-0.0.1.dev27.tar.gz (6.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

udata_analysis_service-0.0.1.dev27-py2.py3-none-any.whl (5.6 kB view details)

Uploaded Python 2Python 3

File details

Details for the file udata-analysis-service-0.0.1.dev27.tar.gz.

File metadata

File hashes

Hashes for udata-analysis-service-0.0.1.dev27.tar.gz
Algorithm Hash digest
SHA256 86875d8aadda74966f4859e77ddf86db8eee37be732fac937caae326017fb930
MD5 46fbf966a68f35caf576018c8e490a34
BLAKE2b-256 27dbc1554957a12a9be49254bf914422402684b24e915140678cbfa7b3ce1ac8

See more details on using hashes here.

File details

Details for the file udata_analysis_service-0.0.1.dev27-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for udata_analysis_service-0.0.1.dev27-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 9c054f20c3615a10cd06e30ae51fe1d5b94b195b500bee9893d9bb0fd3cf4d3f
MD5 1bbf6f815de03374c6f0a20f62596656
BLAKE2b-256 dad873cb209da3cb754c6653874b791b85668e37efe1448dd3f6dc8ffd997c4c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page