Skip to main content

udata analysis service

Project description

udata-analysis-service

This service's purpose is to analyse udata datalake files to enrich the metadata, starting with CSVs. It uses csv-detective to detect the type and format of CSV columns by checking both headers and contents.

Installation

Install udata-analysis-service:

pip install udata-analysis-service

Rename the .env.sample to .env and fill it with the right values.

REDIS_URL = redis://localhost:6381/0
REDIS_HOST = localhost
REDIS_PORT = 6381
KAFKA_HOST = localhost
KAFKA_PORT = 9092
KAFKA_API_VERSION = 2.5.0
MINIO_URL = https://object.local.dev/
MINIO_USER = sample_user
MINIO_PWD = sample_pwd
ROWS_TO_ANALYSE_PER_FILE=500
CSV_DETECTIVE_REPORT_BUCKET = benchmark-de
CSV_DETECTIVE_REPORT_FOLDER = report
TABLESCHEMA_BUCKET = benchmark-de
TABLESCHEMA_FOLDER = schemas
UDATA_INSTANCE_NAME=udata

Usage

Start the Kafka consumer:

udata-analysis-service consume

Start the Celery worker:

udata-analysis-service work

Logging & Debugging

The log level can be adjusted using the environment variable LOGLEVEL. For example, to set the log level to DEBUG when consuming Kafka messages, use LOGLEVEL="DEBUG" udata-analysis-service consume.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

udata-analysis-service-0.0.1.dev38.tar.gz (6.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

udata_analysis_service-0.0.1.dev38-py2.py3-none-any.whl (5.8 kB view details)

Uploaded Python 2Python 3

File details

Details for the file udata-analysis-service-0.0.1.dev38.tar.gz.

File metadata

File hashes

Hashes for udata-analysis-service-0.0.1.dev38.tar.gz
Algorithm Hash digest
SHA256 a3a8c62f20199eaaec133a048017495684eea154e9da5b1a8dec973c81c21bc2
MD5 7259db5075546a836a5f8d18c4548f1e
BLAKE2b-256 953e8e49c7af0860c626ae8ac1a258f9035fbd4786beb91d17525b86f948e583

See more details on using hashes here.

File details

Details for the file udata_analysis_service-0.0.1.dev38-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for udata_analysis_service-0.0.1.dev38-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 94682fb2c9749a11973f2ba62c80e85d1cc565e494812320160ca2bbfef5cb8d
MD5 5ecfb33a2a90bd160658b8172465ef60
BLAKE2b-256 6f222b28af6ff01e71d5d1b993577519edd4fe1b19f5ba6b05bf8bae6b6e139b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page