Skip to main content

etl pipeline for investigations with follow the money data

Project description

investigraph

Research and implementation of an ETL process for a curated and up-to-date public and open-source data catalog of frequently used datasets in investigative journalism.

Using prefect.io for ftm pipeline processing

Documentation

Tutorial

installation

pip install investigraph

example datasets

There is a dedicated repo for example datasets that can be used as a Block within the prefect.io deployment.

deployment

docker

docker-compose.yml for local development / testing, use docker-compose.prod.yml as a starting point for a production setup. More instructions here

run locally

Clone repo first.

Install app and dependencies (use a virtualenv):

pip install -e .

After installation, investigraph as a command should be available:

investigraph --help

Quick run a local dataset definition:

investigraph run <dataset_name> -c ./path/to/config.yml

Register a local datasets block:

investigraph add-block -b local-file-system/investigraph-local -u ./datasets

Register github datasets block:

investigraph add-block -b github/investigraph-datasets -u https://github.com/investigativedata/investigraph-datasets.git

Run a dataset pipeline from a dataset defined in a registered block:

investigraph run ec_meetings

View prefect dashboard:

make server

test

make install
make test

supported by

Media Tech Lab Bayern batch #3

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

investigraph-0.0.4.tar.gz (15.8 kB view details)

Uploaded Source

Built Distribution

investigraph-0.0.4-py3-none-any.whl (21.9 kB view details)

Uploaded Python 3

File details

Details for the file investigraph-0.0.4.tar.gz.

File metadata

  • Download URL: investigraph-0.0.4.tar.gz
  • Upload date:
  • Size: 15.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.5.1 CPython/3.11.1 Linux/6.1.0-9-amd64

File hashes

Hashes for investigraph-0.0.4.tar.gz
Algorithm Hash digest
SHA256 3e225a7c5f2d8fb877060f12de565adfdde8177ef3aa5aaa9572b787320885ed
MD5 8cdad320629933c0636fa324e541d884
BLAKE2b-256 b000e56addbe3f73ca85891a33db7f01e097f21146884c5258b2717ab727daa9

See more details on using hashes here.

File details

Details for the file investigraph-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: investigraph-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 21.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.5.1 CPython/3.11.1 Linux/6.1.0-9-amd64

File hashes

Hashes for investigraph-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 97a201cc7274c5d45366b6b02d4d45ecf805e946257668c5d5ab56ac4a10bc9b
MD5 8699db160523ead77fad539269eceb82
BLAKE2b-256 51382304790fa4bcd9d7ea7276120ddaf55be0cbf46a0b6fa920a38cf78ec7d7

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page