Skip to main content

followthemoney query dsl and io helpers

Project description

ftmq on pypi Python test and package pre-commit Coverage Status MIT License

ftmq

An attempt towards a followthemoney query dsl.

This library provides methods to query and filter entities formatted as followthemoney data, either from a json file/stream or using a SQL backend via followthemoney-store

It also provides a Query class that can be used in other libs to work with SQL queries or api queries.

Minimum Python version: 3.11

Installation

pip install ftmq

Usage

ftmq accepts either a line-based input stream or an argument with a file uri. (For integration with followthemoney-store, see below)

Input stream:

cat entities.ftm.json | ftmq <filter expression> > output.ftm.json

URI argument:

Under the hood, ftmq uses smart_open to be able to interpret arbitrary file uris as argument -i:

ftmq <filter expression> -i ~/Data/entities.ftm.json
ftmq <filter expression> -i https://example.org/data.json.gz
ftmq <filter expression> -i s3://data-bucket/entities.ftm.json
ftmq <filter expression> -i webhdfs://host:port/path/file

...and so on

Of course, the same is possible for output -o:

cat data.json | ftmq <filter expression> -o s3://data-bucket/output.json

Filter for a dataset:

cat entities.ftm.json | ftmq -d ec_meetings

Filter for a schema:

cat entities.ftm.json | ftmq -s Person

Filter for a schema and all it's descendants or ancestors:

cat entities.ftm.json | ftmq -s LegalEntity --schema-include-descendants
cat entities.ftm.json | ftmq -s LegalEntity --schema-include-ancestors

Filter for properties:

Properties are options via --<prop>=<value>

cat entities.ftm.json | ftmq -s Company --country=de

Comparison lookups for properties:

cat entities.ftm.json | ftmq -s Company --incorporationDate__gte=2020 --address__ilike=berlin

Possible lookups:

  • gt - greater than
  • lt - lower than
  • gte - greater or equal
  • lte - lower or equal
  • like - SQLish LIKE (use % placeholders)
  • ilike - SQLish ILIKE, case-insensitive (use % placeholders)
  • [] - usage: prop[]=foo evaluates if foo is member of array prop

ftmq apply

"Uplevel" an entity input stream to nomenklatura.entity.CompositeEntity and optionally apply a dataset.

ftmq apply -i ./entities.ftm.json -d <aditional_dataset>

Overwrite datasets:

ftmq apply -i ./entities.ftm.json -d <aditional_dataset> --replace-dataset

Coverage / Statistics

Often in ftm scripting, we are iterating through all the proxies (e.g. during aggregation). Why not use this to collect statistics on the way? There is a context manager for this, which turns into the Coverage model:

Print coverage to stdout (and filtered entities to nowhere):

cat entities.ftm.json | ftmq -s Event -o /dev/null --coverage-uri -

Within code:

from ftmq.coverage import Collector

fragments = [...]
buffer = {}

c = Collector()
for proxy in fragments:
    if proxy.id in buffer:
        buffer[proxy.id].merge(proxy)
    else:
        buffer[proxy.id] = proxy
        # here collect stats:
        c.collect(proxy)

coverage = c.export()

ftmstore (database read)

NOT IMPLEMENTED YET

The same cli logic applies:

ftmq store iterate -d ec_meetings -s Event --date__gte=2019 --date__lte=2020

Python Library

from ftmq import Query

q = Query() \
    .where(dataset="ec_meetings", date__lte=2020) \
    .where(schema="Event") \
    .order_by("date", ascending=False)

assert q.apply(proxy)

support

This project is part of investigraph

Media Tech Lab Bayern batch #3

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ftmq-0.6.14.tar.gz (28.5 kB view details)

Uploaded Source

Built Distribution

ftmq-0.6.14-py3-none-any.whl (35.0 kB view details)

Uploaded Python 3

File details

Details for the file ftmq-0.6.14.tar.gz.

File metadata

  • Download URL: ftmq-0.6.14.tar.gz
  • Upload date:
  • Size: 28.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.5 Linux/6.10.6-amd64

File hashes

Hashes for ftmq-0.6.14.tar.gz
Algorithm Hash digest
SHA256 58b7c343ece2de3e41cc9f5b74a1d4964646b7c67599d7ad8a3247831239403a
MD5 471c71875ca55a6f7aa77f468a75c65c
BLAKE2b-256 124e7399afc6c6b8ce3cb57a3bd695e49be2a9bcad6db28105a2d25cf178242b

See more details on using hashes here.

File details

Details for the file ftmq-0.6.14-py3-none-any.whl.

File metadata

  • Download URL: ftmq-0.6.14-py3-none-any.whl
  • Upload date:
  • Size: 35.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.5 Linux/6.10.6-amd64

File hashes

Hashes for ftmq-0.6.14-py3-none-any.whl
Algorithm Hash digest
SHA256 c18f7c75faefaec6091f9338c577092d9ea4d239022d62f5b935dede42e2a6f6
MD5 bb93d3b70f2740029aecbcedf48e2a87
BLAKE2b-256 f3484dee3716a2c9306f13d360fed9cd9ab32dd684ad7ee2ef148482e2c7d78d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page