followthemoney query dsl and io helpers
Project description
ftmq
An attempt towards a followthemoney query dsl.
This library provides methods to query and filter entities formatted as followthemoney data, either from a json file/stream or using a SQL backend via followthemoney-store
It also provides a Query
class that can be used in other libs to work with
SQL queries or api queries.
Minimum Python version: 3.11
Installation
pip install ftmq
Usage
ftmq
accepts either a line-based input stream or an argument with a file uri.
(For integration with followthemoney-store
, see below)
Input stream:
cat entities.ftm.json | ftmq <filter expression> > output.ftm.json
URI argument:
Under the hood, ftmq
uses
smart_open to be able to
interpret arbitrary file uris as argument -i
:
ftmq <filter expression> -i ~/Data/entities.ftm.json
ftmq <filter expression> -i https://example.org/data.json.gz
ftmq <filter expression> -i s3://data-bucket/entities.ftm.json
ftmq <filter expression> -i webhdfs://host:port/path/file
Of course, the same is possible for output -o
:
cat data.json | ftmq <filter expression> -o s3://data-bucket/output.json
Filter for a dataset:
cat entities.ftm.json | ftmq -d ec_meetings
Filter for a schema:
cat entities.ftm.json | ftmq -s Person
Filter for a schema and all it's descendants or ancestors:
cat entities.ftm.json | ftmq -s LegalEntity --schema-include-descendants
cat entities.ftm.json | ftmq -s LegalEntity --schema-include-ancestors
Filter for properties:
Properties are options via --<prop>=<value>
cat entities.ftm.json | ftmq -s Company --country=de
Comparison lookups for properties:
cat entities.ftm.json | ftmq -s Company --incorporationDate__gte=2020 --address__ilike=berlin
Possible lookups:
gt
- greater thanlt
- lower thangte
- greater or equallte
- lower or equallike
- SQLishLIKE
(use%
placeholders)ilike
- SQLishILIKE
, case-insensitive (use%
placeholders)[]
- usage:prop[]=foo
evaluates iffoo
is member of arrayprop
ftmq apply
"Uplevel" an entity input stream to nomenklatura.entity.CompositeEntity
and
optionally apply a dataset.
ftmq apply -i ./entities.ftm.json -d <aditional_dataset>
Overwrite datasets:
ftmq apply -i ./entities.ftm.json -d <aditional_dataset> --replace-dataset
Coverage / Statistics
Often in ftm scripting, we are iterating through all the proxies (e.g. during aggregation). Why not use this to collect statistics on the way? There is a context manager for this, which turns into the Coverage
model:
Print coverage to stdout (and filtered entities to nowhere):
cat entities.ftm.json | ftmq -s Event -o /dev/null --coverage-uri -
Within code:
from ftmq.coverage import Collector
fragments = [...]
buffer = {}
c = Collector()
for proxy in fragments:
if proxy.id in buffer:
buffer[proxy.id].merge(proxy)
else:
buffer[proxy.id] = proxy
# here collect stats:
c.collect(proxy)
coverage = c.export()
ftmstore (database read)
NOT IMPLEMENTED YET
The same cli logic applies:
ftmq store iterate -d ec_meetings -s Event --date__gte=2019 --date__lte=2020
Python Library
from ftmq import Query
q = Query() \
.where(dataset="ec_meetings", date__lte=2020) \
.where(schema="Event") \
.order_by("date", ascending=False)
assert q.apply(proxy)
support
This project is part of investigraph
Media Tech Lab Bayern batch #3
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file ftmq-0.6.14.tar.gz
.
File metadata
- Download URL: ftmq-0.6.14.tar.gz
- Upload date:
- Size: 28.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.12.5 Linux/6.10.6-amd64
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 58b7c343ece2de3e41cc9f5b74a1d4964646b7c67599d7ad8a3247831239403a |
|
MD5 | 471c71875ca55a6f7aa77f468a75c65c |
|
BLAKE2b-256 | 124e7399afc6c6b8ce3cb57a3bd695e49be2a9bcad6db28105a2d25cf178242b |
File details
Details for the file ftmq-0.6.14-py3-none-any.whl
.
File metadata
- Download URL: ftmq-0.6.14-py3-none-any.whl
- Upload date:
- Size: 35.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.12.5 Linux/6.10.6-amd64
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c18f7c75faefaec6091f9338c577092d9ea4d239022d62f5b935dede42e2a6f6 |
|
MD5 | bb93d3b70f2740029aecbcedf48e2a87 |
|
BLAKE2b-256 | f3484dee3716a2c9306f13d360fed9cd9ab32dd684ad7ee2ef148482e2c7d78d |