followthemoney query dsl and io helpers
Project description
ftmq
An attempt towards a followthemoney query dsl.
This library provides methods to query and filter entities formatted as followthemoney data, either from a json file/stream or using a SQL backend via followthemoney-store
It also provides a Query
class that can be used in other libs to work with
SQL queries or api queries.
Minimum Python version: 3.10
Installation
pip install ftmq
Usage
ftmq
accepts either a line-based input stream or an argument with a file uri.
(For integration with followthemoney-store
, see below)
Input stream:
cat entities.ftm.json | ftmq <filter expression> > output.ftm.json
URI argument:
Under the hood, ftmq
uses
smart_open to be able to
interpret arbitrary file uris as argument -i
:
ftmq <filter expression> -i ~/Data/entities.ftm.json
ftmq <filter expression> -i https://example.org/data.json.gz
ftmq <filter expression> -i s3://data-bucket/entities.ftm.json
ftmq <filter expression> -i webhdfs://host:port/path/file
Of course, the same is possible for output -o
:
cat data.json | ftmq <filter expression> -o s3://data-bucket/output.json
Filter for a dataset:
cat entities.ftm.json | ftmq -d ec_meetings
Filter for a schema:
cat entities.ftm.json | ftmq -s Person
Filter for a schema and all it's descendants or ancestors:
cat entities.ftm.json | ftmq -s LegalEntity --schema-include-descendants
cat entities.ftm.json | ftmq -s LegalEntity --schema-include-ancestors
Filter for properties:
Properties are options via --<prop>=<value>
cat entities.ftm.json | ftmq -s Company --country=de
Comparison lookups for properties:
cat entities.ftm.json | ftmq -s Company --incorporationDate__gte=2020 --address__ilike=berlin
Possible lookups:
gt
- greater thanlt
- lower thangte
- greater or equallte
- lower or equallike
- SQLishLIKE
(use%
placeholders)ilike
- SQLishILIKE
, case-insensitive (use%
placeholders)[]
- usage:prop[]=foo
evaluates iffoo
is member of arrayprop
ftmstore (database read)
The same cli logic applies:
ftmq store iterate -d ec_meetings -s Event --date__gte=2019 --date__lte=2020
Python Library
from ftmq import Query
q = Query(engine="sqlite") \
.where(dataset="ec_meetings", date__lte=2020) \
.where(schema="Event") \
.order_by("date", ascending=False)
# resulting sqlite query:
str(q)
"""
SELECT t.id,
t.schema,
t.entity,
json_extract(t.entity, '$.properties.date') AS date
FROM ec_meetings t
WHERE
(EXISTS (SELECT 1 FROM json_each(date) WHERE value <= ?)) AND (t.schema = ?)
ORDER BY date DESC
"""
# parameterized
[p for p in q.parameters]
[2020, 'Event']
support
This project is part of investigraph
Media Tech Lab Bayern batch #3
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.