Tools to interact with and deploy CouncilDataProject instances
Project description
cdptools
Making City Council data more accessible and actions taken by city council members more discoverable and trackable.
About
We wondered why it was so hard to find out what was being discussed in Seattle City Council about a specific topic, so we set out to solve that. The first step to this is basic data processing: automated transcript creation for city council events, indexing those transcripts, and finally making them available on the web via our website and database. We also wanted the entire system to aim to be low cost, modular, and open access, so that it would be relatively easy for other CDP instances to be created and maintained. For us that means, databases and file stores are open access to read from, the websites that users can interact with the data can be run on free hosting services such as GitHub Pages, and computation choices should be flexible so that cost isn't a barrier issue.
The first CDP instance to be deployed was for Seattle and an example of the data that is produced and available from these systems can be seen on our Seattle instance website. The repository and code for the instance website can be found here.
This repository and Python package is a collection of tools, pipelines, and processing functions, that are used by servers to retrieve, package, store, and process data required by CouncilDataProject instances.
While this package is primarily targeted towards developers of the CDP instances and backend services, a main mission of CDP was to make city council data easier to access in all forms, on the web and programmatically, so included in this package are objects to do just that; connect and request data from CDP instance databases and file stores (examples below).
User Features
-
Plain text query for events or minutes items
-
Database schema allows for simple querying of:
- events (meetings)
- voting history of a city council or city council member
- bodies (committees)
- members
- minutes items
- event transcripts
-
File stores and databases can be used in combination to download audio or the entire transcript of a meeting
Quickstart
Search for events using plain text:
from cdptools import CDPInstance, configs
seattle = CDPInstance(configs.SEATTLE)
matching_events = seattle.database.search_events("bicycle infrastructure, pedestrian mobility")
# Returns list of Match objects sorted most to least relevant
# [Match, Match, ...]
# Use the `Match.data` attribute to view the match's data
matching_events[0].data
# {
# 'event_id': '05258417-9ad3-4d42-be1d-95eafcfa03c5',
# 'legistar_event_id': 4053,
# 'event_datetime': datetime.datetime(2019, 8, 5, 9, 30),
# ...
# }
Search for bills, appointments, 'minutes items' using plain text:
from cdptools import CDPInstance, configs
seattle = CDPInstance(configs.SEATTLE)
matching_minutes_items = seattle.database.search_minutes_items("bicycle infrastructure")
# Returns list of Match objects sorted most to least relevant
# [Match, Match, ...]
Get all data from a table:
from cdptools import CDPInstance, configs
seattle = CDPInstance(configs.SEATTLE)
all_events = seattle.database.select_rows_as_list("event")
# Returns list of dictionaries with event information
# [{"event_id": "0123", ...}, ...]
Download the highest confidence transcript for each event:
from cdptools import CDPInstance, configs
seattle = CDPInstance(configs.SEATTLE)
event_corpus_map = seattle.download_transcripts()
# Returns a dictionary mapping event id to a local path of the transcript
# {"0123abc...": "~/4567def..."}
Please view the examples directory which contains Jupyter notebooks with more examples on how to use CDP databases and file stores.
Installation
Stable Release: pip install cdptools
Development Head: pip install git+https://github.com/CouncilDataProject/cdptools.git
All City Installation:
pip install cdptools[all]
Individual City Installation:
- Seattle:
pip install cdptools[seattle]
Developer Features
- Modular system for gathering city council events, transcribing, and indexing them to make searchable.
- Data pipelines are highly customizable to fit your cities needs.
- Deploy and run pipelines using Docker to ensure your system has everything it needs.
For additional information on system design please refer to our documentation.
Documentation
For full package documentation please visit CouncilDataProject.github.io/cdptools.
Development
See CONTRIBUTING.md for information related to developing the code.
Free software: BSD-3-Clause license
This package was created with Cookiecutter.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file cdptools-2.0.6.tar.gz
.
File metadata
- Download URL: cdptools-2.0.6.tar.gz
- Upload date:
- Size: 246.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/50.3.0 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.8.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f4a7c5ea427c8a8762031840cdb7c982437437840dcd3a1ace75c4d7256ff3e7 |
|
MD5 | 619253fdc88083bf82adade09291f6d5 |
|
BLAKE2b-256 | 6a2daa4aadaed1ac899af392799de5c64654185d49bb4c2dd4fd44e47e0d7712 |
File details
Details for the file cdptools-2.0.6-py2.py3-none-any.whl
.
File metadata
- Download URL: cdptools-2.0.6-py2.py3-none-any.whl
- Upload date:
- Size: 74.9 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/50.3.0 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.8.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d0f77e5f8630d0758a1508dbd9ed2f726cafb8fe89118dcfb3c526e17f34e9fa |
|
MD5 | ffaed4b6cb2fe62f791db3e5a03a49df |
|
BLAKE2b-256 | e60da7052cc4f9732bc3060ea1703f3e3b4861601be6169bc43cef5417d54b0b |