Add codification and statute tables to pre-existing corpus-base database.
Project description
corpus-x
Concept
corpus-pax + corpus-base + statute-trees = converts raw yaml
-based corpus repository to its database variant corpus-x; see details. After constructing all of the required tables, it becomes possible to evaluate the raw data.
Flow
Local files
Download *.yaml files from repository:
flowchart LR
repo(((github/corpus))) --download---> local(local machine)
Local database
Setup local db:
flowchart LR
local(local corpus)--add corpus-pax tables--->db
local--add corpus-base tables-->db
local--format trees with statute-trees-->trees
trees(tree structures)--add corpus-x tables-->db[(sqlite.db)]
Replicated database
Store backup db on aws:
flowchart LR
db[(sqlite.db)]--litestream replicate-->aws
aws--litestream restore-->lawdata.xyz
Mode
Order | Time | Instruction | Docs |
---|---|---|---|
0 | ~6sec (if with test data) | corpus-pax pre-requiste before corpus-base can work. |
Setup |
1 | ~20-40min | corpus-base pre-requiste before corpus-x can work. |
Setup |
2 | ~120-130min | If inclusion files not yet created, run script to generate. | Pre-inclusions |
3 | ~10min | Assuming inclusion files are already created, can populate the various tables under corpus-x |
Post-inclusions |
4 | ~60min | Litestream output x.db on AWS bucket |
Replicated db |
Gotchas
The statutory event data contained in the units
field does not yet contain the statute_id
s. Note that, prior to database insertion, we only know the statute label but not the id. Once the statute has been inserted, we can now match the statute label to the id:
for row in c.db[CodeRow.__tablename__].rows:
obj = CodeRow.set_update_units(c, row["id"])
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.