"Python package classifications of the util workflow stage of the BONSAI database"

These details have not been verified by PyPI

Project links

Project description

BONSAI classifications

The BONSAI classifications Python package is a part of the Getting The Data Right project.

Here, all the classifications, which are used in the Bonsai database, are created and stored as csv files. The csv files can be found under /src/classifications/data. The structure of organising these files follows the Bonsai ontology and thus has the following folders:

activitytype (includes: industry_activity, government_activity, treatment_activity, non_profit_institution_serving_household, household_production, household_consumption, market_activity, natural_activity, auxiliary_production_activity, change_in_stock_activity, other_activity)
flowobject (includes industry_product, material_for_treatment, market_product, government_product, household_product, needs_satisfaction, emission, direct_physical_change, natural_resource, economic_flow, social_flow)
location
unit
time

Since the Bonsai ontology does not cover all required topics, additional folders are added:

dataquality
uncertainty

A comprehensive documentation of the classification package is availbale here

Format

The csv files (tables) of each folder (datapackage) are organised in tabular format. Each of the mentioned folders represents a valid dataio.datapackage created with the Python package dataio. The following types of tables with its prefixes are used:

tree table tree_
concordance table conc_
dimension table dim_
pairwise cocncordance table concpair_

tree table

Tree tables are used for classifications which have a tree structure, meaning that the classification is structured hierarchically with multiple levels. The classification starts with broad categories at the top level and then branches out into more specific subcategories as you move down the hierarchy.

The following column names are used:

code: code of the item
parent_code: code of the items parent
name: name of the item
level: the items level in the tree structure (from 0 to n)
prefixed_id: unique id (uuid4)

concordance table

A concordance table is used to establish equivalences or relationships between different classification systems. It provides mappings between codes of a classification system and codes from another classification system. A relationship between codes can have four different types:

One-to-One (1:1) Concordance: In a one-to-one concordance, each category or code in one classification system is mapped to exactly one category or code in another classification system, and vice versa. This type of mapping implies a direct and unambiguous correspondence between the two systems.
One-to-Many (1:M) Concordance: In a one-to-many concordance, each category or code in one classification system is mapped to multiple categories or codes in another classification system. However, each category or code in the second system is only mapped to one category or code in the first system. This type of mapping implies that one category or code in the first system may encompass multiple categories or codes in the second system.
Many-to-One (M:1) Concordance: In a many-to-one concordance, multiple categories or codes in one classification system are mapped to a single category or code in another classification system. However, each category or code in the second system is only mapped to one category or code in the first system. This type of mapping implies that multiple categories or codes in the first system are aggregated or collapsed into a single category or code in the second system.
Many-to-Many (M:M) Concordance: In a many-to-many concordance, multiple categories or codes in one classification system are mapped to multiple categories or codes in another classification system. This type of mapping indicates complex relationships where there isn't a straightforward one-to-one correspondence between the categories or codes in the two systems.

The following column names are used:

<tree_classification_A>: code of classification A
<tree_classification_A>: code of classification B which is mapped to the code of classification A
comment: comment on the type of concordance
prefixed_id: unique id (uuid4)

The requirements for these table types are specified here.

dimension table

A dimension table is used for classifications which do not have a tree structure.

The following column names are used:

code: code of the item
name: name of the item
description: description of the item
prefixed_id: unique id (uuid4)

pairwise concordance table (for Bonsai)

This type of concordance table is used to map pairwise codes. For instance, some data providers such as UNdata and IEA are using combined codes for an activity (e.g. for "production of", "electricity production by") and flowobject (e.g. "coal") to express a bonsai_activitytype ("A_COAL", "A_PowC"). In some cases, when the conc_ tables for activitytype and flowobject, which map single relations, are not sufficient to create these pairwise concordances, it is reasonable to make it explicit. The mapping relationships between the pairwise codes can be the same as in the conc_ tables.

The following column names are used:

tree_bonsai_activitytype: code of the Bonsai activitytype
tree_bonsai_flowobject: code of the Bonsai flowobject
tree_other_activitytype: code of the other classification activitytype
tree_other_flowobject: code of the other classification flowobject
other_classification: name of the other classification schema
comment: comment on the type of concordance
prefixed_id: unique id (created by uuid4)

Usage

To use the classification, you can install the package via pip. Replace <version> by a specific tag or branch name.

pip install git+ssh://git@gitlab.com/bonsamurais/bonsai/util/classifications@<version>

From pypi, do:

pip install bonsai_classifications

All classifications are provided as dataio.datapackage which include the tables as pandas.DataFrame. E.g., you can do the following get the classification tree for industry activities of Bonsai:

import classifications

classifications.activitytype.datapackage.tree_bonsai

:::{note} The datapackage object includes also the tables of other classifications. :::

You can also get the concordance tables and external classifications in the similar way, using the datapackage object.

The activities and flowobjects of Bonsai can be also used directly as objects. By doing the following, you would get the name of the A_Chick activity.

import classifications

classifications.activitytype.bonsai.A_Chick.name

Special methods

lookup() for searching strings in code names
get_children() to get all codes those have the same parent code
create_conc() to create a concordance table
disaggregate_bonsai() for adding new codes, which disaggregate an existing code

To search for certain key words in a table, you can use the line of code below. This returns a pandas DataFrame with rows that have "coal" in the name column.

classifications.flowobject.datapackage.tree_bonsai.lookup("coal")

To get all children of a certain code (here for treatment activities in Bonsai), you can do:

classifications.activitytype.datapackage.tree_bonsai.get_children(parent_code="at")

The package also helps to create new concordance tables. When having two concordance tables, one for mapping codes of classification a to b, and the other for mapping b to c, you can use the following:

df_1:

a	b
01.01	x
...	...

df_2:

b	c
x	YXDA
...	...

df_3 = classifications.create_conc(df_1, df_2, source="a", target="c", intermediate="c")

df_3:

a	c
01.01	YXDA
...	...

To disaggregate existing codes of the Bonsai classification, you can use the disaggregate_bonsai() method. Depending on the category, e.g. activitytype or flowobject, you can call that method. To indicate the which code you want to disaggregate, you need to provide a dictionary, with the old code of Bonsai as keys. The value corresponding to that key is a list of tuples. Each tuple represents a new code. The first entry of that tuple is the code, the second entry is the name, and the third is a mapping dictionary. This mapping dictionary includes the name of another classification scheme (other than Bonsai) as key, and a list of strings, which are the codes of the other classification now represented by the new code.

codes = {"disaggregations":
          [
            {"old_code" : "A_Paper",
             "new_codes":
               [
                {"code": "New_Paper1",
                 "description": "new paper production 1",
                 "mappings": {"nace_rev2": ["10.02","01.13"]}
                },
                {"code": "New_Paper2",
                 "description": "new paper production 2",
                 "mappings": {}
                }
                ]
            }
          ]
}

d = classifications.activitytype.datapackage.disaggregate_bonsai(codes)

Get the pandas DataFrames that are modified.

d["tree_bonsai"]
d["conc_bonsai_nace_rev2"]

To use that function via terminal, execute python disaggregate_bonsai.py <bonsai_categorty> <path/to/disaggregaion.yaml> <directory/for/updated/files>. <bonsai_category> can be for instance activitytype or flowobject.

:::{note} To disaggregate an existing code, you need to provide at least 2 new codes. It is assumed that all entities covered by the new codes are equal to the entities of the existing code.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

2.1.2

Feb 26, 2026

0.6.0

Mar 9, 2026

0.5.2

Feb 10, 2026

0.5.1

Jan 27, 2026

0.5.0

Jan 27, 2026

0.4.19

Oct 20, 2025

0.4.18

Oct 7, 2025

0.4.17

Sep 2, 2025

0.4.16

Aug 20, 2025

0.4.15

Aug 11, 2025

0.4.14

Aug 8, 2025

0.4.13

Aug 4, 2025

0.4.12

Jun 19, 2025

0.4.11

Jun 3, 2025

0.4.10

Jun 3, 2025

0.4.9

May 28, 2025

0.4.8

May 28, 2025

0.4.7

May 27, 2025

0.4.6

May 23, 2025

0.4.5

May 21, 2025

0.4.4

May 9, 2025

0.4.3

May 2, 2025

0.4.2

May 1, 2025

0.4.1

Apr 29, 2025

0.3.11

Mar 10, 2025

0.3.10

Jan 16, 2025

0.3.9

Jan 7, 2025

0.3.8

Dec 5, 2024

This version

0.3.7

Dec 2, 2024

0.3.6

Nov 11, 2024

0.3.5

Sep 25, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bonsai_classifications-0.3.7.tar.gz (2.9 MB view details)

Uploaded Dec 2, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

bonsai_classifications-0.3.7-py3-none-any.whl (2.0 MB view details)

Uploaded Dec 2, 2024 Python 3

File details

Details for the file bonsai_classifications-0.3.7.tar.gz.

File metadata

Download URL: bonsai_classifications-0.3.7.tar.gz
Upload date: Dec 2, 2024
Size: 2.9 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.0.1 CPython/3.12.7

File hashes

Hashes for bonsai_classifications-0.3.7.tar.gz
Algorithm	Hash digest
SHA256	`bfba706c85b4089e888fc35419ced6212be6a6d6761dbf2eb78f7aecd9dda57f`
MD5	`0f19c36186b42df36e7066b98caa71cf`
BLAKE2b-256	`aebecbe88a8179f41a717dfe5cb64bd28da56ec114a841cad27d23951c32ff69`

See more details on using hashes here.

File details

Details for the file bonsai_classifications-0.3.7-py3-none-any.whl.

File metadata

Download URL: bonsai_classifications-0.3.7-py3-none-any.whl
Upload date: Dec 2, 2024
Size: 2.0 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.0.1 CPython/3.12.7

File hashes

Hashes for bonsai_classifications-0.3.7-py3-none-any.whl
Algorithm	Hash digest
SHA256	`66d7235d4792b25029681ea624bdfb27560a968e4b49073fdf1ebd505a14ab40`
MD5	`9f38798e9e72a76f2600f9ce9bbc6b2f`
BLAKE2b-256	`3c994d67f25f878f625b9a69906ddcf7c44736d2b91d9f0ecf797c28eed75440`

See more details on using hashes here.

bonsai-classifications 0.3.7

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

BONSAI classifications

Format

tree table

concordance table

dimension table

pairwise concordance table (for Bonsai)

Usage

Special methods

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes