A Python package for creating, analyzing, and visualizing decision trees with expected value calculations

These details have not been verified by PyPI

Project links

Project description

DTrees: Decision Tree Analyzer

A Python package for creating, analyzing, and visualizing decision trees.

In case of questions or ideas to improve the library, reach out to me at:

Forks are welcome, but reach out to me to discuss the updates.

Install

To install the library run

!pip install dtrees-analyzer

For more information about versions, changes, etc., visit the project page on PyPI.

Usage

For the decision tree you only need to provide a utility function if one is used:

utility_function: Utility function, should take a float value as input

For each node you need to add the following parameters:

node_id: The ID used to connect and identify nodes
node_name: The name of the node to describe it, shown in the graphs
value: The value of the node, applies only to terminal nodes

For each edge you need to add the following parameters:

from_node: The starting node
to_node: The node it should be connected to
probability: The probability between two nodes, applies only if:
- Chance node is connected to Chance node
- Chance node is connected to Terminal node

The way to describe a decision tree using the library is by:

Defining the decision tree
If you want to use a utility function, you should provide it at the decision tree definition.

dt = DecisionTree()

# def any_utility_function(x: float):
#     return <Some utility function>
# dt = DecisionTree(utility_function=any_utility_function)

Adding decision nodes For these nodes we don't know the probabilities for each option as they are decisions we have to make. When calculating the expected values (EV) based on the child nodes, it will always take the node with the highest expected value (or utility value if a function is provided).

dt.add_decision_node("I", "Decision")

Adding chance nodes These are intermediate nodes which direct us to other nodes based on a certain probability.

dt.add_chance_node("B", "Buy TSLA stocks")

Adding terminal nodes These nodes are final nodes which are associated with the known expected value we have for each branch.

dt.add_terminal_node("PI", "The price increases", 1_000)

Adding edges between nodes These are the connections between the nodes, and in the case of chance nodes connected among them or to terminal nodes, they will be associated with probabilities.

# Connecting nodes with probability [Chance node -> Chance node & Chance node -> Terminal node]
dt.add_edge("P", "PE", 0.3)

# Connecting nodes without probability [Decision node -> Any]
dt.add_edge("I", "B")

For each node you have to add the following parameters:

node_id: The ID used to connect and identify nodes
node_name: The name of the node to describe it, shown in the graphs
value: The value of the node, applies only to terminal nodes

Example

# Define decision tree
dt = DecisionTree()

# Build decision tree
dt.add_decision_node("D", "Decision")
dt.add_chance_node("B", "Buy TSLA stocks")
dt.add_terminal_node("NB", "Don't buy TSLA stocks", 0)
dt.add_edge("D", "B")
dt.add_edge("D", "NB")

dt.add_terminal_node("PI", "The price increases", 1_000)
dt.add_terminal_node("PD", "The price decreases", -2_000)
dt.add_edge("B", "PI", 0.6)
dt.add_edge("B", "PD", 0.4)

dt.save_mermaid_graph("./images/example.png")

Example

Comprehensive Example: Land Investment Decision

Newox is considering whether or not to drill on its own land in search of natural gas. If the company decides to drill, the cost is $40,000. If gas is found, Newox has two options: it can either sell the land to West Gas for $200,000 or develop the site itself. If no gas is found, there are no additional costs or revenues beyond the initial drilling cost.

The other option is to skip drilling entirely and sell the land as-is for $22,000.

At current natural gas prices, a producing well would be worth $150,000 on the open market. However, there's a chance gas prices could double, in which case the well would be worth $300,000.

Company engineers estimate a 30% chance of finding gas. Meanwhile, the company's economist believes there's a 60% chance that gas prices will double.

What decision should Newox make to maximize its expected profits?

This example demonstrates a more complex decision tree for a land investment scenario, both with and without a utility function.

Without Utility Function

from dtree import DecisionTree

# Create decision tree
dt = DecisionTree()

# Add nodes
dt.add_decision_node("I", "Decision")
dt.add_terminal_node("S", "Sell land", 22_000)
dt.add_chance_node("D", "Drill land")
dt.add_edge("I", "S")
dt.add_edge("I", "D")

dt.add_decision_node("G", "Gas found")
dt.add_terminal_node("NG", "No gas found", -40_000)
dt.add_edge("D", "G", 0.3)
dt.add_edge("D", "NG", 0.7)

dt.add_terminal_node("GS", "Sell land to West Gas", 200_000-40_000)
dt.add_chance_node("GD", "Develop the site")
dt.add_edge("G", "GD")
dt.add_edge("G", "GS")

dt.add_terminal_node("NM", "Normal market conditions", 150_000-40_000)
dt.add_terminal_node("GM", "Good market conditions", 300_000-40_000)
dt.add_edge("GD", "NM", 0.4)
dt.add_edge("GD", "GM", 0.6)

# Create graph
dt.save_mermaid_graph("./images/case_without_utility_func.png")

Case without utility function

With Utility Function

The utility function is:

$u(x) = \sqrt[3]{x}$

import numpy as np
from dtree import DecisionTree

# Utility function
def utility(x):
    return np.cbrt(x)

# Create decision tree
dt = DecisionTree(utility_function=utility)

# Add nodes
dt.add_decision_node("I", "Decision")
dt.add_terminal_node("S", "Sell land", 22_000)
dt.add_chance_node("D", "Drill land")
dt.add_edge("I", "S")
dt.add_edge("I", "D")

dt.add_decision_node("G", "Gas found")
dt.add_terminal_node("NG", "No gas found", -40_000)
dt.add_edge("D", "G", 0.3)
dt.add_edge("D", "NG", 0.7)

dt.add_terminal_node("GS", "Sell land to West Gas", 200_000-40_000)
dt.add_chance_node("GD", "Develop the site")
dt.add_edge("G", "GD")
dt.add_edge("G", "GS")

dt.add_terminal_node("NM", "Normal market conditions", 150_000-40_000)
dt.add_terminal_node("GM", "Good market conditions", 300_000-40_000)
dt.add_edge("GD", "NM", 0.4)
dt.add_edge("GD", "GM", 0.6)

To analyze the decision tree you can:

Create a mermaid graph using save_mermaid_graph.
The thicker line shows the optimal path. This is useful to understand, at each decision node, which would be the best path given the quantified information.

dt.save_mermaid_graph("./images/case_with_utility_func.png")

Case with utility function

Create a markdown representation of the mermaid graph using save_mermaid_diagram. So, you can customize the graph using services as Mermaid.live.

dt.save_mermaid_diagram("./images/case_with_utility_func.md")

# Output
# graph LR
#     classDef decision fill:#4e79a7,stroke:#2c5f85,stroke-width:3px,color:#ffffff,font-weight:bold,font-size:12px
#     classDef chance fill:#f28e2c,stroke:#d4751a,stroke-width:3px,color:#ffffff,font-weight:bold,font-size:12px
#     classDef terminal fill:#59a14f,stroke:#3f7a37,stroke-width:3px,color:#ffffff,font-weight:bold,font-size:12px
#     I["<b>Decision</b><br/>U: 31.75<br/>EV: 32,000.00"]
#     class I decision
#     S["<b>Sell land</b><br/>U: 28.02<br/>EV: 22,000.00"]
#     class S terminal
#     D(["<b>Drill land</b><br/>U: 31.75<br/>EV: 32,000.00"])
#     class D chance
#     G["<b>Gas found</b><br/>U: 58.48<br/>EV: 200,000.00"]
#     class G decision
#     NG["<b>No gas found</b><br/>U: -34.20<br/>EV: -40,000.00"]
#     class NG terminal
#     GS["<b>Sell land to West Gas</b><br/>U: 54.29<br/>EV: 160,000.00"]
#     class GS terminal
#     GD(["<b>Develop the site</b><br/>U: 58.48<br/>EV: 200,000.00"])
#     class GD chance
#     NM["<b>Normal market conditions</b><br/>U: 47.91<br/>EV: 110,000.00"]
#     class NM terminal
#     GM["<b>Good market conditions</b><br/>U: 63.83<br/>EV: 260,000.00"]
#     class GM terminal
#     I ==> S
#     I ==> D
#     D ==>|<b>30.0%</b>| G
#     D ==>|<b>70.0%</b>| NG
#     G ==> GD
#     G ==> GS
#     GD ==>|<b>40.0%</b>| NM
#     GD ==>|<b>60.0%</b>| GM
#     linkStyle default stroke:#666,stroke-width:2px
#     %%{init: {'theme':'base', 'themeVariables': {'primaryColor':'#ffffff', 'primaryTextColor':'#333333', 'primaryBorderColor':'#dddddd', 'lineColor':'#666666'}}}%%
#     linkStyle 1 stroke:#e15759,stroke-width:5px;
#     linkStyle 2 stroke:#e15759,stroke-width:5px;
#     linkStyle 4 stroke:#e15759,stroke-width:5px;
#     linkStyle 7 stroke:#e15759,stroke-width:5px;

The calculate_expected_values method allows you to get all the values as a dictionary.

# Expected output
dt.calculate_expected_values()

# Output
# {
#     'I': {'expected_value': 32000.0, 'utility_value': 31.74802103936399},
#     'S': {'expected_value': 22000, 'utility_value': 28.02039330655387},
#     'D': {'expected_value': 32000.0, 'utility_value': 31.74802103936399},
#     'G': {'expected_value': 200000.0, 'utility_value': 58.480354764257314},
#     'NG': {'expected_value': -40000, 'utility_value': -34.19951893353394},
#     'GS': {'expected_value': 160000, 'utility_value': 54.28835233189813},
#     'GD': {'expected_value': 200000.0, 'utility_value': 58.480354764257314},
#     'NM': {'expected_value': 110000, 'utility_value': 47.91419857062784},
#     'GM': {'expected_value': 260000, 'utility_value': 63.82504298859907}
# }

The get_optimal_path allows you to get the optimal path based on a starting node.

dt.get_optimal_path("I")

# Output
# ['I', 'D', 'G', 'GD', 'NM']

The get_children method shows you all the child nodes based on a node ID you provide.

dt.get_children("GD")

# Output
# [
#     ('NM', 0.4), 
#     ('GM', 0.6)
# ]

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.4

Jul 6, 2025

0.1.3

Jul 4, 2025

This version

0.1.2

Jul 4, 2025

0.1.1

Jul 4, 2025

0.1.0

Jul 4, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dtrees_analyzer-0.1.2.tar.gz (1.0 MB view details)

Uploaded Jul 4, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

dtrees_analyzer-0.1.2-py3-none-any.whl (11.0 kB view details)

Uploaded Jul 4, 2025 Python 3

File details

Details for the file dtrees_analyzer-0.1.2.tar.gz.

File metadata

Download URL: dtrees_analyzer-0.1.2.tar.gz
Upload date: Jul 4, 2025
Size: 1.0 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.11

File hashes

Hashes for dtrees_analyzer-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`399017bc4e1ab62d4694bc4b6734edf5e3ff110858b16d9fcdaadc40db9b5b78`
MD5	`52438861b617753c9441bb638c66e680`
BLAKE2b-256	`88cd7c3fe8755c4a8150cb0e56e6a7306b6bb395da0ec186d52c473697b8da42`

See more details on using hashes here.

File details

Details for the file dtrees_analyzer-0.1.2-py3-none-any.whl.

File metadata

Download URL: dtrees_analyzer-0.1.2-py3-none-any.whl
Upload date: Jul 4, 2025
Size: 11.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.11

File hashes

Hashes for dtrees_analyzer-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`30f5ec857df4e83c4bafa4b0acd1cc865812090e57e9642e82d62e0c081fc3f7`
MD5	`ca8535feea9c8930946f2ed21e669487`
BLAKE2b-256	`1e03acf11b29d7d097a19326e024156fdac6a9a30124cfaba8fbd72ec0dcd44e`

See more details on using hashes here.

dtrees-analyzer 0.1.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

DTrees: Decision Tree Analyzer

Install

Usage

For the decision tree you only need to provide a utility function if one is used:

For each node you need to add the following parameters:

For each edge you need to add the following parameters:

The way to describe a decision tree using the library is by:

Example

Comprehensive Example: Land Investment Decision

Without Utility Function

With Utility Function

$u(x) = \sqrt[3]{x}$

To analyze the decision tree you can:

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes