A library implementing the Partial Least Squares Path Model algorithm
Project description
plspm
Please note: This is not an officially supported Google product.
plspm is a Python 3 package dedicated to Partial Least Squares Path Modeling (PLS-PM) analysis. It is a partial port of the R package plspm.
Currently it will calculate modes A (for reflective relationships) and B (for formative relationships) with metric and non-metric numerical data using centroid, factorial, and path schemes. At present the library will not perform bootstrapping or handle missing values in non-metric data.
Installation
You can install the latest version of this package using pip:
python3 -m pip install --user plspm
It's hosted on pypi: https://pypi.org/project/plspm/
Examples
PLS-PM with metric data
Typical example with a Customer Satisfaction Model
#!/usr/bin/env python3
import pandas as pd, plspm.config as c
from plspm.plspm import Plspm
from plspm.scheme import Scheme
from plspm.mode import Mode
satisfaction = pd.read_csv("file:tests/data/satisfaction.csv", index_col=0)
lvs = ["IMAG", "EXPE", "QUAL", "VAL", "SAT", "LOY"]
sat_path_matrix = pd.DataFrame(
[[0, 0, 0, 0, 0, 0],
[1, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0],
[0, 1, 1, 0, 0, 0],
[1, 1, 1, 1, 0, 0],
[1, 0, 0, 0, 1, 0]],
index=lvs, columns=lvs)
config = c.Config(sat_path_matrix, scaled=False)
config.add_lv_with_columns_named("IMAG", Mode.A, satisfaction, "imag")
config.add_lv_with_columns_named("EXPE", Mode.A, satisfaction, "expe")
config.add_lv_with_columns_named("QUAL", Mode.A, satisfaction, "qual")
config.add_lv_with_columns_named("VAL", Mode.A, satisfaction, "val")
config.add_lv_with_columns_named("SAT", Mode.A, satisfaction, "sat")
config.add_lv_with_columns_named("LOY", Mode.A, satisfaction, "loy")
plspm_calc = Plspm(satisfaction, config, Scheme.CENTROID)
print(plspm_calc.inner_summary())
print(plspm_calc.path_coefficients())
This will produce the output:
type r_squared block_communality mean_redundancy ave
EXPE Endogenous 0.335194 0.616420 0.206620 0.616420
IMAG Exogenous 0.000000 0.582269 0.000000 0.582269
LOY Endogenous 0.509923 0.639052 0.325867 0.639052
QUAL Endogenous 0.719688 0.658572 0.473966 0.658572
SAT Endogenous 0.707321 0.758891 0.536779 0.758891
VAL Endogenous 0.590084 0.664416 0.392061 0.664416
IMAG EXPE QUAL VAL SAT LOY
IMAG 0.000000 0.000000 0.000000 0.000000 0.000000 0
EXPE 0.578959 0.000000 0.000000 0.000000 0.000000 0
QUAL 0.000000 0.848344 0.000000 0.000000 0.000000 0
VAL 0.000000 0.105478 0.676656 0.000000 0.000000 0
SAT 0.200724 -0.002754 0.122145 0.589331 0.000000 0
LOY 0.275150 0.000000 0.000000 0.000000 0.495479 0
PLS-PM with nonmetric data
Example with the classic Russett data (original data set)
#!/usr/bin/env python3
import pandas as pd, plspm.config as c
from plspm.plspm import Plspm
from plspm.scale import Scale
from plspm.scheme import Scheme
from plspm.mode import Mode
russa = pd.read_csv("file:tests/data/russa.csv", index_col=0)
lvs = ["AGRI", "IND", "POLINS"]
rus_path = pd.DataFrame(
[[0, 0, 0],
[0, 0, 0],
[1, 1, 0]],
index=lvs,
columns=lvs)
config = c.Config(rus_path, default_scale=Scale.NUM)
config.add_lv("AGRI", Mode.A, c.MV("gini"), c.MV("farm"), c.MV("rent"))
config.add_lv("IND", Mode.A, c.MV("gnpr"), c.MV("labo"))
config.add_lv("POLINS", Mode.A, c.MV("ecks"), c.MV("death"), c.MV("demo"), c.MV("inst"))
plspm_calc = Plspm(russa, config, Scheme.CENTROID, 100, 0.0000001)
print(plspm_calc.inner_summary())
print(plspm_calc.effects())
This will produce the output:
type r_squared block_communality mean_redundancy ave
AGRI Exogenous 0.000000 0.739560 0.000000 0.739560
IND Exogenous 0.000000 0.907524 0.000000 0.907524
POLINS Endogenous 0.592258 0.565175 0.334729 0.565175
from to direct indirect total
0 AGRI POLINS 0.225639 0.0 0.225639
1 IND POLINS 0.671457 0.0 0.671457
Example 2
PLS-PM using data set russa
, and different scaling
#!/usr/bin/python3
import pandas as pd, plspm.config as c, plspm.util as util
from plspm.plspm import Plspm
from plspm.scale import Scale
from plspm.scheme import Scheme
from plspm.mode import Mode
def russa_path_matrix():
lvs = ["AGRI", "IND", "POLINS"]
return pd.DataFrame(
[[0, 0, 0],
[0, 0, 0],
[1, 1, 0]],
index=lvs, columns=lvs)
russa = pd.read_csv("file:tests/data/russa.csv", index_col=0)
config = c.Config(russa_path_matrix(), default_scale=Scale.NUM)
config.add_lv("AGRI", Mode.A, c.MV("gini"), c.MV("farm"), c.MV("rent"))
config.add_lv("IND", Mode.A, c.MV("gnpr", Scale.ORD), c.MV("labo", Scale.ORD))
config.add_lv("POLINS", Mode.A, c.MV("ecks"), c.MV("death"), c.MV("demo", Scale.NOM), c.MV("inst"))
plspm_calc = Plspm(russa, config, Scheme.CENTROID, 100, 0.0000001)
Maintainers
Jez Humble
(humble at google.com
)
Nicole Forsgren
(nicolefv at google.com
)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.