Skip to main content

Projection of High-Dimensional Data Using Multivariate Decision Trees and UMAP

Project description

TreeOrdination

CI

Implementation of a wrapper which creates unsupervised projections using LANDMark and UMAP.

Install Dependencies

The LANDMark package is needed for TreeOrdination to work. It is available at: https://github.com/jrudar/LANDMark

Install

From PyPI:

pip install TreeOrdination

From source:

git clone https://github.com/jrudar/TreeOrdination.git
cd TreeOrdination
pip install .
# or create a virtual environment
python -m venv venv
source venv/bin/activate
pip install .

Example Usage

    from TreeOrdination import TreeOrdination
    from sklearn.datasets import make_classification
    
    #Create the dataset
    X, y = make_classification(n_samples = 200, n_informative = 20)
    
    #Give features a name
    f_names = ["Feature %s" %str(i) for i in range(X.shape[0])]
    
    tree_ord = TreeOrdination(feature_names = f_names).fit(X, y)

    #This is the LANDMark embedding of the dataset. This dataset is used to train the supervised model ('supervised_clf' parameter)
    landmark_embedding = tree_ord.R_final
    
    #This is the UMAP projection of the LANDMark embedding
    umap_projection = tree_ord.tree_emb
    
    #This is the PCA projetion of the UMAP embedding
    pca_projection = tree_ord.R_PCA_emb      

Notebooks and Other Examples

Comming Soon. When available, examples of how to use TreeOrdination will be found here.

Interface

An overview of the API can be found here.

Contributing

To contribute to the development of TreeOrdination please read our contributing guide

References

Rudar, J., Porter, T.M., Wright, M., Golding G.B., Hajibabaei, M. LANDMark: an ensemble approach to the supervised selection of biomarkers in high-throughput sequencing data. BMC Bioinformatics 23, 110 (2022). https://doi.org/10.1186/s12859-022-04631-z

Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research. 2011;12:2825–30.

Geurts P, Ernst D, Wehenkel L. Extremely Randomized Trees. Machine Learning. 2006;63(1):3–42.

Rudar, J., Golding, G.B., Kremer, S.C., Hajibabaei, M. (2023). Decision Tree Ensembles Utilizing Multivariate Splits Are Effective at Investigating Beta Diversity in Medically Relevant 16S Amplicon Sequencing Data. Microbiology Spectrum e02065-22.

Klaise J., Van Looveren A., Vacanti G., and Coca A. Alibi Explain: Algorithms for Explaining Machine Learning Models. Journal of Machine Learning Research. 2021;22(181):1-7.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

treeordination-1.3.0.tar.gz (14.5 kB view details)

Uploaded Source

Built Distribution

treeordination-1.3.0-py3-none-any.whl (9.5 kB view details)

Uploaded Python 3

File details

Details for the file treeordination-1.3.0.tar.gz.

File metadata

  • Download URL: treeordination-1.3.0.tar.gz
  • Upload date:
  • Size: 14.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.16

File hashes

Hashes for treeordination-1.3.0.tar.gz
Algorithm Hash digest
SHA256 12b56da684f974efb6f6174b3ad807fdfe371f29326e3812b051d7bb76a015ea
MD5 83b0e8a6511638ea8a98282706b4c431
BLAKE2b-256 8194597b7eb92cbdb422487a0efb4bf71d98d8c3fae2b96a7f2bfd2b3d87fca0

See more details on using hashes here.

File details

Details for the file treeordination-1.3.0-py3-none-any.whl.

File metadata

File hashes

Hashes for treeordination-1.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 45906e88000f78e640cd3eced18947cd388f1b02f7a82d204303f5fc717afef0
MD5 2f7092ecfb9848b7d0c616fbaaff4569
BLAKE2b-256 b564032cd55bf1dad154e0ea99c029ff1a8a849d72f07f4743c6f3c4b48d1eb1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page