Skip to main content

Projection of High-Dimensional Data Using Multivariate Decision Trees and UMAP

Project description

TreeOrdination

CI

Implementation of a wrapper which creates unsupervised projections using LANDMark and UMAP.

Install

From PyPI:

pip install TreeOrdination

From source:

git clone https://github.com/jrudar/TreeOrdination.git
cd TreeOrdination
pip install .
# or create a virtual environment
python -m venv venv
source venv/bin/activate
pip install .

Example Usage

    from TreeOrdination import TreeOrdination
    from sklearn.datasets import make_classification
    
    #Create the dataset
    X, y = make_classification(n_samples = 200, n_informative = 20)
    
    #Give features a name
    f_names = ["Feature %s" %str(i) for i in range(X.shape[0])]
    
    tree_ord = TreeOrdination(feature_names = f_names).fit(X, y)

    #This is the LANDMark embedding of the dataset. This dataset is used to train the supervised model ('supervised_clf' parameter)
    landmark_embedding = tree_ord.LM_emb
    
    #This is the UMAP projection of the LANDMark embedding
    umap_projection = tree_ord.UMAP_emb
    
    #This is the PCA projetion of the UMAP embedding
    pca_projection = tree_ord.PCA_emb     

Notebooks and Other Examples

Comming Soon. When available, examples of how to use TreeOrdination will be found here.

Interface

An overview of the API can be found here.

Contributing

To contribute to the development of TreeOrdination please read our contributing guide

References

Rudar, J., Porter, T.M., Wright, M., Golding G.B., Hajibabaei, M. LANDMark: an ensemble approach to the supervised selection of biomarkers in high-throughput sequencing data. BMC Bioinformatics 23, 110 (2022). https://doi.org/10.1186/s12859-022-04631-z

Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research. 2011;12:2825–30.

Geurts P, Ernst D, Wehenkel L. Extremely Randomized Trees. Machine Learning. 2006;63(1):3–42.

Rudar, J., Golding, G.B., Kremer, S.C., Hajibabaei, M. (2023). Decision Tree Ensembles Utilizing Multivariate Splits Are Effective at Investigating Beta Diversity in Medically Relevant 16S Amplicon Sequencing Data. Microbiology Spectrum e02065-22.

Jai Ram Rideout, Greg Caporaso, Evan Bolyen, Daniel McDonald, Yoshiki Vázquez Baeza, Jorge Cañardo Alastuey, Anders Pitman, Jamie Morton, Qiyun Zhu, Jose Navas, Kestrel Gorlick, Justine Debelius, Zech Xu, Matt Aton, llcooljohn, Joshua Shorenstein, Laurent Luce, Will Van Treuren, John Chase, … Dr. K. D. Murray. (2025). scikit-bio/scikit-bio: scikit-bio 0.6.3 (0.6.3). Zenodo. https://doi.org/10.5281/zenodo.14640761

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

treeordination-1.3.5.tar.gz (16.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

treeordination-1.3.5-py3-none-any.whl (11.3 kB view details)

Uploaded Python 3

File details

Details for the file treeordination-1.3.5.tar.gz.

File metadata

  • Download URL: treeordination-1.3.5.tar.gz
  • Upload date:
  • Size: 16.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.21

File hashes

Hashes for treeordination-1.3.5.tar.gz
Algorithm Hash digest
SHA256 c47e932eda567a721714bca3dd55d2dfffdc1d207400dee1e219c3243c65e497
MD5 7a929788584d58ecd8621fdfa0c407c3
BLAKE2b-256 e4eef9378d4d509903bb1b860b8260d63283a267b35379f2e01d397219d6d63f

See more details on using hashes here.

File details

Details for the file treeordination-1.3.5-py3-none-any.whl.

File metadata

  • Download URL: treeordination-1.3.5-py3-none-any.whl
  • Upload date:
  • Size: 11.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.21

File hashes

Hashes for treeordination-1.3.5-py3-none-any.whl
Algorithm Hash digest
SHA256 f4831d77675f0c489b6505febae5b7ebe0c8bdc81dbe0bc4e608f0e9900c00bb
MD5 53fdf62b76c574d46eb19804dfaae464
BLAKE2b-256 46de3d68ec43027cdd11ec1da1cca843f878e2ae75ab15237b90906ba26db126

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page