Skip to main content

ML models for materials property prediction

Project description

Research Find

Materials property prediction models and training scripts.

Overview

  • Collection of training and evaluation scripts for materials datasets (181600, 50000, 5000).

Quick start

  1. Create a Python virtual environment and activate it.
  2. Install dependencies (add your project's requirements to requirements.txt).
  3. Run training or evaluation scripts, e.g.:
python train_181600_models.py

Files of interest

  • train_181600_models.py — training script you're editing.
  • models/ — saved models (ignored by default).
  • outputs/ — predictions and reports (ignored by default).

How to publish to GitHub

  1. Create a new repo on GitHub (choose a name like research-find).
  2. From this folder run:
git init
git add .
git commit -m "Initial commit"
git branch -M main
git remote add origin https://github.com/your-username/repo-name.git
git push -u origin main

Replace your-username/repo-name.git with your repository URL.

If you want me to run these commands and push, give me the repository URL or let me know and I'll guide you through creating a PAT for authentication.

Model description

  • Purpose: Build accurate, generalizable machine-learning predictors for thermodynamic and mechanical materials properties to accelerate screening and discovery.
  • Data: Trained on curated datasets (Materials_Dataset_181600.csv, Materials_Dataset_50000.csv, Materials_Dataset_FIXED.csv).
  • Model types: Ensemble models (XGBoost / CatBoost / RandomForest-style) trained per-property with cross-validation and ensembling.
  • Inputs: Composition-based features and engineered descriptors produced by preprocessing scripts.
  • Outputs: Per-property CSV predictions under outputs/ and evaluation summaries (R², RMSE) stored in outputs/ and models/.
  • Usage: create a Python environment, install dependencies, then run training and evaluation scripts such as python train_181600_models.py.
  • Notes: large model files and outputs are excluded via .gitignore. If datasets or model artifacts exceed GitHub file size limits (>100MB) enable Git LFS for those paths.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

thermoverse-0.1.0.tar.gz (30.6 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

thermoverse-0.1.0-py3-none-any.whl (30.8 MB view details)

Uploaded Python 3

File details

Details for the file thermoverse-0.1.0.tar.gz.

File metadata

  • Download URL: thermoverse-0.1.0.tar.gz
  • Upload date:
  • Size: 30.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for thermoverse-0.1.0.tar.gz
Algorithm Hash digest
SHA256 ee0864a608f1bce5901a32f3e0d7287e5a609b7637c29f3bc21ef78370c39bac
MD5 491b39f47649d00c5fa9e2e8bcb014ce
BLAKE2b-256 32879dc1a14346f5cc0b277d009a12ad2cfba47e91b411e78157820fb64c6d3a

See more details on using hashes here.

File details

Details for the file thermoverse-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: thermoverse-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 30.8 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for thermoverse-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3435526cc9de98e51c80c661f9dcaa07a9b0d37ef361ebe8e7a37581801d79b8
MD5 340f23582a99d0b973fb6290a1cdb0ab
BLAKE2b-256 45001d9fdfd593f0a364f1c9e99c477258d54374f816f1f8828aecc5ef4b3352

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page