Skip to main content

Transformers at zeta scales

Project description

Minerva: Unleashing the Secrets of advanced Mathematics 🏛️🔢

Minerva is a groundbreaking language model that pushes the boundaries of mathematical understanding and problem-solving. Designed with an advanced math theme, Minerva embodies the spirit of renowned mathematicians such as Euclid, Pythagoras, and Archimedes. By harnessing their advanced wisdom, Minerva offers unparalleled capabilities in mathematical reasoning and exploration.


GitHub issues GitHub forks GitHub stars GitHub license

Share on Twitter Share on Facebook Share on LinkedIn

Share on Reddit Share on Hacker News Share on Pinterest Share on WhatsApp


Install

pip install minerva

Usage

import torch
from minerva import Minerva, Train

# Example usage
x = torch.randint(0, 20000, (1, 1024))

Minerva(x)

# or train
Train()

Training

To train Minerva, follow these steps:

  1. Configure the training settings by setting the environment variables:

    • ENTITY_NAME: Your wandb project name
    • OUTPUT_DIR: Specify the output directory for saving the weights (e.g., ./weights)
  2. Launch the training process using Deepspeed:

Accelerate Config
Accelerate launch train_distributed_accelerate.py

Dataset Building

To build a custom dataset for Minerva, you can preprocess the data using the build_dataset.py script. This script performs tasks such as pre-tokenization, data chunking, and uploading to the Huggingface hub. Here's an example command:

Dataset Description
Mathematical Web Pages Web pages containing mathematical expressions in MathJax format, cleaned to preserve math notation
arXiv 2 million arXiv papers up to Feb 2021, in LaTeX format
General Natural Language Data Same dataset used to pretrain PaLM models

The mathematical web pages and arXiv datasets focus on technical and mathematical content. The general natural language data provides a broad coverage of general language.

The paper states the mathematical web pages and arXiv each account for 47.5% of the total data. The remaining 5% is general natural language data which is a subset of what was used for PaLM pretraining.

Roadmap 🗺️📍

  • Create a dataset of ARXVIV papers

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

minerva_torch-0.0.1.tar.gz (15.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

minerva_torch-0.0.1-py3-none-any.whl (15.1 kB view details)

Uploaded Python 3

File details

Details for the file minerva_torch-0.0.1.tar.gz.

File metadata

  • Download URL: minerva_torch-0.0.1.tar.gz
  • Upload date:
  • Size: 15.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.3.2 CPython/3.11.0 Darwin/22.4.0

File hashes

Hashes for minerva_torch-0.0.1.tar.gz
Algorithm Hash digest
SHA256 27941333b7b0bcc1292cdf86864c4d192d52d1f498d6716c45206f8388ce6155
MD5 3e46e888f2284947c7a444d7b413439d
BLAKE2b-256 0285b2c35adb9f9147c40fa1dda5c9eb4cf511d441a1f659564ae3b65018a2a3

See more details on using hashes here.

File details

Details for the file minerva_torch-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: minerva_torch-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 15.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.3.2 CPython/3.11.0 Darwin/22.4.0

File hashes

Hashes for minerva_torch-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 5f9e67cc4c424c93ef06bf4da12b76d0fa210f308b31e6c2391096a7baa762d0
MD5 fdc3c23fa9d0680ff606c0ef2360be72
BLAKE2b-256 f33d3470d7082580683ebf6c12798d910105f0d4db38558403b4d3f95ac35edf

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page