A collection of tools for low-resource indie machine learning development

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

A collection of machine learning tools for low-resource research and experiments

Note: THIS LIBRARY IS UNFINISHED WORK-IN-PROGRESS

Description

pip install ml-indie-tools

This module contains of a collection of tools useable for researchers with limited access to compute-resources and who change between laptop, Colab-instances and local workstations with a graphics card.

env_tools checks the current environment, and populates a number of flags that allow identification of run-time environment and available accelerator hardware. For Colab instances, it provides tools to mount Google Drive for persistant data- and model-storage.

The usage scenarios are:

Env	Tensorflow TPU	Tensorflow GPU	Pytorch TPU	Pytorch GPU	Jax TPU	Jax GPU
Colab	x	x	/	x	x	x
Workstation with Nvidia	/	x	/	x	/	x
Apple Silicon	/	x	/	/	/	/

Gutenberg_Dataset and Text_Dataset are NLP libraries that provide text data and can be used in conjuction with Huggingface Datasets or directly with ML libraries.

ALU_Dataset is a toy-dataset that allows training of integer arithmetic and logical (ALU) operations.

env_tools

A collection of tools that allow moving machine learning projects between local hardware and colab instances.

Examples

Local laptop:

from ml_indie_tools.env_tools import MLEnv
ml_env = MLEnv(platform='tf', accelator='fastest')
ml_env.describe()  # -> 'OS: Darwin, Python: 3.9.9 (Conda) Tensorflow: 2.7.0, GPU: METAL'
ml_env.is_gpu   # -> True
ml_env.is_tensorflow  # -> True
ml_env.gpu_type  # -> 'METAL'

Colab instance:

# !pip install -U ml_indie_tools
from ml_indie_tools.env_tools import MLEnv
ml_env = MLEnv(platform='tf', accelerator='fastest')
print(ml_env.describe())
print(ml_env.gpu_type)

Output:

DEBUG:MLEnv:Tensorflow version: 2.7.0
DEBUG:MLEnv:GPU available
DEBUG:MLEnv:You are on a Jupyter instance.
DEBUG:MLEnv:You are on a Colab instance.
INFO:MLEnv:OS: Linux, Python: 3.7.12, Colab Jupyter Notebook Tensorflow: 2.7.0, GPU: Tesla K80
The tensorboard extension is already loaded. To reload it, use:
  %reload_ext tensorboard
OS: Linux, Python: 3.7.12, Colab Jupyter Notebook Tensorflow: 2.7.0, GPU: Tesla K80
Tesla K80

Project paths

ml_env.init_paths('my_project', 'my_model') will give a list of paths that are adapted for local and colab usage

Local project:

ml_env.init_paths("my_project", "my_model")  # -> ('.', '.', './model/my_model', './data', './logs')

The list contains , (both are current directory for local projects), to save model and weights, for training data and for logs.

Those paths (with exception of ./logs) are moved to Google Drive for Colab instances:

On Google Colab:

# INFO:MLEnv:You will now be asked to authenticate Google Drive access in order to store training data (cache) and model state.
# INFO:MLEnv:Changes will only happen within Google Drive directory `My Drive/Colab Notebooks/<project-name>`.
# DEBUG:MLEnv:Root path: /content/drive/My Drive
# Mounted at /content/drive
('/content/drive/My Drive',
 '/content/drive/My Drive/Colab Notebooks/my_project',
 '/content/drive/My Drive/Colab Notebooks/my_project/model/my_model',
 '/content/drive/My Drive/Colab Notebooks/my_project/data',
 './logs')

See the env_tools API documentation for details.

Gutenberg_Dataset

Gutenberg_Dataset makes books from Project Gutenberg available as dataset.

This module can either work with a local mirror of Project Gutenberg, or download files on demand. Files that are downloaded are cached to prevent unnecessary load on Gutenberg's servers.

Working with a local mirror of Project Gutenberg

If you plan to use a lot of files (hundreds or more) from Gutenberg, a local mirror might be the best solution. Have a look at Project Gutenberg's notes on mirrors.

A mirror image suitable for this project can be made with:

rsync -zarv --dry-run --prune-empty-dirs --del --include="*/" --include='*.'{txt,pdf,ALL} --exclude="*" aleph.gutenberg.org::gutenberg ./gutenberg_mirror

It's not mandatory to include pdf-files, since they are currently not used. Please review the --dry-run flag.

Once a mirror of at least all of Gutenberg's *.txt files and of index-file GUTINDEX.ALL has been generated, it can be used via:

from ml_indie_tools.Gutenberg_Dataset import Gutenberg_Dataset
gd = Gutenberg_Dataset(root_url='./gutenberg_mirror')  # Assuming this is the file-path to the mirror image

Working without a remote mirror

from ml_indie_tools.Gutenberg_Dataset import Gutenberg_Dataset
gd = Gutenberg_Dataset()  # the default Gutenberg site is used. Alternative specify a specific mirror with `root_url=http://...`.

Getting Gutenberg books

After using one of the two methods to instantiate the gd object:

gd.load_index()  # load the index of books

Then get a list of books (array). Each entry is a dict with meta-data: search_result is a list of dictionaries containing meta-data without the actual book-text.

search_result = gd.search({'author': ['kant', 'goethe'], language=['german', 'english']})

Insert the actual book text into the dictionaries. Note that download count is limited if using a remote server.

search_result = gd.insert_book_texts(search_result)
# search_result entries now contain an additional field `text` with the filtered text of the book.
import pandas as pd
df = DataFrame(search_result)  # Display results as Pandas DataFrame

See the Gutenberg_Dataset API documentation for details.

Text_Dataset

See the Text_Dataset API documentation for details.

ALU_Dataset

See the ALU_Dataset API documentation for details.

keras_custom_layers

A collection of Keras residual- and self-attention layers

See the keras_custom_layers API documentation for details.

History

(2021-12-26, 0.0.x) First pre-alpha versions published for testing purposes, not ready for use.

Project details

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.12.22

May 10, 2024

0.12.21

May 10, 2024

0.12.20

May 10, 2024

0.12.19

May 10, 2024

0.12.18

May 10, 2024

0.12.17

May 10, 2024

0.12.16

May 10, 2024

0.12.14

May 10, 2024

0.12.13

May 10, 2024

0.12.12

May 10, 2024

0.12.11

May 10, 2024

0.12.10

May 9, 2024

0.12.9

May 9, 2024

0.12.8

May 9, 2024

0.12.7

May 9, 2024

0.12.6

May 9, 2024

0.12.5

May 9, 2024

0.12.4

May 9, 2024

0.12.3

May 9, 2024

0.12.2

May 9, 2024

0.12.1

May 9, 2024

0.12.0

Apr 28, 2024

0.11.12

Apr 21, 2024

0.11.11

Apr 21, 2024

0.11.10

Apr 21, 2024

0.11.9

Apr 20, 2024

0.11.8

Apr 20, 2024

0.11.7

Apr 20, 2024

0.11.6

Apr 20, 2024

0.11.5

Apr 20, 2024

0.11.4

Apr 20, 2024

0.11.3

Apr 20, 2024

0.11.2

Apr 20, 2024

0.11.1

Apr 20, 2024

0.11.0

Mar 25, 2024

0.10.11

Mar 25, 2024

0.10.10

Feb 26, 2024

0.10.9

Feb 26, 2024

0.10.8

Feb 22, 2024

0.10.6

Feb 22, 2024

0.10.5

Feb 21, 2024

0.10.4

Feb 21, 2024

0.10.3

Nov 19, 2023

0.10.2

Nov 19, 2023

0.10.1

Nov 19, 2023

0.10.0

Nov 19, 2023

0.9.29

Nov 19, 2023

0.9.28

Nov 19, 2023

0.9.27

Nov 19, 2023

0.9.25

Nov 19, 2023

0.9.24

Nov 19, 2023

0.9.22

Nov 19, 2023

0.9.21

Nov 19, 2023

0.9.20

Nov 19, 2023

0.9.19

Nov 18, 2023

0.9.18

Nov 18, 2023

0.9.17

Nov 18, 2023

0.9.16

Nov 18, 2023

0.9.15

Nov 18, 2023

0.9.14

Nov 18, 2023

0.9.13

Nov 18, 2023

0.9.12

Nov 18, 2023

0.9.11

Nov 18, 2023

0.9.10

Nov 18, 2023

0.9.9

Nov 18, 2023

0.9.8

Nov 18, 2023

0.9.7

Nov 18, 2023

0.9.6

Nov 18, 2023

0.9.5

Nov 18, 2023

0.9.4

Nov 14, 2023

0.9.3

Nov 14, 2023

0.9.2

Nov 14, 2023

0.9.1

Nov 14, 2023

0.9.0

Nov 13, 2023

0.8.170

Apr 3, 2023

0.8.169

Apr 2, 2023

0.8.168

Apr 2, 2023

0.8.167

Apr 2, 2023

0.8.166

Apr 2, 2023

0.8.165

Apr 2, 2023

0.8.164

Apr 2, 2023

0.8.163

Apr 2, 2023

0.8.162

Apr 2, 2023

0.8.161

Apr 2, 2023

0.8.160

Apr 2, 2023

0.8.159

Apr 2, 2023

0.8.158

Apr 2, 2023

0.8.157

Apr 2, 2023

0.8.156

Apr 2, 2023

0.8.155

Apr 2, 2023

0.8.154

Apr 2, 2023

0.8.153

Apr 2, 2023

0.8.152

Apr 2, 2023

0.8.151

Apr 1, 2023

0.8.150

Apr 1, 2023

0.8.149

Apr 1, 2023

0.8.148

Apr 1, 2023

0.8.147

Apr 1, 2023

0.8.146

Apr 1, 2023

0.8.144

Apr 1, 2023

0.8.143

Apr 1, 2023

0.8.142

Apr 1, 2023

0.8.141

Apr 1, 2023

0.8.140

Apr 1, 2023

0.8.139

Apr 1, 2023

0.8.138

Apr 1, 2023

0.8.137

Apr 1, 2023

0.8.136

Apr 1, 2023

0.8.135

Apr 1, 2023

0.8.134

Apr 1, 2023

0.8.133

Apr 1, 2023

0.8.132

Apr 1, 2023

0.8.131

Apr 1, 2023

0.8.130

Apr 1, 2023

0.8.129

Apr 1, 2023

0.8.128

Apr 1, 2023

0.8.127

Apr 1, 2023

0.8.126

Apr 1, 2023

0.8.125

Apr 1, 2023

0.8.124

Apr 1, 2023

0.8.123

Apr 1, 2023

0.8.122

Apr 1, 2023

0.8.121

Apr 1, 2023

0.8.120

Apr 1, 2023

0.8.119

Apr 1, 2023

0.8.118

Apr 1, 2023

0.8.117

Apr 1, 2023

0.8.116

Apr 1, 2023

0.8.115

Apr 1, 2023

0.8.114

Apr 1, 2023

0.8.113

Apr 1, 2023

0.8.112

Apr 1, 2023

0.8.111

Apr 1, 2023

0.8.110

Apr 1, 2023

0.8.108

Apr 1, 2023

0.8.107

Apr 1, 2023

0.8.106

Apr 1, 2023

0.8.105

Apr 1, 2023

0.8.104

Apr 1, 2023

0.8.103

Apr 1, 2023

0.8.102

Apr 1, 2023

0.8.101

Apr 1, 2023

0.8.100

Apr 1, 2023

0.8.99

Apr 1, 2023

0.8.98

Apr 1, 2023

0.8.97

Apr 1, 2023

0.8.96

Apr 1, 2023

0.8.95

Apr 1, 2023

0.8.94

Apr 1, 2023

0.8.93

Apr 1, 2023

0.8.92

Apr 1, 2023

0.8.91

Apr 1, 2023

0.8.90

Apr 1, 2023

0.8.6

Mar 31, 2023

0.8.5

Mar 31, 2023

0.8.3

Mar 31, 2023

0.8.2

Mar 31, 2023

0.8.1

Mar 31, 2023

0.8.0

Mar 31, 2023

0.7.13

Mar 30, 2023

0.7.12

Mar 30, 2023

0.7.11

Mar 30, 2023

0.7.10

Mar 30, 2023

0.7.9

Mar 30, 2023

0.7.8

Mar 30, 2023

0.7.7

Mar 30, 2023

0.7.6

Mar 30, 2023

0.7.5

Mar 30, 2023

0.7.4

Mar 30, 2023

0.7.3

Mar 30, 2023

0.7.2

Mar 30, 2023

0.7.1

Mar 30, 2023

0.7.0

Mar 30, 2023

0.6.17

Mar 29, 2023

0.6.16

Mar 29, 2023

0.6.15

Mar 29, 2023

0.6.14

Mar 29, 2023

0.6.13

Mar 29, 2023

0.6.12

Mar 29, 2023

0.6.11

Mar 29, 2023

0.6.10

Mar 29, 2023

0.6.9

Mar 29, 2023

0.6.8

Mar 29, 2023

0.6.7

Mar 29, 2023

0.6.6

Mar 29, 2023

0.6.4

Mar 29, 2023

0.6.3

Mar 29, 2023

0.6.2

Mar 29, 2023

0.6.1

Mar 28, 2023

0.6.0

Mar 28, 2023

0.5.14

Feb 1, 2023

0.5.13

Feb 1, 2023

0.5.12

Feb 1, 2023

0.5.11

Feb 1, 2023

0.5.10

Feb 1, 2023

0.5.9

Feb 1, 2023

0.5.8

Feb 1, 2023

0.5.7

Feb 1, 2023

0.5.6

Feb 1, 2023

0.5.5

Jan 31, 2023

0.5.4

Jan 30, 2023

0.5.3

Jan 30, 2023

0.5.2

Jan 27, 2023

0.5.1

Jan 27, 2023

0.5.0

Jan 27, 2023

0.4.4

Jan 26, 2023

0.4.3

Jan 22, 2023

0.4.2

Jan 21, 2023

0.4.1

Dec 13, 2022

0.4.0

Dec 13, 2022

0.3.32

Dec 12, 2022

0.3.31

Dec 12, 2022

0.3.30

Dec 12, 2022

0.3.29

Dec 12, 2022

0.3.28

Dec 12, 2022

0.3.27

Dec 12, 2022

0.3.26

Dec 12, 2022

0.3.25

Dec 12, 2022

0.3.24

Dec 12, 2022

0.3.23

Dec 12, 2022

0.3.22

Dec 12, 2022

0.3.21

Dec 12, 2022

0.3.20

Dec 12, 2022

0.3.19

Dec 12, 2022

0.3.18

Dec 11, 2022

0.3.17

Dec 11, 2022

0.3.16

Dec 11, 2022

0.3.15

Dec 11, 2022

0.3.14

Dec 11, 2022

0.3.13

Dec 11, 2022

0.3.12

Dec 11, 2022

0.3.11

Dec 11, 2022

0.3.10

Dec 11, 2022

0.3.8

Sep 9, 2022

0.3.7

Jun 19, 2022

0.3.6

Jun 19, 2022

0.3.5

Jun 19, 2022

0.3.4

Jun 19, 2022

0.3.3

Jun 19, 2022

0.3.2

Jun 19, 2022

0.3.1

Jun 19, 2022

0.3.0

Jun 19, 2022

0.2.0

Jun 19, 2022

0.1.6

Jun 17, 2022

0.1.5

Jun 7, 2022

0.1.4

Mar 27, 2022

0.1.3

Mar 27, 2022

0.1.2

Mar 15, 2022

0.1.1

Mar 15, 2022

0.1.0

Mar 12, 2022

This version

0.0.59

Mar 6, 2022

0.0.58

Jan 14, 2022

0.0.57

Jan 14, 2022

0.0.56

Jan 13, 2022

0.0.55

Jan 13, 2022

0.0.54

Jan 13, 2022

0.0.53

Jan 13, 2022

0.0.52

Jan 13, 2022

0.0.51

Jan 13, 2022

0.0.50

Jan 13, 2022

0.0.49

Jan 13, 2022

0.0.48

Jan 11, 2022

0.0.47

Jan 11, 2022

0.0.46

Jan 11, 2022

0.0.45

Jan 11, 2022

0.0.44

Jan 11, 2022

0.0.43

Jan 11, 2022

0.0.42

Jan 10, 2022

0.0.41

Jan 10, 2022

0.0.40

Jan 10, 2022

0.0.39

Jan 10, 2022

0.0.38

Jan 10, 2022

0.0.37

Jan 10, 2022

0.0.36

Jan 9, 2022

0.0.35

Jan 8, 2022

0.0.34

Jan 8, 2022

0.0.33

Jan 8, 2022

0.0.32

Jan 6, 2022

0.0.31

Jan 6, 2022

0.0.30

Jan 6, 2022

0.0.29

Jan 6, 2022

0.0.28

Jan 4, 2022

0.0.26

Jan 2, 2022

0.0.25

Jan 1, 2022

0.0.24

Jan 1, 2022

0.0.23

Jan 1, 2022

0.0.22

Jan 1, 2022

0.0.21

Jan 1, 2022

0.0.20

Jan 1, 2022

0.0.19

Jan 1, 2022

0.0.18

Jan 1, 2022

0.0.17

Jan 1, 2022

0.0.16

Jan 1, 2022

0.0.14

Dec 29, 2021

0.0.13

Dec 29, 2021

0.0.12

Dec 29, 2021

0.0.11

Dec 28, 2021

0.0.10

Dec 28, 2021

0.0.8

Dec 27, 2021

0.0.7

Dec 27, 2021

0.0.6

Dec 27, 2021

0.0.5

Dec 27, 2021

0.0.4

Dec 27, 2021

0.0.3

Dec 27, 2021

0.0.2

Dec 26, 2021

0.0.1

Dec 26, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ml-indie-tools-0.0.59.tar.gz (34.3 kB view hashes)

Uploaded Mar 6, 2022 Source

Built Distribution

ml_indie_tools-0.0.59-py3-none-any.whl (32.9 kB view hashes)

Uploaded Mar 6, 2022 Python 3

Hashes for ml-indie-tools-0.0.59.tar.gz

Hashes for ml-indie-tools-0.0.59.tar.gz
Algorithm	Hash digest
SHA256	`64b026c3a1a715bf7dd24f7662bf3bae399af12e30306a2449e909aa4dcd8333`
MD5	`f984be76aafe9a90af3da26de09e1b2b`
BLAKE2b-256	`85e357e3007bd903d92a669d5c244abfe8aa02d41a8a3939d8d5b665928a5934`

Hashes for ml_indie_tools-0.0.59-py3-none-any.whl

Hashes for ml_indie_tools-0.0.59-py3-none-any.whl
Algorithm	Hash digest
SHA256	`67a99efd2b1328da1d17f1a329f2abdbfb5106493f31935577cf70c5503ee4a3`
MD5	`4dcfac61e5c4d2cb8be4a1997ecce6f4`
BLAKE2b-256	`3cb1fdae17340258d30c8a500adbf6b196c1e1bbd0136813d11593a038fe0436`