Skip to main content

37 machine learning algorithms reconstructed from expired US patents. Zero dependencies, pure Python stdlib.

Project description

patentml

Machine learning from expired patents. Zero dependencies. Pure Python stdlib.

37 modules, 131 classes and functions — every algorithm reconstructed from a United States patent that has expired into the public domain. The patents that built modern ML were filed by IBM, Bell Labs, Microsoft Research, Lockheed, AT&T and Lucent between 1995 and 2006. They have all expired. This library is what they describe, as clean modern Python, with no imports beyond the standard library.

pip install patentml

No numpy. No scipy. No compiled extensions. If it runs Python 3.8+, it runs patentml — locked-down corporate machines, serverless functions, air-gapped environments, Pyodide in the browser, and (with light trimming) MicroPython boards.

Quick start

from patentml import RandomForest, ScalableKMeans, ThompsonSampling, KalmanFilter

# Classification — US6816847 (Microsoft, 1999)
forest = RandomForest(n_trees=25)
forest.fit(X_train, y_train)
labels = [forest.predict(x) for x in X_test]

# Clustering — US6012058 (Microsoft, 1998)
km = ScalableKMeans(k=3)
km.fit(points)

# Bandits — US6981040 (Utopy, 2000) [919 forward citations]
bandit = ThompsonSampling(n_arms=4)
arm = bandit.select()
bandit.update(arm, reward=1.0)

# State estimation — US6795794 (Univ. Illinois, 2002)
kf = KalmanFilter(dim_state=2, dim_obs=1)

What's inside

Family Modules
Evolutionary & global optimisation genetic algorithm, genetic programming, grammar GP / grammatical evolution, linear GP, particle swarm, differential evolution, CMA-ES, simulated annealing, ant colony, Bayesian optimiser / EDA, neuroevolution
Neural networks mini neural net (mini-batch backprop), Conv1D, SimpleRNN, GRU cell, SGD/RMSProp/Adam/AdamW optimisers
Classifiers decision tree, random forest, AdaBoost, SVM (SMO), online Bayes, naive Bayes, KNN (+ BallTree), gradient boosting
Ensembles voting, stacking, bagging, weighted
Clustering scalable & hierarchical k-means, DBSCAN, OPTICS, EM / Gaussian mixture, spectral, mean shift
Reinforcement learning Q-learning, SARSA, function-approximation Q, actor-critic A2C, PPO-lite, ε-greedy / UCB1 / Thompson / EXP3 / LinUCB bandits
Probabilistic Bayesian network, hidden Markov model, Gaussian process regression & classification, kernel density estimation
Anomaly detection isolation forest, one-class SVM
NLP TF-IDF + naive Bayes text pipeline, word2vec SGNS, PMI embeddings
Recommenders memory-based & Bayesian collaborative filtering
Dimensionality & features PCA, randomised SVD, vector quantisation (LBG, product quantiser), scalers, mutual-information ranking, forward selection
State estimation Kalman filter, extended Kalman filter

Provenance

Every module documents its source patent: number, assignee, filing year, and forward-citation count. Highlights:

Patent Assignee Algorithm Citations
US5613012 SmartTouch (1995) Voting ensemble 1,182
US6981040 Utopy (2000) Bandit selection 919
US6161130 Microsoft (1998) Online classifier 896
US6556983 Microsoft (2000) Word embeddings (PMI + SGNS) 645
US6192360 Microsoft (1998) TF-IDF + naive Bayes 364
US6317707 AT&T (1998) Mean shift + KDE 269
US6931384 Microsoft (2001) Gaussian process regression 258

The full list of ~40 source patents is in the package docstring: python -c "import patentml; print(patentml.__doc__)".

All source patents are expired. The implementations are original code, MIT licensed.

Why

Modern ML stacks are heavy, opaque, and supply-chain risky. Sometimes you need one algorithm — a Kalman filter on a microcontroller, a bandit in a serverless function, k-means in a browser — without 200 MB of compiled wheels. And sometimes you want code you can actually read: every module here is a single self-contained file you can audit in one sitting.

These algorithms earned their citations the hard way. They still work.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

patentml-0.1.0.tar.gz (106.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

patentml-0.1.0-py3-none-any.whl (122.5 kB view details)

Uploaded Python 3

File details

Details for the file patentml-0.1.0.tar.gz.

File metadata

  • Download URL: patentml-0.1.0.tar.gz
  • Upload date:
  • Size: 106.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for patentml-0.1.0.tar.gz
Algorithm Hash digest
SHA256 ba2c063fbc388e80b9505f3d1fdfe11c54f1938732a9163be4d92b78cf84268f
MD5 61d8204e1cf7648a8677b3341c8f9431
BLAKE2b-256 1a6dc70e422831e925ce4b8176588e3710573efca6fc4cfd59f02dbf47e143e6

See more details on using hashes here.

File details

Details for the file patentml-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: patentml-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 122.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for patentml-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d61872dc4cdbb8d9995e6e33a5ef36499f71fe5673ee149c5c790bce318e1434
MD5 576397cc74cae1c97d0cf690234fa68b
BLAKE2b-256 6f01caecbf19fafcd17fbe1f3fe15a97b1b7fdbcbb479d625005db994f4c4b06

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page