This is a pre-production deployment of Warehouse, however changes made here WILL affect the production instance of PyPI.
Latest Version Dependencies status unknown Test status unknown Test coverage unknown
Project Description

Overview

There are tons of amazing algorithms and machine learning tools for detecting patterns in data. However, what most of these lack is a useful framework and UI for managing the often complicated setup of the data flow and predictions.

This package provides several tools for utilizing Django’s admin interface and ORM to help organize and manage machine learning setups.

The framework revolves around two basic objects:

  1. A problem, which organizes solutions to acheive some prediction goal. This is mainly implemented a genetic algorithm.
  2. A predictor, which organizes a specific solution to either guess a numeric value (i.e. regression) or a label (i.e. classification).

I made this separation to help myself with maintainence over the life time of an application. Often, I’d want to monitor the accuracy of a solution, but also evaluation other potential solutions without interrupting the solution used for production predictions. Once a superior solution was found, then I’d want to push it into production use with as little effort as possible. By explicitly representing different solutions as different records in the database, I found I could easily monitor them and slip them in and out of use as needed.

Problem

The problem represents a domain where we’re attempting to solve some prediction task, by either guessing a number or guessing a label. In the code, this is referred to as the Genome. A record in the Genome table represents a distinct problem domain and stores all the parameters used to control and manage the search for solutions.

From the Genome you define Genes, which are parameters available for use when attempting to solve the problem.

Specific solutions to the problem are represented by the Genotype model, which contains a list of genes and their associated values as key/values pairs.

To search for the best solution to a problem, you first implement a custom evaluating function, which will take a genotype as an argument and return a positive number, called the fitness, representing its overall suitability in solving the problem. By default, a value of 0 is interpreted to be the worse possible fitness and increasing value representing increasing levels of suitability. Personally, I find it convenience and intuitive to bound fitness between 0 and 1, but this is not strictly enforced.

You then set this function in your Genome's evaluator field and run the management command:

python manage.py evolve_population --genome=<genome_id>

Depending on the other settings in the genome, this will run for a maximum predetermined number of iterations or until improvement of the fitness has stalled. From the genome’s admin change page, you can browse the list of generated genotypes and inspect their fitness, possibly selecting one for production use.

For example, a simple genome might consist of a single gene called algorithm, which contains one of several algorithm names (e.g. ‘Bayesian’, ‘LinearSVC’, ‘RandomForest’, etc.). You would write your evaluation function to read this string and instantiate the appropriate class associated with the name. You could then add additional genes representing parameters common to multiple algorithms or unique to only a few. The Genotype model with generate a unique hash based on which genes it contains, and use this to avoid creating duplicate genotypes.

Predictor

todo

Usage

todo

Release History

Release History

0.4.23

This version

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.4.22

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.4.21

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.4.20

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.4.19

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.4.18

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.4.17

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.4.14

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

Download Files

Download Files

TODO: Brief introduction on what you do with files - including link to relevant help section.

File Name & Checksum SHA256 Checksum Help Version File Type Upload Date
django-analyze-0.4.23.tar.gz (85.9 kB) Copy SHA256 Checksum SHA256 Source Apr 18, 2015

Supported By

WebFaction WebFaction Technical Writing Elastic Elastic Search Pingdom Pingdom Monitoring Dyn Dyn DNS HPE HPE Development Sentry Sentry Error Logging CloudAMQP CloudAMQP RabbitMQ Heroku Heroku PaaS Kabu Creative Kabu Creative UX & Design Fastly Fastly CDN DigiCert DigiCert EV Certificate Rackspace Rackspace Cloud Servers DreamHost DreamHost Log Hosting