Hufflepuffs are particularly good finders.

Project description

Hufflepuffs are particularly good finders.

Hufflepuff is a program for exploring large mostly-convex search spaces. The algorithm it uses is basically beam search, except that it’s expected that exploring all successors of any node is too computationally expensive, and so it explores a sample of them (with larger sample sizes for higher-scoring nodes).

You need to provide three things to Hufflepuff:

The initial state(s) to explore from.
A program for generating a successor of a given state. (the same state may be given multiple times to get multiple successors)
A program for assigning a score to any state.

Hufflepuff has been used to:

Group students into roughly equal teams, given approximate skill levels.
Assign students to classes, such that class sizes are roughly equal and students’ preferences are taken into account.
Generate source code for an esoteric programming language that produced a desired output.

In those three cases, the initial states were randomly generated (with virtually no domain knowledge), the successor generation just randomly changed elements in the state, leaving the scoring program as the single place for specifying the desired properties of the solution.

Invocation

With a config file:

$ hufflepuff -c config.cfg

config.cfg:

initial-file = initial.json
initial-binary = cat $initial-file
mutate-binary = ./mutate
score-binary = ./score

beam-width = 5
expand-mantissa = 1.25
expand-multiplier = 3
target = max

num-iterations = 20
parallelism = NUM_CPUS

The config file is a shorthand for providing the args on the command line:

$ hufflepuff --initial-binary=./initial

Search parameters

By default, hufflepuff will attempt to find states with the highest scores: this is specified by --target=max. If your score metric is a “cost”, then setting --target=min will make it attempt to find states with the lowest scores.

The algorithm followed for searching is:

Get the states from --initial-binary.
For --num-iterations iterations:
Generate a new set of states from the current set of states. * The best-scored state will have --expand-multiplier states generated from it. * Following states will have an exponentially decreasing number of states generated from them (rounding down, but there will always be at least one new state for each current state).
Score the new states.
Throw away all but the best --beam-width states.

However, in addition, the best --beam-width states ever seen are kept, for output at the end of the search.

Communication

All binaries are expected to use newline-separated JSON values, either read from stdin, or written to stdout, encoded in UTF-8.

In order to fully utilise multiple cores, multiple instances of the mutation and scoring binaries are run, and input is provided simultaneously (so these binaries can be written as naïve single-threaded processes but still scale seamlessly).

If this is not desired (for instance, if there is a shared resource, or if you want to implement an more sophisticated form of parallelism), set --parallelism=1. Future lines of input may be provided before a given line has finished processing to aid in this, but this should not be relied upon (don’t block a line of output on reading the next line of input).

Initial binary

The initial-binary is run to provide the set of starting values for the search. For convenience, the substring "$initial-file" is replaced with the value of the initial-file argument, so the default behaviour is to read from that file.

A particularly useful thing to do with this is to store the results of a search in best.txt, and then use --initial-file=best.json to continue the search from that point.

Mutation binary

The mutate-binary is run to take values and turn them into slightly different values (e.g., swap two values in a list, or add/subtract some amount from a number). Usually this should be done entirely at random, rather than trying to make a “smart” decision: the search will quickly eliminate really bad options, and you may need intermediate okay-ish states.

Each line of input should correspond to exactly one line of output (if hufflepuff wants multiple mutations, it will enter the value multiple times).

Scoring binary

The score-binary is run to provide a number (floating point or integral) value for a value. The default meaning is that larger numbers are better, but this is controlled by --target={max,min}.

Each line of input should correspond to exactly one line of output, and this program should be deterministic (a given value produces the same score every time).

Project details

Release history Release notifications | RSS feed

This version

1.4

May 13, 2019

1.3.2

May 10, 2018

1.3

Jan 8, 2018

1.2.1

Sep 15, 2017

1.1

Jan 5, 2016

1.0

Oct 7, 2015

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hufflepuff-1.4.tar.gz (7.9 kB view details)

Uploaded May 13, 2019 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

hufflepuff-1.4-py3-none-any.whl (13.4 kB view details)

Uploaded May 13, 2019 Python 3

File details

Details for the file hufflepuff-1.4.tar.gz.

File metadata

Download URL: hufflepuff-1.4.tar.gz
Upload date: May 13, 2019
Size: 7.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/1.13.0 pkginfo/1.4.2 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.7.3rc1

File hashes

Hashes for hufflepuff-1.4.tar.gz
Algorithm	Hash digest
SHA256	`d206181c2f1eb75d74e689f4f8b47d51ede799a1a22dc3da7bab97126c8d5f4c`
MD5	`170c0b4f87274ffa280aff03ee63a9d8`
BLAKE2b-256	`a17685a1bfce25ae2f845f3a46a77d1e58422aa54ee70e4a6c02b53a3d829cce`

See more details on using hashes here.

File details

Details for the file hufflepuff-1.4-py3-none-any.whl.

File metadata

Download URL: hufflepuff-1.4-py3-none-any.whl
Upload date: May 13, 2019
Size: 13.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/1.13.0 pkginfo/1.4.2 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.7.3rc1

File hashes

Hashes for hufflepuff-1.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`16b038144ba3dc881329e0a1a8006772c1813fc135c07de15daff39d347d3f70`
MD5	`ac0e5cde56c8393505e49edad0c6c8ee`
BLAKE2b-256	`6d346b321a9d3ca7a6fbd66b6556c6c1525955d888dfa5e342b8e52aac3b276f`

See more details on using hashes here.

hufflepuff 1.4

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

Invocation

Search parameters

Communication

Initial binary

Mutation binary

Scoring binary

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes