CombOL - the Combinatorial Objects Library.
Project description
The Combinatorial Objects Library
Introduction
CombOL is an open-source library for the enumeration and Boltzmann sampling of combinatorial classes, written in Rust and Python.
Classes can be specified programmatically through Python, via JSON files, or parsed from a concise string syntax, and may be defined in terms of an arbitrary number of parameters. From a specification, CombOL automatically derives the generating functions, enabling the generation of counting sequences. The library supports exact and approximate-size Boltzmann rejection sampling with automatic parameter tuning to target specific sizes. In addition to implementing established methods, CombOL features a novel early rejection scheme, as well as guaranteed statistical correctness by dynamically increasing the numerical precision, eliminating bias due to floating-point rounding errors.
Through the Python interface, users can define application-specific functions for combinatorial operators, enabling direct sampling of domain objects such as graphs, chemical structure representations, or other complex data types.
Paper
TODO
Tutorial
Quickstart
import combol
# Classes are specified by parsing a string written in syntax similar to `combstruct`.
btrees = combol.parse('B = z + (z * B * B)')
# CombOL can produce the counting sequence for univariate or multivariate classes.
counting_sequence = combol.enumerate(10) # Poly: z + z^3 + 2z^5 + 5z^7 + ...
# We can also sample structures from the class.
sampler = btrees.sampler(0.2)
sampled_structure = sampler.sample() # Product(z, Product(z, z, z), z)
This creates a combinatorial class object which can be used for various purposes.
Specifying classes
Classes are specified by parsing strings with a syntax similar to that of the combstruct maple package.
In this syntax, classes must start with uppercase letters, and variables with lowercase letters.
The table below shows the supported combinatorial constructions and their syntax.
We can also define classes that depend on each other, e.g.
mappings = combol.parse([
'Y = CYC(C)',
'C = z * SET(C)'
])
When a variable is used in a class definition, an atom with size 1 in that variable is implicitly added.
We can also define atoms explicitly. In the following example, we define the class of unary-binary trees,
marking binary nodes with an additional parameter u:
ubtrees = combol.parse([
'leafnode = atom(z: 1)',
'unarynode = atom(z: 1)',
'binarynode = atom(z: 1, u: 1)',
'T = leafnode + (unarynode * T) + (binarynode * T * T)'
])
Here, we specified two different atoms with the same size. This can useful for the structure mapping feature, described below.
Enumeration
We can enumerate the class by calling enumerate. The argument is the maximum power to which the generating function is
expanded.
The result is a Poly object, which may be a UnivariatePolynomial if the class is univariate.
btree_poly = btree.enumerate(10)
btree_poly.get_coefficients() # -> [0, 1, 0, 1, 0, 2, 0, 5, 0, 14, 0]
For multivariate classes, the terms cannot be represented as a simple list. Instead, the coefficients are stored as a list of monomials, each represented as a tuple of the variables with exponents, and the coefficent.
Sampling
To sample from a class, a sampler object needs to be created, instantiated with the desired values for the control parameter(s).
# Here, we define a sampler for z = 0.2
btree_sampler = btrees.sampler(0.2)
btree_sampler.sample() # -> Product(z, Product(z, z, z), z)
By default, the output of the sampler will be an abstract tree-like representation of the sampled structure, such as
Product(z, Product(z, z, z), z).
Note that For multivariate classes, the control parameter must be given for each variable, e.g.:
ubtree_sampler = ubtrees.sampler({'z': 0.3, 'u': 0.2})
If structures within a range of sizes is desired, CombOL can perform rejection sampling (with an early rejection method described in the paper) by setting the desired bounds, e.g.:
btree_sampler.sample(100, bounds_upper=100)
btree_sampler.sample(100, bounds_lower=85, bounds_upper=115)
ubtree_sampler.sample(100, bounds_lower={'z': 10, 'u': 10}, bounds_upper={'z': 100, 'u': None})
Control parameter tuning
The number of expected rejections can be significantly reduced by selecting the control paramters carefully. CombOL supports automatic parameter tuning, e.g.:
btree_params = btrees.tune_control_parameters(target_size={'z': 100})
ubtree_params = ubtrees.tune_control_parameters(target_size={'z': 100, 'u': 40})
# Sample with tuned variables
tuned_sampler = btrees.sampler(btree_params)
tuned_sampler.sample(1000)
Here, tuning a variable to None produces a singlular sampler in that variable.
Structure Mappings
For domain-specific applications, CombOL suppports the mapping of structures to arbitrary domain objects. The user just has to define the bijection by specifying which objects correspond to the atoms, and how the combinatorial constructions act on these objects.
To do this, subclass the StructureBuilder class.
Here, we define a bijection that produces Newick-formatted string representations of the sampled trees:
class NewickNotationBuilder(StructureBuilder[str, str]):
def atom(self, atom: Atom) -> str:
return atom.key
@classmethod
def product(cls, node: SpecificationNodeConstructor, children: list[str]) -> str:
# Assumed first given structure is internal node, the rest are children.
return ('(' + ','.join(children[1:]) + ')' + children[0]) if len(children) > 1 else children[0]
@classmethod
def sequence(cls, node: SpecificationNodeConstructor, children: list[str]) -> str:
return ','.join(children)
Now, we can use this bijection to directly sample Newick-formatted strings:
newick_tree = btree_sampler.sample(1, bounds_lower=4, structure_builder=NewickNotationBuilder())
# -> (z,(z,z)z)z
This can of course also be applied to other domain objects, for instance chemical structures. CombOL will only apply the defined bijection on accepted and completed sampling attempts, so no cycles are wasted constructing structures which are going to be rejected.
Notice that we subclass StructureBuilder with two generic paramters, T and TFinal.
In most cases, these will be the same type (as in our example ablove), but in some cases, different intermediate and
final types may be needed.
In this case, the finalize method needs to be implemented, taking an object of type T and returning a TFinal.
Combinatorial Constructions
CombOL supports the following combinatorial constructions:
| Construction | Semantics | CombOL | Enumeration | Sampling | Parameter Tuning |
|---|---|---|---|---|---|
| Disjoint Union | A structure chosen from A or B | A + B |
✓ | ✓ | ✓ |
| Cartesian Product | A tuple of structures (a,b) | A * B |
✓ | ✓ | ✓ |
| Difference | A structure in A \ B | A \ B |
✓ | ||
| Diagonal | k identical structures from A | DIAG_k(A) |
✓ | ||
| Sequence | A (possibly empty) ordered sequence of structures from A | ||||
| - Unbounded | SEQ(A) |
✓ | ✓ | ✓ | |
| - Exact | SEQ_k(A) |
✓ | ✓ | ✓ | |
| - Lower bounded | SEQ_>=k(A) |
✓ | ✓ | ✓ | |
| - Upper bounded | SEQ_<=k(A) |
✓ | ✓ | ✓ | |
| - Range bounded | SEQ_k..l(A) |
✓ | ✓ | ||
| Cycle | A sequence of structures from A, taken up to cyclic shift | ||||
| - Unbounded | CYC(A) |
||||
| - Exact | CYC_k(A) |
✓, Up to 4 | ✓ | ||
| - Lower bounded | CYC_>=k(A) |
||||
| - Upper bounded | CYC_<=k(A) |
✓, Up to 4 | |||
| - Range bounded | CYC_k..l(A) |
✓, Up to 4 | |||
| Multiset | A sequence of structures from A, taken up to permutation | ||||
| - Unbounded | MSET(A) |
✓ | |||
| - Exact | MSET_k(A) |
✓, Up to 4 | ✓ | ||
| - Lower bounded | MSET_>=k(A) |
||||
| - Upper bounded | MSET_<=k(A) |
✓, Up to 4 | |||
| - Range bounded | MSET_k..l(A) |
✓, Up to 4 |
Support for more combinations of constructors and constraints is planned. Parameter tuning is provided by the Paganini library.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file combol-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: combol-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 9.0 MB
- Tags: CPython 3.12, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.13.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
28a6a06854e1769c27cd03477692c9f9bbe54544d1144ea6cc7ba0ecd0ec7538
|
|
| MD5 |
bece56143b62ec6133e651dc1f65a0ba
|
|
| BLAKE2b-256 |
f3bc6c0e8a621dece317d7bc1ca9cf8374d986072da48bdaa7cfe8b8fcacb827
|
File details
Details for the file combol-0.1.0-cp312-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: combol-0.1.0-cp312-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 9.0 MB
- Tags: CPython 3.12+, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.13.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9170b9886ea382b8ea722278a37db9779eb40f88f50369859cdfe8eca797d237
|
|
| MD5 |
3d2bdc01977726646d2e3c75ae69fd4c
|
|
| BLAKE2b-256 |
88036737268d51e019782dbf7dd1672267027772197d54c3b06ed21bb0c213fa
|