Skip to main content

CombOL - the Combinatorial Objects Library.

Project description

combol_wordmark.svg

The Combinatorial Objects Library


Introduction

CombOL is an open-source library for the enumeration and Boltzmann sampling of combinatorial classes, written in Rust and Python.

Classes can be specified programmatically through Python, via JSON files, or parsed from a concise string syntax, and may be defined in terms of an arbitrary number of parameters. From a specification, CombOL automatically derives the generating functions, enabling the generation of counting sequences. The library supports exact and approximate-size Boltzmann rejection sampling with automatic parameter tuning to target specific sizes. In addition to implementing established methods, CombOL features a novel early rejection scheme, as well as guaranteed statistical correctness by dynamically increasing the numerical precision, eliminating bias due to floating-point rounding errors.

Through the Python interface, users can define application-specific functions for combinatorial operators, enabling direct sampling of domain objects such as graphs, chemical structure representations, or other complex data types.


Paper

TODO


Tutorial

Quickstart

import combol

# Classes are specified by parsing a string written in syntax similar to `combstruct`.
btrees = combol.parse('B = z + (z * B * B)')

# CombOL can produce the counting sequence for univariate or multivariate classes.
counting_sequence = combol.enumerate(10)  # Poly: z + z^3 + 2z^5 + 5z^7 + ...

# We can also sample structures from the class.
sampler = btrees.sampler(0.2)
sampled_structure = sampler.sample()  # Product(z, Product(z, z, z), z)

This creates a combinatorial class object which can be used for various purposes.

Specifying classes

Classes are specified by parsing strings with a syntax similar to that of the combstruct maple package. In this syntax, classes must start with uppercase letters, and variables with lowercase letters.

The table below shows the supported combinatorial constructions and their syntax.

We can also define classes that depend on each other, e.g.

mappings = combol.parse([
    'Y = CYC(C)',
    'C = z * SET(C)'
])

When a variable is used in a class definition, an atom with size 1 in that variable is implicitly added. We can also define atoms explicitly. In the following example, we define the class of unary-binary trees, marking binary nodes with an additional parameter u:

ubtrees = combol.parse([
    'leafnode = atom(z: 1)',
    'unarynode = atom(z: 1)',
    'binarynode = atom(z: 1, u: 1)',
    'T = leafnode + (unarynode * T) + (binarynode * T * T)'
])

Here, we specified two different atoms with the same size. This can useful for the structure mapping feature, described below.

Enumeration

We can enumerate the class by calling enumerate. The argument is the maximum power to which the generating function is expanded. The result is a Poly object, which may be a UnivariatePolynomial if the class is univariate.

btree_poly = btree.enumerate(10)
btree_poly.get_coefficients()  # -> [0, 1, 0, 1, 0, 2, 0, 5, 0, 14, 0]

For multivariate classes, the terms cannot be represented as a simple list. Instead, the coefficients are stored as a list of monomials, each represented as a tuple of the variables with exponents, and the coefficent.

Sampling

To sample from a class, a sampler object needs to be created, instantiated with the desired values for the control parameter(s).

# Here, we define a sampler for z = 0.2
btree_sampler = btrees.sampler(0.2)
btree_sampler.sample()  # -> Product(z, Product(z, z, z), z)

By default, the output of the sampler will be an abstract tree-like representation of the sampled structure, such as Product(z, Product(z, z, z), z).

Note that For multivariate classes, the control parameter must be given for each variable, e.g.:

ubtree_sampler = ubtrees.sampler({'z': 0.3, 'u': 0.2})

If structures within a range of sizes is desired, CombOL can perform rejection sampling (with an early rejection method described in the paper) by setting the desired bounds, e.g.:

btree_sampler.sample(100, bounds_upper=100)
btree_sampler.sample(100, bounds_lower=85, bounds_upper=115)
ubtree_sampler.sample(100, bounds_lower={'z': 10, 'u': 10}, bounds_upper={'z': 100, 'u': None})

Control parameter tuning

The number of expected rejections can be significantly reduced by selecting the control paramters carefully. CombOL supports automatic parameter tuning, e.g.:

btree_params = btrees.tune_control_parameters(target_size={'z': 100})
ubtree_params = ubtrees.tune_control_parameters(target_size={'z': 100, 'u': 40})

# Sample with tuned variables
tuned_sampler = btrees.sampler(btree_params)
tuned_sampler.sample(1000)

Here, tuning a variable to None produces a singlular sampler in that variable.

Structure Mappings

For domain-specific applications, CombOL suppports the mapping of structures to arbitrary domain objects. The user just has to define the bijection by specifying which objects correspond to the atoms, and how the combinatorial constructions act on these objects.

To do this, subclass the StructureBuilder class. Here, we define a bijection that produces Newick-formatted string representations of the sampled trees:

class NewickNotationBuilder(StructureBuilder[str, str]):
    def atom(self, atom: Atom) -> str:
        return atom.key

    @classmethod
    def product(cls, node: SpecificationNodeConstructor, children: list[str]) -> str:
        # Assumed first given structure is internal node, the rest are children.
        return ('(' + ','.join(children[1:]) + ')' + children[0]) if len(children) > 1 else children[0]

    @classmethod
    def sequence(cls, node: SpecificationNodeConstructor, children: list[str]) -> str:
        return ','.join(children)

Now, we can use this bijection to directly sample Newick-formatted strings:

newick_tree = btree_sampler.sample(1, bounds_lower=4, structure_builder=NewickNotationBuilder())
# -> (z,(z,z)z)z

This can of course also be applied to other domain objects, for instance chemical structures. CombOL will only apply the defined bijection on accepted and completed sampling attempts, so no cycles are wasted constructing structures which are going to be rejected.

Notice that we subclass StructureBuilder with two generic paramters, T and TFinal. In most cases, these will be the same type (as in our example ablove), but in some cases, different intermediate and final types may be needed. In this case, the finalize method needs to be implemented, taking an object of type T and returning a TFinal.


Combinatorial Constructions

CombOL supports the following combinatorial constructions:

Construction Semantics CombOL Enumeration Sampling Parameter Tuning
Disjoint Union A structure chosen from A or B A + B
Cartesian Product A tuple of structures (a,b) A * B
Difference A structure in A \ B A \ B
Diagonal k identical structures from A DIAG_k(A)
Sequence A (possibly empty) ordered sequence of structures from A
- Unbounded SEQ(A)
- Exact SEQ_k(A)
- Lower bounded SEQ_>=k(A)
- Upper bounded SEQ_<=k(A)
- Range bounded SEQ_k..l(A)
Cycle A sequence of structures from A, taken up to cyclic shift
- Unbounded CYC(A)
- Exact CYC_k(A) ✓, Up to 4
- Lower bounded CYC_>=k(A)
- Upper bounded CYC_<=k(A) ✓, Up to 4
- Range bounded CYC_k..l(A) ✓, Up to 4
Multiset A sequence of structures from A, taken up to permutation
- Unbounded MSET(A)
- Exact MSET_k(A) ✓, Up to 4
- Lower bounded MSET_>=k(A)
- Upper bounded MSET_<=k(A) ✓, Up to 4
- Range bounded MSET_k..l(A) ✓, Up to 4

Support for more combinations of constructors and constraints is planned. Parameter tuning is provided by the Paganini library.


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

combol-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (9.0 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

combol-0.1.0-cp312-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (9.0 MB view details)

Uploaded CPython 3.12+manylinux: glibc 2.17+ x86-64

File details

Details for the file combol-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for combol-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 28a6a06854e1769c27cd03477692c9f9bbe54544d1144ea6cc7ba0ecd0ec7538
MD5 bece56143b62ec6133e651dc1f65a0ba
BLAKE2b-256 f3bc6c0e8a621dece317d7bc1ca9cf8374d986072da48bdaa7cfe8b8fcacb827

See more details on using hashes here.

File details

Details for the file combol-0.1.0-cp312-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for combol-0.1.0-cp312-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 9170b9886ea382b8ea722278a37db9779eb40f88f50369859cdfe8eca797d237
MD5 3d2bdc01977726646d2e3c75ae69fd4c
BLAKE2b-256 88036737268d51e019782dbf7dd1672267027772197d54c3b06ed21bb0c213fa

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page