Skip to main content
Help us improve Python packaging – donate today!

A Python module for data fusion built on top of factorized models.

Project Description


scikit-fusion is a Python module for data fusion based on recent collective latent factor models.


scikit-fusion is tested to work under Python 3.

The required dependencies to build the software are Numpy >= 1.7, SciPy >= 0.12, PyGraphviz >= 1.3 (needed only for drawing data fusion graphs) and Joblib >= 0.8.4.


This package uses distutils, which is the default way of installing python modules. To install in your home directory, use:

python install --user

To install for all users on Unix/Linux:

python build
sudo python install

For development mode use:

python develop


Let’s generate three random data matrices describing three different object types:

>>> import numpy as np
>>> R12 = np.random.rand(50, 100)
>>> R13 = np.random.rand(50, 40)
>>> R23 = np.random.rand(100, 40)

Next, we define our data fusion graph:

>>> from skfusion import fusion
>>> t1 = fusion.ObjectType('Type 1', 10)
>>> t2 = fusion.ObjectType('Type 2', 20)
>>> t3 = fusion.ObjectType('Type 3', 30)
>>> relations = [fusion.Relation(R12, t1, t2),
                 fusion.Relation(R13, t1, t3),
                 fusion.Relation(R23, t2, t3)]
>>> fusion_graph = fusion.FusionGraph()
>>> fusion_graph.add_relations_from(relations)

and then collectively infer the latent data model:

>>> fuser = fusion.Dfmf()
>>> fuser.fuse(fusion_graph)
>>> print(fuser.factor(t1).shape)
(50, 10)

Afterwards new data might arrive:

>>> new_R12 = np.random.rand(10, 100)
>>> new_R13 = np.random.rand(10, 40)

for which we define the fusion graph:

>>> new_relations = [fusion.Relation(new_R12, t1, t2),
                     fusion.Relation(new_R13, t1, t3)]
>>> new_graph = fusion.FusionGraph(new_relations)

and transform new objects to the latent space induced by the fuser:

>>> transformer = fusion.DfmfTransform()
>>> transformer.transform(t1, new_graph, fuser)
>>> print(transformer.factor(t1).shape)
(10, 10)

scikit-fusion is distributed with a few working data fusion scenarios:

>>> from skfusion import datasets
>>> dicty = datasets.load_dicty()
>>> print(dicty)
FusionGraph(Object types: 3, Relations: 3)
>>> print(dicty.object_types)
{ObjectType(GO term), ObjectType(Experimental condition), ObjectType(Gene)}
>>> print(dicty.relations)
{Relation(ObjectType(Gene), ObjectType(GO term)),
 Relation(ObjectType(Gene), ObjectType(Gene)),
 Relation(ObjectType(Gene), ObjectType(Experimental condition))}

Release history Release notifications

This version
History Node


History Node


History Node


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
scikit-fusion-0.2.1.tar.gz (6.8 MB) Copy SHA256 hash SHA256 Source None Aug 20, 2015

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging CloudAMQP CloudAMQP RabbitMQ AWS AWS Cloud computing Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page