Skip to main content

No project description provided

Project description

Version 0.2

Authors

  • JK Baillie
  • A Law
  • B Wang
  • D Farr

Meta-analysis by information content (MAIC)

Data-driven aggregation of ranked and unranked lists

https://baillielab.net/maic

Code repositories

Original implementation: https://github.com/baillielab/maic Refactored package code: https://github.com/baillielab/maic/tree/packaging

basic usage

installation:

pip install pymaic

from the command line:

pymaic installs a shell script that allows you to run a MAIC analysis directly from the command line as simply as:

maic -f <inputfilename>

Options

-f FILENAME, --filename FILENAME path to the file containing data to be analysed

-t TYPE, --type TYPE format of the file specified with -f (see below).

-o FOLDER, --output-folder FOLDER path to the folder in which to write the results files

-v, --verbose increase the detail of logging messages.

-q, --quiet decrease the detail of logging messages (overrides the -v/--verbose flag)

Input file format

Input is a series of lists of named entities, which may belong to categories. pymaic supports three input formats:

MAIC - a tab-separated format (-t MAIC)

Each line of the input file describes a list of entities. The first four columns in each line specify features of the list in this line, and the fifth is a space-separated list of entity names, e.g.

<category> <list_label> RANKED <unused> entity1 entity2 entity3 ...

<category> <list_label> RANKED <unused> entity1 entity2 entity3 ...

<category> <list_label> UNRANKED <unused> entity1 entity2 entity3 ...

<category> <list_label> UNRANKED <unused> entity1 entity2 entity3 ...

JSON/YAML (-t JSON, -t YAML)

Files can also be provided as semi-structured data in either JSON or YAML format:

[
  {
    "name": <list_label>,
    "category": <category>,
    "ranked": true|false,
    "entities": ["entity1", "entity2", "entity3", ...]
  },
  ...
]
-
  name: <list_label>
  category: <category>
  ranked: true|false
  entities:
    - entity1
    - entity2
    - entity3
    - ...
-
  ...

from a python script:

You can instantiate a MAIC analysis in python if you want greater control over the output of results, would like to do some additional processing after analysis, or need to use data in a format not supported by the command-line script.

Constructing a MAIC analysis object from a file to give programmatic access to the results:

from maic import MAIC

app = MAIC.fromfile("/path/to/inputfile")
app.run()

for result in app.sorted_results():
...

Constructing a MAIC analysis from sources other than a file:

from maic import MAIC
from maic.models import EntityListModel

models = []

# prepare the data:
for list in mydata:
    models.append(EntityListModel(name=list.name, category=list.category, ranked=True if list.type == "RANKED" else False, entities=list.entities))

app = MAIC(modellist = models)
...

Dataset analysis for methods selection

The dataset features including ranking information, the number of sources included and the heterogeneity of quality will be explored to show the estimation of the best performed ranking aggregation method for the given dataset. See Wang et al [https://doi.org/10.1093/bioinformatics/btac621] for an explanation of how we evaluated this.

When MixLarge data with high heterogeneity (See Wang et al [https://doi.org/10.1093/bioinformatics/btac621]) is used, the algorithm will output:

"Based on the characteristics of your dataset, we have estimated that MAIC is the best algorithm for this analysis! See Wang et al [https://doi.org/10.1093/bioinformatics/btac621] for an explanation of how we evaluated this."

When RankLarge data with high heterogeneity (See Wang et al [https://doi.org/10.1093/bioinformatics/btac621]) is used, the algorithm will output:

"Warning! Your dataset has the unusual combination of ranked-only data, high heterogeneity and a relatively large number of sources (11) included. Based on these features we think you'd get better results from running BIRRA [http://www.pitt.edu/~mchikina/BIRRA/]. See Wang et al [https://doi.org/10.1093/bioinformatics/btac621] for an explanation of how we evaluated this."

Project details


Release history Release notifications | RSS feed

This version

0.2

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pymaic-0.2.tar.gz (22.0 kB view details)

Uploaded Source

Built Distribution

pymaic-0.2-py3-none-any.whl (27.0 kB view details)

Uploaded Python 3

File details

Details for the file pymaic-0.2.tar.gz.

File metadata

  • Download URL: pymaic-0.2.tar.gz
  • Upload date:
  • Size: 22.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.7

File hashes

Hashes for pymaic-0.2.tar.gz
Algorithm Hash digest
SHA256 0cb67969518222fbc4e7224a1b18bd60541bbf0718c6c0064a150190e8704468
MD5 fb6dd6287b92f7d3ca345b21e1053be3
BLAKE2b-256 dbf821d5abc6da4003b5ecb8fba02a1b8db0bd43a781ebe12e4da71158b2d9c5

See more details on using hashes here.

File details

Details for the file pymaic-0.2-py3-none-any.whl.

File metadata

  • Download URL: pymaic-0.2-py3-none-any.whl
  • Upload date:
  • Size: 27.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.7

File hashes

Hashes for pymaic-0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 b0e4255fb3cc5d7cb4006170943e0e98cf159963281d70fc6b990505eedfffbc
MD5 7d5129f5d92cb74678a559debb489e16
BLAKE2b-256 4492081e3f1221c2b3d156f366c557e935ff13d770929880266c27317df5b00f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page