Skip to main content

Python Package with quality covers C++ extension

Project description

Quality covers

Quality covers is a pattern mining algorithm.

Install

pip3 install --upgrade quality_covers

Transactional file

If your file looks like this

chess.dat:

1 3 5 7 10 
1 3 5 7 10 
1 3 5 8 9 
1 3 6 7 9 
1 3 6 8 9 

or

P30968
P48551 P17181
P05121 Q03405 P00747 P02671
Q02643
P48551 P17181

use

import quality_covers

quality_covers.run_classic_size("chess.dat", False)

Binary file

If your file looks like this

chess.data:

1 0 1 0 1 0 1 0 0 1
1 0 1 0 1 0 1 0 0 1
1 0 1 0 1 0 0 1 1 0
1 0 1 0 0 1 1 0 1 0
1 0 1 0 0 1 0 1 1 0

use

import quality_covers

quality_covers.run_classic_size("chess.data", True)

Output of the functions

The functions will create two files in current directory:

  • chess.data.out: the result file
  • chess.data.clock: information about time execution

Extract binary matrices

You can obtain binary matrices by calling extract_binary_matrices on the output file

quality_covers.extract_binary_matrices('chess.data.out')

Optional arguments

Threshold coverage

You can provide a threshold to the coverage.

# 60% of coverage
quality_covers.run_classic_size("chess.data", True, 0.6)

Measures

You can also ask for information about measures:

  • frequency
  • monocle
  • separation
  • object uniformity
quality_covers.run_classic_size("chess.data", True, 0.6, True)
3,4,9 ; 4,5,6,7,8#Object Uniformity=0.81944; Monocole=91.00000; Frequency=0.33333; Separation=0.48387
2,9 ; 1,3,7#Object Uniformity=0.68750; Monocole=28.00000; Frequency=0.22222; Separation=0.27273
1,6,9 ; 2,7#Object Uniformity=0.63889; Monocole=28.00000; Frequency=0.33333; Separation=0.31579
# Mandatory: 0
# Non-mandatory: 3
# Total: 3
# Coverage: 25/38(65.78947%)
# Mean frequency: 0.29630
# Mean monocole: 49.00000
# Mean object uniformity: 0.71528
# Mean separation: 0.35746

Different algorithms

There are currently four different algorithms:

  • run_classic_size
  • run_approximate_size
  • run_fca_cemb_with_mandatory
  • run_fca_cemb_without_mandatory

Examples

Transactional file with 80% coverage and measures information with approximate size algorithm

Data file

1 3 5 7 10 
1 3 5 7 10 
1 3 5 8 9 
1 3 6 7 9 
1 3 6 8 9 
1 4 5 7 10 
1 4 5 7 10 
1 4 5 8 9 
1 4 6 7 9 
1 4 6 8 9 
2 3 5 7 10 
2 3 5 7 10 
2 3 5 8 9 
2 3 6 7 9 
2 3 6 8 9 
2 4 5 7 10 
2 4 5 7 10 
2 4 5 8 9 
2 4 6 7 9 
2 4 6 8 9 

Python commands

import quality_covers

quality_covers.run_approximate_size(file.data', True, 0.8, True)

Results file.data.out

1,2,6,7,11,12,16,17 ; 5,7,10#Object Uniformity=0.60000; Monocle=648.00000; Frequency=0.40000; Separation=0.50000
4,5,9,10,14,15,19,20 ; 9,6#Object Uniformity=0.40000; Monocle=352.00000; Frequency=0.40000; Separation=0.36364
3,5,8,10,13,15,18,20 ; 8,9#Object Uniformity=0.40000; Monocle=352.00000; Frequency=0.40000; Separation=0.36364
11,12,13,14,15,16,17,18,19,20 ; 2#Object Uniformity=0.20000; Monocle=228.00000; Frequency=0.50000; Separation=0.20000
6,7,8,9,10,16,17,18,19,20 ; 4#Object Uniformity=0.20000; Monocle=258.00000; Frequency=0.50000; Separation=0.20000
1,2,3,4,5,11,12,13,14,15 ; 3#Object Uniformity=0.20000; Monocle=258.00000; Frequency=0.50000; Separation=0.20000
# Mandatory: 0
# Non-mandatory: 6
# Total: 6
# Coverage: 82/100(82.00000%)
# Mean frequency: 0.45000
# Mean monocle: 349.33334
# Mean object uniformity: 0.33333
# Mean separation: 0.30455

Extract binary matrices

import quality_covers

quality_covers.extract_binary_matrices('file.data.out')

Result binary matrices extent

1 0 0 0 0 1
1 0 0 0 0 1
0 0 1 0 0 1
0 1 0 0 0 1
0 1 1 0 0 1
1 0 0 0 1 0
1 0 0 0 1 0
0 0 1 0 1 0
0 1 0 0 1 0
0 1 1 0 1 0
1 0 0 1 0 1
1 0 0 1 0 1
0 0 1 1 0 1
0 1 0 1 0 1
0 1 1 1 0 1
1 0 0 1 1 0
1 0 0 1 1 0
0 0 1 1 1 0
0 1 0 1 1 0
0 1 1 1 1 0

Result binary matrices extent

The first line is the name of the column

5 7 10 9 6 8 2 4 3
1 1 1 0 0 0 0 0 0
0 0 0 1 1 0 0 0 0
0 0 0 1 0 1 0 0 0
0 0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 0 1

More info

Paper associated

To come

Research lab

More tools about association rules

Authors

Amira Mouakher (amira.mouakher@u-bourgogne.fr) Nicolas Gros (nicolas.gros01@u-bourgogne.fr) Sebastien Gerin (sebastien.gerin@sayens.fr)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

quality_covers-3.1.0-cp38-cp38-manylinux1_x86_64.whl (322.8 kB view hashes)

Uploaded CPython 3.8

quality_covers-3.1.0-cp37-cp37m-manylinux1_x86_64.whl (322.6 kB view hashes)

Uploaded CPython 3.7m

quality_covers-3.1.0-cp36-cp36m-manylinux1_x86_64.whl (322.6 kB view hashes)

Uploaded CPython 3.6m

quality_covers-3.1.0-cp35-cp35m-manylinux1_x86_64.whl (322.5 kB view hashes)

Uploaded CPython 3.5m

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page