Python Package with quality covers C++ extension
Project description
Quality covers
Quality covers is a pattern mining algorithm.
Install
pip3 install --upgrade quality_covers
Transactional file
If your file looks like this
chess.dat:
1 3 5 7 10
1 3 5 7 10
1 3 5 8 9
1 3 6 7 9
1 3 6 8 9
or
P30968
P48551 P17181
P05121 Q03405 P00747 P02671
Q02643
P48551 P17181
use
import quality_covers
quality_covers.run_classic_size("chess.dat", False)
Binary file
If your file looks like this
chess.data:
1 0 1 0 1 0 1 0 0 1
1 0 1 0 1 0 1 0 0 1
1 0 1 0 1 0 0 1 1 0
1 0 1 0 0 1 1 0 1 0
1 0 1 0 0 1 0 1 1 0
use
import quality_covers
quality_covers.run_classic_size("chess.data", True)
Output of the functions
The functions will create two files in current directory:
- chess.data.out: the result file
- chess.data.clock: information about time execution
Extract binary matrices
You can obtain binary matrices by calling extract_binary_matrices
on the output file
quality_covers.extract_binary_matrices('chess.data.out')
Optional arguments
Threshold coverage
You can provide a threshold to the coverage.
# 60% of coverage
quality_covers.run_classic_size("chess.data", True, 0.6)
Measures
You can also ask for information about measures:
- frequency
- monocle
- separation
- object uniformity
quality_covers.run_classic_size("chess.data", True, 0.6, True)
3,4,9 ; 4,5,6,7,8#Object Uniformity=0.81944; Monocole=91.00000; Frequency=0.33333; Separation=0.48387
2,9 ; 1,3,7#Object Uniformity=0.68750; Monocole=28.00000; Frequency=0.22222; Separation=0.27273
1,6,9 ; 2,7#Object Uniformity=0.63889; Monocole=28.00000; Frequency=0.33333; Separation=0.31579
# Mandatory: 0
# Non-mandatory: 3
# Total: 3
# Coverage: 25/38(65.78947%)
# Mean frequency: 0.29630
# Mean monocole: 49.00000
# Mean object uniformity: 0.71528
# Mean separation: 0.35746
Different algorithms
There are currently four different algorithms:
run_classic_size
run_approximate_size
run_fca_cemb_with_mandatory
run_fca_cemb_without_mandatory
Examples
Transactional file with 80% coverage and measures information with approximate size algorithm
Data file
1 3 5 7 10
1 3 5 7 10
1 3 5 8 9
1 3 6 7 9
1 3 6 8 9
1 4 5 7 10
1 4 5 7 10
1 4 5 8 9
1 4 6 7 9
1 4 6 8 9
2 3 5 7 10
2 3 5 7 10
2 3 5 8 9
2 3 6 7 9
2 3 6 8 9
2 4 5 7 10
2 4 5 7 10
2 4 5 8 9
2 4 6 7 9
2 4 6 8 9
Python commands
import quality_covers
quality_covers.run_approximate_size(file.data', True, 0.8, True)
Results file.data.out
1,2,6,7,11,12,16,17 ; 5,7,10#Object Uniformity=0.60000; Monocle=648.00000; Frequency=0.40000; Separation=0.50000
4,5,9,10,14,15,19,20 ; 9,6#Object Uniformity=0.40000; Monocle=352.00000; Frequency=0.40000; Separation=0.36364
3,5,8,10,13,15,18,20 ; 8,9#Object Uniformity=0.40000; Monocle=352.00000; Frequency=0.40000; Separation=0.36364
11,12,13,14,15,16,17,18,19,20 ; 2#Object Uniformity=0.20000; Monocle=228.00000; Frequency=0.50000; Separation=0.20000
6,7,8,9,10,16,17,18,19,20 ; 4#Object Uniformity=0.20000; Monocle=258.00000; Frequency=0.50000; Separation=0.20000
1,2,3,4,5,11,12,13,14,15 ; 3#Object Uniformity=0.20000; Monocle=258.00000; Frequency=0.50000; Separation=0.20000
# Mandatory: 0
# Non-mandatory: 6
# Total: 6
# Coverage: 82/100(82.00000%)
# Mean frequency: 0.45000
# Mean monocle: 349.33334
# Mean object uniformity: 0.33333
# Mean separation: 0.30455
Extract binary matrices
import quality_covers
quality_covers.extract_binary_matrices('file.data.out')
Result binary matrices extent
1 0 0 0 0 1
1 0 0 0 0 1
0 0 1 0 0 1
0 1 0 0 0 1
0 1 1 0 0 1
1 0 0 0 1 0
1 0 0 0 1 0
0 0 1 0 1 0
0 1 0 0 1 0
0 1 1 0 1 0
1 0 0 1 0 1
1 0 0 1 0 1
0 0 1 1 0 1
0 1 0 1 0 1
0 1 1 1 0 1
1 0 0 1 1 0
1 0 0 1 1 0
0 0 1 1 1 0
0 1 0 1 1 0
0 1 1 1 1 0
Result binary matrices extent
The first line is the name of the column
5 7 10 9 6 8 2 4 3
1 1 1 0 0 0 0 0 0
0 0 0 1 1 0 0 0 0
0 0 0 1 0 1 0 0 0
0 0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 0 1
More info
Paper associated
To come
Research lab
More tools about association rules
- https://marm.checksem.fr/api/ui/
- https://app.marm.checksem.fr/
- https://quality-cover.checksem.fr/api/ui
Authors
Amira Mouakher (amira.mouakher@u-bourgogne.fr) Nicolas Gros (nicolas.gros01@u-bourgogne.fr) Sebastien Gerin (sebastien.gerin@sayens.fr)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Hashes for quality_covers-3.0.3-cp38-cp38-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e2c42258f0b0163b1f060584a6e471aacec2104652029fcdbd12153d16d75675 |
|
MD5 | c1fcc40f2df88c9ed9d5ada261d1e3fb |
|
BLAKE2b-256 | 309ded77737af4751f6903496aba698f2d7d01bd4fb560a2350915165aba02a7 |
Hashes for quality_covers-3.0.3-cp37-cp37m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3a843ef5df245756f4b94c1b2b2dd3fb824c76a4627698cea750c7bde075c683 |
|
MD5 | 6222749670852859df8534398a031059 |
|
BLAKE2b-256 | 26e04c59e00580571301e424016e5b2765b7f9cf24bef41af7046e58f21b12df |
Hashes for quality_covers-3.0.3-cp36-cp36m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e26ac20ee8198357659cf3601a153a8f6ba45ea2ebc325e35e6fbcfd9807aa11 |
|
MD5 | 2aa96163718e238528b1ff9a4719fb05 |
|
BLAKE2b-256 | aa79d3bcf560d949e313d07fb214b2233e731b9c62e67c4d2d612701fbf69840 |
Hashes for quality_covers-3.0.3-cp35-cp35m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 053337a091feb17ab73425684702e2ee19eb47b8aca6dedd13cb258912c42f95 |
|
MD5 | 6cf8e75f8c7eb71df5a55ae363246e65 |
|
BLAKE2b-256 | 90bce6f63b5cb4c6427577f7b6a91be9ea0f1471f1c58b15611cb8903c66a548 |