Python Package with quality covers C++ extension
Project description
Quality covers
Quality covers is a pattern mining algorithm.
Install
pip3 install --upgrade quality_covers
Transactional file
If your file looks like this
chess.dat:
1 3 5 7 10
1 3 5 7 10
1 3 5 8 9
1 3 6 7 9
1 3 6 8 9
or
P30968
P48551 P17181
P05121 Q03405 P00747 P02671
Q02643
P48551 P17181
use
import quality_covers
quality_covers.run_classic_size("chess.dat", False)
Binary file
If your file looks like this
chess.data:
1 0 1 0 1 0 1 0 0 1
1 0 1 0 1 0 1 0 0 1
1 0 1 0 1 0 0 1 1 0
1 0 1 0 0 1 1 0 1 0
1 0 1 0 0 1 0 1 1 0
use
import quality_covers
quality_covers.run_classic_size("chess.data", True)
Output of the functions
The functions will create two files in current directory:
- chess.data.out: the result file
- chess.data.clock: information about time execution
Extract binary matrices
You can obtain binary matrices by calling extract_binary_matrices
on the output file
quality_covers.extract_binary_matrices('chess.data.out')
Optional arguments
Threshold coverage
You can provide a threshold to the coverage.
# 60% of coverage
quality_covers.run_classic_size("chess.data", True, 0.6)
Measures
You can also ask for information about measures:
- frequency
- monocle
- separation
- object uniformity
quality_covers.run_classic_size("chess.data", True, 0.6, True)
3,4,9 ; 4,5,6,7,8#Object Uniformity=0.81944; Monocole=91.00000; Frequency=0.33333; Separation=0.48387
2,9 ; 1,3,7#Object Uniformity=0.68750; Monocole=28.00000; Frequency=0.22222; Separation=0.27273
1,6,9 ; 2,7#Object Uniformity=0.63889; Monocole=28.00000; Frequency=0.33333; Separation=0.31579
# Mandatory: 0
# Non-mandatory: 3
# Total: 3
# Coverage: 25/38(65.78947%)
# Mean frequency: 0.29630
# Mean monocole: 49.00000
# Mean object uniformity: 0.71528
# Mean separation: 0.35746
Different algorithms
There are currently four different algorithms:
run_classic_size
run_approximate_size
run_fca_cemb_with_mandatory
run_fca_cemb_without_mandatory
Examples
Transactional file with 80% coverage and measures information with approximate size algorithm
Data file
1 3 5 7 10
1 3 5 7 10
1 3 5 8 9
1 3 6 7 9
1 3 6 8 9
1 4 5 7 10
1 4 5 7 10
1 4 5 8 9
1 4 6 7 9
1 4 6 8 9
2 3 5 7 10
2 3 5 7 10
2 3 5 8 9
2 3 6 7 9
2 3 6 8 9
2 4 5 7 10
2 4 5 7 10
2 4 5 8 9
2 4 6 7 9
2 4 6 8 9
Python commands
import quality_covers
quality_covers.run_approximate_size(file.data', True, 0.8, True)
Results file.data.out
1,2,6,7,11,12,16,17 ; 5,7,10#Object Uniformity=0.60000; Monocle=648.00000; Frequency=0.40000; Separation=0.50000
4,5,9,10,14,15,19,20 ; 9,6#Object Uniformity=0.40000; Monocle=352.00000; Frequency=0.40000; Separation=0.36364
3,5,8,10,13,15,18,20 ; 8,9#Object Uniformity=0.40000; Monocle=352.00000; Frequency=0.40000; Separation=0.36364
11,12,13,14,15,16,17,18,19,20 ; 2#Object Uniformity=0.20000; Monocle=228.00000; Frequency=0.50000; Separation=0.20000
6,7,8,9,10,16,17,18,19,20 ; 4#Object Uniformity=0.20000; Monocle=258.00000; Frequency=0.50000; Separation=0.20000
1,2,3,4,5,11,12,13,14,15 ; 3#Object Uniformity=0.20000; Monocle=258.00000; Frequency=0.50000; Separation=0.20000
# Mandatory: 0
# Non-mandatory: 6
# Total: 6
# Coverage: 82/100(82.00000%)
# Mean frequency: 0.45000
# Mean monocle: 349.33334
# Mean object uniformity: 0.33333
# Mean separation: 0.30455
Extract binary matrices
import quality_covers
quality_covers.extract_binary_matrices('file.data.out')
Result binary matrices extent
1 0 0 0 0 1
1 0 0 0 0 1
0 0 1 0 0 1
0 1 0 0 0 1
0 1 1 0 0 1
1 0 0 0 1 0
1 0 0 0 1 0
0 0 1 0 1 0
0 1 0 0 1 0
0 1 1 0 1 0
1 0 0 1 0 1
1 0 0 1 0 1
0 0 1 1 0 1
0 1 0 1 0 1
0 1 1 1 0 1
1 0 0 1 1 0
1 0 0 1 1 0
0 0 1 1 1 0
0 1 0 1 1 0
0 1 1 1 1 0
Result binary matrices extent
The first line is the name of the column
5 7 10 9 6 8 2 4 3
1 1 1 0 0 0 0 0 0
0 0 0 1 1 0 0 0 0
0 0 0 1 0 1 0 0 0
0 0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 0 1
More info
Paper associated
To come
Research lab
More tools about association rules
- https://marm.checksem.fr/api/ui/
- https://app.marm.checksem.fr/
- https://quality-cover.checksem.fr/api/ui
Authors
Amira Mouakher (amira.mouakher@u-bourgogne.fr) Nicolas Gros (nicolas.gros01@u-bourgogne.fr) Sebastien Gerin (sebastien.gerin@sayens.fr)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Hashes for quality_covers-3.1.0-cp38-cp38-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0887f505b9401442e7cf003a697b89686f7c4f413755d73147cc6d01c6d63b3e |
|
MD5 | 51ab185572e4debef6c0fb3f2df6b147 |
|
BLAKE2b-256 | 84aeaf1b051d99409d932b003463e8ff189a955f767c9b8532e5849c046262bc |
Hashes for quality_covers-3.1.0-cp37-cp37m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 25c33408fbe11e0ffc69a0f548be779431dcaca126f1032317167003852fe7f6 |
|
MD5 | 97115260b8ded5f728f59a2034f7d5e2 |
|
BLAKE2b-256 | c11c1e8b64831ff0377398eb338faa2b6f7a90aa88206a0af933a080491a406f |
Hashes for quality_covers-3.1.0-cp36-cp36m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | de1e3b190a1800e438ba8bcb41b46edcb2648cebd7b5c47d2260892fc3682cc0 |
|
MD5 | b24d0ebe56efa7529068a3da82c70587 |
|
BLAKE2b-256 | fcd6387d876e19351040240ae417e59c135162f0958f255e83eb41cb32ba404b |
Hashes for quality_covers-3.1.0-cp35-cp35m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 33929da306189260a8d5db90f6affb96030e6769624e694e273a036667c709c3 |
|
MD5 | ec8c1e498b15ddca69850b33dbba0da3 |
|
BLAKE2b-256 | 524f50cfe7af33ca2ffd4af4823f05f6cf993b22e76588d8f22aca494aa8fdbd |