a new and more powerful QCA algorithm

These details have not been verified by PyPI

Project links

Homepage

Project description

scpQCA

scpQCA is a new and more powerful algorithm. QCA(Qualitative Comparative Analysis), a kind of configurational comparative method, follows after Ragin.

The source code could find in https://github.com/Kim-Q/scpQCA.git, please obey the Apache-2.0 license.

Here follows the tutorial of scpQCA:

a common usage of scpQCA

`scpQCA`(data: dataframe, decision_name:str, caseid: str)

import scpQCA
import pandas pd

data=[[random.randint(0,100) for _ in range(6)] for _ in range(30)]
data=pd.DataFrame(data)
data.columns=['A','B','C','D','F','cases']
obj=scpQCA.scpQCA(data,decision_name='F',caseid='cases')

To make scpQCA get rid of the uneven sample distribution problem, data after deduplication services better than the dataset with many repeated cases. Use drop_duplicates process before establishing a scpQCA model.

More than this, data should also check the dropna function or the program will alert the errors.

`indirect_calibration` (feature_list: list of column names, class_num: int, full_membership: float, full_nonmembership:float)

If calibration is needed, scpQCA provides two kinds of calibration functions direct_calibration and indirect_calibration.

feature_list=['A','B','C','D','F','cases']
obj.indirect_calibration(feature_list,2,100,0)

`direct_calibration` (feature_list: list of column names, full_membership: float, cross_over: float, full_nonmembership: float)

`raw_truth_table` (decision_label: unique, feature_list: list of column names, cutoff: int, consistency_threshold: float, sortedby: bool)

To make the process visualization, you can use raw_truth_table or scp_truth_table to print some key results.

obj.raw_truth_table(decision_label=1, feature_list=feature_list, cutoff=1,consistency_threshold=0.6,sortedby=False)

###
      A    B    C    D  number            caseid  consistency  coverage
0  0.0  0.0  1.0  1.0       4  [69, 47, 27, 58]     1.000000  0.210526
1  1.0  0.0  0.0  0.0       2          [13, 89]     1.000000  0.105263
2  1.0  0.0  1.0  1.0       2          [41, 10]     1.000000  0.105263
3  1.0  0.0  0.0  1.0       1              [31]     1.000000  0.052632
4  0.0  1.0  1.0  0.0       1             [100]     1.000000  0.052632
5  1.0  1.0  1.0  0.0       4  [96, 69, 75, 33]     0.750000  0.157895
6  0.0  0.0  0.0  1.0       3      [84, 73, 14]     0.666667  0.105263

`scp_truth_table` (rules: list of candidate rules, feature_list: list of column names, decision_label: unique)

However the scpQCA's candidate rule list should run after the sufficiency analysis(candidate_rules):

obj.scp_truth_table(rules, feature_list=feature_list,decision_label=1)

###
Running...please wait. There are 16 factor combinations.
There are 13 candidate rules in total.
      A    B    C    D  number consistency coverage
0     -    -  1.0    -      14      0.6429   0.5294
1     -  0.0    -    -      15      0.6000   0.5294
2     -  0.0    -  1.0       9      0.7778   0.4118
3     -  1.0    -  0.0      10      0.7000   0.4118
4     -  0.0  1.0    -       7      0.7143   0.2941
5     -    -  1.0  0.0       8      0.6250   0.2941
6     -  0.0  1.0  1.0       4      1.0000   0.2353
7     -  1.0  1.0  0.0       5      0.8000   0.2353
8     -    -  1.0  1.0       6      0.6667   0.2353
9     -  0.0  0.0  1.0       5      0.6000   0.1765
10    -  1.0  0.0  0.0       5      0.6000   0.1765
11  0.0  0.0  1.0  1.0       1      1.0000   0.0588
12  0.0    -  1.0  1.0       1      1.0000   0.0588

`search_necessity` (decision_label: unique, feature_list: list of column names, consistency_threshold: float)

Feature_list shouldn't contain any symbol or blank space, while '_' in the middle is allowed. Feature_list counld contain decision_name , caseid or neither.

Pay attention to the special parameter consistency_threshold, it usually takes approximately 0.9.

obj.search_necessity(decision_label=1, feature_list=feature_list,consistency_threshold=0.8)

###
B==1.0 is a necessity condition
C==1.0 is a necessity condition

`candidate_rules` (decision_label: unique, feature_list: list of column names, consistency: float, cutoff: int)

Feature_list shouldn't contain any symbol or blank space, while '_' in the middle is allowed. Feature_list counld contain decision_name , caseid or neither.

Pay attention to the special parameter consistency_threshold, it usually takes the lower limit of 0.75; parameter cutoff, it usually takes the lower limit of 2.

rules=obj.candidate_rules(decision_label=1, feature_list=feature_list, consistency=0.8,cutoff=1)

`greedy` (rules: list of candidate rules, decision_label: unique, unique_cover: int)

The rules input is the output of candidate_rules.

Pay attention to the special parameter unique_cover, it should be set smaller than cutoff in candidate_rules and makes a big impact on final solution.

configuration,issue_set=obj.greedy(rules=rules,decision_label=1,unique_cover=2)
print(configuration)
print(issue_set)

###
A==0.0 is a necessity condition
Running...please wait. There are 16 factor combinations.
There are 27 candidate rules in total.
['B==0.0 & A==0.0', 'D==1.0 & A==0.0', 'D==0.0 & C==0.0 & A==0.0']
{5, 8, 10, 12, 13, 17, 20, 22, 23, 24, 26, 28}

`con_n_con` (decision_label: unique, configuration: list of candidate rules, issue_sets: set of caseid)

configuration and issue_sets are the calculated from greedy.

obj.cov_n_con(decision_label=1, configuration=configuration,issue_sets=issue_set)

OUTPUT：

###
consistency = 0.6 and coverage = 0.7058823529411765

`runQCA` (decision_label: unique, feature_list: list of column names, necessary_consistency: list, sufficiency_consistency: list, cutoff: list, rule_length: int, unique_cover: list)

Otherwises, we also recommand you to use a more convenience function to test the best parameters.

data=[[random.randint(0,100) for _ in range(6)] for _ in range(30)]
data=pd.DataFrame(data)
data.columns=['A','B','C','D','F','cases']
<<<<<<< Updated upstream
obj=scpQCA(data,decision_name='F',caseid='cases')

feature_list=['A','B','C','D','F','cases']
obj.indirect_calibration(feature_list,2,100,0)

configuration,issue_set=obj.runQCA(decision_label=1, feature_list=feature_list, necessary_consistency=[0.8,0.9],sufficiency_consistency=[0.75,0.8],cutoff=[1,2],rule_length=5,unique_cover=[1])

print(configuration)
print(issue_set)
print(obj.cov_n_con(decision_label=1, configuration=configuration,issue_sets=issue_set))
=======
obj=scpQCA.scpQCA(data,decision_name='F',caseid='cases')

feature_list=['A','B','C','D','F','cases']
obj.indirect_calibration(feature_list,2,100,0)

configuration,issue_set=obj.runQCA(decision_label=1, feature_list=feature_list, necessary_consistency=[0.8,0.9],sufficiency_consistency=[0.75,0.8],cutoff=[1,2],rule_length=5,unique_cover=[1])

print(configuration)
print(issue_set)
print(obj.cov_n_con(decision_label=1, configuration=configuration,issue_sets=issue_set))

OUTPUT：
>>>>>>> Stashed changes

###
Running...please wait. There are 16 factor combinations.
There are 20 candidate rules in total.
processing the simplification with para: necessary consistency=0.8, sufficiency consistency=0.75, cutoff=1, unique cover=1
consistency = 0.7894736842105263 and coverage = 0.9375
processing the simplification with para: necessary consistency=0.8, sufficiency consistency=0.75, cutoff=2, unique cover=1
consistency = 0.7894736842105263 and coverage = 0.9375
processing the simplification with para: necessary consistency=0.8, sufficiency consistency=0.8, cutoff=1, unique cover=1
consistency = 0.8666666666666667 and coverage = 0.8125
processing the simplification with para: necessary consistency=0.8, sufficiency consistency=0.8, cutoff=2, unique cover=1
consistency = 0.8666666666666667 and coverage = 0.8125
processing the simplification with para: necessary consistency=0.9, sufficiency consistency=0.75, cutoff=1, unique cover=1
consistency = 0.7894736842105263 and coverage = 0.9375
processing the simplification with para: necessary consistency=0.9, sufficiency consistency=0.75, cutoff=2, unique cover=1
consistency = 0.7894736842105263 and coverage = 0.9375
processing the simplification with para: necessary consistency=0.9, sufficiency consistency=0.8, cutoff=1, unique cover=1
consistency = 0.8666666666666667 and coverage = 0.8125
processing the simplification with para: necessary consistency=0.9, sufficiency consistency=0.8, cutoff=2, unique cover=1
consistency = 0.8666666666666667 and coverage = 0.8125
The best opt parameter of scpQCA is: necessary consistency=0.8, sufficiency consistency=0.75, cutoff=1, unique cover=1
['C==0.0 & B==0.0', 'D==0.0 & A==1.0', 'C==1.0 & B==1.0 & A==1.0', 'D==0.0 & C==1.0 & B==1.0', 'D==1.0 & C==0.0 & A==0.0']
{1, 4, 7, 8, 9, 10, 11, 14, 15, 17, 20, 25, 26, 28, 29}

The input of necessary_consistency, sufficiency_consistency, cutoff and unique_cover are list datatype. Function will find the best parameter combination and output the one.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

0.1.10 yanked

Mar 24, 2023

0.1.8

Feb 2, 2023

0.1.7 yanked

Jan 10, 2023

This version

0.1.6 yanked

Jan 10, 2023

0.1.5 yanked

Jan 10, 2023

0.1.4 yanked

Jan 9, 2023

0.1.3 yanked

Jan 9, 2023

0.1.2 yanked

Oct 26, 2022

0.1.1 yanked

Oct 14, 2022

0.1.0 yanked

Oct 14, 2022

0.0.4 yanked

Jul 5, 2022

0.0.3 yanked

Jul 4, 2022

0.0.2 yanked

Jul 1, 2022

0.0.1 yanked

Feb 8, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scpQCA-0.1.6.tar.gz (11.5 kB view details)

Uploaded Jan 10, 2023 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

scpQCA-0.1.6-py3-none-any.whl (9.7 kB view details)

Uploaded Jan 10, 2023 Python 3

File details

Details for the file scpQCA-0.1.6.tar.gz.

File metadata

Download URL: scpQCA-0.1.6.tar.gz
Upload date: Jan 10, 2023
Size: 11.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.1 CPython/3.10.6

File hashes

Hashes for scpQCA-0.1.6.tar.gz
Algorithm	Hash digest
SHA256	`f9152773807e798e7e17d1c1ad614fa27a2fa1f675df1ae2f27dda02a05a5587`
MD5	`51a8a8375b906ab37ab273466d2e0c61`
BLAKE2b-256	`83660528d63bb9e432cb72223705f472e26d1c52a73a2178fcbc8079a91be23d`

See more details on using hashes here.

File details

Details for the file scpQCA-0.1.6-py3-none-any.whl.

File metadata

Download URL: scpQCA-0.1.6-py3-none-any.whl
Upload date: Jan 10, 2023
Size: 9.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.1 CPython/3.10.6

File hashes

Hashes for scpQCA-0.1.6-py3-none-any.whl
Algorithm	Hash digest
SHA256	`407860b4a241b3ab502bae41c453c6961dbbc2d9cc87621c00c56c469f05adcc`
MD5	`ae7dce861cd2024052833c5c296f7b7d`
BLAKE2b-256	`e3aaea36b490b12695de671cfd4814ad7bec0e2d04445cbb7e7dce969b18a466`

See more details on using hashes here.

scpQCA 0.1.6

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

scpQCA

a common usage of scpQCA

`scpQCA`(data: dataframe, decision_name:str, caseid: str)

`indirect_calibration` (feature_list: list of column names, class_num: int, full_membership: float, full_nonmembership:float)

`direct_calibration` (feature_list: list of column names, full_membership: float, cross_over: float, full_nonmembership: float)

`raw_truth_table` (decision_label: unique, feature_list: list of column names, cutoff: int, consistency_threshold: float, sortedby: bool)

`scp_truth_table` (rules: list of candidate rules, feature_list: list of column names, decision_label: unique)

`search_necessity` (decision_label: unique, feature_list: list of column names, consistency_threshold: float)

`candidate_rules` (decision_label: unique, feature_list: list of column names, consistency: float, cutoff: int)

`greedy` (rules: list of candidate rules, decision_label: unique, unique_cover: int)

`con_n_con` (decision_label: unique, configuration: list of candidate rules, issue_sets: set of caseid)

`runQCA` (decision_label: unique, feature_list: list of column names, necessary_consistency: list, sufficiency_consistency: list, cutoff: list, rule_length: int, unique_cover: list)

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

scpQCA 0.1.6

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

scpQCA

a common usage of scpQCA

scpQCA(data: dataframe, decision_name:str, caseid: str)

indirect_calibration (feature_list: list of column names, class_num: int, full_membership: float, full_nonmembership:float)

direct_calibration (feature_list: list of column names, full_membership: float, cross_over: float, full_nonmembership: float)

raw_truth_table (decision_label: unique, feature_list: list of column names, cutoff: int, consistency_threshold: float, sortedby: bool)

scp_truth_table (rules: list of candidate rules, feature_list: list of column names, decision_label: unique)

search_necessity (decision_label: unique, feature_list: list of column names, consistency_threshold: float)

candidate_rules (decision_label: unique, feature_list: list of column names, consistency: float, cutoff: int)

greedy (rules: list of candidate rules, decision_label: unique, unique_cover: int)

con_n_con (decision_label: unique, configuration: list of candidate rules, issue_sets: set of caseid)

runQCA (decision_label: unique, feature_list: list of column names, necessary_consistency: list, sufficiency_consistency: list, cutoff: list, rule_length: int, unique_cover: list)

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`scpQCA`(data: dataframe, decision_name:str, caseid: str)

`indirect_calibration` (feature_list: list of column names, class_num: int, full_membership: float, full_nonmembership:float)

`direct_calibration` (feature_list: list of column names, full_membership: float, cross_over: float, full_nonmembership: float)

`raw_truth_table` (decision_label: unique, feature_list: list of column names, cutoff: int, consistency_threshold: float, sortedby: bool)

`scp_truth_table` (rules: list of candidate rules, feature_list: list of column names, decision_label: unique)

`search_necessity` (decision_label: unique, feature_list: list of column names, consistency_threshold: float)

`candidate_rules` (decision_label: unique, feature_list: list of column names, consistency: float, cutoff: int)

`greedy` (rules: list of candidate rules, decision_label: unique, unique_cover: int)

`con_n_con` (decision_label: unique, configuration: list of candidate rules, issue_sets: set of caseid)

`runQCA` (decision_label: unique, feature_list: list of column names, necessary_consistency: list, sufficiency_consistency: list, cutoff: list, rule_length: int, unique_cover: list)