Self supervised leanring and Graph neural netrowkrd based P{arkinson's disease diagnosis clinical tool}
Project description
ChasPeePro Project
Installation
Usage
python chaspeepro/chaspeepro/speechsegment.py --segment 1 --dataset "PCGITA"
--segment 1: Specifies to segment the audio files into 500 ms chunks. --dataset "PCGITA": Indicates the dataset to be used (in this case, "PCGITA", or "MoSpeeDI")
Feature Extraction
python chaspeepro/chaspeepro/wav2features.py --feature_type "specg" --segment 1 --dataset "PCGITA"
It also has noise types with differnet SNRs with parameter --snr, --noise "STRAFFIC"
--feature_type mfccs, Spectrogram or "specg" or "melspecg", "w2v2" or "w2v2_large" or"vad_xlrs53" or "w2v2_xlrs53"' or "opensmile"
Create split folds with 5 seeds and 10 fold ids wbased on Mahdis split "split_fold.pkl"
python chaspeepro/chaspeepro/createKfold.py --K 10 --feature_type "sparsity" --dataset 'PCGITA' --segment 1 --smode "mono"
--smode: "mono" or "readspeech"
--feature_type: same as above ones
The data will be saved in base_directory + "PC-GITA-New-SegmentedSpeech" directory
EUSIPCO paper:
Classical machine leanring models
python chaspeepro/chaspeepro/classicalmodels.py --dataset "PCGITA" --segment 0 --smode "mono" --feature_type "opensmile"
FC layer + w2v2 embeddings command
Requires gridtk installation first for submitting jobs on gpu
gridtk submit --partition gpu --gpus 1 --time 24:00:00 --mem 25G --job-name "w2v2xlrs53_pcg_mono_seg_l_1" --- /idiap/temp/ssheikh/miniconda3/envs/hf/bin/python chaspeepro/chaspeepro/main.py --dataset "PCGITA" --modelname "w2v2_xlrs53" --feature_type "w2v2_xlrs53" --segment 1 --smode "monologue" --bs 32 --layer_num 0
--bs: batch size
--model_name: w2v2_xlrs53--> It is basically FC layer or "CNNBaseLine" or "AutoEncoder"
--segment: segmented or full (Use 0 with only FC layer and seg 1 with AE and CNN)
--smode: "monologue" or "SRT"-> means non sponatnoeus speech
ICASSP: GNN for Parkinson's Disease (Best performing models)
Baseline KNN:
gridtk submit --partition gpu --gpus 1 --time 24:00:00 --mem 25G --job-name srt_knn_cosine_seg_neigh_5 --- /idiap/temp/ssheikh/miniconda3/envs/pytorch/bin/python chaspeepro/chaspeepro/classicalmodels.py --dataset "PCGITA" --segment 1 --smode "SRT" --feature_type "w2v2_xlrs53" --voting "major" --model "knn" --k 5 --dtype "cosine"
GCN
SRT (Non spontanous speech)
gridtk submit --partition gpu --gpus 1 --time 24:00:00 --mem 25G --job-name t_nodegcn_pcg_srt_seg_el0_ew_0_oGnExp_manhattan_knn_3_xlrs53_nl3_lr1e-4 --- /idiap/temp/ssheikh/miniconda3/envs/pytorch/bin/python chaspeepro/chaspeepro/dist_graph.py --feature_type "xlrs53" --adj_type "manhattan" --graph_type "temporal" --smode "SRT" --dataset "PCGITA" --segment 1 --neighs 3 --conv_type "kipfconv" --num_layers 3 --w2v2_layernum 0 --ew 0 --seed 48 --lr 1e-4
Monologue (Sponatnous speech)
gridtk submit --partition gpu --gpus 1 --time 24:00:00 --mem 25G --job-name t_nodegcn_pcg_mono_seg_el0_ew_0_oGnExp_cosine_knn_3_xlrs_nl3_lr1e-4 --- /idiap/temp/ssheikh/miniconda3/envs/pytorch/bin/python chaspeepro/chaspeepro/dist_graph.py --feature_type "xlrs53" --adj_type "cosine" --graph_type "temporal" --smode "monologue" --dataset "PCGITA" --segment 1 --neighs 3 --conv_type "kipfconv" --num_layers 3 --w2v2_layernum 0 --ew 0 --seed 48 --lr 1e-4
--feature_type "xlrs53": Specifies the feature type as xlrs53.
--adj_type "manhattan": Configures Manhattan distance for adjacency matrix.
--graph_type "temporal": Sets the graph type to itemporal. (not required. This was used for graph classification rather than node classification)
--smode "SRT": Specifies the mode as "SRT" or "monologue".
--dataset "PCGITA": Uses the PCGITA dataset.
--segment 1: Indicates the segmented speech samples.
--neighs 3: Sets the number of neighbors.
--conv_type "kipfconv": Specifies Kipf-style graph convolution.
--num_layers 3: Uses 3 layers in the GCN model.
--w2v2_layernum 0: Selects layer 0 of Wav2Vec 2.0.
--ew 0: Disables edge weights.
--seed 48: Sets the random seed to 48.
--lr 1e-4: Configures the learning rate to 1e-4.