Parameters and Attributes
Parameter tuning For a more detailed explanation of the impact of tuning key parameters please see the Supplementary Analysis in our paper. PARC Supplementary Analysis
Input parameter |
Description |
---|---|
data |
(numpy.ndarray) num samples x num features |
true_label |
(numpy.ndarray) (optional) |
dist_std_local |
(optional, default = 2) local pruning threshold: the higher the parameter, the more edges are retained |
jac_std_global |
(optional, default = ‘median’) global level graph pruning. This threshold can also be set as the number of standard deviations below the network’s mean-jaccard-weighted edges. 0.1-1 provide reasonable pruning. higher value means less pruning. e.g. a value of 0.15 means all edges that are above mean(edgeweight)-0.15*std(edge-weights) are retained. We find both 0.15 and ‘median’ to yield good results resulting in pruning away ~ 50-60% edges |
random_seed |
(optional, default = 42) The random seed to pass to Leiden |
resolution_parameter |
(optional, default = 1) Uses ModuliartyVP and RBConfigurationVertexPartition |
jac_weighted_edges |
(optional, default = True) Uses Jaccard weighted pruned graph as input to community detection. For very large datasets set this to False to observe a speed-up with little impact on accuracy |
Attributes |
Description |
---|---|
labels |
(list) length n_samples of corresponding cluster labels |
f1_mean |
(list) f1 score (not weighted by population). For details see supplementary section of paper |
stats_df |
(DataFrame) stores parameter values and performance metrics |