Parameters and Attributes

Parameter tuning For a more detailed explanation of the impact of tuning key parameters please see the Supplementary Analysis in our paper. PARC Supplementary Analysis

Parameters
Input parameter	Description
data	(numpy.ndarray) num samples x num features
true_label	(numpy.ndarray) (optional)
dist_std_local	(optional, default = 2) local pruning threshold: the higher the parameter, the more edges are retained
jac_std_global	(optional, default = ‘median’) global level graph pruning. This threshold can also be set as the number of standard deviations below the network’s mean-jaccard-weighted edges. 0.1-1 provide reasonable pruning. higher value means less pruning. e.g. a value of 0.15 means all edges that are above mean(edgeweight)-0.15*std(edge-weights) are retained. We find both 0.15 and ‘median’ to yield good results resulting in pruning away ~ 50-60% edges
random_seed	(optional, default = 42) The random seed to pass to Leiden
resolution_parameter	(optional, default = 1) Uses ModuliartyVP and RBConfigurationVertexPartition
jac_weighted_edges	(optional, default = True) Uses Jaccard weighted pruned graph as input to community detection. For very large datasets set this to False to observe a speed-up with little impact on accuracy

Attributes
Attributes	Description
labels	(list) length n_samples of corresponding cluster labels
f1_mean	(list) f1 score (not weighted by population). For details see supplementary section of paper
stats_df	(DataFrame) stores parameter values and performance metrics