Package: metasnf 2.3.1

Prashanth S Velayudhan

metasnf: Meta Clustering with Similarity Network Fusion

Framework to facilitate patient subtyping with similarity network fusion and meta clustering. The similarity network fusion (SNF) algorithm was introduced by Wang et al. (2014) in <doi:10.1038/nmeth.2810>. SNF is a data integration approach that can transform high-dimensional and diverse data types into a single similarity network suitable for clustering with minimal loss of information from each initial data source. The meta clustering approach was introduced by Caruana et al. (2006) in <doi:10.1109/ICDM.2006.103>. Meta clustering involves generating a wide range of cluster solutions by adjusting clustering hyperparameters, then clustering the solutions themselves into a manageable number of qualitatively similar solutions, and finally characterizing representative solutions to find ones that are best for the user's specific context. This package provides a framework to easily transform multi-modal data into a wide range of similarity network fusion-derived cluster solutions as well as to visualize, characterize, and validate those solutions. Core package functionality includes easy customization of distance metrics, clustering algorithms, and SNF hyperparameters to generate diverse clustering solutions; calculation and plotting of associations between features, between patients, and between cluster solutions; and standard cluster validation approaches including resampled measures of cluster stability, standard metrics of cluster quality, and label propagation to evaluate generalizability in unseen data. Associated vignettes guide the user through using the package to identify patient subtypes while adhering to best practices for unsupervised learning.

Authors:Prashanth S Velayudhan [aut, cre], Xiaoqiao Xu [aut], Prajkta Kallurkar [aut], Ana Patricia Balbon [aut], Maria T Secara [aut], Adam Taback [aut], Denise Sabac [aut], Nicholas Chan [aut], Shihao Ma [aut], Bo Wang [aut], Daniel Felsky [aut], Stephanie H Ameis [aut], Brian Cox [aut], Colin Hawco [aut], Lauren Erdman [aut], Anne L Wheeler [aut, ths]

metasnf_2.3.1.tar.gz
metasnf_2.3.1.zip(r-4.7)metasnf_2.3.1.zip(r-4.6)metasnf_2.3.1.zip(r-4.5)
metasnf_2.3.1.tgz(r-4.6-any)metasnf_2.3.1.tgz(r-4.5-any)
metasnf_2.3.1.tar.gz(r-4.7-any)metasnf_2.3.1.tar.gz(r-4.6-any)
metasnf_2.3.1.tgz(r-4.6-emscripten)
manual.pdf |manual.html
DESCRIPTION |NEWS
card.svg |card.png
metasnf/json (API)

# Install 'metasnf' in R:
install.packages('metasnf', repos = c('https://branchlab.r-universe.dev', 'https://cloud.r-project.org'))

Bug tracker:https://github.com/branchlab/metasnf/issues

Pkgdown/docs site:https://branchlab.github.io

Datasets:

On CRAN:

Conda:

bioinformaticsclusteringmetaclusteringsnf

7.27 score 9 stars 31 scripts 261 downloads 108 exports 39 dependencies

Last updated from:07a0e486d3. Checks:9 OK. Indexed: yes.

TargetResultTimeFilesSyslog
linux-devel-x86_64OK188
source / vignettesOK223
linux-release-x86_64OK173
macos-release-arm64OK121
macos-oldrel-arm64OK110
windows-develOK161
windows-releaseOK139
windows-oldrelOK136
wasm-releaseOK130

Exports:add_settings_df_rowsadjusted_rand_index_heatmapalluvial_cluster_plotas_ari_matrixas_data_listas_settings_dfas_sim_mats_listas_snf_configas_weights_matrixassemble_dataassoc_pval_heatmapauto_plotbar_plotbatch_snfbatch_snf_subsamplescalc_ariscalc_assoc_pval_matrixcalc_nmiscalculate_coclusteringcalculate_db_indicescalculate_dunn_indicescalculate_silhouettescell_significance_fncheck_dataless_annotationscheck_hm_dependenciescheck_similarity_matricesclust_fns_listcocluster_densitycocluster_heatmapcollapse_dlcolour_scaleconfig_heatmapdata_listdist_fns_listdl_variable_summarydlapplydplyr_row_slice.ext_solutions_dfdplyr_row_slice.solutions_dfesm_manhattan_plotestimate_nclust_given_grapheuclidean_distanceextend_solutionsfeaturesgenerate_distance_metrics_listgenerate_settings_matrixget_cluster_dfget_cluster_solutionsget_clustersget_complete_uidsget_dl_uidsget_heatmap_orderget_matrix_orderget_pvalsget_representative_solutionsgower_distancehamming_distanceis_data_listjitter_plotlabel_meta_clusterslabel_propagatelinear_adjustlinear_model_pvalmc_manhattan_plotmerge_df_listmeta_cluster_heatmapn_featuresn_observationsnew_solutions_dford_reg_pvalplpval_heatmaprandom_removalrename_dlresamplesave_heatmapsettings_dfsew_euclidean_distanceshiny_annotatorsim_mats_listsimilarity_matrix_heatmapsiw_euclidean_distancesn_euclidean_distancesnf_configspectral_eigenspectral_eigen_classicspectral_eightspectral_fivespectral_fourspectral_ninespectral_rotspectral_rot_classicspectral_sevenspectral_sixspectral_tenspectral_threespectral_twosplit_parsersubsample_dlsubsample_pairwise_arissummarize_clust_fns_listsummarize_dflsummarize_dlsummary_featurestrain_test_assignuidsvalidate_solutions_dfvar_manhattan_plotweights_matrix

Dependencies:alluvialcliclustercpp11data.tabledigestdplyrExPositionfarvergenericsggplot2gluegtableisobandlabelinglifecyclemagrittrMASSmclustpillarpkgconfigprettyGraphsprogressrpurrrR6RColorBrewerrlangS7scalesSNFtoolstringistringrtibbletidyrtidyselectutf8vctrsviridisLitewithr

A Complete Example
Data Set-up | Pre-processing | Generating the data list | Defining sets of hyperparameters to use for SNF and clustering | The settings data frame | Other parts of the SNF config | Running SNF and clustering | Identifying and visualizing meta clusters | Characterizing cluster solutions | Calculating associations between cluster solutions and initial data | Visualizing feature associations with meta clustering results | Characterizing individual solutions representative of each meta cluster | Relating results to metasnf hyperparameters | Quality measures | Stability measures | Evaluating separation across "target features" of importance | Validating results with label propagation | References

Last update: 2026-06-17
Started: 2024-05-17

Quality Measures

Last update: 2026-06-17
Started: 2024-05-14

Clustering Algorithms
Default clustering | Other built-in clustering options | Structure of a clustering algorithm function | Non-automated clustering | Example of non-automated clustering: DBSCAN

Last update: 2025-04-10
Started: 2023-11-17

Label Propagation

Last update: 2025-04-10
Started: 2024-02-02

NMI Scores

Last update: 2025-04-10
Started: 2024-05-28

Stability Measures
Data set-up

Last update: 2025-03-11
Started: 2023-11-14

The SNF Config
Creating a default SNF config | The settings data frame | The distance functions list | The clustering functions list | The weights matrix | Customizing an SNF config | Alpha, k, and t | Inclusion columns and data frame dropout | Grid searching | Assembling an SNF config in pieces | "settings_df building failed to converge"

Last update: 2025-03-05
Started: 2025-02-04

A Simple Example
The original SNF example | 1. Load the package | 2. Set SNF hyperparameters | 3. Load the data | 4. Generate similarity matrices for each data source | 5. Integrate similarity matrices with SNF | 6. Find clusters in the integrated matrix | The same example using metasnf | 2. Store the data in a data list | 3. Store all the settings of the desired SNF runs in an SNF config | 4. Run SNF | References

Last update: 2025-02-04
Started: 2023-10-23

Alluvial Plots

Last update: 2025-02-04
Started: 2023-10-29

Confounders
Accounting for confounding features | Unwanted signal | Procedure using the metasnf package | Limitations and important considerations | 1. Excessive loss of signal | 2. Lack of accounting for non-linearities | 3. Inability to adjust ordinal, discrete, or categorical data

Last update: 2025-02-04
Started: 2023-11-22

Correlation Plots
Data set-up | Heatmaps

Last update: 2025-02-04
Started: 2023-11-14

Distance Metrics
Distance functions | How the dist_fns_list is used | Removing the default distance_metrics | Supplying weights to distance metrics | Custom distance metrics | Requesting metrics | List of prewritten distance metrics functions | References

Last update: 2025-02-04
Started: 2023-10-23

Feature Plots

Last update: 2025-02-04
Started: 2024-05-29

Feature Weighting
Generating and Using the Weights Matrix | The Nitty Gritty of How Weights are Used

Last update: 2025-02-04
Started: 2023-11-14

Getting Started
Introduction | Installation

Last update: 2025-02-04
Started: 2023-10-23

Imputations

Last update: 2025-02-04
Started: 2024-05-26

Manhattan Plots
Data set-up | Associations with Multiple Cluster Solutions (esm_manhattan_plot) | Associations with Meta Clusters (mc_manhattan_plot) | Associations with a Key Feature

Last update: 2025-02-04
Started: 2023-10-31

Parallel Processing
Basic usage | Including a progress bar | Number of processes

Last update: 2025-02-04
Started: 2023-12-06

Similarity Matrices
Data set-up | Visualize similarity matrices sorted by cluster label | Annotations | More on sorting

Last update: 2025-02-04
Started: 2023-10-29

SNF Schemes
(1) "Individual" | (2) "Two-step" | (3) "Domain" | Custom SNF schemes

Last update: 2025-02-04
Started: 2023-11-29

The Data List
The data_list

Last update: 2025-02-04
Started: 2023-11-14

Troubleshooting

Last update: 2025-02-04
Started: 2023-11-14

Readme and manuals

Help Manual

Help pageTopics
Mock ABCD anxiety dataabcd_anxiety
Mock ABCD "colour" dataabcd_colour
Mock ABCD cortical surface area dataabcd_cort_sa
Mock ABCD cortical thickness dataabcd_cort_t
Mock ABCD depression dataabcd_depress
Mock ABCD income dataabcd_h_income
Mock ABCD income dataabcd_income
Mock ABCD pubertal status dataabcd_pubertal
Mock ABCD subcortical volumes dataabcd_subc_v
Add rows to a settings_dfadd_settings_df_rows
Mock age dataage_df
Alluvial plot of patients across cluster counts and important featuresalluvial_cluster_plot
Mock ABCD anxiety dataanxiety
Convert an object to an ARI matrixas_ari_matrix
Convert an object to a data listas_data_list
Convert an object to a settings data frameas_settings_df
Convert an object to a similarity matrix listas_sim_mats_list
Convert an object to a snf configas_snf_config
Convert an object to a weights matrixas_weights_matrix
Coerce a 'data_list' class object into a 'data.frame' class objectas.data.frame.data_list
Coerce a 'ext_solutions_df' class object into a 'data.frame' class objectas.data.frame.ext_solutions_df
Coerce a 'settings_df' class object into a 'data.frame' class objectas.data.frame.settings_df
Coerce a 'settings_df' class object into a 'data.frame' class objectas.data.frame.snf_config
Coerce a 'solutions_df' class object into a 'data.frame' class objectas.data.frame.solutions_df
Coerce a 't_ext_solutions_df' class object into a 'data.frame' class objectas.data.frame.t_ext_solutions_df
Coerce a 't_solutions_df' class object into a 'data.frame' class objectas.data.frame.t_solutions_df
Coerce a 'weights_matrix' class object into a 'data.frame' class objectas.data.frame.weights_matrix
Coerce a 'clust_fns_list' class object into a 'list' class objectas.list.clust_fns_list
Coerce a 'data_list' class object into a 'list' class objectas.list.data_list
Coerce a 'dist_fns_list' class object into a 'list' class objectas.list.dist_fns_list
Coerce a 'sim_mats_list' class object into a 'list' class objectas.list.sim_mats_list
Coerce a 'snf_config' class object into a 'list' class objectas.list.snf_config
Coerce a 'ari_matrix' class object into a 'matrix' class objectas.matrix.ari_matrix
Coerce a 'weights_matrix' class object into a 'matrix' class objectas.matrix.weights_matrix
Collapse a data frame and/or a data list into a single data frameassemble_data
Heatmap of pairwise associations between featuresassoc_pval_heatmap
Automatically plot features across clustersauto_plot
Bar plot separating a feature by clusterbar_plot
Run variations of SNFbatch_snf
Run SNF clustering pipeline on a list of subsampled data listsbatch_snf_subsamples
Cached example extended solutions data framecache_a_complete_example_ext_sol_df
Cached example extended solutions data framecache_a_complete_example_lp_ext_sol_df
Cached example solutions data framecache_a_complete_example_sol_df
Construct an ARI matrix storing inter-solution similaritiescalc_aris
Calculate p-values for all pairwise associations of features in a data listcalc_assoc_pval_matrix
Calculate feature NMIs for a data list and a solutions data framecalc_nmis
Calculate co-clustering datacalculate_coclustering
Mock diagnosis datacancer_diagnosis_df
Place significance stars on ComplexHeatmap cellscell_significance_fn
Helper function to stop annotation building when no data was providedcheck_dataless_annotations
Check for ComplexHeatmap and circlize dependenciescheck_hm_dependencies
Check validity of similarity matricescheck_similarity_matrices
Built-in clustering algorithmsclust_fns spectral_eigen spectral_eigen_classic spectral_eight spectral_five spectral_four spectral_nine spectral_rot spectral_rot_classic spectral_seven spectral_six spectral_ten spectral_three spectral_two
Build a clustering algorithms listclust_fns_list
Density plot of co-clustering stability across subsampled datacocluster_density
Heatmap of observation co-clustering across resampled datacocluster_heatmap
Return a colour ramp for a given vectorcolour_scale
Mock ABCD cortical surface area datacort_sa
Mock ABCD cortical thickness datacort_t
Build a 'data_list' class objectdata_list
Mock ABCD depression datadepress
Mock diagnosis datadiagnosis_df
Built-in distance functionsdist_fns euclidean_distance gower_distance hamming_distance sew_euclidean_distance sn_euclidean_distance
Build a distance metrics listdist_fns_list
Apply-like function for data list objectsdlapply
Function to extend dplyr to extended solutions data frame objectsdplyr_row_slice.ext_solutions_df
Function to extend dplyr to solutions data frame objectsdplyr_row_slice.solutions_df
Manhattan plot of feature-cluster association p-valuesesm_manhattan_plot
Estimate number of clusters for a similarity matrixestimate_nclust_given_graph
Modification of SNFtool mock data frame "Data1"expression_df
Extend a solutions data frame to include outcome evaluationsextend_solutions
Mock ABCD "colour" datafav_colour
Mock gender datagender_df
Pull complete-data UIDs from a list of data framesget_complete_uids
Return the row or column ordering present in a heatmapget_heatmap_order
Return the hierarchical clustering order of a matrixget_matrix_order
Get p-values from an extended solutions data frameget_pvals
Extract representative solutions from a matrix of ARIsget_representative_solutions
Mock ABCD income dataincome
Test if the object is a data listis_data_list
Jitter plot separating a feature by clusterjitter_plot
Assign meta cluster labels to rows of a solutions data frame or extended solutions data framelabel_meta_clusters
Label propagate cluster solutions to non-clustered observationslabel_propagate
Linearly correct data list by features with unwanted signallinear_adjust
Manhattan plot of feature-meta cluster association p-valuesmc_manhattan_plot
Merge list of data frames into a single data framemerge_df_list
Merge 'clust_fns_list' objectsmerge.clust_fns_list
Merge observations between two compatible data listsmerge.data_list
Merge 'dist_fns_list' objectsmerge.dist_fns_list
Merge 'ext_solutions_df' objectsmerge.ext_solutions_df
Merge 'settings_df' objectsmerge.settings_df
Merge 'sim_mats_list' objectsmerge.sim_mats_list
Merge method for SNF config objectsmerge.snf_config
Merge 'solutions_df' objectsmerge.solutions_df
Merge 't_ext_solutions_df' objectsmerge.t_ext_solutions_df
Merge 't_solutions_df' objectsmerge.t_solutions_df
Merge 'weights_matrix' objectsmerge.weights_matrix
Modification of SNFtool mock data frame "Data2"methylation_df
Mock example of an 'ari_matrix' metasnf objectmock_ari_matrix
Mock example of a 'clust_fns_list' metasnf objectmock_clust_fns_list
Mock example of a 'data_list' metasnf objectmock_data_list
Mock example of a 'dist_fns_list' metasnf objectmock_dist_fns_list
Mock example of a 'ext_solutions_df' metasnf objectmock_ext_solutions_df
Mock example of a 'mc_solutions_df' metasnf objectmock_mc_solutions_df
Mock example of a 'rep_solutions_df' metasnf objectmock_rep_solutions_df
Mock example of a 'settings_df' metasnf objectmock_settings_df
Mock example of a 'snf_config' metasnf objectmock_snf_config
Mock example of a 'solutions_df' metasnf objectmock_solutions_df
Mock example of a 't_solutions_df' metasnf objectmock_t_solutions_df
Mock example of a 'weights_matrix' metasnf objectmock_weights_matrix
Constructor for 'solutions_df' class objectnew_solutions_df
Heatmap of pairwise adjusted rand indices between solutionsmeta_cluster_heatmap plot.ari_matrix
Plot of feature values in a data listplot.data_list
Plot of cluster assignments in an extended solutions data frameplot.ext_solutions_df plot.t_ext_solutions_df
Heatmap for visualizing an SNF configconfig_heatmap plot.settings_df plot.snf_config plot.weights_matrix
Plot of cluster assignments in a solutions data frameplot.solutions_df plot.t_solutions_df
Print method for class 'ari_matrix'print.ari_matrix
Print method for class 'clust_fns_list'print.clust_fns_list
Print method for class 'data_list'print.data_list
Print method for class 'dist_fns_list'print.dist_fns_list
Print method for class 'ext_solutions_df'print.ext_solutions_df
Print method for class 'settings_df'print.settings_df
Print method for class 'sim_mats_list'print.sim_mats_list
Print method for class 'snf_config'print.snf_config
Print method for class 'solutions_df'print.solutions_df
Print method for class 't_ext_solutions_df'print.t_ext_solutions_df
Print method for class 't_solutions_df'print.t_solutions_df
Print method for class 'weights_matrix'print.weights_matrix
Mock ABCD pubertal status datapubertal
Heatmap of p-valuespval_heatmap
Quality metricscalculate_db_indices calculate_dunn_indices calculate_silhouettes quality_measures
Row-binding of solutions data frame class objectsrbind.ext_solutions_df
Row-binding of solutions data frame class objectsrbind.solutions_df
Row-binding of t_solutions_df class objectsrbind.t_solutions_df
Row-bind weights matricesrbind.weights_matrix
Rename features in a data listrename_dl
Helper resampling function found in ?sampleresample
Save a heatmap object to a filesave_heatmap
Build a settings data framesettings_df
Launch a shiny app to identify meta cluster boundariesshiny_annotator
Create or extract a 'sim_mats_list' class objectsim_mats_list
Plot heatmap of similarity matrixsimilarity_matrix_heatmap
Squared (including weights) Euclidean distancesiw_euclidean_distance
Define configuration for generating a set of SNF-based cluster solutionssnf_config
Helper function to determine which row and columns to split onsplit_parser
Structure of a 'ari_matrix' objectstr.ari_matrix
Structure of a 'clust_fns_list' objectstr.clust_fns_list
Structure of a 'data_list' objectstr.data_list
Structure of a 'dist_fns_list' objectstr.dist_fns_list
Structure of a 'ext_solutions_df' objectstr.ext_solutions_df
Structure of a 'settings_df' objectstr.settings_df
Structure of a 'sim_mats_list' objectstr.sim_mats_list
Structure of a 'snf_config' objectstr.snf_config
Structure of a 'solutions_df' objectstr.solutions_df
Structure of a 't_ext_solutions_df' objectstr.t_ext_solutions_df
Structure of a 't_solutions_df' objectstr.t_solutions_df
Structure of a 'weights_matrix' objectstr.weights_matrix
Mock ABCD subcortical volumes datasubc_v
Create subsamples of a data listsubsample_dl
Calculate pairwise adjusted Rand indices across subsamples of datasubsample_pairwise_aris
Summary method for class 'ari_matrix'summary.ari_matrix
Summary method for class 'clust_fns_list'summary.clust_fns_list
Summary method for class 'data_list'summary.data_list
Summary method for class 'dist_fns_list'summary.dist_fns_list
Summary method for class 'ext_solutions_df'summary.ext_solutions_df
Summary method for class 'settings_df'summary.settings_df
Summary method for class 'sim_mats_list'summary.sim_mats_list
Summary method for class 'snf_config'summary.snf_config
Summary method for class 'solutions_df'summary.solutions_df
Summary method for class 't_ext_solutions_df'summary.t_ext_solutions_df
Summary method for class 't_solutions_df'summary.t_solutions_df
Summary method for class 'weights_matrix'summary.weights_matrix
Training and testing splittrain_test_assign
Pull UIDs from an objectuids
Validator for 'solutions_df' class objectvalidate_solutions_df
Manhattan plot of feature-feature association p-valuesvar_manhattan_plot
Generate a matrix to store feature weightsweights_matrix