Package: metasnf 1.1.2

Prashanth S Velayudhan

metasnf: Meta Clustering with Similarity Network Fusion

Framework to facilitate patient subtyping with similarity network fusion and meta clustering. The similarity network fusion (SNF) algorithm was introduced by Wang et al. (2014) in <doi:10.1038/nmeth.2810>. SNF is a data integration approach that can transform high-dimensional and diverse data types into a single similarity network suitable for clustering with minimal loss of information from each initial data source. The meta clustering approach was introduced by Caruana et al. (2006) in <doi:10.1109/ICDM.2006.103>. Meta clustering involves generating a wide range of cluster solutions by adjusting clustering hyperparameters, then clustering the solutions themselves into a manageable number of qualitatively similar solutions, and finally characterizing representative solutions to find ones that are best for the user's specific context. This package provides a framework to easily transform multi-modal data into a wide range of similarity network fusion-derived cluster solutions as well as to visualize, characterize, and validate those solutions. Core package functionality includes easy customization of distance metrics, clustering algorithms, and SNF hyperparameters to generate diverse clustering solutions; calculation and plotting of associations between features, between patients, and between cluster solutions; and standard cluster validation approaches including resampled measures of cluster stability, standard metrics of cluster quality, and label propagation to evaluate generalizability in unseen data. Associated vignettes guide the user through using the package to identify patient subtypes while adhering to best practices for unsupervised learning.

Authors:Prashanth S Velayudhan [aut, cre], Xiaoqiao Xu [aut], Prajkta Kallurkar [aut], Ana Patricia Balbon [aut], Maria T Secara [aut], Adam Taback [aut], Denise Sabac [aut], Nicholas Chan [aut], Shihao Ma [aut], Bo Wang [aut], Daniel Felsky [aut], Stephanie H Ameis [aut], Brian Cox [aut], Colin Hawco [aut], Lauren Erdman [aut], Anne L Wheeler [aut, ths]

metasnf_1.1.2.tar.gz
metasnf_1.1.2.zip(r-4.5)metasnf_1.1.2.zip(r-4.4)metasnf_1.1.2.zip(r-4.3)
metasnf_1.1.2.tgz(r-4.4-any)metasnf_1.1.2.tgz(r-4.3-any)
metasnf_1.1.2.tar.gz(r-4.5-noble)metasnf_1.1.2.tar.gz(r-4.4-noble)
metasnf_1.1.2.tgz(r-4.4-emscripten)metasnf_1.1.2.tgz(r-4.3-emscripten)
metasnf.pdf |metasnf.html
metasnf/json (API)
NEWS

# Install 'metasnf' in R:
install.packages('metasnf', repos = c('https://branchlab.r-universe.dev', 'https://cloud.r-project.org'))

Peer review:

Bug tracker:https://github.com/branchlab/metasnf/issues

Datasets:

On CRAN:

bioinformaticsclusteringmetaclusteringsnf

8.22 score 7 stars 31 scripts 122 exports 44 dependencies

Last updated 15 days agofrom:fc11443ec2. Checks:OK: 7. Indexed: yes.

TargetResultDate
Doc / VignettesOKNov 09 2024
R-4.5-winOKNov 10 2024
R-4.5-linuxOKNov 09 2024
R-4.4-winOKNov 10 2024
R-4.4-macOKNov 10 2024
R-4.3-winOKNov 10 2024
R-4.3-macOKNov 10 2024

Exports:add_columnsadd_settings_matrix_rowsadjusted_rand_index_heatmapalluvial_cluster_plotarrange_dlassemble_dataassoc_pval_heatmapauto_plotbar_plotbatch_nmibatch_row_closurebatch_snfbatch_snf_subsamplescalc_ariscalc_assoc_pvalcalc_assoc_pval_matrixcalculate_coclusteringcalculate_db_indicescalculate_dunn_indicescalculate_silhouettescell_significance_fnchar_to_faccheck_dataless_annotationscheck_hm_dependenciescheck_similarity_matriceschi_squared_pvalcocluster_densitycocluster_heatmapcoclustering_coverage_checkcollapse_dlcolour_scaleconvert_uidsdiscretisationdiscretisation_evec_datadl_has_duplicatesdl_uid_first_coldl_variable_summarydomain_mergedomainsdrop_inputsesm_manhattan_plotestimate_nclust_given_grapheuclidean_distanceextend_solutionsfisher_exact_pvalgenerate_annotations_listgenerate_clust_algs_listgenerate_data_listgenerate_distance_metrics_listgenerate_settings_matrixgenerate_weights_matrixget_cluster_dfget_cluster_solutionsget_clustersget_complete_uidsget_dist_matrixget_dl_subjectsget_heatmap_orderget_matrix_orderget_mean_pvalget_min_pvalget_pvalsget_representative_solutionsgower_distancehamming_distanceindividualjitter_plotlabel_proplabel_splitslinear_adjustlinear_model_pvallist_removelp_solutions_matrixmc_manhattan_plotmerge_data_listsmerge_df_listno_subsnumcol_to_numericord_reg_pvalparallel_batch_snfprefix_dl_skpval_heatmaprandom_removalreduce_dl_to_commonremove_dl_narename_dlreorder_dl_subsresamplesave_heatmapscale_diagonalssettings_matrix_heatmapsew_euclidean_distanceshiny_annotatorsimilarity_matrix_heatmapsimilarity_matrix_pathsiw_euclidean_distancesn_euclidean_distancesnf_stepspectral_eigenspectral_eigen_classicspectral_eightspectral_fivespectral_fourspectral_ninespectral_rotspectral_rot_classicspectral_sevenspectral_sixspectral_tenspectral_threespectral_twosplit_parsersubssubsample_data_listsubsample_pairwise_arissummarize_clust_algs_listsummarize_dlsummarize_dmlsummarize_pvalstrain_test_assigntwo_step_mergevar_manhattan_plot

Dependencies:alluvialcliclustercolorspacecpp11digestdplyrExPositionfansifarvergenericsggplot2gluegtableisobandlabelinglatticelifecyclemagrittrMASSMatrixmclustmgcvmunsellnlmepillarpkgconfigprettyGraphsprogressrpurrrR6RColorBrewerrlangscalesSNFtoolstringistringrtibbletidyrtidyselectutf8vctrsviridisLitewithr

A Complete Example

Rendered froma_complete_example.Rmdusingknitr::rmarkdownon Nov 09 2024.

Last update: 2024-10-21
Started: 2024-05-17

A Simple Example

Rendered froma_simple_example.Rmdusingknitr::rmarkdownon Nov 09 2024.

Last update: 2024-10-21
Started: 2023-10-23

Alluvial Plots

Rendered fromalluvial_plots.Rmdusingknitr::rmarkdownon Nov 09 2024.

Last update: 2024-10-21
Started: 2023-10-29

Clustering Algorithms

Rendered fromclustering_algorithms.Rmdusingknitr::rmarkdownon Nov 09 2024.

Last update: 2024-10-21
Started: 2023-11-17

Confounders

Rendered fromconfounders.Rmdusingknitr::rmarkdownon Nov 09 2024.

Last update: 2024-10-21
Started: 2023-11-22

Correlation Plots

Rendered fromcorrelation_plots.Rmdusingknitr::rmarkdownon Nov 09 2024.

Last update: 2024-10-21
Started: 2023-11-14

Distance Metrics

Rendered fromdistance_metrics.Rmdusingknitr::rmarkdownon Nov 09 2024.

Last update: 2024-10-21
Started: 2023-10-23

Feature Plots

Rendered fromfeature_plots.Rmdusingknitr::rmarkdownon Nov 09 2024.

Last update: 2024-10-21
Started: 2024-05-29

Feature Weighting

Rendered fromfeature_weights.Rmdusingknitr::rmarkdownon Nov 09 2024.

Last update: 2024-10-21
Started: 2023-11-14

Getting Started

Rendered fromgetting_started.Rmdusingknitr::rmarkdownon Nov 09 2024.

Last update: 2024-10-21
Started: 2023-10-23

Imputations

Rendered fromimputations.Rmdusingknitr::rmarkdownon Nov 09 2024.

Last update: 2024-10-21
Started: 2024-05-26

Label Propagation

Rendered fromlabel_propagation.Rmdusingknitr::rmarkdownon Nov 09 2024.

Last update: 2024-10-21
Started: 2024-02-02

Manhattan Plots

Rendered frommanhattan_plots.Rmdusingknitr::rmarkdownon Nov 09 2024.

Last update: 2024-10-21
Started: 2023-10-31

NMI Scores

Rendered fromnmi_scores.Rmdusingknitr::rmarkdownon Nov 09 2024.

Last update: 2024-10-21
Started: 2024-05-28

Parallel Processing

Rendered fromparallel_processing.Rmdusingknitr::rmarkdownon Nov 09 2024.

Last update: 2024-10-21
Started: 2023-12-06

Quality Measures

Rendered fromquality_measures.Rmdusingknitr::rmarkdownon Nov 09 2024.

Last update: 2024-10-21
Started: 2024-05-14

Similarity Matrices

Rendered fromsimilarity_matrix_heatmap.Rmdusingknitr::rmarkdownon Nov 09 2024.

Last update: 2024-10-21
Started: 2023-10-29

SNF Schemes

Rendered fromsnf_schemes.Rmdusingknitr::rmarkdownon Nov 09 2024.

Last update: 2024-10-21
Started: 2023-11-29

Stability Measures

Rendered fromstability_measures.Rmdusingknitr::rmarkdownon Nov 09 2024.

Last update: 2024-10-30
Started: 2023-11-14

The Data List

Rendered fromdata_list.Rmdusingknitr::rmarkdownon Nov 09 2024.

Last update: 2024-07-11
Started: 2023-11-14

The Settings Matrix

Rendered fromsettings_matrix.Rmdusingknitr::rmarkdownon Nov 09 2024.

Last update: 2024-11-08
Started: 2023-10-31

Troubleshooting

Rendered fromtroubleshooting.Rmdusingknitr::rmarkdownon Nov 09 2024.

Last update: 2024-05-28
Started: 2023-11-14

Readme and manuals

Help Manual

Help pageTopics
Mock ABCD anxiety dataabcd_anxiety
Mock ABCD "colour" dataabcd_colour
Mock ABCD cortical surface area dataabcd_cort_sa
Mock ABCD cortical thickness dataabcd_cort_t
Mock ABCD depression dataabcd_depress
Mock ABCD income dataabcd_h_income
Mock ABCD income dataabcd_income
Mock ABCD pubertal status dataabcd_pubertal
Mock ABCD subcortical volumes dataabcd_subc_v
Add columns to a dataframeadd_columns
Add settings matrix rowsadd_settings_matrix_rows
Heatmap of pairwise adjusted rand indices between solutionsadjusted_rand_index_heatmap
Mock age dataage_df
Alluvial plot of patients across cluster counts and important featuresalluvial_cluster_plot
Mock ABCD anxiety dataanxiety
Given a data_list object, sort data elements by subjectkeyarrange_dl
Collapse a dataframe and/or a data_list into a single dataframeassemble_data
Heatmap of pairwise associations between featuresassoc_pval_heatmap
Automatically plot features across clustersauto_plot
Bar plot separating a feature by clusterbar_plot
Calculate feature NMIs for a data_list and a derived solutions_matrixbatch_nmi
Generate closure function to run batch_snf in an apply-friendly formatbatch_row_closure
Run variations of SNF.batch_snf
Run SNF clustering pipeline on a list of subsampled data lists.batch_snf_subsamples
Meta-cluster calculationscalc_aris
Calculate p-values based on feature vectors and their typescalc_assoc_pval
Calculate p-values for all pairwise associations of features in a data_listcalc_assoc_pval_matrix
Calculate coclustering data.calculate_coclustering
Calculate Davies-Bouldin indicescalculate_db_indices
Calculate Dunn indicescalculate_dunn_indices
Calculate silhouette scorescalculate_silhouettes
Mock diagnosis datacancer_diagnosis_df
Place significance stars on ComplexHeatmap cells.cell_significance_fn
Convert character-type columns of a dataframe to factor-typechar_to_fac
Helper function to stop annotation building when no data was providedcheck_dataless_annotations
Check for ComplexHeatmap and circlize dependenciescheck_hm_dependencies
Check validity of similarity matricescheck_similarity_matrices
Chi-squared test p-value (generic)chi_squared_pval
Density plot coclustering stability across subsampled data.cocluster_density
Heatmap of observation co-clustering across resampled data.cocluster_heatmap
Coclustering coverage checkcoclustering_coverage_check
Collapse a data_list into a single dataframecollapse_dl
Return a colour ramp for a given vectorcolour_scale
Convert unique identifiers of data_list to 'subjectkey'convert_uids
Mock ABCD cortical surface area datacort_sa
Mock ABCD cortical thickness datacort_t
Mock ABCD depression datadepress
Mock diagnosis datadiagnosis_df
Internal function for 'estimate_nclust_given_graph'discretisation
Internal function for 'estimate_nclust_given_graph'discretisation_evec_data
Check if data list contains any duplicate featuresdl_has_duplicates
Make the subjectkey UID columns of a data_list firstdl_uid_first_col
Variable-level summary of a data_listdl_variable_summary
SNF scheme: Domain mergedomain_merge
Domainsdomains
Execute inclusiondrop_inputs
Manhattan plot of feature-cluster association p-valuesesm_manhattan_plot
Estimate number of clusters for a similarity matrixestimate_nclust_given_graph
Distance metric: Euclidean distanceeuclidean_distance
Modification of SNFtool mock dataframe "Data1"expression_df
Extend an solutions matrix to include outcome evaluationsextend_solutions
Mock ABCD "colour" datafav_colour
Fisher exact test p-valuefisher_exact_pval
Mock gender datagender_df
Generate annotations listgenerate_annotations_list
Generate a list of custom clustering algorithmsgenerate_clust_algs_list
Generate a data_listgenerate_data_list
Generate a list of distance metricsgenerate_distance_metrics_list
Build a settings matrixgenerate_settings_matrix
Generate a matrix to store feature weightsgenerate_weights_matrix
Extract cluster membership information from one solutions matrix rowget_cluster_df
Extract cluster membership information from a solutions_matrixget_cluster_solutions
Extract cluster membership vector from one solutions matrix rowget_clusters
Pull complete-data UIDs from a list of dataframesget_complete_uids
Calculate distance matricesget_dist_matrix
Extract subjects from a data_listget_dl_subjects
Return the row or column ordering present in a heatmapget_heatmap_order
Return the hierarchical clustering order of a matrixget_matrix_order
Get mean p-valueget_mean_pval
Get minimum p-valueget_min_pval
Get p-values from an extended solutions matrixget_pvals
Extract representative solutions from a matrix of ARIsget_representative_solutions
Distance metric: Gower distancegower_distance
Distance metric: Hamming distancehamming_distance
Mock ABCD income dataincome
SNF Scheme: Individualindividual
Jitter plot separating a feature by clusterjitter_plot
Label propagationlabel_prop
Convert a vector of partition indices into meta cluster labelslabel_splits
Linearly correct data_list by features with unwanted signallinear_adjust
Linear model p-value (generic)linear_model_pval
Remove items from a data_listlist_remove
Label propagate cluster solutions to unclustered subjectslp_solutions_matrix
Manhattan plot of feature-meta cluster associaiton p-valuesmc_manhattan_plot
Horizontally merge compatible data listsmerge_data_lists
Merge list of dataframesmerge_df_list
Modification of SNFtool mock dataframe "Data2"methylation_df
Select all columns of a dataframe not starting with the 'subject_' prefix.no_subs
Convert dataframe columns to numeric typenumcol_to_numeric
Ordinal regression p-valueord_reg_pval
Parallel processing form of batch_snfparallel_batch_snf
Add "subject_" prefix to all UID values in subjectkey columnprefix_dl_sk
Mock ABCD pubertal status datapubertal
Heatmap of p-valuespval_heatmap
Generate random removal sequencerandom_removal
Reduce data_list to common subjectsreduce_dl_to_common
Remove NAs from a data_list objectremove_dl_na
Rename features in a data_listrename_dl
Reorder the subjects in a data_listreorder_dl_subs
Helper resample function found in ?sampleresample
Save a heatmap object to a filesave_heatmap
Adjust the diagonals of a matrixscale_diagonals
Heatmap for visualizing a settings matrixsettings_matrix_heatmap
Squared (excluding weights) Euclidean distancesew_euclidean_distance
Launch shiny app to identify meta cluster boundariesshiny_annotator
Plot heatmap of similarity matrixsimilarity_matrix_heatmap
Generate a complete path and filename to store an similarity matrixsimilarity_matrix_path
Squared (including weights) Euclidean distancesiw_euclidean_distance
Distance metric: Standard normalization then Euclideansn_euclidean_distance
Convert a data list to a similarity matrix through a variety of SNF schemessnf_step
Clustering algorithm: Spectral clustering with eigen-gap heuristicspectral_eigen
Clustering algorithm: Spectral clustering with eigen-gap heuristicspectral_eigen_classic
Clustering algorithm: Spectral clustering for a eight cluster solutionspectral_eight
Clustering algorithm: Spectral clustering for a five cluster solutionspectral_five
Clustering algorithm: Spectral clustering for a four cluster solutionspectral_four
Clustering algorithm: Spectral clustering for a nine cluster solutionspectral_nine
Clustering algorithm: Spectral clustering with rotation cost heuristicspectral_rot
Clustering algorithm: Spectral clustering with rotation cost heuristicspectral_rot_classic
Clustering algorithm: Spectral clustering for a seven cluster solutionspectral_seven
Clustering algorithm: Spectral clustering for a six cluster solutionspectral_six
Clustering algorithm: Spectral clustering for a ten cluster solutionspectral_ten
Clustering algorithm: Spectral clustering for a three cluster solutionspectral_three
Clustering algorithm: Spectral clustering for a two cluster solutionspectral_two
Helper function to determine which row and columns to split onsplit_parser
Mock ABCD subcortical volumes datasubc_v
Select all columns of a dataframe starting with a given string prefix.subs
Create subsamples of a data_listsubsample_data_list
Calculate pairwise adjusted Rand indices across subsamples of datasubsample_pairwise_aris
Summarize a clust_algs_list objectsummarize_clust_algs_list
Summarize a data listsummarize_dl
Summarize metrics contained in a distance_metrics_listsummarize_dml
Summarize p-value columns of an extended solutions matrixsummarize_pvals
Training and testing splittrain_test_assign
Two step SNFtwo_step_merge
Manhattan plot of feature-feature associaiton p-valuesvar_manhattan_plot