Here we evaluate statistical methods for detecting difference between two sample correlation matricies. Let C1
be correlation between a set of features in the dataset Y1
with N1
samples, and let C2
be correlation in dataset Y2
with N2
samples. Alternatively, let Y
be the combined dataset of the subsets indicated by categorical variable
.
cortest(C1, C2,fisher=TRUE)
cortest(C1, C2, N1, N2, fisher=FALSE)
cortest.jennrich(C1, C2, N1, N2)
cortest.mat(C1, C2, N1, N2)
wilcox.test( C1[lower.tri(C1)], C2[lower.tri(C2)], paired=TRUE)
Cai.max.test( Y1, Y2 )
Schott.Frob.test( Y1, Y2 )
Chang.maxBoot.test( Y1, Y2 )
WL.randProj.test( Y1, Y2 )
LC.U.test( Y1, Y2 )
boxM(Y, variable)
boxM_permute(Y, variable)
i
by comparing the correlations based on the full dataset to the correlation after dropping sample i
. This gives a score for each sample. A test of association between this sample-level score and the variable of interest is then evaluated. If this variable has two categories, a Wilcoxon test is used and for more than two categories a Kruskal-Wallis test is used. If the variable is continuous, a Spearman correlation test is used.
delaneau.test( Y, variable)
i
by comparing the sparse leading eigen-value of the correlation matrix based on the full dataset to sparse leading eigen-value of the correlation matrix after dropping sample i
. This gives a score for each sample. A test of association between this sample-level score and the variable of interest is then evaluated. If this variable has two categories, a Wilcoxon test is used and for more than two categories a Kruskal-Wallis test is used. If the variable is continuous, a Spearman correlation test is used.
sle.test( Y, variable)
Simulation results are shown comparing correlation matricies for p features for N samples. Most methods are only applicable to positive definite matricies corresponding to N > p. Only Mann-Whitney, sLED, Delaneau and deltaSLE are applicable dataset with N > p, so the remaing methods do not give results simulations in this case (i.e. top right of figures).
To determine control of the false positive rate, 5000 simulations were performed under the null model of no difference between correlation structure in the two datasets (i.e. C1 == C2
).
Note that x-axis stops at 0.2, but often the false positive rate of the Factor and Jennich methods exceed this value.
In group 1, all pairwise correlations are 0.80 and in group 2 all pairwise correlations are 0.75.
To test the power of each method, 1000 null simulations were performed in addition to 1000 simulations with different correlation structure (i.e.C1 != C2
).
In group 1, all pairwise correlations are 0.80 and in group 2 half of the pairwise correlations are set to 0.75 and the rest remain at 0.80. This followed by a small correction to make matrix positive definite.
To test the power of each method, 1000 null simulations were performed in addition to 1000 simulations with different correlation structure (i.e.C1 != C2
).