Title: | Testing and Plotting Procedures for Biostatistics |
---|---|
Description: | Contains miscellaneous functions useful in biostatistics, mostly univariate and multivariate testing procedures with a special emphasis on permutation tests. Many functions intend to simplify user's life by shortening existing procedures or by implementing plotting functions that can be used with as many methods from different packages as possible. |
Authors: | Maxime HERVE |
Maintainer: | Maxime HERVE <[email protected]> |
License: | GPL-2 |
Version: | 0.9-83-7 |
Built: | 2025-02-28 03:36:47 UTC |
Source: | https://github.com/cran/RVAideMemoire |
Contains miscellaneous functions useful in biostatistics, mostly univariate and multivariate testing procedures with a special emphasis on permutation tests. Many functions intend to simplify user's life by shortening existing procedures or by implementing plotting functions that can be used with as many methods from different packages as possible.
Package: | RVAideMemoire |
Type: | Package |
Version: | 0.9-83-7 |
Date: | 2023-11-06 |
License: | GPL-2 |
LazyLoad: | yes |
Maxime HERVE
Maintainer: Maxime HERVE <[email protected]>
Document : "Aide-memoire de statistique appliquee a la biologie - Construire son etude et analyser les resutats a l'aide du logiciel R" (available on CRAN)
This function is a wrapper to adonis
but performs type II tests (whereas adonis
performs type I).
adonis.II(formula, data, ...)
adonis.II(formula, data, ...)
formula |
a typical model formula such as |
data |
the data frame from which |
... |
additional arguments to |
See adonis
for detailed explanation of what is done. The only difference with adonis
is that adonis.II
performs type II tests instead of type I.
a data frame of class "anova"
.
Maxime HERVE <[email protected]>
require(vegan) data(dune) data(dune.env) # Compare: adonis(dune~Management*A1,data=dune.env,permutations=99) adonis(dune~A1*Management,data=dune.env,permutations=99) # With: adonis.II(dune~Management*A1,data=dune.env,permutations=99) adonis.II(dune~A1*Management,data=dune.env,permutations=99)
require(vegan) data(dune) data(dune.env) # Compare: adonis(dune~Management*A1,data=dune.env,permutations=99) adonis(dune~A1*Management,data=dune.env,permutations=99) # With: adonis.II(dune~Management*A1,data=dune.env,permutations=99) adonis.II(dune~A1*Management,data=dune.env,permutations=99)
These functions are methods for Anova
to calculate type-II or type-III analysis-of-deviance tables for model objects produced by clm
and clmm
. Likelihood-ratio tests are calculated in both cases.
## S3 method for class 'clm' Anova(mod, type = c("II", "III", 2, 3), ...) ## S3 method for class 'clmm' Anova(mod, type = c("II", "III", 2, 3), ...)
## S3 method for class 'clm' Anova(mod, type = c("II", "III", 2, 3), ...) ## S3 method for class 'clmm' Anova(mod, type = c("II", "III", 2, 3), ...)
mod |
|
type |
type of test, |
... |
additional arguments to |
See help of the Anova
for a detailed explanation of what "type II" and "typ III" mean.
See Anova
.
Maxime HERVE <[email protected]>
Back-transforms EMMeans (produced by emmeans
) when the model was built on a transformed response variable. This is typically the case when a LM(M) with log(x+1) as response variable gives a better fitting than a GLM(M) for count data, or when a beta regression takes as response a variable on the [0;1] interval that has been rescaled to the (0;1) interval using p.beta
.
back.emmeans(emm, transform = c("log", "logit", "sqrt", "4rt", "inverse", "p.beta"), base = exp(1), add = 0, n = NULL, C = 2, ord = FALSE, decreasing = TRUE)
back.emmeans(emm, transform = c("log", "logit", "sqrt", "4rt", "inverse", "p.beta"), base = exp(1), add = 0, n = NULL, C = 2, ord = FALSE, decreasing = TRUE)
emm |
object returned by |
transform |
transformation applied to the response variable before building the model on which |
base |
the base with respect to which the logarithm transformation was computed (if |
add |
value to be added to x before computing the transformation, if needed (e.g. |
n |
total number of observations in the initial data set. Only used with |
C |
number of categories from which initial continuous proportions were computed. Only used with |
ord |
logical indicating if back-transformed EMMeans should be ordered. |
decreasing |
logical indicating in which order back-transformed EMMeans should be ordered, if |
Maxime HERVE <[email protected]>
require(emmeans) set.seed(1149) tab <- data.frame( response <- c(rpois(30,0),rpois(30,2),rpois(30,4)), fact <- gl(3,30,labels=LETTERS[1:3]) ) model <- lm(log(response+1)~fact,data=tab) EMM <- emmeans(model,~fact) back.emmeans(EMM,transform="log",add=1)
require(emmeans) set.seed(1149) tab <- data.frame( response <- c(rpois(30,0),rpois(30,2),rpois(30,4)), fact <- gl(3,30,labels=LETTERS[1:3]) ) model <- lm(log(response+1)~fact,data=tab) EMM <- emmeans(model,~fact) back.emmeans(EMM,transform="log",add=1)
Simplified version of the boot
function.
bootstrap(x, fun, nrep = 1000, conf.level = 0.95, ...)
bootstrap(x, fun, nrep = 1000, conf.level = 0.95, ...)
x |
numeric vector. |
fun |
function to be used for computation ( |
nrep |
number of replicates. |
conf.level |
confidence level for confidence interval. |
... |
additional arguments to |
See help of the boot
function for more explanations.
method |
the character string |
data.name |
a character string giving the name of the data. |
estimate |
the estimated original value |
conf.level |
confidence level for confidence interval. |
rep |
number of replicates. |
conf.int |
limits of the confidence interval. |
Maxime HERVE <[email protected]>
# Confidence interval of a mean samp <- sample(1:50,10,replace=TRUE) bootstrap(samp,function(x,i) mean(x[i])) # Confidence interval of the standard error of the mean bootstrap(samp,function(x,i) sd(x[i])/sqrt(length(x[i])))
# Confidence interval of a mean samp <- sample(1:50,10,replace=TRUE) bootstrap(samp,function(x,i) mean(x[i])) # Confidence interval of the standard error of the mean bootstrap(samp,function(x,i) sd(x[i])/sqrt(length(x[i])))
Draws a histogram of a numeric variable per level of a factor.
byf.hist(formula, data, sep = FALSE, density = TRUE, xlab = NULL, ylab = NULL, col = NULL)
byf.hist(formula, data, sep = FALSE, density = TRUE, xlab = NULL, ylab = NULL, col = NULL)
formula |
a formula of the form |
data |
an optional data frame containing the variables in the formula |
sep |
logical. If |
density |
logical. If |
xlab |
label for x-axis (name of the response variable as default). |
ylab |
label for y-axis ("Density" or "Frequency" as default, depending on the type of histogram). |
col |
color(s) used for density curves or bars. |
Maxime HERVE <[email protected]>
data(iris) byf.hist(Sepal.Length~Species,data=iris)
data(iris) byf.hist(Sepal.Length~Species,data=iris)
Draws a multivariate QQ-plot of numeric variables per level of a factor.
byf.mqqnorm(formula, data)
byf.mqqnorm(formula, data)
formula |
a formula of the form |
data |
an optional data frame containing the variables in the formula |
Maxime HERVE <[email protected]>
data(iris) byf.mqqnorm(as.matrix(iris[,1:4])~Species,data=iris)
data(iris) byf.mqqnorm(as.matrix(iris[,1:4])~Species,data=iris)
Performs a multivariate Shapiro-Wilk test on numeric variables per level of a factor.
byf.mshapiro(formula, data)
byf.mshapiro(formula, data)
formula |
a formula of the form |
data |
an optional data frame containing the variables in the formula |
method |
name of the test. |
data.name |
a character string giving the names of the data. |
tab |
table of results. |
Maxime HERVE <[email protected]>
byf.mqqnorm
, mshapiro.test
, qqPlot
data(iris) byf.mshapiro(as.matrix(iris[,1:4])~Species,data=iris)
data(iris) byf.mshapiro(as.matrix(iris[,1:4])~Species,data=iris)
Draws a QQ-plot of a numeric variable per level of a factor.
byf.qqnorm(formula, data, ...)
byf.qqnorm(formula, data, ...)
formula |
a formula of the form |
data |
an optional data frame containing the variables in the formula |
... |
other arguments to pass to |
Maxime HERVE <[email protected]>
link[RVAideMemoire]{byf.shapiro}
, qqPlot
data(iris) byf.qqnorm(Sepal.Length~Species,data=iris)
data(iris) byf.qqnorm(Sepal.Length~Species,data=iris)
Performs a Shapiro-Wilk test on a numeric variable per level of a factor.
byf.shapiro(formula, data)
byf.shapiro(formula, data)
formula |
a formula of the form |
data |
an optional data frame containing the variables in the formula |
method |
name of the test. |
data.name |
a character string giving the names of the data. |
tab |
table of results. |
Maxime HERVE <[email protected]>
data(iris) byf.shapiro(Sepal.Length~Species,data=iris)
data(iris) byf.shapiro(Sepal.Length~Species,data=iris)
Performs cross validation with correspondence discriminant analyses.
CDA.cv(X, Y, repet = 10, k = 7, ncomp = NULL, method = c("mahalanobis", "euclidian"))
CDA.cv(X, Y, repet = 10, k = 7, ncomp = NULL, method = c("mahalanobis", "euclidian"))
X |
a data frame of dependent variables (typically contingency or presence-absence table). |
Y |
factor giving the groups. |
repet |
an integer giving the number of times the whole procedure has to be repeated. |
k |
an integer giving the number of folds (can be re-set internally if needed). |
ncomp |
an integer giving the number of components to be used for prediction. If |
method |
criterion used to predict class membership. See |
The training sets are generated in respect to the relative proportions of the levels of Y
in the original data set (see splitf
).
repet |
number of times the whole procedure was repeated. |
k |
number of folds. |
ncomp |
number of components used. |
method |
criterion used to classify individuals of the test sets. |
groups |
levels of |
models.list |
list of of models generated ( |
NMC |
Classification error rates ( |
Maxime HERVE <[email protected]>
require(ade4) data(perthi02) ## Not run: CDA.cv(perthi02$tab,perthi02$cla)
require(ade4) data(perthi02) ## Not run: CDA.cv(perthi02$tab,perthi02$cla)
Performs a significance test for correspondence discriminant analysis. See Details.
CDA.test(X, fact, ncomp = NULL, ...)
CDA.test(X, fact, ncomp = NULL, ...)
X |
a data frame of dependent variables (typically contingency or presence-absence table). |
fact |
factor giving the groups. |
ncomp |
an integer giving the number of components to be used for the test. If |
... |
other arguments to pass to |
CDA consists in two steps: building a correspondence analysis (CA) on X
, then using row coordinates on all CA components as input variables for a linear discriminant analysis. CDA.test
builds the intermediate CA, then uses the first ncomp
components to test for an effect of fact
. If 1 component is used the test is an ANOVA, if more than 1 component are used the test is a MANOVA.
An ANOVA or MANOVA table.
Maxime HERVE <[email protected]>
require(ade4) data(perthi02) CDA.test(perthi02$tab,perthi02$cla)
require(ade4) data(perthi02) CDA.test(perthi02$tab,perthi02$cla)
Returns an object similar to what is produced by ecdf
, but based on a known discrete distribution.
cdf.discrete(x, dist = c("binom", "geom", "hyper", "nbinom", "pois"), ...)
cdf.discrete(x, dist = c("binom", "geom", "hyper", "nbinom", "pois"), ...)
x |
numeric vector of the observations. |
dist |
character string naming a discrete distribution ( |
... |
parameters of the distribution specified by |
The function is intended to be used in goodness-of-fits tests for discrete distributions, such as proposed in the dgof
package.
Maxime HERVE <[email protected]>
if(require(dgof)) { set.seed(1124) resp <- rpois(20,2) cvm.test(resp,cdf.discrete(resp,"pois",2)) }
if(require(dgof)) { set.seed(1124) resp <- rpois(20,2) cvm.test(resp,cdf.discrete(resp,"pois",2)) }
Returns expected counts before comparing response probabilities (i.e. when the response variable is a binary variable) to given values by a chi-squared test. The function is in fact a wrapper to the chi-squared test for comparison of proportions to given values on a contingency table.
chisq.bin.exp(formula, data, p, graph = FALSE)
chisq.bin.exp(formula, data, p, graph = FALSE)
formula |
a formula of the form |
data |
an optional data frame containing the variables in the formula |
p |
theoretical probabilities. |
graph |
logical. If |
The function returns how many counts can be < 5 to respect Cochran's rule (80% of counts must be >= 5).
p.theo |
theoretical probabilities. |
mat |
contingency table of expected counts. |
cochran |
number of counts which can be < 5. |
Maxime HERVE <[email protected]>
prop.test
, chisq.theo.bintest
, mosaicplot
response <- c(rep(0:1,c(40,60)),rep(0:1,c(55,45)),rep(0:1,c(65,35))) fact <- gl(3,100,labels=LETTERS[1:3]) p.theo <- c(0.5,0.45,0.2) chisq.bin.exp(response~fact,p=p.theo)
response <- c(rep(0:1,c(40,60)),rep(0:1,c(55,45)),rep(0:1,c(65,35))) fact <- gl(3,100,labels=LETTERS[1:3]) p.theo <- c(0.5,0.45,0.2) chisq.bin.exp(response~fact,p=p.theo)
Performs a Pearson's Chi-squared test for comparing response probabilities (i.e. when the response variable is a binary variable). The function is in fact a wrapper to the chi-squared test for comparison of proportions on a contingency table. If the p-value of the test is significant, the function performs pairwise comparisons by using Pearson's Chi-squared tests.
chisq.bintest(formula, data, correct = TRUE, alpha = 0.05, p.method = "fdr")
chisq.bintest(formula, data, correct = TRUE, alpha = 0.05, p.method = "fdr")
formula |
a formula of the form |
data |
an optional data frame containing the variables in the formula |
correct |
a logical indicating whether to apply continuity correction when computing the test statistic for 2 by 2 tables. See help of |
alpha |
significance level to compute pairwise comparisons. |
p.method |
method for p-values correction. See help of |
If the response is a 0/1 variable, the probability of the '1' group is tested. In any other cases, the response is transformed into a factor and the probability of the second level is tested.
Since a chi-squared test is an approximate test, an exact test is preferable when the number of individuals is small (200 is a reasonable minimum). See fisher.bintest
in that case.
method.test |
a character string giving the name of the global test computed. |
data.name |
a character string giving the name(s) of the data. |
alternative |
a character string describing the alternative hypothesis. |
estimate |
the estimated probabilities. |
null.value |
the value of the difference in probabilities under the null hypothesis, always 0. |
statistic |
test statistics. |
parameter |
test degrees of freedom. |
p.value |
p-value of the global test. |
alpha |
significance level. |
p.adjust.method |
method for p-values correction. |
p.value.multcomp |
data frame of pairwise comparisons result. |
method.multcomp |
a character string giving the name of the test computed for pairwise comparisons. |
Maxime HERVE <[email protected]>
response <- c(rep(0:1,c(40,60)),rep(0:1,c(55,45)),rep(0:1,c(65,35))) fact <- gl(3,100,labels=LETTERS[1:3]) chisq.bintest(response~fact)
response <- c(rep(0:1,c(40,60)),rep(0:1,c(55,45)),rep(0:1,c(65,35))) fact <- gl(3,100,labels=LETTERS[1:3]) chisq.bintest(response~fact)
Returns expected counts before comparing proportions to given values by a chi-squared test.
chisq.exp(data, p, graph = FALSE)
chisq.exp(data, p, graph = FALSE)
data |
contingency table. |
p |
theoretical proportions. |
graph |
logical. If |
The function returns how many counts can be < 5 to respect Cochran's rule (80% of counts must be >= 5).
p.theo |
theoretical proportions. |
mat |
contingency table of expected counts. |
cochran |
number of counts which can be < 5. |
Maxime HERVE <[email protected]>
prop.test
, chisq.test
, mosaicplot
proportions <- sample(c(0,1),200,replace=TRUE) populations <- sample(LETTERS[1:3],200,replace=TRUE) tab.cont <- table(populations,proportions) p.theo <- c(0.4,0.5,0.7) chisq.exp(tab.cont,p=p.theo)
proportions <- sample(c(0,1),200,replace=TRUE) populations <- sample(LETTERS[1:3],200,replace=TRUE) tab.cont <- table(populations,proportions) p.theo <- c(0.4,0.5,0.7) chisq.exp(tab.cont,p=p.theo)
Performs pairwise comparisons after a global chi-squared goodness-of-fit test.
chisq.multcomp(x, p.method = "fdr")
chisq.multcomp(x, p.method = "fdr")
x |
numeric vector (counts). |
p.method |
method for p-values correction. See help of |
Since a chi-squared test is an approximate test, an exact test is preferable when the number of individuals is small (200 is a reasonable minimum). See multinomial.multcomp
in that case.
method |
name of the test. |
data.name |
a character string giving the name(s) of the data. |
p.adjust.method |
method for p-values correction. |
p.value |
table of results. |
Maxime HERVE <[email protected]>
chisq.test
, multinomial.test
, multinomial.multcomp
counts <- c(49,30,63,59) chisq.test(counts) chisq.multcomp(counts)
counts <- c(49,30,63,59) chisq.test(counts) chisq.multcomp(counts)
Performs a Pearson's Chi-squared test for comparing response probabilities (i.e. when the response variable is a binary variable) to given values. The function is in fact a wrapper to the chi-squared test for comparison of proportions to given values on a contingency table.
chisq.theo.bintest(formula, data, p)
chisq.theo.bintest(formula, data, p)
formula |
a formula of the form |
data |
an optional data frame containing the variables in the formula |
p |
theoretical probabilities. |
If the response is a 0/1 variable, the probability of the '1' group is tested. In any other cases, the response is transformed into a factor and the probability of the second level is tested.
method.test |
a character string giving the name of the test. |
data.name |
a character string giving the name(s) of the data. |
alternative |
a character string describing the alternative hypothesis, always two-sided. |
estimate |
the estimated probabilities. |
null.value |
the theoretical probabilities. |
statistic |
test statistics. |
parameter |
test degrees of freedom. |
p.value |
p-value of the test. |
Maxime HERVE <[email protected]>
prop.test
, chisq.bin.exp
, prop.bin.multcomp
response <- c(rep(0:1,c(40,60)),rep(0:1,c(55,45)),rep(0:1,c(65,35))) fact <- gl(3,100,labels=LETTERS[1:3]) p.theo <- c(0.5,0.45,0.2) chisq.theo.bintest(response~fact,p=p.theo)
response <- c(rep(0:1,c(40,60)),rep(0:1,c(55,45)),rep(0:1,c(65,35))) fact <- gl(3,100,labels=LETTERS[1:3]) p.theo <- c(0.5,0.45,0.2) chisq.theo.bintest(response~fact,p=p.theo)
Performs pairwise comparisons after a global chi-squared test for given probabilities.
chisq.theo.multcomp(x, p = rep(1/length(x), length(x)), p.method = "fdr")
chisq.theo.multcomp(x, p = rep(1/length(x), length(x)), p.method = "fdr")
x |
numeric vector (counts). |
p |
theoretical proportions. |
p.method |
method for p-values correction. See help of |
Since a chi-squared test is an approximate test, an exact test is preferable when the number of individuals is small (200 is a reasonable minimum). See multinomial.theo.multcomp
in that case.
method |
name of the test. |
data.name |
a character string giving the name(s) of the data. |
observed |
observed counts. |
expected |
expected counts. |
p.adjust.method |
method for p-values correction. |
statistic |
statistics of each test. |
p.value2 |
corrected p-values. |
p.value |
data frame of results. |
Maxime HERVE <[email protected]>
chisq.test
, multinomial.test
, multinomial.theo.multcomp
counts <- c(49,30,63,59) p.theo <- c(0.2,0.1,0.45,0.25) chisq.test(counts,p=p.theo) chisq.theo.multcomp(counts,p=p.theo)
counts <- c(49,30,63,59) p.theo <- c(0.2,0.1,0.45,0.25) chisq.test(counts,p=p.theo) chisq.theo.multcomp(counts,p=p.theo)
Performs the Cochran's Q test for unreplicated randomized block design experiments with a binary response variable and paired data. If the p-value of the test is significant, the function performs pairwise comparisons by using the Wilcoxon sign test.
cochran.qtest(formula, data, alpha = 0.05, p.method = "fdr")
cochran.qtest(formula, data, alpha = 0.05, p.method = "fdr")
formula |
a formula of the form |
data |
an optional data frame containing the variables in the formula |
alpha |
significance level to compute pairwise comparisons. |
p.method |
method for p-values correction. See help of |
If the response is a 0/1 variable, the probability of the '1' group is tested. In any other cases, the response is transformed into a factor and the probability of the second level is tested.
method.test |
a character string giving the name of the global test computed. |
data.name |
a character string giving the name(s) of the data. |
alternative |
a character string describing the alternative hypothesis. |
estimate |
the estimated probabilities. |
null.value |
the value of the difference in probabilities under the null hypothesis, always 0. |
statistic |
test statistics (Pearson's Chi-squared test only). |
parameter |
test degrees of freedom (Pearson's Chi-squared test only). |
p.value |
p-value of the global test. |
alpha |
significance level. |
p.adjust.method |
method for p-values correction. |
p.value.multcomp |
data frame of pairwise comparisons result. |
method.multcomp |
a character string giving the name of the test computed for pairwise comparisons. |
Maxime HERVE <[email protected]>
response <- c(0,1,1,0,0,1,0,1,1,1,1,1,0,0,1,1,0,1,0,1,1,0,0,1,0,1,1,0,0,1) fact <- gl(3,1,30,labels=LETTERS[1:3]) block <- gl(10,3,labels=letters[1:10]) cochran.qtest(response~fact|block)
response <- c(0,1,1,0,0,1,0,1,1,1,1,1,0,0,1,1,0,1,0,1,1,0,0,1,0,1,1,0,0,1) fact <- gl(3,1,30,labels=LETTERS[1:3]) block <- gl(10,3,labels=letters[1:10]) cochran.qtest(response~fact|block)
Computes the condition number of the Hessian matrix of a model fitted with multinom
.
cond.multinom(model)
cond.multinom(model)
model |
object of class |
Maxime HERVE <[email protected]>
Returns the coordinates of a set of points when orthogonally projected on a new axis.
coord.proj(coord,slp)
coord.proj(coord,slp)
coord |
2-column data frame or matrix giving the original coordinates (left column: x, right column: y). |
slp |
slope of the new axis. |
Maxime HERVE <[email protected]>
data(iris) # Original coordinates plot(Petal.Length~Sepal.Length,pch=16,col=as.numeric(iris$Species),data=iris) # New axis abline(-6,1.6) # Coordinates on new axis new.coord <- coord.proj(iris[,c("Sepal.Length","Petal.Length")],1.6) stripchart(new.coord~Species,data=iris,col=1:3)
data(iris) # Original coordinates plot(Petal.Length~Sepal.Length,pch=16,col=as.numeric(iris$Species),data=iris) # New axis abline(-6,1.6) # Coordinates on new axis new.coord <- coord.proj(iris[,c("Sepal.Length","Petal.Length")],1.6) stripchart(new.coord~Species,data=iris,col=1:3)
Performs the test for equality of 2 Pearson's correlation coefficients. If the difference is not significant, the function returns the common coefficient, its confidence interval and performs the test for equality to a given value.
cor.2comp(var1, var2, var3, var4, alpha = 0.05, conf.level = 0.95, theo = 0)
cor.2comp(var1, var2, var3, var4, alpha = 0.05, conf.level = 0.95, theo = 0)
var1 |
numeric vector (first variable of the first correlation). |
var2 |
numeric vector (second variable of the first correlation). |
var3 |
numeric vector (first variable of the second correlation). |
var4 |
numeric vector (second variable of the second correlation). |
alpha |
significance level. |
conf.level |
confidence level. |
theo |
theoretical coefficient. |
method.test |
a character string giving the name of the global test computed. |
data.name |
a character string giving the name(s) of the data. |
statistic |
test statistics. |
p.value |
p-value for comparison of the 2 coefficients. |
null.value |
the value of the difference in coefficients under the null hypothesis, always 0. |
alternative |
a character string describing the alternative hypothesis. |
estimate |
the estimated correlation coefficients. |
alpha |
significance level. |
conf.level |
confidence level. |
common.name |
a character string explaining the elements of the table below. |
common |
data frame of results if the coefficients are not significantly different (common coefficient). |
Maxime HERVE <[email protected]>
cor1.var1 <- 1:30+rnorm(30,0,2) cor1.var2 <- 1:30+rnorm(30,0,3) cor2.var1 <- (-1):-30+rnorm(30,0,2) cor2.var2 <- (-1):-30+rnorm(30,0,3) cor.2comp(cor1.var1,cor1.var2,cor2.var1,cor2.var2)
cor1.var1 <- 1:30+rnorm(30,0,2) cor1.var2 <- 1:30+rnorm(30,0,3) cor2.var1 <- (-1):-30+rnorm(30,0,2) cor2.var2 <- (-1):-30+rnorm(30,0,3) cor.2comp(cor1.var1,cor1.var2,cor2.var1,cor2.var2)
Performs a test for equality of a Pearson's linear correlation coefficient to a given value.
cor.conf(var1, var2, theo)
cor.conf(var1, var2, theo)
var1 |
numeric vector (first variable). |
var2 |
numeric vector (second variable). |
theo |
theoretical value. |
method |
a character string giving the name of the test. |
data.name |
a character string giving the name(s) of the data. |
statistic |
test statistics. |
p.value |
p-value of the test. |
null.value |
the value of the theoretical coefficient. |
alternative |
a character string describing the alternative hypothesis. |
estimate |
the estimated correlation coefficient. |
Maxime HERVE <[email protected]>
var1 <- 1:30+rnorm(30,0,4) var2 <- 1:30+rnorm(30,0,4) cor.conf(var1,var2,theo=0.5)
var1 <- 1:30+rnorm(30,0,4) var2 <- 1:30+rnorm(30,0,4) cor.conf(var1,var2,theo=0.5)
Performs comparisons of several Pearson's linear correlation coefficients. If no difference, the function returns the common correlation coefficient, its confidence interval and test for its equality to a given value. If the difference is significant, the functions performs pairwise comparisons between coefficients.
cor.multcomp(var1, var2, fact, alpha = 0.05, conf.level = 0.95, theo = 0, p.method = "fdr")
cor.multcomp(var1, var2, fact, alpha = 0.05, conf.level = 0.95, theo = 0, p.method = "fdr")
var1 |
numeric vector (first variable). |
var2 |
numeric vector (second variable). |
fact |
factor (groups). |
alpha |
significance level. |
conf.level |
confidence level. |
theo |
theoretical coefficient. |
p.method |
method for p-values correction. See help of |
method.test |
a character string giving the name of the global test computed. |
data.name |
a character string giving the name(s) of the data. |
statistic |
test statistics. |
parameter |
test degrees of freedom. |
p.value |
p-value for comparison of the coefficients. |
null.value |
the value of the difference in coefficients under the null hypothesis, always 0. |
alternative |
a character string describing the alternative hypothesis. |
estimate |
the estimated correlation coefficients. |
alpha |
significance level. |
conf.level |
confidence level. |
p.adjust.method |
method for p-values correction. |
p.value.multcomp |
data frame of pairwise comparisons result. |
common.name |
a character string explaining the elements of the table below. |
common |
data frame of results if the coefficients are not significantly different (common coefficient). |
Maxime HERVE <[email protected]>
var1 <- c(1:15+rnorm(15,0,4),1:15+rnorm(15,0,1),1:15+rnorm(15,0,8)) var2 <- c(-1:-15+rnorm(15,0,4),1:15+rnorm(15,0,1),1:15+rnorm(15,0,8)) fact <- gl(3,15,labels=LETTERS[1:3]) cor.multcomp(var1,var2,fact) var3 <- c(1:15+rnorm(15,0,1),1:15+rnorm(15,0,3),1:15+rnorm(15,0,2)) cor.multcomp(var1,var3,fact)
var1 <- c(1:15+rnorm(15,0,4),1:15+rnorm(15,0,1),1:15+rnorm(15,0,8)) var2 <- c(-1:-15+rnorm(15,0,4),1:15+rnorm(15,0,1),1:15+rnorm(15,0,8)) fact <- gl(3,15,labels=LETTERS[1:3]) cor.multcomp(var1,var2,fact) var3 <- c(1:15+rnorm(15,0,1),1:15+rnorm(15,0,3),1:15+rnorm(15,0,2)) cor.multcomp(var1,var3,fact)
Performs a permutation test based on the sum of square covariance between variables of two datasets, to test wether the (square) covariance is higher than expected under random association between the two datasets. The test is relevent parallel to a 2B-PLS.
cov.test(X, Y, scale.X = TRUE, scale.Y = TRUE, nperm = 999, progress = TRUE)
cov.test(X, Y, scale.X = TRUE, scale.Y = TRUE, nperm = 999, progress = TRUE)
X |
a numeric vector, matrix or data frame. |
Y |
a numeric vector, matrix or data frame. |
scale.X |
logical, if |
scale.Y |
logical, if |
nperm |
number of permutations. |
progress |
logical indicating if the progress bar should be displayed. |
The function deals with the limitted floating point precision, which can bias calculation of p-values based on a discrete test statistic distribution.
method |
a character string indicating the name of the test. |
data.name |
a character string giving the name(s) of the data, plus additional information. |
statistic |
the value of the test statistics. |
permutations |
the number of permutations. |
p.value |
the p-value of the test. |
Maxime HERVE <[email protected]>
Plots martingale residuals of a Cox model against fitted values, to check for log-linearity of covariates.
cox.resid(model)
cox.resid(model)
model |
a |
Maxime HERVE <[email protected]>, based on an idea of John Fox.
Fox, J. 2002 Cox Proportional-Hazards Regression for Survival Data.
# 'kidney' dataset of package 'survival' require(survival) data(kidney) model <- coxph(Surv(time,status)~age+factor(sex),data=kidney) cox.resid(model)
# 'kidney' dataset of package 'survival' require(survival) data(kidney) model <- coxph(Surv(time,status)~age+factor(sex),data=kidney) cox.resid(model)
Computes the Cramer's association coefficient between 2 nominal variables.
cramer(x, y)
cramer(x, y)
x |
a contingency table ('matrix' or 'table' object). |
y |
ignored if |
Maxime HERVE <[email protected]>
var1 <- sample(LETTERS[1:3],30,replace=TRUE) var2 <- sample(letters[1:3],30,replace=TRUE) cramer(var1,var2) # or cramer(table(var1,var2))
var1 <- sample(LETTERS[1:3],30,replace=TRUE) var2 <- sample(letters[1:3],30,replace=TRUE) cramer(var1,var2) # or cramer(table(var1,var2))
Computes the Cramer's association coefficient between 2 nominal variables, its confidence interval (by bootstraping) and tests for its significance.
cramer.test(x, y, nrep = 1000, conf.level = 0.95)
cramer.test(x, y, nrep = 1000, conf.level = 0.95)
x |
a contingency table ('matrix' or 'table' object). |
y |
ignored if |
nrep |
number of replicates for bootstraping. |
conf.level |
confidence level. |
method |
name of the test. |
statistic |
test statistics. |
parameter |
test degrees of freedom. |
p.value |
test p-value. |
data.name |
a character string giving the names of the data. |
estimate |
Cramer's coefficient. |
conf.level |
confidence level. |
rep |
number of replicates. |
conf.int |
confidence interval. |
alternative |
a character string giving the alternative hypothesis, always |
null.value |
the value of the association measure under the null hypothesis, always 0. |
Maxime HERVE <[email protected]>
var1 <- sample(LETTERS[1:3],30,replace=TRUE) var2 <- sample(letters[1:3],30,replace=TRUE) cramer.test(var1,var2) # or cramer.test(table(var1,var2))
var1 <- sample(LETTERS[1:3],30,replace=TRUE) var2 <- sample(letters[1:3],30,replace=TRUE) cramer.test(var1,var2) # or cramer.test(table(var1,var2))
Computes the coefficient of variation of a vector.
cv(x, abs = TRUE, pc = TRUE)
cv(x, abs = TRUE, pc = TRUE)
x |
numeric vector. |
abs |
logical. If |
pc |
logical. If |
The function deals with missing values.
Maxime HERVE <[email protected]>
cv(rnorm(30))
cv(rnorm(30))
Draws a dendrogram and an additional bar plot helping to choose the number of groups to be retained (based on the dendrogram).
dendro.gp(dend)
dendro.gp(dend)
dend |
a dendrogram obtained from |
Maxime HERVE <[email protected]>
data(iris) distances <- dist(iris[,1:4],method="euclidian") dendro <- hclust(distances,method="ward.D2") dendro.gp(dendro)
data(iris) distances <- dist(iris[,1:4],method="euclidian") dendro <- hclust(distances,method="ward.D2") dendro.gp(dendro)
Functions that are not usable anymore, and will be entirely removed from the package in future versions.
back.lsmeans(...) byf.normhist(...) cor.sparse(...) CvM.test(...) DA.confusion(...) DA.valid(...) DA.var(...) dunn.test(...) fc.multcomp(...) friedman.rating.test(...) kruskal.rating.test(...) pairwise.manova(...) pairwise.to.groups(...) pairwise.wilcox.rating.test(...) plot1comp.ind(...) plot1comp.var(...) PLSDA.ncomp(...) PLSDA.test(...) rating.lsmeans(...) s.corcircle2(...) scat.mix.categorical(...) scat.mix.numeric(...) scatter.coa2(...) wilcox.paired.rating.multcomp(...) wilcox.rating.signtest(...) wilcox.rating.test(...)
back.lsmeans(...) byf.normhist(...) cor.sparse(...) CvM.test(...) DA.confusion(...) DA.valid(...) DA.var(...) dunn.test(...) fc.multcomp(...) friedman.rating.test(...) kruskal.rating.test(...) pairwise.manova(...) pairwise.to.groups(...) pairwise.wilcox.rating.test(...) plot1comp.ind(...) plot1comp.var(...) PLSDA.ncomp(...) PLSDA.test(...) rating.lsmeans(...) s.corcircle2(...) scat.mix.categorical(...) scat.mix.numeric(...) scatter.coa2(...) wilcox.paired.rating.multcomp(...) wilcox.rating.signtest(...) wilcox.rating.test(...)
... |
previous arguments. |
back.lsmeans
and rating.lsmeans
are replaced by back.emmeans
and rating.emmeans
. More generally, stop using package lsmeans
and change to package emmeans
, its new version.
byf.normhist
was not very useful and byf.hist
does nearly the same job.
cor.sparse
is replaced by the more generic MVA.plot
.
CvM.test
did actually not perform a Cramer-von Mises test but an alternative Cramer test. Use cramer.test
from package cramer
directly, on which CvM.test
was based.
DA.confusion
and DA.valid
are replaced by the more generic MVA.cmv
and MVA.cv
.
DA.var
is replaced by the more generic MVA.synt
.
dunn.test
is not very useful, prefer dunnTest
from package FSA
.
fc.multcomp
is not useful anymore since emtrends
(package emmeans) does the same job in a much more powerful manner (see argument var
of emtrends
).
friedman.rating.test
, kruskal.rating.test
, wilcox.rating.test
, wilcox.rating.signtest
, pairwise.wilcox.rating.test
and wilcox.paired.rating.multcomp
can be problematic with ratings (in which ties and zeroes are very frequent). The use of CLM(M)s (via clm
and clmm
) is recommended.
pairwise.manova
is not useful anymore since emmeans
(package emmeans
) does the same job in a much more powerful manner (on "mlm"
objects, created by lm
and not manova
)
pairwise.to.groups
was not very useful.
plot1comp.ind
, plot1comp.var
, s.corcircle2
, scat.mix.categorical
, scat.mix.numeric
and scatter.coa2
are replaced by the more generic MVA.plot
.
PLSDA.ncomp
was not really useful and mvr
does nearly the same job.
PLSDA.test
is replaced by the more generic MVA.test
.
Performs cross validation with DIABLO (block.plsda
or block.splsda
).
DIABLO.cv(x, method = c("mahalanobis.dist", "max.dist", "centroids.dist"), validation = c("Mfold", "loo"), k = 7, repet = 10, ...)
DIABLO.cv(x, method = c("mahalanobis.dist", "max.dist", "centroids.dist"), validation = c("Mfold", "loo"), k = 7, repet = 10, ...)
x |
an object of class |
method |
criterion used to predict class membership. See |
validation |
a character giving the kind of (internal) validation to use. See |
k |
an integer giving the number of folds (can be re-set internally if needed). |
repet |
an integer giving the number of times the whole procedure has to be repeated. |
... |
other arguments to pass to |
The function uses the weighted predicted classification error rate (see perf
).
repet |
number of times the whole procedure was repeated. |
k |
number of folds. |
validation |
kind of validation used. |
ncomp |
number of components used. |
method |
criterion used to classify individuals of the test sets. |
NMC.mean |
mean classification error rate (based on |
NMC.se |
standard error of the classification error rate (based on |
Maxime HERVE <[email protected]>
block.plsda
, block.splsda
, perf
## Not run: require(mixOmics) data(nutrimouse) data <- list(gene=nutrimouse$gene,lipid=nutrimouse$lipid,Y=nutrimouse$diet) DIABLO <- block.plsda(X=data,indY=3) DIABLO.cv(DIABLO) ## End(Not run)
## Not run: require(mixOmics) data(nutrimouse) data <- list(gene=nutrimouse$gene,lipid=nutrimouse$lipid,Y=nutrimouse$diet) DIABLO <- block.plsda(X=data,indY=3) DIABLO.cv(DIABLO) ## End(Not run)
Performs a permutation significance test based on cross-validation with DIABLO (block.plsda
or block.splsda
).
DIABLO.test(x, method = c("mahalanobis.dist", "max.dist", "centroids.dist"), validation = c("Mfold", "loo"), k = 7, nperm = 999, progress = TRUE, ...)
DIABLO.test(x, method = c("mahalanobis.dist", "max.dist", "centroids.dist"), validation = c("Mfold", "loo"), k = 7, nperm = 999, progress = TRUE, ...)
x |
an object of class |
method |
criterion used to predict class membership. See |
validation |
a character giving the kind of (internal) validation to use. See |
k |
an integer giving the number of folds (can be re-set internally if needed). |
nperm |
number of permutations. |
progress |
logical indicating if the progress bar should be displayed. |
... |
other arguments to pass to |
The function uses the weighted predicted classification error rate (see perf
).
method |
a character string indicating the name of the test. |
data.name |
a character string giving the name of the data, plus additional information. |
statistic |
the value of the test statistics (classification error rate). |
permutations |
the number of permutations. |
p.value |
the p-value of the test. |
Maxime HERVE <[email protected]>
block.plsda
, block.splsda
, perf
## Not run: require(mixOmics) data(nutrimouse) data <- list(gene=nutrimouse$gene,lipid=nutrimouse$lipid,Y=nutrimouse$diet) DIABLO <- block.plsda(X=data,indY=3) DIABLO.test(DIABLO) ## End(Not run)
## Not run: require(mixOmics) data(nutrimouse) data <- list(gene=nutrimouse$gene,lipid=nutrimouse$lipid,Y=nutrimouse$diet) DIABLO <- block.plsda(X=data,indY=3) DIABLO.test(DIABLO) ## End(Not run)
Creates a matrix of dummy responses from a factor. Needed in some multivariate analyses.
dummy(f, simplify = TRUE)
dummy(f, simplify = TRUE)
f |
vector (internally transformed into factor). |
simplify |
logical indicating if the last column of the response matrix should be removed (to avoid model overfitting). |
Maxime HERVE <[email protected]>
fac <- gl(3,5,labels=LETTERS[1:3]) dummy(fac)
fac <- gl(3,5,labels=LETTERS[1:3]) dummy(fac)
Empirical logistic transformation for logistic models with data showing (quasi-)complete separation. The function is intended to be used as a link function in GLM(M)s.
elogis()
elogis()
Formula from McCullagh & Nelder in their seminal book 'Generalized Linear Models'. R code from Eric Wajnberg & Jean-Sebastien Pierre.
# An example with 3 groups and complete separation (from E. Wajnberg) tab <- data.frame(case=letters[1:3],yes=c(25,30,0),no=c(1,0,20)) tab ## Not run: mod <- glm(cbind(yes,no)~case,family=binomial(link=elogis()),data=tab) # Warnings are normal summary(mod) ## End(Not run)
# An example with 3 groups and complete separation (from E. Wajnberg) tab <- data.frame(case=letters[1:3],yes=c(25,30,0),no=c(1,0,20)) tab ## Not run: mod <- glm(cbind(yes,no)~case,family=binomial(link=elogis()),data=tab) # Warnings are normal summary(mod) ## End(Not run)
Performs a Fisher's exact test for comparing response probabilities (i.e. when the response variable is a binary variable). The function is in fact a wrapper to the Fisher's exact test for count data. If the p-value of the test is significant, the function performs pairwise comparisons by using Fisher's exact tests.
fisher.bintest(formula, data, alpha = 0.05, p.method = "fdr")
fisher.bintest(formula, data, alpha = 0.05, p.method = "fdr")
formula |
a formula of the form |
data |
an optional data frame containing the variables in the formula |
alpha |
significance level to compute pairwise comparisons. |
p.method |
method for p-values correction. See help of |
If the response is a 0/1 variable, the probability of the '1' group is tested. In any other cases, the response is transformed into a factor and the probability of the second level is tested.
Since chi-squared and G tests are approximate tests, exact tests are preferable when the number of individuals is small (200 is a reasonable minimum).
method.test |
a character string giving the name of the global test computed. |
data.name |
a character string giving the name(s) of the data. |
alternative |
a character string describing the alternative hypothesis. |
estimate |
the estimated probabilities. |
null.value |
the value of the difference in probabilities under the null hypothesis, always 0. |
p.value |
p-value of the global test. |
alpha |
significance level. |
p.adjust.method |
method for p-values correction. |
p.value.multcomp |
data frame of pairwise comparisons result. |
method.multcomp |
a character string giving the name of the test computed for pairwise comparisons. |
Maxime HERVE <[email protected]>
response <- c(0,0,0,0,0,0,1,0,0,0,0,0,1,0,1,1,1,0,0,1,1,1,1,1,1,0,0,1,1,1) fact <- gl(3,10,labels=LETTERS[1:3]) fisher.bintest(response~fact)
response <- c(0,0,0,0,0,0,1,0,0,0,0,0,1,0,1,1,1,0,0,1,1,1,1,1,1,0,0,1,1,1) fact <- gl(3,10,labels=LETTERS[1:3]) fisher.bintest(response~fact)
Performs pairwise comparisons after a comparison of proportions or after a test for independence of 2 categorical variables, by using a Fisher's exact test.
fisher.multcomp(tab.cont, p.method = "fdr")
fisher.multcomp(tab.cont, p.method = "fdr")
tab.cont |
contingency table. |
p.method |
method for p-values correction. See help of |
Since chi-squared and G tests are approximate tests, exact tests are preferable when the number of individuals is small (200 is a reasonable minimum).
method |
name of the test. |
data.name |
a character string giving the name(s) of the data. |
p.adjust.method |
method for p-values correction. |
p.value |
table of results of pairwise comparisons. |
Maxime HERVE <[email protected]>
chisq.test
, prop.test
, fisher.test
# 2-column contingency table: comparison of proportions tab.cont1 <- matrix(c(17,23,12,24,20,10),ncol=2,dimnames=list(c("Control", "Treatment1","Treatment2"),c("Alive","Dead")),byrow=TRUE) fisher.test(tab.cont1) fisher.multcomp(tab.cont1) # 3-column contingency table: independence test tab.cont2 <- as.table(matrix(c(25,10,12,6,15,14,9,16,9),ncol=3,dimnames=list(c("fair", "dark","russet"),c("blue","brown","green")))) fisher.test(tab.cont2) fisher.multcomp(tab.cont2)
# 2-column contingency table: comparison of proportions tab.cont1 <- matrix(c(17,23,12,24,20,10),ncol=2,dimnames=list(c("Control", "Treatment1","Treatment2"),c("Alive","Dead")),byrow=TRUE) fisher.test(tab.cont1) fisher.multcomp(tab.cont1) # 3-column contingency table: independence test tab.cont2 <- as.table(matrix(c(25,10,12,6,15,14,9,16,9),ncol=3,dimnames=list(c("fair", "dark","russet"),c("blue","brown","green")))) fisher.test(tab.cont2) fisher.multcomp(tab.cont2)
Performs a Fligner-Policello test of the null that the medians in the two groups (samples) are the same.
fp.test(x, ...) ## Default S3 method: fp.test(x, y, delta = 0, alternative = "two.sided", ...) ## S3 method for class 'formula' fp.test(formula, data, subset, ...)
fp.test(x, ...) ## Default S3 method: fp.test(x, y, delta = 0, alternative = "two.sided", ...) ## S3 method for class 'formula' fp.test(formula, data, subset, ...)
x |
a numeric vector of data values. |
y |
a numeric vector of data values. |
delta |
null difference in medians tested. |
alternative |
a character string specifying the alternative hypothesis, must be one of |
formula |
a formula of the form |
data |
an optional data frame containing the variables in the formula |
subset |
an optional vector specifying a subset of observations to be used. |
... |
further arguments to be passed to or from other methods. |
The Fligner-Policello test does not assume that the shape of the distribution is similar in two groups, contrary to the Mann-Whitney-Wilcoxon test. However, it assumes that the the distributions are symmetric.
statistic |
test statistics. |
p.value |
p-value of the test. |
alternative |
a character string describing the alternative hypothesis. |
method |
a character string indicating the name of the test. |
data.name |
a character string giving the names of the data. |
null.value |
the specified hypothesized value of the median difference. |
Maxime HERVE <[email protected]>
x <- rpois(20,3) y <- rpois(20,5) fp.test(x,y)
x <- rpois(20,3) y <- rpois(20,5) fp.test(x,y)
Performs a G-test for comparing response probabilities (i.e. when the response variable is a binary variable). The function is in fact a wrapper to the G-test for comparison of proportions on a contingency table. If the p-value of the test is significant, the function performs pairwise comparisons by using G-tests.
G.bintest(formula, data, alpha = 0.05, p.method = "fdr")
G.bintest(formula, data, alpha = 0.05, p.method = "fdr")
formula |
a formula of the form |
data |
an optional data frame containing the variables in the formula |
alpha |
significance level to compute pairwise comparisons. |
p.method |
method for p-values correction. See help of |
If the response is a 0/1 variable, the probability of the '1' group is tested. In any other cases, the response is transformed into a factor and the probability of the second level is tested.
Since a G-test is an approximate test, an exact test is preferable when the number of individuals is small (200 is a reasonable minimum). See fisher.bintest
in that case.
method.test |
a character string giving the name of the global test computed. |
data.name |
a character string giving the name(s) of the data. |
alternative |
a character string describing the alternative hypothesis. |
estimate |
the estimated probabilities. |
null.value |
the value of the difference in probabilities under the null hypothesis, always 0. |
statistic |
test statistics. |
parameter |
test degrees of freedom. |
p.value |
p-value of the global test. |
alpha |
significance level. |
p.adjust.method |
method for p-values correction. |
p.value.multcomp |
data frame of pairwise comparisons result. |
method.multcomp |
a character string giving the name of the test computed for pairwise comparisons. |
Maxime HERVE <[email protected]>
response <- c(rep(0:1,c(40,60)),rep(0:1,c(55,45)),rep(0:1,c(65,35))) fact <- gl(3,100,labels=LETTERS[1:3]) G.bintest(response~fact)
response <- c(rep(0:1,c(40,60)),rep(0:1,c(55,45)),rep(0:1,c(65,35))) fact <- gl(3,100,labels=LETTERS[1:3]) G.bintest(response~fact)
Performs pairwise comparisons after a global G-test.
G.multcomp(x, p.method = "fdr")
G.multcomp(x, p.method = "fdr")
x |
numeric vector (counts). |
p.method |
method for p-values correction. See help of |
Since a G-test is an approximate test, an exact test is preferable when the number of individuals is small (200 is a reasonable minimum). See multinomial.multcomp
in that case.
method |
name of the test. |
data.name |
a character string giving the name(s) of the data. |
p.adjust.method |
method for p-values correction. |
p.value |
table of results. |
Maxime HERVE <[email protected]>
G.test
, multinomial.test
, multinomial.multcomp
counts <- c(49,30,63,59) G.test(counts) G.multcomp(counts)
counts <- c(49,30,63,59) G.test(counts) G.multcomp(counts)
Perfoms a G-test on a contingency table or a vector of counts.
G.test(x, p = rep(1/length(x), length(x)))
G.test(x, p = rep(1/length(x), length(x)))
x |
a numeric vector or matrix (see Details). |
p |
theoretical proportions (optional). |
If x
is matrix, it must be constructed like this:
- 2 columns giving number of successes (left) and fails (right)
- 1 row per population.
The function works as chisq.test
:
- if x
is a vector and theoretical proportions are not given, equality of counts is tested
- if x
is a vector and theoretical proportions are given, equality of counts to theoretical counts (given by theoretical proportions) is tested
- if x
is a matrix with two columns, equality of proportion of successes between populations is tested.
- if x
is a matrix with more than two columns, independence of rows and columns is tested.
Since a G-test is an approximate test, an exact test is preferable when the number of individuals is small (200 is a reasonable minimum). See multinomial.test
in that case with a vector, fisher.test
with a matrix.
method |
name of the test. |
statistic |
test statistics. |
parameter |
test degrees of freedom. |
p.value |
p-value. |
data.name |
a character string giving the name(s) of the data. |
observed |
the observed counts. |
expected |
the expected counts under the null hypothesis. |
Maxime HERVE <[email protected]>
chisq.test
, multinomial.test
, fisher.test
G.multcomp
, G.theo.multcomp
, pairwise.G.test
counts <- c(49,30,63,59) G.test(counts)
counts <- c(49,30,63,59) G.test(counts)
Performs pairwise comparisons after a global G-test for given probabilities.
G.theo.multcomp(x, p = rep(1/length(x), length(x)), p.method = "fdr")
G.theo.multcomp(x, p = rep(1/length(x), length(x)), p.method = "fdr")
x |
numeric vector (counts). |
p |
theoretical proportions. |
p.method |
method for p-values correction. See help of |
Since a G-test is an approximate test, an exact test is preferable when the number of individuals is small (200 is a reasonable minimum). See multinomial.theo.multcomp
in that case.
method |
name of the test. |
data.name |
a character string giving the name(s) of the data. |
observed |
observed counts. |
expected |
expected counts. |
p.adjust.method |
method for p-values correction. |
statistic |
statistics of each test. |
p.value2 |
corrected p-values. |
p.value |
data frame of results. |
Maxime HERVE <[email protected]>
G.test
, multinomial.test
, multinomial.theo.multcomp
counts <- c(49,30,63,59) p.theo <- c(0.2,0.1,0.45,0.25) G.test(counts,p=p.theo) G.theo.multcomp(counts,p=p.theo)
counts <- c(49,30,63,59) p.theo <- c(0.2,0.1,0.45,0.25) G.test(counts,p=p.theo) G.theo.multcomp(counts,p=p.theo)
Performs a permutation significance test based on total variance explained for Generalized Procrustes Analysis. The function uses GPA
.
GPA.test(df, group, tolerance = 10^-10, nbiteration = 200, scale = TRUE, nperm = 999, progress = TRUE)
GPA.test(df, group, tolerance = 10^-10, nbiteration = 200, scale = TRUE, nperm = 999, progress = TRUE)
df |
a data frame with n rows (individuals) and p columns (quantitative varaibles), in which all data frames are combined. |
group |
a vector indicating the number of variables in each group (i.e. data frame). |
tolerance |
a threshold with respect to which the algorithm stops, i.e. when the difference between the GPA loss function at step n and n+1 is less than |
nbiteration |
the maximum number of iterations until the algorithm stops. |
scale |
logical, if |
nperm |
number of permutations. |
progress |
logical indicating if the progress bar should be displayed. |
Rows of each data frame are randomly and independently permuted.
The function deals with the limitted floating point precision, which can bias calculation of p-values based on a discrete test statistic distribution.
method |
a character string indicating the name of the test. |
data.name |
a character string giving the name(s) of the data, plus additional information. |
statistic |
the value of the test statistics. |
permutations |
the number of permutations. |
p.value |
the p-value of the test. |
Maxime HERVE <[email protected]>
Wakeling IN, Raats MM and MacFie HJH (1992) A new significance test for consensus in Generalized Procrustes Analysis. Journal of Sensory Studies 7:91-96.
require(FactoMineR) data(wine) ## Not run: GPA.test(wine[,-(1:2)],group=c(5,3,10,9,2))
require(FactoMineR) data(wine) ## Not run: GPA.test(wine[,-(1:2)],group=c(5,3,10,9,2))
Computes difference in regression parameters when each individual is dropped, expressed in proportion of the whole regression coefficients. The function deals with lm
(including glm) and least.rect
models.
ind.contrib(model, print.diff = FALSE, graph = TRUE, warning=25)
ind.contrib(model, print.diff = FALSE, graph = TRUE, warning=25)
model |
model (of class |
print.diff |
logical. If |
graph |
logical. If |
warning |
level of graphical warning. |
coefficients |
coefficients of each computed regression. |
coefficients.diff |
difference in coefficients between each computed regression and the whole regression. |
coefficients.prop |
difference in coefficients expressed in proportion of the whole regression coefficients. |
Maxime HERVE <[email protected]>
x <- 1:30 y <- 1:30+rnorm(30,0,4) model1 <- lm(y~x) model2 <- least.rect(y~x) ind.contrib(model1) ind.contrib(model2)
x <- 1:30 y <- 1:30+rnorm(30,0,4) model1 <- lm(y~x) model2 <- least.rect(y~x) ind.contrib(model1) ind.contrib(model2)
Fits a least rectangle linear regression, possibly for each level of a factor.
least.rect(formula, data, conf.level = 0.95, theo = 1, adj = TRUE)
least.rect(formula, data, conf.level = 0.95, theo = 1, adj = TRUE)
formula |
a formula of the form |
data |
an optional data frame containing the variables in the formula |
conf.level |
confidence level. |
theo |
theoretical value of the slope. If several regression are fitted, the same value is used for all comparisons of slope vs. theoretical value. |
adj |
logical indicating if, in case of several regressions fitted, confidence intervals and p-values should be Bonferroni-corrected for multiple testing. |
coefficients |
regression parameters. |
residuals |
residuals. |
fitted.values |
fitted values. |
call |
the matched call. |
model |
the model frame used. |
conf.level |
confidence level. |
conf.int |
confidence interval of regression parameters. |
theo |
theoretical value of the slope. |
comp |
data frame of results for equality of the slope(s) to the theoretical value. |
corr |
data frame of results for significativity of the correlation coefficient(s). |
multiple |
logical, |
adj |
logical, |
Maxime HERVE <[email protected]>
x <- 1:30+rnorm(30,0,3) y <- 1:30+rnorm(30,0,3) regression1 <- least.rect(y~x) summary(regression1) x2 <- c(1:30,1:30) y2 <- c(1:30+rnorm(30,0,3),seq(10,22,12/29)+rnorm(30,0,3)) fact <- gl(2,30,labels=LETTERS[1:2]) regression2 <- least.rect(y2~x2|fact) summary(regression2)
x <- 1:30+rnorm(30,0,3) y <- 1:30+rnorm(30,0,3) regression1 <- least.rect(y~x) summary(regression1) x2 <- c(1:30,1:30) y2 <- c(1:30+rnorm(30,0,3),seq(10,22,12/29)+rnorm(30,0,3)) fact <- gl(2,30,labels=LETTERS[1:2]) regression2 <- least.rect(y2~x2|fact) summary(regression2)
Returns the slope of a line defined by selecting two points on a graph.
loc.slp()
loc.slp()
Maxime HERVE <[email protected]>
Cuts the data into intervals, compute the response probability and its standard error for each interval and add the results to the regression curve. No test is performed but this permits to have a graphical idea of the adjustment of the model to the data.
logis.fit(model, int = 5, ...)
logis.fit(model, int = 5, ...)
model |
|
int |
number of intervals. |
... |
Maxime HERVE <[email protected]>
x <- 1:50 y <- c(rep(0,18),sample(0:1,14,replace=TRUE),rep(1,18)) model <- glm(y~x,family=binomial) plot(x,y) lines(x,model$fitted) logis.fit(model)
x <- 1:50 y <- c(rep(0,18),sample(0:1,14,replace=TRUE),rep(1,18)) model <- glm(y~x,family=binomial) plot(x,y) lines(x,model$fitted) logis.fit(model)
Adds some noise to the fitted values of a glm
model to create a nls
model for logistic regression (creating a nls
model from exact fitted values can not be done, see help of nls
).
logis.noise(model, intensity = 25)
logis.noise(model, intensity = 25)
model |
|
intensity |
intensity of the noise: lower the value, bigger the noise. |
Maxime HERVE <[email protected]>
x <- 1:50 y <- c(rep(0,18),sample(0:1,14,replace=TRUE),rep(1,18)) model <- glm(y~x,family=binomial) y2 <- logis.noise(model) # Then model2 <- nls(y2~SSlogis(...))
x <- 1:50 y <- c(rep(0,18),sample(0:1,14,replace=TRUE),rep(1,18)) model <- glm(y~x,family=binomial) y2 <- logis.noise(model) # Then model2 <- nls(y2~SSlogis(...))
Computes the mode of a vector. The function makes the difference between continuous and discontinuous variables (which are made up of integers only). By extention, it also gives the most frequent value in a character vector or a factor.
mod(x)
mod(x)
x |
vector (numeric, character or factor). |
Maxime HERVE <[email protected]>
# Continuous variable x <- rnorm(100) mod(x) # Discontinuous variable y <- rpois(100,2) mod(y) # Character vector z <- sample(LETTERS[1:3],20,replace=TRUE) mod(z)
# Continuous variable x <- rnorm(100) mod(x) # Discontinuous variable y <- rpois(100,2) mod(y) # Character vector z <- sample(LETTERS[1:3],20,replace=TRUE) mod(z)
Performs a Mood's median test to compare medians of independent samples.
mood.medtest(x, ...) ## Default S3 method: mood.medtest(x, g, exact = NULL, ...) ## S3 method for class 'formula' mood.medtest(formula, data, subset, ...)
mood.medtest(x, ...) ## Default S3 method: mood.medtest(x, g, exact = NULL, ...) ## S3 method for class 'formula' mood.medtest(formula, data, subset, ...)
x |
a numeric vector of data values. |
g |
a vector or factor object giving the group for the corresponding elements of |
exact |
a logical indicating whether an exact p-value should be computed. |
formula |
a formula of the form |
data |
an optional data frame containing the variables in the formula |
subset |
an optional vector specifying a subset of observations to be used. |
... |
further arguments to be passed to or from other methods. |
If exact=NULL
, a Fisher's exact test is used if the number of data values is < 200; otherwise a chi-square test is used, with Yates continuity correction if necessary.
method |
a character string indicating the name of the test. |
data.name |
a character string giving the name(s) of the data. |
statistic |
the value the chi-squared test statistic (in case of a chis-square test). |
parameter |
the degrees of freedom of the approximate chi-squared distribution of the test statistic (in case of a chis-square test). |
p.value |
the p-value of the test. |
Maxime HERVE <[email protected]>
set.seed(1716) response <- c(rnorm(10,3,1.5),rnorm(10,5.5,2)) fact <- gl(2,10,labels=LETTERS[1:2]) mood.medtest(response~fact)
set.seed(1716) response <- c(rnorm(10,3,1.5),rnorm(10,5.5,2)) fact <- gl(2,10,labels=LETTERS[1:2]) mood.medtest(response~fact)
Draws a QQ-plot to assess multivariate normality.
mqqnorm(x, main = "Multi-normal Q-Q Plot")
mqqnorm(x, main = "Multi-normal Q-Q Plot")
x |
a data frame or a matrix of numeric variables (each column giving a variable). |
main |
title of the graph. |
Maxime HERVE <[email protected]>
x <- 1:30+rnorm(30) y <- 1:30+rnorm(30,1,3) mqqnorm(cbind(x,y))
x <- 1:30+rnorm(30) y <- 1:30+rnorm(30,1,3) mqqnorm(cbind(x,y))
Performs a Shapiro-Wilk test to asses multivariate normality. This is a slightly modified copy of the mshapiro.test
function of the package mvnormtest, for internal convenience.
mshapiro.test(x)
mshapiro.test(x)
x |
a data frame or a matrix of numeric variables (each column giving a variable). |
method |
name of the test. |
data.name |
a character string giving the names of the data. |
statistic |
test statistics of the test. |
p.value |
p-value of the test. |
Maxime HERVE <[email protected]> from the work of Slawomir Jarek
x <- 1:30+rnorm(30) y <- 1:30+rnorm(30,1,3) mshapiro.test(cbind(x,y))
x <- 1:30+rnorm(30) y <- 1:30+rnorm(30,1,3) mshapiro.test(cbind(x,y))
Performs pairwise comparisons after a global exact multinomial test. These comparisons are performed using exact binomial tests.
multinomial.multcomp(x, p.method = "fdr")
multinomial.multcomp(x, p.method = "fdr")
x |
numeric vector (counts). Can also be a factor; in that case |
p.method |
method for p-values correction. See help of |
Since chi-squared and G tests are approximate tests, exact tests are preferable when the number of individuals is small (200 is a reasonable minimum).
An exact multinomial test with two groups is strictly the same than an exact binomial test.
method |
name of the test. |
data.name |
a character string giving the name(s) of the data. |
p.adjust.method |
method for p-values correction. |
p.value |
table of results. |
Maxime HERVE <[email protected]>
counts <- c(5,15,23) multinomial.test(counts) multinomial.multcomp(counts)
counts <- c(5,15,23) multinomial.test(counts) multinomial.multcomp(counts)
Perfoms an exact multinomial test on a vector of counts.
multinomial.test(x, p = rep(1/length(x), length(x)))
multinomial.test(x, p = rep(1/length(x), length(x)))
x |
numeric vector (counts). Can also be a factor; in that case |
p |
theoretical proportions (optional). |
The function works as chisq.test
or G.test
:
- if theoretical proportions are not given, equality of counts is tested
- if theoretical proportions are given, equality of counts to theoretical counts (given by theoretical proportions) is tested.
Since chi-squared and G tests are approximate tests, exact tests are preferable when the number of individuals is small (200 is a reasonable minimum).
Be aware that the calculation time increases with the number of individuals (i.e. the sum of x
) and the number of groups (i.e. the length of x
).
An exact multinomial test with two groups is strictly the same as an exact binomial test.
method |
name of the test. |
p.value |
p-value. |
data.name |
a character string giving the name(s) of the data. |
observed |
the observed counts. |
expected |
the expected counts under the null hypothesis. |
Maxime HERVE <[email protected]> based on multinomial.test
chisq.test
, G.test
, binom.test
, multinomial.multcomp
, multinomial.theo.multcomp
counts <- c(5,15,23) multinomial.test(counts)
counts <- c(5,15,23) multinomial.test(counts)
Performs pairwise comparisons after a global exact multinomial test for given probabilities. These comparisons are performed using exact binomial tests.
multinomial.theo.multcomp(x, p = rep(1/length(x), length(x)), prop = FALSE, p.method = "fdr")
multinomial.theo.multcomp(x, p = rep(1/length(x), length(x)), prop = FALSE, p.method = "fdr")
x |
numeric vector (counts). Can also be a factor; in that case |
p |
theoretical proportions. |
prop |
logical indicating if results should be printed as counts ( |
p.method |
method for p-values correction. See help of |
Since chi-squared and G tests are approximate tests, exact tests are preferable when the number of individuals is small (200 is a reasonable minimum).
An exact multinomial test with two groups is strictly the same than an exact binomial test.
method |
name of the test. |
data.name |
a character string giving the name(s) of the data. |
observed |
observed counts. |
expected |
expected counts. |
p.adjust.method |
method for p-values correction. |
p.value2 |
corrected p-values. |
p.value |
data frame of results. |
Maxime HERVE <[email protected]>
counts <- c(5,15,23) p.theo <- c(0.2,0.5,0.3) multinomial.test(counts,p=p.theo) multinomial.theo.multcomp(counts,p=p.theo)
counts <- c(5,15,23) p.theo <- c(0.2,0.5,0.3) multinomial.test(counts,p=p.theo) multinomial.theo.multcomp(counts,p=p.theo)
Performs correlation tests between one variable and a series of other variables, and corrects p-values.
multtest.cor(mult.var, uni.var, method = "pearson", p.method = "fdr", ordered = TRUE) ## S3 method for class 'multtest.cor' plot(x, arrows = TRUE, main = NULL, pch = 16, cex = 1, col = c("red", "orange", "black"), labels = NULL, ...)
multtest.cor(mult.var, uni.var, method = "pearson", p.method = "fdr", ordered = TRUE) ## S3 method for class 'multtest.cor' plot(x, arrows = TRUE, main = NULL, pch = 16, cex = 1, col = c("red", "orange", "black"), labels = NULL, ...)
mult.var |
data frame containing a series of numeric variables. |
uni.var |
numeric variable (vector). |
method |
a character string indicating which correlation coefficient is to be computed. See help of |
p.method |
method for p-values correction. See help of |
ordered |
logical indicating if variables should be ordered based on correlation values. |
x |
object returned from |
arrows |
logical indicating if arrows should be plotted. If |
main |
optional title of the graph. |
pch |
symbol(s) used for points, when points are displayed (see |
cex |
size of points and labels (see help of |
col |
vector of three colors: first for variables with P < 0.05, second for variables with 0.05 < P < 0.1, third for variables with P > 0.1. Recycled if only one value. |
labels |
names of the variables. If |
... |
not used. |
Maxime HERVE <[email protected]>
data(iris) # Original coordinates plot(Petal.Length~Sepal.Length,pch=16,col=as.numeric(iris$Species),data=iris) # New axis abline(-6,1.6) # Coordinates on new axis new.coord <- coord.proj(iris[,c("Sepal.Length","Petal.Length")],1.6) # Correlation between the whole dataset and new coordinates mult.cor <- multtest.cor(iris[,1:4],new.coord) plot(mult.cor)
data(iris) # Original coordinates plot(Petal.Length~Sepal.Length,pch=16,col=as.numeric(iris$Species),data=iris) # New axis abline(-6,1.6) # Coordinates on new axis new.coord <- coord.proj(iris[,c("Sepal.Length","Petal.Length")],1.6) # Correlation between the whole dataset and new coordinates mult.cor <- multtest.cor(iris[,1:4],new.coord) plot(mult.cor)
Performs group comparisons for multiple variables using parametric, permutational or rank tests, and corrects p-values. Gives also group means and standards errors for each variable.
multtest.gp(tab, fac, test = c("param", "perm", "rank"), transform = c("none", "sqrt", "4rt", "log"), add = 0, p.method = "fdr", ordered = TRUE, ...) ## S3 method for class 'multtest.gp' plot(x, signif = FALSE, alpha = 0.05, vars = NULL, xlab = "Group", ylab = "Mean (+/- SE) value", titles = NULL, groups = NULL, ...)
multtest.gp(tab, fac, test = c("param", "perm", "rank"), transform = c("none", "sqrt", "4rt", "log"), add = 0, p.method = "fdr", ordered = TRUE, ...) ## S3 method for class 'multtest.gp' plot(x, signif = FALSE, alpha = 0.05, vars = NULL, xlab = "Group", ylab = "Mean (+/- SE) value", titles = NULL, groups = NULL, ...)
tab |
data frame containing response variables. |
fac |
factor defining groups to compare. |
test |
type of test to use: parametric (default), permutational (non parametric) or rank-based (non parametric). See Details. |
transform |
transformation to apply to response variables before testing (none by default; |
add |
value to add to response variables before a log-transformation. |
p.method |
method for p-values correction. See help of |
ordered |
logical indicating if variables should be ordered based on p-values. |
x |
object returned from |
signif |
logical indicating if only variables with significant P-value should be plotted. |
alpha |
significance threshold. |
vars |
numeric vector giving variables to plot (rows of |
xlab |
legend of the x axis. |
ylab |
legend of the y axis |
titles |
titles of the graphs (name of the variables by default). |
groups |
names of the bars (levels of |
... |
additional arguments to testing functions in |
In case of parametric tests, t-tests or ANOVAs are used depending on the number of groups (2 or more, respectively). In case of permutational tests: permutational t-tests or permutational ANOVAs. In case of rank-based tests: Mann-Whitney-Wilcoxon or Kruskal-Wallis tests.
Maxime HERVE <[email protected]>
data(iris) mult <- multtest.gp(iris[,1:4],iris$Species) plot(mult)
data(iris) mult <- multtest.gp(iris[,1:4],iris$Species) plot(mult)
Performs group comparisons for multiple binary variables using a parametric or an exact test, and corrects p-values. Gives also group proportions and standards errors for each variable.
multtest.gp.bin(tab, fac, test = c("LRT", "Fisher"), p.method = "fdr", ordered = TRUE, ...) ## S3 method for class 'multtest.gp.bin' plot(x, signif = FALSE, alpha = 0.05, vars = NULL, xlab = "Group", ylab = "Mean (+/- SE) proportion", titles = NULL, groups = NULL, ...)
multtest.gp.bin(tab, fac, test = c("LRT", "Fisher"), p.method = "fdr", ordered = TRUE, ...) ## S3 method for class 'multtest.gp.bin' plot(x, signif = FALSE, alpha = 0.05, vars = NULL, xlab = "Group", ylab = "Mean (+/- SE) proportion", titles = NULL, groups = NULL, ...)
tab |
data frame containing response variables. |
fac |
factor defining groups to compare. |
test |
type of test to use: likelihood ratio test based on binomial GLM ( |
p.method |
method for p-values correction. See help of |
ordered |
logical indicating if variables should be ordered based on p-values. |
x |
object returned from |
signif |
logical indicating if only variables with significant P-value should be plotted. |
alpha |
significance threshold. |
vars |
numeric vector giving variables to plot (rows of |
xlab |
legend of the x axis. |
ylab |
legend of the y axis |
titles |
titles of the graphs (name of the variables by default). |
groups |
names of the bars (levels of |
... |
additional arguments to testing functions in |
Maxime HERVE <[email protected]>
multtest.gp
, Anova
, fisher.test
This function is a wrapper to anova.cca(...,by="terms")
but performs type II tests (whereas anova.cca
performs type I).
MVA.anova(object, ...)
MVA.anova(object, ...)
object |
|
... |
additional arguments to |
See anova.cca
for detailed explanation of what is done. The only difference with anova.cca
is that MVA.anova
performs type II tests instead of type I.
See example of adonis.II
for the difference between type I (sequential) and type II tests.
a data frame of class "anova"
.
Maxime HERVE <[email protected]>
Displays a biplot of a multivariate analysis. This just consists in superimposing a score plot and a correlation circle (plus centroids of factor levels in constrained analyses, RDA or CCA). The correlation circle is adjusted to fit the size of the score plot.
MVA.biplot(x, xax = 1, yax = 2, scaling = 2, sco.set = c(12, 1, 2), cor.set = c(12, 1, 2), space = 1, ratio = 0.9, weights = 1, constraints = c("nf", "n", "f", NULL), sco.args = list(), cor.args = list(), f.col = 1, f.cex = 1)
MVA.biplot(x, xax = 1, yax = 2, scaling = 2, sco.set = c(12, 1, 2), cor.set = c(12, 1, 2), space = 1, ratio = 0.9, weights = 1, constraints = c("nf", "n", "f", NULL), sco.args = list(), cor.args = list(), f.col = 1, f.cex = 1)
x |
a multivariate analysis (see Details). |
xax |
the horizontal axis. |
yax |
the vertical axis. |
scaling |
type of scaling (see |
sco.set |
scores to be displayed, when several sets are available (see |
cor.set |
correlations to be displayed, when several sets are available (see |
space |
space to use, when several are available (see |
ratio |
constant for adjustement of correlations to the size of the score plot ( |
weights |
only used with constrained analyses (RDA or CCA) where some constraints are factors. Individual weights, used to calculate barycenter positions. |
constraints |
only used with constrained analyses (RDA or CCA). Type of constraints to display: quantitative ( |
sco.args |
list containing optional arguments to pass to |
cor.args |
list containing optional arguments to pass to |
f.col |
color(s) used for barycenters in case of a constraint analysis (RDA or CCA) containing factor constraint(s). Can be a unique value, a vector giving one color per constraint or a vector giving one color per barycenter (all factors confounded). |
f.cex |
size(s) used for barycenters in case of a constraint analysis (RDA or CCA) containing factor constraint(s). Can be a unique value, a vector giving one size per constraint or a vector giving one size per barycenter (all factors confounded). |
This function should not be use directly. Prefer the general MVA.plot
, to which all arguments can be passed.
All multivariate analyses covered by MVA.corplot
can be used for biplots.
Maxime HERVE <[email protected]>
require(vegan) data(iris) RDA <- rda(iris[,1:4]~Species,data=iris) MVA.plot(RDA,"biplot",cor.args=list(col="purple"),ratio=0.8,f.col=c("red","green","blue"))
require(vegan) data(iris) RDA <- rda(iris[,1:4]~Species,data=iris) MVA.plot(RDA,"biplot",cor.args=list(col="purple"),ratio=0.8,f.col=c("red","green","blue"))
Performs cross model validation (2CV) with different PLS analyses.
MVA.cmv(X, Y, repet = 10, kout = 7, kinn = 6, ncomp = 8, scale = TRUE, model = c("PLSR", "CPPLS", "PLS-DA", "PPLS-DA", "PLS-DA/LDA", "PLS-DA/QDA", "PPLS-DA/LDA", "PPLS-DA/QDA"), crit.inn = c("RMSEP", "Q2", "NMC"), Q2diff = 0.05, lower = 0.5, upper = 0.5, Y.add = NULL, weights = rep(1, nrow(X)), set.prior = FALSE, crit.DA = c("plug-in", "predictive", "debiased"), ...)
MVA.cmv(X, Y, repet = 10, kout = 7, kinn = 6, ncomp = 8, scale = TRUE, model = c("PLSR", "CPPLS", "PLS-DA", "PPLS-DA", "PLS-DA/LDA", "PLS-DA/QDA", "PPLS-DA/LDA", "PPLS-DA/QDA"), crit.inn = c("RMSEP", "Q2", "NMC"), Q2diff = 0.05, lower = 0.5, upper = 0.5, Y.add = NULL, weights = rep(1, nrow(X)), set.prior = FALSE, crit.DA = c("plug-in", "predictive", "debiased"), ...)
X |
a data frame of independent variables. |
Y |
the dependent variable(s): numeric vector, data frame of quantitative variables or factor. |
repet |
an integer giving the number of times the whole 2CV procedure has to be repeated. |
kout |
an integer giving the number of folds in the outer loop (can be re-set internally if needed). |
kinn |
an integer giving the number of folds in the inner loop (can be re-set internally if needed). Cannot be |
ncomp |
an integer giving the maximal number of components to be tested in the inner loop (can be re-set depending on the size of the train sets). |
scale |
logical indicating if data should be scaled (see Details). |
model |
the model to be fitted (see Details). |
crit.inn |
the criterion to be used to choose the number of components in the inner loop. Root Mean Square Error of Prediction ( |
Q2diff |
the threshold to be used if the number of components is chosen according to Q2. The next component is added only if it makes the Q2 increase more than |
lower |
a vector of lower limits for power optimisation in CPPLS or PPLS-DA (see |
upper |
a vector of upper limits for power optimisation in CPPLS or PPLS-DA (see |
Y.add |
a vector or matrix of additional responses containing relevant information about the observations, in CPPLS or PPLS-DA (see |
weights |
a vector of individual weights for the observations, in CPPLS or PPLS-DA (see |
set.prior |
only used when a second analysis (LDA or QDA) is performed. If |
crit.DA |
criterion used to predict class membership when a second analysis (LDA or QDA) is used. See |
... |
other arguments to pass to |
Cross model validation is detailed is Szymanska et al (2012). Some more details about how this function works:
- when a discriminant analysis is used ("PLS-DA"
, "PPLS-DA"
, "PLS-DA/LDA"
, "PLS-DA/QDA"
, "PPLS-DA/LDA"
or "PPLS-DA/QDA"
), the training sets (test set itself in the inner loop, test+validation sets in the outer loop) are generated in respect to the relative proportions of the levels of Y
in the original data set (see splitf
).
- "PLS-DA"
is considered as PLS2 on a dummy-coded response. For a PLS-DA based on the CPPLS algorithm, use "PPLS-DA"
with lower
and upper
limits of the power parameters set to 0.5
.
- if a second analysis is used ("PLS-DA/LDA"
, "PLS-DA/QDA"
, "PPLS-DA/LDA"
or "PPLS-DA/QDA"
), a LDA or QDA is built on scores of the first analysis (PLS-DA or PPLS-DA) also in the inner loop. The classification error rate, based on this second analysis, is used to choose the number of components.
If scale = TRUE
, the scaling is done as this:
- for each step of the outer loop (i.e. kout
steps), the rest set is pre-processed by centering and unit-variance scaling. Means and standard deviations of variables in the rest set are then used to scale the test set.
- for each step of the inner loop (i.e. kinn
steps), the training set is pre-processed by centering and unit-variance scaling. Means and standard deviations of variables in the training set are then used to scale the validation set.
model |
model used. |
type |
type of model used. |
repet |
number of times the whole 2CV procedure was repeated. |
kout |
number of folds in the outer loop. |
kinn |
number of folds in the inner loop. |
crit.inn |
criterion used to choose the number of components in the inner loop. |
crit.DA |
criterion used to classify individuals of the test and validation sets. |
Q2diff |
threshold used if the number of components is chosen according to Q2. |
groups |
levels of |
models.list |
list of of models generated ( |
models1.list |
list of of (P)PLS-DA models generated ( |
models2.list |
list of of LDA/QDA models generated ( |
RMSEP |
RMSEP computed from the models used in the outer loops ( |
Q2 |
Q2 computed from the models used in the outer loops ( |
NMC |
Classification error rate computed from the models used in the outer loops ( |
confusion |
Confusion matrices computed from the models used in the outer loops ( |
pred.prob |
Probability of each individual of being of each level of |
Maxime HERVE <[email protected]>
Szymanska E, Saccenti E, Smilde AK and Westerhuis J (2012) Double-check: validation of diagnostic statistics for PLS-DA models in metabolomics studies. Metabolomics (2012) 8:S3-S16.
predict.MVA.cmv
, mvr
, lda
, qda
require(pls) require(MASS) # PLSR data(yarn) ## Not run: MVA.cmv(yarn$NIR,yarn$density,model="PLSR") # PPLS-DA coupled to LDA data(mayonnaise) ## Not run: MVA.cmv(mayonnaise$NIR,factor(mayonnaise$oil.type),model="PPLS-DA/LDA",crit.inn="NMC")
require(pls) require(MASS) # PLSR data(yarn) ## Not run: MVA.cmv(yarn$NIR,yarn$density,model="PLSR") # PPLS-DA coupled to LDA data(mayonnaise) ## Not run: MVA.cmv(mayonnaise$NIR,factor(mayonnaise$oil.type),model="PPLS-DA/LDA",crit.inn="NMC")
Returns correlations of a multivariate analysis.
MVA.cor(x, xax = 1, yax = 2, set = c(12, 1, 2), space = 1, ...)
MVA.cor(x, xax = 1, yax = 2, set = c(12, 1, 2), space = 1, ...)
x |
a multivariate analysis (see Details). |
xax |
axis or axes for which to extract correlations. |
yax |
axis for which to extract correlations (ignored if |
set |
variables to be displayed, when several sets are available (see Details). |
space |
variables to be displayed, when several spaces are available (see Details). |
... |
not used. |
Many multivariate analyses are supported, from various packages:
- sPCA: spca
.
- IPCA: ipca
.
- sIPCA: sipca
.
- PLS-DA (PLS2 on a dummy-coded factor): plsda
. X space only.
- sPLS-DA (sPLS2 on a dummy-coded factor): splsda
. X space only.
- CPPLS: mvr
. Set 1 is X, set 2 is Y. If set=12
(default), fac
is not available and pch
,cex
, col
, lwd
can be defined differently for each set. X space only.
- PLSR: mvr
, pls
, plsR
(plsRglm package). Set 1 is X, set 2 is Y. If set=12
(default), fac
is not available and pch
,cex
, col
, lwd
can be defined differently for each set. X space only.
- sPLSR: pls
. Set 1 is X, set 2 is Y. If set=12
(default), fac
is not available and pch
,cex
, col
, lwd
can be defined differently for each set. X space only.
- PLS-GLR: plsRglm
(plsRglm package). Set 1 is X, set 2 is Y. If set=12
(default), fac
is not available and pch
,cex
, col
, lwd
can be defined differently for each set. Correlations are computed with Y on the link scale.
- PCR: mvr
. Set 1 is X, set 2 is Y. If set=12
(default), fac
is not available and pch
,cex
, col
, lwd
can be defined differently for each set.
- CDA: discrimin
, discrimin.coa
.
- NSCOA: dudi.nsc
. For NSCOA there is no real correlation, but the classical representation of columns is arrows. This is why MVA.corplot was made able to deal with this analysis.
- CCA: cca
, pcaiv
. Constraints (only quantitative constraints are extracted) in constrained space only.
- Mix analysis: dudi.mix
, dudi.hillsmith
. Only quantitative variables are displayed.
- RDA (or PCAIV): pcaiv
, pcaivortho
, rda
. With rda
, space 1 is constrained space, space 2 is unconstrained space. Only constrained space is available with pcaiv
, the opposite for pcaivortho
. Set 1 is constraints (only quantitative constraints are extracted), set 2 is dependent variables (only set 2 is available for pcaivortho
). If set=12
(default), fac
is not available and pch
,cex
, col
, lwd
can be defined differently for each set.
- CCorA: CCorA
, rcc
. Space 1 is X, space 2 is Y. With rcc
a third space is available, in which coordinates are means of X and Y coordinates. In this third space, set 1 is X, set 2 is Y. If set=12
(default), fac
is not available and pch
,cex
, col
, lwd
can be defined differently for each set.
- rCCorA: rcc
. Space 1 is X, space 2 is Y, space 3 is a "common" space in which coordinates are means of X and Y coordinates. In space 3, set 1 is X and set 2 is Y. If set=12
(default), fac
is not available and pch
,cex
, col
, lwd
can be defined differently for each set.
- CIA: coinertia
. Space 1 is X, space 2 is Y, space 3 is a "common" space where X and Y scores are normed. In space 3, set 1 is X and set 2 is Y. If set=12
in space 3 (default), fac
is not available and pch
,cex
, col
, lws
can be defined differently for each set.
- GPA: GPA
. Only the consensus ordination can be displayed.
- 2B-PLS: pls
. Space 1 is X, space 2 is Y, space 3 is a "common" space in which coordinates are means of X and Y coordinates. In space 3, set 1 is X and set 2 is Y. If set=12
(default), fac
is not available and pch
,cex
, col
, lwd
can be defined differently for each set.
- 2B-sPLS: pls
. Space 1 is X, space 2 is Y, space 3 is a "common" space in which coordinates are means of X and Y coordinates. In space 3, set 1 is X and set 2 is Y. If set=12
(default), fac
is not available and pch
,cex
, col
, lwd
can be defined differently for each set.
- rGCCA: wrapper.rgcca
. Space can be 1 to n, the number of blocks (i.e. datasets).
- sGCCA: wrapper.sgcca
. Space can be 1 to n, the number of blocks (i.e. datasets).
- DIABLO: block.plsda
, block.splsda
. Space can be 1 to n, the number of blocks (i.e. datasets).
Maxime HERVE <[email protected]>
Displays a correlation circle of a multivariate analysis.
MVA.corplot(x, xax = 1, yax = 2, thresh = 0, fac = NULL, set = c(12, 1, 2), space = 1, xlab = NULL, ylab = NULL, main = NULL, circle = TRUE, intcircle = 0.5, points = TRUE, ident = TRUE, arrows = TRUE, labels = NULL, main.pos = c("bottomleft", "topleft", "bottomright", "topright"), main.cex = 1.3, legend = FALSE, legend.pos = c("topleft", "topright", "bottomleft", "bottomright"), legend.title = NULL, legend.lab = NULL, pch = 16, cex = 1, col = 1, lwd = 1, drawintaxes = TRUE, add = FALSE, add.const = 1, keepmar = FALSE)
MVA.corplot(x, xax = 1, yax = 2, thresh = 0, fac = NULL, set = c(12, 1, 2), space = 1, xlab = NULL, ylab = NULL, main = NULL, circle = TRUE, intcircle = 0.5, points = TRUE, ident = TRUE, arrows = TRUE, labels = NULL, main.pos = c("bottomleft", "topleft", "bottomright", "topright"), main.cex = 1.3, legend = FALSE, legend.pos = c("topleft", "topright", "bottomleft", "bottomright"), legend.title = NULL, legend.lab = NULL, pch = 16, cex = 1, col = 1, lwd = 1, drawintaxes = TRUE, add = FALSE, add.const = 1, keepmar = FALSE)
x |
a multivariate analysis (see Details). |
xax |
the horizontal axis. |
yax |
the vertical axis. This can be set to |
thresh |
threshold (in absolute value of the correlation coefficient) of variables to be plotted. |
fac |
an optional factor defining groups of variables. |
set |
variables to be displayed, when several sets are available (see Details). |
space |
variables to be displayed, when several spaces are available (see Details). |
xlab |
legend of the horizontal axis. If |
ylab |
only used for two-dimensional graphs. Legend of the vertical axis. If |
main |
optional title of the graph. |
circle |
only used for two-dimensional graphs. Logical indicating if the circle of radius 1 should be plotted. |
intcircle |
only used for two-dimensional graphs. Vector of one or several values indicating radii of circles to be plotted inside the main circle. Can be set to |
points |
only used for two-dimensional graphs. If |
ident |
only used for two-dimensional graphs when |
arrows |
only used if |
labels |
names of the variables. If |
main.pos |
position of the title, if |
main.cex |
size of the title, if |
legend |
only used for two-dimensional graphs. Logical indicating if a legend should be added to the graph. |
legend.pos |
position of the legend, if |
legend.title |
optional title of the legend, if |
legend.lab |
legend labels, if |
pch |
symbol(s) used for points, when points are displayed (see |
cex |
size of the points and/or of the variable names. For two-dimensional graphs: if |
col |
color(s) used for points and/or variable names. If |
lwd |
only used if arrows are displayed. Width of arrows. If |
drawintaxes |
logical indicating if internal axes should be drawn. |
add |
only used for two-dimensional graphs. Logical indicating if the correlation circle should be added to an existing graph. |
add.const |
only used for two-dimensional graphs and if |
keepmar |
only used for two-dimensional graphs. Logical indicating if margins defined by MVA.corplot should be kept after plotting (necessary in some cases when |
This function should not be use directly. Prefer the general MVA.plot
, to which all arguments can be passed.
Many multivariate analyses are supported, from various packages:
- sPCA: spca
.
- IPCA: ipca
.
- sIPCA: sipca
.
- PLS-DA (PLS2 on a dummy-coded factor): plsda
. X space only.
- sPLS-DA (sPLS2 on a dummy-coded factor): splsda
. X space only.
- CPPLS: mvr
. Set 1 is X, set 2 is Y. If set=12
(default), fac
is not available and pch
,cex
, col
, lwd
can be defined differently for each set. X space only.
- PLSR: mvr
, pls
, plsR
(plsRglm package). Set 1 is X, set 2 is Y. If set=12
(default), fac
is not available and pch
,cex
, col
, lwd
can be defined differently for each set. X space only.
- sPLSR: pls
. Set 1 is X, set 2 is Y. If set=12
(default), fac
is not available and pch
,cex
, col
, lwd
can be defined differently for each set. X space only.
- PLS-GLR: plsRglm
(plsRglm package). Set 1 is X, set 2 is Y. If set=12
(default), fac
is not available and pch
,cex
, col
, lwd
can be defined differently for each set. Correlations are computed with Y on the link scale.
- PCR: mvr
. Set 1 is X, set 2 is Y. If set=12
(default), fac
is not available and pch
,cex
, col
, lwd
can be defined differently for each set.
- CDA: discrimin
, discrimin.coa
.
- NSCOA: dudi.nsc
. For NSCOA there is no real correlation, but the classical representation of columns is arrows. This is why MVA.corplot was made able to deal with this analysis.
- CCA: cca
, pcaiv
. Constraints (only quantitative constraints are extracted) in constrained space only.
- Mix analysis: dudi.mix
, dudi.hillsmith
. Only quantitative variables are displayed.
- RDA (or PCAIV): pcaiv
, pcaivortho
, rda
. With rda
, space 1 is constrained space, space 2 is unconstrained space. Only constrained space is available with pcaiv
, the opposite for pcaivortho
. Set 1 is constraints (only quantitative constraints are extracted), set 2 is dependent variables (only set 2 is available for pcaivortho
). If set=12
(default), fac
is not available and pch
,cex
, col
, lwd
can be defined differently for each set.
- db-RDA: capscale
, dbrda
. Constraints (only quantitative constraints are extracted) in constrained space only.
- CCorA: CCorA
, rcc
. Space 1 is X, space 2 is Y. With rcc
a third space is available, in which coordinates are means of X and Y coordinates. In this third space, set 1 is X, set 2 is Y. If set=12
(default), fac
is not available and pch
,cex
, col
, lwd
can be defined differently for each set.
- rCCorA: rcc
. Space 1 is X, space 2 is Y, space 3 is a "common" space in which coordinates are means of X and Y coordinates. In space 3, set 1 is X and set 2 is Y. If set=12
(default), fac
is not available and pch
,cex
, col
, lwd
can be defined differently for each set.
- CIA: coinertia
. Space 1 is X, space 2 is Y, space 3 is a "common" space where X and Y scores are normed. In space 3, set 1 is X and set 2 is Y. If set=12
in space 3 (default), fac
is not available and pch
,cex
, col
, lws
can be defined differently for each set.
- PCIA: procuste
. Set 1 is X, set 2 is Y.
- 2B-PLS: pls
. Space 1 is X, space 2 is Y, space 3 is a "common" space in which coordinates are means of X and Y coordinates. In space 3, set 1 is X and set 2 is Y. If set=12
(default), fac
is not available and pch
,cex
, col
, lwd
can be defined differently for each set.
- 2B-sPLS: pls
. Space 1 is X, space 2 is Y, space 3 is a "common" space in which coordinates are means of X and Y coordinates. In space 3, set 1 is X and set 2 is Y. If set=12
(default), fac
is not available and pch
,cex
, col
, lwd
can be defined differently for each set.
- rGCCA: wrapper.rgcca
. Space can be 1 to n, the number of blocks (i.e. datasets).
- sGCCA: wrapper.sgcca
. Space can be 1 to n, the number of blocks (i.e. datasets).
- DIABLO: block.plsda
, block.splsda
. Space can be 1 to n, the number of blocks (i.e. datasets).
Maxime HERVE <[email protected]>
require(ade4) data(olympic) PCA <- dudi.pca(olympic$tab,scannf=FALSE) MVA.plot(PCA,"corr")
require(ade4) data(olympic) PCA <- dudi.pca(olympic$tab,scannf=FALSE) MVA.plot(PCA,"corr")
Performs cross validation with different PLS and/or discriminant analyses.
MVA.cv(X, Y, repet = 10, k = 7, ncomp = 8, scale = TRUE, model = c("PLSR", "CPPLS", "PLS-DA", "PPLS-DA", "LDA", "QDA", "PLS-DA/LDA", "PLS-DA/QDA", "PPLS-DA/LDA", "PPLS-DA/QDA"), lower = 0.5, upper = 0.5, Y.add = NULL, weights = rep(1, nrow(X)), set.prior = FALSE, crit.DA = c("plug-in", "predictive", "debiased"), ...)
MVA.cv(X, Y, repet = 10, k = 7, ncomp = 8, scale = TRUE, model = c("PLSR", "CPPLS", "PLS-DA", "PPLS-DA", "LDA", "QDA", "PLS-DA/LDA", "PLS-DA/QDA", "PPLS-DA/LDA", "PPLS-DA/QDA"), lower = 0.5, upper = 0.5, Y.add = NULL, weights = rep(1, nrow(X)), set.prior = FALSE, crit.DA = c("plug-in", "predictive", "debiased"), ...)
X |
a data frame of independent variables. |
Y |
the dependent variable(s): numeric vector, data frame of quantitative variables or factor. |
repet |
an integer giving the number of times the whole procedure has to be repeated. |
k |
an integer giving the number of folds (can be re-set internally if needed). |
ncomp |
an integer giving the number of components to be used for all models except LDA and QDA (can be re-set depending on the size of the train sets). |
scale |
logical indicating if data should be scaled (see Details). |
model |
the model to be fitted (see Details). |
lower |
a vector of lower limits for power optimisation in CPPLS or PPLS-DA (see |
upper |
a vector of upper limits for power optimisation in CPPLS or PPLS-DA (see |
Y.add |
a vector or matrix of additional responses containing relevant information about the observations, in CPPLS or PPLS-DA (see |
weights |
a vector of individual weights for the observations, in CPPLS or PPLS-DA (see |
set.prior |
only used when a LDA or QDA is performed (coupled or not with a PLS model). If |
crit.DA |
criterion used to predict class membership when a LDA or QDA is used. See |
... |
other arguments to pass to |
When a discriminant analysis is used ("PLS-DA"
, "PPLS-DA"
, "LDA"
, "QDA"
, "PLS-DA/LDA"
, "PLS-DA/QDA"
, "PPLS-DA/LDA"
or "PPLS-DA/QDA"
), the training sets are generated in respect to the relative proportions of the levels of Y
in the original data set (see splitf
).
"PLS-DA"
is considered as PLS2 on a dummy-coded response. For a PLS-DA based on the CPPLS algorithm, use "PPLS-DA"
with lower
and upper
limits of the power parameters set to 0.5
.
If scale = TRUE
, the scaling is done as this: for each step of the validation loop (i.e. k
steps), the training set is pre-processed by centering and unit-variance scaling. Means and standard deviations of variables in the training set are then used to scale the test set.
model |
model used. |
type |
type of model used. |
repet |
number of times the whole procedure was repeated. |
k |
number of folds. |
ncomp |
number of components used. |
crit.DA |
criterion used to classify individuals of the test sets. |
groups |
levels of |
models.list |
list of of models generated ( |
models1.list |
list of of (P)PLS-DA models generated ( |
models2.list |
list of of LDA/QDA models generated ( |
RMSEP |
RMSEP vales ( |
Q2 |
Q2 values ( |
NMC |
Classification error rates ( |
confusion |
Confusion matrices ( |
pred.prob |
Probability of each individual of being of each level of |
Maxime HERVE <[email protected]>
predict.MVA.cmv
, mvr
, lda
, qda
require(pls) require(MASS) # PLSR data(yarn) ## Not run: MVA.cv(yarn$NIR,yarn$density,model="PLSR") # PPLS-DA coupled to LDA data(mayonnaise) ## Not run: MVA.cv(mayonnaise$NIR,factor(mayonnaise$oil.type),model="PPLS-DA/LDA")
require(pls) require(MASS) # PLSR data(yarn) ## Not run: MVA.cv(yarn$NIR,yarn$density,model="PLSR") # PPLS-DA coupled to LDA data(mayonnaise) ## Not run: MVA.cv(mayonnaise$NIR,factor(mayonnaise$oil.type),model="PPLS-DA/LDA")
Returns loadings of a multivariate analysis.
MVA.load(x, xax = 1, yax = 2, set = c(12, 1, 2), space = 1, ...)
MVA.load(x, xax = 1, yax = 2, set = c(12, 1, 2), space = 1, ...)
x |
a multivariate analysis (see Details). |
xax |
axis or axes for which to extract loadings. |
yax |
axis for which to extract loadings (ignored if |
set |
variables to be displayed, when several sets are available (see Details). |
space |
variables to be displayed, when several spaces are available (see Details). |
... |
not used. |
Many multivariate analyses are supported, from various packages:
- PCA: prcomp
, princomp
, dudi.pca
, rda
, pca
, pca
.
- sPCA: spca
.
- IPCA: ipca
.
- sIPCA: sipca
.
- PLS-DA (PLS2 on a dummy-coded factor): plsda
. X space only.
- sPLS-DA (sPLS2 on a dummy-coded factor): splsda
. X space only.
- CPPLS: mvr
. X space only.
- PLSR: mvr
, pls
, plsR
(plsRglm package). X space only.
- sPLSR: pls
. X space only.
- PLS-GLR: plsRglm
(plsRglm package).
- PCR: mvr
.
- CDA: discrimin
, discrimin.coa
.
- NSCOA: dudi.nsc
.
- MCA: dudi.acm
.
- Mix analysis: dudi.mix
, dudi.hillsmith
.
- PCIA: procuste
. Set 1 is X, set 2 is Y.
- RDA (or PCAIV): pcaiv
, pcaivortho
, rda
. With rda
, space 1 is constrained space, space 2 is unconstrained space. Only constrained space is available with pcaiv
, the opposite for pcaivortho
.
- CCorA: rcc
. Space 1 is X, space 2 is Y.
- rCCorA: rcc
. Space 1 is X, space 2 is Y.
- CIA: coinertia
. Space 1 is X, space 2 is Y.
- 2B-PLS: pls
. Space 1 is X, space 2 is Y.
- 2B-sPLS: pls
. Space 1 is X, space 2 is Y.
- rGCCA: wrapper.rgcca
. Space can be 1 to n, the number of blocks (i.e. datasets).
- sGCCA: wrapper.sgcca
. Space can be 1 to n, the number of blocks (i.e. datasets).
- DIABLO: block.plsda
, block.splsda
. Space can be 1 to n, the number of blocks (i.e. datasets).
Maxime HERVE <[email protected]>
Displays a loading plot of a multivariate analysis.
MVA.loadplot(x, xax = 1, yax = 2, fac = NULL, set = c(12, 1, 2), space = 1, map = TRUE, xlab = NULL, ylab = NULL, main = NULL, points = TRUE, ident = TRUE, links = TRUE, line = TRUE, labels = NULL, main.pos = c("bottomleft", "topleft","bottomright", "topright"), main.cex = 1.3, legend = FALSE, legend.pos = c("topleft", "topright", "bottomleft", "bottomright"), legend.title = NULL, legend.lab = NULL, pch = 16, cex = 1, col = 1, lwd = 1, lty = 1, drawextaxes = TRUE, drawintaxes = TRUE, xlim = NULL, ylim = NULL)
MVA.loadplot(x, xax = 1, yax = 2, fac = NULL, set = c(12, 1, 2), space = 1, map = TRUE, xlab = NULL, ylab = NULL, main = NULL, points = TRUE, ident = TRUE, links = TRUE, line = TRUE, labels = NULL, main.pos = c("bottomleft", "topleft","bottomright", "topright"), main.cex = 1.3, legend = FALSE, legend.pos = c("topleft", "topright", "bottomleft", "bottomright"), legend.title = NULL, legend.lab = NULL, pch = 16, cex = 1, col = 1, lwd = 1, lty = 1, drawextaxes = TRUE, drawintaxes = TRUE, xlim = NULL, ylim = NULL)
x |
a multivariate analysis (see Details). |
xax |
the horizontal axis. |
yax |
the vertical axis. This can be set to |
fac |
only used for one-dimensional graphs. An optional factor defining groups of variables. |
set |
variables to be displayed, when several sets are available (see Details). |
space |
variables to be displayed, when several spaces are available (see Details). |
map |
logical indicating if a two-dimensional ( |
xlab |
only used for two-dimensional graphs. Legend of the horizontal axis. If |
ylab |
legend of the vertical axis. If |
main |
optional title of the graph. |
points |
only used for two-dimensional graphs. If |
ident |
logical indicating if variable names should be displayed. Only used when |
links |
only used for two-dimensional graphs when |
line |
only used for one-dimensional graphs when |
labels |
only used if |
main.pos |
only used for one-dimensional graphs. Position of the title, if |
main.cex |
size of the title, if |
legend |
logical indicating if a legend should be added to the graph. |
legend.pos |
position of the legend, if |
legend.title |
optional title of the legend, if |
legend.lab |
legend labels, if |
pch |
only used for two-dimensional graphs. Symbol(s) used for points, when points are displayed (see |
cex |
size of the points and/or of the variable names. For two-dimensional graphs: if |
col |
color(s) used for points, variable names and/or lines/sticks. For one-dimensional graphs, can be a vector of length one or a vector giving one value per line. For two-dimensional graphs: if |
lwd |
width of lines. For one-dimensional graphs, can be a vector of length one or a vector giving one value per line. For two-dimensional graphs: if |
lty |
only used for one-dimensional graphs. Can be a vector of length one or a vector giving one value per line. |
drawextaxes |
logical indicating if external axes should be drawn. |
drawintaxes |
only used for two-dimensional graphs. Logical indicating if internal axes should be drawn. |
xlim |
only used in two-dimensional graphs. Limits of the horizontal axis. If |
ylim |
limits of the vertical axis. If |
This function should not be use directly. Prefer the general MVA.plot
, to which all arguments can be passed.
Many multivariate analyses are supported, from various packages:
- PCA: prcomp
, princomp
, dudi.pca
, rda
, pca
, pca
.
- sPCA: spca
.
- IPCA: ipca
.
- sIPCA: sipca
.
- PLS-DA (PLS2 on a dummy-coded factor): plsda
. X space only.
- sPLS-DA (sPLS2 on a dummy-coded factor): splsda
. X space only.
- CPPLS: mvr
. X space only.
- PLSR: mvr
, pls
, plsR
(plsRglm package). X space only.
- sPLSR: pls
. X space only.
- PLS-GLR: plsRglm
(plsRglm package).
- PCR: mvr
.
- CDA: discrimin
, discrimin.coa
.
- NSCOA: dudi.nsc
.
- MCA: dudi.acm
.
- Mix analysis: dudi.mix
, dudi.hillsmith
.
- PCIA: procuste
. Set 1 is X, set 2 is Y.
- RDA (or PCAIV): pcaiv
, pcaivortho
, rda
. With rda
, space 1 is constrained space, space 2 is unconstrained space. Only constrained space is available with pcaiv
, the opposite for pcaivortho
.
- CCorA: rcc
. Space 1 is X, space 2 is Y.
- rCCorA: rcc
. Space 1 is X, space 2 is Y.
- CIA: coinertia
. Space 1 is X, space 2 is Y.
- 2B-PLS: pls
. Space 1 is X, space 2 is Y.
- 2B-sPLS: pls
. Space 1 is X, space 2 is Y.
- rGCCA: wrapper.rgcca
. Space can be 1 to n, the number of blocks (i.e. datasets).
- sGCCA: wrapper.sgcca
. Space can be 1 to n, the number of blocks (i.e. datasets).
- DIABLO: block.plsda
, block.splsda
. Space can be 1 to n, the number of blocks (i.e. datasets).
Maxime HERVE <[email protected]>
require(ade4) data(olympic) PCA <- dudi.pca(olympic$tab,scannf=FALSE) MVA.plot(PCA,"load")
require(ade4) data(olympic) PCA <- dudi.pca(olympic$tab,scannf=FALSE) MVA.plot(PCA,"load")
Displays a paired plot (i.e. a score plot of paired points) of a multivariate analysis.
MVA.pairplot(x, xax = 1, yax = 2, pairs = NULL, scaling = 2, space = 1, fac = NULL, xlab = NULL, ylab = NULL, main = NULL, ident = TRUE, labels = NULL, cex = 0.7, col = 1, lwd = 1, main.pos = c("bottomleft", "topleft", "bottomright", "topright"), main.cex = 1.3, legend = FALSE, legend.pos = c("topleft", "topright", "bottomleft", "bottomright"), legend.title = NULL, legend.lab = NULL, drawextaxes = TRUE, drawintaxes = TRUE, xlim = NULL, ylim = NULL)
MVA.pairplot(x, xax = 1, yax = 2, pairs = NULL, scaling = 2, space = 1, fac = NULL, xlab = NULL, ylab = NULL, main = NULL, ident = TRUE, labels = NULL, cex = 0.7, col = 1, lwd = 1, main.pos = c("bottomleft", "topleft", "bottomright", "topright"), main.cex = 1.3, legend = FALSE, legend.pos = c("topleft", "topright", "bottomleft", "bottomright"), legend.title = NULL, legend.lab = NULL, drawextaxes = TRUE, drawintaxes = TRUE, xlim = NULL, ylim = NULL)
x |
a multivariate analysis (see Details). |
xax |
the horizontal axis. |
yax |
the vertical axis. Cannot be |
pairs |
two-level factor identifying paired individuals (in the same order in both sets of points). Can be omitted with multivariate analyses where two sets of points are available in the same space (see |
scaling |
type of scaling. Only available with some analyses performed with the |
space |
scores to be displayed, when several spaces are available (see Details of |
fac |
an optional factor defining groups pairs. |
xlab |
legend of the horizontal axis. If |
ylab |
legend of the vertical axis. If |
main |
optional title of the graph. |
ident |
logical indicating if variable names should be displayed. |
labels |
names of the individuals. If |
cex |
size of the labels. If |
col |
color(s) used for arrows and labels. If |
lwd |
width of arrows. If |
main.pos |
position of the title, if |
main.cex |
size of the title, if |
legend |
logical indicating if a legend should be added to the graph. |
legend.pos |
position of the legend, if |
legend.title |
optional title of the legend, if |
legend.lab |
legend labels, if |
drawextaxes |
logical indicating if external axes should be drawn.. |
drawintaxes |
logical indicating if internal axes should be drawn. |
xlim |
limits of the horizontal axis. If |
ylim |
limits of the vertical axis. If |
This function should not be use directly. Prefer the general MVA.plot
, to which all arguments can be passed.
All multivariate analyses supported by MVA.scoreplot
can be used for a paired plot.
Maxime HERVE <[email protected]>
require(ade4) data(macaca) PCIA <- procuste(macaca$xy1,macaca$xy2) MVA.plot(PCIA,"pairs")
require(ade4) data(macaca) PCIA <- procuste(macaca$xy1,macaca$xy2) MVA.plot(PCIA,"pairs")
Displays several kinds of plots for multivariate analyses.
MVA.plot(x, type = c("scores", "loadings", "correlations", "biplot", "pairs", "trajectories"), ...)
MVA.plot(x, type = c("scores", "loadings", "correlations", "biplot", "pairs", "trajectories"), ...)
x |
a multivariate analysis (see Details). |
type |
the type of plot to be displayed: score plot (default), loading plot, correlation circle, biplot, score plot showing paired samples or score plot showing trajectories, respectively. |
... |
arguments to be passed to subfunctions. See Details. |
Different subfunctions are used depending on the type of plot to be displayed: MVA.scoreplot
, MVA.loadplot
, MVA.corplot
, MVA.biplot
, MVA.pairplot
or MVA.trajplot
. These functions should not be used directly (everything can be done with the general MVA.plot
) but for convenience, arguments and analyses supported are detailed in separate help pages.
Warning: the use of attach
before running a multivariate analysis can prevent MVA.plot
to get the values it needs, and make it fail.
Maxime HERVE <[email protected]>
Displays a score plot of a multivariate analysis.
MVA.scoreplot(x, xax = 1, yax = 2, scaling = 2, set = c(12, 1, 2), space = 1, byfac = TRUE, fac = NULL, barycenters = TRUE, stars = TRUE, contours = FALSE, dhist = TRUE, weights = 1, xlab = NULL, ylab = NULL, main = NULL, pch = 16, cex = 1, col = 1, points = TRUE, labels = NULL, main.pos = c("bottomleft", "topleft", "bottomright", "topright"), main.cex = 1.3, fac.lab = NULL, fac.cex = 1, legend = FALSE, legend.pos = c("topleft", "topright", "bottomleft", "bottomright"), legend.title = NULL, legend.lab = NULL, legend.cex = 1, drawextaxes = TRUE, drawintaxes = TRUE, xlim = NULL, ylim = NULL, keepmar = FALSE)
MVA.scoreplot(x, xax = 1, yax = 2, scaling = 2, set = c(12, 1, 2), space = 1, byfac = TRUE, fac = NULL, barycenters = TRUE, stars = TRUE, contours = FALSE, dhist = TRUE, weights = 1, xlab = NULL, ylab = NULL, main = NULL, pch = 16, cex = 1, col = 1, points = TRUE, labels = NULL, main.pos = c("bottomleft", "topleft", "bottomright", "topright"), main.cex = 1.3, fac.lab = NULL, fac.cex = 1, legend = FALSE, legend.pos = c("topleft", "topright", "bottomleft", "bottomright"), legend.title = NULL, legend.lab = NULL, legend.cex = 1, drawextaxes = TRUE, drawintaxes = TRUE, xlim = NULL, ylim = NULL, keepmar = FALSE)
x |
a multivariate analysis (see Details). |
xax |
the horizontal axis. |
yax |
the vertical axis. This can be set to |
scaling |
type of scaling. Only available with some analyses performed with the |
set |
scores to be displayed, when several sets are available (see Details). |
space |
scores to be displayed, when several spaces are available (see Details). |
byfac |
only used with MCA and mix analyses (see Details). If |
fac |
an optional factor defining groups of individuals. |
barycenters |
only used if |
stars |
only used if |
contours |
only used if |
dhist |
only used in the one-dimensional case. If |
weights |
individual weights, used to calculate barycenter positions (see |
xlab |
legend of the horizontal axis. If |
ylab |
legend of the vertical axis. If |
main |
optional title of the graph. Can be a vector of several values for MCA and mix analyses when |
pch |
symbol(s) used for points, when points are displayed (see |
cex |
size of the points or of the labels (see |
col |
color(s) used for points or labels (see |
points |
only used for two-dimensional graphs. If |
labels |
used in two-dimensional graphs when |
main.pos |
position of the title, if |
main.cex |
size of the title, if |
fac.lab |
only used if |
fac.cex |
only used if |
legend |
logical indicating if a legend should be added to the graph. Available for two-dimensional graphs and for density histograms in the one-dimensional case (see |
legend.pos |
position of the legend, if |
legend.title |
optional title of the legend, if |
legend.lab |
legend labels, if |
legend.cex |
size of legend labels, if |
drawextaxes |
logical indicating if external axes should be drawn. Available for two-dimensional graphs and for density histograms in the one-dimensional case (see |
drawintaxes |
logical indicating if internal axes should be drawn. |
xlim |
limits of the horizontal axis. If |
ylim |
only used in two-dimensional graphs. Limits of the vertical axis. If |
keepmar |
only used in two-dimensional graphs. Logical indicating if margins defined by MVA.scoreplot should be kept after plotting (necessary for biplots). |
This function should not be use directly. Prefer the general MVA.plot
, to which all arguments can be passed.
Many multivariate analyses are supported, from various packages:
- PCA: prcomp
, princomp
(if scores=TRUE
), dudi.pca
, rda
, pca
, pca
. scaling
can be defined for rda
(see scores.rda
).
- sPCA: spca
.
- IPCA: ipca
.
- sIPCA: sipca
.
- PCoA: cmdscale
(with at least on non-default argument), dudi.pco
, wcmdscale
(with at least one non-default argument), capscale
, pco
, pcoa
.
- nMDS: monoMDS
, metaMDS
, nmds
, isoMDS
.
- PLS-DA (PLS2 on a dummy-coded factor): plsda
. X space only.
- sPLS-DA (sPLS2 on a dummy-coded factor): splsda
. X space only.
- CPPLS: mvr
. X space only.
- PLSR: mvr
, pls
, plsR
(plsRglm package). X space only.
- sPLSR: pls
. X space only.
- PLS-GLR: plsRglm
(plsRglm package).
- PCR: mvr
.
- CDA: discrimin
, discrimin.coa
.
- NSCOA: dudi.nsc
.
- MCA: dudi.acm
.
- Mix analysis: dudi.mix
, dudi.hillsmith
.
- COA: dudi.coa
, cca
. Set 1 is rows, set 2 is columns. If set=12
(default), fac
is not available and pch
,cex
, col
can be defined differently for each set. scaling
can be defined for cca
(see scores.cca
).
- DCOA: dudi.dec
. Set 1 is rows, set 2 is columns. If set=12
(default), fac
is not available and pch
,cex
, col
can be defined differently for each set.
- PCIA: procuste
. Set 1 is X, set 2 is Y. If set=12
(default), fac
is not available and pch
,cex
, col
can be defined differently for each set.
- Procrustean superimposition: procrustes
. Set 1 is X, set 2 is Y. If set=12
(default), fac
is not available and pch
,cex
, col
can be defined differently for each set.
- GPA: GPA
. Only the consensus ordination can be displayed.
- DPCoA: dpcoa
. Set 1 is categories, set 2 is collections. If set=12
(default), fac
is not available and pch
,cex
, col
can be defined differently for each set.
- RDA (or PCAIV): pcaiv
, pcaivortho
, rda
. With rda
, space 1 is constrained space, space 2 is unconstrained space. Only constrained space is available with pcaiv
, the opposite for pcaivortho
. scaling
can be defined for rda
(see scores.rda
).
- db-RDA (or CAP): capscale
, dbrda
. Space 1 is constrained space, space 2 is unconstrained space.
- CCA: pcaiv
, cca
. With rda
, space 1 is constrained space, space 2 is unconstrained space. Only constrained space is available with pcaiv
. Set 1 is rows, set 2 is columns. scaling
can be defined for cca
(see scores.cca
).
- CCorA: CCorA
, rcc
. Space 1 is X, space 2 is Y. With rcc
a third space is available, in which coordinates are means of X and Y coordinates.
- rCCorA: rcc
. Space 1 is X, space 2 is Y, space 3 is a "common" space in which coordinates are means of X and Y coordinates.
- CIA: coinertia
. Space 1 is X, space 2 is Y, space 3 is a "common" space where X and Y scores are normed. In space 3, set 1 is X and set 2 is Y. If set=12
in space 3 (default), fac
is not available and pch
,cex
, col
can be defined differently for each set.
- 2B-PLS: pls
. Space 1 is X, space 2 is Y, space 3 is a "common" space in which coordinates are means of X and Y coordinates.
- 2B-sPLS: pls
. Space 1 is X, space 2 is Y, space 3 is a "common" space in which coordinates are means of X and Y coordinates.
- rGCCA: rgcca
, wrapper.rgcca
. Space can be 1 to n, the number of blocks (i.e. datasets).
- sGCCA: rgcca
, wrapper.sgcca
. Space can be 1 to n, the number of blocks (i.e. datasets).
- DIABLO: block.plsda
, block.splsda
. Space can be 1 to n, the number of blocks (i.e. datasets).
Maxime HERVE <[email protected]>
data(iris) PCA <- prcomp(iris[,1:4]) MVA.plot(PCA,"scores") MVA.plot(PCA,"scores",fac=iris$Species,col=1:3,pch=15:17)
data(iris) PCA <- prcomp(iris[,1:4]) MVA.plot(PCA,"scores") MVA.plot(PCA,"scores",fac=iris$Species,col=1:3,pch=15:17)
Returns scores of a multivariate analysis.
MVA.scores(x, xax = 1, yax = 2, scaling = 2, set = c(12, 1, 2), space = 1, ...)
MVA.scores(x, xax = 1, yax = 2, scaling = 2, set = c(12, 1, 2), space = 1, ...)
x |
a multivariate analysis (see Details). |
xax |
axis or axes for which to extract scores. |
yax |
axis for which to extract scores (ignored if |
scaling |
type of scaling. Only available with some analyses performed with the |
set |
scores to be displayed, when several sets are available (see Details). |
space |
scores to be displayed, when several spaces are available (see Details). |
... |
not used. |
Many multivariate analyses are supported, from various packages:
- PCA: prcomp
, princomp
(if scores=TRUE
), dudi.pca
, rda
, pca
, pca
. scaling
can be defined for rda
(see scores.rda
).
- sPCA: spca
.
- IPCA: ipca
.
- sIPCA: sipca
.
- PCoA: cmdscale
(with at least on non-default argument), dudi.pco
, wcmdscale
(with at least one non-default argument), capscale
, pco
, pcoa
.
- nMDS: monoMDS
, metaMDS
, nmds
, isoMDS
.
- PLS-DA (PLS2 on a dummy-coded factor): plsda
. X space only.
- sPLS-DA (sPLS2 on a dummy-coded factor): splsda
. X space only.
- CPPLS: mvr
. X space only.
- PLSR: mvr
, pls
, plsR
(plsRglm package). X space only.
- sPLSR: pls
. X space only.
- PLS-GLR: plsRglm
(plsRglm package).
- PCR: mvr
.
- CDA: discrimin
, discrimin.coa
.
- NSCOA: dudi.nsc
.
- MCA: dudi.acm
.
- Mix analysis: dudi.mix
, dudi.hillsmith
.
- COA: dudi.coa
, cca
. Set 1 is rows, set 2 is columns. If set=12
(default), fac
is not available and pch
,cex
, col
can be defined differently for each set. scaling
can be defined for cca
(see scores.cca
).
- DCOA: dudi.dec
. Set 1 is rows, set 2 is columns. If set=12
(default), fac
is not available and pch
,cex
, col
can be defined differently for each set.
- PCIA: procuste
. Set 1 is X, set 2 is Y. If set=12
(default), fac
is not available and pch
,cex
, col
can be defined differently for each set.
- Procrustean superimposition: procrustes
. Set 1 is X, set 2 is Y. If set=12
(default), fac
is not available and pch
,cex
, col
can be defined differently for each set.
- GPA: GPA
. Only the consensus ordination can be displayed.
- DPCoA: dpcoa
. Set 1 is categories, set 2 is collections. If set=12
(default), fac
is not available and pch
,cex
, col
can be defined differently for each set.
- RDA (or PCAIV): pcaiv
, pcaivortho
, rda
. With rda
, space 1 is constrained space, space 2 is unconstrained space. Only constrained space is available with pcaiv
, the opposite for pcaivortho
. scaling
can be defined for rda
(see scores.rda
).
- db-RDA (or CAP): capscale
, dbrda
. Space 1 is constrained space, space 2 is unconstrained space.
- CCA: pcaiv
, cca
. With rda
, space 1 is constrained space, space 2 is unconstrained space. Only constrained space is available with pcaiv
. Set 1 is rows, set 2 is columns. scaling
can be defined for cca
(see scores.cca
).
- CCorA: CCorA
, rcc
. Space 1 is X, space 2 is Y. With rcc
a third space is available, in which coordinates are means of X and Y coordinates.
- rCCorA: rcc
. Space 1 is X, space 2 is Y, space 3 is a "common" space in which coordinates are means of X and Y coordinates.
- CIA: coinertia
. Space 1 is X, space 2 is Y, space 3 is a "common" space where X and Y scores are normed. In space 3, set 1 is X and set 2 is Y. If set=12
in space 3 (default), fac
is not available and pch
,cex
, col
can be defined differently for each set.
- 2B-PLS: pls
. Space 1 is X, space 2 is Y, space 3 is a "common" space in which coordinates are means of X and Y coordinates.
- 2B-sPLS: pls
. Space 1 is X, space 2 is Y, space 3 is a "common" space in which coordinates are means of X and Y coordinates.
- rGCCA: rgcca
, wrapper.rgcca
. Space can be 1 to n, the number of blocks (i.e. datasets).
- sGCCA: rgcca
, wrapper.sgcca
. Space can be 1 to n, the number of blocks (i.e. datasets).
- DIABLO: block.plsda
, block.splsda
. Space can be 1 to n, the number of blocks (i.e. datasets).
Maxime HERVE <[email protected]>
Gives a simple estimator of the quality of the (descriptive) synthesis performed by a wide range of multivariate analyses.
MVA.synt(x, rows = 5)
MVA.synt(x, rows = 5)
x |
a multivariate analysis (see Details). |
rows |
maximum number of axes to print in the output. |
Many multivariate analyses are supported, from various packages.
- PCA: prcomp
, princomp
, dudi.pca
, rda
, pca
, pca
: % of total variance explained by each axis.
- sPCA: spca
: % of total variance explained by each axis.
- IPCA: ipca
: kurtosis of each axis.
- sIPCA: sipca
: kurtosis of each axis.
- PCoA: cmdscale
(with eig=TRUE
), dudi.pco
, wcmdscale
(with eig=TRUE
), capscale
, pco
, pcoa
: % of total variance explained by each axis.
- nMDS: monoMDS
, metaMDS
, nmds
, isoMDS
: stress.
- RDA: pcaiv
, pcaivortho
, rda
: % of constrained and unconstrained total variance, % of constrained variance explained by constrained axes (pcaiv
and rda
), % of unconstrained variance explained by unconstrained axes (pcaivortho
and rda
).
- db-RDA (or CAP): capscale
, dbrda
: % of constrained and unconstrained total variance, % of constrained variance explained by constrained axes, % of unconstrained variance explained by unconstrained axes.
- COA: dudi.coa
, cca
: % of total inertia explained by each axis.
- CCA: pcaiv
, cca
: % of constrained and unconstrained total inertia, % of constrained inertia explained by constrained axes, % of unconstrained inertia explained by unconstrained axes (cca
only).
- CPPLS: mvr
: % of X and Y variances explained by each axis.
- PLSR: mvr
, plsR
(plsRglm package): % of X and Y variances explained by each axis (only Y for the moment with plsR
).
- 2B-PLS: pls
: % of X/Y square covariance explained by each pair of axes, correlation between each pair of axes (canonical correlations).
- CCorA: CCorA
, rcc
: correlation between each pair of axes (canonical correlations).
- rCCorA: rcc
: correlation between each pair of axes (canonical correlations).
- PCR: mvr
: % of X and Y variances explained by each axis.
- MCA: dudi.acm
: % of total inertia explained by each axis.
- Mix analysis: dudi.mix
, dudi.hillsmith
: % of total inertia explained by each axis.
- GPA: GPA
: % of consensus and residual variance, % of total variance exlained by each axis, % of consensus variance explained by each axis, % of residual variance coming from each group of variables.
- RGCCA: rgcca
, wrapper.rgcca
: % of total intra-block variance explained by each axis, correlation between each pair of axes (canonical correlations).
- DIABLO: block.plsda
, block.splsda
: % of total intra-block variance explained by each axis, correlation between each pair of axes (canonical correlations).
- CIA: coinertia
: RV coefficient, % of co-inertia explained by each pair of axes, correlation between each pair of axes (canonical correlations).
- PCIA: procuste
: m2.
Maxime HERVE <[email protected]>
data(iris) PCA <- prcomp(iris[,1:4]) MVA.synt(PCA)
data(iris) PCA <- prcomp(iris[,1:4]) MVA.synt(PCA)
Performs a permutation significance test based on cross (model) validation with different PLS and/or discriminant analyses. See MVA.cv
and MVA.cmv
for more details about how cross (model) validation is performed.
MVA.test(X, Y, cmv = FALSE, ncomp = 8, kout = 7, kinn = 6, scale = TRUE, model = c("PLSR", "CPPLS", "PLS-DA", "PPLS-DA", "LDA", "QDA", "PLS-DA/LDA", "PLS-DA/QDA", "PPLS-DA/LDA","PPLS-DA/QDA"), Q2diff = 0.05, lower = 0.5, upper = 0.5, Y.add = NULL, weights = rep(1, nrow(X)), set.prior = FALSE, crit.DA = c("plug-in", "predictive", "debiased"), p.method = "fdr", nperm = 999, progress = TRUE, ...)
MVA.test(X, Y, cmv = FALSE, ncomp = 8, kout = 7, kinn = 6, scale = TRUE, model = c("PLSR", "CPPLS", "PLS-DA", "PPLS-DA", "LDA", "QDA", "PLS-DA/LDA", "PLS-DA/QDA", "PPLS-DA/LDA","PPLS-DA/QDA"), Q2diff = 0.05, lower = 0.5, upper = 0.5, Y.add = NULL, weights = rep(1, nrow(X)), set.prior = FALSE, crit.DA = c("plug-in", "predictive", "debiased"), p.method = "fdr", nperm = 999, progress = TRUE, ...)
X |
a data frame of independent variables. |
Y |
the dependent variable(s): numeric vector, data frame of quantitative variables or factor. |
cmv |
a logical indicating if the values (Q2 or NMC) should be generated through cross-validation (classical K-fold process) or cross model validation (inner + outer loops). |
ncomp |
an integer giving the number of components to be used to generate all submodels (cross-validation) or the maximal number of components to be tested in the inner loop (cross model validation). Can be re-set internally if needed. Does not concern LDA and QDA. |
kout |
an integer giving the number of folds (cross-validation) or the number of folds in the outer loop (cross-model validation). Can be re-set internally if needed. |
kinn |
an integer giving the number of folds in the inner loop (cross model validation only). Can be re-set internally if needed. Cannot be |
scale |
logical indicating if data should be scaled. See help of |
model |
the model to be fitted. |
Q2diff |
the threshold to be used if the number of components is chosen according to Q2 (cross model validation only). |
lower |
a vector of lower limits for power optimisation in CPPLS or PPLS-DA (see |
upper |
a vector of upper limits for power optimisation in CPPLS or PPLS-DA (see |
Y.add |
a vector or matrix of additional responses containing relevant information about the observations, in CPPLS or PPLS-DA (see |
weights |
a vector of individual weights for the observations, in CPPLS or PPLS-DA (see |
set.prior |
only used when a LDA or QDA is performed (coupled or not with a PLS model). If |
crit.DA |
criterion used to predict class membership when a LDA or QDA is used. See |
p.method |
method for p-values correction. See help of |
nperm |
number of permutations. |
progress |
logical indicating if the progress bar should be displayed. |
... |
other arguments to pass to |
When Y
consists in quantitative response(s), the null hypothesis is that each response is not predicted better than what would happen by chance. In this case, Q2 is used as the test statistic. When Y
contains several responses, a p-value is computed for each response and p-values are corrected for multiple testing.
When Y
is a factor, the null hypothesis is that the factor has no discriminant ability. In this case, the classification error rate (NMC) is used as the test statistic.
Whatever the response, the reference value of the test statistics is obtained by averaging 20 values coming from independently performed cross (model) validation on the original data.
The function deals with the limitted floating point precision, which can bias calculation of p-values based on a discrete test statistic distribution.
method |
a character string indicating the name of the test. |
data.name |
a character string giving the name(s) of the data, plus additional information. |
statistic |
the value of the test statistics. |
permutations |
the number of permutations. |
p.value |
the p-value of the test. |
p.adjust.method |
a character string giving the method for p-values correction. |
Maxime HERVE <[email protected]>
Westerhuis J, Hoefsloot HCJ, Smit S, Vis DJ, Smilde AK, van Velzen EJJ, van Duijnhoven JPM and van Dorsten FA (2008) Assessment of PLSDA cross validation. Metabolomics 4:81-89.
require(pls) require(MASS) # PLSR data(yarn) ## Not run: MVA.test(yarn$NIR,yarn$density,cmv=TRUE,model="PLSR") # PPLS-DA coupled to LDA data(mayonnaise) ## Not run: MVA.test(mayonnaise$NIR,factor(mayonnaise$oil.type),model="PPLS-DA/LDA")
require(pls) require(MASS) # PLSR data(yarn) ## Not run: MVA.test(yarn$NIR,yarn$density,cmv=TRUE,model="PLSR") # PPLS-DA coupled to LDA data(mayonnaise) ## Not run: MVA.test(mayonnaise$NIR,factor(mayonnaise$oil.type),model="PPLS-DA/LDA")
Displays a trajectory plot (i.e. a score plot with trajectories linking defined points) of a multivariate analysis.
MVA.trajplot(x, xax = 1, yax = 2, trajects, trajlab = NULL, scaling = 2, set = c(12, 1, 2), space = 1, xlab = NULL, ylab = NULL, main = NULL, pch = 16, cex = 1, trajlab.cex = 1, col = 1, lwd = 1, lty = 1, points = TRUE, allpoints = TRUE, arrows = TRUE, labels = NULL, main.pos = c("bottomleft", "topleft", "bottomright", "topright"), main.cex = 1.3, legend = FALSE, legend.pos = c("topleft", "topright", "bottomleft", "bottomright"), legend.title = NULL, legend.lab = NULL, legend.cex = 1, drawextaxes = TRUE, drawintaxes = TRUE, xlim = NULL, ylim = NULL)
MVA.trajplot(x, xax = 1, yax = 2, trajects, trajlab = NULL, scaling = 2, set = c(12, 1, 2), space = 1, xlab = NULL, ylab = NULL, main = NULL, pch = 16, cex = 1, trajlab.cex = 1, col = 1, lwd = 1, lty = 1, points = TRUE, allpoints = TRUE, arrows = TRUE, labels = NULL, main.pos = c("bottomleft", "topleft", "bottomright", "topright"), main.cex = 1.3, legend = FALSE, legend.pos = c("topleft", "topright", "bottomleft", "bottomright"), legend.title = NULL, legend.lab = NULL, legend.cex = 1, drawextaxes = TRUE, drawintaxes = TRUE, xlim = NULL, ylim = NULL)
x |
a multivariate analysis (see Details). |
xax |
the horizontal axis. |
yax |
the vertical axis. Cannot be |
trajects |
vector or list of vectors identifying trajectories. Each vector should give the number of the individuals to be linked, ordered from the first to the last one. |
trajlab |
optional traject labels. |
scaling |
type of scaling. Only available with some analyses performed with the |
set |
scores to be displayed, when several sets are available (see Details of |
space |
scores to be displayed, when several spaces are available (see Details of |
xlab |
legend of the horizontal axis. If |
ylab |
legend of the vertical axis. If |
main |
optional title of the graph. |
pch |
symbols used for points. Can be a vector giving one value per trajectory (and a last one for non-linked points if |
cex |
size of the labels. Can be a vector giving one value per trajectory (and a last one for non-linked points if |
trajlab.cex |
size of trajectory labels. Can be a vector giving one value per trajectory. |
col |
color(s) used for arrows and labels. If |
lwd |
width of trajectory segments. Can be a vector giving one value per trajectory. |
lty |
type of trajectory segments. Can be a vector giving one value per trajectory. |
points |
logical indicating if points should be displayed. If |
allpoints |
logical indicating if points which do not belong to any trajectory should be drawn. |
arrows |
logical indicating if trajectories should be oriented with arrows. |
labels |
names of the individuals. If |
main.pos |
position of the title, if |
main.cex |
size of the title, if |
legend |
logical indicating if a legend should be added to the graph. |
legend.pos |
position of the legend, if |
legend.title |
optional title of the legend, if |
legend.lab |
legend labels, if |
legend.cex |
size of legend labels, if |
drawextaxes |
logical indicating if external axes should be drawn.. |
drawintaxes |
logical indicating if internal axes should be drawn. |
xlim |
limits of the horizontal axis. If |
ylim |
limits of the vertical axis. If |
This function should not be use directly. Prefer the general MVA.plot
, to which all arguments can be passed.
All multivariate analyses supported by MVA.scoreplot
can be used for a paired plot.
Maxime HERVE <[email protected]>
require(ade4) data(olympic) PCA <- dudi.pca(olympic$tab,scannf=FALSE) MVA.plot(PCA,"traject",trajects=list(1:10,25:30),col=c(2,3,1),trajlab=c("T1","T2"))
require(ade4) data(olympic) PCA <- dudi.pca(olympic$tab,scannf=FALSE) MVA.plot(PCA,"traject",trajects=list(1:10,25:30),col=c(2,3,1),trajlab=c("T1","T2"))
Computes the odds ratios and their confidence interval for a predictor of a model fitted with multinom
.
OR.multinom(model, variable, conf.level = 0.95)
OR.multinom(model, variable, conf.level = 0.95)
model |
object of class |
variable |
any predictor present in |
conf.level |
confidence level. |
Maxime HERVE <[email protected]>
Re-computes an ordination using given row weights (possibly extracted from a correspondence analysis). The function is intended to be used prior to coinertia
when row weights have to be equalized.
ord.rw(ord, CA = NULL, rw = NULL)
ord.rw(ord, CA = NULL, rw = NULL)
ord |
an ordination to re-compute. Must come from the |
CA |
an optional correspondence analysis from which row weights should be extracted. Must come from |
rw |
an optional vector of row weights. Used only is |
Maxime HERVE <[email protected]>
glmer
models
Estimates residual deviance and residual degrees of freedom to check for overdispersion with glmer
models. This function is directly comming from http://glmm.wikidot.com/faq
.
overdisp.glmer(model)
overdisp.glmer(model)
model |
a model fitted by |
Ben Bolker
require(lme4) # Example from the 'glmer' function gm1 <- glmer(cbind(incidence,size-incidence)~period+(1|herd), family="binomial",data=cbpp) overdisp.glmer(gm1)
require(lme4) # Example from the 'glmer' function gm1 <- glmer(cbind(incidence,size-incidence)~period+(1|herd), family="binomial",data=cbpp) overdisp.glmer(gm1)
The function uses the formula presented in Douma & Weedon (2019). It is primarily intended to be used in beta regression (regression for continuous proportions) when data contain zeroes and/or ones, but can be applied to any variable initially bounded in the [0,1] interval when rescaling is necessary. The function can also perform back-transformation.
p.beta(p, n = length(p), C = 2, back = FALSE)
p.beta(p, n = length(p), C = 2, back = FALSE)
p |
numeric vector of values in the [0,1] interval. |
n |
total number of observations in the initial data set. Not very useful when the transformation is applied to the initial data set, but needed when back-transformation is applied from predicted values. |
C |
number of categories from which |
back |
logical. If |
Maxime HERVE <[email protected]> from the following paper: Douma JC & Weedon JT (2019) Analysing continuous proportions in ecology and evolution: A practical introduction to beta and Dirichlet regression. Methods in Ecology and Evolution, 10: 1412-1430
# A fictive example with four animals performing a behavioral choice-test where time # can be spent in three branches (total time 20 min) (tab <- data.frame(Individual=c("Ind1","Ind2","Ind3","Ind4"),Branch1=c(0,12,20,4), Branch2=c(8,4,0,6),Branch3=c(12,4,0,10))) # Raw proportions of time spent in branch 1: (p1 <- tab$Branch1/rowSums(tab[,-1])) # Scaled proportions: p.beta(p1,C=3)
# A fictive example with four animals performing a behavioral choice-test where time # can be spent in three branches (total time 20 min) (tab <- data.frame(Individual=c("Ind1","Ind2","Ind3","Ind4"),Branch1=c(0,12,20,4), Branch2=c(8,4,0,6),Branch3=c(12,4,0,10))) # Raw proportions of time spent in branch 1: (p1 <- tab$Branch1/rowSums(tab[,-1])) # Scaled proportions: p.beta(p1,C=3)
Performs pairwise comparisons between group levels with corrections for multiple testing, using CDA.test
.
pairwise.CDA.test(X, fact, ncomp = NULL, p.method = "fdr", ...)
pairwise.CDA.test(X, fact, ncomp = NULL, p.method = "fdr", ...)
X |
a data frame of dependent variables (typically contingency or presence-absence table). |
fact |
factor giving the groups. |
ncomp |
an integer giving the number of components to be used for the test. If |
p.method |
method for p-values correction. See help of |
... |
other arguments to pass to |
See CDA.test
.
method |
a character string indicating what type of tests were performed. |
data.name |
a character string giving the name(s) of the data. |
p.value |
table of results. |
p.adjust.method |
method for p-values correction. |
Maxime HERVE <[email protected]>
require(ade4) data(perthi02) CDA.test(perthi02$tab,perthi02$cla) pairwise.CDA.test(perthi02$tab,perthi02$cla)
require(ade4) data(perthi02) CDA.test(perthi02$tab,perthi02$cla) pairwise.CDA.test(perthi02$tab,perthi02$cla)
Performs pairwise comparisons between group levels with corrections for multiple testing. Tests are computed using factorfit
.
pairwise.factorfit(ord, fact, xax = 1, yax = 2, nperm = 999, p.method = "fdr", ...)
pairwise.factorfit(ord, fact, xax = 1, yax = 2, nperm = 999, p.method = "fdr", ...)
ord |
any multivariate analysis handled by |
fact |
grouping factor. |
xax |
first axis of the factorial map. |
yax |
second axis of the factorial map. |
nperm |
number of permutations. |
p.method |
method for p-values correction. See help of |
... |
optional further agruments to |
method |
a character string giving the name of the test. |
data.name |
a character string giving the name(s) of the data and the number of permutations. |
p.value |
table of results. |
p.adjust.method |
method for p-values correction. |
Maxime HERVE <[email protected]>
require(vegan) data(iris) PCA <- rda(iris[,1:4]) MVA.plot(PCA,fac=iris$Species,col=1:3) # Global test envfit(PCA~Species,data=iris) # Pairwise comparisons # (not enough permutations here but faster to run) pairwise.factorfit(PCA,iris$Species,nperm=49)
require(vegan) data(iris) PCA <- rda(iris[,1:4]) MVA.plot(PCA,fac=iris$Species,col=1:3) # Global test envfit(PCA~Species,data=iris) # Pairwise comparisons # (not enough permutations here but faster to run) pairwise.factorfit(PCA,iris$Species,nperm=49)
Performs pairwise comparisons between pairs of proportions with correction for multiple testing.
pairwise.G.test(x, p.method = "fdr")
pairwise.G.test(x, p.method = "fdr")
x |
matrix with 2 columns giving the counts of successes and failures, respectively. |
p.method |
method for p-values correction. See help of |
Since a G-test is an approximate test, an exact test is preferable when the number of individuals is small (200 is a reasonable minimum). See fisher.multcomp
in that case.
method |
name of the test. |
data.name |
a character string giving the name(s) of the data. |
p.adjust.method |
method for p-values correction. |
p.value |
table of results. |
x <- matrix(c(44,56,36,64,64,40),ncol=2,dimnames=list(c("Control","Treatment1","Treatment2"), c("Alive","Dead")),byrow=TRUE) G.test(x) pairwise.G.test(x)
x <- matrix(c(44,56,36,64,64,40),ncol=2,dimnames=list(c("Control","Treatment1","Treatment2"), c("Alive","Dead")),byrow=TRUE) G.test(x) pairwise.G.test(x)
Performs pairwise comparisons between group levels with corrections for multiple testing.
pairwise.mood.medtest(resp, fact, exact = NULL, p.method = "fdr")
pairwise.mood.medtest(resp, fact, exact = NULL, p.method = "fdr")
resp |
response vector. |
fact |
grouping factor. |
exact |
a logical indicating whether exact p-values should be computed. |
p.method |
method for p-values correction. See help of |
If exact=NULL
, Fisher's exact tests are used if the number of data values is < 200; otherwise chi-square tests are used (with Yates continuity correction).
method |
a character string indicating the name of the test. |
data.name |
a character string giving the name(s) of the data. |
p.value |
table of results. |
p.adjust.method |
method for p-values correction. |
Maxime HERVE <[email protected]>
set.seed(0904) response <- c(rnorm(10),rnorm(10,0.8),rnorm(10,2)) fact <- gl(3,10,labels=LETTERS[1:3]) mood.medtest(response~fact) pairwise.mood.medtest(response,fact)
set.seed(0904) response <- c(rnorm(10),rnorm(10,0.8),rnorm(10,2)) fact <- gl(3,10,labels=LETTERS[1:3]) mood.medtest(response~fact) pairwise.mood.medtest(response,fact)
Performs pairwise comparisons between group levels with corrections for multiple testing, using MVA.test
.
pairwise.MVA.test(X, fact, p.method = "fdr", cmv = FALSE, ncomp = 8, kout = 7, kinn = 6, model = c("PLS-DA", "PPLS-DA", "LDA", "QDA", "PLS-DA/LDA", "PLS-DA/QDA", "PPLS-DA/LDA","PPLS-DA/QDA"), nperm = 999, progress = TRUE, ...)
pairwise.MVA.test(X, fact, p.method = "fdr", cmv = FALSE, ncomp = 8, kout = 7, kinn = 6, model = c("PLS-DA", "PPLS-DA", "LDA", "QDA", "PLS-DA/LDA", "PLS-DA/QDA", "PPLS-DA/LDA","PPLS-DA/QDA"), nperm = 999, progress = TRUE, ...)
X |
a data frame of independent variables. |
fact |
grouping factor. |
p.method |
method for p-values correction. See help of |
cmv |
a logical indicating if the test statistic (NMC) should be generated through cross-validation (classical K-fold process) or cross model validation (inner + outer loops). |
ncomp |
an integer giving the number of components to be used to generate all submodels (cross-validation) or the maximal number of components to be tested in the inner loop (cross model validation). Can be re-set internally if needed. Does not concern LDA and QDA. |
kout |
an integer giving the number of folds (cross-validation) or the number of folds in the outer loop (cross-model validation). Can be re-set internally if needed. |
kinn |
an integer giving the number of folds in the inner loop (cross model validation only). Can be re-set internally if needed. Cannot be |
model |
the model to be fitted. |
nperm |
number of permutations. |
progress |
logical indicating if the progress bar should be displayed. |
... |
other arguments to pass to |
The function deals with the limitted floating point precision, which can bias calculation of p-values based on a discrete test statistic distribution.
method |
a character string indicating what type of tests were performed. |
data.name |
a character string giving the name(s) of the data. |
p.value |
table of results. |
p.adjust.method |
method for p-values correction. |
permutations |
number of permutations. |
Maxime HERVE <[email protected]>
require(pls) data(mayonnaise) # PPLS-DA ## Not run: pairwise.MVA.test(mayonnaise$NIR,factor(mayonnaise$oil.type),model="PPLS-DA") # The function needs a long calculation time!
require(pls) data(mayonnaise) # PPLS-DA ## Not run: pairwise.MVA.test(mayonnaise$NIR,factor(mayonnaise$oil.type),model="PPLS-DA") # The function needs a long calculation time!
Performs pairwise comparisons between group levels with corrections for multiple testing. These pairwise comparisons are relevant after a permutation MANOVA, such as performed by adonis
.
pairwise.perm.manova(resp, fact, test = c("Pillai", "Wilks", "Hotelling-Lawley", "Roy", "Spherical"), nperm = 999, progress = TRUE, p.method = "fdr", F = FALSE, R2 = FALSE)
pairwise.perm.manova(resp, fact, test = c("Pillai", "Wilks", "Hotelling-Lawley", "Roy", "Spherical"), nperm = 999, progress = TRUE, p.method = "fdr", F = FALSE, R2 = FALSE)
resp |
response. Either a matrix (one column per variable; objects of class |
fact |
grouping factor. |
test |
choice of test statistic when |
nperm |
number of permutations. |
progress |
logical indicating if the progress bar should be displayed. |
p.method |
method for p-values correction. See help of |
F |
should the table of F values be returned? |
R2 |
should the table of R2 values be returned? For tests based on distance matrices only. |
If resp
is a matrix, a classical MANOVA is performed and the distribution of the (pseudo-)F is computed through permutations. The function deals with the limitted floating point precision, which can bias calculation of p-values based on a discrete test statistic distribution.
If resp
is a distance matrix, adonis
is used to perform each comparison.
method |
a character string giving the name of the test. |
data.name |
a character string giving the name(s) of the data and the number of permutations. |
p.value |
table of results. |
p.adjust.method |
method for p-values correction. |
F.value |
table of F values (if required). |
R2.value |
table of R2 values (if required). |
Maxime HERVE <[email protected]>
require(vegan) data(iris) # permutation MANOVA adonis(iris[,1:4]~Species,data=iris,method="euclidean") # Pairwise comparisons # (not enough permutations here but faster to run) pairwise.perm.manova(iris[,1:4],iris$Species,nperm=49) # or pairwise.perm.manova(dist(iris[,1:4],"euclidean"),iris$Species,nperm=49)
require(vegan) data(iris) # permutation MANOVA adonis(iris[,1:4]~Species,data=iris,method="euclidean") # Pairwise comparisons # (not enough permutations here but faster to run) pairwise.perm.manova(iris[,1:4],iris$Species,nperm=49) # or pairwise.perm.manova(dist(iris[,1:4],"euclidean"),iris$Species,nperm=49)
Performs pairwise comparisons between group levels with corrections for multiple testing.
pairwise.perm.t.test(resp, fact, p.method = "fdr", paired = FALSE, alternative = c("two.sided","less", "greater"), nperm = 999, progress = TRUE)
pairwise.perm.t.test(resp, fact, p.method = "fdr", paired = FALSE, alternative = c("two.sided","less", "greater"), nperm = 999, progress = TRUE)
resp |
response vector. |
fact |
grouping factor. |
p.method |
method for p-values correction. See help of |
paired |
a logical indicating whether you want paired (permutation) t-tests. |
alternative |
a character string specifying the alternative hypothesis, must be one of |
nperm |
number of permutations. |
progress |
logical indicating if the progress bar should be displayed. |
The function deals with the limitted floating point precision, which can bias calculation of p-values based on a discrete test statistic distribution.
method |
a character string indicating what type of t-tests were performed. |
data.name |
a character string giving the name(s) of the data. |
p.value |
table of results. |
p.adjust.method |
method for p-values correction. |
permutations |
number of permutations. |
Maxime HERVE <[email protected]>
set.seed(1203) response <- c(rnorm(5),rpois(5,0.5),rnorm(5,2,1)) fact <- gl(3,5,labels=LETTERS[1:3]) # Not enough permutations here but it runs faster # permutation ANOVA perm.anova(response~fact,nperm=49) # Pairwise comparisons pairwise.perm.t.test(response,fact,nperm=49)
set.seed(1203) response <- c(rnorm(5),rpois(5,0.5),rnorm(5,2,1)) fact <- gl(3,5,labels=LETTERS[1:3]) # Not enough permutations here but it runs faster # permutation ANOVA perm.anova(response~fact,nperm=49) # Pairwise comparisons pairwise.perm.t.test(response,fact,nperm=49)
Performs pairwise comparisons between group levels with corrections for multiple testing.
pairwise.perm.var.test(resp, fact, p.method = "fdr", alternative = c("two.sided","less", "greater"), nperm = 999, progress = TRUE)
pairwise.perm.var.test(resp, fact, p.method = "fdr", alternative = c("two.sided","less", "greater"), nperm = 999, progress = TRUE)
resp |
response vector. |
fact |
grouping factor. |
p.method |
method for p-values correction. See help of |
alternative |
a character string specifying the alternative hypothesis, must be one of |
nperm |
number of permutations. |
progress |
logical indicating if the progress bar should be displayed. |
The function deals with the limitted floating point precision, which can bias calculation of p-values based on a discrete test statistic distribution.
method |
a character string giving the name of the test. |
data.name |
a character string giving the name(s) of the data. |
p.value |
table of results. |
p.adjust.method |
method for p-values correction. |
permutations |
number of permutations. |
Maxime HERVE <[email protected]>
set.seed(0921) response <- c(rnorm(10),rpois(10,0.2),rnorm(10,,2)) fact <- gl(3,10,labels=LETTERS[1:3]) # Not enough permutations here but it runs faster # permutation Bartlett test perm.bartlett.test(response~fact,nperm=49) # Pairwise comparisons pairwise.perm.var.test(response,fact,nperm=49)
set.seed(0921) response <- c(rnorm(10),rpois(10,0.2),rnorm(10,,2)) fact <- gl(3,10,labels=LETTERS[1:3]) # Not enough permutations here but it runs faster # permutation Bartlett test perm.bartlett.test(response~fact,nperm=49) # Pairwise comparisons pairwise.perm.var.test(response,fact,nperm=49)
Performs pairwise comparisons between group levels with corrections for multiple testing.
pairwise.var.test(resp, fact, p.method = "fdr", alternative = c("two.sided","less", "greater"))
pairwise.var.test(resp, fact, p.method = "fdr", alternative = c("two.sided","less", "greater"))
resp |
response vector. |
fact |
grouping factor. |
p.method |
method for p-values correction. See help of |
alternative |
a character string specifying the alternative hypothesis, must be one of |
method |
a character string giving the name of the test. |
data.name |
a character string giving the name(s) of the data. |
p.value |
table of results. |
p.adjust.method |
method for p-values correction. |
Maxime HERVE <[email protected]>
require(graphics) # Bartlett test bartlett.test(count~spray,data=InsectSprays) # Pairwise comparisons pairwise.var.test(InsectSprays$count,InsectSprays$spray)
require(graphics) # Bartlett test bartlett.test(count~spray,data=InsectSprays) # Pairwise comparisons pairwise.var.test(InsectSprays$count,InsectSprays$spray)
Computes the (semi-)partial correlation of x
and y
, controlling for z
.
pcor(x, y, z, semi = FALSE, use = "complete.obs", method = c("pearson", "spearman"))
pcor(x, y, z, semi = FALSE, use = "complete.obs", method = c("pearson", "spearman"))
x |
a numeric vector. |
y |
a numeric vector. |
z |
a numeric vector, matrix, data frame or list giving the controlling variables. For matrices, variables must be placed in columns. |
semi |
logical. If |
use |
same as |
method |
same as |
Maxime HERVE <[email protected]>
pcor.test
for confidence intervals (and tests).
set.seed(1444) x <- 1:30 y <- 1:30+rnorm(30,0,2) z1 <- runif(30,0,4) z2 <- 30:1+rnorm(30,0,3) pcor(x,y,z1) pcor(x,y,list(z1,z2))
set.seed(1444) x <- 1:30 y <- 1:30+rnorm(30,0,2) z1 <- runif(30,0,4) z2 <- 30:1+rnorm(30,0,3) pcor(x,y,z1) pcor(x,y,list(z1,z2))
Tests for (semi-)partial association between paired samples while controlling for other variables, using one of Pearson's product moment correlation coefficient or Spearman's rho.
pcor.test(x, y, z, semi = FALSE, conf.level = 0.95, nrep = 1000, method = c("pearson", "spearman"))
pcor.test(x, y, z, semi = FALSE, conf.level = 0.95, nrep = 1000, method = c("pearson", "spearman"))
x |
a numeric vector. |
y |
a numeric vector. |
z |
a numeric vector, matrix, data frame or list giving the controlling variables. For matrices, variables must be placed in columns. |
semi |
logical. If |
conf.level |
confidence level for confidence interval.. |
nrep |
number of replicates for computation of the confidence interval of a Spearman's rank correlation coefficient (by bootstraping). |
method |
a character string indicating which correlation coefficient is to be used for the test. One of "pearson" or "spearman". |
If method
is "pearson"
and if there are at least 4+k complete series of observation (where k is the number of controlling variables), an asymptotic confidence interval of the correlation coefficient is given based on Fisher's Z transform.
If method
is "spearman"
, the p-value is computed through the AS89 algorithm if the number of complete series of observation is less than 10, otherwise via the asymptotic t approximation (in both cases the pspearman
function is used). A confidence interval of the correlation coefficient, computed by bootstraping, is given.
data.name |
a character string giving the name(s) of the data. |
alternative |
a character string describing the alternative hypothesis, always two-sided. |
method |
a character string indicating how the association was measured. |
conf.int |
a condidence interval for the measure of association. |
statistic |
the value of the test statistic. |
parameter |
the degrees of freedom of the test (only for a Pearson's correlation coefficient). |
p.value |
the p-value of the test. |
estimate |
the estimated measure of association, with name |
null.value |
he value of the association measure under the null hypothesis, always 0. |
Maxime HERVE <[email protected]>
set.seed(1444) x <- 1:30 y <- 1:30+rnorm(30,0,2) z1 <- runif(30,0,4) z2 <- 30:1+rnorm(30,0,3) pcor.test(x,y,z1) pcor.test(x,y,list(z1,z2))
set.seed(1444) x <- 1:30 y <- 1:30+rnorm(30,0,2) z1 <- runif(30,0,4) z2 <- 30:1+rnorm(30,0,3) pcor.test(x,y,z1) pcor.test(x,y,list(z1,z2))
Performs a permutation analysis of variance for 1 to 3 factors. For 2 and 3 factors, experiment design must be balanced. For 2 factors, the factors can be crossed with or without interaction, or nested. The second factor can be a blocking (random) factor. For 3 factors, design is restricted to 2 fixed factors crossed (with or without interaction) inside blocks (third factor).
perm.anova(formula, nest.f2 = c("fixed", "random"), data, nperm = 999, progress = TRUE)
perm.anova(formula, nest.f2 = c("fixed", "random"), data, nperm = 999, progress = TRUE)
formula |
a formula of the form |
nest.f2 |
in case of 2 nested factors, precision is needed if the nested factor (factor2) is |
data |
an optional data frame containing the variables in the formula |
nperm |
number of permutations. |
progress |
logical indicating if the progress bar should be displayed. |
For 2 factors, the formula can be:
response ~ factor1 + factor2
for 2 fixed factors without interaction
response ~ factor1 * factor2
for 2 fixed factors with interaction
response ~ factor1 / factor2
for 2 fixed factors with factor2 nested into factor1 (if factor2 is a random factor, argument nest.f2
must be changed from "fixed"
(default) to "random"
)
response ~ factor1 | factor2
for 1 fixed factor (factor1) and 1 blocking (random) factor (factor2).
For 3 factors, the formula can only be:
response ~ factor1 + factor2 | factor3
or
response ~ factor1 * factor2 | factor3
. The 2 factors are here fixed and crossed inside each level of the third, blocking (random), factor.
The function deals with the limitted floating point precision, which can bias calculation of p-values based on a discrete test statistic distribution.
a data frame of class "anova"
.
Maxime HERVE <[email protected]>
set.seed(1203) response <- c(rnorm(12),rpois(12,0.5),rnorm(12,2,1)) fact1 <- gl(3,12,labels=LETTERS[1:3]) fact2 <- gl(3,1,36,labels=letters[1:3]) fact3 <- gl(6,6,labels=letters[1:6]) block <- gl(2,6,36,labels=letters[1:2]) # Not enough permutations here but faster to run # 2 crossed fixed factors with interaction perm.anova(response~fact1*fact2,nperm=49) # 2 nested fixed factors perm.anova(response~fact1/fact2,nperm=49) # 2 nested factors, fact2 being random perm.anova(response~fact1/fact3,nest.f2="random",nperm=49) # 1 fixed factor and 1 blocking (random) factor perm.anova(response~fact1|block,nperm=49)
set.seed(1203) response <- c(rnorm(12),rpois(12,0.5),rnorm(12,2,1)) fact1 <- gl(3,12,labels=LETTERS[1:3]) fact2 <- gl(3,1,36,labels=letters[1:3]) fact3 <- gl(6,6,labels=letters[1:6]) block <- gl(2,6,36,labels=letters[1:2]) # Not enough permutations here but faster to run # 2 crossed fixed factors with interaction perm.anova(response~fact1*fact2,nperm=49) # 2 nested fixed factors perm.anova(response~fact1/fact2,nperm=49) # 2 nested factors, fact2 being random perm.anova(response~fact1/fact3,nest.f2="random",nperm=49) # 1 fixed factor and 1 blocking (random) factor perm.anova(response~fact1|block,nperm=49)
Performs a permutation Bartlett's test of homogeneity of k variances.
perm.bartlett.test(formula, data, nperm = 999, progress = TRUE)
perm.bartlett.test(formula, data, nperm = 999, progress = TRUE)
formula |
a formula of the form |
data |
an optional data frame containing the variables in the formula |
nperm |
number of permutations. |
progress |
logical indicating if the progress bar should be displayed. |
The function deals with the limitted floating point precision, which can bias calculation of p-values based on a discrete test statistic distribution.
method |
name of the test. |
data.name |
a character string giving the name(s) of the data. |
statistic |
test statistics of the parametric test. |
permutations |
number of permutations. |
p.value |
p-value of the permutation test. |
Maxime HERVE <[email protected]>
response <- c(rnorm(12),rpois(12,1),rnorm(12,2,1)) fact <- gl(3,12,labels=LETTERS[1:3]) perm.bartlett.test(response~fact)
response <- c(rnorm(12),rpois(12,1),rnorm(12,2,1)) fact <- gl(3,12,labels=LETTERS[1:3]) perm.bartlett.test(response~fact)
Performs a permutation Pearson's product-moment correlation test.
perm.cor.test(x, y, alternative = c("two.sided", "less", "greater"), nperm = 999, progress = TRUE)
perm.cor.test(x, y, alternative = c("two.sided", "less", "greater"), nperm = 999, progress = TRUE)
x , y
|
numeric vectors of data values. |
alternative |
a character string specifying the alternative hypothesis, must be one of |
nperm |
number of permutations. |
progress |
logical indicating if the progress bar should be displayed. |
The function deals with the limitted floating point precision, which can bias calculation of p-values based on a discrete test statistic distribution.
method |
name of the test. |
data.name |
a character string giving the name(s) of the data. |
statistic |
test statistics of the parametric test. |
permutations |
number of permutations. |
p.value |
p-value of the permutation test. |
estimate |
the estimated correlation coefficient. |
alternative |
a character string describing the alternative hypothesis. |
null.value |
the value of the association measure under the null hypothesis, always 0. |
Maxime HERVE <[email protected]>
x <- rnorm(50) y <- runif(50) perm.cor.test(x,y)
x <- rnorm(50) y <- runif(50) perm.cor.test(x,y)
Performs a permutation Student's t-test.
perm.t.test(x, ...) ## Default S3 method: perm.t.test(x, y, paired = FALSE, ...) ## S3 method for class 'formula' perm.t.test(formula, data, alternative = c("two.sided", "less", "greater"), paired = FALSE, nperm = 999, progress = TRUE, ...)
perm.t.test(x, ...) ## Default S3 method: perm.t.test(x, y, paired = FALSE, ...) ## S3 method for class 'formula' perm.t.test(formula, data, alternative = c("two.sided", "less", "greater"), paired = FALSE, nperm = 999, progress = TRUE, ...)
x |
a numeric vector of data values. |
y |
a numeric vector of data values. |
paired |
a logical indicating whether you want a paired t-test. |
formula |
a formula of the form |
data |
an optional data frame containing the variables in the formula |
alternative |
a character string specifying the alternative hypothesis, must be one of |
nperm |
number of permutations. |
progress |
logical indicating if the progress bar should be displayed. |
... |
further arguments to be passed to or from other methods. |
The function deals with the limitted floating point precision, which can bias calculation of p-values based on a discrete test statistic distribution.
statistic |
test statistics of the parametric test. |
permutations |
number of permutations. |
p.value |
p-value of the permutation test. |
estimate |
the estimated mean or difference in means depending on whether it was a paired or not paired test. |
alternative |
a character string describing the alternative hypothesis. |
method |
a character string indicating what type of t-test was performed. |
data.name |
a character string giving the name(s) of the data. |
null.value |
the specified hypothesized value of the mean difference, always 0. |
Maxime HERVE <[email protected]>
response <- c(rnorm(5),rnorm(5,2,1)) fact <- gl(2,5,labels=LETTERS[1:2]) # Not enough permutations here but faster to run # Unpaired test perm.t.test(response~fact,nperm=49) # Paired test perm.t.test(response~fact,paired=TRUE,nperm=49)
response <- c(rnorm(5),rnorm(5,2,1)) fact <- gl(2,5,labels=LETTERS[1:2]) # Not enough permutations here but faster to run # Unpaired test perm.t.test(response~fact,nperm=49) # Paired test perm.t.test(response~fact,paired=TRUE,nperm=49)
Performs a permutation F test to compare two variances.
perm.var.test(x, ...) ## Default S3 method: perm.var.test(x, y, ...) ## S3 method for class 'formula' perm.var.test(formula, data, alternative = c("two.sided", "less", "greater"), nperm = 999, progress = TRUE, ...)
perm.var.test(x, ...) ## Default S3 method: perm.var.test(x, y, ...) ## S3 method for class 'formula' perm.var.test(formula, data, alternative = c("two.sided", "less", "greater"), nperm = 999, progress = TRUE, ...)
x |
a numeric vector of data values. |
y |
a numeric vector of data values. |
formula |
a formula of the form |
data |
an optional data frame containing the variables in the formula |
alternative |
a character string specifying the alternative hypothesis, must be one of |
nperm |
number of permutations. |
progress |
logical indicating if the progress bar should be displayed. |
... |
further arguments to be passed to or from other methods. |
The function deals with the limitted floating point precision, which can bias calculation of p-values based on a discrete test statistic distribution.
method |
name of the test. |
statistic |
test statistics of the parametric test. |
permutations |
number of permutations. |
p.value |
p-value of the permutation test. |
estimate |
the ratio of the two variances. |
alternative |
a character string describing the alternative hypothesis. |
data.name |
a character string giving the name(s) of the data. |
null.value |
the ratio of population variances under the null hypothesis, always 1. |
Maxime HERVE <[email protected]>
response <- c(rpois(8,1),rpois(8,3)) fact <- gl(2,8,labels=LETTERS[1:2]) perm.var.test(response~fact)
response <- c(rpois(8,1),rpois(8,3)) fact <- gl(2,8,labels=LETTERS[1:2]) perm.var.test(response~fact)
Plots residuals of a model against fitted values and for some models a QQ-plot of these residuals. Optionally, a Shapiro-Wilk test can be performed on residuals. The function deals with lm
(including glm
, lmList
, lmList
, glm.nb
, mlm
and manova
), lmer
, glmer
, glmmPQL
, glmmadmb
, lme
, gls
, nls
, nlsList
, survreg
, least.rect
, betareg
or glmmTMB
models.
plotresid(model, shapiro = FALSE)
plotresid(model, shapiro = FALSE)
model |
an object of class |
shapiro |
logical. If |
Response residuals are used for linear models, non linear models and generalized linear models based on an identity link (except with "quasi"
distributions where response residuals are used only if variance="constant"
). Pearson or studentized residuals are used whenever there is a link function which is not identity (and with "quasi"
distributions when variance
is not "constant"
), except for betareg
models where standardized weighted residuals 2 are used (see residuals.betareg
).
QQ-plots and Shapiro-Wilk tests are available whenever the model is based on a Gaussian distribution (and with "quasi"
distributions when variance="constant"
).
With a mlm
or manova
model, only a multivariate QQ-plot is drawn. The test performed when shapiro=TRUE
is a Shapiro-Wilk test for multivariate normality.
Maxime HERVE <[email protected]>
lm
, lmList
, lmList
, glm
, glm.nb
, manova
, lmer
, glmer
, lmer
, glmer.nb
, lme
, glmmPQL
, glmmadmb
, glmmTMB
, gls
, nls
, nlsList
, survreg
, least.rect
, betareg
, qresiduals
, qqPlot
, shapiro.test
, mqqnorm
, mshapiro.test
Plots the survivor curve (log(survivors) against time) of a dataset to check for constancy of hazard.
plotsurvivors(x, status = rep(1, length(x)))
plotsurvivors(x, status = rep(1, length(x)))
x |
time to event. |
status |
status (1: event observed, 0: event not observed). |
n |
initial number of individuals. |
time |
time of events. |
alive |
number of survivors at each time. |
Maxime HERVE <[email protected]>
# 'kidney' dataset of package 'survival' require(survival) data(kidney) plotsurvivors(kidney$time,kidney$status)
# 'kidney' dataset of package 'survival' require(survival) data(kidney) plotsurvivors(kidney$time,kidney$status)
Returns VIP score of each X-variable in a PLS-DA (obtained from plsda
).
PLSDA.VIP(model, graph = FALSE)
PLSDA.VIP(model, graph = FALSE)
model |
object of class |
graph |
logical: should VIP scores be displayed? |
tab |
table of results. |
sup1 |
name of X-variables having a VIP score > 1. |
Maxime HERVE <[email protected]>
if (require(mixOmics)) { data(yeast) model.PLSDA <- plsda(t(yeast$data),yeast$strain.cond) PLSDA.VIP(model.PLSDA) }
if (require(mixOmics)) { data(yeast) model.PLSDA <- plsda(t(yeast$data),yeast$strain.cond) PLSDA.VIP(model.PLSDA) }
Predicts response based on CDA (correspondence discriminant analysis) submodels generated by cross validation. The predicted class is given with its probability (computed from the values predicted by all submodels).
## S3 method for class 'CDA.cv' predict(object, newdata, type = c("max", "all"), method = c("mahalanobis", "euclidian"), ...)
## S3 method for class 'CDA.cv' predict(object, newdata, type = c("max", "all"), method = c("mahalanobis", "euclidian"), ...)
object |
object of class inheriting from |
newdata |
vector, matrix or data frame giving new individuals (one row per individual). |
type |
should the probability of the most probable class be given ( |
method |
criterion used to predict class membership. See |
... |
further arguments to be passed to or from other methods. |
Maxime HERVE <[email protected]>
Predicts class of the grouping factor based on a Correspondence Discriminant Analysis (performed using discrimin.coa
).
## S3 method for class 'coadisc' predict(object, newdata, method = c("mahalanobis", "euclidian"), ...)
## S3 method for class 'coadisc' predict(object, newdata, method = c("mahalanobis", "euclidian"), ...)
object |
object of class inheriting from |
newdata |
contingency table (either a |
method |
distance metric to be used for prediction. In all cases the predicted class corresponds to the minimum distance between the new individual and the centroid of each class. Default is Mahalanobis distance. |
... |
further arguments to be passed to or from other methods. |
Maxime HERVE <[email protected]>
require(ade4) data(perthi02) CDA <- discrimin.coa(perthi02$tab,perthi02$cla,scan=FALSE) new <- matrix(c(17,45,32,17,17,52,28,29,6,10,7,7,7,5,10,4,37,34,23,9),ncol=20) predict(CDA,new)
require(ade4) data(perthi02) CDA <- discrimin.coa(perthi02$tab,perthi02$cla,scan=FALSE) new <- matrix(c(17,45,32,17,17,52,28,29,6,10,7,7,7,5,10,4,37,34,23,9),ncol=20) predict(CDA,new)
Predicts response based on submodels generated by cross (model) validation. For regression models (PLSR and CPPLS), the predicted value is given with its confidence interval. For discriminant analyses, the predicted class is given with its probability (computed from the values predicted by all submodels).
## S3 method for class 'MVA.cv' predict(object, newdata, conf.level = 0.95, type.DA = c("max", "all"), crit.DA = c("plug-in", "predictive", "debiased"), ...) ## S3 method for class 'MVA.cmv' predict(object, newdata, conf.level = 0.95, type.DA = c("max", "all"), crit.DA = c("plug-in", "predictive", "debiased"), ...)
## S3 method for class 'MVA.cv' predict(object, newdata, conf.level = 0.95, type.DA = c("max", "all"), crit.DA = c("plug-in", "predictive", "debiased"), ...) ## S3 method for class 'MVA.cmv' predict(object, newdata, conf.level = 0.95, type.DA = c("max", "all"), crit.DA = c("plug-in", "predictive", "debiased"), ...)
object |
object of class inheriting from |
newdata |
vector, matrix or data frame giving new individuals (one row per individual). |
conf.level |
confidence level for prediction of a quantitative response. |
type.DA |
for discriminant analyses, should the probability of the most probable class be given ( |
crit.DA |
criterion used to predict class membership when a LDA or QDA is used. See |
... |
further arguments to be passed to or from other methods. |
Maxime HERVE <[email protected]>
Performs pairwise comparisons after a global test for given response probabilities (i.e. when the response variable is a binary variable), by using exact binomial tests. The function is in fact a wrapper to pairwise comparisons of proportions to given values on a contingency table.
prop.bin.multcomp(formula, data, p, p.method = "fdr")
prop.bin.multcomp(formula, data, p, p.method = "fdr")
formula |
a formula of the form |
data |
an optional data frame containing the variables in the formula |
p |
theoretical probabilities. |
p.method |
method for p-values correction. See help of |
If the response is a 0/1 variable, the probability of the '1' group is tested. In any other cases, the response is transformed into a factor and the probability of the second level is tested.
method |
name of the test. |
data.name |
a character string giving the name(s) of the data. |
observed |
observed probabilities. |
expected |
expected probabilities. |
p.adjust.method |
method for p-values correction. |
p.value2 |
corrected p-values. |
p.value |
table or results of pairwise comparisons. |
Maxime HERVE <[email protected]>
prop.multcomp
, chisq.theo.bintest
response <- c(rep(0:1,c(40,60)),rep(0:1,c(55,45)),rep(0:1,c(65,35))) fact <- gl(3,100,labels=LETTERS[1:3]) p.theo <- c(0.5,0.45,0.2) chisq.theo.bintest(response~fact,p=p.theo) prop.bin.multcomp(response~fact,p=p.theo)
response <- c(rep(0:1,c(40,60)),rep(0:1,c(55,45)),rep(0:1,c(65,35))) fact <- gl(3,100,labels=LETTERS[1:3]) p.theo <- c(0.5,0.45,0.2) chisq.theo.bintest(response~fact,p=p.theo) prop.bin.multcomp(response~fact,p=p.theo)
Performs pairwise comparisons after a global test for given proportions, by using exact binomial tests.
prop.multcomp(x, p, p.method = "fdr")
prop.multcomp(x, p, p.method = "fdr")
x |
contingency table. |
p |
theoretical proportions. |
p.method |
method for p-values correction. See help of |
method |
name of the test. |
data.name |
a character string giving the name(s) of the data. |
observed |
observed proportions. |
expected |
expected proportions. |
p.adjust.method |
method for p-values correction. |
p.value2 |
corrected p-values. |
p.value |
table or results of pairwise comparisons. |
Maxime HERVE <[email protected]>
proportions <- sample(c(0,1),200,replace=TRUE) populations <- sample(LETTERS[1:3],200,replace=TRUE) tab.cont <- table(populations,proportions) p.theo <- c(0.4,0.5,0.7) prop.test(tab.cont,p=p.theo) prop.multcomp(tab.cont,p=p.theo)
proportions <- sample(c(0,1),200,replace=TRUE) populations <- sample(LETTERS[1:3],200,replace=TRUE) tab.cont <- table(populations,proportions) p.theo <- c(0.4,0.5,0.7) prop.test(tab.cont,p=p.theo) prop.multcomp(tab.cont,p=p.theo)
Computes proportions (and their standard errors) when the number of classes is >= 2, based on predicted values of a model. The function is intended to be used parallel to a multinomial log-linear model.
prop.multinom(x)
prop.multinom(x)
x |
either a factor or a matrix with K columns giving the counts for each of the K classes. |
The proportions can be computed through the predict
function applied on a multinomial log-linear model (see multinom
). However, standard errors (or confidence intervals) cannot be obtained this way. The present function uses differents GLMs (in each case considering one category vs. the sum of all others) to obtain proportions and standard errors. Overdispersion is taken into account by default, using a quasibinomial law in all GLMs built.
probs |
the calculated proportions. |
se |
the calculated standard errors. |
Maxime HERVE <[email protected]>
response <- data.frame(A=c(2,2,4,0,2,14,6,16,0,0), B=c(2,0,0,0,6,2,10,6,0,0), C=c(12,6,0,6,2,0,0,0,0,0), D=c(0,0,0,14,0,0,0,0,2,0), E=c(0,0,0,0,0,0,0,0,16,15)) prop.multinom(response)
response <- data.frame(A=c(2,2,4,0,2,14,6,16,0,0), B=c(2,0,0,0,6,2,10,6,0,0), C=c(12,6,0,6,2,0,0,0,0,0), D=c(0,0,0,14,0,0,0,0,2,0), E=c(0,0,0,0,0,0,0,0,16,15)) prop.multinom(response)
Performs pairwise comparisons of proportions when the number of classes is >= 2 with corrections for multiple testing.
prop.multinom.test(x, p.method = "fdr")
prop.multinom.test(x, p.method = "fdr")
x |
either a factor or a matrix with K columns giving the counts for each of the K classes. |
p.method |
method for p-values correction. See help of |
The function builds multinomial log-linear models (using multinom
) and applies Wald tests to compare the intercepts to 0. All necessary models (each time using a different reference level/class) are built to get p-values of all possible comparisons among levels/classes.
method |
a character string indicating the name of the test. |
data.name |
a character string giving the name(s) of the data. |
p.adjust.method |
method for p-values correction. |
p.value |
table of results. |
z.tab |
table of z values. |
Maxime HERVE <[email protected]>
response <- factor(rep(LETTERS[1:4],c(20,40,42,13))) table(response)/length(response) prop.multinom.test(response)
response <- factor(rep(LETTERS[1:4],c(20,40,42,13))) table(response)/length(response) prop.multinom.test(response)
Allows for the calculation of QAIC(c) with glm
models built with a quasibinomial
or quasipoisson
distribution. The function is directly coming from QAIC.
quasibinomial.QAIC(link = "logit") quasipoisson.QAIC(link = "log")
quasibinomial.QAIC(link = "logit") quasipoisson.QAIC(link = "log")
link |
This function is only intended to make the suggestion of Kamil Barton in QAIC available.
# See ?QAIC from the MuMIn package
# See ?QAIC from the MuMIn package
Extracts EMMeans (produced by emmeans
) from Cumulative Link (Mixed) Models (produced by clm
or clmm
), with different possible formats.
rating.emmeans(emm, type = c("prob", "cumprob", "class1", "class2"), level = 0.9)
rating.emmeans(emm, type = c("prob", "cumprob", "class1", "class2"), level = 0.9)
emm |
|
type |
type of output to be returned: |
level |
used only for type |
A factor named cut
must have been called in emmeans
, to compute EMMeans per cut point (i.e. rating). Additionally, the argument mode
of emmeans
must have been set to "linear.predictor". Finally, the call to emmeans
is typically like emmeans(model,~factor|cut,mode="linear.predictor")
where factor
is the factor (or interaction) giving levels for which EMMeans have to be computed.
Maxime HERVE <[email protected]>
require(ordinal) require(emmeans) model <- clm(rating~contact*temp,data=wine) EMM <- emmeans(model,~contact:temp|cut,mode="linear.predictor") # Probabilities rating.emmeans(EMM) # Cumulative probabilities rating.emmeans(EMM,type="cumprob") # Most probable rating rating.emmeans(EMM,type="class1")
require(ordinal) require(emmeans) model <- clm(rating~contact*temp,data=wine) EMM <- emmeans(model,~contact:temp|cut,mode="linear.predictor") # Probabilities rating.emmeans(EMM) # Cumulative probabilities rating.emmeans(EMM,type="cumprob") # Most probable rating rating.emmeans(EMM,type="class1")
Computes observed rating frequencies per level of a factor, in various formats.
rating.prob(x, g, type = c("prob", "cumprob", "class"))
rating.prob(x, g, type = c("prob", "cumprob", "class"))
x |
ordered factor (ratings). |
g |
factor giving groups to be compared. |
type |
type of output to be returned: |
Maxime HERVE <[email protected]>
require(ordinal) data(wine) # Frequencies rating.prob(wine$rating,wine$contact:wine$temp) # Cumulative frequencies rating.prob(wine$rating,wine$contact:wine$temp,type="cumprob") # Most frequent rating rating.prob(wine$rating,wine$contact:wine$temp,type="class")
require(ordinal) data(wine) # Frequencies rating.prob(wine$rating,wine$contact:wine$temp) # Cumulative frequencies rating.prob(wine$rating,wine$contact:wine$temp,type="cumprob") # Most frequent rating rating.prob(wine$rating,wine$contact:wine$temp,type="class")
Computes and add to a graph the confidence interval of a simple regression line or of individual values.
reg.ci(model, conf.level = 0.95, type = c("mean", "ind"), ...)
reg.ci(model, conf.level = 0.95, type = c("mean", "ind"), ...)
model |
|
conf.level |
confidence level. |
type |
interval type : |
... |
other agruments. See help of |
Maxime HERVE <[email protected]>
x <- 1:50 y <- 1:50+rnorm(50,0,4) regression <- lm(y~x) plot(x,y) abline(regression) reg.ci(regression,type="mean",col="red") reg.ci(regression,type="ind",col="blue")
x <- 1:50 y <- 1:50+rnorm(50,0,4) regression <- lm(y~x) plot(x,y) abline(regression) reg.ci(regression,type="mean",col="red") reg.ci(regression,type="ind",col="blue")
Represents the "correlation" of variables to axes in a MCA (from dudi.acm
) or a mix analysis (from dudi.hillsmith
or dudi.mix
).
scat.cr(dudi.obj, axis = 1)
scat.cr(dudi.obj, axis = 1)
dudi.obj |
object obtained from |
axis |
axis to be represented (the first by default). |
For quantitative variables, the squared correlation coefficient is displayed. For ordered factors, the squared multiple correlation coefficient is displayed. For unordered factors, the correlation ratio is displayed.
Maxime HERVE <[email protected]>, based on an idea of Stephane Champely.
dudi.acm
, dudi.hillsmith
, dudi.mix
require(ade4) # Fictive dataset age <- sample(15:60,50,replace=TRUE) sex <- sample(c("M","F"),50,replace=TRUE) size <- sample(155:190,50,replace=TRUE) hair <- sample(c("Fair","Dark","Russet"),50,replace=TRUE) eyes <- sample(c("Blue","Green","Brown"),50,replace=TRUE) weight <- sample(50:85,50,replace=TRUE) hand <- sample(c("Left.handed","Right.handed"),50,replace=TRUE) tab <- data.frame(age,sex,size,weight,hand,eyes,hair,stringsAsFactors=TRUE) amix <- dudi.hillsmith(tab,scannf=FALSE,nf=2) scat.cr(amix)
require(ade4) # Fictive dataset age <- sample(15:60,50,replace=TRUE) sex <- sample(c("M","F"),50,replace=TRUE) size <- sample(155:190,50,replace=TRUE) hair <- sample(c("Fair","Dark","Russet"),50,replace=TRUE) eyes <- sample(c("Blue","Green","Brown"),50,replace=TRUE) weight <- sample(50:85,50,replace=TRUE) hand <- sample(c("Left.handed","Right.handed"),50,replace=TRUE) tab <- data.frame(age,sex,size,weight,hand,eyes,hair,stringsAsFactors=TRUE) amix <- dudi.hillsmith(tab,scannf=FALSE,nf=2) scat.cr(amix)
Computes the standard error of a mean or of a proportion.
se(x, y = NULL)
se(x, y = NULL)
x |
numeric vector or number of successes. |
y |
number of trials. If |
The function deals with missing values.
Maxime HERVE <[email protected]>
# Standard error of a mean se(rnorm(30)) # Standard error of a proportion se(9,25)
# Standard error of a mean se(rnorm(30)) # Standard error of a proportion se(9,25)
Generates a regular sequence from the minimum to the maximum of a vector.
seq2(x, int = 999)
seq2(x, int = 999)
x |
numeric vector. |
int |
number of values to be generated ( |
Maxime HERVE <[email protected]>
seq2(rnorm(30))
seq2(rnorm(30))
Computes the confidence interval of a Spearman's rank correlation coefficient by bootstraping.
spearman.ci(var1, var2, nrep = 1000, conf.level = 0.95)
spearman.ci(var1, var2, nrep = 1000, conf.level = 0.95)
var1 |
numeric vector (first variable). |
var2 |
nuermic verctor (second variable). |
nrep |
number of replicates for bootstraping. |
conf.level |
confidence level of the interval. |
method |
name of the test. |
data.name |
a character string giving the name(s) of the data. |
conf.level |
confidence level. |
rep |
number of replicates. |
estimate |
Spearman's rank correlation coefficient. |
conf.int |
confidence interval. |
Maxime HERVE <[email protected]>
var1 <- sample(1:50,15,replace=TRUE) var2 <- sample(1:50,15,replace=TRUE) spearman.ci(var1,var2)
var1 <- sample(1:50,15,replace=TRUE) var2 <- sample(1:50,15,replace=TRUE) spearman.ci(var1,var2)
Computes Bonferroni-adjusted confidence intervals of a series of Spearman's rank correlation coefficients, for multiple comparisons. Confidence intervals are computed by bootstraping.
spearman.cor.multcomp(var1, var2, fact, alpha = 0.05, nrep = 1000)
spearman.cor.multcomp(var1, var2, fact, alpha = 0.05, nrep = 1000)
var1 |
numeric vector (first variable). |
var2 |
numeric vector (second variable). |
fact |
factor (groups). |
alpha |
significance level. |
nrep |
number of replicates for bootstraping. |
Confidence intervals which do not overlap indicate correlation coefficients significantly different at alpha
.
method |
name of the test. |
data.name |
a character string giving the name(s) of the data. |
tab |
data frame of correlation coefficients with confidence intervals |
alpha |
significance level. |
nrep |
number of replicates for bootstraping. |
Maxime HERVE <[email protected]>
set.seed(1510) var1 <- c(1:15+rnorm(15,0,2),1:15+rnorm(15,0,2),1:15+rnorm(15,0,2)) var2 <- c(-1:-15+rnorm(15,0,2),1:15+rnorm(15,0,2),1:15+rnorm(15,0,2)) fact <- gl(3,15,labels=LETTERS[1:3]) spearman.cor.multcomp(var1,var2,fact) # B and C similar but different from A
set.seed(1510) var1 <- c(1:15+rnorm(15,0,2),1:15+rnorm(15,0,2),1:15+rnorm(15,0,2)) var2 <- c(-1:-15+rnorm(15,0,2),1:15+rnorm(15,0,2),1:15+rnorm(15,0,2)) fact <- gl(3,15,labels=LETTERS[1:3]) spearman.cor.multcomp(var1,var2,fact) # B and C similar but different from A
Divides a data frame randomly, but respecting the relative proportions of levels of a factor in the original data frame. Each subset has roughly the same number of individuals, and the same relative proportions in respect to levels of the given factor.
splitf(set, fac, k)
splitf(set, fac, k)
set |
a data frame containing values to be divided into groups. |
fac |
a reference factor giving the relative proportions to be respected in each subset of |
k |
an integer giving the number of subsets to be generated. |
A list of subsets of set
.
Maxime HERVE <[email protected]>
data(iris) iris2 <- iris[c(1:50,51:80,101:120),] # Proportions to be respected table(iris2$Species)/nrow(iris2) # Splitting result <- splitf(iris2,iris2$Species,3) # All subsets have the same size lapply(result,nrow) # And respect the initial proportions lapply(result,function(x) table(x$Species)/nrow(x))
data(iris) iris2 <- iris[c(1:50,51:80,101:120),] # Proportions to be respected table(iris2$Species)/nrow(iris2) # Splitting result <- splitf(iris2,iris2$Species,3) # All subsets have the same size lapply(result,nrow) # And respect the initial proportions lapply(result,function(x) table(x$Species)/nrow(x))
Centers and scales a data frame. See Details.
stand(tab, ref.tab=NULL, center=NULL, scale=NULL)
stand(tab, ref.tab=NULL, center=NULL, scale=NULL)
tab |
data frame to scale. |
ref.tab |
optional reference data frame, from which centering and scaling parameters are obtained (see Details). |
center |
optional vector of centering parameters (one per column of |
scale |
optional vector of scaling parameters (one per column of |
If ref.tab
is not NULL
, centering and scaling parameters are looked for into this data frame. If it has a "scaled:center"
attribute, this one is used to center tab
. Otherwise means of ref.tab
's columns are used. The same happens for scaling parameters (with the "scaled:scale"
attribute and standard deviations).
If ref.tab
is NULL
, values of center
and scale
are used to standardize tab
.
If ref.tab
and center
are NULL
, means of tab
's columns are used for centering. If ref.tab
and scale
are NULL
, standard deviations of tab
's columns are used for scaling.
Maxime HERVE <[email protected]>
data(iris) set.seed(1131) iris.samp <- iris[sample(1:150,10),1:4] # Centering parameters of the complete dataset attr(scale(iris[,1:4]),"scaled:center") # Centering parameters of the reduced dataset attr(scale(iris.samp),"scaled:center") # Standardization based on the reduced dataset only attr(stand(iris.samp),"scaled:center") # Standardization based on the complete dataset attr(stand(iris.samp,iris[,1:4]),"scaled:center")
data(iris) set.seed(1131) iris.samp <- iris[sample(1:150,10),1:4] # Centering parameters of the complete dataset attr(scale(iris[,1:4]),"scaled:center") # Centering parameters of the reduced dataset attr(scale(iris.samp),"scaled:center") # Standardization based on the reduced dataset only attr(stand(iris.samp),"scaled:center") # Standardization based on the complete dataset attr(stand(iris.samp,iris[,1:4]),"scaled:center")
Tests for significance of coefficients associated with a given predictor of a model fitted with multinom
. Wald tests are used. All coefficients are generated and tested through the building of models using different reference classes (for the response but also for qualitative predictors with more than 2 levels).
test.multinom(model, variable)
test.multinom(model, variable)
model |
object of class |
variable |
any predictor present in |
Maxime HERVE <[email protected]>
Converts some ordinations performed with the vegan
package to objects compatible with coinertia
.
to.dudi(ord)
to.dudi(ord)
ord |
an ordination (see Details). |
The function supports:
- PCA computed from rda
. If data were scaled (prior to the analysis or using scale
of rda
) it is assumed that is was with the standard deviation using n-1
; As in dudi.pca
, to.dudi
rescales the data with the standard deviation using n
.
- PCoA computed from wcmdscale
, capscale
or dbrda
.
- CA computed from cca
.
Maxime HERVE <[email protected]>
Returns a function usable by emmeans
for user defined contrasts.
user.cont(cont)
user.cont(cont)
cont |
any matrix of contrasts (see 'Details'). |
In these matrices, each line is a comparison (= contrast) and each colum is a level of the factor. Rules for writing contrasts are:
- levels not involved in the comparison must have a null value
- levels to be compared must have opposite signs
- levels can be grouped (for example 2 -1 -1 give a comparison of the first level against the group composed by the two others)
- the sum of all values of a contrast must be null.
user.cont.emmc |
the function to be called by |
Maxime HERVE <[email protected]>
require(car) require(emmeans) tab <- data.frame( response <- c(rpois(30,1),rpois(30,3),rpois(30,10)) , fact <- gl(3,30,labels=LETTERS[1:3]) ) model <- glm(response~fact,family="poisson",data=tab) Anova(model) mat <- matrix(c(1,-1,0,0,1,-1,2,-1,-1),nrow=3,byrow=TRUE,dimnames=list(1:3,levels(fact))) mat cont.emmc <- user.cont(mat) EMM <- emmeans(model,~fact) contrast(EMM,"cont")
require(car) require(emmeans) tab <- data.frame( response <- c(rpois(30,1),rpois(30,3),rpois(30,10)) , fact <- gl(3,30,labels=LETTERS[1:3]) ) model <- glm(response~fact,family="poisson",data=tab) Anova(model) mat <- matrix(c(1,-1,0,0,1,-1,2,-1,-1),nrow=3,byrow=TRUE,dimnames=list(1:3,levels(fact))) mat cont.emmc <- user.cont(mat) EMM <- emmeans(model,~fact) contrast(EMM,"cont")
Performs pairwise comparisons of proportions to theoretical values.
wald.ptheo.multinom.test(x, p, p.method = "fdr")
wald.ptheo.multinom.test(x, p, p.method = "fdr")
x |
either a factor or a matrix with K columns giving the counts for each of the K classes. |
p |
theoretical proportions. |
p.method |
method for p-values correction. See help of |
The function builds K logistic regressions (in each case considering one class vs. the sum of all others) and uses wald.ptheo.test
to test the hypothesis that the proportion of this class is equal to p[K]
.
method |
name of the test. |
data.name |
a character string giving the name(s) of the data. |
observed |
observed proportions. |
expected |
theoretical proportions. |
p.adjust.method |
method for p-values correction. |
statistic |
statistics of each test. |
p.value2 |
corrected p-values. |
p.value |
data frame of results. |
Maxime HERVE <[email protected]>
wald.ptheo.test
, prop.multinom
response <- factor(rep(LETTERS[1:4],c(20,40,42,13))) wald.ptheo.multinom.test(response,p=c(0.15,0.25,0.3,0.3))
response <- factor(rep(LETTERS[1:4],c(20,40,42,13))) wald.ptheo.multinom.test(response,p=c(0.15,0.25,0.3,0.3))
Performs a Wald test for comparison of a proportion to a theoretical value.
wald.ptheo.test(y, blocks = NULL, p = 0.5)
wald.ptheo.test(y, blocks = NULL, p = 0.5)
y |
either a binary response (numeric vector or factor, with only two possible values except NA) or a two-column matrix with the columns giving the numbers of successes (left) and failures (right). |
blocks |
optional blocking (random) factor. |
p |
hypothesized probability of success. |
The function builds a logistic (mixed) regression and applies a Wald test to compare the estimated value of the intercept to its theoretical value under H0. Eventual overdispersion is taken into account, by using a quasi-binomial law in case of no blocks or by introducing an individual-level random factor if blocks are present.
If the response is a 0/1 vector, the probability of the '1' group is tested. With other vectors, the response is transformed into a factor and the probability of the second level is tested.
If the response is a two-column matrix, the probability of the left column is tested.
If the reponse is a vector and no blocking factor is present, the exact binomial test performed by binom.test
should be preferred since it is an exact test, whereas the Wald test is an approximate test.
method |
name of the test. |
data.name |
a character string giving the name(s) of the data. |
statistic |
test statistics of the test. |
p.value |
p-value of the test. |
estimate |
the estimated proportion (calculated without taking into account the blocking factor, if present). |
alternative |
a character string describing the alternative hypothesis, always |
null.value |
the value of the proportion under the null hypothesis. |
parameter |
the degrees of freedom for the t-statistic, only whith overdispersion and no blocks. |
Maxime HERVE <[email protected]>
set.seed(2006) response <- sample(0:1,60,replace=TRUE) # Comparison to p=0.5 wald.ptheo.test(response) # Comparison to p=0.8 wald.ptheo.test(response,p=0.8) # With a blocking factor require(lme4) blocks <- gl(3,20) wald.ptheo.test(response,blocks)
set.seed(2006) response <- sample(0:1,60,replace=TRUE) # Comparison to p=0.5 wald.ptheo.test(response) # Comparison to p=0.8 wald.ptheo.test(response,p=0.8) # With a blocking factor require(lme4) blocks <- gl(3,20) wald.ptheo.test(response,blocks)
Performs non parametric pairwise comparisons of paired samples by Wilcoxon signed rank tests for paired data.
wilcox.paired.multcomp(formula, data, p.method = "fdr")
wilcox.paired.multcomp(formula, data, p.method = "fdr")
formula |
a formula of the form |
data |
an optional data frame containing the variables in the formula |
p.method |
method for p-values correction. See help of |
method |
name of the test. |
data.name |
a character string giving the name(s) of the data. |
method |
a character string indicating the name of the test. |
p.adjust.method |
method for p-values correction. |
p.value |
table of results of pairwise comparisons. |
Maxime HERVE <[email protected]>
pairwise.wilcox.test
, wilcox.test
response <- c(rnorm(10,0,3),rnorm(10,5,3),rnorm(10,8,2)) fact <- gl(3,10,labels=LETTERS[1:3]) block <- gl(10,1,30,labels=letters[1:10]) friedman.test(response~fact|block) wilcox.paired.multcomp(response~fact|block)
response <- c(rnorm(10,0,3),rnorm(10,5,3),rnorm(10,8,2)) fact <- gl(3,10,labels=LETTERS[1:3]) block <- gl(10,1,30,labels=letters[1:10]) friedman.test(response~fact|block) wilcox.paired.multcomp(response~fact|block)
Performs a Wilcoxon sign test to compare medians of two paired samples or one median to a given value.
wilcox.signtest(x, ...) ## Default S3 method: wilcox.signtest(x, y = NULL, mu = 0, conf.level = 0.95, ...) ## S3 method for class 'formula' wilcox.signtest(formula, data, subset, ...)
wilcox.signtest(x, ...) ## Default S3 method: wilcox.signtest(x, y = NULL, mu = 0, conf.level = 0.95, ...) ## S3 method for class 'formula' wilcox.signtest(formula, data, subset, ...)
x |
a numeric vector of data values. |
y |
an optional numeric vector of data values (for paired two-sample test). |
mu |
theoretical median (one-sample test) or theoretical median of |
conf.level |
confidence level of the interval. |
formula |
a formula of the form |
data |
an optional data frame containing the variables in the formula |
subset |
an optional vector specifying a subset of observations to be used. |
... |
further arguments to be passed to or from other methods. |
If zeroes (i.e. null differences with mu
) are present, the median of the data different from mu
is tested in the one-sample situation; the median of the x-y
differences different from mu
in the two-sample situation.
method |
a character string indicating the name of the test. |
data.name |
a character string giving the name(s) of the data. |
null.value |
the specified hypothesized value of the median or median difference depending on the test performed. |
p.value |
the p-value of the test. |
alternative |
a character string giving the alternative hypothesis, always |
estimate |
the estimated median or median of |
conf.int |
a confidence interval for the median tested. |
Maxime HERVE <[email protected]>
set.seed(1706) response <- c(rnorm(7,3,1.5),rnorm(7,5.5,2)) # Comparison of 2 samples fact <- gl(2,7,labels=LETTERS[1:2]) wilcox.signtest(response~fact) # Comparison to a given value theo <- 4 wilcox.signtest(response,mu=theo)
set.seed(1706) response <- c(rnorm(7,3,1.5),rnorm(7,5.5,2)) # Comparison of 2 samples fact <- gl(2,7,labels=LETTERS[1:2]) wilcox.signtest(response~fact) # Comparison to a given value theo <- 4 wilcox.signtest(response,mu=theo)
Computes the weighted arithmetic mean of a vector.
wmean(x, w = rep(1, length(x)), na.rm = TRUE)
wmean(x, w = rep(1, length(x)), na.rm = TRUE)
x |
numeric vector. |
w |
numeric vector of weights. |
na.rm |
a logical value indicating whether |
Maxime HERVE <[email protected]>
mean(1:10) wmean(1:10,w=10:1)
mean(1:10) wmean(1:10,w=10:1)