Package 'RVAideMemoire'

Title:	Testing and Plotting Procedures for Biostatistics
Description:	Contains miscellaneous functions useful in biostatistics, mostly univariate and multivariate testing procedures with a special emphasis on permutation tests. Many functions intend to simplify user's life by shortening existing procedures or by implementing plotting functions that can be used with as many methods from different packages as possible.
Authors:	Maxime HERVE
Maintainer:	Maxime HERVE <[email protected]>
License:	GPL-2
Version:	0.9-83-7
Built:	2025-03-30 03:44:29 UTC
Source:	https://github.com/cran/RVAideMemoire

Help Index

Testing and Plotting Procedures for Biostatistics
Type II permutation MANOVA using distance matrices
Anova Tables for Cumulative Link (Mixed) Models
Back-transformation of EMMeans
Bootstrap
Histogram for factor levels
QQ-plot for factor levels
Shapiro-Wilk test for factor levels
QQ-plot for factor levels
Shapiro-Wilk test for factor levels
Cross validation
Significance test for CDA
Cumulative Distribution Function of a known discrete distribution
Expected counts for comparison of response probabilities to given values
Pearson's Chi-squared test for binary variables
Expected counts for comparison of proportions to given values
Pairwise comparisons after a chi-squared goodness-of-fit test
Pearson's Chi-squared test for comparison of response probabilities to given values
Pairwise comparisons after a chi-squared test for given probabilities
Cochran's Q test
Condition number of the Hessian matrix of a multinomial log-linear model
Coordinates of projected points
Comparison of 2 Pearson's linear correlation coefficients
Equality of a Pearson's linear correlation coefficient to a given value
Comparison of several Pearson's linear correlation coefficients
Significance test for the covariance between two datasets
Martingale residuals of a Cox model
Cramer's association coefficient
Cramer's association coefficient
Coefficient of variation
Dendrogram and number of groups to be chosen
Deprecated functions in RVAideMemoire package
Cross validation
Significance test based on cross-validation
Dummy responses
Empirical logistic transformation
Fisher's exact test for binary variables
Pairwise comparisons using Fisher's exact test
Fligner-Policello test
G-test for binary variables
Pairwise comparisons after a G-test
G-test
Pairwise comparisons after a G-test for given probabilities
Significance test for GPA
Individual contributions in regression
Least rectangles linear regression
Slope of a hand-defined line
Graphical adujstment of a simple binary logistic regression to data
Creating a nls model for logistic regression from fitted values of a glm model
Mode
Mood's median test
Multivariate normality QQ-Plot
Shapiro-Wilk multivariate normality test
Pairwise comparisons after an exact multinomial test
Exact multinomial test
Pairwise comparisons after an exact multinomial test for given probabilities
Univariate correlation test for multiple variables
Univariate comparison of groups for multiple variables
Univariate comparison of groups for multiple binary variables
Type II permutation test for constrained multivariate analyses
Biplot of multivariate analyses
Cross model validation
Correlations of multivariate analyses
Correlation circle of multivariate analyses
Cross validation
Loadings of multivariate analyses
Loading plot of multivariate analyses
Paired plot of multivariate analyses
Plotting of multivariate analyses
Score plot of multivariate analyses
Scores of multivariate analyses
Synthesis quality of multivariate analyses
Significance test based on cross (model) validation
Trajectory plot of multivariate analyses
Odds-ratio (multinomial regression)
Re-computation of an ordination using given row weights
Estimation of overdispersion with glmer models
Rescaling of a [0,1] variable into the (0,1) interval (and vice-versa)
Pairwise comparisons for CDA
Pairwise comparisons of groups displayed on a factorial map
Pairwise comparisons for proportions using G-tests
Pairwise Mood's median tests
Pairwise permutation tests based on cross (model) validation
Pairwise permutation MANOVAs
Pairwise permutation t tests
Pairwise permutation F tests
Pairwise F tests
(Semi-)Partial correlation
Tests for (semi-)partial association/correlation between paired samples
Permutation Analysis of Variance
Permutation Bartlett's test of homogeneity of variances
Permutation Pearson's correlation test
Permutation Student's t-test
Permutation F test to compare two variances
Simple analysis of model residuals
Survivor curve
Variable Importance in the Projection (VIP)
Predict method for cross-validated CDA submodels
Predict method for CDA
Predict method for cross-validated submodels
Pairwise comparisons after a test for given probabilities
Pairwise comparisons after a test for given proportions
Proportions and standard errors
Wald tests for comparison of proportions
QAIC(c) calculation with GLMs of the quasi- family
EMMeans for Cumulative Link (Mixed) Models
Observed rating frequencies
Confidence intervals of a simple linear regression
"Correlation" of variables to axes in MCA or mix analyses
Standard error
Sequence generation
Confidence interval of a Spearman's rank correlation coefficient
Comparison of several Spearman's rank correlation coefficients
Divide into groups respecting relative proportions
Standardization of a data frame based on another data frame
Significance tests of coefficients (multinomial regression)
Synthesis quality of multivariate analyses
User defined contrasts for EMMeans
Wald tests for comparison of proportions to theoretical values
Wald test for comparison of a proportion to a theoretical value
Non parametric pairwise comparisons for paired data
Wilcoxon sign test
Weighted arithmetic mean

Testing and Plotting Procedures for Biostatistics

Description

Contains miscellaneous functions useful in biostatistics, mostly univariate and multivariate testing procedures with a special emphasis on permutation tests. Many functions intend to simplify user's life by shortening existing procedures or by implementing plotting functions that can be used with as many methods from different packages as possible.

Details

Package:	RVAideMemoire
Type:	Package
Version:	0.9-83-7
Date:	2023-11-06
License:	GPL-2
LazyLoad:	yes

Author(s)

Maxime HERVE

Maintainer: Maxime HERVE <[email protected]>

References

Document : "Aide-memoire de statistique appliquee a la biologie - Construire son etude et analyser les resutats a l'aide du logiciel R" (available on CRAN)

Type II permutation MANOVA using distance matrices

Description

This function is a wrapper to adonis but performs type II tests (whereas adonis performs type I).

Usage

adonis.II(formula, data, ...)
adonis.II(formula, data, ...)

Arguments

`formula`	a typical model formula such as `Y~A+B*C`, but where `Y` is either a dissimilarity object (inheriting from class `"dist"`) or data frame or a matrix; `A`, `B`, and `C` may be factors or continuous variables.
`data`	the data frame from which `A`, `B` and `C` would be drawn.
`...`	additional arguments to `adonis`. See help of this function.

Details

See adonis for detailed explanation of what is done. The only difference with adonis is that adonis.II performs type II tests instead of type I.

Value

a data frame of class "anova".

Author(s)

Maxime HERVE <[email protected]>

Examples

require(vegan)
data(dune)
data(dune.env)

# Compare:
adonis(dune~Management*A1,data=dune.env,permutations=99)
adonis(dune~A1*Management,data=dune.env,permutations=99)

# With:
adonis.II(dune~Management*A1,data=dune.env,permutations=99)
adonis.II(dune~A1*Management,data=dune.env,permutations=99)
require(vegan)
data(dune)
data(dune.env)

# Compare:
adonis(dune~Management*A1,data=dune.env,permutations=99)
adonis(dune~A1*Management,data=dune.env,permutations=99)

# With:
adonis.II(dune~Management*A1,data=dune.env,permutations=99)
adonis.II(dune~A1*Management,data=dune.env,permutations=99)

Anova Tables for Cumulative Link (Mixed) Models

Description

These functions are methods for Anova to calculate type-II or type-III analysis-of-deviance tables for model objects produced by clm and clmm. Likelihood-ratio tests are calculated in both cases.

Usage

## S3 method for class 'clm'
Anova(mod, type = c("II", "III", 2, 3), ...)

## S3 method for class 'clmm'
Anova(mod, type = c("II", "III", 2, 3), ...)
## S3 method for class 'clm'
Anova(mod, type = c("II", "III", 2, 3), ...)

## S3 method for class 'clmm'
Anova(mod, type = c("II", "III", 2, 3), ...)

Arguments

`mod`	`clm` or `clmm` object.
`type`	type of test, `"II"`, `"III"`, `2` or `3`.
`...`	additional arguments to `Anova`. Not usable here.

Details

See help of the Anova for a detailed explanation of what "type II" and "typ III" mean.

Value

See Anova.

Author(s)

Maxime HERVE <[email protected]>

Back-transformation of EMMeans

Description

Back-transforms EMMeans (produced by emmeans) when the model was built on a transformed response variable. This is typically the case when a LM(M) with log(x+1) as response variable gives a better fitting than a GLM(M) for count data, or when a beta regression takes as response a variable on the [0;1] interval that has been rescaled to the (0;1) interval using p.beta.

Usage

back.emmeans(emm, transform = c("log", "logit", "sqrt", "4rt", "inverse", "p.beta"),
  base = exp(1), add = 0, n = NULL, C = 2, ord = FALSE, decreasing = TRUE)
back.emmeans(emm, transform = c("log", "logit", "sqrt", "4rt", "inverse", "p.beta"),
  base = exp(1), add = 0, n = NULL, C = 2, ord = FALSE, decreasing = TRUE)

Arguments

`emm`	object returned by `emmeans`.
`transform`	transformation applied to the response variable before building the model on which `emm` is based (`"4rt"` is fourth-root). `"p.beta"` assumes that the model (beta regression) was built using a logit link function.
`base`	the base with respect to which the logarithm transformation was computed (if `transform="log"`). Defaults to e=`exp(1)`.
`add`	value to be added to x before computing the transformation, if needed (e.g. `1` if the initial transformation was log(x+1)).
`n`	total number of observations in the initial data set. Only used with `transform="p.beta"`, see `p.beta`.
`C`	number of categories from which initial continuous proportions were computed. Only used with `transform="p.beta"`, see `p.beta`.
`ord`	logical indicating if back-transformed EMMeans should be ordered.
`decreasing`	logical indicating in which order back-transformed EMMeans should be ordered, if `order=TRUE`.

Author(s)

Maxime HERVE <[email protected]>

Examples

require(emmeans)

set.seed(1149)
tab <- data.frame(
  response <- c(rpois(30,0),rpois(30,2),rpois(30,4)),
  fact <- gl(3,30,labels=LETTERS[1:3])
)

model <- lm(log(response+1)~fact,data=tab)
EMM <- emmeans(model,~fact)
back.emmeans(EMM,transform="log",add=1)
require(emmeans)

set.seed(1149)
tab <- data.frame(
  response <- c(rpois(30,0),rpois(30,2),rpois(30,4)),
  fact <- gl(3,30,labels=LETTERS[1:3])
)

model <- lm(log(response+1)~fact,data=tab)
EMM <- emmeans(model,~fact)
back.emmeans(EMM,transform="log",add=1)

Bootstrap

Description

Simplified version of the boot function.

Usage

bootstrap(x, fun, nrep = 1000, conf.level = 0.95, ...)
bootstrap(x, fun, nrep = 1000, conf.level = 0.95, ...)

Arguments

`x`	numeric vector.
`fun`	function to be used for computation (`function(x,i) ...(x[i])`).
`nrep`	number of replicates.
`conf.level`	confidence level for confidence interval.
`...`	additional arguments to `boot`. See help of this function.

Details

See help of the boot function for more explanations.

Value

`method`	the character string `"Bootstrap"`
`data.name`	a character string giving the name of the data.
`estimate`	the estimated original value
`conf.level`	confidence level for confidence interval.
`rep`	number of replicates.
`conf.int`	limits of the confidence interval.

Author(s)

Maxime HERVE <[email protected]>

Examples

# Confidence interval of a mean
samp <- sample(1:50,10,replace=TRUE)
bootstrap(samp,function(x,i) mean(x[i]))

# Confidence interval of the standard error of the mean
bootstrap(samp,function(x,i) sd(x[i])/sqrt(length(x[i])))
# Confidence interval of a mean
samp <- sample(1:50,10,replace=TRUE)
bootstrap(samp,function(x,i) mean(x[i]))

# Confidence interval of the standard error of the mean
bootstrap(samp,function(x,i) sd(x[i])/sqrt(length(x[i])))

Histogram for factor levels

Description

Draws a histogram of a numeric variable per level of a factor.

Usage

byf.hist(formula, data, sep = FALSE, density = TRUE, xlab = NULL, ylab = NULL,
  col = NULL)
byf.hist(formula, data, sep = FALSE, density = TRUE, xlab = NULL, ylab = NULL,
  col = NULL)

Arguments

`formula`	a formula of the form `a ~ b` where `a` gives the data values and `b` a factor giving the corresponding groups.
`data`	an optional data frame containing the variables in the formula `formula`. By default the variables are taken from `environment(formula)`.
`sep`	logical. If `TRUE` a histogram is displayed per level of the factor. If `FALSE` all levels are displayed on the same histogram.
`density`	logical. If `TRUE` density polygons are displayed, if `FALSE` classical counts are displayed.
`xlab`	label for x-axis (name of the response variable as default).
`ylab`	label for y-axis ("Density" or "Frequency" as default, depending on the type of histogram).
`col`	color(s) used for density curves or bars.

Author(s)

Maxime HERVE <[email protected]>

Examples

data(iris)
byf.hist(Sepal.Length~Species,data=iris)
data(iris)
byf.hist(Sepal.Length~Species,data=iris)

QQ-plot for factor levels

Description

Draws a multivariate QQ-plot of numeric variables per level of a factor.

Usage

byf.mqqnorm(formula, data)
byf.mqqnorm(formula, data)

Arguments

`formula`	a formula of the form `a ~ b`, where `a` is a matrix giving the dependent variables (each column giving a variable) and `b` a factor giving the corresponding groups.
`data`	an optional data frame containing the variables in the formula `formula`. By default the variables are taken from `environment(formula)`.

Author(s)

Maxime HERVE <[email protected]>

Examples

data(iris)
byf.mqqnorm(as.matrix(iris[,1:4])~Species,data=iris)
data(iris)
byf.mqqnorm(as.matrix(iris[,1:4])~Species,data=iris)

Shapiro-Wilk test for factor levels

Description

Performs a multivariate Shapiro-Wilk test on numeric variables per level of a factor.

Usage

byf.mshapiro(formula, data)
byf.mshapiro(formula, data)

Arguments

`formula`	a formula of the form `a ~ b` where `a` is a matrix giving the dependent variables (each column giving a variable) and `b` a factor giving the corresponding groups.
`data`	an optional data frame containing the variables in the formula `formula`. By default the variables are taken from `environment(formula)`.

Value

`method`	name of the test.
`data.name`	a character string giving the names of the data.
`tab`	table of results.

Author(s)

Maxime HERVE <[email protected]>

Examples

data(iris)
byf.mshapiro(as.matrix(iris[,1:4])~Species,data=iris)
data(iris)
byf.mshapiro(as.matrix(iris[,1:4])~Species,data=iris)

QQ-plot for factor levels

Description

Draws a QQ-plot of a numeric variable per level of a factor.

Usage

byf.qqnorm(formula, data, ...)
byf.qqnorm(formula, data, ...)

Arguments

`formula`	a formula of the form `a ~ b` where `a` gives the data values and `b` a factor giving the corresponding groups.
`data`	an optional data frame containing the variables in the formula `formula`. By default the variables are taken from `environment(formula)`.
`...`	other arguments to pass to `qqPlot`.

Author(s)

Maxime HERVE <[email protected]>

Examples

data(iris)
byf.qqnorm(Sepal.Length~Species,data=iris)
data(iris)
byf.qqnorm(Sepal.Length~Species,data=iris)

Shapiro-Wilk test for factor levels

Description

Performs a Shapiro-Wilk test on a numeric variable per level of a factor.

Usage

byf.shapiro(formula, data)
byf.shapiro(formula, data)

Arguments

`formula`	a formula of the form `a ~ b` where `a` gives the data values and `b` a factor giving the corresponding groups.
`data`	an optional data frame containing the variables in the formula `formula`. By default the variables are taken from `environment(formula)`.

Value

`method`	name of the test.
`data.name`	a character string giving the names of the data.
`tab`	table of results.

Author(s)

Maxime HERVE <[email protected]>

Examples

data(iris)
byf.shapiro(Sepal.Length~Species,data=iris)
data(iris)
byf.shapiro(Sepal.Length~Species,data=iris)

Cross validation

Description

Performs cross validation with correspondence discriminant analyses.

Usage

CDA.cv(X, Y, repet = 10, k = 7, ncomp = NULL, method = c("mahalanobis",
  "euclidian"))
CDA.cv(X, Y, repet = 10, k = 7, ncomp = NULL, method = c("mahalanobis",
  "euclidian"))

Arguments

`X`	a data frame of dependent variables (typically contingency or presence-absence table).
`Y`	factor giving the groups.
`repet`	an integer giving the number of times the whole procedure has to be repeated.
`k`	an integer giving the number of folds (can be re-set internally if needed).
`ncomp`	an integer giving the number of components to be used for prediction. If `NULL` all components are used.
`method`	criterion used to predict class membership. See `predict.coadisc`.

Details

The training sets are generated in respect to the relative proportions of the levels of Y in the original data set (see splitf).

Value

`repet`	number of times the whole procedure was repeated.
`k`	number of folds.
`ncomp`	number of components used.
`method`	criterion used to classify individuals of the test sets.
`groups`	levels of `Y`.
`models.list`	list of of models generated (`repet*k` models), for PLSR, CPPLS, PLS-DA, PPLS-DA, LDA and QDA.
`NMC`	Classification error rates (`repet` values).

Author(s)

Maxime HERVE <[email protected]>

Examples

require(ade4)
data(perthi02)
## Not run: CDA.cv(perthi02$tab,perthi02$cla)
require(ade4)
data(perthi02)
## Not run: CDA.cv(perthi02$tab,perthi02$cla)

Significance test for CDA

Description

Performs a significance test for correspondence discriminant analysis. See Details.

Usage

CDA.test(X, fact, ncomp = NULL, ...)
CDA.test(X, fact, ncomp = NULL, ...)

Arguments

`X`	a data frame of dependent variables (typically contingency or presence-absence table).
`fact`	factor giving the groups.
`ncomp`	an integer giving the number of components to be used for the test. If `NULL` `nlevels(fact)-1` are used. See Details.
`...`	other arguments to pass to `summary.manova`. See Details.

Details

CDA consists in two steps: building a correspondence analysis (CA) on X, then using row coordinates on all CA components as input variables for a linear discriminant analysis. CDA.test builds the intermediate CA, then uses the first ncomp components to test for an effect of fact. If 1 component is used the test is an ANOVA, if more than 1 component are used the test is a MANOVA.

Value

An ANOVA or MANOVA table.

Author(s)

Maxime HERVE <[email protected]>

Examples

require(ade4)
data(perthi02)

CDA.test(perthi02$tab,perthi02$cla)
require(ade4)
data(perthi02)

CDA.test(perthi02$tab,perthi02$cla)

Cumulative Distribution Function of a known discrete distribution

Description

Returns an object similar to what is produced by ecdf, but based on a known discrete distribution.

Usage

cdf.discrete(x, dist = c("binom", "geom", "hyper", "nbinom", "pois"), ...)
cdf.discrete(x, dist = c("binom", "geom", "hyper", "nbinom", "pois"), ...)

Arguments

`x`	numeric vector of the observations.
`dist`	character string naming a discrete distribution (`"binom"` by default).
`...`	parameters of the distribution specified by `dist`.

Details

The function is intended to be used in goodness-of-fits tests for discrete distributions, such as proposed in the dgof package.

Author(s)

Maxime HERVE <[email protected]>

Examples

if(require(dgof)) {
  set.seed(1124)
  resp <- rpois(20,2)
  cvm.test(resp,cdf.discrete(resp,"pois",2))
}
if(require(dgof)) {
  set.seed(1124)
  resp <- rpois(20,2)
  cvm.test(resp,cdf.discrete(resp,"pois",2))
}

Expected counts for comparison of response probabilities to given values

Description

Returns expected counts before comparing response probabilities (i.e. when the response variable is a binary variable) to given values by a chi-squared test. The function is in fact a wrapper to the chi-squared test for comparison of proportions to given values on a contingency table.

Usage

chisq.bin.exp(formula, data, p, graph = FALSE)
chisq.bin.exp(formula, data, p, graph = FALSE)

Arguments

`formula`	a formula of the form `a ~ b`, where `a` and `b` give the data values and corresponding groups, respectively. `a` can be a numeric vector or a factor, with only two possible values (except NA).
`data`	an optional data frame containing the variables in the formula `formula`. By default the variables are taken from `environment(formula)`.
`p`	theoretical probabilities.
`graph`	logical. If `TRUE` a mosaic plot of expected counts is drawn.

Details

The function returns how many counts can be < 5 to respect Cochran's rule (80% of counts must be >= 5).

Value

`p.theo`	theoretical probabilities.
`mat`	contingency table of expected counts.
`cochran`	number of counts which can be < 5.

Author(s)

Maxime HERVE <[email protected]>

Examples

response <- c(rep(0:1,c(40,60)),rep(0:1,c(55,45)),rep(0:1,c(65,35)))
fact <- gl(3,100,labels=LETTERS[1:3])
p.theo <- c(0.5,0.45,0.2)
chisq.bin.exp(response~fact,p=p.theo)
response <- c(rep(0:1,c(40,60)),rep(0:1,c(55,45)),rep(0:1,c(65,35)))
fact <- gl(3,100,labels=LETTERS[1:3])
p.theo <- c(0.5,0.45,0.2)
chisq.bin.exp(response~fact,p=p.theo)

Pearson's Chi-squared test for binary variables

Description

Performs a Pearson's Chi-squared test for comparing response probabilities (i.e. when the response variable is a binary variable). The function is in fact a wrapper to the chi-squared test for comparison of proportions on a contingency table. If the p-value of the test is significant, the function performs pairwise comparisons by using Pearson's Chi-squared tests.

Usage

chisq.bintest(formula, data, correct = TRUE, alpha = 0.05, p.method = "fdr")
chisq.bintest(formula, data, correct = TRUE, alpha = 0.05, p.method = "fdr")

Arguments

`formula`	a formula of the form `a ~ b`, where `a` and `b` give the data values and corresponding groups, respectively. `a` can be a numeric vector or a factor, with only two possible values (except NA).
`data`	an optional data frame containing the variables in the formula `formula`. By default the variables are taken from `environment(formula)`.
`correct`	a logical indicating whether to apply continuity correction when computing the test statistic for 2 by 2 tables. See help of `chisq.test`.
`alpha`	significance level to compute pairwise comparisons.
`p.method`	method for p-values correction. See help of `p.adjust`.

Details

If the response is a 0/1 variable, the probability of the '1' group is tested. In any other cases, the response is transformed into a factor and the probability of the second level is tested.

Since a chi-squared test is an approximate test, an exact test is preferable when the number of individuals is small (200 is a reasonable minimum). See fisher.bintest in that case.

Value

`method.test`	a character string giving the name of the global test computed.
`data.name`	a character string giving the name(s) of the data.
`alternative`	a character string describing the alternative hypothesis.
`estimate`	the estimated probabilities.
`null.value`	the value of the difference in probabilities under the null hypothesis, always 0.
`statistic`	test statistics.
`parameter`	test degrees of freedom.
`p.value`	p-value of the global test.
`alpha`	significance level.
`p.adjust.method`	method for p-values correction.
`p.value.multcomp`	data frame of pairwise comparisons result.
`method.multcomp`	a character string giving the name of the test computed for pairwise comparisons.

Author(s)

Maxime HERVE <[email protected]>

Examples

response <- c(rep(0:1,c(40,60)),rep(0:1,c(55,45)),rep(0:1,c(65,35)))
fact <- gl(3,100,labels=LETTERS[1:3])
chisq.bintest(response~fact)
response <- c(rep(0:1,c(40,60)),rep(0:1,c(55,45)),rep(0:1,c(65,35)))
fact <- gl(3,100,labels=LETTERS[1:3])
chisq.bintest(response~fact)

Expected counts for comparison of proportions to given values

Description

Returns expected counts before comparing proportions to given values by a chi-squared test.

Usage

chisq.exp(data, p, graph = FALSE)
chisq.exp(data, p, graph = FALSE)

Arguments

`data`	contingency table.
`p`	theoretical proportions.
`graph`	logical. If `TRUE` a mosaic plot of expected counts is drawn.

Details

The function returns how many counts can be < 5 to respect Cochran's rule (80% of counts must be >= 5).

Value

`p.theo`	theoretical proportions.
`mat`	contingency table of expected counts.
`cochran`	number of counts which can be < 5.

Author(s)

Maxime HERVE <[email protected]>

Examples

proportions <- sample(c(0,1),200,replace=TRUE)
populations <- sample(LETTERS[1:3],200,replace=TRUE)
tab.cont <- table(populations,proportions)
p.theo <- c(0.4,0.5,0.7)
chisq.exp(tab.cont,p=p.theo)
proportions <- sample(c(0,1),200,replace=TRUE)
populations <- sample(LETTERS[1:3],200,replace=TRUE)
tab.cont <- table(populations,proportions)
p.theo <- c(0.4,0.5,0.7)
chisq.exp(tab.cont,p=p.theo)

Pairwise comparisons after a chi-squared goodness-of-fit test

Description

Performs pairwise comparisons after a global chi-squared goodness-of-fit test.

Usage

chisq.multcomp(x, p.method = "fdr")
chisq.multcomp(x, p.method = "fdr")

Arguments

`x`	numeric vector (counts).
`p.method`	method for p-values correction. See help of `p.adjust`.

Details

Since a chi-squared test is an approximate test, an exact test is preferable when the number of individuals is small (200 is a reasonable minimum). See multinomial.multcomp in that case.

Value

`method`	name of the test.
`data.name`	a character string giving the name(s) of the data.
`p.adjust.method`	method for p-values correction.
`p.value`	table of results.

Author(s)

Maxime HERVE <[email protected]>

Examples

counts <- c(49,30,63,59)
chisq.test(counts)
chisq.multcomp(counts)
counts <- c(49,30,63,59)
chisq.test(counts)
chisq.multcomp(counts)

Pearson's Chi-squared test for comparison of response probabilities to given values

Description

Performs a Pearson's Chi-squared test for comparing response probabilities (i.e. when the response variable is a binary variable) to given values. The function is in fact a wrapper to the chi-squared test for comparison of proportions to given values on a contingency table.

Usage

chisq.theo.bintest(formula, data, p)
chisq.theo.bintest(formula, data, p)

Arguments

`formula`	a formula of the form `a ~ b`, where `a` and `b` give the data values and corresponding groups, respectively. `a` can be a numeric vector or a factor, with only two possible values (except NA).
`data`	an optional data frame containing the variables in the formula `formula`. By default the variables are taken from `environment(formula)`.
`p`	theoretical probabilities.

Details

If the response is a 0/1 variable, the probability of the '1' group is tested. In any other cases, the response is transformed into a factor and the probability of the second level is tested.

Value

`method.test`	a character string giving the name of the test.
`data.name`	a character string giving the name(s) of the data.
`alternative`	a character string describing the alternative hypothesis, always two-sided.
`estimate`	the estimated probabilities.
`null.value`	the theoretical probabilities.
`statistic`	test statistics.
`parameter`	test degrees of freedom.
`p.value`	p-value of the test.

Author(s)

Maxime HERVE <[email protected]>

Examples

response <- c(rep(0:1,c(40,60)),rep(0:1,c(55,45)),rep(0:1,c(65,35)))
fact <- gl(3,100,labels=LETTERS[1:3])
p.theo <- c(0.5,0.45,0.2)
chisq.theo.bintest(response~fact,p=p.theo)
response <- c(rep(0:1,c(40,60)),rep(0:1,c(55,45)),rep(0:1,c(65,35)))
fact <- gl(3,100,labels=LETTERS[1:3])
p.theo <- c(0.5,0.45,0.2)
chisq.theo.bintest(response~fact,p=p.theo)

Pairwise comparisons after a chi-squared test for given probabilities

Description

Performs pairwise comparisons after a global chi-squared test for given probabilities.

Usage

chisq.theo.multcomp(x, p = rep(1/length(x), length(x)), p.method = "fdr")
chisq.theo.multcomp(x, p = rep(1/length(x), length(x)), p.method = "fdr")

Arguments

`x`	numeric vector (counts).
`p`	theoretical proportions.
`p.method`	method for p-values correction. See help of `p.adjust`.

Details

Since a chi-squared test is an approximate test, an exact test is preferable when the number of individuals is small (200 is a reasonable minimum). See multinomial.theo.multcomp in that case.

Value

`method`	name of the test.
`data.name`	a character string giving the name(s) of the data.
`observed`	observed counts.
`expected`	expected counts.
`p.adjust.method`	method for p-values correction.
`statistic`	statistics of each test.
`p.value2`	corrected p-values.
`p.value`	data frame of results.

Author(s)

Maxime HERVE <[email protected]>

Examples

counts <- c(49,30,63,59)
p.theo <- c(0.2,0.1,0.45,0.25)
chisq.test(counts,p=p.theo)
chisq.theo.multcomp(counts,p=p.theo)
counts <- c(49,30,63,59)
p.theo <- c(0.2,0.1,0.45,0.25)
chisq.test(counts,p=p.theo)
chisq.theo.multcomp(counts,p=p.theo)

Cochran's Q test

Description

Performs the Cochran's Q test for unreplicated randomized block design experiments with a binary response variable and paired data. If the p-value of the test is significant, the function performs pairwise comparisons by using the Wilcoxon sign test.

Usage

cochran.qtest(formula, data, alpha = 0.05, p.method = "fdr")
cochran.qtest(formula, data, alpha = 0.05, p.method = "fdr")

Arguments

`formula`	a formula of the form `a ~ b \| c`, where `a`, `b` and `c` give the data values and corresponding groups and blocks, respectively. `a` can be a numeric vector or a factor, with only two possible values.
`data`	an optional data frame containing the variables in the formula `formula`. By default the variables are taken from `environment(formula)`.
`alpha`	significance level to compute pairwise comparisons.
`p.method`	method for p-values correction. See help of `p.adjust`.

Details

If the response is a 0/1 variable, the probability of the '1' group is tested. In any other cases, the response is transformed into a factor and the probability of the second level is tested.

Value

`method.test`	a character string giving the name of the global test computed.
`data.name`	a character string giving the name(s) of the data.
`alternative`	a character string describing the alternative hypothesis.
`estimate`	the estimated probabilities.
`null.value`	the value of the difference in probabilities under the null hypothesis, always 0.
`statistic`	test statistics (Pearson's Chi-squared test only).
`parameter`	test degrees of freedom (Pearson's Chi-squared test only).
`p.value`	p-value of the global test.
`alpha`	significance level.
`p.adjust.method`	method for p-values correction.
`p.value.multcomp`	data frame of pairwise comparisons result.
`method.multcomp`	a character string giving the name of the test computed for pairwise comparisons.

Author(s)

Maxime HERVE <[email protected]>

Examples

response <- c(0,1,1,0,0,1,0,1,1,1,1,1,0,0,1,1,0,1,0,1,1,0,0,1,0,1,1,0,0,1)
fact <- gl(3,1,30,labels=LETTERS[1:3])
block <- gl(10,3,labels=letters[1:10])
cochran.qtest(response~fact|block)
response <- c(0,1,1,0,0,1,0,1,1,1,1,1,0,0,1,1,0,1,0,1,1,0,0,1,0,1,1,0,0,1)
fact <- gl(3,1,30,labels=LETTERS[1:3])
block <- gl(10,3,labels=letters[1:10])
cochran.qtest(response~fact|block)

Condition number of the Hessian matrix of a multinomial log-linear model

Description

Computes the condition number of the Hessian matrix of a model fitted with multinom.

Usage

cond.multinom(model)
cond.multinom(model)

Arguments

model

object of class "multinom".

Author(s)

Maxime HERVE <[email protected]>

Coordinates of projected points

Description

Returns the coordinates of a set of points when orthogonally projected on a new axis.

Usage

coord.proj(coord,slp)
coord.proj(coord,slp)

Arguments

`coord`	2-column data frame or matrix giving the original coordinates (left column: x, right column: y).
`slp`	slope of the new axis.

Author(s)

Maxime HERVE <[email protected]>

Examples

data(iris)

# Original coordinates
plot(Petal.Length~Sepal.Length,pch=16,col=as.numeric(iris$Species),data=iris)

# New axis
abline(-6,1.6)

# Coordinates on new axis
new.coord <- coord.proj(iris[,c("Sepal.Length","Petal.Length")],1.6)
stripchart(new.coord~Species,data=iris,col=1:3)
data(iris)

# Original coordinates
plot(Petal.Length~Sepal.Length,pch=16,col=as.numeric(iris$Species),data=iris)

# New axis
abline(-6,1.6)

# Coordinates on new axis
new.coord <- coord.proj(iris[,c("Sepal.Length","Petal.Length")],1.6)
stripchart(new.coord~Species,data=iris,col=1:3)

Comparison of 2 Pearson's linear correlation coefficients

Description

Performs the test for equality of 2 Pearson's correlation coefficients. If the difference is not significant, the function returns the common coefficient, its confidence interval and performs the test for equality to a given value.

Usage

cor.2comp(var1, var2, var3, var4, alpha = 0.05, conf.level = 0.95, theo = 0)
cor.2comp(var1, var2, var3, var4, alpha = 0.05, conf.level = 0.95, theo = 0)

Arguments

`var1`	numeric vector (first variable of the first correlation).
`var2`	numeric vector (second variable of the first correlation).
`var3`	numeric vector (first variable of the second correlation).
`var4`	numeric vector (second variable of the second correlation).
`alpha`	significance level.
`conf.level`	confidence level.
`theo`	theoretical coefficient.

Value

`method.test`	a character string giving the name of the global test computed.
`data.name`	a character string giving the name(s) of the data.
`statistic`	test statistics.
`p.value`	p-value for comparison of the 2 coefficients.
`null.value`	the value of the difference in coefficients under the null hypothesis, always 0.
`alternative`	a character string describing the alternative hypothesis.
`estimate`	the estimated correlation coefficients.
`alpha`	significance level.
`conf.level`	confidence level.
`common.name`	a character string explaining the elements of the table below.
`common`	data frame of results if the coefficients are not significantly different (common coefficient).

Author(s)

Maxime HERVE <[email protected]>

Examples

cor1.var1 <- 1:30+rnorm(30,0,2)
cor1.var2 <- 1:30+rnorm(30,0,3)
cor2.var1 <- (-1):-30+rnorm(30,0,2)
cor2.var2 <- (-1):-30+rnorm(30,0,3)
cor.2comp(cor1.var1,cor1.var2,cor2.var1,cor2.var2)
cor1.var1 <- 1:30+rnorm(30,0,2)
cor1.var2 <- 1:30+rnorm(30,0,3)
cor2.var1 <- (-1):-30+rnorm(30,0,2)
cor2.var2 <- (-1):-30+rnorm(30,0,3)
cor.2comp(cor1.var1,cor1.var2,cor2.var1,cor2.var2)

Equality of a Pearson's linear correlation coefficient to a given value

Description

Performs a test for equality of a Pearson's linear correlation coefficient to a given value.

Usage

cor.conf(var1, var2, theo)
cor.conf(var1, var2, theo)

Arguments

`var1`	numeric vector (first variable).
`var2`	numeric vector (second variable).
`theo`	theoretical value.

Value

`method`	a character string giving the name of the test.
`data.name`	a character string giving the name(s) of the data.
`statistic`	test statistics.
`p.value`	p-value of the test.
`null.value`	the value of the theoretical coefficient.
`alternative`	a character string describing the alternative hypothesis.
`estimate`	the estimated correlation coefficient.

Author(s)

Maxime HERVE <[email protected]>

Examples

var1 <- 1:30+rnorm(30,0,4)
var2 <- 1:30+rnorm(30,0,4)
cor.conf(var1,var2,theo=0.5)
var1 <- 1:30+rnorm(30,0,4)
var2 <- 1:30+rnorm(30,0,4)
cor.conf(var1,var2,theo=0.5)

Comparison of several Pearson's linear correlation coefficients

Description

Performs comparisons of several Pearson's linear correlation coefficients. If no difference, the function returns the common correlation coefficient, its confidence interval and test for its equality to a given value. If the difference is significant, the functions performs pairwise comparisons between coefficients.

Usage

cor.multcomp(var1, var2, fact, alpha = 0.05, conf.level = 0.95, theo = 0,
  p.method = "fdr")
cor.multcomp(var1, var2, fact, alpha = 0.05, conf.level = 0.95, theo = 0,
  p.method = "fdr")

Arguments

`var1`	numeric vector (first variable).
`var2`	numeric vector (second variable).
`fact`	factor (groups).
`alpha`	significance level.
`conf.level`	confidence level.
`theo`	theoretical coefficient.
`p.method`	method for p-values correction. See help of`p.adjust`.

Value

`method.test`	a character string giving the name of the global test computed.
`data.name`	a character string giving the name(s) of the data.
`statistic`	test statistics.
`parameter`	test degrees of freedom.
`p.value`	p-value for comparison of the coefficients.
`null.value`	the value of the difference in coefficients under the null hypothesis, always 0.
`alternative`	a character string describing the alternative hypothesis.
`estimate`	the estimated correlation coefficients.
`alpha`	significance level.
`conf.level`	confidence level.
`p.adjust.method`	method for p-values correction.
`p.value.multcomp`	data frame of pairwise comparisons result.
`common.name`	a character string explaining the elements of the table below.
`common`	data frame of results if the coefficients are not significantly different (common coefficient).

Author(s)

Maxime HERVE <[email protected]>

Examples

var1 <- c(1:15+rnorm(15,0,4),1:15+rnorm(15,0,1),1:15+rnorm(15,0,8))
var2 <- c(-1:-15+rnorm(15,0,4),1:15+rnorm(15,0,1),1:15+rnorm(15,0,8))
fact <- gl(3,15,labels=LETTERS[1:3])
cor.multcomp(var1,var2,fact)

var3 <- c(1:15+rnorm(15,0,1),1:15+rnorm(15,0,3),1:15+rnorm(15,0,2))
cor.multcomp(var1,var3,fact)
var1 <- c(1:15+rnorm(15,0,4),1:15+rnorm(15,0,1),1:15+rnorm(15,0,8))
var2 <- c(-1:-15+rnorm(15,0,4),1:15+rnorm(15,0,1),1:15+rnorm(15,0,8))
fact <- gl(3,15,labels=LETTERS[1:3])
cor.multcomp(var1,var2,fact)

var3 <- c(1:15+rnorm(15,0,1),1:15+rnorm(15,0,3),1:15+rnorm(15,0,2))
cor.multcomp(var1,var3,fact)

Significance test for the covariance between two datasets

Description

Performs a permutation test based on the sum of square covariance between variables of two datasets, to test wether the (square) covariance is higher than expected under random association between the two datasets. The test is relevent parallel to a 2B-PLS.

Usage

cov.test(X, Y, scale.X = TRUE, scale.Y = TRUE, nperm = 999, progress = TRUE)
cov.test(X, Y, scale.X = TRUE, scale.Y = TRUE, nperm = 999, progress = TRUE)

Arguments

`X`	a numeric vector, matrix or data frame.
`Y`	a numeric vector, matrix or data frame.
`scale.X`	logical, if `TRUE` (default) scaling of `X` is required.
`scale.Y`	logical, if `TRUE` (default) scaling of `Y` is required.
`nperm`	number of permutations.
`progress`	logical indicating if the progress bar should be displayed.

Details

The function deals with the limitted floating point precision, which can bias calculation of p-values based on a discrete test statistic distribution.

Value

`method`	a character string indicating the name of the test.
`data.name`	a character string giving the name(s) of the data, plus additional information.
`statistic`	the value of the test statistics.
`permutations`	the number of permutations.
`p.value`	the p-value of the test.

Author(s)

Maxime HERVE <[email protected]>

Martingale residuals of a Cox model

Description

Plots martingale residuals of a Cox model against fitted values, to check for log-linearity of covariates.

Usage

cox.resid(model)
cox.resid(model)

Arguments

model

a coxph model.

Author(s)

Maxime HERVE <[email protected]>, based on an idea of John Fox.

References

Fox, J. 2002 Cox Proportional-Hazards Regression for Survival Data.

Examples

# 'kidney' dataset of package 'survival'
require(survival)
data(kidney)
model <- coxph(Surv(time,status)~age+factor(sex),data=kidney)
cox.resid(model)
# 'kidney' dataset of package 'survival'
require(survival)
data(kidney)
model <- coxph(Surv(time,status)~age+factor(sex),data=kidney)
cox.resid(model)

Cramer's association coefficient

Description

Computes the Cramer's association coefficient between 2 nominal variables.

Usage

cramer(x, y)
cramer(x, y)

Arguments

`x`	a contingency table ('matrix' or 'table' object). `x` and `y` can also both be factors.
`y`	ignored if `x` is a contingency table. If not, `y` should be a vector of the same length.

Author(s)

Maxime HERVE <[email protected]>

Examples

var1 <- sample(LETTERS[1:3],30,replace=TRUE)
var2 <- sample(letters[1:3],30,replace=TRUE)
cramer(var1,var2)
# or cramer(table(var1,var2))
var1 <- sample(LETTERS[1:3],30,replace=TRUE)
var2 <- sample(letters[1:3],30,replace=TRUE)
cramer(var1,var2)
# or cramer(table(var1,var2))

Cramer's association coefficient

Description

Computes the Cramer's association coefficient between 2 nominal variables, its confidence interval (by bootstraping) and tests for its significance.

Usage

cramer.test(x, y, nrep = 1000, conf.level = 0.95)
cramer.test(x, y, nrep = 1000, conf.level = 0.95)

Arguments

`x`	a contingency table ('matrix' or 'table' object). `x` and `y` can also both be factors.
`y`	ignored if `x` is a contingency table. If not, `y` should be a vector of the same length.
`nrep`	number of replicates for bootstraping.
`conf.level`	confidence level.

Value

`method`	name of the test.
`statistic`	test statistics.
`parameter`	test degrees of freedom.
`p.value`	test p-value.
`data.name`	a character string giving the names of the data.
`estimate`	Cramer's coefficient.
`conf.level`	confidence level.
`rep`	number of replicates.
`conf.int`	confidence interval.
`alternative`	a character string giving the alternative hypothesis, always `"two.sided"`
`null.value`	the value of the association measure under the null hypothesis, always 0.

Author(s)

Maxime HERVE <[email protected]>

Examples

var1 <- sample(LETTERS[1:3],30,replace=TRUE)
var2 <- sample(letters[1:3],30,replace=TRUE)
cramer.test(var1,var2)
# or cramer.test(table(var1,var2))
var1 <- sample(LETTERS[1:3],30,replace=TRUE)
var2 <- sample(letters[1:3],30,replace=TRUE)
cramer.test(var1,var2)
# or cramer.test(table(var1,var2))

Coefficient of variation

Description

Computes the coefficient of variation of a vector.

Usage

cv(x, abs = TRUE, pc = TRUE)
cv(x, abs = TRUE, pc = TRUE)

Arguments

`x`	numeric vector.
`abs`	logical. If `TRUE` the coefficient is expressed in absolute value.
`pc`	logical. If `TRUE` the coefficient is expressed in percentage.

Details

The function deals with missing values.

Author(s)

Maxime HERVE <[email protected]>

Examples

cv(rnorm(30))
cv(rnorm(30))

Dendrogram and number of groups to be chosen

Description

Draws a dendrogram and an additional bar plot helping to choose the number of groups to be retained (based on the dendrogram).

Usage

dendro.gp(dend)
dendro.gp(dend)

Arguments

dend

a dendrogram obtained from hclust.

Author(s)

Maxime HERVE <[email protected]>

Examples

data(iris)
distances <- dist(iris[,1:4],method="euclidian")
dendro <- hclust(distances,method="ward.D2")
dendro.gp(dendro)
data(iris)
distances <- dist(iris[,1:4],method="euclidian")
dendro <- hclust(distances,method="ward.D2")
dendro.gp(dendro)

Deprecated functions in RVAideMemoire package

Description

Functions that are not usable anymore, and will be entirely removed from the package in future versions.

Usage

back.lsmeans(...)
byf.normhist(...)
cor.sparse(...)
CvM.test(...)
DA.confusion(...)
DA.valid(...)
DA.var(...)
dunn.test(...)
fc.multcomp(...)
friedman.rating.test(...)
kruskal.rating.test(...)
pairwise.manova(...)
pairwise.to.groups(...)
pairwise.wilcox.rating.test(...)
plot1comp.ind(...)
plot1comp.var(...)
PLSDA.ncomp(...)
PLSDA.test(...)
rating.lsmeans(...)
s.corcircle2(...)
scat.mix.categorical(...)
scat.mix.numeric(...)
scatter.coa2(...)
wilcox.paired.rating.multcomp(...)
wilcox.rating.signtest(...)
wilcox.rating.test(...)
back.lsmeans(...)
byf.normhist(...)
cor.sparse(...)
CvM.test(...)
DA.confusion(...)
DA.valid(...)
DA.var(...)
dunn.test(...)
fc.multcomp(...)
friedman.rating.test(...)
kruskal.rating.test(...)
pairwise.manova(...)
pairwise.to.groups(...)
pairwise.wilcox.rating.test(...)
plot1comp.ind(...)
plot1comp.var(...)
PLSDA.ncomp(...)
PLSDA.test(...)
rating.lsmeans(...)
s.corcircle2(...)
scat.mix.categorical(...)
scat.mix.numeric(...)
scatter.coa2(...)
wilcox.paired.rating.multcomp(...)
wilcox.rating.signtest(...)
wilcox.rating.test(...)

Arguments

...

previous arguments.

Details

back.lsmeans and rating.lsmeans are replaced by back.emmeans and rating.emmeans. More generally, stop using package lsmeans and change to package emmeans, its new version.

byf.normhist was not very useful and byf.hist does nearly the same job.

cor.sparse is replaced by the more generic MVA.plot.

CvM.test did actually not perform a Cramer-von Mises test but an alternative Cramer test. Use cramer.test from package cramer directly, on which CvM.test was based.

DA.confusion and DA.valid are replaced by the more generic MVA.cmv and MVA.cv.

DA.var is replaced by the more generic MVA.synt.

dunn.test is not very useful, prefer dunnTest from package FSA.

fc.multcomp is not useful anymore since emtrends (package emmeans) does the same job in a much more powerful manner (see argument var of emtrends).

friedman.rating.test, kruskal.rating.test, wilcox.rating.test, wilcox.rating.signtest, pairwise.wilcox.rating.test and wilcox.paired.rating.multcomp can be problematic with ratings (in which ties and zeroes are very frequent). The use of CLM(M)s (via clm and clmm) is recommended.

pairwise.manova is not useful anymore since emmeans (package emmeans) does the same job in a much more powerful manner (on "mlm" objects, created by lm and not manova)

pairwise.to.groups was not very useful.

plot1comp.ind, plot1comp.var, s.corcircle2, scat.mix.categorical, scat.mix.numeric and scatter.coa2 are replaced by the more generic MVA.plot.

PLSDA.ncomp was not really useful and mvr does nearly the same job.

PLSDA.test is replaced by the more generic MVA.test.

Cross validation

Description

Performs cross validation with DIABLO (block.plsda or block.splsda).

Usage

DIABLO.cv(x, method = c("mahalanobis.dist", "max.dist", "centroids.dist"),
  validation = c("Mfold", "loo"), k = 7, repet = 10, ...)
DIABLO.cv(x, method = c("mahalanobis.dist", "max.dist", "centroids.dist"),
  validation = c("Mfold", "loo"), k = 7, repet = 10, ...)

Arguments

`x`	an object of class `"sgccda"`.
`method`	criterion used to predict class membership. See `perf`.
`validation`	a character giving the kind of (internal) validation to use. See `perf`.
`k`	an integer giving the number of folds (can be re-set internally if needed).
`repet`	an integer giving the number of times the whole procedure has to be repeated.
`...`	other arguments to pass to `perf`.

Details

The function uses the weighted predicted classification error rate (see perf).

Value

`repet`	number of times the whole procedure was repeated.
`k`	number of folds.
`validation`	kind of validation used.
`ncomp`	number of components used.
`method`	criterion used to classify individuals of the test sets.
`NMC.mean`	mean classification error rate (based on `repet` values).
`NMC.se`	standard error of the classification error rate (based on `repet` values).

Author(s)

Maxime HERVE <[email protected]>

Examples

## Not run: 
require(mixOmics)
data(nutrimouse)
data <- list(gene=nutrimouse$gene,lipid=nutrimouse$lipid,Y=nutrimouse$diet)
DIABLO <- block.plsda(X=data,indY=3)
DIABLO.cv(DIABLO)

## End(Not run)
## Not run: 
require(mixOmics)
data(nutrimouse)
data <- list(gene=nutrimouse$gene,lipid=nutrimouse$lipid,Y=nutrimouse$diet)
DIABLO <- block.plsda(X=data,indY=3)
DIABLO.cv(DIABLO)

## End(Not run)

Significance test based on cross-validation

Description

Performs a permutation significance test based on cross-validation with DIABLO (block.plsda or block.splsda).

Usage

DIABLO.test(x, method = c("mahalanobis.dist", "max.dist", "centroids.dist"),
  validation = c("Mfold", "loo"), k = 7, nperm = 999, progress = TRUE, ...)
DIABLO.test(x, method = c("mahalanobis.dist", "max.dist", "centroids.dist"),
  validation = c("Mfold", "loo"), k = 7, nperm = 999, progress = TRUE, ...)

Arguments

`x`	an object of class `"sgccda"`.
`method`	criterion used to predict class membership. See `perf`.
`validation`	a character giving the kind of (internal) validation to use. See `perf`.
`k`	an integer giving the number of folds (can be re-set internally if needed).
`nperm`	number of permutations.
`progress`	logical indicating if the progress bar should be displayed.
`...`	other arguments to pass to `perf`.

Details

The function uses the weighted predicted classification error rate (see perf).

Value

`method`	a character string indicating the name of the test.
`data.name`	a character string giving the name of the data, plus additional information.
`statistic`	the value of the test statistics (classification error rate).
`permutations`	the number of permutations.
`p.value`	the p-value of the test.

Author(s)

Maxime HERVE <[email protected]>

Examples

## Not run: 
require(mixOmics)
data(nutrimouse)
data <- list(gene=nutrimouse$gene,lipid=nutrimouse$lipid,Y=nutrimouse$diet)
DIABLO <- block.plsda(X=data,indY=3)
DIABLO.test(DIABLO)

## End(Not run)
## Not run: 
require(mixOmics)
data(nutrimouse)
data <- list(gene=nutrimouse$gene,lipid=nutrimouse$lipid,Y=nutrimouse$diet)
DIABLO <- block.plsda(X=data,indY=3)
DIABLO.test(DIABLO)

## End(Not run)

Dummy responses

Description

Creates a matrix of dummy responses from a factor. Needed in some multivariate analyses.

Usage

dummy(f, simplify = TRUE)
dummy(f, simplify = TRUE)

Arguments

`f`	vector (internally transformed into factor).
`simplify`	logical indicating if the last column of the response matrix should be removed (to avoid model overfitting).

Author(s)

Maxime HERVE <[email protected]>

Examples

fac <- gl(3,5,labels=LETTERS[1:3])
dummy(fac)
fac <- gl(3,5,labels=LETTERS[1:3])
dummy(fac)

Empirical logistic transformation

Description

Empirical logistic transformation for logistic models with data showing (quasi-)complete separation. The function is intended to be used as a link function in GLM(M)s.

Usage

elogis()
elogis()

Author(s)

Formula from McCullagh & Nelder in their seminal book 'Generalized Linear Models'. R code from Eric Wajnberg & Jean-Sebastien Pierre.

Examples

# An example with 3 groups and complete separation (from E. Wajnberg)
tab <- data.frame(case=letters[1:3],yes=c(25,30,0),no=c(1,0,20))
tab
## Not run: 
mod <- glm(cbind(yes,no)~case,family=binomial(link=elogis()),data=tab)
# Warnings are normal
summary(mod)
## End(Not run)
# An example with 3 groups and complete separation (from E. Wajnberg)
tab <- data.frame(case=letters[1:3],yes=c(25,30,0),no=c(1,0,20))
tab
## Not run: 
mod <- glm(cbind(yes,no)~case,family=binomial(link=elogis()),data=tab)
# Warnings are normal
summary(mod)
## End(Not run)

Fisher's exact test for binary variables

Description

Performs a Fisher's exact test for comparing response probabilities (i.e. when the response variable is a binary variable). The function is in fact a wrapper to the Fisher's exact test for count data. If the p-value of the test is significant, the function performs pairwise comparisons by using Fisher's exact tests.

Usage

fisher.bintest(formula, data, alpha = 0.05, p.method = "fdr")
fisher.bintest(formula, data, alpha = 0.05, p.method = "fdr")

Arguments

`formula`	a formula of the form `a ~ b`, where `a` and `b` give the data values and corresponding groups, respectively. `a` can be a numeric vector or a factor, with only two possible values (except NA).
`data`	an optional data frame containing the variables in the formula `formula`. By default the variables are taken from `environment(formula)`.
`alpha`	significance level to compute pairwise comparisons.
`p.method`	method for p-values correction. See help of `p.adjust`.

Details

If the response is a 0/1 variable, the probability of the '1' group is tested. In any other cases, the response is transformed into a factor and the probability of the second level is tested.

Since chi-squared and G tests are approximate tests, exact tests are preferable when the number of individuals is small (200 is a reasonable minimum).

Value

`method.test`	a character string giving the name of the global test computed.
`data.name`	a character string giving the name(s) of the data.
`alternative`	a character string describing the alternative hypothesis.
`estimate`	the estimated probabilities.
`null.value`	the value of the difference in probabilities under the null hypothesis, always 0.
`p.value`	p-value of the global test.
`alpha`	significance level.
`p.adjust.method`	method for p-values correction.
`p.value.multcomp`	data frame of pairwise comparisons result.
`method.multcomp`	a character string giving the name of the test computed for pairwise comparisons.

Author(s)

Maxime HERVE <[email protected]>

Examples

response <- c(0,0,0,0,0,0,1,0,0,0,0,0,1,0,1,1,1,0,0,1,1,1,1,1,1,0,0,1,1,1)
fact <- gl(3,10,labels=LETTERS[1:3])
fisher.bintest(response~fact)
response <- c(0,0,0,0,0,0,1,0,0,0,0,0,1,0,1,1,1,0,0,1,1,1,1,1,1,0,0,1,1,1)
fact <- gl(3,10,labels=LETTERS[1:3])
fisher.bintest(response~fact)

Pairwise comparisons using Fisher's exact test

Description

Performs pairwise comparisons after a comparison of proportions or after a test for independence of 2 categorical variables, by using a Fisher's exact test.

Usage

fisher.multcomp(tab.cont, p.method = "fdr")
fisher.multcomp(tab.cont, p.method = "fdr")

Arguments

`tab.cont`	contingency table.
`p.method`	method for p-values correction. See help of `p.adjust`.

Details

Since chi-squared and G tests are approximate tests, exact tests are preferable when the number of individuals is small (200 is a reasonable minimum).

Value

`method`	name of the test.
`data.name`	a character string giving the name(s) of the data.
`p.adjust.method`	method for p-values correction.
`p.value`	table of results of pairwise comparisons.

Author(s)

Maxime HERVE <[email protected]>

Examples

# 2-column contingency table: comparison of proportions
tab.cont1 <- matrix(c(17,23,12,24,20,10),ncol=2,dimnames=list(c("Control",
  "Treatment1","Treatment2"),c("Alive","Dead")),byrow=TRUE)
fisher.test(tab.cont1)
fisher.multcomp(tab.cont1)

# 3-column contingency table: independence test
tab.cont2 <- as.table(matrix(c(25,10,12,6,15,14,9,16,9),ncol=3,dimnames=list(c("fair",
  "dark","russet"),c("blue","brown","green"))))
fisher.test(tab.cont2)
fisher.multcomp(tab.cont2)
# 2-column contingency table: comparison of proportions
tab.cont1 <- matrix(c(17,23,12,24,20,10),ncol=2,dimnames=list(c("Control",
  "Treatment1","Treatment2"),c("Alive","Dead")),byrow=TRUE)
fisher.test(tab.cont1)
fisher.multcomp(tab.cont1)

# 3-column contingency table: independence test
tab.cont2 <- as.table(matrix(c(25,10,12,6,15,14,9,16,9),ncol=3,dimnames=list(c("fair",
  "dark","russet"),c("blue","brown","green"))))
fisher.test(tab.cont2)
fisher.multcomp(tab.cont2)

Fligner-Policello test

Description

Performs a Fligner-Policello test of the null that the medians in the two groups (samples) are the same.

Usage


fp.test(x, ...)

## Default S3 method:
fp.test(x, y, delta = 0, alternative = "two.sided", ...)

## S3 method for class 'formula'
fp.test(formula, data, subset, ...)
fp.test(x, ...)

## Default S3 method:
fp.test(x, y, delta = 0, alternative = "two.sided", ...)

## S3 method for class 'formula'
fp.test(formula, data, subset, ...)

Arguments

`x`	a numeric vector of data values.
`y`	a numeric vector of data values.
`delta`	null difference in medians tested.
`alternative`	a character string specifying the alternative hypothesis, must be one of `"two.sided"` (default), `"greater"` or `"less"`.
`formula`	a formula of the form `a ~ b`, where `a` and `b` give the data values and corresponding groups.
`data`	an optional data frame containing the variables in the formula `formula`. By default the variables are taken from `environment(formula)`.
`subset`	an optional vector specifying a subset of observations to be used.
`...`	further arguments to be passed to or from other methods.

Details

The Fligner-Policello test does not assume that the shape of the distribution is similar in two groups, contrary to the Mann-Whitney-Wilcoxon test. However, it assumes that the the distributions are symmetric.

Value

`statistic`	test statistics.
`p.value`	p-value of the test.
`alternative`	a character string describing the alternative hypothesis.
`method`	a character string indicating the name of the test.
`data.name`	a character string giving the names of the data.
`null.value`	the specified hypothesized value of the median difference.

Author(s)

Maxime HERVE <[email protected]>

Examples

x <- rpois(20,3)
y <- rpois(20,5)
fp.test(x,y)
x <- rpois(20,3)
y <- rpois(20,5)
fp.test(x,y)

G-test for binary variables

Description

Performs a G-test for comparing response probabilities (i.e. when the response variable is a binary variable). The function is in fact a wrapper to the G-test for comparison of proportions on a contingency table. If the p-value of the test is significant, the function performs pairwise comparisons by using G-tests.

Usage

G.bintest(formula, data, alpha = 0.05, p.method = "fdr")
G.bintest(formula, data, alpha = 0.05, p.method = "fdr")

Arguments

`formula`	a formula of the form `a ~ b`, where `a` and `b` give the data values and corresponding groups, respectively. `a` can be a numeric vector or a factor, with only two possible values (except NA).
`data`	an optional data frame containing the variables in the formula `formula`. By default the variables are taken from `environment(formula)`.
`alpha`	significance level to compute pairwise comparisons.
`p.method`	method for p-values correction. See help of `p.adjust`.

Details

If the response is a 0/1 variable, the probability of the '1' group is tested. In any other cases, the response is transformed into a factor and the probability of the second level is tested.

Since a G-test is an approximate test, an exact test is preferable when the number of individuals is small (200 is a reasonable minimum). See fisher.bintest in that case.

Value

`method.test`	a character string giving the name of the global test computed.
`data.name`	a character string giving the name(s) of the data.
`alternative`	a character string describing the alternative hypothesis.
`estimate`	the estimated probabilities.
`null.value`	the value of the difference in probabilities under the null hypothesis, always 0.
`statistic`	test statistics.
`parameter`	test degrees of freedom.
`p.value`	p-value of the global test.
`alpha`	significance level.
`p.adjust.method`	method for p-values correction.
`p.value.multcomp`	data frame of pairwise comparisons result.
`method.multcomp`	a character string giving the name of the test computed for pairwise comparisons.

Author(s)

Maxime HERVE <[email protected]>

Examples

response <- c(rep(0:1,c(40,60)),rep(0:1,c(55,45)),rep(0:1,c(65,35)))
fact <- gl(3,100,labels=LETTERS[1:3])
G.bintest(response~fact)
response <- c(rep(0:1,c(40,60)),rep(0:1,c(55,45)),rep(0:1,c(65,35)))
fact <- gl(3,100,labels=LETTERS[1:3])
G.bintest(response~fact)

Pairwise comparisons after a G-test

Description

Performs pairwise comparisons after a global G-test.

Usage

G.multcomp(x, p.method = "fdr")
G.multcomp(x, p.method = "fdr")

Arguments

`x`	numeric vector (counts).
`p.method`	method for p-values correction. See help of `p.adjust`.

Details

Since a G-test is an approximate test, an exact test is preferable when the number of individuals is small (200 is a reasonable minimum). See multinomial.multcomp in that case.

Value

`method`	name of the test.
`data.name`	a character string giving the name(s) of the data.
`p.adjust.method`	method for p-values correction.
`p.value`	table of results.

Author(s)

Maxime HERVE <[email protected]>

Examples

counts <- c(49,30,63,59)
G.test(counts)
G.multcomp(counts)
counts <- c(49,30,63,59)
G.test(counts)
G.multcomp(counts)

G-test

Description

Perfoms a G-test on a contingency table or a vector of counts.

Usage

G.test(x, p = rep(1/length(x), length(x)))
G.test(x, p = rep(1/length(x), length(x)))

Arguments

`x`	a numeric vector or matrix (see Details).
`p`	theoretical proportions (optional).

Details

If x is matrix, it must be constructed like this:

- 2 columns giving number of successes (left) and fails (right)

- 1 row per population.

The function works as chisq.test :

- if x is a vector and theoretical proportions are not given, equality of counts is tested

- if x is a vector and theoretical proportions are given, equality of counts to theoretical counts (given by theoretical proportions) is tested

- if x is a matrix with two columns, equality of proportion of successes between populations is tested.

- if x is a matrix with more than two columns, independence of rows and columns is tested.

Since a G-test is an approximate test, an exact test is preferable when the number of individuals is small (200 is a reasonable minimum). See multinomial.test in that case with a vector, fisher.test with a matrix.

Value

`method`	name of the test.
`statistic`	test statistics.
`parameter`	test degrees of freedom.
`p.value`	p-value.
`data.name`	a character string giving the name(s) of the data.
`observed`	the observed counts.
`expected`	the expected counts under the null hypothesis.

Author(s)

Maxime HERVE <[email protected]>

Examples

counts <- c(49,30,63,59)
G.test(counts)
counts <- c(49,30,63,59)
G.test(counts)

Pairwise comparisons after a G-test for given probabilities

Description

Performs pairwise comparisons after a global G-test for given probabilities.

Usage

G.theo.multcomp(x, p = rep(1/length(x), length(x)), p.method = "fdr")
G.theo.multcomp(x, p = rep(1/length(x), length(x)), p.method = "fdr")

Arguments

`x`	numeric vector (counts).
`p`	theoretical proportions.
`p.method`	method for p-values correction. See help of `p.adjust`.

Details

Since a G-test is an approximate test, an exact test is preferable when the number of individuals is small (200 is a reasonable minimum). See multinomial.theo.multcomp in that case.

Value

`method`	name of the test.
`data.name`	a character string giving the name(s) of the data.
`observed`	observed counts.
`expected`	expected counts.
`p.adjust.method`	method for p-values correction.
`statistic`	statistics of each test.
`p.value2`	corrected p-values.
`p.value`	data frame of results.

Author(s)

Maxime HERVE <[email protected]>

Examples

counts <- c(49,30,63,59)
p.theo <- c(0.2,0.1,0.45,0.25)
G.test(counts,p=p.theo)
G.theo.multcomp(counts,p=p.theo)
counts <- c(49,30,63,59)
p.theo <- c(0.2,0.1,0.45,0.25)
G.test(counts,p=p.theo)
G.theo.multcomp(counts,p=p.theo)

Significance test for GPA

Description

Performs a permutation significance test based on total variance explained for Generalized Procrustes Analysis. The function uses GPA.

Usage

GPA.test(df, group, tolerance = 10^-10, nbiteration = 200, scale = TRUE,
  nperm = 999, progress = TRUE)
GPA.test(df, group, tolerance = 10^-10, nbiteration = 200, scale = TRUE,
  nperm = 999, progress = TRUE)

Arguments

`df`	a data frame with n rows (individuals) and p columns (quantitative varaibles), in which all data frames are combined.
`group`	a vector indicating the number of variables in each group (i.e. data frame).
`tolerance`	a threshold with respect to which the algorithm stops, i.e. when the difference between the GPA loss function at step n and n+1 is less than `tolerance`.
`nbiteration`	the maximum number of iterations until the algorithm stops.
`scale`	logical, if `TRUE` (default) scaling is required.
`nperm`	number of permutations.
`progress`	logical indicating if the progress bar should be displayed.

Details

Rows of each data frame are randomly and independently permuted.

The function deals with the limitted floating point precision, which can bias calculation of p-values based on a discrete test statistic distribution.

Value

`method`	a character string indicating the name of the test.
`data.name`	a character string giving the name(s) of the data, plus additional information.
`statistic`	the value of the test statistics.
`permutations`	the number of permutations.
`p.value`	the p-value of the test.

Author(s)

Maxime HERVE <[email protected]>

References

Wakeling IN, Raats MM and MacFie HJH (1992) A new significance test for consensus in Generalized Procrustes Analysis. Journal of Sensory Studies 7:91-96.

Examples

require(FactoMineR)
data(wine)

## Not run: GPA.test(wine[,-(1:2)],group=c(5,3,10,9,2))
require(FactoMineR)
data(wine)

## Not run: GPA.test(wine[,-(1:2)],group=c(5,3,10,9,2))

Individual contributions in regression

Description

Computes difference in regression parameters when each individual is dropped, expressed in proportion of the whole regression coefficients. The function deals with lm (including glm) and least.rect models.

Usage

ind.contrib(model, print.diff = FALSE, graph = TRUE, warning=25)
ind.contrib(model, print.diff = FALSE, graph = TRUE, warning=25)

Arguments

`model`	model (of class `"lm"` or `"least.rect"`).
`print.diff`	logical. If `TRUE` results are printed.
`graph`	logical. If `TRUE` results are returned in a graphical way.
`warning`	level of graphical warning.

Value

`coefficients`	coefficients of each computed regression.
`coefficients.diff`	difference in coefficients between each computed regression and the whole regression.
`coefficients.prop`	difference in coefficients expressed in proportion of the whole regression coefficients.

Author(s)

Maxime HERVE <[email protected]>

Examples

x <- 1:30
y <- 1:30+rnorm(30,0,4)
model1 <- lm(y~x)
model2 <- least.rect(y~x)
ind.contrib(model1)
ind.contrib(model2)
x <- 1:30
y <- 1:30+rnorm(30,0,4)
model1 <- lm(y~x)
model2 <- least.rect(y~x)
ind.contrib(model1)
ind.contrib(model2)

Least rectangles linear regression

Description

Fits a least rectangle linear regression, possibly for each level of a factor.

Usage

least.rect(formula, data, conf.level = 0.95, theo = 1, adj = TRUE)
least.rect(formula, data, conf.level = 0.95, theo = 1, adj = TRUE)

Arguments

`formula`	a formula of the form `y ~ x`, where `y` and `x` give the y and x variable, respectively. The formula can also be `y ~ x \| f` to fit a (separate) regression for each level of the factor `f`.
`data`	an optional data frame containing the variables in the formula `formula`. By default the variables are taken from `environment(formula)`.
`conf.level`	confidence level.
`theo`	theoretical value of the slope. If several regression are fitted, the same value is used for all comparisons of slope vs. theoretical value.
`adj`	logical indicating if, in case of several regressions fitted, confidence intervals and p-values should be Bonferroni-corrected for multiple testing.

Value

`coefficients`	regression parameters.
`residuals`	residuals.
`fitted.values`	fitted values.
`call`	the matched call.
`model`	the model frame used.
`conf.level`	confidence level.
`conf.int`	confidence interval of regression parameters.
`theo`	theoretical value of the slope.
`comp`	data frame of results for equality of the slope(s) to the theoretical value.
`corr`	data frame of results for significativity of the correlation coefficient(s).
`multiple`	logical, `TRUE` if several regressions are fitted.
`adj`	logical, `TRUE` if confidence intervals and p-values are corrected for multiple testing (only if several regressions are fitted).

Author(s)

Maxime HERVE <[email protected]>

Examples

x <- 1:30+rnorm(30,0,3)
y <- 1:30+rnorm(30,0,3)
regression1 <- least.rect(y~x)
summary(regression1)

x2 <- c(1:30,1:30)
y2 <- c(1:30+rnorm(30,0,3),seq(10,22,12/29)+rnorm(30,0,3))
fact <- gl(2,30,labels=LETTERS[1:2])
regression2 <- least.rect(y2~x2|fact)
summary(regression2)
x <- 1:30+rnorm(30,0,3)
y <- 1:30+rnorm(30,0,3)
regression1 <- least.rect(y~x)
summary(regression1)

x2 <- c(1:30,1:30)
y2 <- c(1:30+rnorm(30,0,3),seq(10,22,12/29)+rnorm(30,0,3))
fact <- gl(2,30,labels=LETTERS[1:2])
regression2 <- least.rect(y2~x2|fact)
summary(regression2)

Slope of a hand-defined line

Description

Returns the slope of a line defined by selecting two points on a graph.

Usage

loc.slp()
loc.slp()

Author(s)

Maxime HERVE <[email protected]>

Graphical adujstment of a simple binary logistic regression to data

Description

Cuts the data into intervals, compute the response probability and its standard error for each interval and add the results to the regression curve. No test is performed but this permits to have a graphical idea of the adjustment of the model to the data.

Usage

logis.fit(model, int = 5, ...)
logis.fit(model, int = 5, ...)

Arguments

`model`	`glm` model.
`int`	number of intervals.
`...`	other arguments. See help of `points` and `segments`.

Author(s)

Maxime HERVE <[email protected]>

Examples

x <- 1:50
y <- c(rep(0,18),sample(0:1,14,replace=TRUE),rep(1,18))
model <- glm(y~x,family=binomial)
plot(x,y)
lines(x,model$fitted)
logis.fit(model)
x <- 1:50
y <- c(rep(0,18),sample(0:1,14,replace=TRUE),rep(1,18))
model <- glm(y~x,family=binomial)
plot(x,y)
lines(x,model$fitted)
logis.fit(model)

Creating a nls model for logistic regression from fitted values of a glm model

Description

Adds some noise to the fitted values of a glm model to create a nls model for logistic regression (creating a nls model from exact fitted values can not be done, see help of nls).

Usage

logis.noise(model, intensity = 25)
logis.noise(model, intensity = 25)

Arguments

`model`	`glm` model.
`intensity`	intensity of the noise: lower the value, bigger the noise.

Author(s)

Maxime HERVE <[email protected]>

Examples

x <- 1:50
y <- c(rep(0,18),sample(0:1,14,replace=TRUE),rep(1,18))
model <- glm(y~x,family=binomial)
y2 <- logis.noise(model)
# Then model2 <- nls(y2~SSlogis(...))
x <- 1:50
y <- c(rep(0,18),sample(0:1,14,replace=TRUE),rep(1,18))
model <- glm(y~x,family=binomial)
y2 <- logis.noise(model)
# Then model2 <- nls(y2~SSlogis(...))

Mode

Description

Computes the mode of a vector. The function makes the difference between continuous and discontinuous variables (which are made up of integers only). By extention, it also gives the most frequent value in a character vector or a factor.

Usage

mod(x)
mod(x)

Arguments

`x`	vector (numeric, character or factor).

Author(s)

Maxime HERVE <[email protected]>

Examples

# Continuous variable
x <- rnorm(100)
mod(x)

# Discontinuous variable
y <- rpois(100,2)
mod(y)

# Character vector
z <- sample(LETTERS[1:3],20,replace=TRUE)
mod(z)
# Continuous variable
x <- rnorm(100)
mod(x)

# Discontinuous variable
y <- rpois(100,2)
mod(y)

# Character vector
z <- sample(LETTERS[1:3],20,replace=TRUE)
mod(z)

Mood's median test

Description

Performs a Mood's median test to compare medians of independent samples.

Usage


mood.medtest(x, ...)

## Default S3 method:
mood.medtest(x, g, exact = NULL, ...)

## S3 method for class 'formula'
mood.medtest(formula, data, subset, ...)
mood.medtest(x, ...)

## Default S3 method:
mood.medtest(x, g, exact = NULL, ...)

## S3 method for class 'formula'
mood.medtest(formula, data, subset, ...)

Arguments

`x`	a numeric vector of data values.
`g`	a vector or factor object giving the group for the corresponding elements of `x`.
`exact`	a logical indicating whether an exact p-value should be computed.
`formula`	a formula of the form `a ~ b`, where `a` and `b` give the data values and corresponding groups.
`data`	an optional data frame containing the variables in the formula `formula`. By default the variables are taken from `environment(formula)`.
`subset`	an optional vector specifying a subset of observations to be used.
`...`	further arguments to be passed to or from other methods.

Details

If exact=NULL, a Fisher's exact test is used if the number of data values is < 200; otherwise a chi-square test is used, with Yates continuity correction if necessary.

Value

`method`	a character string indicating the name of the test.
`data.name`	a character string giving the name(s) of the data.
`statistic`	the value the chi-squared test statistic (in case of a chis-square test).
`parameter`	the degrees of freedom of the approximate chi-squared distribution of the test statistic (in case of a chis-square test).
`p.value`	the p-value of the test.

Author(s)

Maxime HERVE <[email protected]>

Examples

set.seed(1716)
response <- c(rnorm(10,3,1.5),rnorm(10,5.5,2))
fact <- gl(2,10,labels=LETTERS[1:2])
mood.medtest(response~fact)
set.seed(1716)
response <- c(rnorm(10,3,1.5),rnorm(10,5.5,2))
fact <- gl(2,10,labels=LETTERS[1:2])
mood.medtest(response~fact)

Multivariate normality QQ-Plot

Description

Draws a QQ-plot to assess multivariate normality.

Usage

mqqnorm(x, main = "Multi-normal Q-Q Plot")
mqqnorm(x, main = "Multi-normal Q-Q Plot")

Arguments

`x`	a data frame or a matrix of numeric variables (each column giving a variable).
`main`	title of the graph.

Author(s)

Maxime HERVE <[email protected]>

Examples

x <- 1:30+rnorm(30)
y <- 1:30+rnorm(30,1,3)
mqqnorm(cbind(x,y))
x <- 1:30+rnorm(30)
y <- 1:30+rnorm(30,1,3)
mqqnorm(cbind(x,y))

Shapiro-Wilk multivariate normality test

Description

Performs a Shapiro-Wilk test to asses multivariate normality. This is a slightly modified copy of the mshapiro.test function of the package mvnormtest, for internal convenience.

Usage

mshapiro.test(x)
mshapiro.test(x)

Arguments

`x`	a data frame or a matrix of numeric variables (each column giving a variable).

Value

`method`	name of the test.
`data.name`	a character string giving the names of the data.
`statistic`	test statistics of the test.
`p.value`	p-value of the test.

Author(s)

Maxime HERVE <[email protected]> from the work of Slawomir Jarek

Examples

x <- 1:30+rnorm(30)
y <- 1:30+rnorm(30,1,3)
mshapiro.test(cbind(x,y))
x <- 1:30+rnorm(30)
y <- 1:30+rnorm(30,1,3)
mshapiro.test(cbind(x,y))

Pairwise comparisons after an exact multinomial test

Description

Performs pairwise comparisons after a global exact multinomial test. These comparisons are performed using exact binomial tests.

Usage

multinomial.multcomp(x, p.method = "fdr")
multinomial.multcomp(x, p.method = "fdr")

Arguments

`x`	numeric vector (counts). Can also be a factor; in that case `table(x)` is used as counts.
`p.method`	method for p-values correction. See help of `p.adjust`.

Details

Since chi-squared and G tests are approximate tests, exact tests are preferable when the number of individuals is small (200 is a reasonable minimum).

An exact multinomial test with two groups is strictly the same than an exact binomial test.

Value

`method`	name of the test.
`data.name`	a character string giving the name(s) of the data.
`p.adjust.method`	method for p-values correction.
`p.value`	table of results.

Author(s)

Maxime HERVE <[email protected]>

Examples

counts <- c(5,15,23)
multinomial.test(counts)
multinomial.multcomp(counts)
counts <- c(5,15,23)
multinomial.test(counts)
multinomial.multcomp(counts)

Exact multinomial test

Description

Perfoms an exact multinomial test on a vector of counts.

Usage

multinomial.test(x, p = rep(1/length(x), length(x)))
multinomial.test(x, p = rep(1/length(x), length(x)))

Arguments

`x`	numeric vector (counts). Can also be a factor; in that case `table(x)` is used as counts.
`p`	theoretical proportions (optional).

Details

The function works as chisq.test or G.test :

- if theoretical proportions are not given, equality of counts is tested

- if theoretical proportions are given, equality of counts to theoretical counts (given by theoretical proportions) is tested.

Since chi-squared and G tests are approximate tests, exact tests are preferable when the number of individuals is small (200 is a reasonable minimum).

Be aware that the calculation time increases with the number of individuals (i.e. the sum of x) and the number of groups (i.e. the length of x).

An exact multinomial test with two groups is strictly the same as an exact binomial test.

Value

`method`	name of the test.
`p.value`	p-value.
`data.name`	a character string giving the name(s) of the data.
`observed`	the observed counts.
`expected`	the expected counts under the null hypothesis.

Author(s)

Maxime HERVE <[email protected]> based on multinomial.test

Examples

counts <- c(5,15,23)
multinomial.test(counts)
counts <- c(5,15,23)
multinomial.test(counts)

Pairwise comparisons after an exact multinomial test for given probabilities

Description

Performs pairwise comparisons after a global exact multinomial test for given probabilities. These comparisons are performed using exact binomial tests.

Usage

multinomial.theo.multcomp(x, p = rep(1/length(x), length(x)), prop = FALSE,
  p.method = "fdr")
multinomial.theo.multcomp(x, p = rep(1/length(x), length(x)), prop = FALSE,
  p.method = "fdr")

Arguments

`x`	numeric vector (counts). Can also be a factor; in that case `table(x)` is used as counts.
`p`	theoretical proportions.
`prop`	logical indicating if results should be printed as counts (`FALSE`) or proportions (`TRUE`).
`p.method`	method for p-values correction. See help of `p.adjust`.

Details

Since chi-squared and G tests are approximate tests, exact tests are preferable when the number of individuals is small (200 is a reasonable minimum).

An exact multinomial test with two groups is strictly the same than an exact binomial test.

Value

`method`	name of the test.
`data.name`	a character string giving the name(s) of the data.
`observed`	observed counts.
`expected`	expected counts.
`p.adjust.method`	method for p-values correction.
`p.value2`	corrected p-values.
`p.value`	data frame of results.

Author(s)

Maxime HERVE <[email protected]>

Examples

counts <- c(5,15,23)
p.theo <- c(0.2,0.5,0.3)
multinomial.test(counts,p=p.theo)
multinomial.theo.multcomp(counts,p=p.theo)
counts <- c(5,15,23)
p.theo <- c(0.2,0.5,0.3)
multinomial.test(counts,p=p.theo)
multinomial.theo.multcomp(counts,p=p.theo)

Univariate correlation test for multiple variables

Description

Performs correlation tests between one variable and a series of other variables, and corrects p-values.

Usage

multtest.cor(mult.var, uni.var, method = "pearson", p.method = "fdr",
  ordered = TRUE)

## S3 method for class 'multtest.cor'
plot(x, arrows = TRUE, main = NULL, pch = 16,
  cex = 1, col = c("red", "orange", "black"), labels = NULL, ...)
multtest.cor(mult.var, uni.var, method = "pearson", p.method = "fdr",
  ordered = TRUE)

## S3 method for class 'multtest.cor'
plot(x, arrows = TRUE, main = NULL, pch = 16,
  cex = 1, col = c("red", "orange", "black"), labels = NULL, ...)

Arguments

`mult.var`	data frame containing a series of numeric variables.
`uni.var`	numeric variable (vector).
`method`	a character string indicating which correlation coefficient is to be computed. See help of `cor`.
`p.method`	method for p-values correction. See help of `p.adjust`.
`ordered`	logical indicating if variables should be ordered based on correlation values.
`x`	object returned from `multtest.cor`.
`arrows`	logical indicating if arrows should be plotted. If `FALSE`, points are displayed at the extremity of the arrows.
`main`	optional title of the graph.
`pch`	symbol(s) used for points, when points are displayed (see `arrows`).
`cex`	size of points and labels (see help of `dotchart`).
`col`	vector of three colors: first for variables with P < 0.05, second for variables with 0.05 < P < 0.1, third for variables with P > 0.1. Recycled if only one value.
`labels`	names of the variables. If `NULL` (default), labels correspond to names found in `mult.var`.
`...`	not used.

Author(s)

Maxime HERVE <[email protected]>

Examples

data(iris)

# Original coordinates
plot(Petal.Length~Sepal.Length,pch=16,col=as.numeric(iris$Species),data=iris)

# New axis
abline(-6,1.6)

# Coordinates on new axis
new.coord <- coord.proj(iris[,c("Sepal.Length","Petal.Length")],1.6)

# Correlation between the whole dataset and new coordinates
mult.cor <- multtest.cor(iris[,1:4],new.coord)
plot(mult.cor)
data(iris)

# Original coordinates
plot(Petal.Length~Sepal.Length,pch=16,col=as.numeric(iris$Species),data=iris)

# New axis
abline(-6,1.6)

# Coordinates on new axis
new.coord <- coord.proj(iris[,c("Sepal.Length","Petal.Length")],1.6)

# Correlation between the whole dataset and new coordinates
mult.cor <- multtest.cor(iris[,1:4],new.coord)
plot(mult.cor)

Univariate comparison of groups for multiple variables

Description

Performs group comparisons for multiple variables using parametric, permutational or rank tests, and corrects p-values. Gives also group means and standards errors for each variable.

Usage

multtest.gp(tab, fac, test = c("param", "perm", "rank"),
  transform = c("none", "sqrt", "4rt", "log"), add = 0, p.method = "fdr",
  ordered = TRUE, ...)

## S3 method for class 'multtest.gp'
plot(x, signif = FALSE, alpha = 0.05,
  vars = NULL, xlab = "Group", ylab = "Mean (+/- SE) value",
  titles = NULL, groups = NULL, ...)
multtest.gp(tab, fac, test = c("param", "perm", "rank"),
  transform = c("none", "sqrt", "4rt", "log"), add = 0, p.method = "fdr",
  ordered = TRUE, ...)

## S3 method for class 'multtest.gp'
plot(x, signif = FALSE, alpha = 0.05,
  vars = NULL, xlab = "Group", ylab = "Mean (+/- SE) value",
  titles = NULL, groups = NULL, ...)

Arguments

`tab`	data frame containing response variables.
`fac`	factor defining groups to compare.
`test`	type of test to use: parametric (default), permutational (non parametric) or rank-based (non parametric). See Details.
`transform`	transformation to apply to response variables before testing (none by default; `"4rt"` is fourth-root). Only used for parametric and permutational tests.
`add`	value to add to response variables before a log-transformation.
`p.method`	method for p-values correction. See help of `p.adjust`.
`ordered`	logical indicating if variables should be ordered based on p-values.
`x`	object returned from `multtest.gp`.
`signif`	logical indicating if only variables with significant P-value should be plotted.
`alpha`	significance threshold.
`vars`	numeric vector giving variables to plot (rows of `x`). Default to all, which can lead to errors if too many variables.
`xlab`	legend of the x axis.
`ylab`	legend of the y axis
`titles`	titles of the graphs (name of the variables by default).
`groups`	names of the bars (levels of `fac` by default).
`...`	additional arguments to testing functions in `multtest.gp` (especially for `var.equal` in `t.test` and `nperm` in `perm.anova` and `perm.t.test`) and to `barplot` in `plot`.

Details

In case of parametric tests, t-tests or ANOVAs are used depending on the number of groups (2 or more, respectively). In case of permutational tests: permutational t-tests or permutational ANOVAs. In case of rank-based tests: Mann-Whitney-Wilcoxon or Kruskal-Wallis tests.

Author(s)

Maxime HERVE <[email protected]>

Examples

data(iris)
mult <- multtest.gp(iris[,1:4],iris$Species)
plot(mult)
data(iris)
mult <- multtest.gp(iris[,1:4],iris$Species)
plot(mult)

Univariate comparison of groups for multiple binary variables

Description

Performs group comparisons for multiple binary variables using a parametric or an exact test, and corrects p-values. Gives also group proportions and standards errors for each variable.

Usage

multtest.gp.bin(tab, fac, test = c("LRT", "Fisher"),
  p.method = "fdr", ordered = TRUE, ...)

## S3 method for class 'multtest.gp.bin'
plot(x, signif = FALSE, alpha = 0.05,
  vars = NULL, xlab = "Group", ylab = "Mean (+/- SE) proportion",
  titles = NULL, groups = NULL, ...)
multtest.gp.bin(tab, fac, test = c("LRT", "Fisher"),
  p.method = "fdr", ordered = TRUE, ...)

## S3 method for class 'multtest.gp.bin'
plot(x, signif = FALSE, alpha = 0.05,
  vars = NULL, xlab = "Group", ylab = "Mean (+/- SE) proportion",
  titles = NULL, groups = NULL, ...)

Arguments

`tab`	data frame containing response variables.
`fac`	factor defining groups to compare.
`test`	type of test to use: likelihood ratio test based on binomial GLM (`"LRT"`, default) or Fisher's exact test (`"Fisher"`).
`p.method`	method for p-values correction. See help of `p.adjust`.
`ordered`	logical indicating if variables should be ordered based on p-values.
`x`	object returned from `multtest.gp.bin`.
`signif`	logical indicating if only variables with significant P-value should be plotted.
`alpha`	significance threshold.
`vars`	numeric vector giving variables to plot (rows of `x`). Default to all, which can lead to errors if too many variables.
`xlab`	legend of the x axis.
`ylab`	legend of the y axis
`titles`	titles of the graphs (name of the variables by default).
`groups`	names of the bars (levels of `fac` by default).
`...`	additional arguments to testing functions in `multtest.gp.bin` (especially for `Anova` and `fisher.test`) and to `barplot` in `plot`.

Author(s)

Maxime HERVE <[email protected]>

Type II permutation test for constrained multivariate analyses

Description

This function is a wrapper to anova.cca(...,by="terms") but performs type II tests (whereas anova.cca performs type I).

Usage

MVA.anova(object, ...)
MVA.anova(object, ...)

Arguments

`object`	a result object from `cca`, `rda`, `capscale` or `dbrda`.
`...`	additional arguments to `anova.cca` (can be `permutations`, `model`, `parallel` and/or `strata`). See help of this function.

Details

See anova.cca for detailed explanation of what is done. The only difference with anova.cca is that MVA.anova performs type II tests instead of type I.

See example of adonis.II for the difference between type I (sequential) and type II tests.

Value

a data frame of class "anova".

Author(s)

Maxime HERVE <[email protected]>

Biplot of multivariate analyses

Description

Displays a biplot of a multivariate analysis. This just consists in superimposing a score plot and a correlation circle (plus centroids of factor levels in constrained analyses, RDA or CCA). The correlation circle is adjusted to fit the size of the score plot.

Usage

MVA.biplot(x, xax = 1, yax = 2, scaling = 2, sco.set = c(12, 1, 2),
  cor.set = c(12, 1, 2), space = 1, ratio = 0.9, weights = 1,
  constraints = c("nf", "n", "f", NULL), sco.args = list(),
  cor.args = list(), f.col = 1, f.cex = 1)
MVA.biplot(x, xax = 1, yax = 2, scaling = 2, sco.set = c(12, 1, 2),
  cor.set = c(12, 1, 2), space = 1, ratio = 0.9, weights = 1,
  constraints = c("nf", "n", "f", NULL), sco.args = list(),
  cor.args = list(), f.col = 1, f.cex = 1)

Arguments

`x`	a multivariate analysis (see Details).
`xax`	the horizontal axis.
`yax`	the vertical axis.
`scaling`	type of scaling (see `MVA.scoreplot`).
`sco.set`	scores to be displayed, when several sets are available (see `MVA.scoreplot`).
`cor.set`	correlations to be displayed, when several sets are available (see `MVA.scoreplot`).
`space`	space to use, when several are available (see `MVA.scoreplot` and `MVA.corplot`).
`ratio`	constant for adjustement of correlations to the size of the score plot (`0.9` means the longest arrows is 90% of the corresponding axis).
`weights`	only used with constrained analyses (RDA or CCA) where some constraints are factors. Individual weights, used to calculate barycenter positions.
`constraints`	only used with constrained analyses (RDA or CCA). Type of constraints to display: quantitative (`"n"`), factors (`"f"`), both (`"nf", default`) or none (`"NULL"`).
`sco.args`	list containing optional arguments to pass to `MVA.scoreplot`. All arguments are accepted.
`cor.args`	list containing optional arguments to pass to `MVA.corplot`. All arguments are accepted except `xlab`, `ylab`, `circle`, `intcircle`, `drawintaxes`, `add` and `add.const`.
`f.col`	color(s) used for barycenters in case of a constraint analysis (RDA or CCA) containing factor constraint(s). Can be a unique value, a vector giving one color per constraint or a vector giving one color per barycenter (all factors confounded).
`f.cex`	size(s) used for barycenters in case of a constraint analysis (RDA or CCA) containing factor constraint(s). Can be a unique value, a vector giving one size per constraint or a vector giving one size per barycenter (all factors confounded).

Details

This function should not be use directly. Prefer the general MVA.plot, to which all arguments can be passed.

All multivariate analyses covered by MVA.corplot can be used for biplots.

Author(s)

Maxime HERVE <[email protected]>

Examples

require(vegan)
data(iris)
RDA <- rda(iris[,1:4]~Species,data=iris)
MVA.plot(RDA,"biplot",cor.args=list(col="purple"),ratio=0.8,f.col=c("red","green","blue"))
require(vegan)
data(iris)
RDA <- rda(iris[,1:4]~Species,data=iris)
MVA.plot(RDA,"biplot",cor.args=list(col="purple"),ratio=0.8,f.col=c("red","green","blue"))

Cross model validation

Description

Performs cross model validation (2CV) with different PLS analyses.

Usage

MVA.cmv(X, Y, repet = 10, kout = 7, kinn = 6, ncomp = 8, scale = TRUE,
  model = c("PLSR", "CPPLS", "PLS-DA", "PPLS-DA", "PLS-DA/LDA", "PLS-DA/QDA",
  "PPLS-DA/LDA", "PPLS-DA/QDA"), crit.inn = c("RMSEP", "Q2", "NMC"),
  Q2diff = 0.05, lower = 0.5, upper = 0.5, Y.add = NULL, weights = rep(1, nrow(X)),
  set.prior = FALSE, crit.DA = c("plug-in", "predictive", "debiased"), ...)
MVA.cmv(X, Y, repet = 10, kout = 7, kinn = 6, ncomp = 8, scale = TRUE,
  model = c("PLSR", "CPPLS", "PLS-DA", "PPLS-DA", "PLS-DA/LDA", "PLS-DA/QDA",
  "PPLS-DA/LDA", "PPLS-DA/QDA"), crit.inn = c("RMSEP", "Q2", "NMC"),
  Q2diff = 0.05, lower = 0.5, upper = 0.5, Y.add = NULL, weights = rep(1, nrow(X)),
  set.prior = FALSE, crit.DA = c("plug-in", "predictive", "debiased"), ...)

Arguments

`X`	a data frame of independent variables.
`Y`	the dependent variable(s): numeric vector, data frame of quantitative variables or factor.
`repet`	an integer giving the number of times the whole 2CV procedure has to be repeated.
`kout`	an integer giving the number of folds in the outer loop (can be re-set internally if needed).
`kinn`	an integer giving the number of folds in the inner loop (can be re-set internally if needed). Cannot be `> kout`.
`ncomp`	an integer giving the maximal number of components to be tested in the inner loop (can be re-set depending on the size of the train sets).
`scale`	logical indicating if data should be scaled (see Details).
`model`	the model to be fitted (see Details).
`crit.inn`	the criterion to be used to choose the number of components in the inner loop. Root Mean Square Error of Prediction (`"RMSEP"`, default) and Q2 (`"Q2"`) are only used for PLSR and CPPLS, whereas the Number of MisClassifications (`"NMC"`) is only used for discriminant analyses.
`Q2diff`	the threshold to be used if the number of components is chosen according to Q2. The next component is added only if it makes the Q2 increase more than `Q2diff` (5% by default).
`lower`	a vector of lower limits for power optimisation in CPPLS or PPLS-DA (see `cppls.fit`).
`upper`	a vector of upper limits for power optimisation in CPPLS or PPLS-DA (see `cppls.fit`).
`Y.add`	a vector or matrix of additional responses containing relevant information about the observations, in CPPLS or PPLS-DA (see `cppls.fit`).
`weights`	a vector of individual weights for the observations, in CPPLS or PPLS-DA (see `cppls.fit`).
`set.prior`	only used when a second analysis (LDA or QDA) is performed. If `TRUE`, the prior probabilities of class membership are defined according to the mean weight of individuals belonging to each class. If `FALSE`, prior probabilities are obtained from the data sets on which LDA/QDA models are built.
`crit.DA`	criterion used to predict class membership when a second analysis (LDA or QDA) is used. See `predict.lda`.
`...`	other arguments to pass to `plsr` (PLSR, PLS-DA) or `cppls` (CPPLS, PPLS-DA).

Details

Cross model validation is detailed is Szymanska et al (2012). Some more details about how this function works:

- when a discriminant analysis is used ("PLS-DA", "PPLS-DA", "PLS-DA/LDA", "PLS-DA/QDA", "PPLS-DA/LDA" or "PPLS-DA/QDA"), the training sets (test set itself in the inner loop, test+validation sets in the outer loop) are generated in respect to the relative proportions of the levels of Y in the original data set (see splitf).

- "PLS-DA" is considered as PLS2 on a dummy-coded response. For a PLS-DA based on the CPPLS algorithm, use "PPLS-DA" with lower and upper limits of the power parameters set to 0.5.

- if a second analysis is used ("PLS-DA/LDA", "PLS-DA/QDA", "PPLS-DA/LDA" or "PPLS-DA/QDA"), a LDA or QDA is built on scores of the first analysis (PLS-DA or PPLS-DA) also in the inner loop. The classification error rate, based on this second analysis, is used to choose the number of components.

If scale = TRUE, the scaling is done as this:

- for each step of the outer loop (i.e. kout steps), the rest set is pre-processed by centering and unit-variance scaling. Means and standard deviations of variables in the rest set are then used to scale the test set.

- for each step of the inner loop (i.e. kinn steps), the training set is pre-processed by centering and unit-variance scaling. Means and standard deviations of variables in the training set are then used to scale the validation set.

Value

`model`	model used.
`type`	type of model used.
`repet`	number of times the whole 2CV procedure was repeated.
`kout`	number of folds in the outer loop.
`kinn`	number of folds in the inner loop.
`crit.inn`	criterion used to choose the number of components in the inner loop.
`crit.DA`	criterion used to classify individuals of the test and validation sets.
`Q2diff`	threshold used if the number of components is chosen according to Q2.
`groups`	levels of `Y` if it is a factor.
`models.list`	list of of models generated (`repet*kout` models), for PLSR, CPPLS, PLS-DA and PPLS-DA.
`models1.list`	list of of (P)PLS-DA models generated (`repet*kout` models), for PLS-DA/LDA, PLS-DA/QDA, PPLS-DA/LDA and PPLS-DA/QDA.
`models2.list`	list of of LDA/QDA models generated (`repet*kout` models), for PLS-DA/LDA, PLS-DA/QDA, PPLS-DA/LDA and PPLS-DA/QDA.
`RMSEP`	RMSEP computed from the models used in the outer loops (`repet` values).
`Q2`	Q2 computed from the models used in the outer loops (`repet` values).
`NMC`	Classification error rate computed from the models used in the outer loops (`repet` values).
`confusion`	Confusion matrices computed from the models used in the outer loops (`repet` values).
`pred.prob`	Probability of each individual of being of each level of `Y`.

Author(s)

Maxime HERVE <[email protected]>

References

Szymanska E, Saccenti E, Smilde AK and Westerhuis J (2012) Double-check: validation of diagnostic statistics for PLS-DA models in metabolomics studies. Metabolomics (2012) 8:S3-S16.

Examples

require(pls)
require(MASS)

# PLSR
data(yarn)
## Not run: MVA.cmv(yarn$NIR,yarn$density,model="PLSR")

# PPLS-DA coupled to LDA
data(mayonnaise)
## Not run: MVA.cmv(mayonnaise$NIR,factor(mayonnaise$oil.type),model="PPLS-DA/LDA",crit.inn="NMC")
require(pls)
require(MASS)

# PLSR
data(yarn)
## Not run: MVA.cmv(yarn$NIR,yarn$density,model="PLSR")

# PPLS-DA coupled to LDA
data(mayonnaise)
## Not run: MVA.cmv(mayonnaise$NIR,factor(mayonnaise$oil.type),model="PPLS-DA/LDA",crit.inn="NMC")

Correlations of multivariate analyses

Description

Returns correlations of a multivariate analysis.

Usage

MVA.cor(x, xax = 1, yax = 2, set = c(12, 1, 2), space = 1, ...)
MVA.cor(x, xax = 1, yax = 2, set = c(12, 1, 2), space = 1, ...)

Arguments

`x`	a multivariate analysis (see Details).
`xax`	axis or axes for which to extract correlations.
`yax`	axis for which to extract correlations (ignored if `length(xax) > 1`).
`set`	variables to be displayed, when several sets are available (see Details). `12` (default) for both sets, `1` for X or constraints, `2` for Y or constrained variables.
`space`	variables to be displayed, when several spaces are available (see Details). `space` is the number of the space to be plotted.
`...`	not used.

Details

Many multivariate analyses are supported, from various packages:

- PCA: dudi.pca, rda.

- sPCA: spca.

- IPCA: ipca.

- sIPCA: sipca.

- LDA: lda, discrimin.

- PLS-DA (PLS2 on a dummy-coded factor): plsda. X space only.

- sPLS-DA (sPLS2 on a dummy-coded factor): splsda. X space only.

- CPPLS: mvr. Set 1 is X, set 2 is Y. If set=12 (default), fac is not available and pch,cex, col, lwd can be defined differently for each set. X space only.

- PLSR: mvr, pls, plsR (plsRglm package). Set 1 is X, set 2 is Y. If set=12 (default), fac is not available and pch,cex, col, lwd can be defined differently for each set. X space only.

- sPLSR: pls. Set 1 is X, set 2 is Y. If set=12 (default), fac is not available and pch,cex, col, lwd can be defined differently for each set. X space only.

- PLS-GLR: plsRglm (plsRglm package). Set 1 is X, set 2 is Y. If set=12 (default), fac is not available and pch,cex, col, lwd can be defined differently for each set. Correlations are computed with Y on the link scale.

- PCR: mvr. Set 1 is X, set 2 is Y. If set=12 (default), fac is not available and pch,cex, col, lwd can be defined differently for each set.

- CDA: discrimin, discrimin.coa.

- NSCOA: dudi.nsc. For NSCOA there is no real correlation, but the classical representation of columns is arrows. This is why MVA.corplot was made able to deal with this analysis.

- CCA: cca, pcaiv. Constraints (only quantitative constraints are extracted) in constrained space only.

- Mix analysis: dudi.mix, dudi.hillsmith. Only quantitative variables are displayed.

- RDA (or PCAIV): pcaiv, pcaivortho, rda. With rda, space 1 is constrained space, space 2 is unconstrained space. Only constrained space is available with pcaiv, the opposite for pcaivortho. Set 1 is constraints (only quantitative constraints are extracted), set 2 is dependent variables (only set 2 is available for pcaivortho). If set=12 (default), fac is not available and pch,cex, col, lwd can be defined differently for each set.

- CCorA: CCorA, rcc. Space 1 is X, space 2 is Y. With rcc a third space is available, in which coordinates are means of X and Y coordinates. In this third space, set 1 is X, set 2 is Y. If set=12 (default), fac is not available and pch,cex, col, lwd can be defined differently for each set.

- rCCorA: rcc. Space 1 is X, space 2 is Y, space 3 is a "common" space in which coordinates are means of X and Y coordinates. In space 3, set 1 is X and set 2 is Y. If set=12 (default), fac is not available and pch,cex, col, lwd can be defined differently for each set.

- CIA: coinertia. Space 1 is X, space 2 is Y, space 3 is a "common" space where X and Y scores are normed. In space 3, set 1 is X and set 2 is Y. If set=12 in space 3 (default), fac is not available and pch,cex, col, lws can be defined differently for each set.

- GPA: GPA. Only the consensus ordination can be displayed.

- 2B-PLS: pls. Space 1 is X, space 2 is Y, space 3 is a "common" space in which coordinates are means of X and Y coordinates. In space 3, set 1 is X and set 2 is Y. If set=12 (default), fac is not available and pch,cex, col, lwd can be defined differently for each set.

- 2B-sPLS: pls. Space 1 is X, space 2 is Y, space 3 is a "common" space in which coordinates are means of X and Y coordinates. In space 3, set 1 is X and set 2 is Y. If set=12 (default), fac is not available and pch,cex, col, lwd can be defined differently for each set.

- rGCCA: wrapper.rgcca. Space can be 1 to n, the number of blocks (i.e. datasets).

- sGCCA: wrapper.sgcca. Space can be 1 to n, the number of blocks (i.e. datasets).

- DIABLO: block.plsda, block.splsda. Space can be 1 to n, the number of blocks (i.e. datasets).

Author(s)

Maxime HERVE <[email protected]>

Correlation circle of multivariate analyses

Description

Displays a correlation circle of a multivariate analysis.

Usage

MVA.corplot(x, xax = 1, yax = 2, thresh = 0, fac = NULL, set = c(12, 1, 2), space = 1,
  xlab = NULL, ylab = NULL, main = NULL, circle = TRUE, intcircle = 0.5, points = TRUE,
  ident = TRUE, arrows = TRUE, labels = NULL, main.pos = c("bottomleft", "topleft",
  "bottomright", "topright"), main.cex = 1.3, legend = FALSE, legend.pos = c("topleft",
  "topright", "bottomleft", "bottomright"), legend.title = NULL, legend.lab = NULL,
  pch = 16, cex = 1, col = 1, lwd = 1, drawintaxes = TRUE, add = FALSE, add.const = 1,
  keepmar = FALSE)
MVA.corplot(x, xax = 1, yax = 2, thresh = 0, fac = NULL, set = c(12, 1, 2), space = 1,
  xlab = NULL, ylab = NULL, main = NULL, circle = TRUE, intcircle = 0.5, points = TRUE,
  ident = TRUE, arrows = TRUE, labels = NULL, main.pos = c("bottomleft", "topleft",
  "bottomright", "topright"), main.cex = 1.3, legend = FALSE, legend.pos = c("topleft",
  "topright", "bottomleft", "bottomright"), legend.title = NULL, legend.lab = NULL,
  pch = 16, cex = 1, col = 1, lwd = 1, drawintaxes = TRUE, add = FALSE, add.const = 1,
  keepmar = FALSE)

Arguments

`x`	a multivariate analysis (see Details).
`xax`	the horizontal axis.
`yax`	the vertical axis. This can be set to `NULL` for a one-dimensional graph, which is a dotchart.
`thresh`	threshold (in absolute value of the correlation coefficient) of variables to be plotted.
`fac`	an optional factor defining groups of variables.
`set`	variables to be displayed, when several sets are available (see Details). `12` (default) for both sets, `1` for X or constraints, `2` for Y or constrained variables.
`space`	variables to be displayed, when several spaces are available (see Details). `space` is the number of the space to be plotted.
`xlab`	legend of the horizontal axis. If `NULL` (default), automatic labels are used depending on the multivariate analysis.
`ylab`	only used for two-dimensional graphs. Legend of the vertical axis. If `NULL` (default), automatic labels are used depending on the multivariate analysis.
`main`	optional title of the graph.
`circle`	only used for two-dimensional graphs. Logical indicating if the circle of radius 1 should be plotted.
`intcircle`	only used for two-dimensional graphs. Vector of one or several values indicating radii of circles to be plotted inside the main circle. Can be set to `NULL`.
`points`	only used for two-dimensional graphs. If `FALSE`, arrows or points (see `arrows`) are replaced with their corresponding label (defined by `labels`).
`ident`	only used for two-dimensional graphs when `points=TRUE`. A logical indicating if variable names should be displayed.
`arrows`	only used if `points=TRUE`. Logical indicating if arrows should be plotted. If `FALSE`, points are displayed at the extremity of the arrows.
`labels`	names of the variables. If `NULL` (default), labels correspond to variable names found in the data used in the multivariate analysis. For two-dimensional graphs, only used if `ident=TRUE`.
`main.pos`	position of the title, if `main` is not `NULL`. Default to `"bottomleft"`.
`main.cex`	size of the title, if `main` is not `NULL`.
`legend`	only used for two-dimensional graphs. Logical indicating if a legend should be added to the graph.
`legend.pos`	position of the legend, if `legend` is `TRUE`. Default to `"topleft"`.
`legend.title`	optional title of the legend, if `legend` is `TRUE`.
`legend.lab`	legend labels, if `legend` is `TRUE`. If `NULL`, levels of the factor defined by `fac` are used.
`pch`	symbol(s) used for points, when points are displayed (see `arrows`). If `fac` is not `NULL`, can be a vector of length one or a vector giving one value per group. Otherwise a vector of any length can be defined, which is recycled if necessary.
`cex`	size of the points and/or of the variable names. For two-dimensional graphs: if `fac` is not `NULL`, can be a vector of length one or a vector giving one value per group; otherwise a vector of any length can be defined, which is recycled if necessary. For dotcharts, gives the size used for points and all labels (see `dotchart`).
`col`	color(s) used for points and/or variable names. If `fac` is not `NULL`, can be a vector of length one or a vector giving one value per group. Otherwise a vector of any length can be defined, which is recycled if necessary (not available for density histograms, see `dhist`).
`lwd`	only used if arrows are displayed. Width of arrows. If `fac` is not `NULL`, can be a vector of length one or a vector giving one value per group. Otherwise a vector of any length can be defined, which is recycled if necessary.
`drawintaxes`	logical indicating if internal axes should be drawn.
`add`	only used for two-dimensional graphs. Logical indicating if the correlation circle should be added to an existing graph.
`add.const`	only used for two-dimensional graphs and if `add` is `TRUE`. Constant by which correlations are multiplied to fit onto the original graph.
`keepmar`	only used for two-dimensional graphs. Logical indicating if margins defined by MVA.corplot should be kept after plotting (necessary in some cases when `add=TRUE`).

Details

This function should not be use directly. Prefer the general MVA.plot, to which all arguments can be passed.

Many multivariate analyses are supported, from various packages:

- PCA: dudi.pca, rda.

- sPCA: spca.

- IPCA: ipca.

- sIPCA: sipca.

- LDA: lda, discrimin.

- PLS-DA (PLS2 on a dummy-coded factor): plsda. X space only.

- sPLS-DA (sPLS2 on a dummy-coded factor): splsda. X space only.

- CPPLS: mvr. Set 1 is X, set 2 is Y. If set=12 (default), fac is not available and pch,cex, col, lwd can be defined differently for each set. X space only.

- sPLSR: pls. Set 1 is X, set 2 is Y. If set=12 (default), fac is not available and pch,cex, col, lwd can be defined differently for each set. X space only.

- PCR: mvr. Set 1 is X, set 2 is Y. If set=12 (default), fac is not available and pch,cex, col, lwd can be defined differently for each set.

- CDA: discrimin, discrimin.coa.

- NSCOA: dudi.nsc. For NSCOA there is no real correlation, but the classical representation of columns is arrows. This is why MVA.corplot was made able to deal with this analysis.

- CCA: cca, pcaiv. Constraints (only quantitative constraints are extracted) in constrained space only.

- Mix analysis: dudi.mix, dudi.hillsmith. Only quantitative variables are displayed.

- db-RDA: capscale, dbrda. Constraints (only quantitative constraints are extracted) in constrained space only.

- PCIA: procuste. Set 1 is X, set 2 is Y.

- rGCCA: wrapper.rgcca. Space can be 1 to n, the number of blocks (i.e. datasets).

- sGCCA: wrapper.sgcca. Space can be 1 to n, the number of blocks (i.e. datasets).

- DIABLO: block.plsda, block.splsda. Space can be 1 to n, the number of blocks (i.e. datasets).

Author(s)

Maxime HERVE <[email protected]>

Examples

require(ade4)
data(olympic)
PCA <- dudi.pca(olympic$tab,scannf=FALSE)
MVA.plot(PCA,"corr")
require(ade4)
data(olympic)
PCA <- dudi.pca(olympic$tab,scannf=FALSE)
MVA.plot(PCA,"corr")

Cross validation

Description

Performs cross validation with different PLS and/or discriminant analyses.

Usage

MVA.cv(X, Y, repet = 10, k = 7, ncomp = 8, scale = TRUE, model = c("PLSR",
  "CPPLS", "PLS-DA", "PPLS-DA", "LDA", "QDA", "PLS-DA/LDA", "PLS-DA/QDA",
  "PPLS-DA/LDA", "PPLS-DA/QDA"), lower = 0.5, upper = 0.5, Y.add = NULL,
  weights = rep(1, nrow(X)), set.prior = FALSE, crit.DA = c("plug-in",
  "predictive", "debiased"), ...)
MVA.cv(X, Y, repet = 10, k = 7, ncomp = 8, scale = TRUE, model = c("PLSR",
  "CPPLS", "PLS-DA", "PPLS-DA", "LDA", "QDA", "PLS-DA/LDA", "PLS-DA/QDA",
  "PPLS-DA/LDA", "PPLS-DA/QDA"), lower = 0.5, upper = 0.5, Y.add = NULL,
  weights = rep(1, nrow(X)), set.prior = FALSE, crit.DA = c("plug-in",
  "predictive", "debiased"), ...)

Arguments

`X`	a data frame of independent variables.
`Y`	the dependent variable(s): numeric vector, data frame of quantitative variables or factor.
`repet`	an integer giving the number of times the whole procedure has to be repeated.
`k`	an integer giving the number of folds (can be re-set internally if needed).
`ncomp`	an integer giving the number of components to be used for all models except LDA and QDA (can be re-set depending on the size of the train sets).
`scale`	logical indicating if data should be scaled (see Details).
`model`	the model to be fitted (see Details).
`lower`	a vector of lower limits for power optimisation in CPPLS or PPLS-DA (see `cppls.fit`).
`upper`	a vector of upper limits for power optimisation in CPPLS or PPLS-DA (see `cppls.fit`).
`Y.add`	a vector or matrix of additional responses containing relevant information about the observations, in CPPLS or PPLS-DA (see `cppls.fit`).
`weights`	a vector of individual weights for the observations, in CPPLS or PPLS-DA (see `cppls.fit`).
`set.prior`	only used when a LDA or QDA is performed (coupled or not with a PLS model). If `TRUE`, the prior probabilities of class membership are defined according to the mean weight of individuals belonging to each class. If `FALSE`, prior probabilities are obtained from the data sets on which LDA/QDA models are built.
`crit.DA`	criterion used to predict class membership when a LDA or QDA is used. See `predict.lda`.
`...`	other arguments to pass to `plsr` (PLSR, PLS-DA) or `cppls` (CPPLS, PPLS-DA).

Details

When a discriminant analysis is used ("PLS-DA", "PPLS-DA", "LDA", "QDA", "PLS-DA/LDA", "PLS-DA/QDA", "PPLS-DA/LDA" or "PPLS-DA/QDA"), the training sets are generated in respect to the relative proportions of the levels of Y in the original data set (see splitf).

"PLS-DA" is considered as PLS2 on a dummy-coded response. For a PLS-DA based on the CPPLS algorithm, use "PPLS-DA" with lower and upper limits of the power parameters set to 0.5.

If scale = TRUE, the scaling is done as this: for each step of the validation loop (i.e. k steps), the training set is pre-processed by centering and unit-variance scaling. Means and standard deviations of variables in the training set are then used to scale the test set.

Value

`model`	model used.
`type`	type of model used.
`repet`	number of times the whole procedure was repeated.
`k`	number of folds.
`ncomp`	number of components used.
`crit.DA`	criterion used to classify individuals of the test sets.
`groups`	levels of `Y` if it is a factor.
`models.list`	list of of models generated (`repet*k` models), for PLSR, CPPLS, PLS-DA, PPLS-DA, LDA and QDA.
`models1.list`	list of of (P)PLS-DA models generated (`repet*k` models), for PLS-DA/LDA, PLS-DA/QDA, PPLS-DA/LDA and PPLS-DA/QDA.
`models2.list`	list of of LDA/QDA models generated (`repet*k` models), for PLS-DA/LDA, PLS-DA/QDA, PPLS-DA/LDA and PPLS-DA/QDA.
`RMSEP`	RMSEP vales (`repet` values).
`Q2`	Q2 values (`repet` values).
`NMC`	Classification error rates (`repet` values).
`confusion`	Confusion matrices (`repet` values).
`pred.prob`	Probability of each individual of being of each level of `Y`.

Author(s)

Maxime HERVE <[email protected]>

Examples

require(pls)
require(MASS)

# PLSR
data(yarn)
## Not run: MVA.cv(yarn$NIR,yarn$density,model="PLSR")

# PPLS-DA coupled to LDA
data(mayonnaise)
## Not run: MVA.cv(mayonnaise$NIR,factor(mayonnaise$oil.type),model="PPLS-DA/LDA")
require(pls)
require(MASS)

# PLSR
data(yarn)
## Not run: MVA.cv(yarn$NIR,yarn$density,model="PLSR")

# PPLS-DA coupled to LDA
data(mayonnaise)
## Not run: MVA.cv(mayonnaise$NIR,factor(mayonnaise$oil.type),model="PPLS-DA/LDA")

Loadings of multivariate analyses

Description

Returns loadings of a multivariate analysis.

Usage

MVA.load(x, xax = 1, yax = 2, set = c(12, 1, 2), space = 1, ...)
MVA.load(x, xax = 1, yax = 2, set = c(12, 1, 2), space = 1, ...)

Arguments

`x`	a multivariate analysis (see Details).
`xax`	axis or axes for which to extract loadings.
`yax`	axis for which to extract loadings (ignored if `length(xax) > 1`).
`set`	variables to be displayed, when several sets are available (see Details). `12` (default) for both sets, `1` for X, `2` for Y.
`space`	variables to be displayed, when several spaces are available (see Details). `space` is the number of the space to be plotted.
`...`	not used.

Details

Many multivariate analyses are supported, from various packages:

- PCA: prcomp, princomp, dudi.pca, rda, pca, pca.

- sPCA: spca.

- IPCA: ipca.

- sIPCA: sipca.

- LDA: lda, discrimin.

- PLS-DA (PLS2 on a dummy-coded factor): plsda. X space only.

- sPLS-DA (sPLS2 on a dummy-coded factor): splsda. X space only.

- CPPLS: mvr. X space only.

- PLSR: mvr, pls, plsR (plsRglm package). X space only.

- sPLSR: pls. X space only.

- PLS-GLR: plsRglm (plsRglm package).

- PCR: mvr.

- CDA: discrimin, discrimin.coa.

- NSCOA: dudi.nsc.

- MCA: dudi.acm.

- Mix analysis: dudi.mix, dudi.hillsmith.

- PCIA: procuste. Set 1 is X, set 2 is Y.

- CCorA: rcc. Space 1 is X, space 2 is Y.

- rCCorA: rcc. Space 1 is X, space 2 is Y.

- CIA: coinertia. Space 1 is X, space 2 is Y.

- 2B-PLS: pls. Space 1 is X, space 2 is Y.

- 2B-sPLS: pls. Space 1 is X, space 2 is Y.

- rGCCA: wrapper.rgcca. Space can be 1 to n, the number of blocks (i.e. datasets).

- sGCCA: wrapper.sgcca. Space can be 1 to n, the number of blocks (i.e. datasets).

- DIABLO: block.plsda, block.splsda. Space can be 1 to n, the number of blocks (i.e. datasets).

Author(s)

Maxime HERVE <[email protected]>

Loading plot of multivariate analyses

Description

Displays a loading plot of a multivariate analysis.

Usage

MVA.loadplot(x, xax = 1, yax = 2, fac = NULL, set = c(12, 1, 2), space = 1, map = TRUE,
  xlab = NULL, ylab = NULL, main = NULL, points = TRUE, ident = TRUE, links = TRUE, 
  line = TRUE, labels = NULL, main.pos = c("bottomleft", "topleft","bottomright",
  "topright"), main.cex = 1.3, legend = FALSE, legend.pos = c("topleft", "topright",
  "bottomleft", "bottomright"), legend.title = NULL, legend.lab = NULL, pch = 16,
  cex = 1, col = 1, lwd = 1, lty = 1, drawextaxes = TRUE, drawintaxes = TRUE, xlim = NULL,
  ylim = NULL)
MVA.loadplot(x, xax = 1, yax = 2, fac = NULL, set = c(12, 1, 2), space = 1, map = TRUE,
  xlab = NULL, ylab = NULL, main = NULL, points = TRUE, ident = TRUE, links = TRUE, 
  line = TRUE, labels = NULL, main.pos = c("bottomleft", "topleft","bottomright",
  "topright"), main.cex = 1.3, legend = FALSE, legend.pos = c("topleft", "topright",
  "bottomleft", "bottomright"), legend.title = NULL, legend.lab = NULL, pch = 16,
  cex = 1, col = 1, lwd = 1, lty = 1, drawextaxes = TRUE, drawintaxes = TRUE, xlim = NULL,
  ylim = NULL)

Arguments

`x`	a multivariate analysis (see Details).
`xax`	the horizontal axis.
`yax`	the vertical axis. This can be set to `NULL` for a one-dimensional graph.
`fac`	only used for one-dimensional graphs. An optional factor defining groups of variables.
`set`	variables to be displayed, when several sets are available (see Details). `12` (default) for both sets, `1` for X, `2` for Y.
`space`	variables to be displayed, when several spaces are available (see Details). `space` is the number of the space to be plotted.
`map`	logical indicating if a two-dimensional (`TRUE`, default) or a one-dimensional graph should be drawn. A one-dimensional graph can show loadings for one or two dimensions, both horizontally.
`xlab`	only used for two-dimensional graphs. Legend of the horizontal axis. If `NULL` (default), automatic labels are used depending on the multivariate analysis.
`ylab`	legend of the vertical axis. If `NULL` (default), automatic labels are used depending on the multivariate analysis.
`main`	optional title of the graph.
`points`	only used for two-dimensional graphs. If `FALSE`, lines or points (see `links`) are replaced with their corresponding label (defined by `labels`).
`ident`	logical indicating if variable names should be displayed. Only used when `points=TRUE` for two-dimensional graphs.
`links`	only used for two-dimensional graphs when `points=TRUE`. Logical indicating if variables should be linked to the origin of the graph. If `FALSE`, points are displayed at the extremity of the segments.
`line`	only used for one-dimensional graphs when `yax=NULL`. Logical indicating if loadings should be linked (default) as displayed as sticks.
`labels`	only used if `ident=TRUE`. Names of the variables. If `NULL` (default), labels correspond to variable names found in the data used in the multivariate analysis.
`main.pos`	only used for one-dimensional graphs. Position of the title, if `main` is not `NULL`. Default to `"bottomleft"`.
`main.cex`	size of the title, if `main` is not `NULL`.
`legend`	logical indicating if a legend should be added to the graph.
`legend.pos`	position of the legend, if `legend` is `TRUE`. Default to `"topleft"`.
`legend.title`	optional title of the legend, if `legend` is `TRUE`.
`legend.lab`	legend labels, if `legend` is `TRUE`. If `NULL` for a one-dimensional graph, dimension names are used. If `NULL` for a two-dimensional graph, levels of the factor defined by `fac` are used.
`pch`	only used for two-dimensional graphs. Symbol(s) used for points, when points are displayed (see `links`). If `fac` is not `NULL`, can be a vector of length one or a vector giving one value per group. Otherwise a vector of any length can be defined, which is recycled if necessary.
`cex`	size of the points and/or of the variable names. For two-dimensional graphs: if `fac` is not `NULL`, can be a vector of length one or a vector giving one value per group; otherwise a vector of any length can be defined, which is recycled if necessary.
`col`	color(s) used for points, variable names and/or lines/sticks. For one-dimensional graphs, can be a vector of length one or a vector giving one value per line. For two-dimensional graphs: if `fac` is not `NULL`, can be a vector of length one or a vector giving one value per group. Otherwise a vector of any length can be defined, which is recycled if necessary (not available for density histograms, see `dhist`).
`lwd`	width of lines. For one-dimensional graphs, can be a vector of length one or a vector giving one value per line. For two-dimensional graphs: if `fac` is not `NULL`, can be a vector of length one or a vector giving one value per group. Otherwise a vector of any length can be defined, which is recycled if necessary.
`lty`	only used for one-dimensional graphs. Can be a vector of length one or a vector giving one value per line.
`drawextaxes`	logical indicating if external axes should be drawn.
`drawintaxes`	only used for two-dimensional graphs. Logical indicating if internal axes should be drawn.
`xlim`	only used in two-dimensional graphs. Limits of the horizontal axis. If `NULL`, limits are computed automatically.
`ylim`	limits of the vertical axis. If `NULL`, limits are computed automatically.

Details

This function should not be use directly. Prefer the general MVA.plot, to which all arguments can be passed.

Many multivariate analyses are supported, from various packages:

- PCA: prcomp, princomp, dudi.pca, rda, pca, pca.

- sPCA: spca.

- IPCA: ipca.

- sIPCA: sipca.

- LDA: lda, discrimin.

- PLS-DA (PLS2 on a dummy-coded factor): plsda. X space only.

- sPLS-DA (sPLS2 on a dummy-coded factor): splsda. X space only.

- CPPLS: mvr. X space only.

- PLSR: mvr, pls, plsR (plsRglm package). X space only.

- sPLSR: pls. X space only.

- PLS-GLR: plsRglm (plsRglm package).

- PCR: mvr.

- CDA: discrimin, discrimin.coa.

- NSCOA: dudi.nsc.

- MCA: dudi.acm.

- Mix analysis: dudi.mix, dudi.hillsmith.

- PCIA: procuste. Set 1 is X, set 2 is Y.

- CCorA: rcc. Space 1 is X, space 2 is Y.

- rCCorA: rcc. Space 1 is X, space 2 is Y.

- CIA: coinertia. Space 1 is X, space 2 is Y.

- 2B-PLS: pls. Space 1 is X, space 2 is Y.

- 2B-sPLS: pls. Space 1 is X, space 2 is Y.

- rGCCA: wrapper.rgcca. Space can be 1 to n, the number of blocks (i.e. datasets).

- sGCCA: wrapper.sgcca. Space can be 1 to n, the number of blocks (i.e. datasets).

- DIABLO: block.plsda, block.splsda. Space can be 1 to n, the number of blocks (i.e. datasets).

Author(s)

Maxime HERVE <[email protected]>

Examples

require(ade4)
data(olympic)
PCA <- dudi.pca(olympic$tab,scannf=FALSE)
MVA.plot(PCA,"load")
require(ade4)
data(olympic)
PCA <- dudi.pca(olympic$tab,scannf=FALSE)
MVA.plot(PCA,"load")

Paired plot of multivariate analyses

Description

Displays a paired plot (i.e. a score plot of paired points) of a multivariate analysis.

Usage

MVA.pairplot(x, xax = 1, yax = 2, pairs = NULL, scaling = 2, space = 1, fac = NULL,
  xlab = NULL, ylab = NULL, main = NULL, ident = TRUE, labels = NULL, cex = 0.7, col = 1,
  lwd = 1, main.pos = c("bottomleft", "topleft", "bottomright", "topright"),
  main.cex = 1.3, legend = FALSE, legend.pos = c("topleft", "topright", "bottomleft",
  "bottomright"), legend.title = NULL, legend.lab = NULL, drawextaxes = TRUE,
  drawintaxes = TRUE, xlim = NULL, ylim = NULL)
MVA.pairplot(x, xax = 1, yax = 2, pairs = NULL, scaling = 2, space = 1, fac = NULL,
  xlab = NULL, ylab = NULL, main = NULL, ident = TRUE, labels = NULL, cex = 0.7, col = 1,
  lwd = 1, main.pos = c("bottomleft", "topleft", "bottomright", "topright"),
  main.cex = 1.3, legend = FALSE, legend.pos = c("topleft", "topright", "bottomleft",
  "bottomright"), legend.title = NULL, legend.lab = NULL, drawextaxes = TRUE,
  drawintaxes = TRUE, xlim = NULL, ylim = NULL)

Arguments

`x`	a multivariate analysis (see Details).
`xax`	the horizontal axis.
`yax`	the vertical axis. Cannot be `NULL`, only two-dimensional graphs can be drawn.
`pairs`	two-level factor identifying paired individuals (in the same order in both sets of points). Can be omitted with multivariate analyses where two sets of points are available in the same space (see `MVA.scoreplot`). In this case these sets are automatically detected.
`scaling`	type of scaling. Only available with some analyses performed with the `vegan` package. See Details of `MVA.scoreplot`.
`space`	scores to be displayed, when several spaces are available (see Details of `MVA.scoreplot`). `space` is the number of the space to be plotted.
`fac`	an optional factor defining groups pairs.
`xlab`	legend of the horizontal axis. If `NULL` (default), automatic labels are used depending on the multivariate analysis.
`ylab`	legend of the vertical axis. If `NULL` (default), automatic labels are used depending on the multivariate analysis.
`main`	optional title of the graph.
`ident`	logical indicating if variable names should be displayed.
`labels`	names of the individuals. If `NULL` (default), labels correspond to row names of the data used in the multivariate analysis.
`cex`	size of the labels. If `fac` is not `NULL`, can be a vector of length one or a vector giving one value per group. Otherwise a vector of any length can be defined, which is recycled if necessary.
`col`	color(s) used for arrows and labels. If `fac` is not `NULL`, can be a vector of length one or a vector giving one value per group. Otherwise a vector of any length can be defined, which is recycled if necessary.
`lwd`	width of arrows. If `fac` is not `NULL`, can be a vector of length one or a vector giving one value per group. Otherwise a vector of any length can be defined, which is recycled if necessary.
`main.pos`	position of the title, if `main` is not `NULL`. Default to `"bottomleft"`.
`main.cex`	size of the title, if `main` is not `NULL`.
`legend`	logical indicating if a legend should be added to the graph.
`legend.pos`	position of the legend, if `legend` is `TRUE`. Default to `"topleft"`.
`legend.title`	optional title of the legend, if `legend` is `TRUE`.
`legend.lab`	legend labels, if `legend` is `TRUE`. If `NULL` and `fac` is defined, levels of `fac` are used.
`drawextaxes`	logical indicating if external axes should be drawn..
`drawintaxes`	logical indicating if internal axes should be drawn.
`xlim`	limits of the horizontal axis. If `NULL`, limits are computed automatically.
`ylim`	limits of the vertical axis. If `NULL`, limits are computed automatically.

Details

This function should not be use directly. Prefer the general MVA.plot, to which all arguments can be passed.

All multivariate analyses supported by MVA.scoreplot can be used for a paired plot.

Author(s)

Maxime HERVE <[email protected]>

Examples

require(ade4)
data(macaca)
PCIA <- procuste(macaca$xy1,macaca$xy2)
MVA.plot(PCIA,"pairs")
require(ade4)
data(macaca)
PCIA <- procuste(macaca$xy1,macaca$xy2)
MVA.plot(PCIA,"pairs")

Plotting of multivariate analyses

Description

Displays several kinds of plots for multivariate analyses.

Usage

MVA.plot(x, type = c("scores", "loadings", "correlations", "biplot", "pairs",
  "trajectories"), ...)
MVA.plot(x, type = c("scores", "loadings", "correlations", "biplot", "pairs",
  "trajectories"), ...)

Arguments

`x`	a multivariate analysis (see Details).
`type`	the type of plot to be displayed: score plot (default), loading plot, correlation circle, biplot, score plot showing paired samples or score plot showing trajectories, respectively.
`...`	arguments to be passed to subfunctions. See Details.

Details

Different subfunctions are used depending on the type of plot to be displayed: MVA.scoreplot, MVA.loadplot, MVA.corplot, MVA.biplot, MVA.pairplot or MVA.trajplot. These functions should not be used directly (everything can be done with the general MVA.plot) but for convenience, arguments and analyses supported are detailed in separate help pages.

Warning: the use of attach before running a multivariate analysis can prevent MVA.plot to get the values it needs, and make it fail.

Author(s)

Maxime HERVE <[email protected]>

Score plot of multivariate analyses

Description

Displays a score plot of a multivariate analysis.

Usage

MVA.scoreplot(x, xax = 1, yax = 2, scaling = 2, set = c(12, 1, 2), space = 1,
  byfac = TRUE, fac = NULL, barycenters = TRUE, stars = TRUE, contours = FALSE,
  dhist = TRUE, weights = 1, xlab = NULL, ylab = NULL, main = NULL, pch = 16,
  cex = 1, col = 1, points = TRUE, labels = NULL, main.pos = c("bottomleft",
  "topleft", "bottomright", "topright"), main.cex = 1.3, fac.lab = NULL,
  fac.cex = 1, legend = FALSE, legend.pos = c("topleft", "topright", "bottomleft",
  "bottomright"), legend.title = NULL, legend.lab = NULL, legend.cex = 1,
  drawextaxes = TRUE, drawintaxes = TRUE, xlim = NULL, ylim = NULL,
  keepmar = FALSE)
MVA.scoreplot(x, xax = 1, yax = 2, scaling = 2, set = c(12, 1, 2), space = 1,
  byfac = TRUE, fac = NULL, barycenters = TRUE, stars = TRUE, contours = FALSE,
  dhist = TRUE, weights = 1, xlab = NULL, ylab = NULL, main = NULL, pch = 16,
  cex = 1, col = 1, points = TRUE, labels = NULL, main.pos = c("bottomleft",
  "topleft", "bottomright", "topright"), main.cex = 1.3, fac.lab = NULL,
  fac.cex = 1, legend = FALSE, legend.pos = c("topleft", "topright", "bottomleft",
  "bottomright"), legend.title = NULL, legend.lab = NULL, legend.cex = 1,
  drawextaxes = TRUE, drawintaxes = TRUE, xlim = NULL, ylim = NULL,
  keepmar = FALSE)

Arguments

`x`	a multivariate analysis (see Details).
`xax`	the horizontal axis.
`yax`	the vertical axis. This can be set to `NULL` for a one-dimensional graph. The type of graph to be drawn in this case depends on the value of `dhist`.
`scaling`	type of scaling. Only available with some analyses performed with the `vegan` package. See Details.
`set`	scores to be displayed, when several sets are available (see Details). `12` (default) for both sets, `1` for rows or X, `2` for columns or Y.
`space`	scores to be displayed, when several spaces are available (see Details). `space` is the number of the space to be plotted.
`byfac`	only used with MCA and mix analyses (see Details). If `TRUE`, a separate score plot is displayed for each factor included in the analysis. In this case `fac` cannot be used and if `main=NULL`, the factor names are displayed as titles on the graphs.
`fac`	an optional factor defining groups of individuals.
`barycenters`	only used if `fac` is not `NULL`. If `TRUE` (default), the name of each group (defined by `fac.lab`) is diplayed at the position of the barycenter of this group. Available for two-dimensional graphs and for dotcharts in the one-dimensional case (see `dhist`).
`stars`	only used if `fac` is not `NULL`. If `TRUE` (default), the individual of each group are linked to their corresponding barycenter.
`contours`	only used if `fac` is not `NULL`. If `TRUE`, a polygon of contour is displayed for each group.
`dhist`	only used in the one-dimensional case. If `TRUE` (default), a density histogram is displayed. If `FALSE`, a dotchart is displayed.
`weights`	individual weights, used to calculate barycenter positions (see `barycenters`).
`xlab`	legend of the horizontal axis. If `NULL` (default), automatic labels are used depending on the multivariate analysis.
`ylab`	legend of the vertical axis. If `NULL` (default), automatic labels are used depending on the multivariate analysis. Available for two-dimensional graphs and for density histograms in the one-dimensional case (see `dhist`).
`main`	optional title of the graph. Can be a vector of several values for MCA and mix analyses when `byfac=TRUE` (see `byfac`).
`pch`	symbol(s) used for points, when points are displayed (see `points`). If `fac` is not `NULL`, can be a vector of length one or a vector giving one value per group. Otherwise a vector of any length can be defined, which is recycled if necessary. Available for two-dimensional graphs and for dotcharts in the one-dimensional case (see `dhist`). Re-used for all graphs for MCA and mix analyses when `byfac=TRUE` (see `byfac`).
`cex`	size of the points or of the labels (see `points`). Available for two-dimensional graphs and for dotcharts in the one-dimensional case (see `dhist`). For two-dimensional graphs: if `fac` is not `NULL`, can be a vector of length one or a vector giving one value per group; otherwise a vector of any length can be defined, which is recycled if necessary. For dotcharts, gives the size used for points and all labels (see `dotchart`). Re-used for all graphs for MCA and mix analyses when `byfac=TRUE` (see `byfac`).
`col`	color(s) used for points or labels (see `points`). If `fac` is not `NULL`, can be a vector of length one or a vector giving one value per group. Otherwise a vector of any length can be defined, which is recycled if necessary (not available for density histograms, see `dhist`). Re-used for all graphs for MCA and mix analyses when `byfac=TRUE` (see `byfac`).
`points`	only used for two-dimensional graphs. If `FALSE`, points are replaced with their corresponding label (defined by `labels`). Re-used for all graphs for MCA and mix analyses when `byfac=TRUE` (see `byfac`).
`labels`	used in two-dimensional graphs when `points=FALSE` and in dotcharts (see `dhist`). Names of the individuals. If `NULL` (default), labels correspond to row names of the data used in the multivariate analysis. Re-used for all graphs for MCA and mix analyses when `byfac=TRUE` (see `byfac`).
`main.pos`	position of the title, if `main` is not `NULL`. Default to `"bottomleft"`. Re-used for all graphs for MCA and mix analyses when `byfac=TRUE` (see `byfac`).
`main.cex`	size of the title, if `main` is not `NULL`. Re-used for all graphs for MCA and mix analyses when `byfac=TRUE` (see `byfac`).
`fac.lab`	only used if `fac` is not `NULL`. Labels used to display barycenters in two-dimensional graphs or on the vertical axis of a dotchart in the one-dimensional case (see `dhist`). If `NULL`, levels of the factor defined by `fac` are used. In case of a MCA or a mix analysis with `byfac=TRUE` (see `byfac`), labels cannot be changed and correspond to the levels of the factor displayed on each graph.
`fac.cex`	only used if `fac` is not `NULL` and in two-dimensional graphs. Labels used to display barycenters. Can be a vector of length one or a vector giving one value per group. Re-used for all graphs for MCA and mix analyses when `byfac=TRUE` (see `byfac`).
`legend`	logical indicating if a legend should be added to the graph. Available for two-dimensional graphs and for density histograms in the one-dimensional case (see `dhist`).
`legend.pos`	position of the legend, if `legend` is `TRUE`. Default to `"topleft"`.
`legend.title`	optional title of the legend, if `legend` is `TRUE`. Not available for MCA and mix analyses when `byfac=TRUE` (see `byfac`).
`legend.lab`	legend labels, if `legend` is `TRUE`. If `NULL`, labels defined by `fac.labels` are used (see `fac.labels`).
`legend.cex`	size of legend labels, if `legend` is `TRUE`.
`drawextaxes`	logical indicating if external axes should be drawn. Available for two-dimensional graphs and for density histograms in the one-dimensional case (see `dhist`).
`drawintaxes`	logical indicating if internal axes should be drawn.
`xlim`	limits of the horizontal axis. If `NULL`, limits are computed automatically. Re-used for all graphs for MCA and mix analyses when `byfac=TRUE` (see `byfac`).
`ylim`	only used in two-dimensional graphs. Limits of the vertical axis. If `NULL`, limits are computed automatically. Re-used for all graphs for MCA and mix analyses when `byfac=TRUE` (see `byfac`).
`keepmar`	only used in two-dimensional graphs. Logical indicating if margins defined by MVA.scoreplot should be kept after plotting (necessary for biplots).

Details

This function should not be use directly. Prefer the general MVA.plot, to which all arguments can be passed.

Many multivariate analyses are supported, from various packages:

- PCA: prcomp, princomp (if scores=TRUE), dudi.pca, rda, pca, pca. scaling can be defined for rda (see scores.rda).

- sPCA: spca.

- IPCA: ipca.

- sIPCA: sipca.

- PCoA: cmdscale (with at least on non-default argument), dudi.pco, wcmdscale (with at least one non-default argument), capscale, pco, pcoa.

- nMDS: monoMDS, metaMDS, nmds, isoMDS.

- LDA: lda, discrimin.

- PLS-DA (PLS2 on a dummy-coded factor): plsda. X space only.

- sPLS-DA (sPLS2 on a dummy-coded factor): splsda. X space only.

- CPPLS: mvr. X space only.

- PLSR: mvr, pls, plsR (plsRglm package). X space only.

- sPLSR: pls. X space only.

- PLS-GLR: plsRglm (plsRglm package).

- PCR: mvr.

- CDA: discrimin, discrimin.coa.

- NSCOA: dudi.nsc.

- MCA: dudi.acm.

- Mix analysis: dudi.mix, dudi.hillsmith.

- COA: dudi.coa, cca. Set 1 is rows, set 2 is columns. If set=12 (default), fac is not available and pch,cex, col can be defined differently for each set. scaling can be defined for cca (see scores.cca).

- DCOA: dudi.dec. Set 1 is rows, set 2 is columns. If set=12 (default), fac is not available and pch,cex, col can be defined differently for each set.

- PCIA: procuste. Set 1 is X, set 2 is Y. If set=12 (default), fac is not available and pch,cex, col can be defined differently for each set.

- Procrustean superimposition: procrustes. Set 1 is X, set 2 is Y. If set=12 (default), fac is not available and pch,cex, col can be defined differently for each set.

- GPA: GPA. Only the consensus ordination can be displayed.

- DPCoA: dpcoa. Set 1 is categories, set 2 is collections. If set=12 (default), fac is not available and pch,cex, col can be defined differently for each set.

- db-RDA (or CAP): capscale, dbrda. Space 1 is constrained space, space 2 is unconstrained space.

- CCA: pcaiv, cca. With rda, space 1 is constrained space, space 2 is unconstrained space. Only constrained space is available with pcaiv. Set 1 is rows, set 2 is columns. scaling can be defined for cca (see scores.cca).

- CCorA: CCorA, rcc. Space 1 is X, space 2 is Y. With rcc a third space is available, in which coordinates are means of X and Y coordinates.

- rCCorA: rcc. Space 1 is X, space 2 is Y, space 3 is a "common" space in which coordinates are means of X and Y coordinates.

- 2B-PLS: pls. Space 1 is X, space 2 is Y, space 3 is a "common" space in which coordinates are means of X and Y coordinates.

- 2B-sPLS: pls. Space 1 is X, space 2 is Y, space 3 is a "common" space in which coordinates are means of X and Y coordinates.

- rGCCA: rgcca, wrapper.rgcca. Space can be 1 to n, the number of blocks (i.e. datasets).

- sGCCA: rgcca, wrapper.sgcca. Space can be 1 to n, the number of blocks (i.e. datasets).

- DIABLO: block.plsda, block.splsda. Space can be 1 to n, the number of blocks (i.e. datasets).

Author(s)

Maxime HERVE <[email protected]>

Examples

data(iris)
PCA <- prcomp(iris[,1:4])
MVA.plot(PCA,"scores")
MVA.plot(PCA,"scores",fac=iris$Species,col=1:3,pch=15:17)
data(iris)
PCA <- prcomp(iris[,1:4])
MVA.plot(PCA,"scores")
MVA.plot(PCA,"scores",fac=iris$Species,col=1:3,pch=15:17)

Scores of multivariate analyses

Description

Returns scores of a multivariate analysis.

Usage

MVA.scores(x, xax = 1, yax = 2, scaling = 2, set = c(12, 1, 2), space = 1, ...)
MVA.scores(x, xax = 1, yax = 2, scaling = 2, set = c(12, 1, 2), space = 1, ...)

Arguments

`x`	a multivariate analysis (see Details).
`xax`	axis or axes for which to extract scores.
`yax`	axis for which to extract scores (ignored if `length(xax) > 1`).
`scaling`	type of scaling. Only available with some analyses performed with the `vegan` package. See Details.
`set`	scores to be displayed, when several sets are available (see Details). `12` (default) for both sets, `1` for rows or X, `2` for columns or Y.
`space`	scores to be displayed, when several spaces are available (see Details). `space` is the number of the space to be plotted.
`...`	not used.