Package 'PhylogeneticEM' reference manual

Title:	Automatic Shift Detection using a Phylogenetic EM
Description:	Implementation of the automatic shift detection method for Brownian Motion (BM) or Ornstein–Uhlenbeck (OU) models of trait evolution on phylogenies. Some tools to handle equivalent shifts configurations are also available. See Bastide et al. (2017) <doi:10.1111/rssb.12206> and Bastide et al. (2018) <doi:10.1093/sysbio/syy005>.
Authors:	Paul Bastide [aut, cre], Mahendra Mariadassou [ctb]
Maintainer:	Paul Bastide <[email protected]>
License:	GPL (>= 2) \| file LICENSE
Version:	1.8.0
Built:	2025-02-18 14:12:25 UTC
Source:	https://github.com/pbastide/phylogeneticem

Allocation of regimes to nodes.

Description

allocate_regimes_from_shifts allocate a number (from 0 to the number of shifts) to each node, corresponding to its regime : all nodes below shift i are numbered by i.

Usage

allocate_regimes_from_shifts(phylo, shifts_edges)
allocate_regimes_from_shifts(phylo, shifts_edges)

Arguments

`phylo`	a phylogenetic tree, class `phylo`.
`shifts_edges`	edges were the shifts are.

Value

Vector of size (ntaxa + Nnode) of the regimes of each node and tip.

Allocation of shifts to edges

Description

allocate_shifts_from_regimes returns the position of the shifts induced by the allocation of the regimes. Only works in an "infinite site" model.

Usage

allocate_shifts_from_regimes(phylo, regimes)
allocate_shifts_from_regimes(phylo, regimes)

Arguments

`phylo`	a phylogenetic tree, class `phylo`.
`regimes`	: vector of size (ntaxa + Nnode) of the regimes of each node and tip.

Value

Vector of edges numbers where the shifts are.

Check Parsimony, assuming no homoplasy

Description

check_parsimony take a vector of shifts edges, and check whether the number of groups of the tips induced by this allocation is exactly the number of shifts plus one. This is equivalent to parsimony when there is no homoplasy (i.e. no convergent regimes).

Usage

check_parsimony(tree, edges, ...)
check_parsimony(tree, edges, ...)

Arguments

`tree`	phylogenetic tree
`edges`	a vector of edges of the tree, where the shifts are
`...`	possibly, a list giving the descendant tips of each edge

Details

This function computes explicitly the clustering of the tips, using function clusters_from_shifts. By default, this function uses enumerate_tips_under_edges to compute the list of tips under each edge, but a list can be provided (to avoid extra computation, if many tests on the same tree are done).

Value

boolean : TRUE if the allocation is parsimonious.

Clustering associated to a shift allocation, assuming no homoplasy.

Description

clusters_from_shifts take a vector of shifts edges, and gives the clustering of the tips induced by them, in a "no homoplasy" model (i.e. no convergence is allowed).

Usage

clusters_from_shifts(tree, edges, part.list = enumerate_tips_under_edges(tree))
clusters_from_shifts(tree, edges, part.list = enumerate_tips_under_edges(tree))

Arguments

`tree`	phylogenetic tree
`edges`	a vector of edges of the tree, where the shifts are
`part.list`	a list giving the descendant tips of each edge

Details

By default, this function uses enumerate_tips_under_edges to compute the list of tips under each edge.

Value

list of size n+m-1, entry i is the vector of tips bellow edge i.

Computation of the optimal values at nodes and tips.

Description

compute_betas_from_shifts computes the optimal values at the nodes and tips of the tree, given the value at the root and the list of shifts occurring in the tree. It assumes an OU model.

Usage

compute_betas_from_shifts(phylo, optimal.value, shifts)
compute_betas_from_shifts(phylo, optimal.value, shifts)

Arguments

`phylo`	a phylogenetic tree, class `phylo`.
`optimal.value`	the optimal value at the root of the tree.
`shifts`	position and values of the shifts .

Details

Note that this is intended to be an internal function, and should not be used. In general, use node_optimal_values to get optimal values from a set of parameters.

Value

Vector of size (ntaxa + Nnode) of the optimal values at the tips of the tree.

Phylogenetic Distances

Description

compute_dist_phy computes the phylogenetic distances d_ij between all the tips i, j.

Usage

compute_dist_phy(phy)
compute_dist_phy(phy)

Arguments

phy

a phylogenetic tree of class phylo.

Details

This function relies on ape function dist.nodes.

Value

a matrix of phylogenetic distances, ordered as the tips of the tree. The matrix is of type symmetricMatrix-class.

Computation of shifts from the vector of optimal values

Description

compute_shifts_from_betas computes the list of shifts corresponding to the vector of optimal values on nodes.

Usage

compute_shifts_from_betas(phylo, betas)
compute_shifts_from_betas(phylo, betas)

Arguments

`phylo`	a phylogenetic tree, class `phylo`.
`betas`	vector of size (ntaxa + Nnode) of the optimal values at each node and tip.

Details

This function uses function fun on each row of matrix of edges.

Value

vector of shifts.

Compute the stationary variance matrix

Description

compute_stationary_variance computes the stationary variance matrix of an OU process.

Usage

compute_stationary_variance(variance, selection.strength)
compute_stationary_variance(variance, selection.strength)

Arguments

`variance`	the variance (rate matrix) of the process.
`selection.strength`	the selection strength (alpha) matrix of the process.

Value

A positive definite Matrix of class dpoMatrix-class.

Common Ancestors Times

Description

compute_times_ca computes the times t_ij between the root and the common ancestor of two tips i, j.

Usage

compute_times_ca(phy)
compute_times_ca(phy)

Arguments

phy

a phylogenetic tree of class phylo.

Details

This function relies on ape functions node.depth.edgelength and mrca.

Value

a matrix of times of shared evolution, ordered as the tips of the tree. The matrix is of type symmetricMatrix-class.

Correspondence between edges numbers

Description

correspondenceEdges takes edges numbers on an input tree, and gives back their corresponding numbers on the output tree.

Usage

correspondenceEdges(edges, from, to)
correspondenceEdges(edges, from, to)

Arguments

`edges`	vector of index of edges in the tree "from"
`from`	initial input tree (format "`phylo`")
`to`	aimed output tree (format "`phylo`")

Value

vector of index of edges in the tree "to"

Make the result lighter

Description

enlight.PhyloEM takes an object of class PhyloEM, and returns the same object, without saving the quantities that can be easily re-computed using function imputed_traits.PhyloEM.

Usage

enlight(x)

## S3 method for class 'PhyloEM'
enlight(x)
enlight(x)

## S3 method for class 'PhyloEM'
enlight(x)

Arguments

`x`	an object of class `PhyloEM`.

Details

The resulting object can be much lighter, saving a lot of memory space, but each call to the function imputed_traits.PhyloEM will be longer. As function plot.PhyloEM relies on this function, this makes the plotting also longer. This has the same effect as setting the option "light_result=TRUE" in the call of PhyloEM.

Value

Same as entry, lighter.

Methods (by class)

enlight(PhyloEM): PhyloEM object

Enumerate all the possible regime allocations, given a clustering of the tips.

Description

enumerate_parsimony enumerate all the equivalent allocation of the regimes in the tree, a clustering of the tips being given. The number of such equivalent regimes is given by parsimonyNumber (which is faster).

Usage

enumerate_parsimony(phylo, clusters = rep(1, length(phylo$tip.label)))
enumerate_parsimony(phylo, clusters = rep(1, length(phylo$tip.label)))

Arguments

`phylo`	a phylogenetic tree, class `phylo`.
`clusters`	a vector representing the group of each tip. (Default to only one group with all the tips.)

Details

Function extract.enumerate_parsimony furnishes the result in a human readable form (for any subtree). Function plot.enumerate_parsimony plots all the solutions found on the tree.

Value

an S3 object of class "enumerate_parsimony", with:

nbrReconstructions: an object of class "parsimonyCost", result of function parsimonyCost.
allocations: a list of size Nnode + ntaxa. Each entry i of the list represents the solutions for the subtree starting at node i. It is a list with nclus entries, each entry being a matrix. A line of the kth matrix for the ith node is one possible allocation of the shifts, starting with regime k for node i.
phylo: the entry phylogenetic tree

Examples

tree <- read.tree(text="(((A,B),C),D);")
plot(tree)
clusters <- c(0, 1, 2, 2)
sols <- enumerate_parsimony(tree, clusters)
plot(sols)

## Extract the parsimonious solutions from the root
extract(sols) # each line is a solution, with states of each node

## Extract the number of solutions from the root
extract(sols, what = "number")
extract(parsimonyNumber(tree, clusters)) # same result, more efficient

## Extract the cost of the solutions from the root
extract(sols, what = "cost")
extract(parsimonyCost(tree, clusters)) # same result, more efficient:

## Extract for the sub-tree below node 7
extract(sols, 7) # NAs: non-existing nodes in the sub-tree

tree <- read.tree(text="(((A,B),C),D);")
plot(tree)
clusters <- c(0, 1, 2, 2)
sols <- enumerate_parsimony(tree, clusters)
plot(sols)

## Extract the parsimonious solutions from the root
extract(sols) # each line is a solution, with states of each node

## Extract the number of solutions from the root
extract(sols, what = "number")
extract(parsimonyNumber(tree, clusters)) # same result, more efficient

## Extract the cost of the solutions from the root
extract(sols, what = "cost")
extract(parsimonyCost(tree, clusters)) # same result, more efficient:

## Extract for the sub-tree below node 7
extract(sols, 7) # NAs: non-existing nodes in the sub-tree

Tips descendants of nodes.

Description

enumerate_tips_under_edges gives, for each edge of the tree, the labels of the tips that have this edge as an ancestor.

Usage

enumerate_tips_under_edges(tree)
enumerate_tips_under_edges(tree)

Arguments

tree

phylogenetic tree, class phylo.

Details

This function uses function prop.part from package ape.

Value

list of size Nedge, entry i is the vector of tips bellow edge i.

Find all equivalent shifts allocations and values.

Description

equivalent_shifts computes the equivalent shifts positions and their corresponding values, assuming an ultrametric tree.

Usage

equivalent_shifts(
  phylo,
  params,
  T_tree = incidence.matrix(phylo),
  part.list = enumerate_tips_under_edges(phylo),
  times_shared = NULL
)
equivalent_shifts(
  phylo,
  params,
  T_tree = incidence.matrix(phylo),
  part.list = enumerate_tips_under_edges(phylo),
  times_shared = NULL
)

Arguments

`phylo`	a phylogenetic tree, of class `phylo`.
`params`	an object of class `params_process`, result inference by function `PhyloEM`, or constructed through function `params_process`
`T_tree`	(optional) matrix of incidence of the tree, result of function `incidence.matrix`
`part.list`	(optional) list of partition of the tree, result of function `enumerate_tips_under_edges`.
`times_shared`	(optional) a matrix, result of function `compute_times_ca`.

Details

This function is only valid for ultrametric trees, and for models: BM, OU with fixed root or stationary root. It assumes that there are no homoplasies.

Value

object of class equivalent_shifts, with entries:

eq_shifts_edges: matrix of equivalent shifts
shifts_and_betas: matrix of corresponding shifts values
phylo: the entry phylogenetic tree
p: the dimension

Examples

if (requireNamespace("TreeSim", quietly = TRUE)) {
  ## Simualte a tree
  set.seed(17920902)
  ntaxa = 20
  phylo <- TreeSim::sim.bd.taxa.age(n = ntaxa, numbsim = 1, lambda = 0.1,
                                   mu = 0, age = 1, mrca = TRUE)[[1]]
  
  ## Define parameters (BM, fixed root)
  params <- params_BM(p = 4, edges = c(6, 17, 31),
                     values = cbind(1:4, -(1:4), rep(1, 4)))
  ## Find equivalent solutions and plot them
  eq_shifts <- equivalent_shifts(phylo, params)
  eq_shifts
  plot(eq_shifts)
  ## Extract the values
  # Shifts values for trait 2, for the three shifts (rows), and three solutions (columns)
  extract(eq_shifts, trait = 2, what = "shifts_values")
  # Root values for trait 4, for the tree solutions (columns)
  extract(eq_shifts, trait = 4, what = "root_values")
  ## Define parameters (OU, stationary root
  params <- params_OU(p = 4, edges = c(6, 17, 31),
                     selection.strength = 0.1,
                     values = cbind(1:4, -(1:4), rep(1, 4)),
                     random = TRUE)
  ## Find equivalent solutions and plot them
  eq_shifts <- equivalent_shifts(phylo, params)
  eq_shifts
  plot(eq_shifts)
  ## Extract the values
  # Shifts values for trait 2, for the three shifts (rows), and three solutions (columns)
  extract(eq_shifts, trait = 2, what = "shifts_values")
  # Root values for trait 4, for the three solutions (columns)
  extract(eq_shifts, trait = 4, what = "root_values")
}

if (requireNamespace("TreeSim", quietly = TRUE)) {
  ## Simualte a tree
  set.seed(17920902)
  ntaxa = 20
  phylo <- TreeSim::sim.bd.taxa.age(n = ntaxa, numbsim = 1, lambda = 0.1,
                                   mu = 0, age = 1, mrca = TRUE)[[1]]
  
  ## Define parameters (BM, fixed root)
  params <- params_BM(p = 4, edges = c(6, 17, 31),
                     values = cbind(1:4, -(1:4), rep(1, 4)))
  ## Find equivalent solutions and plot them
  eq_shifts <- equivalent_shifts(phylo, params)
  eq_shifts
  plot(eq_shifts)
  ## Extract the values
  # Shifts values for trait 2, for the three shifts (rows), and three solutions (columns)
  extract(eq_shifts, trait = 2, what = "shifts_values")
  # Root values for trait 4, for the tree solutions (columns)
  extract(eq_shifts, trait = 4, what = "root_values")
  ## Define parameters (OU, stationary root
  params <- params_OU(p = 4, edges = c(6, 17, 31),
                     selection.strength = 0.1,
                     values = cbind(1:4, -(1:4), rep(1, 4)),
                     random = TRUE)
  ## Find equivalent solutions and plot them
  eq_shifts <- equivalent_shifts(phylo, params)
  eq_shifts
  plot(eq_shifts)
  ## Extract the values
  # Shifts values for trait 2, for the three shifts (rows), and three solutions (columns)
  extract(eq_shifts, trait = 2, what = "shifts_values")
  # Root values for trait 4, for the three solutions (columns)
  extract(eq_shifts, trait = 4, what = "root_values")
}

Perform One EM

Description

EstimateEM performs one EM for one given number of shifts. It is called from function PhyloEM. Its use is mostly internal, and most user should not need it.

Usage

estimateEM(
  phylo,
  Y_data,
  Y_data_imp = Y_data,
  process = c("BM", "OU", "scOU", "rBM"),
  independent = FALSE,
  tol_EM = list(variance = 10^(-2), value.root = 10^(-2), exp.root = 10^(-2), var.root =
    10^(-2), selection.strength = 10^(-2), normalized_half_life = 10^(-2), log_likelihood
    = 10^(-2)),
  Nbr_It_Max = 500,
  method.variance = c("simple", "upward_downward"),
  method.init = c("default", "lasso"),
  method.init.alpha = c("default", "estimation"),
  method.init.alpha.estimation = c("regression", "regression.MM", "median"),
  nbr_of_shifts = 0,
  random.root = TRUE,
  stationary.root = TRUE,
  alpha_known = FALSE,
  eps = 10^(-3),
  known.selection.strength = 1,
  init.selection.strength = 1,
  max_selection.strength = 100,
  use_sigma_for_lasso = TRUE,
  max_triplet_number = 10000,
  min_params = list(variance = 0, value.root = -10^(5), exp.root = -10^(5), var.root = 0,
    selection.strength = 0),
  max_params = list(variance = 10^(5), value.root = 10^(5), exp.root = 10^(5), var.root =
    10^(5), selection.strength = 10^(5)),
  var.init.root = diag(1, nrow(Y_data)),
  variance.init = diag(1, nrow(Y_data), nrow(Y_data)),
  methods.segmentation = c("lasso", "same_shifts", "best_single_move"),
  check.tips.names = FALSE,
  times_shared = NULL,
  distances_phylo = NULL,
  subtree.list = NULL,
  T_tree = NULL,
  U_tree = NULL,
  h_tree = NULL,
  F_moments = NULL,
  tol_half_life = TRUE,
  warning_several_solutions = TRUE,
  convergence_mode = c("relative", "absolute"),
  check_convergence_likelihood = TRUE,
  sBM_variance = FALSE,
  method.OUsun = c("rescale", "raw"),
  K_lag_init = 0,
  allow_negative = FALSE,
  trait_correlation_threshold = 0.9,
  ...
)
estimateEM(
  phylo,
  Y_data,
  Y_data_imp = Y_data,
  process = c("BM", "OU", "scOU", "rBM"),
  independent = FALSE,
  tol_EM = list(variance = 10^(-2), value.root = 10^(-2), exp.root = 10^(-2), var.root =
    10^(-2), selection.strength = 10^(-2), normalized_half_life = 10^(-2), log_likelihood
    = 10^(-2)),
  Nbr_It_Max = 500,
  method.variance = c("simple", "upward_downward"),
  method.init = c("default", "lasso"),
  method.init.alpha = c("default", "estimation"),
  method.init.alpha.estimation = c("regression", "regression.MM", "median"),
  nbr_of_shifts = 0,
  random.root = TRUE,
  stationary.root = TRUE,
  alpha_known = FALSE,
  eps = 10^(-3),
  known.selection.strength = 1,
  init.selection.strength = 1,
  max_selection.strength = 100,
  use_sigma_for_lasso = TRUE,
  max_triplet_number = 10000,
  min_params = list(variance = 0, value.root = -10^(5), exp.root = -10^(5), var.root = 0,
    selection.strength = 0),
  max_params = list(variance = 10^(5), value.root = 10^(5), exp.root = 10^(5), var.root =
    10^(5), selection.strength = 10^(5)),
  var.init.root = diag(1, nrow(Y_data)),
  variance.init = diag(1, nrow(Y_data), nrow(Y_data)),
  methods.segmentation = c("lasso", "same_shifts", "best_single_move"),
  check.tips.names = FALSE,
  times_shared = NULL,
  distances_phylo = NULL,
  subtree.list = NULL,
  T_tree = NULL,
  U_tree = NULL,
  h_tree = NULL,
  F_moments = NULL,
  tol_half_life = TRUE,
  warning_several_solutions = TRUE,
  convergence_mode = c("relative", "absolute"),
  check_convergence_likelihood = TRUE,
  sBM_variance = FALSE,
  method.OUsun = c("rescale", "raw"),
  K_lag_init = 0,
  allow_negative = FALSE,
  trait_correlation_threshold = 0.9,
  ...
)

Arguments

`phylo`	A phylogenetic tree of class `phylo` (from package `ape`).
`Y_data`	Matrix of data at the tips, size p x ntaxa. Each line is a trait, and each column is a tip. The column names are checked against the tip names of the tree.
`Y_data_imp`	(optional) imputed data if previously computed, same format as `Y_data`. Mostly here for internal calls.
`process`	The model used for the fit. One of "BM" (for a full BM model, univariate or multivariate); "OU" (for an OU with independent traits, univariate or multivariate); or "scOU" (for a "scalar OU" model, see details).
`independent`	Are the trait assumed to be independent from one another? Default to FALSE. OU in a multivariate setting only works if TRUE.
`tol_EM`	the tolerance for the convergence of the parameters. A named list, with items: variance default to 10^(-2) value.root default to 10^(-2) exp.root default to 10^(-2) var.root default to 10^(-2) selection.strength default to 10^(-2) normalized_half_life default to 10^(-2) log_likelihood default to 10^(-2)
`Nbr_It_Max`	the maximal number of iterations of the EM allowed. Default to 500 iterations.
`method.variance`	Algorithm to be used for the moments computations at the E step. One of "simple" for the naive method; of "upward_downward" for the Upward Downward method (usually faster). Default to "upward_downward".
`method.init`	The initialization method. One of "lasso" for the LASSO base initialization method; or "default" for user-specified initialization values. Default to "lasso".
`method.init.alpha`	For OU model, initialization method for the selection strength alpha. One of "estimation" for a cherry-based initialization, using `nlrob`; or "default" for user-specified initialization values. Default to "estimation".
`method.init.alpha.estimation`	If method.init.alpha="estimation", choice of the estimation(s) methods to be used. Choices among "regression", (method="M" is passed to `nlrob`); "regression.MM" (method="MM" is passed to `nlrob`) or "median" (`nlrob` is not used, a simple median is taken). Default to all of them.
`nbr_of_shifts`	the number of shifts allowed.
`random.root`	whether the root is assumed to be random (TRUE) of fixed (FALSE). Default to TRUE
`stationary.root`	whether the root is assumed to be in the stationary state. Default to TRUE.
`alpha_known`	is the selection strength assumed to be known ? Default to FALSE.
`eps`	tolerance on the selection strength value before switching to a BM. Default to 10^(-3).
`known.selection.strength`	if `alpha_known=TRUE`, the value of the known selection strength.
`init.selection.strength`	(optional) a starting point for the selection strength value.
`max_selection.strength`	the maximal value allowed of the selection strength. Default to 100.
`use_sigma_for_lasso`	whether to use the first estimation of the variance matrix in the lasso regression. Default to TRUE.
`max_triplet_number`	for the initialization of the selection strength value (when estimated), the maximal number of triplets of tips to be considered.
`min_params`	a named list containing the minimum allowed values for the parameters. If the estimation is smaller, then the EM stops, and is considered to be divergent. Default values: variance default to 0 value.root default to -10^(5) exp.root default to -10^(5) var.root default to 0 selection.strength default to 0
`max_params`	a named list containing the maximum allowed values for the parameters. If the estimation is larger, then the EM stops, and is considered to be divergent. Default values: variance default to 10^(5) value.root default to 10^(5) exp.root default to 10^(5) var.root default to 10^(5) selection.strength default to 10^(5)
`var.init.root`	optional initialization value for the variance of the root.
`variance.init`	optional initialization value for the variance.
`methods.segmentation`	For OU, method(s) used at the M step to find new candidate shifts positions. Choices among "lasso" for a LASSO-based algorithm; and "best_single_move" for a one-move at a time based heuristic. Default to both of them. Using only "lasso" might speed up the function a lot.
`check.tips.names`	whether to check the tips names of the tree against the column names of the data. Default to TRUE.
`times_shared`	(optional) times of shared ancestry of all nodes and tips, result of function `compute_times_ca`
`distances_phylo`	(optional) phylogenetic distances, result of function `compute_dist_phy`.
`subtree.list`	(optional) tips descendants of all the edges, result of function `enumerate_tips_under_edges`.
`T_tree`	(optional) matrix of incidence of the tree, result of function `incidence.matrix`.
`U_tree`	(optional) full matrix of incidence of the tree, result of function `incidence.matrix.full`.
`h_tree`	(optional) total height of the tree.
`F_moments`	(optional, internal)
`tol_half_life`	should the tolerance criterion be applied to the phylogenetic half life (TRUE, default) or to the raw selection strength ?
`warning_several_solutions`	whether to issue a warning if several equivalent solutions are found (default to TRUE).
`convergence_mode`	one of "relative" (the default) or "absolute". Should the tolerance be applied to the raw parameters, or to the renormalized ones ?
`check_convergence_likelihood`	should the likelihood be taken into consideration for convergence assessment ? (default to TRUE).
`sBM_variance`	Is the root of the BM supposed to be random and "stationary"? Used for BM equivalent computations. Default to FALSE.
`method.OUsun`	Method to be used in univariate OU. One of "rescale" (rescale the tree to fit a BM) or "raw" (directly use an OU, only available for univariate processes).
`K_lag_init`	Number of extra shifts to be considered at the initialization step. Increases the accuracy, but can make computations quite slow of taken too high. Default to 5.
`allow_negative`	whether to allow negative values for alpha (Early Burst). See documentation of `PhyloEM` for more details. Default to FALSE.
`trait_correlation_threshold`	the trait correlation threshold to stop the analysis. Default to 0.9.
`...`	Further arguments to be passed to `estimateEM`, including tolerance parameters for stopping criteria, maximal number of iterations, etc.

Details

See documentation of PhyloEM for further details. All the parameters monitoring the EM (like tol_EM, Nbr_It_Max, etc.) can be called from PhyloEM.

Value

An object of class EstimateEM.

Extraction function

Description

extract the needed quantities out of an S3 object.

Usage

extract(x, ...)
extract(x, ...)

Arguments

`x`	an S3 object.
`...`	further arguments to be passed to the specific method.

Value

An integer giving the number of equivalent parsimonious solutions.

Extract the result of `enumerate_parsimony` at a node.

Description

extract.enumerate_parsimony returns a matrix containing all the possible regime allocations for the nodes of a given subtree.

Usage

## S3 method for class 'enumerate_parsimony'
extract(
  x,
  node = attr(x$allocations, "ntaxa") + 1,
  what = c("solutions", "number", "cost"),
  ...
)
## S3 method for class 'enumerate_parsimony'
extract(
  x,
  node = attr(x$allocations, "ntaxa") + 1,
  what = c("solutions", "number", "cost"),
  ...
)

Arguments

`x`	an object of class "`enumerate_parsimony`", result of function `enumerate_parsimony`.
`node`	the node where to retrieve the parsimony number. Default to the root of the tree.
`what`	the quantity to retrieve. Either "solutions" for the full solutions, "number" for the number of solutions, or "cost" for the minimal cost of a solution. Default to "solutions"
`...`	unused

Value

A matrix with ntaxa + Nnode columns, and as many rows as the number of possible parsimonious reconstructions.

Extract the shifts values for one trait.

Description

extract.equivalent_shifts takes an object of class equivalent_shifts, result of function equivalent_shifts, and returns the shifts of root values for a given trait.

Usage

## S3 method for class 'equivalent_shifts'
extract(x, trait = 1, what = c("shifts_values", "root_values"), ...)
## S3 method for class 'equivalent_shifts'
extract(x, trait = 1, what = c("shifts_values", "root_values"), ...)

Arguments

`x`	an object of class `equivalent_shifts`, result of function `equivalent_shifts`
`trait`	the number of the trait to be extracted. Default to 1.
`what`	one of "shifts_values" or "root_values".
`...`	unused.

Value

A matrix with the values of the shifts (what = "shifts_values") or the root (what = "root_values") for the trait for each equivalent configuration. Each column is one configuration.

Extraction of the actual number of solutions.

Description

extract.parsimonyCost takes an object of class "parsimonyCost", result of function parsimonyCost, and computes the minimum cost at the given node.

Usage

## S3 method for class 'parsimonyCost'
extract(x, node = attr(x, "ntaxa") + 1, ...)
## S3 method for class 'parsimonyCost'
extract(x, node = attr(x, "ntaxa") + 1, ...)

Arguments

`x`	an object of class "`parsimonyCost`", result of function `parsimonyCost`.
`node`	the root node of the subtree. By default, the root of the tree.
`...`	unused

Value

An integer giving the minimum cost of the subtree.

Extraction of the actual number of solutions.

Description

extract.parsimonyNumber takes the two matrices computed by parsimonyNumber, and compute the actual number of parsimonious solution for any subtree starting from a given node.

Usage

## S3 method for class 'parsimonyNumber'
extract(
  x,
  node = attr(x$nbrReconstructions, "ntaxa") + 1,
  what = c("number", "cost"),
  ...
)
## S3 method for class 'parsimonyNumber'
extract(
  x,
  node = attr(x$nbrReconstructions, "ntaxa") + 1,
  what = c("number", "cost"),
  ...
)

Arguments

`x`	an object of class "`parsimonyNumber`", result of function `parsimonyNumber`.
`node`	the root node of the subtree. By default, the root of the tree.
`what`	the quantity to retrieve. Either "number" for the number of solutions, or "cost" for the minimal cost of a solution. Default to "number".
`...`	unused

Details

The parsimonious solutions are the one with the minimum number of shifts (that are given by matrix costReconstructions). This function sums the number of solutions (given in matrix nbrReconstructions) that have the minimum number of shifts.

Value

An integer giving the number of equivalent parsimonious solutions.

Extract from object `partitionsNumber`

Description

extract.partitionsNumber extracts the number of partitions for a given sub-tree, either marked or non-marked.

Usage

## S3 method for class 'partitionsNumber'
extract(
  x,
  node = attr(x, "ntaxa") + 1,
  npart = attr(x, "npart"),
  marked = FALSE,
  ...
)
## S3 method for class 'partitionsNumber'
extract(
  x,
  node = attr(x, "ntaxa") + 1,
  npart = attr(x, "npart"),
  marked = FALSE,
  ...
)

Arguments

`x`	an object of class `partitionsNumber`, result of function `partitionsNumber`.
`node`	the root node of the subtree where to get the result. Default to the root of the tree.
`npart`	the number of partitions (colors) allowed at the tips. Default to the value used in the call of function `partitionsNumber` (the maximum).
`marked`	whether to extract the marked (TRUE) or un-marked (FALSE) partitions. The number of models is the number of un-marked partitions. Default to FALSE.
`...`	unused.

Value

the number of partitions with npart colors, on the sub-tree starting at node, marked or not.

Extraction of simulated traits

Description

extract.simul_process takes an object of class "simul_process", result of function simul_process, and extracts the traits values, expectations or optimal values at the tips or the internal nodes.

Usage

## S3 method for class 'simul_process'
extract(
  x,
  where = c("tips", "nodes"),
  what = c("states", "expectations", "optimal.values"),
  ...
)
## S3 method for class 'simul_process'
extract(
  x,
  where = c("tips", "nodes"),
  what = c("states", "expectations", "optimal.values"),
  ...
)

Arguments

`x`	an object of class "`simul_process`", result of function `simul_process`.
`where`	one of "tips" (the default) or "nodes". Where to extract the results.
`what`	one of "states" (the default), "expectation", or "optimal.values".
`...`	unused

Details

Value

A matrix giving the selected quantities at the selected nodes or tips. If the tips or nods are labeled, then the colnames of the matrix are set accordingly.

Find a reasonable grid for alpha

Description

Grid so that 2*ln(2)*quantile(d_ij)/factor_up_alpha < t_1/2 < factor_down_alpha * ln(2) * h_tree, with t_1/2 the phylogenetic half life: t_1/2 = log(2)/alpha. Ensures that for alpha_min, it is almost a BM, and for alpha_max, almost all the tips are decorrelated.

Usage

find_grid_alpha(
  phy,
  alpha = NULL,
  nbr_alpha = 10,
  factor_up_alpha = 2,
  factor_down_alpha = 3,
  quantile_low_distance = 1e-04,
  log_transform = TRUE,
  allow_negative = FALSE,
  ...
)
find_grid_alpha(
  phy,
  alpha = NULL,
  nbr_alpha = 10,
  factor_up_alpha = 2,
  factor_down_alpha = 3,
  quantile_low_distance = 1e-04,
  log_transform = TRUE,
  allow_negative = FALSE,
  ...
)

Arguments

`phy`	phylogenetic tree of class "`phylo`"
`alpha`	fixed vector of alpha values if already known. Default to NULL.
`nbr_alpha`	the number of elements in the grid
`factor_up_alpha`	factor for up scalability
`factor_down_alpha`	factor for down scalability
`quantile_low_distance`	quantile for min distance
`log_transform`	whether to take a log scale for the spacing of alpha values. Default to TRUE.
`allow_negative`	whether to allow negative values for alpha (Early Burst). See documentation of `PhyloEM` for more details. Default to FALSE.
`...`	not used.

Details

If quantile_low_distance=0, then quantile(d_ij)=min(d_ij), and, for any two tips i,j, the correlation between i and j is bounded by exp(-factor_up_alpha/2). Those values of alpha will be used for the re-scaling of the tree, which has an exponential term in exp(2*alpha*h). The function makes sure that this number is below the maximal float allowed (equals to .Machine$double.xmax).

Value

A grid of alpha values

Test for rotation invariant datasets

Description

find_rotation takes two fits from from PhyloEM, and test if their datasets are equal up to a rotation.

Usage

find_rotation(res1, res2, tol = NULL)
find_rotation(res1, res2, tol = NULL)

Arguments

`res1`	an object of class `PhyloEM`.
`res2`	an object of class `PhyloEM`.
`tol`	relative numerical tolerance. Default to `.Machine$double.eps^(0.5)`.

Value

If appropriate, the rotation matrix rot such that dat1 = rot

Get Model Selection Criterion

Description

This function takes an object of class PhyloEM, result of function PhyloEM, and return the values of the model selection criterion for each value of K.

Usage

get_criterion(res, method.selection = NULL)
get_criterion(res, method.selection = NULL)

Arguments

`res`	an object of class `PhyloEM`, result of function `PhyloEM`.
`method.selection`	select the parameters to plot. One of "LINselect", "DDSE", "Djump" or "likelihood" (for un-penalized likelihood). Default to "LINselect". See `params_process.PhyloEM`.

Value

A named vector with the values of the criterion for each number of shift K.

Ancestral State Reconstruction

Description

imputed_traits.PhyloEM takes an object of class PhyloEM, and returns the imputed traits values, either at the internal nodes (ancestral state reconstruction) or at the tips (data imputation)

Usage

imputed_traits(x, ...)

## S3 method for class 'PhyloEM'
imputed_traits(
  x,
  trait = 1,
  save_all = FALSE,
  where = c("nodes", "tips"),
  what = c("imputed", "variances", "expectations"),
  params = NULL,
  method.selection = NULL,
  reconstructed_states = NULL,
  ...
)
imputed_traits(x, ...)

## S3 method for class 'PhyloEM'
imputed_traits(
  x,
  trait = 1,
  save_all = FALSE,
  where = c("nodes", "tips"),
  what = c("imputed", "variances", "expectations"),
  params = NULL,
  method.selection = NULL,
  reconstructed_states = NULL,
  ...
)

Arguments

`x`	an object of class `PhyloEM`.
`...`	further arguments to be passed on to `params_process.PhyloEM`
`trait`	an integer giving the trait to extract. Default to 1.
`save_all`	if TRUE, arguments `where` and `what` are ignored, and all the moments are kept for further extraction with the same function, specifying the argument `reconstructed_states`. Default to FALSE.
`where`	either "nodes" for ancestral state reconstruction, or "tips" for data imputation.
`what`	the quantity to retrieve. Either the imputed traits (default), their conditional variances, or the simple expectations under the selected process.
`params`	(optional) some user-specified parameters. Must be of class `params_process`. If left blank, they are extracted using the `method.selection` argument (see below).
`method.selection`	(optional) the method selection to be used. One of "LINselect", "DDSE", "Djump". Default to "LINselect".
`reconstructed_states`	if the reconstructed states have already been computed (by a previous call of the function, with `save_all=TRUE`), they can be passed on here (avoids multiple computations of the E step).

Value

A matrix or array with the computed quantities.

Methods (by class)

imputed_traits(PhyloEM): PhyloEM object

Incidence matrix of a tree.

Description

incidence.matrix computes the incidence matrix T of a tree : for a lineage i and a branch b, T[i,b]=1 if b is in the lineage i, and 0 otherwise.

Usage

incidence.matrix(phylo)
incidence.matrix(phylo)

Arguments

phylo

a phylogenetic tree, class phylo.

Value

Matrix of incidence, size Nedge x ntaxa.

Incidence matrix of a tree.

Description

incidence.matrix.full computes the incidence matrix U of a tree : for a node i and a branch b, U[i,b]=1 if b is in the lineage i, and 0 otherwise.

Usage

incidence.matrix.full(phylo)
incidence.matrix.full(phylo)

Arguments

phylo

a phylogenetic tree, class phylo.

Value

Matrix of incidence, size ntaxa + Nnode.

Log Likelihood of a fitted object

Description

log_likelihood computes the log likelihood of some parameters.

Usage

log_likelihood(x, ...)

## S3 method for class 'params_process'
log_likelihood(x, Y_data, phylo, ...)

## S3 method for class 'PhyloEM'
log_likelihood(x, ...)
log_likelihood(x, ...)

## S3 method for class 'params_process'
log_likelihood(x, Y_data, phylo, ...)

## S3 method for class 'PhyloEM'
log_likelihood(x, ...)

Arguments

`x`	an object of class `params_process` or `PhyloEM`.
`...`	for a `PhyloEM` object, further arguments to be passed on to `params_process.PhyloEM` (to choose which parameters to extract from the results, see documentation of this function).
`Y_data`	matrix of data at the tips, size p x ntaxa. Each line is a trait, and each column is a tip. The column names are checked against the tip names of the tree.
`phylo`	a phylogenetic tree, class `phylo`. `incidence.matrix.full`. Can be specified to avoid extra computations.

Value

The log likelihood of the data with the provided parameters on the tree.

Methods (by class)

log_likelihood(params_process): params_process object
log_likelihood(PhyloEM): PhyloEM object

Merge PhyloEM fits on various grids of alpha values

Description

merge_alpha_grids takes several fits from PhyloEM, and merge them so as to take into account all alpha values. This can be used to break down computations into smaller chunks to be run independently.

Usage

merge_alpha_grids(...)
merge_alpha_grids(...)

Arguments

...

objects of class PhyloEM fitted on the same dataset with the same parameters, but different grids of alpha values.

Value

An object of class PhyloEM, result of the merge.

Examples

## Not run: 
## Load Data
data(monkeys)
## First fit with coarse grid
res1 <- PhyloEM(Y_data = monkeys$dat,
                phylo = monkeys$phy,
                process = "scOU",
                random.root = TRUE,
                stationary.root = TRUE,
                K_max = 10,
                alpha = c(0.2, 0.3),
                parallel_alpha = TRUE,
                Ncores = 2)
## Second fit with finer grid
res2 <- PhyloEM(Y_data = monkeys$dat,
                phylo = monkeys$phy,
                process = "scOU",
                random.root = TRUE,
                stationary.root = TRUE,
                K_max = 10,
                alpha = c(0.22, 0.24),
                parallel_alpha = TRUE,
                Ncores = 2)
## Thrid fit with redundancies
res3 <- PhyloEM(Y_data = monkeys$dat,
                phylo = monkeys$phy,
                process = "scOU",
                random.root = TRUE,
                stationary.root = TRUE,
                K_max = 10,
                alpha = c(0.26, 0.3),
                parallel_alpha = TRUE,
                Ncores = 2)

## Merge the three
res_merge <- merge_alpha_grids(res1, res2, res3)
## Plot the selected result
plot(res_merge)
## Plot the model selection criterion
plot_criterion(res_merge)

## End(Not run)

## Not run: 
## Load Data
data(monkeys)
## First fit with coarse grid
res1 <- PhyloEM(Y_data = monkeys$dat,
                phylo = monkeys$phy,
                process = "scOU",
                random.root = TRUE,
                stationary.root = TRUE,
                K_max = 10,
                alpha = c(0.2, 0.3),
                parallel_alpha = TRUE,
                Ncores = 2)
## Second fit with finer grid
res2 <- PhyloEM(Y_data = monkeys$dat,
                phylo = monkeys$phy,
                process = "scOU",
                random.root = TRUE,
                stationary.root = TRUE,
                K_max = 10,
                alpha = c(0.22, 0.24),
                parallel_alpha = TRUE,
                Ncores = 2)
## Thrid fit with redundancies
res3 <- PhyloEM(Y_data = monkeys$dat,
                phylo = monkeys$phy,
                process = "scOU",
                random.root = TRUE,
                stationary.root = TRUE,
                K_max = 10,
                alpha = c(0.26, 0.3),
                parallel_alpha = TRUE,
                Ncores = 2)

## Merge the three
res_merge <- merge_alpha_grids(res1, res2, res3)
## Plot the selected result
plot(res_merge)
## Plot the model selection criterion
plot_criterion(res_merge)

## End(Not run)

Merge fits from independent runs of PhyloEM.

Description

merge_rotations takes several fits from PhyloEM, and merge them according to the best score (maximum likelihood or least squares). For each number of shifts, The datasets needs to be equal up to a rotation. This is tested thanks to a QR decomposition, see function find_rotation.

Usage

merge_rotations(..., method.selection = NULL, tol = NULL)
merge_rotations(..., method.selection = NULL, tol = NULL)

Arguments

`...`	objects of class `PhyloEM` fitted on datasets that are equal up to a rotation.
`method.selection`	(optional) selection method to be applied to the merged fit. See `params_process.PhyloEM`.
`tol`	(optional) relative numerical tolerance. See `find_rotation`.

Value

An object of class PhyloEM, result of the merge.

Examples

## Not run: 
## Load Data
data(monkeys)
## Run method
# Note: use more alpha values for better results.
res <- PhyloEM(Y_data = monkeys$dat,        ## data
               phylo = monkeys$phy,         ## phylogeny
               process = "scOU",            ## scalar OU
               random.root = TRUE,          ## root is stationary
               stationary.root = TRUE,
               K_max = 10,                  ## maximal number of shifts
               nbr_alpha = 4,               ## number of alpha values
               parallel_alpha = TRUE,       ## parallelize on alpha values
               Ncores = 2)
## Rotate dataset
rot <- matrix(c(cos(pi/4), -sin(pi/4), sin(pi/4), cos(pi/4)), nrow= 2, ncol = 2)
Yrot <- t(rot) %*% monkeys$dat
rownames(Yrot) <- rownames(monkeys$dat)
## Fit rotated dataset
# Note: use more alpha values for better results.
res_rot <- PhyloEM(Y_data = Yrot,               ## rotated data
                   phylo = monkeys$phy,         
                   process = "scOU",            
                   random.root = TRUE,          
                   stationary.root = TRUE,
                   K_max = 10,                  
                   nbr_alpha = 4,               
                   parallel_alpha = TRUE,       
                   Ncores = 2)
## Merge the two
res_merge <- merge_rotations(res, res_rot)
## Plot the selected result
plot(res_merge)
## Plot the model selection criterion
plot_criterion(res_merge)

## End(Not run)

## Not run: 
## Load Data
data(monkeys)
## Run method
# Note: use more alpha values for better results.
res <- PhyloEM(Y_data = monkeys$dat,        ## data
               phylo = monkeys$phy,         ## phylogeny
               process = "scOU",            ## scalar OU
               random.root = TRUE,          ## root is stationary
               stationary.root = TRUE,
               K_max = 10,                  ## maximal number of shifts
               nbr_alpha = 4,               ## number of alpha values
               parallel_alpha = TRUE,       ## parallelize on alpha values
               Ncores = 2)
## Rotate dataset
rot <- matrix(c(cos(pi/4), -sin(pi/4), sin(pi/4), cos(pi/4)), nrow= 2, ncol = 2)
Yrot <- t(rot) %*% monkeys$dat
rownames(Yrot) <- rownames(monkeys$dat)
## Fit rotated dataset
# Note: use more alpha values for better results.
res_rot <- PhyloEM(Y_data = Yrot,               ## rotated data
                   phylo = monkeys$phy,         
                   process = "scOU",            
                   random.root = TRUE,          
                   stationary.root = TRUE,
                   K_max = 10,                  
                   nbr_alpha = 4,               
                   parallel_alpha = TRUE,       
                   Ncores = 2)
## Merge the two
res_merge <- merge_rotations(res, res_rot)
## Plot the selected result
plot(res_merge)
## Plot the model selection criterion
plot_criterion(res_merge)

## End(Not run)

Model Selection of a fitted object

Description

model_selection does the model selection on a fitted PhyloEM object, and returns the same fitted object.

Usage

model_selection(x, ...)

## S3 method for class 'PhyloEM'
model_selection(
  x,
  method.selection = c("LINselect", "DDSE", "Djump"),
  C.BM1 = 0.1,
  C.BM2 = 2.5,
  C.LINselect = 1.1,
  independent = FALSE,
  ...
)
model_selection(x, ...)

## S3 method for class 'PhyloEM'
model_selection(
  x,
  method.selection = c("LINselect", "DDSE", "Djump"),
  C.BM1 = 0.1,
  C.BM2 = 2.5,
  C.LINselect = 1.1,
  independent = FALSE,
  ...
)

Arguments

`x`	a fitted `PhyloEM` object
`...`	Further arguments to be passed to `estimateEM`, including tolerance parameters for stopping criteria, maximal number of iterations, etc.
`method.selection`	Method selection to be used. Several ones can be used at the same time. One of "LINselect" for the Baraud Giraud Huet LINselect method; "DDSE" for the Slope Heuristic or "Djump" for the Jump Heuristic, last two based the Birgé Massart method.
`C.BM1`	Multiplying constant to be used for the BigeMassart1 method. Need to be positive. Default to 0.1.
`C.BM2`	Multiplying constant to be used for the BigeMassart2 method. Default to 2.5.
`C.LINselect`	Multiplying constant to be used for the LINselect method. Need to be greater than 1. Default to 1.1.
`independent`	Are the trait assumed to be independent from one another? Default to FALSE. OU in a multivariate setting only works if TRUE.

Value

The same object, but with a slot corresponding to the model selection used. See function params_process.PhyloEM to retrieve the selected parameters.

Methods (by class)

model_selection(PhyloEM): PhyloEM object

New World Monkeys dataset

Description

Morphometric dataset and phylogeny for brain shape variation of 50 species of New World monkeys (platyrrhine).

Usage

monkeys
monkeys

Format

A list containing two objects:

phy: The Phylogenetic tree for the platyrrhine species, pruned to match the species in the morphometric dataset
dat: First two PC scores from a PCA of the species-averaged Procrustes coordinates

References

Aristide, L., dos Reis, S. F., Machado, A. C., Lima, I., Lopes, R. T. & Perez, S. I. (2016). Brain shape convergence in the adaptive radiation of New World monkeys. Proceedings of the National Academy of Sciences, 113(8), 2158–2163. http://doi.org/10.1073/pnas.1514473113

Computation of the optimal values at nodes and tips.

Description

compute_betas_from_shifts computes the optimal values at the nodes and tips of the tree, given the value at the root and the list of shifts occurring in the tree. It assumes an OU model.

Usage

node_optimal_values(param, phylo)
node_optimal_values(param, phylo)

Arguments

`param`	an object of class `params_process`.
`phylo`	a phylogenetic tree, class `phylo`.

Value

Matrix of size ntraits x (ntaxa + Nnode) of the optimal values at the node and tips of the tree. Column names correspond to the number of the node in the phylo object.

Examples

set.seed(1792)
ntaxa = 10
tree <- rphylo(ntaxa, 1, 0.1)
# parameters of the process
par <- params_process("BM",                             ## Process
                      p = 2,                            ## Dimension
                      variance = diag(0.5, 2, 2) + 0.5, ## Rate matrix
                      edges = c(4, 10, 15),             ## Positions of the shifts
                      values = cbind(c(5, 4),           ## Values of the shifts
                                     c(-4, -5),
                                     c(5, -3)))
plot(par, phylo = tree, traits = 1, value_in_box = TRUE,
     shifts_bg = "white", root_bg = "white", ancestral_as_shift = TRUE, root_adj = 5)
nodelabels()
node_optimal_values(par, tree)

set.seed(1792)
ntaxa = 10
tree <- rphylo(ntaxa, 1, 0.1)
# parameters of the process
par <- params_process("BM",                             ## Process
                      p = 2,                            ## Dimension
                      variance = diag(0.5, 2, 2) + 0.5, ## Rate matrix
                      edges = c(4, 10, 15),             ## Positions of the shifts
                      values = cbind(c(5, 4),           ## Values of the shifts
                                     c(-4, -5),
                                     c(5, -3)))
plot(par, phylo = tree, traits = 1, value_in_box = TRUE,
     shifts_bg = "white", root_bg = "white", ancestral_as_shift = TRUE, root_adj = 5)
nodelabels()
node_optimal_values(par, tree)

Create an object `params_process` for a BM

Description

params_BM creates a coherent object params_process from user provided values of the parameters. Non specified parameters are set to default values.

Usage

params_BM(
  p = 1,
  variance = diag(1, p, p),
  random = FALSE,
  value.root = rep(0, p),
  exp.root = rep(0, p),
  var.root = diag(1, p, p),
  edges = NULL,
  values = matrix(0, p, length(edges)),
  relativeTimes = NULL,
  nbr_of_shifts = length(edges),
  phylo = NULL,
  sBM_variance = FALSE,
  trait_names = NULL,
  ...
)
params_BM(
  p = 1,
  variance = diag(1, p, p),
  random = FALSE,
  value.root = rep(0, p),
  exp.root = rep(0, p),
  var.root = diag(1, p, p),
  edges = NULL,
  values = matrix(0, p, length(edges)),
  relativeTimes = NULL,
  nbr_of_shifts = length(edges),
  phylo = NULL,
  sBM_variance = FALSE,
  trait_names = NULL,
  ...
)

Arguments

`p`	the dimension (number of traits) of the parameters. Default to 1.
`variance`	the variance (rate matrix) of the BM. Default to `diag(1, p, p)`.
`random`	whether the root of the BM is random (TRUE) or fixed (FALSE). Default to FALSE.
`value.root`	if random=FALSE, the root value. Default to 0.
`exp.root`	if random=TRUE, the root expectation. Default to 0.
`var.root`	if random=TRUE, the root variance. Default to `diag(1, p, p)`.
`edges`	a vector of edges where the shifts occur. Default to NULL (no shift).
`values`	a matrix of shift values, with p lines and as many columns as the number of shifts. Each column is the p values for one shift. Default to `matrix(0, p, length(edges))`.
`relativeTimes`	(unused) the relative position of the shift on the branch, between 0 (beginning of the branch) and 1 (end of the branch). Default to 0.
`nbr_of_shifts`	the number of shifts to use (randomly drawn). Use only if `edges` is not specified. In that case, a phylogenetic tree must be provided (to allow a random sampling of its edges).
`phylo`	a phylogenetic tree of class `phylo`. Needed only if the shifts edges are not specified, or if sBM_variance=TRUE. Default to NULL. If sBM_variance=TRUE, it must have a specified value for the root branch length (slot root.edge).
`sBM_variance`	if the root is random, does it depend on the length of the root edge ? (For equivalent purposes with a rescaled OU). Default to FALSE. If TRUE, a phylogenetic tree with root edge length must be provided.
`trait_names`	vector of trait names values. Must be of length p.
`...`	unused.

Value

an object of class params_process.

Create an object `params_process` for an OU

Description

params_OU creates a coherent object params_process from user provided values of the parameters. Non specified parameters are set to default values.

Usage

params_OU(
  p = 1,
  variance = diag(1, p, p),
  selection.strength = diag(1, p, p),
  optimal.value = rep(0, p),
  random = TRUE,
  stationary.root = TRUE,
  value.root = rep(0, p),
  exp.root = rep(0, p),
  var.root = diag(1, p, p),
  edges = NULL,
  values = matrix(0, p, length(edges)),
  relativeTimes = NULL,
  nbr_of_shifts = length(edges),
  phylo = NULL,
  trait_names = NULL,
  ...
)
params_OU(
  p = 1,
  variance = diag(1, p, p),
  selection.strength = diag(1, p, p),
  optimal.value = rep(0, p),
  random = TRUE,
  stationary.root = TRUE,
  value.root = rep(0, p),
  exp.root = rep(0, p),
  var.root = diag(1, p, p),
  edges = NULL,
  values = matrix(0, p, length(edges)),
  relativeTimes = NULL,
  nbr_of_shifts = length(edges),
  phylo = NULL,
  trait_names = NULL,
  ...
)

Arguments

`p`	the dimension (number of traits) of the parameters. Default to 1.
`variance`	the variance (rate matrix) of the BM. Default to `diag(1, p, p)`.
`selection.strength`	the selection strength matrix. Default to `diag(1, p, p)`.
`optimal.value`	the vector of the optimal values at the root. Default to `rep(0, p)`.
`random`	whether the root of the OU is random (TRUE) or fixed (FALSE). Default to TRUE.
`stationary.root`	whether the root of the OU is stationary (TRUE) or not. Default to TRUE.
`value.root`	if random=FALSE, the root value. Default to 0.
`exp.root`	if random=TRUE, the root expectation. Default to 0. If stationary.root=TRUE, default to `optimal.value`.
`var.root`	if random=TRUE, the root variance. Default to `diag(1, p, p)`. If stationary.root=TRUE, default to the stationary variance computed from `variance` and `selection.strength`, see function `compute_stationary_variance`.
`edges`	a vector of edges where the shifts occur. Default to NULL (no shift).
`values`	a matrix of shift values, with p lines and as many columns as the number of shifts. Each column is the p values for one shift. Default to `matrix(0, p, length(edges))`.
`relativeTimes`	(unused) the relative position of the shift on the branch, between 0 (beginning of the branch) and 1 (end of the branch). Default to 0.
`nbr_of_shifts`	the number of shifts to use (randomly drawn). Use only if `edges` is not specified. In that case, a phylogenetic tree must be provided (to allow a random sampling of its edges).
`phylo`	a phylogenetic tree of class `phylo`. Needed only if the shifts edges are not specified.
`trait_names`	vector of trait names values. Must be of length p.
`...`	unused.

Value

an object of class params_process.

Create an object params_process

Description

params_process creates or extracts a set of parameters of class params_process.

Usage

params_process(x, ...)
params_process(x, ...)

Arguments

`x`	an S3 object.
`...`	further arguments to be passed to the specific method.

Value

An S3 object of class params_process. This is essentially a list containing the following entries:

process

The model used. One of "BM" (for a full BM model, univariate or multivariate); "OU" (for a full OU model, univariate or multivariate); or "scOU" (for a "scalar OU" model).

p

Dimension of the trait.

root.state

List describing the state of the root, with:

random: random state (TRUE) or deterministic state (FALSE)
value.root: if deterministic, value of the character at the root
exp.root: if random, expectation of the character at the root
var.root: if random, variance of the character at the root (pxp matrix)

shifts

List with position and values of the shifts:

edges: vector of the K id of edges where the shifts are
values: matrix p x K of values of the shifts on the edges (one column = one shift)
relativeTimes: vector of dimension K of relative time of the shift from the parent node of edges

variance

Variance-covariance matrix size p x p.

selection.strength

Matrix of selection strength size p x p (OU).

optimal.value

Vector of p optimal values at the root (OU).

Create an object `params_process`

Description

params_process creates a coherent object params_process from user provided values of the parameters.

Usage

## S3 method for class 'character'
params_process(x, ...)
## S3 method for class 'character'
params_process(x, ...)

Arguments

`x`	one of "BM" or "OU"
`...`	specified parameters, see functions `params_BM` and `params_OU` for details.

Value

an object of class params_process.

Parameter estimates

Description

params takes an object of class PhyloEM, and returns the inferred parameters of the process.

Usage

## S3 method for class 'PhyloEM'
params_process(
  x,
  method.selection = NULL,
  K = NULL,
  alpha = NULL,
  rBM = FALSE,
  init = FALSE,
  ...
)
## S3 method for class 'PhyloEM'
params_process(
  x,
  method.selection = NULL,
  K = NULL,
  alpha = NULL,
  rBM = FALSE,
  init = FALSE,
  ...
)

Arguments

`x`	an object of class `PhyloEM`
`method.selection`	(optional) the method selection to be used. One of "LINselect", "DDSE", "Djump". Default to "LINselect".
`K`	(optional) an integer giving the number of shifts for which to retrieve the parameters. Default to NULL (automatically selected number of shifts, see `method.selection` argument).
`alpha`	(optional) a value of alpha for which to retrieve the parameters. Can be an (un-ambiguous) estimation of the true value. If specified, then `K` must be precised too. Default to NULL (automatically selected value, see `method.selection` argument).
`rBM`	(optional) if TRUE, and if the process is "scOU", returns the raw parameters of the BM on the re-scaled tree. Default to FALSE, except if the selection strength is negative (see doc of `PhyloEM` for an explanation of this particular case).
`init`	(optional) if TRUE, gives the parameters from the initialization of the EM. Default to FALSE. This has no effect if `K` is not specified.
`...`	unused.

Value

An object of class params_process.

Minimal number of shifts needed to get a clustering.

Description

parsimonyCost is an implementation of the Sankoff algorithm, when the cost of transition between two state is always one. It is used in functions parsimonyNumber and enumerate_parsimony to count or enumerate all the parsimonious solutions given one clustering of the tips.

Usage

parsimonyCost(phylo, clusters = rep(1, length(phylo$tip.label)))
parsimonyCost(phylo, clusters = rep(1, length(phylo$tip.label)))

Arguments

`phylo`	a phylogenetic tree, class `phylo`.
`clusters`	the vector of the clusters of the tips. (Default to all the tips in a single group).

Value

An S3 class "parsimonyCost" containing a (ntaxa + Nnode) x (nclus) matrix of the total number of shifts needed to get the clustering, if starting from a node in state k. The cost can be extract from any subtree with function extract.parsimonyCost.

Examples

tree <- read.tree(text="(((1,1),2),2);")
plot(tree); nodelabels()
clusters <- c(1, 1, 2, 2)
costs <- parsimonyCost(tree, clusters)
costs

## Extract the parsimony cost at the root
extract(costs)

## Extract the cost for the sub-tree below node 7
extract(costs, 7)

tree <- read.tree(text="(((1,1),2),2);")
plot(tree); nodelabels()
clusters <- c(1, 1, 2, 2)
costs <- parsimonyCost(tree, clusters)
costs

## Extract the parsimony cost at the root
extract(costs)

## Extract the cost for the sub-tree below node 7
extract(costs, 7)

Number of equivalent parsimonious allocations.

Description

parsimonyNumber aims at finding the number of equivalent allocations of the shifts on the tree, i.e allocations that are parsimonious and compatible with a given clustering of the tips.

Usage

parsimonyNumber(phylo, clusters = rep(1, length(phylo$tip.label)))
parsimonyNumber(phylo, clusters = rep(1, length(phylo$tip.label)))

Arguments

`phylo`	phylogenetic tree, class `phylo`.
`clusters`	the vector of the clusters of the tips. Default to all the tips in one single cluster.

Details

This function does a recursion up the tree. The function extract.parsimonyNumber gives the result sought for any subtree. The matrix of costs of the states (number of shifts) is also required, it is computed by function parsimonyCost.

Value

an object of S3 class "parsimonyNumber" with:

nbrReconstructions: a (ntaxa + Nnode) x (nclus) matrix of locally parsimonious solutions starting from a cluster k at a given node
costReconstructions: an object of class "parsimonyCost", result of function parsimonyCost.

Examples

tree <- read.tree(text="(((0,1),2),2);")
plot(tree); nodelabels()
clusters <- c(0, 1, 2, 2)
n_sols <- parsimonyNumber(tree, clusters)
n_sols

## Extract the number of parsimonious solutions at the root
extract(n_sols)

## Extract the cost of the solutions from the root
extract(n_sols, what = "cost")
extract(parsimonyCost(tree, clusters)) # same, more efficient

## Extract for the sub-tree below node 7
extract(n_sols, 7) # Result: 2 (the ancestral state is either "0" or "1"). 

tree <- read.tree(text="(((0,1),2),2);")
plot(tree); nodelabels()
clusters <- c(0, 1, 2, 2)
n_sols <- parsimonyNumber(tree, clusters)
n_sols

## Extract the number of parsimonious solutions at the root
extract(n_sols)

## Extract the cost of the solutions from the root
extract(n_sols, what = "cost")
extract(parsimonyCost(tree, clusters)) # same, more efficient

## Extract for the sub-tree below node 7
extract(n_sols, 7) # Result: 2 (the ancestral state is either "0" or "1").

Number of different models

Description

partitionsNumber computes the number of different models with a given number of shifts K. It is also the number of colorings of the tips to the tree in npart = K + 1 colors.

Usage

partitionsNumber(phylo, npart)
partitionsNumber(phylo, npart)

Arguments

`phylo`	a phylogenetic tree of class `phylo`.
`npart`	the numbers of partitions (colors) allowed at the tips. This is the number of shifts plus one (npart = K + 1).

Value

an object of class partitionsNumber. This is made of a matrix with (Nnode + ntaxa) rows and (2*npart) columns. Each column contains two vectors: for k=1:npart it contains the number of partitions with k groups compatible with the tree and the shift process; and for k=(npart+1):2*npart, it contains the number of "marked" partitions with (k-npart) groups compatible with the tree and the shift process. The actual number can be extracted with function extract.partitionsNumber (see examples below).

Examples

if (requireNamespace("combinat", quietly = TRUE)) {
  npart <- 8 # number of colors at the tips allowed
  tree <- read.tree(text="(A,(A,(A,A,A),A,A));") # a tree with polytomies
  plot(tree)
  parts_num <- partitionsNumber(tree, npart)
  parts_num
  
  ## Number of possible colorings of the tips in npart colors
  extract(parts_num)
  
  ## Get all the solutions for colorings with 1 to nparts colors
  extract(parts_num, npart = 1:npart)
  
  ## Number of possible colorings of the tips in npart colors
  ## For the sub-tree starting at node 17
  extract(parts_num, node = 10)
  
  ## Number of possible colorings of the tips in npart colors
  ## with one marked color
  extract(parts_num, marked = TRUE)
}


if (requireNamespace("combinat", quietly = TRUE)) {
  npart <- 8 # number of colors at the tips allowed
  tree <- read.tree(text="(A,(A,(A,A,A),A,A));") # a tree with polytomies
  plot(tree)
  parts_num <- partitionsNumber(tree, npart)
  parts_num
  
  ## Number of possible colorings of the tips in npart colors
  extract(parts_num)
  
  ## Get all the solutions for colorings with 1 to nparts colors
  extract(parts_num, npart = 1:npart)
  
  ## Number of possible colorings of the tips in npart colors
  ## For the sub-tree starting at node 17
  extract(parts_num, node = 10)
  
  ## Number of possible colorings of the tips in npart colors
  ## with one marked color
  extract(parts_num, marked = TRUE)
}

Model Estimation with Detection of Shifts

Description

PhyloEM is the main function of the package. It uses maximum likelihood methods to fit a BM or an OU process for several traits evolving along a phylogenetic tree, with automatic shift detection on the branches of the tree. This function can handle missing data.

Usage

PhyloEM(
  phylo,
  Y_data,
  process = c("BM", "OU", "scOU", "rBM"),
  check_postorder = TRUE,
  independent = FALSE,
  K_max = max(floor(sqrt(length(phylo$tip.label))), 10),
  use_previous = FALSE,
  order = TRUE,
  method.selection = c("LINselect", "DDSE", "Djump"),
  C.BM1 = 0.1,
  C.BM2 = 2.5,
  C.LINselect = 1.1,
  method.variance = c("upward_downward", "simple"),
  method.init = "lasso",
  method.init.alpha = "estimation",
  method.init.alpha.estimation = c("regression", "regression.MM", "median"),
  methods.segmentation = c("lasso", "best_single_move"),
  alpha_grid = TRUE,
  nbr_alpha = 10,
  random.root = TRUE,
  stationary.root = random.root,
  alpha = NULL,
  check.tips.names = TRUE,
  progress.bar = TRUE,
  estimates = NULL,
  save_step = FALSE,
  rescale_OU = TRUE,
  parallel_alpha = FALSE,
  Ncores = 3,
  K_lag_init = 5,
  light_result = TRUE,
  tol_tree = .Machine$double.eps^0.5,
  allow_negative = FALSE,
  option_is.ultrametric = 1,
  trait_correlation_threshold = 0.9,
  ...
)
PhyloEM(
  phylo,
  Y_data,
  process = c("BM", "OU", "scOU", "rBM"),
  check_postorder = TRUE,
  independent = FALSE,
  K_max = max(floor(sqrt(length(phylo$tip.label))), 10),
  use_previous = FALSE,
  order = TRUE,
  method.selection = c("LINselect", "DDSE", "Djump"),
  C.BM1 = 0.1,
  C.BM2 = 2.5,
  C.LINselect = 1.1,
  method.variance = c("upward_downward", "simple"),
  method.init = "lasso",
  method.init.alpha = "estimation",
  method.init.alpha.estimation = c("regression", "regression.MM", "median"),
  methods.segmentation = c("lasso", "best_single_move"),
  alpha_grid = TRUE,
  nbr_alpha = 10,
  random.root = TRUE,
  stationary.root = random.root,
  alpha = NULL,
  check.tips.names = TRUE,
  progress.bar = TRUE,
  estimates = NULL,
  save_step = FALSE,
  rescale_OU = TRUE,
  parallel_alpha = FALSE,
  Ncores = 3,
  K_lag_init = 5,
  light_result = TRUE,
  tol_tree = .Machine$double.eps^0.5,
  allow_negative = FALSE,
  option_is.ultrametric = 1,
  trait_correlation_threshold = 0.9,
  ...
)

Arguments

`phylo`	A phylogenetic tree of class `phylo` (from package `ape`).
`Y_data`	Matrix of data at the tips, size p x ntaxa. Each line is a trait, and each column is a tip. The column names are checked against the tip names of the tree.
`process`	The model used for the fit. One of "BM" (for a full BM model, univariate or multivariate); "OU" (for an OU with independent traits, univariate or multivariate); or "scOU" (for a "scalar OU" model, see details).
`check_postorder`	Re-order the tree in post-order. If the Upward-Downward algorithm is used, the tree need to be in post-order. Default to TRUE if the upward-downward is used, otherwise automatically set to FALSE.
`independent`	Are the trait assumed to be independent from one another? Default to FALSE. OU in a multivariate setting only works if TRUE.
`K_max`	The maximum number of shifts to be considered. Default to $max(\|\sqrt ntaxa\|, 10)$ .
`use_previous`	Should the initialization for K+1 shifts use the estimation for $K$ shifts already obtained? Default to FALSE.
`order`	Should the estimations be done for K increasing (TRUE) or K decreasing (FALSE)? If use_previous=FALSE, this has no influence, except if one initialization fails. Default to TRUE.
`method.selection`	Method selection to be used. Several ones can be used at the same time. One of "LINselect" for the Baraud Giraud Huet LINselect method; "DDSE" for the Slope Heuristic or "Djump" for the Jump Heuristic, last two based the Birgé Massart method.
`C.BM1`	Multiplying constant to be used for the BigeMassart1 method. Need to be positive. Default to 0.1.
`C.BM2`	Multiplying constant to be used for the BigeMassart2 method. Default to 2.5.
`C.LINselect`	Multiplying constant to be used for the LINselect method. Need to be greater than 1. Default to 1.1.
`method.variance`	Algorithm to be used for the moments computations at the E step. One of "simple" for the naive method; of "upward_downward" for the Upward Downward method (usually faster). Default to "upward_downward".
`method.init`	The initialization method. One of "lasso" for the LASSO base initialization method; or "default" for user-specified initialization values. Default to "lasso".
`method.init.alpha`	For OU model, initialization method for the selection strength alpha. One of "estimation" for a cherry-based initialization, using `nlrob`; or "default" for user-specified initialization values. Default to "estimation".
`method.init.alpha.estimation`	If method.init.alpha="estimation", choice of the estimation(s) methods to be used. Choices among "regression", (method="M" is passed to `nlrob`); "regression.MM" (method="MM" is passed to `nlrob`) or "median" (`nlrob` is not used, a simple median is taken). Default to all of them.
`methods.segmentation`	For OU, method(s) used at the M step to find new candidate shifts positions. Choices among "lasso" for a LASSO-based algorithm; and "best_single_move" for a one-move at a time based heuristic. Default to both of them. Using only "lasso" might speed up the function a lot.
`alpha_grid`	whether to use a grid for alpha values. Default to TRUE. This is the only available method for scOU. This method is not available for OU with multivariate traits. OU with univariate traits can take both TRUE or FALSE. If TRUE, a grid based on the branch length of the tree is automatically computed, using function `find_grid_alpha`.
`nbr_alpha`	If `alpha_grid=TRUE`, the number of alpha values on the grid. Default to 10.
`random.root`	whether the root is assumed to be random (TRUE) of fixed (FALSE). Default to TRUE
`stationary.root`	whether the root is assumed to be in the stationary state. Default to TRUE.
`alpha`	If the estimation is done with a fixed alpha (either known, or on a grid), the possible value for alpha. Default to NULL.
`check.tips.names`	whether to check the tips names of the tree against the column names of the data. Default to TRUE.
`progress.bar`	whether to display a progress bar of the computations. Default to TRUE.
`estimates`	The result of a previous run of this same function. This function can be re-run for other model election method. Default to NULL.
`save_step`	If alpha_grid=TRUE, whether to save the intermediate results for each value of alpha (in a temporary file). Useful for long computations. Default to FALSE.
`rescale_OU`	For the Univariate OU, should the tree be re-scaled to use a BM ? This can speed up the computations a lot. However, it can make it harder for the EM to explore the space of parameters, and hence lead to a sub-optimal solution. Default to TRUE.
`parallel_alpha`	If alpha_grid=TRUE, whether to run the estimations with different values of alpha on separate cores. Default to FALSE. If TRUE, the log is written as a temporary file.
`Ncores`	If parallel_alpha=TRUE, number of cores to be used.
`K_lag_init`	Number of extra shifts to be considered at the initialization step. Increases the accuracy, but can make computations quite slow of taken too high. Default to 5.
`light_result`	if TRUE (the default), the object returned is made light, without easily computable quantities. If FALSE, the object can be very heavy, but its subsequent manipulations can be faster (especially for plotting).
`tol_tree`	tolerance to consider a branch length significantly greater than zero, or two lineages lengths to be different, when checking for ultrametry. (Default to .Machine$double.eps^0.5). See `is.ultrametric` and `di2multi`.
`allow_negative`	whether to allow negative values for alpha (Early Burst). See details. Default to FALSE.
`option_is.ultrametric`	option for `is.ultrametric` check. Default to 1.
`trait_correlation_threshold`	the trait correlation threshold to stop the analysis. Default to 0.9.
`...`	Further arguments to be passed to `estimateEM`, including tolerance parameters for stopping criteria, maximal number of iterations, etc.

Details

Several models can be used:

BM with fixed root, univariate or multivariate.
OU with fixed or stationary root, univariate or multivariate.

For the OU in the multivariate setting, two assumptions can be made:

Independent traits. This amounts to diagonal rate and selection matrices.
"Scalar OU" (scOU): the rate matrix can be full, but the selection strength matrix is assumed to be scalar, i.e. all the traits are supposed to go to their optimum values with the same speed.

Note that the "scalar OU" model can also be seen as a re-scaling of the tree. The selection strength parameter alpha can then be interpreted as a measure of the "phylogenetic signal":

If alpha is close to 0, then the process is similar to a BM on the original tree, and the signal is strong.
If alpha is large, then the re-scaled tree is similar to a star-tree, and the signal is weak.

When there are no shifts, and the root is taken to be constant, this model is actually equivalent to an AC model (Uyeda et al. 2015). With this interpretation in mind, one might want to explore negative values of alpha, in order to fit a DC (or Early Burst) model. With no shift and a fixed root, the same proof shows that the scOU with alpha negative is equivalent to the DC model. There are two strong caveats in doing that.

The interpretation of the OU as modeling the dynamic of a trait undergoing stabilizing selection is lost. In this case, the scOU can only be seen as a re-scaling of the tree, similar to Pagel's delta.
The values of the "optimal values", and of the shifts on them, cannot be interpreted as such (the process is actually going away from this values, instead of being attracted). When looking at these values, one should only use the un-normalized values happening of the underlying BM. You can extract those using the params_process function with rBM = TRUE.

Value

An object of class PhyloEM. Relevant quantities can be extracted from it using helper functions params_process.PhyloEM, imputed_traits.PhyloEM

Examples

## Not run: 
## Load Data
data(monkeys)
## Run method
# Note: use more alpha values for better results.
res <- PhyloEM(Y_data = monkeys$dat,        ## data
               phylo = monkeys$phy,         ## phylogeny
               process = "scOU",            ## scalar OU
               random.root = TRUE,          ## root is stationary
               stationary.root = TRUE,
               K_max = 10,                  ## maximal number of shifts
               nbr_alpha = 4,               ## number of alpha values
               parallel_alpha = TRUE,       ## parallelize on alpha values
               Ncores = 2)
## Plot selected solution (LINselect)
plot(res) # three shifts
## Plot selected solution (DDSE)
plot(res, method.selection = "DDSE") # no shift
## Extract and solution with 5 shifts
params_5 <- params_process(res, K = 5)
plot(res, params = params_5)
## Show all equivalent solutions
eq_sol <- equivalent_shifts(monkeys$phy, params_5)
plot(eq_sol)

## End(Not run)

## Not run: 
## Load Data
data(monkeys)
## Run method
# Note: use more alpha values for better results.
res <- PhyloEM(Y_data = monkeys$dat,        ## data
               phylo = monkeys$phy,         ## phylogeny
               process = "scOU",            ## scalar OU
               random.root = TRUE,          ## root is stationary
               stationary.root = TRUE,
               K_max = 10,                  ## maximal number of shifts
               nbr_alpha = 4,               ## number of alpha values
               parallel_alpha = TRUE,       ## parallelize on alpha values
               Ncores = 2)
## Plot selected solution (LINselect)
plot(res) # three shifts
## Plot selected solution (DDSE)
plot(res, method.selection = "DDSE") # no shift
## Extract and solution with 5 shifts
params_5 <- params_process(res, K = 5)
plot(res, params = params_5)
## Show all equivalent solutions
eq_sol <- equivalent_shifts(monkeys$phy, params_5)
plot(eq_sol)

## End(Not run)

Plot Model Selection Criterion

Description

This function takes an object of class PhyloEM, result of function PhyloEM, and plots a model selection criterion.

Usage

plot_criterion(
  res,
  method.selection = NULL,
  add = FALSE,
  select.col = "red",
  ...
)
plot_criterion(
  res,
  method.selection = NULL,
  add = FALSE,
  select.col = "red",
  ...
)

Arguments

`res`	an object of class `PhyloEM`, result of function `PhyloEM`.
`method.selection`	select the parameters to plot. One of "LINselect", "DDSE", "Djump" or "likelihood" (for un-penalized likelihood). Default to "LINselect". See `params_process.PhyloEM`.
`add`	boolean: should the points be added to a current plot (default to FALSE).
`select.col`	the color of the point selected by the criterion. Default to "red".
`...`	further argument to be passed to base `plot`.

Plot all the equivalent solutions.

Description

plot.enumerate_parsimony plots a representation of all the equivalent solutions.

Usage

## S3 method for class 'enumerate_parsimony'
plot(x, numbering = FALSE, nbr_col = 3, ...)
## S3 method for class 'enumerate_parsimony'
plot(x, numbering = FALSE, nbr_col = 3, ...)

Arguments

`x`	an object of class `enumerate_parsimony`, result of function `enumerate_parsimony`
`numbering`	whether to number the solutions. Default to FALSE.
`nbr_col`	the number of columns on which to display the plot. Default to 3.
`...`	further arguments to be passed to `plot.phylo` or `nodelabels`.

Details

This function uses functions plot.phylo and nodelabels for the actual plotting of the trees.

Value

A plot of the equivalent shifts allocations.

Plot all the equivalent solutions.

Description

plot.equivalent_shifts plots a representation of all the equivalent shifts allocations, with a representation of the shifts and their values, and a coloration of the branches in term of regimes.

Usage

## S3 method for class 'equivalent_shifts'
plot(
  x,
  trait = 1,
  show_shifts_values = TRUE,
  numbering = FALSE,
  colors_tips = NULL,
  nbr_col = 3,
  gray_scale = FALSE,
  edge.width = 2,
  shifts_cex = 1.2,
  ...
)
## S3 method for class 'equivalent_shifts'
plot(
  x,
  trait = 1,
  show_shifts_values = TRUE,
  numbering = FALSE,
  colors_tips = NULL,
  nbr_col = 3,
  gray_scale = FALSE,
  edge.width = 2,
  shifts_cex = 1.2,
  ...
)

Arguments

`x`	an object of class `equivalent_shifts`, result of function `equivalent_shifts`
`trait`	(integer) the trait to be plotted, if multivariate. Default to 1.
`show_shifts_values`	whether to show the equivalent shifts values or not. Default to FALSE.
`numbering`	whether to number the solutions. Default to FALSE.
`colors_tips`	user-provided colors for the tips of the tree. A vector vector with as many colors as there are tips. Will be automatically computed if not provided.
`nbr_col`	the number of columns on which to display the plot. Default to 3.
`gray_scale`	if TRUE, the colors are replaced by a gray scale. Default to FALSE.
`edge.width`	width of the edge. Default to 1.
`shifts_cex`	if `value_in_box=TRUE`, the size of the text in the boxes. Default to 0.8.
`...`	further arguments to be passed to `plot.phylo`.

Details

This function uses function plot.phylo for the actual plotting of the trees.

Value

A plot of the equivalent shifts allocations.

Plot for class `simul_process`

Description

This function takes an object of class params_process, and plots them along with some data at the tips of the tree.

Usage

## S3 method for class 'params_process'
plot(
  x,
  phylo,
  data = NULL,
  traits,
  automatic_colors = TRUE,
  color_characters = "black",
  color_edges = "black",
  plot_ancestral_states = FALSE,
  ancestral_states = NULL,
  imposed_scale,
  ancestral_cex = 2,
  ancestral_pch = 19,
  value_in_box = FALSE,
  ancestral_as_shift = FALSE,
  shifts_cex = 0.6,
  shifts_bg = "chocolate4",
  root_bg = "chocolate4",
  shifts_adj = 0,
  root_adj = 1,
  color_shifts_regimes = FALSE,
  regime_boxes = FALSE,
  alpha_border = 70,
  show.tip.label = FALSE,
  label_cex = 0.5,
  label_offset = 0,
  axis_cex = 0.7,
  edge.width = 1,
  margin_plot = NULL,
  gray_scale = FALSE,
  ...
)
## S3 method for class 'params_process'
plot(
  x,
  phylo,
  data = NULL,
  traits,
  automatic_colors = TRUE,
  color_characters = "black",
  color_edges = "black",
  plot_ancestral_states = FALSE,
  ancestral_states = NULL,
  imposed_scale,
  ancestral_cex = 2,
  ancestral_pch = 19,
  value_in_box = FALSE,
  ancestral_as_shift = FALSE,
  shifts_cex = 0.6,
  shifts_bg = "chocolate4",
  root_bg = "chocolate4",
  shifts_adj = 0,
  root_adj = 1,
  color_shifts_regimes = FALSE,
  regime_boxes = FALSE,
  alpha_border = 70,
  show.tip.label = FALSE,
  label_cex = 0.5,
  label_offset = 0,
  axis_cex = 0.7,
  edge.width = 1,
  margin_plot = NULL,
  gray_scale = FALSE,
  ...
)

Arguments

`x`	an object of class `params_process`.
`phylo`	a phylogenetic tree.
`data`	a matrix of data at the tips of the tree. Must have p rows and ntaxa columns. If these are simulated, use the `extract.simul_process` function.
`traits`	a vector of integers giving the numbers of the trait to be plotted. Default to 1:p (all the traits).
`automatic_colors`	whether to color the edges automatically according to their regimes. Default to TRUE. If FALSE, colors can be manually specified through arguments `color_characters` and `colro_edges` (see below).
`color_characters`	if `automatic_colors=FALSE`, a vector of colors for the tips of the tree.
`color_edges`	if `automatic_colors=FALSE`, a vector of colors for the edges of the tree.
`plot_ancestral_states`	whether to plot the ancestral traits inferred at the internal nodes of the tree. Only available if only one trait is plotted. Default to FALSE.
`ancestral_states`	if `plot_ancestral_states=TRUE`, the ancestral states must be specified. If these are simulated, use the `extract.simul_process` function.
`imposed_scale`	if `plot_ancestral_states=TRUE`, a vector specifying the imposed scale for the ancestral states plotting. Useful to make comparisons. Default to the plotted trait.
`ancestral_cex`	if `plot_ancestral_states=TRUE`, the size of the ancestral states on the tree. Default to 2.
`ancestral_pch`	if `plot_ancestral_states=TRUE`, the symbol used of the ancestral states. Default to circles (`pch=19`).
`value_in_box`	whether to plot the value of the shift in a box on the edges. Only available when only one trait is plotted. Can be difficult to read on big trees. The size of the text in the boxes is controlled by parameter. Default to FALSE.
`ancestral_as_shift`	whether to represent the ancestral value at the root as an ancestral shift on the root edge. Default to FALSE. `shifts_cex` (see below).
`shifts_cex`	if `value_in_box=TRUE`, the size of the text in the boxes. Default to 0.8.
`shifts_bg`	if `value_in_box=TRUE`, the background color of the boxes.
`root_bg`	if `value_in_box=TRUE` and `ancestral_as_shift=TRUE`, the background color of the ancestral box.
`shifts_adj`	the adj parameter for the shifts position on the edges. Default to 0 (beginning of the edge).
`root_adj`	if `ancestral_as_shift=TRUE`, the adj parameter for the ancestral value position on the root edge. Default to 1.
`color_shifts_regimes`	whether to color each shift according to its regime (default to the same color of the edge it's on). Default to FALSE.
`regime_boxes`	whether to draw a box showing all the tips below a given. The transparency of the border of the box is controlled by parameter `alpha_border` (see below).
`alpha_border`	if `regime_boxes=TRUE`, the alpha parameter of the border of the box. Default to 70.
`show.tip.label`	whether to show the tip labels. Default to FALSE.
`label_cex`	if `show.tip.label=TRUE`, the size of the labels. Default to 0.5.
`label_offset`	if `show.tip.label=TRUE`, the size of the offset between the tree and the labels. Default to 0.
`axis_cex`	cex for the label values of the plot. Default to 0.7.
`edge.width`	width of the edge. Default to 1.
`margin_plot`	vector giving the margin to around the plot. Default to `c(0, 0, 0, 0)`.
`gray_scale`	if TRUE, the colors are replaced by a gray scale. Default to FALSE.
`...`	further arguments to be passed to `plot.phylo`.

Plot for class `PhyloEM`

Description

This function takes an object of class PhyloEM, result of function PhyloEM, and plots the result of the inference.

Usage

## S3 method for class 'PhyloEM'
plot(
  x,
  traits = 1:(x$p),
  params = NULL,
  method.selection = NULL,
  automatic_colors = TRUE,
  color_characters = "black",
  color_edges = "black",
  plot_ancestral_states = FALSE,
  name_trait = "Trait Value",
  imposed_scale,
  ancestral_cex = 2,
  ancestral_pch = 19,
  value_in_box = FALSE,
  ancestral_as_shift = FALSE,
  shifts_cex = 0.6,
  shifts_bg = "chocolate4",
  root_bg = "chocolate4",
  shifts_adj = 0,
  root_adj = 1,
  color_shifts_regimes = FALSE,
  regime_boxes = FALSE,
  alpha_border = 70,
  show.tip.label = FALSE,
  label_cex = 0.5,
  label_font = 1,
  label_offset = 0,
  axis_cex = 0.7,
  axis_las = 0,
  show_axis_traits = TRUE,
  edge.width = 1,
  margin_plot = NULL,
  gray_scale = FALSE,
  root.edge = TRUE,
  ...
)
## S3 method for class 'PhyloEM'
plot(
  x,
  traits = 1:(x$p),
  params = NULL,
  method.selection = NULL,
  automatic_colors = TRUE,
  color_characters = "black",
  color_edges = "black",
  plot_ancestral_states = FALSE,
  name_trait = "Trait Value",
  imposed_scale,
  ancestral_cex = 2,
  ancestral_pch = 19,
  value_in_box = FALSE,
  ancestral_as_shift = FALSE,
  shifts_cex = 0.6,
  shifts_bg = "chocolate4",
  root_bg = "chocolate4",
  shifts_adj = 0,
  root_adj = 1,
  color_shifts_regimes = FALSE,
  regime_boxes = FALSE,
  alpha_border = 70,
  show.tip.label = FALSE,
  label_cex = 0.5,
  label_font = 1,
  label_offset = 0,
  axis_cex = 0.7,
  axis_las = 0,
  show_axis_traits = TRUE,
  edge.width = 1,
  margin_plot = NULL,
  gray_scale = FALSE,
  root.edge = TRUE,
  ...
)

Arguments

`x`	an object of class `PhyloEM`, result of function `PhyloEM`.
`traits`	a vector of integers giving the numbers of the trait to be plotted. Default to 1:p (all the traits).
`params`	(optional) some user-specified parameters. Must be of class `params_process`. If left blank, they are extracted using the `method.selection` argument (see below).
`method.selection`	select the parameters to plot. One of "LINselect", "DDSE", "Djump". Default to "LINselect". See `params_process.PhyloEM`.
`automatic_colors`	whether to color the edges automatically according to their regimes. Default to TRUE. If FALSE, colors can be manually specified through arguments `color_characters` and `colro_edges` (see below).
`color_characters`	if `automatic_colors=FALSE`, a vector of colors for the tips of the tree.
`color_edges`	if `automatic_colors=FALSE`, a vector of colors for the edges of the tree.
`plot_ancestral_states`	whether to plot the ancestral traits inferred at the internal nodes of the tree. Only available if only one trait is plotted. Default to FALSE.
`name_trait`	name of the trait scale bar for the ancestral states plotting. Default to "Trait Value".
`imposed_scale`	if `plot_ancestral_states=TRUE`, a vector specifying the imposed scale for the ancestral states plotting. Useful to make comparisons. Default to the plotted trait.
`ancestral_cex`	if `plot_ancestral_states=TRUE`, the size of the ancestral states on the tree. Default to 2.
`ancestral_pch`	if `plot_ancestral_states=TRUE`, the symbol used of the ancestral states. Default to circles (`pch=19`).
`value_in_box`	whether to plot the value of the shift in a box on the edges. Only available when only one trait is plotted. Can be difficult to read on big trees. The size of the text in the boxes is controlled by parameter. Default to FALSE.
`ancestral_as_shift`	whether to represent the ancestral value at the root as an ancestral shift on the root edge. Default to FALSE. `shifts_cex` (see below).
`shifts_cex`	if `value_in_box=TRUE`, the size of the text in the boxes. Default to 0.8.
`shifts_bg`	if `value_in_box=TRUE`, the background color of the boxes.
`root_bg`	if `value_in_box=TRUE` and `ancestral_as_shift=TRUE`, the background color of the ancestral box.
`shifts_adj`	the adj parameter for the shifts position on the edges. Default to 0 (beginning of the edge).
`root_adj`	if `ancestral_as_shift=TRUE`, the adj parameter for the ancestral value position on the root edge. Default to 1.
`color_shifts_regimes`	whether to color each shift according to its regime (default to the same color of the edge it's on). Default to FALSE.
`regime_boxes`	whether to draw a box showing all the tips below a given. The transparency of the border of the box is controlled by parameter `alpha_border` (see below).
`alpha_border`	if `regime_boxes=TRUE`, the alpha parameter of the border of the box. Default to 70.
`show.tip.label`	whether to show the tip labels. Default to FALSE.
`label_cex`	if `show.tip.label=TRUE`, the size of the labels. Default to 0.5.
`label_font`	if `show.tip.label=TRUE`, the font of the labels (see par).
`label_offset`	if `show.tip.label=TRUE`, the size of the offset between the tree and the labels. Default to 0.
`axis_cex`	cex for the label values of the plot. Default to 0.7.
`axis_las`	las for the label values of the plot. Default to 0 (see par).
`show_axis_traits`	control whether the trait values axis is plotted (default to TRUE).
`edge.width`	width of the edge. Default to 1.
`margin_plot`	vector giving the margin to around the plot. Default to `c(0, 0, 0, 0)`.
`gray_scale`	if TRUE, the colors are replaced by a gray scale. Default to FALSE.
`root.edge`	a logical indicating whether to draw the root edge (defaults to TRUE)
`...`	further arguments to be passed to `plot.phylo`.

Residuals of a fitted object

Description

residuals computes the residuals of some parameters.

Usage

## S3 method for class 'PhyloEM'
residuals(object, ...)
## S3 method for class 'PhyloEM'
residuals(object, ...)

Arguments

`object`	an object of class `params_process` or `PhyloEM`. trait, and each column is a tip. The column names are checked against the tip names of the tree. `incidence.matrix.full`. Can be specified to avoid extra computations.
`...`	for a `PhyloEM` object, further arguments to be passed on to `params_process.PhyloEM` (to choose which parameters to extract from the results, see documentation of this function).

Value

The log likelihood of the data with the provided parameters on the tree.

Simmap format mapping from list of edges

Description

shifts_to_simmap takes a vector of edges where the shifts occur, and return a simmap formatted tree, mapped with corresponding regimes.

Usage

shifts_to_simmap(tree, shifts_edges)
shifts_to_simmap(tree, shifts_edges)

Arguments

`tree`	input tree in `phylo` format
`shifts_edges`	shifts positions on the edges

Details

Ancestral state is always 0, and other states are consecutive integers.

Value

tree a simmap object

Compute the matrix of shifts.

Description

shifts.list_to_matrix takes the list description of the shifts to give the matrix representation of the shifts : the b th element of the lth line has the value of the shift on character l occurring on that branch b

Usage

shifts.list_to_matrix(phy, shifts, p = nrow(shifts$values))
shifts.list_to_matrix(phy, shifts, p = nrow(shifts$values))

Arguments

`phy`	Input tree.
`shifts`	list description of the shifts : shifts$edges, shifts$values.
`p`	number of traits (optional, needed when shifts = NULL).

Value

Matrix p x Nedge of length nbranch.

Compute the list of shifts.

Description

shifts.matrix_to_list takes the vectorial description of the shifts to create the list description of the shifts.

Usage

shifts.matrix_to_list(delta)
shifts.matrix_to_list(delta)

Arguments

delta

matrix description of the shift.

Value

List describing shifts.

Simulate a Stochastic Process on a tree

Description

simulate simulate a stochastic process on a tree.

Usage

simul_process(x, ...)

## S3 method for class 'params_process'
simul_process(
  x,
  phylo,
  simulate_random = TRUE,
  checks = TRUE,
  U_tree = NULL,
  times_shared = NULL,
  ...
)

## S3 method for class 'PhyloEM'
simul_process(
  x,
  simulate_random = TRUE,
  checks = TRUE,
  U_tree = NULL,
  times_shared = NULL,
  ...
)
simul_process(x, ...)

## S3 method for class 'params_process'
simul_process(
  x,
  phylo,
  simulate_random = TRUE,
  checks = TRUE,
  U_tree = NULL,
  times_shared = NULL,
  ...
)

## S3 method for class 'PhyloEM'
simul_process(
  x,
  simulate_random = TRUE,
  checks = TRUE,
  U_tree = NULL,
  times_shared = NULL,
  ...
)

Arguments

`x`	an object of class `params_process` or `PhyloEM`.
`...`	for a `PhyloEM` object, further arguments to be passed on to `params_process.PhyloEM` (to choose which parameters to extract from the results, see documentation of this function).
`phylo`	a phylogenetic tree, class `phylo`.
`simulate_random`	set to FALSE if only the expected values are needed (and not the random sample). Default to TRUE.
`checks`	whether to check the entry parameters for consistency. Default to TRUE.
`U_tree`	optional, full incidence matrix of the tree, result of function `incidence.matrix.full`. Can be specified to avoid extra computations.
`times_shared`	optional, times of shared ancestry of all nodes and tips, result of function `compute_times_ca`. Can be specified to avoid extra computations.

Value

An S3 object of class simul_process. This contains:

sim_traits: an array with dimensions p x Nnode x 2 (BM) or p x Nnode x 3 (OU). For each trait t, 1 <= t <= p, sim_traits[t, , ] has tree columns, containing respectively the simulated state, expected value and optimal value for all the nodes.
phylo: the phylogenetic tree used for the simulations (class phylo).
params: the parameters used for the simulations (class params_proces).

Methods (by class)

simul_process(params_process): params_process object
simul_process(PhyloEM): PhyloEM object

Transform branch length for a re-scaled BM

Description

Re-scale the branch length of the tree so that a BM running on the new tree produces the same observations at the tips than an OU with parameter alpha.

Usage

transform_branch_length(phylo, alp)
transform_branch_length(phylo, alp)

Arguments

`phylo`	A phylogenetic tree of class `phylo`, with branch lengths.
`alp`	Value of the selection strength.

Value

phylo The same phylogenetic tree, with transformed branch lengths.

Package 'PhylogeneticEM'

Help Index

Allocation of regimes to nodes.

Description

Usage

Arguments

Value

Allocation of shifts to edges

Description

Usage

Arguments

Value

Check Parsimony, assuming no homoplasy

Description

Usage

Arguments

Details

Value

Clustering associated to a shift allocation, assuming no homoplasy.

Description

Usage

Arguments

Details

Value

Computation of the optimal values at nodes and tips.

Description

Usage

Arguments

Details

Value

Phylogenetic Distances

Description

Usage

Arguments

Details

Value

Computation of shifts from the vector of optimal values

Description

Usage

Arguments

Details

Value

Compute the stationary variance matrix

Description

Usage

Arguments

Value

Common Ancestors Times

Description

Usage

Arguments

Details

Value

Correspondence between edges numbers

Description

Usage

Arguments

Value

Make the result lighter

Description

Usage

Arguments

Details

Value

Methods (by class)

See Also

Enumerate all the possible regime allocations, given a clustering of the tips.

Description

Usage

Arguments

Details

Value

See Also

Examples

Tips descendants of nodes.

Description

Usage

Arguments

Details

Value

Extract the result of `enumerate_parsimony` at a node.

Extract from object `partitionsNumber`