@@ -50,13 +50,13 @@ This function is a structural Monte Carlo Markov Chain Model Choice (MC)^3 sampl

The classical (MC)^3 is unbiased but inefficient in mixing, the two radical MCMC alternative move are known to massively accelerate mixing without introducing biases. But those move are computationally expensive. Then low frequencies are advised. The REV move is not necessarily ergotic , then it should not be used alone.

The parameter \code{start.dag} can be: "random", "hc" or user defined. If user select "random" then a random valid DAG is selected. The routine used favourise low density structure. If "hc" (for Hill-climber: \link[abn:search_heuristic]{searchHeuristic} then a DAG is selected using 100 different searches with 500 optimization steps. A user defined DAG can be provided. It should be a named square matrix containing only zeros and ones. The DAG should be valid (i.e. acyclic).

The parameter \code{start.dag} can be: \code{"random"}, \code{"hc"} or user defined. If user select \code{"random"} then a random valid DAG is selected. The routine used favourise low density structure. If \code{"hc"} (for Hill-climber: \link[abn:search_heuristic]{searchHeuristic} then a DAG is selected using 100 different searches with 500 optimization steps. A user defined DAG can be provided. It should be a named square matrix containing only zeros and ones. The DAG should be valid (i.e. acyclic).

The parameter \code{prior.choice} determines the prior used within each individual node for a given choice of parent combination. In Koivisto and Sood (2004) p.554 a form of prior is used which assumes that the prior probability for parent combinations comprising of the same number of parents are all equal. Specifically, that the prior probability for parent set G with cardinality |G| is proportional to 1/[n-1 choose |G|] where there are n total nodes. Note that this favours parent combinations with either very low or very high cardinality which may not be appropriate. This prior is used when \code{prior.choice=2}. When prior.choice=1 an uninformative prior is used where parent combinations of all cardinalities are equally likely. When \code{prior.choice=3} a user defined prior is used, defined by \code{prior.dag}. It is given by an adjacency matrix (squared and same size as number of nodes) where entries ranging from zero to one give the user prior belief. An hyper parameter defining the global user belief in the prior is given by \code{prior.lambda}.

The parameter \code{prior.choice} determines the prior used within each individual node for a given choice of parent combination. In Koivisto and Sood (2004) p.554 a form of prior is used which assumes that the prior probability for parent combinations comprising of the same number of parents are all equal. Specifically, that the prior probability for parent set G with cardinality |G| is proportional to 1/[n-1 choose |G|] where there are n total nodes. Note that this favours parent combinations with either very low or very high cardinality which may not be appropriate. This prior is used when \code{prior.choice=2}. When \code{prior.choice=1} an uninformative prior is used where parent combinations of all cardinalities are equally likely. When \code{prior.choice=3} a user defined prior is used, defined by \code{prior.dag}. It is given by an adjacency matrix (squared and same size as number of nodes) where entries ranging from zero to one give the user prior belief. An hyper parameter defining the global user belief in the prior is given by \code{prior.lambda}.

MCMC sampler came with asymptotic statistical guarantees. Therefore it is highly advised to run multiple long enough chains. The burn in phase length (i.e throwing away first MCMC iterations) should be adequately chosen.

MCMC sampler come with asymptotic statistical guarantees. Therefore it is highly advised to run multiple long enough chains. The burn in phase length (i.e throwing away first MCMC iterations) should be adequately chosen.

The argument \code{data.dists} must be a list with named arguments, one for each of the variables in \code{data.df}, where each entry is either "poisson","binomial", or "gaussian".

The argument \code{data.dists} must be a list with named arguments, one for each of the variables in \code{data.df}, where each entry is either \code{"poisson"}, \code{"binomial"}, or \code{"gaussian"}.

The parameter \code{heating} could improve convergence. It should be a real positive number. If smaller than one it is a tuning parameter which transforms the score by raising it to this power. One is neutral. The smaller the more probable to accept any move. If larger than one, it indicates the number of returned steps where an exponentially decrease heating scheme is applied. After this number of steps, the \code{heating} parameter is set to one.

}

...

...

@@ -68,15 +68,15 @@ The parameter \code{heating} could improve convergence. It should be a real posi

\references{

For the implementation of the function:

Kratzer, G. Furrer, R. "Is a single unique Bayesian network enough to accurately represent your data?". arXiv preprint arXiv:1902.06641.

Kratzer, G., Furrer, R. "Is a single unique Bayesian network enough to accurately represent your data?". arXiv preprint arXiv:1902.06641.

For the new edge reversal:

Grzegorczyk, M.Husmeier, D. "Improving the structure MCMC sampler for Bayesian networks by introducing a new edge reversal move", Machine Learning, vol. 71(2-3), pp. 265, 2008.

Grzegorczyk, M., Husmeier, D. (2008). "Improving the structure MCMC sampler for Bayesian networks by introducing a new edge reversal move", Machine Learning, vol. 71(2-3), 265.

For the Markov Blanket resampling move:

Su, C.Borsuk, M. E. "Improving structure MCMC for Bayesian networks through Markov blanket resampling", The Journal of Machine Learning Research, vol. 17(1), pp. 4042-4061, 2016.

Su, C., Borsuk, M. E. (2016). "Improving structure MCMC for Bayesian networks through Markov blanket resampling", The Journal of Machine Learning Research, vol. 17(1), 4042-4061.

For the Koivisto prior:

...

...

@@ -84,27 +84,28 @@ Koivisto, M. V. (2004). Exact Structure Discovery in Bayesian Networks, Journal

For the user defined prior:

Werhli, A. V., & Husmeier, D. (2007). "Reconstructing gene regulatory networks with Bayesian networks by combining expression data with multiple sources of prior knowledge". Statistical Applications in Genetics and Molecular Biology, 6 (Article 15).

Werhli, A. V., Husmeier, D. (2007). "Reconstructing gene regulatory networks with Bayesian networks by combining expression data with multiple sources of prior knowledge". Statistical Applications in Genetics and Molecular Biology, 6 (Article 15).

Imoto, S., Higuchi, T., Goto, T., Tashiro, K., Kuhara, S., & Miyano, S. (2003). Using Bayesian networks for estimating gene networks from microarrays and biological knowledge. In Proceedings of the European Conference on Computational Biology.

Imoto, S., Higuchi, T., Goto, T., Tashiro, K., Kuhara, S., Miyano, S. (2003). Using Bayesian networks for estimating gene networks from microarrays and biological knowledge. In Proceedings of the European Conference on Computational Biology.

For the asia dataset:

Scutari, M. (2010). Learning Bayesian Networks with the bnlearn R Package. Journal of Statistical Software, 35(3), 1 - 22. doi:http://dx.doi.org/10.18637/jss.v035.i03.

Scutari, M. (2010). Learning Bayesian Networks with the bnlearn R Package. Journal of Statistical Software, 35(3), 1-22. doi:http://dx.doi.org/10.18637/jss.v035.i03.

}

\examples{

## Example from the asia dataset from Lauritzen and Spiegelhalter (1988) provided by Scutari (2010)

## Example from the asia dataset from Lauritzen and Spiegelhalter (1988)

## provided by Scutari (2010)

# the number of MCMC run is delibaretelly chosen too small (computing time)

# The number of MCMC run is delibaretelly chosen too small (computing time)

# no thinning (usually not recommended)

# no burn-in (usually not recommended,

# even if not supported by any theoretical arguments)

data("mcmc_run_asia")

# let us run: 0.03 REV, 0.03 MBR, 0.94 MC3 MCMC jumps

# Let us run: 0.03 REV, 0.03 MBR, 0.94 MC3 MCMC jumps

@@ -37,17 +37,18 @@ Alboukadel Kassambara (2018). ggpubr: 'ggplot2' Based Publication Ready Plots. R

Claus O. Wilke (2019). cowplot: Streamlined Plot Theme and Plot Annotations for 'ggplot2'. R package version 0.9.4. https://CRAN.R-project.org/package=cowplot

Data:

Scutari, M. (2010). Learning Bayesian Networks with the bnlearn R Package. Journal of Statistical Software, 35(3), 1 - 22. doi:http://dx.doi.org/10.18637/jss.v035.i03.

Scutari, M. (2010). Learning Bayesian Networks with the bnlearn R Package. Journal of Statistical Software, 35(3), 1-22. doi:http://dx.doi.org/10.18637/jss.v035.i03.

}

\examples{

## Example from the asia dataset from Lauritzen and Spiegelhalter (1988) provided by Scutari (2010)

## Example from the asia dataset from Lauritzen and Spiegelhalter (1988)

\item{formula}{formula statement or adjacency matrix to query the MCMC samples, see details. If this argument is NULL, then the average arc-wise frequencies is reported.}

\item{formula}{formula statement or adjacency matrix to query the MCMC samples, see details. If this argument is \code{NULL}, then the average arc-wise frequencies is reported.}

}

\description{The function allows users to perform structural queries over MCMC samples produced by \code{mcmcabn}.

...

...

@@ -23,10 +23,10 @@ query(mcmcabn = NULL,

\details{The query can be formulated using an adjacency matrix or a formula-wise expression.

The adjacency matrix should be squared of dimension equal to the number of nodes in the networks. Their entries should be either 1,0 or -1. The 1 indicates the requested arcs, the -1 the excluded and the 0 all other entries that are not subject to query. The rows indicated the set of parents of the index nodes. The order of rows and column should be the same as the one used in the `mcmcabn()` function in the `data.dist` argument.

The adjacency matrix should be squared of dimension equal to the number of nodes in the networks. Their entries should be either 1,0 or -1. The 1 indicates the requested arcs, the -1 the excluded and the 0 all other entries that are not subject to query. The rows indicated the set of parents of the index nodes. The order of rows and column should be the same as the one used in the \code{mcmcabn()} function in the \code{data.dist} argument.

The formula statement has been designed to ease querying over the MCMC sample. It allows user to make complex queries without explicitly writing an adjacency matrix (which can be painful when the number of variables is large). The formula argument can be provided using typically a formula like:

~ node1|parent1:parent2 + node2:node3|parent3. The formula statement has to start with `~`. In this example, node1 has two parents (parent1 and parent2). node2 and node3 have the same parent3. The parents names have to exactly match those given in name. `:` is the separator between either children or parents, `|` separates children (left side) and parents (right side), `+` separates

\code{~ node1|parent1:parent2 + node2:node3|parent3}. The formula statement has to start with `~`. In this example, node1 has two parents (parent1 and parent2). node2 and node3 have the same parent3. The parents names have to exactly match those given in name. `:` is the separator between either children or parents, `|` separates children (left side) and parents (right side), `+` separates

terms, `.` replaces all the variables in name. Additional, when one want to exclude an arc simply put `-` in front of that statement. Then a formula like: ~ -node1|parent1 exclude all DAGs that have an arc between parent1 and node1.

If the formula argument is not provided the function returns the average support of all individual arcs using a named matrix.

...

...

@@ -36,33 +36,32 @@ If the formula argument is not provided the function returns the average support

\author{Gilles Kratzer}

\references{Kratzer, G. Furrer, R. "Is a single unique Bayesian network enough to accurately represent your data?". arXiv preprint arXiv:1902.06641.

\references{Kratzer, G., Furrer, R. "Is a single unique Bayesian network enough to accurately represent your data?". arXiv preprint arXiv:1902.06641.

Lauritzen S, Spiegelhalter D (1988). "Local Computation with Probabilities on Graphical Structures and their Application to Expert Systems (with discussion)". Journal of the Royal Statistical Society: Series B, 50(2):157–224.

Lauritzen, S., Spiegelhalter, D. (1988). "Local Computation with Probabilities on Graphical Structures and their Application to Expert Systems (with discussion)". Journal of the Royal Statistical Society: Series B, 50(2):157–224.

Scutari, M. (2010). Learning Bayesian Networks with the bnlearn R Package. Journal of Statistical Software, 35(3), 1 - 22. doi:http://dx.doi.org/10.18637/jss.v035.i03.

Scutari, M. (2010). Learning Bayesian Networks with the bnlearn R Package. Journal of Statistical Software, 35(3), 1–22. doi:http://dx.doi.org/10.18637/jss.v035.i03.

}

\examples{

## Example from the asia dataset from Lauritzen and Spiegelhalter (1988) provided by Scutari (2010)

## Example from the asia dataset from Lauritzen and Spiegelhalter (1988)

## provided by Scutari (2010)

data("mcmc_run_asia")

##return a named matrix with individual arc support

## Return a named matrix with individual arc support

query(mcmcabn = mcmc.out.asia)

## what is the probability of LungCancer node being children of the Smoking node?

## What is the probability of LungCancer node being children of the Smoking node?