mcmc.Rd 12.1 KB
 Gilles Kratzer committed Feb 21, 2019 1 % mcmcabn.Rd ---  Gilles Kratzer committed Nov 15, 2018 2 % Author : Gilles Kratzer  Gilles Kratzer committed Feb 19, 2019 3 % Created on : 18.02.2019  Gilles Kratzer committed Nov 15, 2018 4 5 6 % Last modification : %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  Gilles Kratzer committed Feb 19, 2019 7 8 \name{mcmcabn} \alias{mcmcabn}  Gilles Kratzer committed Feb 21, 2019 9 \title{Structural MCMC sampler for DAGs}  Gilles Kratzer committed Nov 15, 2018 10 11  \usage{  Gilles Kratzer committed Feb 19, 2019 12 13 14 15 mcmcabn(score.cache = NULL, score = "mlik", data.dists = NULL, max.parents = 1,  Gilles Kratzer committed Feb 21, 2019 16  mcmc.scheme = c(100,1000,1000),  Gilles Kratzer committed Feb 19, 2019 17 18 19  seed = 42, verbose = FALSE, start.dag = NULL,  Gilles Kratzer committed Feb 21, 2019 20 21  prior.dag = NULL, prior.lambda = NULL,  Gilles Kratzer committed Feb 19, 2019 22 23  prob.rev = 0.05, prob.mbr = 0.05,  Gilles Kratzer committed Oct 13, 2019 24  heating = 1,  Gilles Kratzer committed Feb 21, 2019 25  prior.choice = 2)  Gilles Kratzer committed Nov 15, 2018 26 27 28  } \arguments{  Gilles Kratzer committed Mar 03, 2019 29  \item{score.cache}{output from \link[abn:build_score_cache]{buildscorecache} from the \code{abn} R package.}  Gilles Kratzer committed Feb 19, 2019 30  \item{score}{character giving which network score should be used to sample the DAGs landscape.}  Gilles Kratzer committed Feb 28, 2019 31 \item{data.dists}{a named list giving the distribution for each node in the network, see details.}  Gilles Kratzer committed Feb 19, 2019 32 33 34 35  \item{max.parents}{a constant giving the maximum number of parents allowed.} \item{mcmc.scheme}{a sampling scheme. It is vector giving in that order: the number of returned DAGS, the number of thinned steps and length of the burn in phase.} \item{seed}{a non-negative integer which sets the seed.} \item{verbose}{extra output, see output for details.}  Gilles Kratzer committed Mar 03, 2019 36  \item{start.dag}{a DAG given as a matrix, see details for format, which can be used to explicitly provide a starting point for the structural search. Alternatively character "random" will select a random DAG as starting point. Character "hc" will call a hill-climber to select a DAG as starting point.}  Gilles Kratzer committed Feb 21, 2019 37 38  \item{prior.dag}{user defined prior. It should be given as a matrix where entries range from zero to one. 0.5 is non-informative for the given arc.} \item{prior.lambda}{hyper parameter representing the strength of belief in the user defined prior.}  Gilles Kratzer committed Feb 19, 2019 39 \item{prob.rev}{probability of selecting a new edge reversal.}  Gilles Kratzer committed Feb 21, 2019 40 \item{prob.mbr}{probability of selecting a Markov blanket resampling move.}  Gilles Kratzer committed Oct 21, 2019 41 \item{heating}{a real positive number that heats up the convergence if between zero and one and apply an exponential decrease scheme if larger than one. The default is one. See details}  Gilles Kratzer committed Feb 28, 2019 42  \item{prior.choice}{an integer, 1 or 2, where 1 is a uniform structural prior and 2 uses a weighted prior, see details.}  Gilles Kratzer committed Nov 15, 2018 43 44 45  } \description{  Gilles Kratzer committed Feb 28, 2019 46 This function is a structural Monte Carlo Markov Chain Model Choice (MC)^3 sampler that is equipped with two large scale MCMC moves that are purposed to accelerate chain mixing.  Gilles Kratzer committed Nov 15, 2018 47 48 }  Gilles Kratzer committed Feb 28, 2019 49 \details{The procedure runs a structural Monte Carlo Markov Chain Model Choice (MC)^3 to find the most probable posterior network (DAG). The default algorithm is based on three MCMC move: edge addition, edge deletion and edge reversal. This algorithm is known as the (MC)^3. It is known to mix slowly and getting stuck in low probability regions. Indeed, changing of Markov equivalence region often requires multiple MCMC moves. Then large scale MCMC moves are implemented. Their relative frequency can be set by the user. The new edge reversal move (REV) from Grzegorczyk and Husmeier (2008) and the Markov blanket resampling (MBR) from Su and Borsuk (2016). The classical reversal move depends on the global configuration of the parents and children and fails to propose MCMC jumps that produce valid but very different DAGs in a unique move. The REV move sample globally a new set of parent. The MBR workaround applies the same idea but to the entire Markov blanket of a randomly chosen node.  Gilles Kratzer committed Nov 15, 2018 50   Gilles Kratzer committed Feb 28, 2019 51 The classical (MC)^3 is unbiased but inefficient in mixing, the two radical MCMC alternative move are known to massively accelerate mixing without introducing biases. But those move are computationally expensive. Then low frequencies are advised. The REV move is not necessarily ergotic , then it should not be used alone.  Gilles Kratzer committed Nov 15, 2018 52   Reinhard Furrer committed Oct 29, 2019 53 The parameter \code{start.dag} can be: \code{"random"}, \code{"hc"} or user defined. If user select \code{"random"} then a random valid DAG is selected. The routine used favourise low density structure. If \code{"hc"} (for Hill-climber: \link[abn:search_heuristic]{searchHeuristic} then a DAG is selected using 100 different searches with 500 optimization steps. A user defined DAG can be provided. It should be a named square matrix containing only zeros and ones. The DAG should be valid (i.e. acyclic).  Gilles Kratzer committed Mar 03, 2019 54   Reinhard Furrer committed Oct 29, 2019 55 The parameter \code{prior.choice} determines the prior used within each individual node for a given choice of parent combination. In Koivisto and Sood (2004) p.554 a form of prior is used which assumes that the prior probability for parent combinations comprising of the same number of parents are all equal. Specifically, that the prior probability for parent set G with cardinality |G| is proportional to 1/[n-1 choose |G|] where there are n total nodes. Note that this favours parent combinations with either very low or very high cardinality which may not be appropriate. This prior is used when \code{prior.choice=2}. When \code{prior.choice=1} an uninformative prior is used where parent combinations of all cardinalities are equally likely. When \code{prior.choice=3} a user defined prior is used, defined by \code{prior.dag}. It is given by an adjacency matrix (squared and same size as number of nodes) where entries ranging from zero to one give the user prior belief. An hyper parameter defining the global user belief in the prior is given by \code{prior.lambda}.  Gilles Kratzer committed Nov 15, 2018 56   Reinhard Furrer committed Oct 29, 2019 57 MCMC sampler come with asymptotic statistical guarantees. Therefore it is highly advised to run multiple long enough chains. The burn in phase length (i.e throwing away first MCMC iterations) should be adequately chosen.  Gilles Kratzer committed Feb 21, 2019 58   Reinhard Furrer committed Oct 29, 2019 59 The argument \code{data.dists} must be a list with named arguments, one for each of the variables in \code{data.df}, where each entry is either \code{"poisson"}, \code{"binomial"}, or \code{"gaussian"}.  Gilles Kratzer committed Oct 21, 2019 60 61  The parameter \code{heating} could improve convergence. It should be a real positive number. If smaller than one it is a tuning parameter which transforms the score by raising it to this power. One is neutral. The smaller the more probable to accept any move. If larger than one, it indicates the number of returned steps where an exponentially decrease heating scheme is applied. After this number of steps, the \code{heating} parameter is set to one.  Gilles Kratzer committed Nov 15, 2018 62 63 }  Gilles Kratzer committed Oct 21, 2019 64 \value{A list with an entry for the list of sampled DAGs, the list of scores, the acceptance probability, the method used for each MCMC jump, the rejection status for each MCMC jump, the total number of iterations the thinning, the length of burn in phase, the named list of distribution per node and the heating parameter. The returned object is of class mcmcabn.}  Gilles Kratzer committed Nov 15, 2018 65 66 67  \author{Gilles Kratzer}  Gilles Kratzer committed Feb 22, 2019 68 69 70 \references{ For the implementation of the function:  Reinhard Furrer committed Oct 29, 2019 71 Kratzer, G., Furrer, R. "Is a single unique Bayesian network enough to accurately represent your data?". arXiv preprint arXiv:1902.06641.  Gilles Kratzer committed Feb 22, 2019 72 73  For the new edge reversal:  Gilles Kratzer committed Nov 15, 2018 74   Reinhard Furrer committed Oct 29, 2019 75 Grzegorczyk, M., Husmeier, D. (2008). "Improving the structure MCMC sampler for Bayesian networks by introducing a new edge reversal move", Machine Learning, vol. 71(2-3), 265.  Gilles Kratzer committed Feb 21, 2019 76   Gilles Kratzer committed Feb 22, 2019 77 78 For the Markov Blanket resampling move:  Reinhard Furrer committed Oct 29, 2019 79 Su, C., Borsuk, M. E. (2016). "Improving structure MCMC for Bayesian networks through Markov blanket resampling", The Journal of Machine Learning Research, vol. 17(1), 4042-4061.  Gilles Kratzer committed Nov 15, 2018 80   Gilles Kratzer committed Feb 22, 2019 81 82 83 84 85 86 For the Koivisto prior: Koivisto, M. V. (2004). Exact Structure Discovery in Bayesian Networks, Journal of Machine Learning Research, vol 5, 549-573. For the user defined prior:  Reinhard Furrer committed Oct 29, 2019 87 Werhli, A. V., Husmeier, D. (2007). "Reconstructing gene regulatory networks with Bayesian networks by combining expression data with multiple sources of prior knowledge". Statistical Applications in Genetics and Molecular Biology, 6 (Article 15).  Gilles Kratzer committed Feb 21, 2019 88   Reinhard Furrer committed Oct 29, 2019 89 Imoto, S., Higuchi, T., Goto, T., Tashiro, K., Kuhara, S., Miyano, S. (2003). Using Bayesian networks for estimating gene networks from microarrays and biological knowledge. In Proceedings of the European Conference on Computational Biology.  Gilles Kratzer committed Feb 22, 2019 90   Gilles Kratzer committed Feb 25, 2019 91 For the asia dataset:  Gilles Kratzer committed Feb 22, 2019 92   Reinhard Furrer committed Oct 29, 2019 93 Scutari, M. (2010). Learning Bayesian Networks with the bnlearn R Package. Journal of Statistical Software, 35(3), 1-22. doi:http://dx.doi.org/10.18637/jss.v035.i03.  Gilles Kratzer committed Nov 15, 2018 94 }  Gilles Kratzer committed Feb 21, 2019 95   Gilles Kratzer committed Feb 21, 2019 96   Gilles Kratzer committed Feb 21, 2019 97 \examples{  Reinhard Furrer committed Oct 29, 2019 98 99 ## Example from the asia dataset from Lauritzen and Spiegelhalter (1988) ## provided by Scutari (2010)  Gilles Kratzer committed Feb 21, 2019 100   Reinhard Furrer committed Oct 29, 2019 101 # The number of MCMC run is delibaretelly chosen too small (computing time)  Gilles Kratzer committed Mar 07, 2019 102 103 104 105 # no thinning (usually not recommended) # no burn-in (usually not recommended, # even if not supported by any theoretical arguments)  Gilles Kratzer committed Feb 21, 2019 106 107 data("mcmc_run_asia")  Reinhard Furrer committed Oct 29, 2019 108 # Let us run: 0.03 REV, 0.03 MBR, 0.94 MC3 MCMC jumps  Gilles Kratzer committed Mar 07, 2019 109 110 111 # with a random DAG as starting point mcmc.out.asia.small <- mcmcabn(score.cache = bsc.compute.asia,  Gilles Kratzer committed Feb 21, 2019 112  score = "mlik",  Gilles Kratzer committed Feb 25, 2019 113  data.dists = dist.asia,  Gilles Kratzer committed Feb 21, 2019 114  max.parents = 2,  Gilles Kratzer committed Mar 07, 2019 115  mcmc.scheme = c(100,0,0),  Gilles Kratzer committed Mar 08, 2019 116  seed = 321,  Gilles Kratzer committed Feb 21, 2019 117 118  verbose = FALSE, start.dag = "random",  Gilles Kratzer committed Feb 25, 2019 119 120 121  prob.rev = 0.03, prob.mbr = 0.03, prior.choice = 2)  Gilles Kratzer committed Feb 21, 2019 122   Gilles Kratzer committed Mar 07, 2019 123 124 summary(mcmc.out.asia.small)  Gilles Kratzer committed Oct 21, 2019 125 # Soly with MC3 moves  Gilles Kratzer committed Mar 07, 2019 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140  mcmc.out.asia.small <- mcmcabn(score.cache = bsc.compute.asia, score = "mlik", data.dists = dist.asia, max.parents = 2, mcmc.scheme = c(100,0,0), seed = 42, verbose = FALSE, start.dag = "random", prob.rev = 0, prob.mbr = 0, prior.choice = 2) summary(mcmc.out.asia.small)  Reinhard Furrer committed Oct 29, 2019 141 # Defining a starting DAG  Gilles Kratzer committed Mar 08, 2019 142 143 144 145 146 147 148 149 150 startDag <- matrix(data = c(0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0),nrow = 8,ncol = 8, byrow = TRUE)  Gilles Kratzer committed Mar 07, 2019 151 152 colnames(startDag) <- rownames(startDag) <- names(dist.asia)  Gilles Kratzer committed Mar 08, 2019 153 # Additionally, let us use the non informative prior  Gilles Kratzer committed Mar 07, 2019 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 mcmc.out.asia.small <- mcmcabn(score.cache = bsc.compute.asia, score = "mlik", data.dists = dist.asia, max.parents = 2, mcmc.scheme = c(100,0,0), seed = 42, verbose = FALSE, start.dag = startDag, prob.rev = 0, prob.mbr = 0, prior.choice = 1) summary(mcmc.out.asia.small) # let us define our very own prior # we know that there should be a link between Smoking and LungCancer nodes  Gilles Kratzer committed Mar 08, 2019 171 172 # uninformative prior matrix priorDag <- matrix(data = 0.5,nrow = 8,ncol = 8)  Gilles Kratzer committed Mar 07, 2019 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 # name it colnames(priorDag) <- rownames(priorDag) <- names(dist.asia) # parent = smoking; child = LungCancer priorDag["LungCancer","Smoking"] <- 1 mcmc.out.asia.small <- mcmcabn(score.cache = bsc.compute.asia, score = "mlik", data.dists = dist.asia, max.parents = 2, mcmc.scheme = c(100,0,0), seed = 42, verbose = FALSE, start.dag = startDag, prob.rev = 0, prob.mbr = 0, prior.choice = 3, prior.dag = priorDag) summary(mcmc.out.asia.small)  Gilles Kratzer committed Oct 21, 2019 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207  # let us improve convergence rate. The 20 first MCMC moves are performed with an # heating parameter different than one, afterwards, heating is set up to one. mcmc.out.asia.small <- mcmcabn(score.cache = bsc.compute.asia, score = "mlik", data.dists = dist.asia, max.parents = 2, mcmc.scheme = c(100,0,0), seed = 321, verbose = FALSE, start.dag = "random", prob.rev = 0.03, prob.mbr = 0.03, prior.choice = 2, heating = 20)  Reinhard Furrer committed Oct 29, 2019 208 summary(mcmc.out.asia.small)  Gilles Kratzer committed Oct 21, 2019 209   Gilles Kratzer committed Mar 07, 2019 210 }