mcmc.Rd 10.9 KB
Newer Older
Gilles Kratzer's avatar
Gilles Kratzer committed
1
% mcmcabn.Rd ---
Gilles Kratzer's avatar
Gilles Kratzer committed
2
% Author           : Gilles Kratzer
Gilles Kratzer's avatar
Gilles Kratzer committed
3
% Created on :       18.02.2019
Gilles Kratzer's avatar
Gilles Kratzer committed
4
5
6
% Last modification :
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Gilles Kratzer's avatar
Gilles Kratzer committed
7
8
\name{mcmcabn}
\alias{mcmcabn}
Gilles Kratzer's avatar
Gilles Kratzer committed
9
\title{Structural MCMC sampler for DAGs}
Gilles Kratzer's avatar
Gilles Kratzer committed
10
11

\usage{
Gilles Kratzer's avatar
Gilles Kratzer committed
12
13
14
15
mcmcabn(score.cache = NULL,
                 score = "mlik",
                 data.dists = NULL,
                 max.parents = 1,
Gilles Kratzer's avatar
Gilles Kratzer committed
16
                 mcmc.scheme = c(100,1000,1000),
Gilles Kratzer's avatar
Gilles Kratzer committed
17
18
19
                 seed = 42,
                 verbose = FALSE,
                 start.dag = NULL,
Gilles Kratzer's avatar
Gilles Kratzer committed
20
21
                 prior.dag = NULL,
                 prior.lambda = NULL,
Gilles Kratzer's avatar
Gilles Kratzer committed
22
23
                 prob.rev = 0.05,
                 prob.mbr = 0.05,
24
                 scaling = 0,
Gilles Kratzer's avatar
Gilles Kratzer committed
25
                 prior.choice = 2)
Gilles Kratzer's avatar
Gilles Kratzer committed
26
27
28
                 }

\arguments{
29
  \item{score.cache}{output from \link[abn:build_score_cache]{buildscorecache} from the \code{abn} R package.}
Gilles Kratzer's avatar
Gilles Kratzer committed
30
  \item{score}{character giving which network score should be used to sample the DAGs landscape.}
Gilles Kratzer's avatar
typos    
Gilles Kratzer committed
31
\item{data.dists}{a named list giving the distribution for each node in the network, see details.}
Gilles Kratzer's avatar
Gilles Kratzer committed
32
33
34
35
  \item{max.parents}{a constant giving the maximum number of parents allowed.}
  \item{mcmc.scheme}{a sampling scheme. It is vector giving in that order: the number of returned DAGS, the number of thinned steps and length of the burn in phase.}
\item{seed}{a non-negative integer which sets the seed.}
\item{verbose}{extra output, see output for details.}
Gilles Kratzer's avatar
website    
Gilles Kratzer committed
36
  \item{start.dag}{a DAG given as a matrix, see details for format, which can be used to explicitly provide a starting point for the structural search. Alternatively character "random" will select a random DAG as starting point. Character "hc" will call a hill-climber to select a DAG as starting point.}
Gilles Kratzer's avatar
Gilles Kratzer committed
37
38
  \item{prior.dag}{user defined prior. It should be given as a matrix where entries range from zero to one. 0.5 is non-informative for the given arc.}
\item{prior.lambda}{hyper parameter representing the strength of belief in the user defined prior.}
Gilles Kratzer's avatar
Gilles Kratzer committed
39
\item{prob.rev}{probability of selecting a new edge reversal.}
Gilles Kratzer's avatar
Gilles Kratzer committed
40
\item{prob.mbr}{probability of selecting a Markov blanket resampling move.}
41
\item{scaling}{a number between zero and one that heat up the mcmc process.}
Gilles Kratzer's avatar
typos    
Gilles Kratzer committed
42
  \item{prior.choice}{an integer, 1 or 2, where 1 is a uniform structural prior and 2 uses a weighted prior, see details.}
Gilles Kratzer's avatar
Gilles Kratzer committed
43
44
45
  }

\description{
Gilles Kratzer's avatar
typos    
Gilles Kratzer committed
46
This function is a structural Monte Carlo Markov Chain Model Choice (MC)^3 sampler that is equipped with two large scale MCMC moves that are purposed to accelerate chain mixing.
Gilles Kratzer's avatar
Gilles Kratzer committed
47
48
}

Gilles Kratzer's avatar
typos    
Gilles Kratzer committed
49
\details{The procedure runs a structural Monte Carlo Markov Chain Model Choice (MC)^3 to find the most probable posterior network (DAG). The default algorithm is based on three MCMC move: edge addition, edge deletion and edge reversal. This algorithm is known as the (MC)^3. It is known to mix slowly and getting stuck in low probability regions. Indeed, changing of Markov equivalence region often requires multiple MCMC moves. Then large scale MCMC moves are implemented. Their relative frequency can be set by the user. The new edge reversal move (REV) from Grzegorczyk and Husmeier (2008) and the Markov blanket resampling (MBR) from Su and Borsuk (2016). The classical reversal move depends on the global configuration of the parents and children and fails to propose MCMC jumps that produce valid but very different DAGs in a unique move. The REV move sample globally a new set of parent. The MBR workaround applies the same idea but to the entire Markov blanket of a randomly chosen node.
Gilles Kratzer's avatar
Gilles Kratzer committed
50

Gilles Kratzer's avatar
typos    
Gilles Kratzer committed
51
The classical (MC)^3 is unbiased but inefficient in mixing, the two radical MCMC alternative move are known to massively accelerate mixing without introducing biases. But those move are computationally expensive. Then low frequencies are advised. The REV move is not necessarily ergotic , then it should not be used alone.
Gilles Kratzer's avatar
Gilles Kratzer committed
52

Gilles Kratzer's avatar
Gilles Kratzer committed
53
The parameter \code{start.dag} can be: "random", "hc" or user defined. If user select "random" then a random valid DAG is selected. The routine used favourise low density structure. If "hc" (for Hill-climber: \link[abn:search_heuristic]{searchHeuristic} then a DAG is selected using 100 different searches with 500 optimization steps. A user defined DAG can be provided. It should be a named square matrix  containing only zeros and ones. The DAG should be valid (i.e. acyclic).
54

Gilles Kratzer's avatar
typos    
Gilles Kratzer committed
55
The parameter \code{prior.choice} determines the prior used within each individual node for a given choice of parent combination. In Koivisto and Sood (2004) p.554 a form of prior is used which assumes that the prior probability for parent combinations comprising of the same number of parents are all equal. Specifically, that the prior probability for parent set G with cardinality |G| is proportional to 1/[n-1 choose |G|] where there are n total nodes. Note that this favours parent combinations with either very low or very high cardinality which may not be appropriate. This prior is used when \code{prior.choice=2}. When prior.choice=1 an uninformative prior is used where parent combinations of all cardinalities are equally likely. When \code{prior.choice=3} a user defined prior is used, defined by \code{prior.dag}. It is given by an adjacency matrix (squared and same size as number of nodes) where entries ranging from zero to one give the user prior belief. An hyper parameter defining the global user belief in the prior is given by \code{prior.lambda}.
Gilles Kratzer's avatar
Gilles Kratzer committed
56

Gilles Kratzer's avatar
Gilles Kratzer committed
57
58
MCMC sampler came with asymptotic statistical guarantees. Therefore it is highly advised to run multiple long enough chains. The burn in phase length (i.e throwing away first MCMC iterations) should be adequately chosen.

Gilles Kratzer's avatar
typos    
Gilles Kratzer committed
59
The argument \code{data.dists} must be a list with named arguments, one for each of the variables in \code{data.df}, where each entry is either "poisson","binomial", or "gaussian"
Gilles Kratzer's avatar
Gilles Kratzer committed
60
61
}

Gilles Kratzer's avatar
Gilles Kratzer committed
62
\value{A list with an entry for the list of sampled DAGs, the list of scores, the acceptance probability, the method used for each MCMC jump, the rejection status for each MCMC jump, the total number of iterations the thinning, the length of burn in phase and the named list of distribution per node. The returned object is of class mcmcabn.}
Gilles Kratzer's avatar
Gilles Kratzer committed
63
64
65

\author{Gilles Kratzer}

Gilles Kratzer's avatar
Gilles Kratzer committed
66
67
68
69
70
71
\references{
For the implementation of the function:

Kratzer, G. Furrer, R.  "Is a single unique Bayesian network enough to accurately represent your data?". arXiv preprint arXiv:1902.06641.

For the new edge reversal:
Gilles Kratzer's avatar
Gilles Kratzer committed
72

Gilles Kratzer's avatar
Gilles Kratzer committed
73
74
Grzegorczyk, M.Husmeier,  D.  "Improving the structure MCMC sampler for Bayesian networks by introducing a new edge reversal move", Machine Learning, vol. 71(2-3), pp. 265, 2008.

Gilles Kratzer's avatar
Gilles Kratzer committed
75
76
For the Markov Blanket resampling move:

Gilles Kratzer's avatar
Gilles Kratzer committed
77
Su, C.Borsuk, M. E.  "Improving structure MCMC for Bayesian networks through Markov blanket resampling", The Journal of Machine Learning Research, vol. 17(1), pp. 4042-4061, 2016.
Gilles Kratzer's avatar
Gilles Kratzer committed
78

Gilles Kratzer's avatar
Gilles Kratzer committed
79
80
81
82
83
84
For the Koivisto prior:

Koivisto, M. V. (2004). Exact Structure Discovery in Bayesian Networks, Journal of Machine Learning Research, vol 5, 549-573.

For the user defined prior:

Gilles Kratzer's avatar
Gilles Kratzer committed
85
86
Werhli, A. V., & Husmeier, D. (2007). "Reconstructing gene regulatory networks with Bayesian networks by combining expression data with multiple sources of prior knowledge". Statistical Applications in Genetics and Molecular Biology, 6 (Article 15).

Gilles Kratzer's avatar
Gilles Kratzer committed
87
88
Imoto, S., Higuchi, T., Goto, T., Tashiro, K., Kuhara, S., & Miyano, S. (2003). Using Bayesian networks for estimating gene networks from microarrays and biological knowledge. In Proceedings of the European Conference on Computational Biology.

Gilles Kratzer's avatar
Gilles Kratzer committed
89
For the asia dataset:
Gilles Kratzer's avatar
Gilles Kratzer committed
90

Gilles Kratzer's avatar
Gilles Kratzer committed
91
Scutari, M. (2010). Learning Bayesian Networks with the bnlearn R Package. Journal of Statistical Software, 35(3), 1 - 22. doi:http://dx.doi.org/10.18637/jss.v035.i03.
Gilles Kratzer's avatar
Gilles Kratzer committed
92
}
Gilles Kratzer's avatar
Gilles Kratzer committed
93

Gilles Kratzer's avatar
Gilles Kratzer committed
94

Gilles Kratzer's avatar
Gilles Kratzer committed
95
96
97
\examples{
## Example from the asia dataset from Lauritzen and Spiegelhalter (1988) provided by Scutari (2010)

98
99
100
101
102
# the number of MCMC run is delibaretelly chosen too small (computing time)
# no thinning (usually not recommended)
# no burn-in (usually not recommended,
# even if not supported by any theoretical arguments)

Gilles Kratzer's avatar
Gilles Kratzer committed
103
104
data("mcmc_run_asia")

105
106
107
108
# let us run: 0.03 REV, 0.03 MBR, 0.94 MC3 MCMC jumps
# with a random DAG as starting point

mcmc.out.asia.small <- mcmcabn(score.cache = bsc.compute.asia,
Gilles Kratzer's avatar
Gilles Kratzer committed
109
                  score = "mlik",
Gilles Kratzer's avatar
Gilles Kratzer committed
110
                  data.dists = dist.asia,
Gilles Kratzer's avatar
Gilles Kratzer committed
111
                  max.parents = 2,
112
                  mcmc.scheme = c(100,0,0),
Gilles Kratzer's avatar
Gilles Kratzer committed
113
                  seed = 321,
Gilles Kratzer's avatar
Gilles Kratzer committed
114
115
                  verbose = FALSE,
                  start.dag = "random",
Gilles Kratzer's avatar
Gilles Kratzer committed
116
117
118
                  prob.rev = 0.03,
                  prob.mbr = 0.03,
                  prior.choice = 2)
Gilles Kratzer's avatar
Gilles Kratzer committed
119

120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
summary(mcmc.out.asia.small)

# Uniquelly with MC3 moves

mcmc.out.asia.small <- mcmcabn(score.cache = bsc.compute.asia,
                  score = "mlik",
                  data.dists = dist.asia,
                  max.parents = 2,
                  mcmc.scheme = c(100,0,0),
                  seed = 42,
                  verbose = FALSE,
                  start.dag = "random",
                  prob.rev = 0,
                  prob.mbr = 0,
                  prior.choice = 2)

summary(mcmc.out.asia.small)

Gilles Kratzer's avatar
Gilles Kratzer committed
138
139
140
141
142
143
144
145
146
147
#let us define a starting DAG
startDag <- matrix(data = c(0, 0, 0, 1, 0, 0, 0, 0,
                            0, 0, 1, 0, 0, 0, 0, 0,
                            1, 0, 0, 0, 0, 0, 0, 0,
                            0, 0, 0, 0, 0, 1, 0, 0,
                            0, 0, 0, 0, 0, 0, 0, 0,
                            0, 0, 0, 0, 0, 0, 0, 0,
                            0, 0, 0, 0, 0, 0, 0, 0,
                            0, 0, 0, 0, 0, 0, 0, 0),nrow = 8,ncol = 8, byrow = TRUE)

148
149
colnames(startDag) <- rownames(startDag) <- names(dist.asia)

Gilles Kratzer's avatar
Gilles Kratzer committed
150
# Additionally, let us use the non informative prior
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
mcmc.out.asia.small <- mcmcabn(score.cache = bsc.compute.asia,
                  score = "mlik",
                  data.dists = dist.asia,
                  max.parents = 2,
                  mcmc.scheme = c(100,0,0),
                  seed = 42,
                  verbose = FALSE,
                  start.dag = startDag,
                  prob.rev = 0,
                  prob.mbr = 0,
                  prior.choice = 1)

summary(mcmc.out.asia.small)

# let us define our very own prior
# we know that there should be a link between Smoking and LungCancer nodes

Gilles Kratzer's avatar
Gilles Kratzer committed
168
169
# uninformative prior matrix
priorDag <- matrix(data = 0.5,nrow = 8,ncol = 8)
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
# name it
colnames(priorDag) <- rownames(priorDag) <- names(dist.asia)
# parent = smoking; child = LungCancer
priorDag["LungCancer","Smoking"] <- 1

mcmc.out.asia.small <- mcmcabn(score.cache = bsc.compute.asia,
                               score = "mlik",
                               data.dists = dist.asia,
                               max.parents = 2,
                               mcmc.scheme = c(100,0,0),
                               seed = 42,
                               verbose = FALSE,
                               start.dag = startDag,
                               prob.rev = 0,
                               prob.mbr = 0,
                               prior.choice = 3,
                               prior.dag = priorDag)

summary(mcmc.out.asia.small)
}