query.Rd 3.76 KB
Newer Older
Gilles Kratzer's avatar
Gilles Kratzer committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
% query.Rd ---
% Author           : Gilles Kratzer
% Created on :       18.02.2019
% Last modification :
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\name{query}
\alias{query}
\title{Function to query MCMC samples generated by mcmcabn}

\usage{
query(mcmcabn = NULL,
                 formula = NULL)
                 }

\arguments{
  \item{mcmcabn}{object of class mcmcabn.}
Reinhard Furrer's avatar
details    
Reinhard Furrer committed
18
  \item{formula}{formula statement or adjacency matrix to query the MCMC samples, see details. If this argument is \code{NULL}, then the average arc-wise frequencies is reported.}
Gilles Kratzer's avatar
Gilles Kratzer committed
19
20
  }

Gilles Kratzer's avatar
typos    
Gilles Kratzer committed
21
\description{The function allows users to perform structural queries over MCMC samples produced by \code{mcmcabn}.
Gilles Kratzer's avatar
Gilles Kratzer committed
22
23
}

Gilles Kratzer's avatar
typos    
Gilles Kratzer committed
24
\details{The query can be formulated using an adjacency matrix or a formula-wise expression.
Gilles Kratzer's avatar
Gilles Kratzer committed
25

Reinhard Furrer's avatar
details    
Reinhard Furrer committed
26
The adjacency matrix should be squared of dimension equal to the number of nodes in the networks. Their entries should be either 1, 0 or -1. The 1 indicates the requested arcs, the -1 the excluded and the 0 all other entries that are not subject to query. The rows indicated the set of parents of the index nodes. The order of rows and column should be the same as the one used in the \code{mcmcabn()} function in the \code{data.dist} argument.
Gilles Kratzer's avatar
Gilles Kratzer committed
27
28

The formula statement has been designed to ease querying over the MCMC sample. It allows user to make complex queries without explicitly writing an adjacency matrix (which can be painful when the number of variables is large). The formula argument can be provided using typically a formula like:
Reinhard Furrer's avatar
details    
Reinhard Furrer committed
29
\code{~ node1|parent1:parent2 + node2:node3|parent3}. The formula statement has to start with `~`. In this example, node1 has two parents (parent1 and parent2). node2 and node3 have the same parent3. The parents names have to exactly match those given in name. `:` is the separator between either children or parents, `|` separates children (left side) and parents (right side), `+` separates
Gilles Kratzer's avatar
Gilles Kratzer committed
30
31
32
33
34
terms, `.` replaces all the variables in name. Additional, when one want to exclude an arc simply put `-` in front of that statement. Then a formula like: ~ -node1|parent1 exclude all DAGs that have an arc between parent1 and node1.

If the formula argument is not provided the function returns the average support of all individual arcs using a named matrix.
}

Gilles Kratzer's avatar
typos    
Gilles Kratzer committed
35
\value{A frequency for the requested query. Alternatively a matrix with arc-wise frequencies.}
Gilles Kratzer's avatar
Gilles Kratzer committed
36
37
38

\author{Gilles Kratzer}

Reinhard Furrer's avatar
details    
Reinhard Furrer committed
39
\references{Kratzer, G., Furrer, R.  "Is a single unique Bayesian network enough to accurately represent your data?". arXiv preprint arXiv:1902.06641.
Gilles Kratzer's avatar
Gilles Kratzer committed
40

Reinhard Furrer's avatar
details    
Reinhard Furrer committed
41
Lauritzen, S., Spiegelhalter, D. (1988). "Local Computation with Probabilities on Graphical Structures and their Application to Expert Systems (with discussion)". Journal of the Royal Statistical Society: Series B, 50(2):157–224.
Gilles Kratzer's avatar
Gilles Kratzer committed
42

Reinhard Furrer's avatar
details    
Reinhard Furrer committed
43
Scutari, M. (2010). Learning Bayesian Networks with the bnlearn R Package. Journal of Statistical Software, 35(3), 1–22. doi:http://dx.doi.org/10.18637/jss.v035.i03.
Gilles Kratzer's avatar
Gilles Kratzer committed
44
45
46
}

\examples{
Reinhard Furrer's avatar
details    
Reinhard Furrer committed
47
48
## Example from the asia dataset from Lauritzen and Spiegelhalter (1988)
## provided by Scutari (2010)
Gilles Kratzer's avatar
Gilles Kratzer committed
49
50
data("mcmc_run_asia")

Reinhard Furrer's avatar
details    
Reinhard Furrer committed
51
## Return a named matrix with individual arc support
Gilles Kratzer's avatar
Gilles Kratzer committed
52
query(mcmcabn = mcmc.out.asia)
Gilles Kratzer's avatar
Gilles Kratzer committed
53

Reinhard Furrer's avatar
details    
Reinhard Furrer committed
54
## What is the probability of LungCancer node being children of the Smoking node?
Gilles Kratzer's avatar
Gilles Kratzer committed
55
query(mcmcabn = mcmc.out.asia,formula = ~LungCancer|Smoking)
Gilles Kratzer's avatar
Gilles Kratzer committed
56

Reinhard Furrer's avatar
details    
Reinhard Furrer committed
57
## What is the probability of Smoking node being parent of
Gilles Kratzer's avatar
Gilles Kratzer committed
58
## both LungCancer and Bronchitis node?
Gilles Kratzer's avatar
Gilles Kratzer committed
59
query(mcmcabn = mcmc.out.asia,
Reinhard Furrer's avatar
details    
Reinhard Furrer committed
60
      formula = ~ LungCancer|Smoking+Bronchitis|Smoking)
Gilles Kratzer's avatar
Gilles Kratzer committed
61

Reinhard Furrer's avatar
details    
Reinhard Furrer committed
62
63
## What is the probability of previous statement, when there
## is no arc from Smoking to Tuberculosis and from Bronchitis to XRay?
Gilles Kratzer's avatar
Gilles Kratzer committed
64
query(mcmcabn = mcmc.out.asia,
Reinhard Furrer's avatar
details    
Reinhard Furrer committed
65
66
      formula = ~LungCancer|Smoking + Bronchitis|Smoking -
                  Tuberculosis|Smoking - XRay|Bronchitis)
Gilles Kratzer's avatar
Gilles Kratzer committed
67
}