query.Rd 3.76 KB
 Gilles Kratzer committed Feb 21, 2019 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 % query.Rd --- % Author : Gilles Kratzer % Created on : 18.02.2019 % Last modification : %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \name{query} \alias{query} \title{Function to query MCMC samples generated by mcmcabn} \usage{ query(mcmcabn = NULL, formula = NULL) } \arguments{ \item{mcmcabn}{object of class mcmcabn.}  Reinhard Furrer committed Oct 29, 2019 18  \item{formula}{formula statement or adjacency matrix to query the MCMC samples, see details. If this argument is \code{NULL}, then the average arc-wise frequencies is reported.}  Gilles Kratzer committed Feb 21, 2019 19 20  }  Gilles Kratzer committed Feb 28, 2019 21 \description{The function allows users to perform structural queries over MCMC samples produced by \code{mcmcabn}.  Gilles Kratzer committed Feb 21, 2019 22 23 }  Gilles Kratzer committed Feb 28, 2019 24 \details{The query can be formulated using an adjacency matrix or a formula-wise expression.  Gilles Kratzer committed Feb 21, 2019 25   Reinhard Furrer committed Oct 29, 2019 26 The adjacency matrix should be squared of dimension equal to the number of nodes in the networks. Their entries should be either 1, 0 or -1. The 1 indicates the requested arcs, the -1 the excluded and the 0 all other entries that are not subject to query. The rows indicated the set of parents of the index nodes. The order of rows and column should be the same as the one used in the \code{mcmcabn()} function in the \code{data.dist} argument.  Gilles Kratzer committed Feb 21, 2019 27 28  The formula statement has been designed to ease querying over the MCMC sample. It allows user to make complex queries without explicitly writing an adjacency matrix (which can be painful when the number of variables is large). The formula argument can be provided using typically a formula like:  Reinhard Furrer committed Oct 29, 2019 29 \code{~ node1|parent1:parent2 + node2:node3|parent3}. The formula statement has to start with ~. In this example, node1 has two parents (parent1 and parent2). node2 and node3 have the same parent3. The parents names have to exactly match those given in name. : is the separator between either children or parents, | separates children (left side) and parents (right side), + separates  Gilles Kratzer committed Feb 21, 2019 30 31 32 33 34 terms, . replaces all the variables in name. Additional, when one want to exclude an arc simply put - in front of that statement. Then a formula like: ~ -node1|parent1 exclude all DAGs that have an arc between parent1 and node1. If the formula argument is not provided the function returns the average support of all individual arcs using a named matrix. }  Gilles Kratzer committed Mar 01, 2019 35 \value{A frequency for the requested query. Alternatively a matrix with arc-wise frequencies.}  Gilles Kratzer committed Feb 21, 2019 36 37 38  \author{Gilles Kratzer}  Reinhard Furrer committed Oct 29, 2019 39 \references{Kratzer, G., Furrer, R. "Is a single unique Bayesian network enough to accurately represent your data?". arXiv preprint arXiv:1902.06641.  Gilles Kratzer committed Feb 21, 2019 40   Reinhard Furrer committed Oct 29, 2019 41 Lauritzen, S., Spiegelhalter, D. (1988). "Local Computation with Probabilities on Graphical Structures and their Application to Expert Systems (with discussion)". Journal of the Royal Statistical Society: Series B, 50(2):157–224.  Gilles Kratzer committed Feb 21, 2019 42   Reinhard Furrer committed Oct 29, 2019 43 Scutari, M. (2010). Learning Bayesian Networks with the bnlearn R Package. Journal of Statistical Software, 35(3), 1–22. doi:http://dx.doi.org/10.18637/jss.v035.i03.  Gilles Kratzer committed Feb 21, 2019 44 45 46 } \examples{  Reinhard Furrer committed Oct 29, 2019 47 48 ## Example from the asia dataset from Lauritzen and Spiegelhalter (1988) ## provided by Scutari (2010)  Gilles Kratzer committed Feb 21, 2019 49 50 data("mcmc_run_asia")  Reinhard Furrer committed Oct 29, 2019 51 ## Return a named matrix with individual arc support  Gilles Kratzer committed Feb 25, 2019 52 query(mcmcabn = mcmc.out.asia)  Gilles Kratzer committed Feb 21, 2019 53   Reinhard Furrer committed Oct 29, 2019 54 ## What is the probability of LungCancer node being children of the Smoking node?  Gilles Kratzer committed Feb 25, 2019 55 query(mcmcabn = mcmc.out.asia,formula = ~LungCancer|Smoking)  Gilles Kratzer committed Feb 21, 2019 56   Reinhard Furrer committed Oct 29, 2019 57 ## What is the probability of Smoking node being parent of  Gilles Kratzer committed Feb 21, 2019 58 ## both LungCancer and Bronchitis node?  Gilles Kratzer committed Feb 25, 2019 59 query(mcmcabn = mcmc.out.asia,  Reinhard Furrer committed Oct 29, 2019 60  formula = ~ LungCancer|Smoking+Bronchitis|Smoking)  Gilles Kratzer committed Feb 21, 2019 61   Reinhard Furrer committed Oct 29, 2019 62 63 ## What is the probability of previous statement, when there ## is no arc from Smoking to Tuberculosis and from Bronchitis to XRay?  Gilles Kratzer committed Feb 25, 2019 64 query(mcmcabn = mcmc.out.asia,  Reinhard Furrer committed Oct 29, 2019 65 66  formula = ~LungCancer|Smoking + Bronchitis|Smoking - Tuberculosis|Smoking - XRay|Bronchitis)  Gilles Kratzer committed Feb 21, 2019 67 }