dRWRpipeline
is supposed to estimate sample relationships (ie.
contact strength between samples) from an input gene-sample matrix and
an input graph. The pipeline includes: 1) random walk restart (RWR) of
the input graph using the input matrix as seeds; 2) calculation of
contact strength (inner products of RWR-smoothed columns of input
matrix); 3) estimation of the contact signficance by a randomalisation
procedure. It supports two methods how to use RWR: 'direct' for
directly applying RWR in the given seeds; 'indirectly' for first
pre-computing affinity matrix of the input graph, and then deriving the
affinity score. Parallel computing is also supported for Linux or Mac
operating systems.
dRWRpipeline(data, g, method = c("direct", "indirect"), normalise = c("laplacian", "row", "column", "none"), restart = 0.75, normalise.affinity.matrix = c("none", "quantile"), permutation = c("random", "degree"), num.permutation = 10, p.adjust.method = c("BH", "BY", "bonferroni", "holm", "hochberg", "hommel"), adjp.cutoff = 0.05, parallel = TRUE, multicores = NULL, verbose = T)
source("http://bioconductor.org/biocLite.R");
biocLite(c("foreach","doParallel"))
. If not yet installed, this option
will be disabledan object of class "dContact", a list with following components:
ratio
: a symmetric matrix storing ratio (the observed
against the expected) between pairwise samples
zscore
: a symmetric matrix storing zscore between pairwise
samples
pval
: a symmetric matrix storing pvalue between pairwise
samples
adjpval
: a symmetric matrix storing adjusted pvalue
between pairwise samples
cgraph
: the constructed contact graph (as a 'igraph'
object) under the cutoff of adjusted value
Amatrix
: a pre-computated affinity matrix when using
'inderect' method; NULL otherwise
call
: the call that produced this result
The choice of which method to use RWR depends on the number of seed
sets and the number of permutations for statistical test. If the total
product of both numbers are huge, it is better to use 'indrect' method
(for a single run). However, if the user wants to re-use pre-computed
affinity matrix (ie. re-use the input graph a lot), then it is highly
recommended to sequentially use dRWR
and
dRWRcontact
instead.
# 1) generate a random graph according to the ER model g <- erdos.renyi.game(100, 1/100) # 2) produce the induced subgraph only based on the nodes in query subg <- dNetInduce(g, V(g), knn=0) V(subg)$name <- 1:vcount(subg) # 3) estimate RWR dating based sample relationships # define sets of seeds as data # each seed with equal weight (i.e. all non-zero entries are '1') aSeeds <- c(1,0,1,0,1) bSeeds <- c(0,0,1,0,1) data <- data.frame(aSeeds,bSeeds) rownames(data) <- 1:5 # calcualte their two contact graph dContact <- dRWRpipeline(data=data, g=subg, parallel=FALSE)Start at 2018-01-19 12:36:56 First, RWR on input graph (15 nodes and 14 edges) using input matrix (5 rows and 2 columns) as seeds (2018-01-19 12:36:56)... Second, calculate contact strength (2018-01-19 12:36:56)... Third, generate the distribution of contact strength based on 10 permutations on nodes respecting random (2018-01-19 12:36:56)... 1 out of 10 (2018-01-19 12:36:56) 2 out of 10 (2018-01-19 12:36:56) 3 out of 10 (2018-01-19 12:36:56) 4 out of 10 (2018-01-19 12:36:57) 5 out of 10 (2018-01-19 12:36:57) 6 out of 10 (2018-01-19 12:36:57) 7 out of 10 (2018-01-19 12:36:57) 8 out of 10 (2018-01-19 12:36:57) 9 out of 10 (2018-01-19 12:36:57) 10 out of 10 (2018-01-19 12:36:57) Last, estimate the significance of contact strength: zscore, pvalue, and BH adjusted-pvalue (2018-01-19 12:36:58)... Also, construct the contact graph under the cutoff 5.0e-02 of adjusted-pvalue (2018-01-19 12:36:58)... Finish at 2018-01-19 12:36:58 Runtime in total is: 2 secsdContact$ratio [,1] [,2] [1,] 1.181814 1.191843 [2,] 1.191843 1.141409 $zscore [,1] [,2] [1,] 2.170178 2.121734 [2,] 2.121734 1.185599 $pval [,1] [,2] [1,] 0 0.0 [2,] 0 0.1 $adjpval [,1] [,2] [1,] 0 0.0 [2,] 0 0.1 $cgraph IGRAPH 85ace61 U-W- 2 1 -- + attr: weight (e/n) + edge from 85ace61: [1] 1--2 $Amatrix NULL $call dRWRpipeline(data = data, g = subg, parallel = FALSE) $method [1] "dnet" attr(,"class") [1] "dContact"
dRWRpipeline.r
dRWRpipeline.Rd
dRWRpipeline.pdf
dRWR
, dRWRcontact
,
dCheckParallel