dNetPipeline
is supposed to finish ab inito maximum-scoring
subgraph identification for the input graph with the node information
on the significance (p-value or fdr). It returns an object of class
"igraph" or "graphNEL".
dNetPipeline(g, pval, method = c("pdf", "cdf", "customised"), significance.threshold = NULL, nsize = NULL, plot = F, verbose = T)
a subgraph with a maximum score, an object of class "igraph" or "graphNEL"
The pipeline sequentially consists of:
dBUMfit
used to fit the p-value distribution under beta-uniform mixture model,
and dBUMscore
used to calculate the scores according to
the fitted BUM and the significance threshold.
dFDRscore
.
dNetFind
used to find maximum-scoring subgraph
from the input graph and scores imposed on its nodes.
# 1) generate an vector consisting of random values from beta distribution x <- rbeta(1000, shape1=0.5, shape2=1) names(x) <- as.character(1:length(x)) # 2) generate a random graph according to the ER model g <- erdos.renyi.game(1000, 1/100) # 3) produce the induced subgraph only based on the nodes in query subg <- dNetInduce(g, V(g), knn=0) # 4) find maximum-scoring subgraph based on the given significance threshold # 4a) assume the input is a list of p-values (controlling fdr=0.1) subgraph <- dNetPipeline(g=subg, pval=x, significance.threshold=0.1)Start at 2018-01-19 12:36:34 First, fit the input p-value distribution under beta-uniform mixture model (2018-01-19 12:36:34)... A total of p-values: 1000 Maximum Log-Likelihood: 252.6 Mixture parameter (lambda): 0.000 Shape parameter (a): 0.530 Second, determine the significance threshold (2018-01-19 12:36:34)... significance threshold: 1.00e-01 Third, calculate the scores according to the fitted BUM and FDR threshold (if any) (2018-01-19 12:36:34)... Amongst 1000 scores, there are 154 positives. Finally, find the subgraph from the input graph with 1000 nodes and 5039 edges (2018-01-19 12:36:34)... Size of the subgraph: 107 nodes and 113 edges Finish at 2018-01-19 12:36:36 Runtime in total is: 2 secs# 4b) assume the input is a list of customised significance (eg FDR directly) subgraph <- dNetPipeline(g=subg, pval=x, method="customised", significance.threshold=0.1)Start at 2018-01-19 12:36:36 First, consider the input fdr (or p-value) distribution (2018-01-19 12:36:36)... Second, determine the significance threshold (2018-01-19 12:36:36)... significance threshold: 1.00e-01 Third, calculate the scores according to the input fdr (or p-value) and the threshold (if any) (2018-01-19 12:36:36)... Amongst 1000 scores, there are 300 positives. Finally, find the subgraph from the input graph with 1000 nodes and 5039 edges (2018-01-19 12:36:36)... Size of the subgraph: 287 nodes and 433 edges Finish at 2018-01-19 12:36:38 Runtime in total is: 2 secs# 5) find maximum-scoring subgraph with the desired node number nsize=20 subgraph <- dNetPipeline(g=subg, pval=x, nsize=20)Start at 2018-01-19 12:36:38 First, fit the input p-value distribution under beta-uniform mixture model (2018-01-19 12:36:38)... A total of p-values: 1000 Maximum Log-Likelihood: 252.6 Mixture parameter (lambda): 0.000 Shape parameter (a): 0.530 Second, determine the significance threshold (2018-01-19 12:36:38)... Via constraint on the size of subnetwork to be identified (20 nodes) Scanning significance threshold at rough stage (2018-01-19 12:36:38)... significance threshold: 1.00e-05, corresponding to the network size (0 nodes) (2018-01-19 12:36:38) significance threshold: 1.00e-04, corresponding to the network size (0 nodes) (2018-01-19 12:36:38) significance threshold: 1.00e-03, corresponding to the network size (0 nodes) (2018-01-19 12:36:38) significance threshold: 1.00e-02, corresponding to the network size (2 nodes) (2018-01-19 12:36:38) significance threshold: 1.00e-01, corresponding to the network size (107 nodes) (2018-01-19 12:36:39) Scanning significance threshold at finetuning stage (2018-01-19 12:36:39)... significance threshold: 1.50e-02, corresponding to the network size (2 nodes) (2018-01-19 12:36:40) significance threshold: 2.00e-02, corresponding to the network size (6 nodes) (2018-01-19 12:36:40) significance threshold: 2.50e-02, corresponding to the network size (9 nodes) (2018-01-19 12:36:41) significance threshold: 3.00e-02, corresponding to the network size (9 nodes) (2018-01-19 12:36:41) significance threshold: 3.50e-02, corresponding to the network size (11 nodes) (2018-01-19 12:36:42) significance threshold: 4.00e-02, corresponding to the network size (11 nodes) (2018-01-19 12:36:43) significance threshold: 4.50e-02, corresponding to the network size (10 nodes) (2018-01-19 12:36:43) significance threshold: 5.00e-02, corresponding to the network size (29 nodes) (2018-01-19 12:36:44) significance threshold: 5.00e-02 Third, calculate the scores according to the fitted BUM and FDR threshold (if any) (2018-01-19 12:36:44)... Amongst 1000 scores, there are 72 positives. Finally, find the subgraph from the input graph with 1000 nodes and 5039 edges (2018-01-19 12:36:44)... Size of the subgraph: 29 nodes and 29 edges Finish at 2018-01-19 12:36:45 Runtime in total is: 7 secs
dNetPipeline.r
dNetPipeline.Rd
dNetPipeline.pdf
dBUMfit
, dBUMscore
,
dFDRscore
, dNetFind