SciPost Submission Page
"Spectrally gapped" random walks on networks: a Mean First Passage Time formula
by Silvia Bartolucci, Fabio Caccioli, Francesco Caravelli, Pierpaolo Vivo
This is not the latest submitted version.
Submission summary
| Authors (as registered SciPost users): | Francesco Caravelli · Pierpaolo Vivo |
| Submission information | |
|---|---|
| Preprint Link: | https://arxiv.org/abs/2106.02730v2 (pdf) |
| Date submitted: | June 11, 2021, 6:28 p.m. |
| Submitted by: | Pierpaolo Vivo |
| Submitted to: | SciPost Physics |
| Ontological classification | |
|---|---|
| Academic field: | Physics |
| Specialties: |
|
| Approach: | Theoretical |
Abstract
We derive an approximate but explicit formula for the Mean First Passage Time of a random walker between a source and a target node of a directed and weighted network. The formula does not require any matrix inversion, and takes as only input the transition probabilities into the target node. It is derived from the calculation of the average resolvent of a deformed ensemble of random sub-stochastic matrices $A=\langle A\rangle +\delta A$, with $\langle A\rangle$ rank-$1$ and non-negative. The accuracy of the formula depends on the spectral gap of the reduced transition matrix, and it is tested numerically on several instances of (weighted) networks away from the high sparsity regime, with an excellent agreement.
Current status:
Reports on this Submission
Report #2 by Anonymous (Referee 2) on 2021-7-22 (Invited Report)
- Cite as: Anonymous, Report on arXiv:2106.02730v2, delivered 2021-07-22, doi: 10.21468/SciPost.Report.3279
Report
Dear Editor
The manuscript deals with the interesting and important problem of the mean first passage time on random walks on networks. The main result, summarized in Eq. (5), is based on a previous work or the same authors (refs 36-37) and provides an elegant formula for the mean first passage time in terms of the transition matrix without any need to invert matrices, iterate them etc. In that sense, this work does provide a step forward in the field and a practical result. Furthermore, this formula is applicable to weighted and directed networks, which is an obvious strength. The main limitation of this work is the fact that it requires a significant spectral gap of the reduced transition matrix in order to perform well. In practical terms, this limits the applicability to very sparse networks. My main criticism about this manuscript in its current form is that it is not 100% well written, and could benefit from further clarifications and examples. Overall, I think that this work represents an important contribution and should be published once the authors consider the following comments:
-
Right at the beginning (around eqs (1)-(2)) the authors define the adjacency matrix A and the transition matrix T. traditionally the adjacency matrix is a binary matrix, with 0/1 entries. I believe that the authors intend to use here a weighted version of A, and this should be stated clearly. Furthermore, it would be useful to stress that A does not have to be symmetric so that it encodes directed graphs too.
-
In chapter 2 the authors use a random matrix A which, if I understand correctly, is not the adjacency matrix, but a different object - namely the reduced transition matrix. I find this notation confusing, and it would be best to replace the notation. It would also be useful to explain in section 2 that A is not the adjacency matrix.
-
In the context of Eq. (13) the authors say that "it is rather straightforward to show that...| It would be useful to add a reference here, or provide an explanation.
-
In Eq. (21) the authors formulate the cumulant decay condition. While this is a sound technical definition, it would be useful to try and provide some insight into scenarios where this is obeyed vs. scenarios where this is not obeyed.
6.In Sec. 3.2 the authors consider Erdos-Renyi networks. It is not completely clear if they use here a directed or undirected version of the network, and this should be explained. Also, I guess that the weights are incorporated into the adjacency matrix directly, and this should be stated more clearly.
-
A second issue with ER networks is regarding the reference to the strongly connected component of the ER network. The authors correctly mention at the end of the section that c=ln N=6.21 is the connectedness threshold. I believe it would be useful to mention this background fact at the beginning of the section to remind the readers of this fact. Furthermore, in the example presented in Fig. 3 the authors use c=7 and c=190. These values are already above c=6.21 and hence the issues with the connectedness threshold are ignored. I believe that it would be useful to present a sparser ER network example as well, say for c=4. I understands that in this limit the spectral gap is much smaller and the validity of the formula is weaker, but I believe it will be useful to probe more this limit in addition to the c=7 and 190. Furthermore, the giant component of the ER network exhibits degree-degree correlations (it is disassortative) below the connectedness threshold, and it would be interesting to see the effect of this.
-
In the context of ER as well as Random Regular Graphs, in the limit of large c's - would it be possible to provide a large c approximation for Eq. (5)? As is well known, the spectrum of such dense ER/RRG networks approaches a simple form and it might yield a simple formula for the Mean First Passage Time in this limit.
-
In Fig. 4 the authors present results for c=7 and c=190 for Random Regular Graphs. It would be useful to include c=4 too to probe the less dense limit, where the assumptions of Eq. (5) break down. I thin this could add insight into the result. Furthermore, already for c=7 we see deviations. Can the author explain whether these deviations are systematic? It seems that the approximation tends to underestimate the exact result, hence most dots lie below the line. Is it a consistent result over many realizations or an accidental one?
-
In the captions of Tables 1 and 2, the authors report a comparison of "numerical", exact and approximate results. First, by "numerical" they really mean results of a simulations (as written in the main text). Therefore, it would be better to replace "numerical" by simulations. Also, it would be useful to add a specific reference to the equation used for the approximate and exact results - which equations were precisely used here? Another issue – the table reports a specific pair (i,j) – which pair? What were the degrees of the nodes i and j? why not average over all pairs?
-
On the broader picture - could the authors say anything about the distribution of first passage times? What quantity would be needed, from the random matrix side, in order to provide a similar result for the full distribution? Any comment here would be highly appreciated.
To summarize, this work provides a useful result for the mean first passage time in weighted and directed networks. I would be happy to recommend publication of the paper once the authors consider the comments mentioned above.
Report #1 by Anonymous (Referee 1) on 2021-7-20 (Invited Report)
- Cite as: Anonymous, Report on arXiv:2106.02730v2, delivered 2021-07-20, doi: 10.21468/SciPost.Report.3270
Strengths
-
explicit formula to compute MFPT on weighted graphs
-
the formula requires local calculations, no matrix inversion which can be computationally cumbersome and numerically inaccurate.
-
validity limits are theoretically established exploiting random matrix theory and replica calculations
Weaknesses
- the proposed formula is an approximation that only works on moderately dense graphs, there is no simple way to link the inaccuracy on sparse graphs with their local structural properties (average degree, etc.).
Report
The main result is sound and relevant, it provides a novel link between different research areas such as random matrices, disordered systems and complex networks.
The paper is well organised, clearly written, with a very pedagogical discussion of the physical meaning of the formula, which turns out to be as useful for the general public as the more technical calculations. The introduction to the problem is non-technical and accessible to a broad interdisciplinary audience. Relevant literature is cited and the state of the art is explicitly discussed in the manuscript. All theoretical results are reproducible from the calculations and numerical results could be with a straightforward algorithmic implementation of the main formula. Conclusions are clear, and in addition to summarise results, they provide some perspectives for future work. For all these reasons I think the manuscript meets all journal’s acceptance criteria and strongly recommend its publication almost in the current form (some minor amendments are suggested below).
Minor points:
— In Section 1.2, the sentence “The formula (5) is strikingly simple, not relying on a large-N approximation, …” is a bit misleading. The derivation of the formula uses a saddle-point calculation with a decay condition on some cumulant; in this sense it seems to rely on N being large (on a specific random matrix ensemble). However, I understand that the formula can be applied on any graph, for finite N, and in practice its validity only requires that the spectral gap is large enough. Is this what you mean? Please, just explain a bit better this sentence in order to avoid confusion in the reader.
— In the case of random regular graph the accuracy of the formula deteriorates considerably for low degree. In Table 2, the case for c=4 is surprisingly bad if compared with the errors reported for larger degrees and to the case of Erdos-Renyi random graphs. Do you have any idea about the reason? Could it be related to the fact that in random regular graphs entries of the adjacency matrix are much more correlated than in Erdos-Renyi graphs?
— Again on the same point, a random graph of N=500 nodes with an average degree of O(10) is not very sparse. In order to be more conclusive on the applicability of the formula on sparse graph it would be useful to show results (e.g. the scatterplots or some other metrics) at fixed average degree for different values of N (e.g. N=500,1000, 5000).
— The Authors say that “We do not report here on (dense) scale-free topologies, whose phenomenology is very similar to the other cases, albeit with significantly larger fluctuations: a detailed study of this (and other) heterogeneous cases is deferred to a separate publication.” And also “While our approach continues to work for (sufficiently dense) heterogeneous networks, the intra-row fluctuations around the random matrix assumption ⟨aij⟩ = zi/N may be very significant there and will thus require a more careful treatment“.
I do not criticise the decision of deferring the analysis of heterogenous structures to a different publication, however the present paper would really benefit of a deeper discussion on the limitations to the applicability of the formula in the case of heterogenous networks. In fact, from the fact that the formula only cares about incoming weights, one would expect that it could work pretty well when computing MFPT on hubs and central nodes and possibly much worse on poorly connected regions of the graphs. Is this correct? Is there any known result on the spectral property (spectral gap) of heterogenous graphs which prevents/limits the applicability of the formula to such graphs?
Requested changes
No major changes, some minor revisions are suggested (recommended) in the report.
see attached pdf

Author: Pierpaolo Vivo on 2021-08-18 [id 1681]
(in reply to Report 2 on 2021-07-22)see attached pdf
Attachment:
RebuttalLetter_FINAL.pdf