Hello again Tiago,

First of all thanks for the resources and help. As you may expect, I somewhat lack a specialized background in graph theory and so I sometimes miss the correct terminology.

As a reminder, this is a DAG. I haven’t, yet, implemented the random unbiased generator version of all_paths you kindly shared in those links, although I have it listed as a possible future step.

As a temporary workaround, I’ve been doing the following each time I want to randomly sample paths between two nodes:

  1. I calculate the min and max length of paths between the two nodes (the min using topology.shortest_distance, a good aproximation of the max is trivial).
  2. I sample a number between those min and max`.
  3. I execute topology.all_paths with that number as cutoff argument to obtain the paths generator.
  4. I then execute some sampling from that generator, with an iteration limit.

This is, of course, just a very crude and biased way of sampling, but at least it returns a different set of paths at each time it is executed. Until now, I’ve been assuming each edge has a weight of 1.

I now would like to test giving edges a weight in order for that cutoff argument to use it. I know the shortest_distance function accepts weights of edges in order to do Dijkstra. Could the same be done with all_paths such that the search is stopped if an accumulated weighted-distance is reached? Is there any (alternative) way of controlling which paths I get from the all_paths besides that cutoff argument? Or whatever specialized logic regarding path sampling / filtering I would have to implemented myself (just like the examples you shared)? Would this be something you would consider adding to graph-tool?

For example, I’ve been even wondering if I could just create a temporal view from the graph, by a randomly filtering edges or nodes before samplinh the paths, so as to, again, reduce the graph I will be sampling at each iteration.

as always thanks for your time and help!

Franco Peschiera

Message: 2
Date: Mon, 23 Mar 2020 21:37:52 +0000
From: Tiago de Paula Peixoto <tiago@skewed.de>
To: Main discussion list for the graph-tool project
        <graph-tool@skewed.de>
Subject: Re: [graph-tool] efficient random sampling of paths between
        two nodes
Message-ID: <e7b73784-607d-6644-f613-46af6d9bc1d1@skewed.de>
Content-Type: text/plain; charset=utf-8

Am 23.03.20 um 22:14 schrieb Franco Peschiera:
> Hello Tiago,
>
> First of all, thanks for your time.
>
> I see what you mean by having a biased logic that would prefer shorter
> paths to longer ones, I had not thought about that.
>
> Regarding the self-reference part, I think it would not be a problem
> because of the structure of my particular (directed) graph. In fact,
> each node represents an assignment *at some given time period* and the
> outward neighbors of a node represent assignments *in the future*. In
> this way, a path can never visit a previously visited node since there
> are no possible cycles. In fact I can easily calculate the shortest and
> longest possible path between two nodes (shortest: using graphql's
> `shortest_distance` method, longest= number of periods in between the
> two nodes).

Well, for DAG (directed acyclic graphs) the situation is quite
different, you should have said so in the beginning.

> So the paths I want to create (or sample) are just the different ways
> one can go from a node N1 (in period P1) to node N2 (in period P2 > P1).?
> I think that in my graph I could just sample neighbors with a weight
> that depends on how far they are (in number of periods) from the node:
> the farthest neighbor will have the least probability of being chosen.
> This way, I'd compensate the fact that shorter paths take less hops.
>
> What do you think?

Why do I get the impression I'm using google more than you to answer
your question?

Here is an approach using rejection sampling:

https://math.stackexchange.com/questions/2673132/a-procedure-for-sampling-paths-in-a-directed-acyclic-graph

Another approach is to count the number of paths that go through each
node (this is feasible for DAGs) and use this to sample directly, see:

https://pdfs.semanticscholar.org/0d74/e82c41124f83c842d5432abcb914ed1f410f.pdf

Best,
Tiago


--
Tiago de Paula Peixoto <tiago@skewed.de>