All Pairs Shortest Path with limitations
Hi together, I have a question on how you would tackle the following: I have a graph and want to calculate all possible shortest paths but exclude a few nodes as starting / stopping positions. I can not filter out those vertices, since paths should be allowed to pass them - just not start or terminate there. Calculating all possible shortest paths manually is computationally infeasible. Thanks a lot! Stephan _______________________________________________ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-leave@skewed.de
Hi Stephan, If the number of start-end pairs is not high you could compute all shortest paths and then remove the ones you are not interested in. Best, Aleks On Wed, Nov 17, 2021 at 3:55 PM Monecke, Stephan <stephan.monecke@ds.mpg.de> wrote:
Hi together,
I have a question on how you would tackle the following:
I have a graph and want to calculate all possible shortest paths but exclude a few nodes as starting / stopping positions.
I can not filter out those vertices, since paths should be allowed to pass them - just not start or terminate there.
Calculating all possible shortest paths manually is computationally infeasible.
Thanks a lot!
Stephan
_______________________________________________ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-leave@skewed.de
_______________________________________________ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-leave@skewed.de
Hi Aleks! I thank you for your answer! If already tried that but have not been successful speed-wise. A full APSP of a very small test graph already takes around 200 ms. The fastest I could get as described below runs in about 3.6 s. :/ The main reason is that shortest_distance() has a "single source, all targets" mode but no "single target, all sources" mode (the reverse) - as soon as "source" is empty, it defaults to APSP, neglecting "target". Hence I have to do this part manually using shortest_distance(G, u, v).* My bottleneck is the APSP calculation. Cheers, Stephan *Since the graph is directed, we have the following: Say V is the set of all vertices and B a subset of V I want to exclude as sources / targets from APSP. Then the set of points is [ (V x V) \ [ (V x B) + (B x V) ] ] + (B x B). Or in graph-tool terms: - (V x V): shortest_distance(G) - (V x B): shortest_distance(G, v, b) for b in B for v in V (horrible run-time) - (B x V): shortest_distance(G, b, _) for b in B - (B x B): shortest_distance(G, u, v) for u,v in B x B ________________________________ Von: Aleksandar Trifunovic <akstrfn@gmail.com> Gesendet: Mittwoch, 17. November 2021 17:54 An: Main discussion list for the graph-tool project Betreff: [graph-tool] Re: All Pairs Shortest Path with limitations Hi Stephan, If the number of start-end pairs is not high you could compute all shortest paths and then remove the ones you are not interested in. Best, Aleks On Wed, Nov 17, 2021 at 3:55 PM Monecke, Stephan <stephan.monecke@ds.mpg.de> wrote:
Hi together,
I have a question on how you would tackle the following:
I have a graph and want to calculate all possible shortest paths but exclude a few nodes as starting / stopping positions.
I can not filter out those vertices, since paths should be allowed to pass them - just not start or terminate there.
Calculating all possible shortest paths manually is computationally infeasible.
Thanks a lot!
Stephan
_______________________________________________ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-leave@skewed.de
_______________________________________________ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-leave@skewed.de _______________________________________________ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-leave@skewed.de
Am 17.11.21 um 19:22 schrieb Monecke, Stephan:
The main reason is that shortest_distance() has a "single source, all targets" mode but no "single target, all sources" mode (the reverse) - as soon as "source" is empty, it defaults to APSP, neglecting "target".
Of course it does. Just reverse the graph with: GraphView(g, reversed=True) -- Tiago de Paula Peixoto <tiago@skewed.de> _______________________________________________ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-leave@skewed.de
🥳 I didn't know that existed! Thanks a ton, I'm really looking forward to test it out tomorrow! Cheers, Stephan -- Sent from my mobile device. Please excuse my brevity. Am 17. November 2021 19:27:37 MEZ schrieb Tiago de Paula Peixoto <tiago@skewed.de>:
Am 17.11.21 um 19:22 schrieb Monecke, Stephan:
The main reason is that shortest_distance() has a "single source, all targets" mode but no "single target, all sources" mode (the reverse) - as soon as "source" is empty, it defaults to APSP, neglecting "target".
Of course it does. Just reverse the graph with:
GraphView(g, reversed=True)
_______________________________________________ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-leave@skewed.de
This works like a charm! With it, I'm down to 228 ms ± 1.31 ms from 3.6 s ( 82.9 ms ± 2.45 ms for shortest_distance(G) ). It's still very very expensive as it's 2.75 times APSP but approaching reasonability. The costly remaining part is the manual B x B.* I tried to speed it up by the approximation to limit G to B since I know that B is connected but graph-tool segfaults right away (I wrote another ticket for that). :/ If anyone else has any ideas on how to speed this up, I'd be very grateful! Cheers, Stephan - - - o - - - *Since the graph is directed, we have the following: Say V is the set of all vertices and B a subset of V I want to exclude as sources / targets from APSP. Then the set of points is (V\B) x (V\B) = [ (V x V) \ [ (V x B) + (B x V) ] ] U (B x B). Or in graph-tool terms: - (V x V): shortest_distance(G) - (B x V): shortest_distance(G, b, _) for b in B - (V x B): shortest_distance(GraphView(G, reversed=True), b, _) for b in B - (B x B): shortest_distance(G, u, v) for u,v in B x B Von: Stephan Monecke <smoneck@ds.mpg.de> Gesendet: Mittwoch, 17. November 2021 20:24 An: Main discussion list for the graph-tool project Betreff: [graph-tool] Re: All Pairs Shortest Path with limitations 🥳 I didn't know that existed! Thanks a ton, I'm really looking forward to test it out tomorrow! Cheers, Stephan -- Sent from my mobile device. Please excuse my brevity. Am 17. November 2021 19:27:37 MEZ schrieb Tiago de Paula Peixoto <tiago@skewed.de>:
Am 17.11.21 um 19:22 schrieb Monecke, Stephan:
The main reason is that shortest_distance() has a "single source, all targets" mode but no "single target, all sources" mode (the reverse) - as soon as "source" is empty, it defaults to APSP, neglecting "target".
Of course it does. Just reverse the graph with:
GraphView(g, reversed=True)
_______________________________________________ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-leave@skewed.de _______________________________________________ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-leave@skewed.de
Hi Tiago and the list, my algorithm is still _infeasibly_ slow and > 90 % of the run-time results from shortest_path(). Do you have any idea on how this could be sped up? I have the graph G, the set of all nodes V and a sub-set B ⊂ V. I want to calculate or approximate the mean shortest distance between all pairs V\B x V\B but on the complete graph - or in other words, calculate the mean of all pairs shortest distance ( mean(shortest_distance()) ) but exclude n∈B from being source or target. Sub-sampling would be okay as well. I currently don't see a way to do this fast with graph-tool besides writing a custom function but that probably exceeds my capacities. Am I missing something? Do you have any ideas? Ideal would be something like a multi-threaded shortest_distance(g=G, source=V\B, target=V\B, weights=G.edge_properties['weight'], samples=N) with N pairs random sub-sampling and source / target being lists. Thanks a ton! Stephan ________________________________ Von: Monecke, Stephan <stephan.monecke@ds.mpg.de> Gesendet: Donnerstag, 18. November 2021 14:59 An: Main discussion list for the graph-tool project Betreff: [graph-tool] Re: All Pairs Shortest Path with limitations This works like a charm! With it, I'm down to 228 ms ± 1.31 ms from 3.6 s ( 82.9 ms ± 2.45 ms for shortest_distance(G) ). It's still very very expensive as it's 2.75 times APSP but approaching reasonability. The costly remaining part is the manual B x B.* I tried to speed it up by the approximation to limit G to B since I know that B is connected but graph-tool segfaults right away (I wrote another ticket for that). :/ If anyone else has any ideas on how to speed this up, I'd be very grateful! Cheers, Stephan - - - o - - - *Since the graph is directed, we have the following: Say V is the set of all vertices and B a subset of V I want to exclude as sources / targets from APSP. Then the set of points is (V\B) x (V\B) = [ (V x V) \ [ (V x B) + (B x V) ] ] U (B x B). Or in graph-tool terms: - (V x V): shortest_distance(G) - (B x V): shortest_distance(G, b, _) for b in B - (V x B): shortest_distance(GraphView(G, reversed=True), b, _) for b in B - (B x B): shortest_distance(G, u, v) for u,v in B x B Von: Stephan Monecke <smoneck@ds.mpg.de> Gesendet: Mittwoch, 17. November 2021 20:24 An: Main discussion list for the graph-tool project Betreff: [graph-tool] Re: All Pairs Shortest Path with limitations 🥳 I didn't know that existed! Thanks a ton, I'm really looking forward to test it out tomorrow! Cheers, Stephan -- Sent from my mobile device. Please excuse my brevity. Am 17. November 2021 19:27:37 MEZ schrieb Tiago de Paula Peixoto <tiago@skewed.de>:
Am 17.11.21 um 19:22 schrieb Monecke, Stephan:
The main reason is that shortest_distance() has a "single source, all targets" mode but no "single target, all sources" mode (the reverse) - as soon as "source" is empty, it defaults to APSP, neglecting "target".
Of course it does. Just reverse the graph with:
GraphView(g, reversed=True)
_______________________________________________ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-leave@skewed.de _______________________________________________ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-leave@skewed.de _______________________________________________ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-leave@skewed.de
Hi Stephan, I've used graph tool to compute a lot of shortest paths and used a dumb solution of just throwing it into multiprocessing module from python (with some lru_cache tricks). If you have enough memory and you don't call the main procedure often (seems to be the case?) then the overhead of multiprocessing becomes insignificant. A benefit is that this is pretty easy to code and scales perfectly to more cores (until you run out of memory). Best, Aleks On Wed, Nov 24, 2021 at 3:32 PM Monecke, Stephan <stephan.monecke@ds.mpg.de> wrote:
Hi Tiago and the list,
my algorithm is still _infeasibly_ slow and > 90 % of the run-time results from shortest_path().
Do you have any idea on how this could be sped up?
I have the graph G, the set of all nodes V and a sub-set B ⊂ V.
I want to calculate or approximate the mean shortest distance between all pairs V\B x V\B but on the complete graph - or in other words, calculate the mean of all pairs shortest distance ( mean(shortest_distance()) ) but exclude n∈B from being source or target.
Sub-sampling would be okay as well.
I currently don't see a way to do this fast with graph-tool besides writing a custom function but that probably exceeds my capacities. Am I missing something? Do you have any ideas?
Ideal would be something like a multi-threaded shortest_distance(g=G, source=V\B, target=V\B, weights=G.edge_properties['weight'], samples=N) with N pairs random sub-sampling and source / target being lists.
Thanks a ton!
Stephan
________________________________ Von: Monecke, Stephan <stephan.monecke@ds.mpg.de> Gesendet: Donnerstag, 18. November 2021 14:59 An: Main discussion list for the graph-tool project Betreff: [graph-tool] Re: All Pairs Shortest Path with limitations
This works like a charm!
With it, I'm down to 228 ms ± 1.31 ms from 3.6 s ( 82.9 ms ± 2.45 ms for shortest_distance(G) ).
It's still very very expensive as it's 2.75 times APSP but approaching reasonability. The costly remaining part is the manual B x B.*
I tried to speed it up by the approximation to limit G to B since I know that B is connected but graph-tool segfaults right away (I wrote another ticket for that). :/
If anyone else has any ideas on how to speed this up, I'd be very grateful!
Cheers, Stephan
- - - o - - -
*Since the graph is directed, we have the following: Say V is the set of all vertices and B a subset of V I want to exclude as sources / targets from APSP. Then the set of points is (V\B) x (V\B) = [ (V x V) \ [ (V x B) + (B x V) ] ] U (B x B).
Or in graph-tool terms: - (V x V): shortest_distance(G) - (B x V): shortest_distance(G, b, _) for b in B - (V x B): shortest_distance(GraphView(G, reversed=True), b, _) for b in B - (B x B): shortest_distance(G, u, v) for u,v in B x B
Von: Stephan Monecke <smoneck@ds.mpg.de> Gesendet: Mittwoch, 17. November 2021 20:24 An: Main discussion list for the graph-tool project Betreff: [graph-tool] Re: All Pairs Shortest Path with limitations
🥳 I didn't know that existed!
Thanks a ton, I'm really looking forward to test it out tomorrow!
Cheers, Stephan -- Sent from my mobile device. Please excuse my brevity.
Am 17. November 2021 19:27:37 MEZ schrieb Tiago de Paula Peixoto <tiago@skewed.de>:
Am 17.11.21 um 19:22 schrieb Monecke, Stephan:
The main reason is that shortest_distance() has a "single source, all targets" mode but no "single target, all sources" mode (the reverse) - as soon as "source" is empty, it defaults to APSP, neglecting "target".
Of course it does. Just reverse the graph with:
GraphView(g, reversed=True)
_______________________________________________ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-leave@skewed.de
_______________________________________________ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-leave@skewed.de _______________________________________________ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-leave@skewed.de
_______________________________________________ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-leave@skewed.de
Hi Aleks! The multiprocessing module is what I'm currently using as a workaround until something other pops up - I just throw it into a mp.Pool. For the initial debugging / get-to-know phase, so to say. I don't know what exactly you've put into the multiprocessing module but I don't see how this could be of help in my case. I'm running a metropolis algorithm that runs hours on a very small graph. Each step depends on the previous. On the graphs I'm actually - or at least - aiming for, it would probably run at least over a week. I also have a large parameter space and would like to try something like lipo* but same here: Just 100 iterations already approach 2 weeks on the tiny graph and I can't multiprocess it. Cheers, Stephan *https://pypi.org/project/lipo/ ________________________________ Von: Aleksandar Trifunovic <akstrfn@gmail.com> Gesendet: Mittwoch, 24. November 2021 16:44 An: Main discussion list for the graph-tool project Betreff: [graph-tool] Re: All Pairs Shortest Path with limitations Hi Stephan, I've used graph tool to compute a lot of shortest paths and used a dumb solution of just throwing it into multiprocessing module from python (with some lru_cache tricks). If you have enough memory and you don't call the main procedure often (seems to be the case?) then the overhead of multiprocessing becomes insignificant. A benefit is that this is pretty easy to code and scales perfectly to more cores (until you run out of memory). Best, Aleks On Wed, Nov 24, 2021 at 3:32 PM Monecke, Stephan <stephan.monecke@ds.mpg.de> wrote:
Hi Tiago and the list,
my algorithm is still _infeasibly_ slow and > 90 % of the run-time results from shortest_path().
Do you have any idea on how this could be sped up?
I have the graph G, the set of all nodes V and a sub-set B ⊂ V.
I want to calculate or approximate the mean shortest distance between all pairs V\B x V\B but on the complete graph - or in other words, calculate the mean of all pairs shortest distance ( mean(shortest_distance()) ) but exclude n∈B from being source or target.
Sub-sampling would be okay as well.
I currently don't see a way to do this fast with graph-tool besides writing a custom function but that probably exceeds my capacities. Am I missing something? Do you have any ideas?
Ideal would be something like a multi-threaded shortest_distance(g=G, source=V\B, target=V\B, weights=G.edge_properties['weight'], samples=N) with N pairs random sub-sampling and source / target being lists.
Thanks a ton!
Stephan
________________________________ Von: Monecke, Stephan <stephan.monecke@ds.mpg.de> Gesendet: Donnerstag, 18. November 2021 14:59 An: Main discussion list for the graph-tool project Betreff: [graph-tool] Re: All Pairs Shortest Path with limitations
This works like a charm!
With it, I'm down to 228 ms ± 1.31 ms from 3.6 s ( 82.9 ms ± 2.45 ms for shortest_distance(G) ).
It's still very very expensive as it's 2.75 times APSP but approaching reasonability. The costly remaining part is the manual B x B.*
I tried to speed it up by the approximation to limit G to B since I know that B is connected but graph-tool segfaults right away (I wrote another ticket for that). :/
If anyone else has any ideas on how to speed this up, I'd be very grateful!
Cheers, Stephan
- - - o - - -
*Since the graph is directed, we have the following: Say V is the set of all vertices and B a subset of V I want to exclude as sources / targets from APSP. Then the set of points is (V\B) x (V\B) = [ (V x V) \ [ (V x B) + (B x V) ] ] U (B x B).
Or in graph-tool terms: - (V x V): shortest_distance(G) - (B x V): shortest_distance(G, b, _) for b in B - (V x B): shortest_distance(GraphView(G, reversed=True), b, _) for b in B - (B x B): shortest_distance(G, u, v) for u,v in B x B
Von: Stephan Monecke <smoneck@ds.mpg.de> Gesendet: Mittwoch, 17. November 2021 20:24 An: Main discussion list for the graph-tool project Betreff: [graph-tool] Re: All Pairs Shortest Path with limitations
🥳 I didn't know that existed!
Thanks a ton, I'm really looking forward to test it out tomorrow!
Cheers, Stephan -- Sent from my mobile device. Please excuse my brevity.
Am 17. November 2021 19:27:37 MEZ schrieb Tiago de Paula Peixoto <tiago@skewed.de>:
Am 17.11.21 um 19:22 schrieb Monecke, Stephan:
The main reason is that shortest_distance() has a "single source, all targets" mode but no "single target, all sources" mode (the reverse) - as soon as "source" is empty, it defaults to APSP, neglecting "target".
Of course it does. Just reverse the graph with:
GraphView(g, reversed=True)
_______________________________________________ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-leave@skewed.de
_______________________________________________ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-leave@skewed.de _______________________________________________ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-leave@skewed.de
_______________________________________________ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-leave@skewed.de _______________________________________________ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-leave@skewed.de
Hi Stephan, On Wed, Nov 24, 2021 at 5:58 PM Monecke, Stephan <stephan.monecke@ds.mpg.de> wrote:
Hi Aleks!
The multiprocessing module is what I'm currently using as a workaround until something other pops up - I just throw it into a mp.Pool.
For the initial debugging / get-to-know phase, so to say.
This is the first time I hear that someone uses multiprocessing for debugging. Usually its the other way around :)
I don't know what exactly you've put into the multiprocessing module but I don't see how this could be of help in my case.
You requested multi-threaded shortest_distance and wanted to compute multiple paths and then get the mean dist? So my suggestion was to throw Pool on shortest_distance(g=G, source=V\B, target=V\B, weights=G.edge_properties['weight'], samples=N) which is trivial to do e.g. def wrap(node): return shortest_distance(g=G, source=node, target=node, weights=G.edge_properties['weight']) res = multiprocessing.Pool().map(wrap, random.choices(V, N)) do_some_processing_on_result(res)... This is not optimal due to various overheads, however from python its the best you can get. But if you know a bit of c++ and want to optimize some more, graph-tool already uses openmp so making a parallel loop over vertices should be easy. Tiago keeps the code very readable and everything is logical so finding a place to make your modification should take much time: https://git.skewed.de/count0/graph-tool/-/blob/master/src/graph/topology/gra...
I'm running a metropolis algorithm that runs hours on a very small graph. Each step depends on the previous. On the graphs I'm actually - or at least - aiming for, it would probably run at least over a week.
I also have a large parameter space and would like to try something like lipo* but same here: Just 100 iterations already approach 2 weeks on the tiny graph and I can't multiprocess it.
You can for sure parallelize parameter opt with dlib. Check the official docs (http://dlib.net/dlib/global_optimization/global_function_search_abstract.h.h...) however its already done in other frameworks: listed by lipo readme: https://optuna.readthedocs.io/en/stable/ I've tested this one (~30 parameters): https://github.com/facebookresearch/nevergrad Good luck, Aleks
Cheers, Stephan
*https://pypi.org/project/lipo/
________________________________ Von: Aleksandar Trifunovic <akstrfn@gmail.com> Gesendet: Mittwoch, 24. November 2021 16:44 An: Main discussion list for the graph-tool project Betreff: [graph-tool] Re: All Pairs Shortest Path with limitations
Hi Stephan,
I've used graph tool to compute a lot of shortest paths and used a dumb solution of just throwing it into multiprocessing module from python (with some lru_cache tricks). If you have enough memory and you don't call the main procedure often (seems to be the case?) then the overhead of multiprocessing becomes insignificant. A benefit is that this is pretty easy to code and scales perfectly to more cores (until you run out of memory).
Best, Aleks
On Wed, Nov 24, 2021 at 3:32 PM Monecke, Stephan <stephan.monecke@ds.mpg.de> wrote:
Hi Tiago and the list,
my algorithm is still _infeasibly_ slow and > 90 % of the run-time results from shortest_path().
Do you have any idea on how this could be sped up?
I have the graph G, the set of all nodes V and a sub-set B ⊂ V.
I want to calculate or approximate the mean shortest distance between all pairs V\B x V\B but on the complete graph - or in other words, calculate the mean of all pairs shortest distance ( mean(shortest_distance()) ) but exclude n∈B from being source or target.
Sub-sampling would be okay as well.
I currently don't see a way to do this fast with graph-tool besides writing a custom function but that probably exceeds my capacities. Am I missing something? Do you have any ideas?
Ideal would be something like a multi-threaded shortest_distance(g=G, source=V\B, target=V\B, weights=G.edge_properties['weight'], samples=N) with N pairs random sub-sampling and source / target being lists.
Thanks a ton!
Stephan
________________________________ Von: Monecke, Stephan <stephan.monecke@ds.mpg.de> Gesendet: Donnerstag, 18. November 2021 14:59 An: Main discussion list for the graph-tool project Betreff: [graph-tool] Re: All Pairs Shortest Path with limitations
This works like a charm!
With it, I'm down to 228 ms ± 1.31 ms from 3.6 s ( 82.9 ms ± 2.45 ms for shortest_distance(G) ).
It's still very very expensive as it's 2.75 times APSP but approaching reasonability. The costly remaining part is the manual B x B.*
I tried to speed it up by the approximation to limit G to B since I know that B is connected but graph-tool segfaults right away (I wrote another ticket for that). :/
If anyone else has any ideas on how to speed this up, I'd be very grateful!
Cheers, Stephan
- - - o - - -
*Since the graph is directed, we have the following: Say V is the set of all vertices and B a subset of V I want to exclude as sources / targets from APSP. Then the set of points is (V\B) x (V\B) = [ (V x V) \ [ (V x B) + (B x V) ] ] U (B x B).
Or in graph-tool terms: - (V x V): shortest_distance(G) - (B x V): shortest_distance(G, b, _) for b in B - (V x B): shortest_distance(GraphView(G, reversed=True), b, _) for b in B - (B x B): shortest_distance(G, u, v) for u,v in B x B
Von: Stephan Monecke <smoneck@ds.mpg.de> Gesendet: Mittwoch, 17. November 2021 20:24 An: Main discussion list for the graph-tool project Betreff: [graph-tool] Re: All Pairs Shortest Path with limitations
🥳 I didn't know that existed!
Thanks a ton, I'm really looking forward to test it out tomorrow!
Cheers, Stephan -- Sent from my mobile device. Please excuse my brevity.
Am 17. November 2021 19:27:37 MEZ schrieb Tiago de Paula Peixoto <tiago@skewed.de>:
Am 17.11.21 um 19:22 schrieb Monecke, Stephan:
The main reason is that shortest_distance() has a "single source, all targets" mode but no "single target, all sources" mode (the reverse) - as soon as "source" is empty, it defaults to APSP, neglecting "target".
Of course it does. Just reverse the graph with:
GraphView(g, reversed=True)
_______________________________________________ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-leave@skewed.de
_______________________________________________ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-leave@skewed.de _______________________________________________ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-leave@skewed.de
_______________________________________________ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-leave@skewed.de _______________________________________________ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-leave@skewed.de
graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-leave@skewed.de
Hi Aleks! I thank you for your long answer!
def wrap(node): return shortest_distance(g=G, source=node, target=node, weights=G.edge_properties['weight'])
res = multiprocessing.Pool().map(wrap, random.choices(V, N)) do_some_processing_on_result(res)...
This is not optimal due to various overheads, however from python its the best you can get.
So you basically recommend full APSP manually? In terms of run-time, this is the worst you can get divided by the number of cores - or the worst nightmare of an hpc admin. Just out of curiosity: Even on my small test-graph it's ~ 1200 times slower than full shortest_distance(). I'll have a look at the other resources! Cheers, Stephan Von: Aleksandar Trifunovic <akstrfn@gmail.com> Gesendet: Donnerstag, 25. November 2021 01:16 An: Main discussion list for the graph-tool project Betreff: [graph-tool] Re: All Pairs Shortest Path with limitations Hi Stephan, On Wed, Nov 24, 2021 at 5:58 PM Monecke, Stephan <stephan.monecke@ds.mpg.de> wrote:
Hi Aleks!
The multiprocessing module is what I'm currently using as a workaround until something other pops up - I just throw it into a mp.Pool.
For the initial debugging / get-to-know phase, so to say.
This is the first time I hear that someone uses multiprocessing for debugging. Usually its the other way around :)
I don't know what exactly you've put into the multiprocessing module but I don't see how this could be of help in my case.
You requested multi-threaded shortest_distance and wanted to compute multiple paths and then get the mean dist? So my suggestion was to throw Pool on shortest_distance(g=G, source=V\B, target=V\B, weights=G.edge_properties['weight'], samples=N) which is trivial to do e.g. def wrap(node): return shortest_distance(g=G, source=node, target=node, weights=G.edge_properties['weight']) res = multiprocessing.Pool().map(wrap, random.choices(V, N)) do_some_processing_on_result(res)... This is not optimal due to various overheads, however from python its the best you can get. But if you know a bit of c++ and want to optimize some more, graph-tool already uses openmp so making a parallel loop over vertices should be easy. Tiago keeps the code very readable and everything is logical so finding a place to make your modification should take much time: https://git.skewed.de/count0/graph-tool/-/blob/master/src/graph/topology/gra...
I'm running a metropolis algorithm that runs hours on a very small graph. Each step depends on the previous. On the graphs I'm actually - or at least - aiming for, it would probably run at least over a week.
I also have a large parameter space and would like to try something like lipo* but same here: Just 100 iterations already approach 2 weeks on the tiny graph and I can't multiprocess it.
You can for sure parallelize parameter opt with dlib. Check the official docs (http://dlib.net/dlib/global_optimization/global_function_search_abstract.h.h...) however its already done in other frameworks: listed by lipo readme: https://optuna.readthedocs.io/en/stable/ I've tested this one (~30 parameters): https://github.com/facebookresearch/nevergrad Good luck, Aleks
Cheers, Stephan
*https://pypi.org/project/lipo/
________________________________ Von: Aleksandar Trifunovic <akstrfn@gmail.com> Gesendet: Mittwoch, 24. November 2021 16:44 An: Main discussion list for the graph-tool project Betreff: [graph-tool] Re: All Pairs Shortest Path with limitations
Hi Stephan,
I've used graph tool to compute a lot of shortest paths and used a dumb solution of just throwing it into multiprocessing module from python (with some lru_cache tricks). If you have enough memory and you don't call the main procedure often (seems to be the case?) then the overhead of multiprocessing becomes insignificant. A benefit is that this is pretty easy to code and scales perfectly to more cores (until you run out of memory).
Best, Aleks
On Wed, Nov 24, 2021 at 3:32 PM Monecke, Stephan <stephan.monecke@ds.mpg.de> wrote:
Hi Tiago and the list,
my algorithm is still _infeasibly_ slow and > 90 % of the run-time results from shortest_path().
Do you have any idea on how this could be sped up?
I have the graph G, the set of all nodes V and a sub-set B ⊂ V.
I want to calculate or approximate the mean shortest distance between all pairs V\B x V\B but on the complete graph - or in other words, calculate the mean of all pairs shortest distance ( mean(shortest_distance()) ) but exclude n∈B from being source or target.
Sub-sampling would be okay as well.
I currently don't see a way to do this fast with graph-tool besides writing a custom function but that probably exceeds my capacities. Am I missing something? Do you have any ideas?
Ideal would be something like a multi-threaded shortest_distance(g=G, source=V\B, target=V\B, weights=G.edge_properties['weight'], samples=N) with N pairs random sub-sampling and source / target being lists.
Thanks a ton!
Stephan
________________________________ Von: Monecke, Stephan <stephan.monecke@ds.mpg.de> Gesendet: Donnerstag, 18. November 2021 14:59 An: Main discussion list for the graph-tool project Betreff: [graph-tool] Re: All Pairs Shortest Path with limitations
This works like a charm!
With it, I'm down to 228 ms ± 1.31 ms from 3.6 s ( 82.9 ms ± 2.45 ms for shortest_distance(G) ).
It's still very very expensive as it's 2.75 times APSP but approaching reasonability. The costly remaining part is the manual B x B.*
I tried to speed it up by the approximation to limit G to B since I know that B is connected but graph-tool segfaults right away (I wrote another ticket for that). :/
If anyone else has any ideas on how to speed this up, I'd be very grateful!
Cheers, Stephan
- - - o - - -
*Since the graph is directed, we have the following: Say V is the set of all vertices and B a subset of V I want to exclude as sources / targets from APSP. Then the set of points is (V\B) x (V\B) = [ (V x V) \ [ (V x B) + (B x V) ] ] U (B x B).
Or in graph-tool terms: - (V x V): shortest_distance(G) - (B x V): shortest_distance(G, b, _) for b in B - (V x B): shortest_distance(GraphView(G, reversed=True), b, _) for b in B - (B x B): shortest_distance(G, u, v) for u,v in B x B
Von: Stephan Monecke <smoneck@ds.mpg.de> Gesendet: Mittwoch, 17. November 2021 20:24 An: Main discussion list for the graph-tool project Betreff: [graph-tool] Re: All Pairs Shortest Path with limitations
🥳 I didn't know that existed!
Thanks a ton, I'm really looking forward to test it out tomorrow!
Cheers, Stephan -- Sent from my mobile device. Please excuse my brevity.
Am 17. November 2021 19:27:37 MEZ schrieb Tiago de Paula Peixoto <tiago@skewed.de>:
Am 17.11.21 um 19:22 schrieb Monecke, Stephan:
The main reason is that shortest_distance() has a "single source, all targets" mode but no "single target, all sources" mode (the reverse) - as soon as "source" is empty, it defaults to APSP, neglecting "target".
Of course it does. Just reverse the graph with:
GraphView(g, reversed=True)
_______________________________________________ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-leave@skewed.de
_______________________________________________ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-leave@skewed.de _______________________________________________ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-leave@skewed.de
_______________________________________________ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-leave@skewed.de _______________________________________________ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-leave@skewed.de
graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-leave@skewed.de _______________________________________________ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-leave@skewed.de
Hi Stephan, On Fri, Nov 26, 2021 at 2:58 PM Monecke, Stephan <stephan.monecke@ds.mpg.de> wrote:
Hi Aleks!
I thank you for your long answer!
def wrap(node): return shortest_distance(g=G, source=node, target=node, weights=G.edge_properties['weight'])
res = multiprocessing.Pool().map(wrap, random.choices(V, N)) do_some_processing_on_result(res)...
This is not optimal due to various overheads, however from python its the best you can get.
So you basically recommend full APSP manually?
In terms of run-time, this is the worst you can get divided by the number of cores - or the worst nightmare of an hpc admin.
I made a typo so it should be `random.choices(V/B, N)`, but typo or not I don't see how is this APSP? I am unsure which part gives HPC admin nightmares, but I'll trust you that this is a bad HPC solution.
Just out of curiosity: Even on my small test-graph it's ~ 1200 times slower than full shortest_distance().
I've missed from your email that you have such a short execution time so definitely multiprocessing makes no sense, sorry for the wasted time. "Even on" in your sentence tells me you don't understand how this works so I'll explain. multiprocessing does binary serialization and deserialization so small problems with short execution time are the worst possible case (exactly your test case). Additionally I hope that you didn't run this example on a machine with a very high core count since that would only exacerbate the problem... In general doing micro optimizations from Python has a very narrow range where its useful, since Python is too slow in comparison to C++. So you'll likely have to write some C++ if you need to go even faster. Have fun, Aleks
I'll have a look at the other resources!
Cheers, Stephan
Von: Aleksandar Trifunovic <akstrfn@gmail.com> Gesendet: Donnerstag, 25. November 2021 01:16 An: Main discussion list for the graph-tool project Betreff: [graph-tool] Re: All Pairs Shortest Path with limitations
Hi Stephan,
On Wed, Nov 24, 2021 at 5:58 PM Monecke, Stephan <stephan.monecke@ds.mpg.de> wrote:
Hi Aleks!
The multiprocessing module is what I'm currently using as a workaround until something other pops up - I just throw it into a mp.Pool.
For the initial debugging / get-to-know phase, so to say.
This is the first time I hear that someone uses multiprocessing for debugging. Usually its the other way around :)
I don't know what exactly you've put into the multiprocessing module but I don't see how this could be of help in my case.
You requested multi-threaded shortest_distance and wanted to compute multiple paths and then get the mean dist? So my suggestion was to throw Pool on shortest_distance(g=G, source=V\B, target=V\B, weights=G.edge_properties['weight'], samples=N) which is trivial to do e.g.
def wrap(node): return shortest_distance(g=G, source=node, target=node, weights=G.edge_properties['weight'])
res = multiprocessing.Pool().map(wrap, random.choices(V, N)) do_some_processing_on_result(res)...
This is not optimal due to various overheads, however from python its the best you can get. But if you know a bit of c++ and want to optimize some more, graph-tool already uses openmp so making a parallel loop over vertices should be easy. Tiago keeps the code very readable and everything is logical so finding a place to make your modification should take much time: https://git.skewed.de/count0/graph-tool/-/blob/master/src/graph/topology/gra...
I'm running a metropolis algorithm that runs hours on a very small graph. Each step depends on the previous. On the graphs I'm actually - or at least - aiming for, it would probably run at least over a week.
I also have a large parameter space and would like to try something like lipo* but same here: Just 100 iterations already approach 2 weeks on the tiny graph and I can't multiprocess it.
You can for sure parallelize parameter opt with dlib. Check the official docs (http://dlib.net/dlib/global_optimization/global_function_search_abstract.h.h...) however its already done in other frameworks: listed by lipo readme: https://optuna.readthedocs.io/en/stable/ I've tested this one (~30 parameters): https://github.com/facebookresearch/nevergrad
Good luck, Aleks
Cheers, Stephan
*https://pypi.org/project/lipo/
________________________________ Von: Aleksandar Trifunovic <akstrfn@gmail.com> Gesendet: Mittwoch, 24. November 2021 16:44 An: Main discussion list for the graph-tool project Betreff: [graph-tool] Re: All Pairs Shortest Path with limitations
Hi Stephan,
I've used graph tool to compute a lot of shortest paths and used a dumb solution of just throwing it into multiprocessing module from python (with some lru_cache tricks). If you have enough memory and you don't call the main procedure often (seems to be the case?) then the overhead of multiprocessing becomes insignificant. A benefit is that this is pretty easy to code and scales perfectly to more cores (until you run out of memory).
Best, Aleks
On Wed, Nov 24, 2021 at 3:32 PM Monecke, Stephan <stephan.monecke@ds.mpg.de> wrote:
Hi Tiago and the list,
my algorithm is still _infeasibly_ slow and > 90 % of the run-time results from shortest_path().
Do you have any idea on how this could be sped up?
I have the graph G, the set of all nodes V and a sub-set B ⊂ V.
I want to calculate or approximate the mean shortest distance between all pairs V\B x V\B but on the complete graph - or in other words, calculate the mean of all pairs shortest distance ( mean(shortest_distance()) ) but exclude n∈B from being source or target.
Sub-sampling would be okay as well.
I currently don't see a way to do this fast with graph-tool besides writing a custom function but that probably exceeds my capacities. Am I missing something? Do you have any ideas?
Ideal would be something like a multi-threaded shortest_distance(g=G, source=V\B, target=V\B, weights=G.edge_properties['weight'], samples=N) with N pairs random sub-sampling and source / target being lists.
Thanks a ton!
Stephan
________________________________ Von: Monecke, Stephan <stephan.monecke@ds.mpg.de> Gesendet: Donnerstag, 18. November 2021 14:59 An: Main discussion list for the graph-tool project Betreff: [graph-tool] Re: All Pairs Shortest Path with limitations
This works like a charm!
With it, I'm down to 228 ms ± 1.31 ms from 3.6 s ( 82.9 ms ± 2.45 ms for shortest_distance(G) ).
It's still very very expensive as it's 2.75 times APSP but approaching reasonability. The costly remaining part is the manual B x B.*
I tried to speed it up by the approximation to limit G to B since I know that B is connected but graph-tool segfaults right away (I wrote another ticket for that). :/
If anyone else has any ideas on how to speed this up, I'd be very grateful!
Cheers, Stephan
- - - o - - -
*Since the graph is directed, we have the following: Say V is the set of all vertices and B a subset of V I want to exclude as sources / targets from APSP. Then the set of points is (V\B) x (V\B) = [ (V x V) \ [ (V x B) + (B x V) ] ] U (B x B).
Or in graph-tool terms: - (V x V): shortest_distance(G) - (B x V): shortest_distance(G, b, _) for b in B - (V x B): shortest_distance(GraphView(G, reversed=True), b, _) for b in B - (B x B): shortest_distance(G, u, v) for u,v in B x B
Von: Stephan Monecke <smoneck@ds.mpg.de> Gesendet: Mittwoch, 17. November 2021 20:24 An: Main discussion list for the graph-tool project Betreff: [graph-tool] Re: All Pairs Shortest Path with limitations
🥳 I didn't know that existed!
Thanks a ton, I'm really looking forward to test it out tomorrow!
Cheers, Stephan -- Sent from my mobile device. Please excuse my brevity.
Am 17. November 2021 19:27:37 MEZ schrieb Tiago de Paula Peixoto <tiago@skewed.de>:
Am 17.11.21 um 19:22 schrieb Monecke, Stephan:
The main reason is that shortest_distance() has a "single source, all targets" mode but no "single target, all sources" mode (the reverse) - as soon as "source" is empty, it defaults to APSP, neglecting "target".
Of course it does. Just reverse the graph with:
GraphView(g, reversed=True)
_______________________________________________ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-leave@skewed.de
_______________________________________________ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-leave@skewed.de _______________________________________________ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-leave@skewed.de
_______________________________________________ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-leave@skewed.de _______________________________________________ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-leave@skewed.de
graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-leave@skewed.de
_______________________________________________ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-leave@skewed.de
_______________________________________________ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-leave@skewed.de
participants (4)
-
Aleksandar Trifunovic -
Monecke, Stephan -
Stephan Monecke -
Tiago de Paula Peixoto